SoftBank doubles down with a record $40 billion loan while Google quietly solves one of AI's most expensive infrastructure problems.
💰 SoftBank Secures Record $40B Bridge Loan for OpenAI Stake
Decoded: SoftBank Group confirmed on Friday it secured a $40 billion unsecured bridge loan to fund its investment in OpenAI, maturing in March 2027. The capital was arranged through a banking syndicate including JPMorgan Chase, Goldman Sachs, Mizuho, SMBC, and MUFG — underwriting one of the largest AI financing transactions on record. SoftBank had previously agreed to invest $30 billion in OpenAI through its Vision Fund 2; this loan fulfills that commitment and covers related costs. SoftBank and OpenAI are co-founders of the Stargate Project, which has committed up to $500 billion in U.S. AI infrastructure over four years. CEO Masayoshi Son pledged $100 billion in U.S. AI and infrastructure investment through 2028. (Reuters, Bloomberg, March 27, 2026)
Why it matters: SoftBank absorbing $40 billion in unsecured debt — maturing in 12 months — to fund an AI stake signals the urgency to not miss the OpenAI window. For OpenAI, mid-restructuring to a for-profit model, SoftBank's commitment provides capital and valuation support heading into its most consequential year. For Microsoft (MSFT), OpenAI's largest existing backer, SoftBank's scale entry reduces concentration risk while potentially complicating governance. The deal confirms the private AI infrastructure race has not slowed — even as public market AI valuations remain volatile.
🤖 Google's TurboQuant Cuts AI Model Memory Overhead by 6x
Decoded: Google Research published TurboQuant on March 26 — a vector quantization algorithm that reduces AI model key-value (KV) cache memory by at least six times with zero accuracy loss. KV caches are core infrastructure that allow large language models to process long contexts without recalculating every token per inference pass; they are a primary driver of memory costs at AI inference scale. Standard quantization methods reduce memory but introduce 1–2 bits of per-block overhead, partially canceling the savings. TurboQuant eliminates that overhead through an optimized quantization approach. The algorithm will be presented at ICLR 2026. (Google Research official blog, The Verge, March 26, 2026)
Why it matters: KV cache memory is one of the largest cost drivers in large-scale AI inference — every context token occupies GPU memory that must be provisioned and paid for. A 6x reduction with zero accuracy loss allows the same GPU fleet to serve significantly more concurrent users or process longer context windows at the same cost. For Google (GOOGL), publishing this openly gives cloud customers lower inference costs — a competitive move against AWS and Azure in inference-as-a-service. For Nvidia, efficiency gains reduce upgrade cycle urgency, as more efficient memory use delays the need for additional HBM capacity.
Stay decoded. See you tomorrow.
— The Get AI Decoded Team
Enjoyed this article?
Subscribe free — AI news decoded for investors, every morning.
No spam. Unsubscribe anytime. Privacy Policy