This chip startup just raised $135M on a bet that AI's biggest bottleneck isn't compute — it's memory
XCENA, a four-year-old chip startup, raises $135 million to develop a chip that places compute capabilities close to DRAM, aiming to solve AI's memory bottleneck.

['Every time you ask ChatGPT a question, your request triggers a data relay race. Information leaves memory, passes through a CPU for preprocessing, travels to a GPU for heavy computation, and then makes its way back — and that entire journey repeats for every single word the AI generates.', 'The bottleneck is structural — it means routing through some of the most expensive and power-intensive chips in the industry on every single request. That inefficiency is exactly what XCENA, a startup with offices in South Korea and the U.S., is trying to solve.
The four-year-old startup has designed a chip that places compute capabilities much closer to DRAM — the fast, short-term memory chips that store data a processor is actively using — allowing routine data operations to be handled near memory, without the costly round trips between CPUs, GPUs, and memory.', "If it works at scale, the implications for AI infrastructure costs could be significant, which largely explains investor enthusiasm around the country. Indeed, XCENA just raised $135 million in a Series B at a valuation of $570 million, bringing its total raised to $185 million. XCENA CEO Jin Kim co-founded the startup in 2022 alongside CTO Dohun Kim and CPO Harry Juhyun Kim, all veterans of Samsung and SK Hynix, the memory giants that supply chips powering Nvidia's GPUs.", "CPUs and GPUs have both gotten smarter over the decades, but memory never did, Kim said in an interview.
'The recent rise in memory prices and related stocks points to a broader shift in AI infrastructure toward memory-centric architectures.' This month, the three companies that dominate the global memory chip market — Samsung, SK Hynix, and Micron — each crossed a trillion-dollar valuation for the first time. XCENA is betting its business on the thesis that 'inference isn't just a compute problem; it's increasingly a memory scaling problem,' said Kim.", "XCENA's chip, the MX1, connects to the CPU through CXL (Compute Express Link) — essentially a dedicated express lane between the processor and memory — processing data before it ever needs to leave the memory module. It brings compute to the data, not the other way around.
The company claims that what used to require 10 servers could potentially run on just one. While GPUs excel at matrix multiplication — the heavy math behind AI model training — much of the surrounding data orchestration, including preprocessing, KV cache management, and data caching, still runs on CPUs. Our chip handles those tasks directly within the memory module itself, Kim said.", "The MX1 is still a prototype, with mass production chips scheduled to roll off Samsung's foundry lines by the end of 2026, and the company expecting to generate revenue starting in 2027.
Source: TechCrunch