CPU Quietly Returns to the Center Stage of AI Compute

Bitsfull2026/06/03 18:335148

Summary:

It's not that the CPU is faster than the GPU now, but rather the AI **workload** has changed.


For the past three years, the story of AI computing power has revolved around the GPU.


From NVIDIA's H100, H200, to GB200, GB300, and to cloud vendors rushing to expand hundred-thousand-card clusters—every industrial narrative has been telling one thing: the bottleneck of computing power lies in the GPU. In this story, the CPU has long been tacitly considered a less important "complementary" role, trailing behind the GPU and handling tasks that the GPU is unwilling to do.


However, starting in 2026, this narrative began to show some cracks.


On June 1, Intel launched the Xeon 6+ processor in Beijing, designed for cloud-native, intelligent edge AI, and network-intensive workloads. This is the first data center CPU on Intel's 18A process.


In Intel's own description, the Xeon 6+ is not a GPU's "complement," but rather the AI infrastructure's "control plane," responsible for orchestration, concurrency, and data flow.


"The path to AI expansion lies not in the stacking of individual components, but in the system's collaborative operation." Kevork Kechichian, Executive Vice President and General Manager of Intel's Data Center Business, said during a briefing, "With AI transitioning to the era of intelligent edge, orchestration, concurrency, and data flow have become new limiting factors.


This once again reinforces a core fact: the CPU remains the control plane of modern AI infrastructure."


This judgment is not exclusive to Intel alone. In February of this year, independent semiconductor research firm SemiAnalysis released a 2026 Data Center CPU Landscape Report titled "CPU Returns," which similarly gave a direct assessment. As AI training and inference are massively deployed, the CPU is now being re-demanded in a completely different way from the past three years.


However, this "return," when examined closely, reveals that the CPU is not reclaiming the spotlight but being redefined in a new position.


I. Crack in the GPU-Centric Theory


To understand why the CPU is "coming back," we must first look at the changes happening within AI workloads themselves.


Over the past two years, the mainstream narrative of AI compute has been about training, with the scale of large model training increasing four to ten times each year. Training requires massive parallel computing, where GPUs play a crucial role. However, training is not the only workload of AI.


According to Intel's assessment during a communication event, the entire AI compute workload can be roughly divided into three categories:


The first category is foundational workloads such as storage, databases, web, microservices, CDN – these are not AI-specific but are essential infrastructure for AI. This part still primarily relies on traditional CPUs.


The second category is training – the training of cutting-edge large models that almost entirely rely on GPUs and dedicated accelerators. This has been the battleground for everyone in the past three years.


The third category is inference and intelligent agents, which is rapidly growing and has significant differences compared to training.


The key difference in the third category lies in the nature of the workload itself. Training is the process of creating a model from scratch, with extremely high parallelism and a high demand for single-point peak compute. In contrast, inference and intelligent agents do not – they take the already trained model and deploy it in real-world applications.


This means that a significant portion of the work is not about "calculating" but about orchestration: scheduling the collaboration of multiple models, managing contexts, coordinating data flow among different agents, handling user requests concurrently, and ensuring predictable latency.


These tasks are not GPU's strong suit.


"In this scenario, we can see a combination of GPU-level acceleration, but the main workload still revolves around a CPU-centric infrastructure," said Kevork Kechichian during the communication event.


Behind this lies a more specific industry fact. As mentioned in SemiAnalysis' "CPU Regress" report, an example was given: in the "Fairwater" data center built by Microsoft for OpenAI, a 48-megawatt CPU and storage building supports a 295-megawatt GPU cluster.


In other words, to truly operate that 295-megawatt GPU cluster, thousands of CPUs are needed to process the petabyte-level data flow generated by the GPUs, schedule tasks, and manage storage.


As the GPU's computing power is pushed higher, the "peripheral computing power demand" it generates becomes larger. And this peripheral computing power demand ultimately falls on the CPU.


This means that the CPU's resurgence is not about "the CPU becoming faster than the GPU again." Instead, as the form of AI computing power expands from "training a large model" to "running thousands of intelligent agents," orchestration and data flow become a bottleneck once again. This is something the GPU cannot solve but the CPU can.


This is the overlooked side of the AI narrative over the past three years.


2. What Path Does Xeon 6+ Bet On


Intel's bet is reflected in the product definition of Xeon 6+.


One of the most intuitive numbers is, up to 288 cores, all of which are efficient cores (E-cores).


The E-core and P-core are forks Intel has made in CPU architecture over the past few years. The P-core is a performance core that pursues the ultimate single-core performance, the design goal of a traditional server CPU. The E-core is an efficient core with slightly weaker single-core performance, but smaller size and lower power consumption, allowing more cores to be packed into the same chip area.


Xeon 6+ takes this fork to the extreme. 288 efficient cores mean that Intel's bet on a CPU is not about "how fast each core can be," but about "how many cores can be packed into a single CPU."


The logic behind this product definition is: the workload of AI agents is not about how fast a single core can run, but about whether thousands of lightweight tasks can run simultaneously. When a server needs to simultaneously orchestrate hundreds of agents, process thousands of inference requests, and maintain tens of thousands of concurrent connections, the throughput of 288 E-cores is far more important than the single-core performance of 64 P-cores.


This is a non-mainstream product definition. For the past few decades, the mainstream narrative of server CPUs has been focused on single-core performance, with higher clock speeds, stronger IPC, and larger caches. The E-core path fundamentally acknowledges: that narrative may be coming to an end.


But there are several things that must be considered together.


First, The E-core architecture is not exclusive to Intel. AMD introduced Bergamo in 2023, based on a density-optimized Zen 4c core. AWS' Graviton series and Ampere's AmpereOne series have also long followed the "high-density core + energy efficiency first" approach. In the AmpereOne Aurora roadmap announced by Ampere in 2024, the core count has reached 512.


This means that the Intel Xeon 6+ is Intel catching up to an existing industry trend—Intel is not a leader but a player returning to the industry trend.


Second, The Intel Xeon 6+ is the first data center CPU on Intel's 18A process, a fact that may be more important in Intel's own context than "288-core E-core."


Intel 18A is Intel's biggest bet in recent years. It is not just about a single CPU but about Intel Foundry, Intel's foundry business, and whether it can stand firm. If the 18A process cannot deliver a competitive product to the market, the Intel Foundry story will halt.


The Intel Xeon 6+ produced on the 18A process, pushing energy-efficient core count to 288, and declaring "industry-leading performance density," this is one of Intel's responses handed to the market. Whether it will be accepted by the market, whether it can compete with TSMC N2 and Samsung 2nm in the same generation, is another question.


Third, several industrially meaningful names have appeared on the Intel Xeon 6+ customer list—Ericsson testing 5G core network with Xeon 6+, T-Systems under Deutsche Telekom using Xeon 6+ for private AI infrastructure. Both of these customers are traditional and stable procurers of data center CPUs, and their procurement choices themselves are a market signal.


Taking these three things together, The Intel Xeon 6+ is betting on this path: gaining energy efficiency with the 18A process, achieving core density with 288 E-cores, and focusing on AI inference and private AI scenarios with "high density, high energy efficiency, high throughput" workloads.


This is not a story of CPUs returning to the forefront of computational power but a story of CPUs finding a new position.


3. The Bottom Line: Will This Happen?


Will Intel's story of "CPU Comeback" actually happen? It depends on several other variables in the industry.


The first variable is the response from GPU manufacturers.


In the past two years, NVIDIA has also been working on "orchestration"-related matters. The combination of the Grace CPU and the Hopper GPU is NVIDIA's way of filling the CPU gap. If GPU manufacturers make the "CPU + GPU" integrated solution mainstream, the position of CPU manufacturers as independent players will be marginalized. This is the biggest challenge to Intel's narrative of "CPU as the control plane," not from AMD, but from NVIDIA itself.


The second variable is the in-house CPU development by cloud providers.


AWS Graviton has already been deployed at scale in AWS's own data centers, handling a significant portion of AWS's internal general computing workloads. Microsoft is working on Cobalt, Google on Axion, Alibaba on Yitian, almost all major cloud providers are developing their own ARM architecture server CPUs.


These in-house CPUs are also following the "high-density, energy-efficient first" path - directly competing with the Xeon 6+ in product definition.


In other words, the market segment that Xeon 6+ is targeting, cloud providers are developing their own solutions for it. Intel needs to prove that there is still a significant market beyond cloud providers' in-house CPUs, such as telco operators, private clouds, and vertical industry data centers.


The third variable is the 18A process technology itself.


The Xeon 6+ is Intel's first data center CPU on the 18A process, which in itself implies that this chip bears far more industry significance than just the product. If the 18A process faces issues in mass production yield, performance stability, and customer validation, the market performance of Xeon 6+ will be affected. Conversely, if the 18A process performs well, Xeon 6+ might actually provide some breathing room for Intel Foundry.


However, the 18A process is not operating in a vacuum - TSMC's N2 process will begin mass production in the second half of 2026, and Samsung's 2nm process is also in progress. Intel's goal with the 18A is not just to "make it," but to "lead after making it," setting a higher standard.


Mixing these three variables together, the ultimate performance of Sapphire Rapids+ not only depends on itself, but also on whether NVIDIA will dominate the CPU role, whether cloud providers will continue to develop their own CPUs, and whether Intel 18A will be competitive against TSMC and Samsung in the same generation.


This is why the concept of "CPU Renaissance," from an industry perspective, holds true, but from the perspective of whether Intel can reap the benefits of this renaissance, it remains unknown.


The battle for CPU's position on the AI compute stage has been ongoing for three years.


The script of the past three years has been "GPU at the core, CPU in a supporting role." This script began to loosen in 2026—not because CPUs suddenly outpaced GPUs, but because AI compute itself is changing. As AI expands from "training a model" to "operating thousands of intelligent agents," orchestration, concurrency, and data movement become bottlenecks again, making CPUs irreplaceable in this position.


Intel is betting on this, and Sapphire Rapids+ is its response. However, whether this will hold true and whether Intel can reap the benefits will ultimately be answered in the client-server rooms of 2027 and 2028. AMD, ARM camps, cloud providers developing their own CPUs, NVIDIA venturing into CPUs—each variable could alter the script.


The CPU renaissance is real, but who will lead this renaissance is still undetermined.



Welcome to join the official BlockBeats community:

Telegram Subscription Group: https://t.me/theblockbeats

Telegram Discussion Group: https://t.me/BlockBeats_App

Official Twitter Account: https://twitter.com/BlockBeatsAsia