Father of AlphaGo Starts New Venture, Raises $1.1 Billion in Seed Funding

Bitsfull2026/04/28 11:1919235

概要:

He doesn't want human data, but instead allows the model to generate experience from the environment

David Silver was last seen in the spotlight in 2016 in a conference hall in Seoul. More precisely, he was "behind" the table across from Lee Sedol. Sitting at the table was AlphaGo.



Ten years later, he resigned from Google DeepMind and started anew in London. Less than 24 hours after the funding announcement, the European venture capital community collectively held its breath: $1.1 billion in a seed round, $5.1 billion valuation, with Sequoia and Lightspeed co-leading, alongside Nvidia, DST Global, Index, Google, and the UK's Sovereign AI Fund, a string of prominent names.


This is the largest seed round in European venture capital history.


"Largest Seed Round" is just the beginning, not the focus


Let's start with the numbers.


The company is called Ineffable Intelligence. Registered in November 2025, Silver officially took over full-time in January of this year after leaving DeepMind. From founding the company to securing this funding, it took less than half a year.


The seed round gives a $5.1 billion valuation, nearly matching Mistral's Series B valuation from a year ago, surpassing the early valuations of any European AI startup during the same period. The investor list also remarkably combines European and American sovereign capital, top Silicon Valley VCs, and compute providers all at once. The UK government's Sovereign AI Fund participated in an early-stage round of this scale for the first time, which is a signal in itself.


Receiving this amount in a seed round is unconventional. Early-stage investors typically wait for at least one of the trifecta—product, revenue, or customers—to materialize before making a significant investment. In this round, Silver essentially bypassed all these steps and directly acquired a valuation comparable to that of a medium-sized public company.


The source of the funding is not surprising. They are not betting on a product but on a paradigm shift. This assessment will be elaborated upon later.


First, let's look at Silver's own resume. With ten years at DeepMind, he has led or co-led projects such as playing Atari games from pixels, AlphaGo, AlphaZero (combining Go, Chess, and Shogi without human game records), AlphaProof (winning a silver medal at the International Mathematical Olympiad). He is also a professor at UCL. In other words, he is almost the person who has turned reinforcement learning from an academic niche into an industrial headline over the past fifteen years.



The value of this resume is not in the number of papers, but in a monopolistic discursive power. In the entire world, there are no more than five individuals who can simultaneously possess the "academic reputation + engineering resume + textbook status" trio.


He Doesn't Want Human Data


The truly counterintuitive part of the story lies here.


The GPT, Claude, Gemini generation of models essentially stuff all human-written content into the network, compress it into a ball of semantic probability, and then use various post-training methods to "retrieve" it. It can write emails, code, perform a stand-up comedy routine because humans have already written all these things.


The goal Silver wrote on the Ineffable website is to create a superlearner. What it aims to do is not rely on any human-generated data but learn from scratch through its own "experience," starting from the most basic motor skills all the way to "profound intellectual breakthroughs."



This is not a marketing slogan; there is an article behind it that is currently being published.


Starting last year, Silver and reinforcement learning textbook author, Turing Award winner Richard Sutton co-authored an article titled "Era of Experience," an excerpt from the upcoming MIT Press book "Designing an Intelligence." The article contains a frequently cited statement.


Most of what humans have written has already been read by models. Moving forward, the marginal benefits of pretraining will become thinner, and the scaling law will turn into a flat line.


The way out for the next generation of AI is not more extensive corpora or more human feedback but letting the models themselves generate experience from the environment. They will experiment, fail, interact, and explore things that no one has ever written.


The statement made by the Sequoia Capital partner in the announcement is even more profound. "If successful, this will be a Darwinian-level scientific breakthrough. His law explained all life; our law will explain and construct all intelligence."


This kind of talk can easily make people roll their eyes. But before you dismiss it, at least it candidly acknowledged Andreessen Horowitz's true assessment. They are not betting on just another chatbot or a vertical industry copilot; they are betting on a new possibility.


In Fact, This Is Not a New Story


Those familiar with Silver's background will recognize that the superlearner concept is not new.


In 2017, AlphaZero did just that. Across the games of Go, chess, and shogi, with no human game data, it relied solely on self-play, defeating all previous top engines within a few hours. In 2024, AlphaProof won a silver medal at the International Mathematical Olympiad, following the same path of using self-generated formal proofs to train itself.



It sounds all very sexy. However, over the past decade, almost all reinforcement learning (RL) labs have hit a wall on this approach.


The reason is that "self-play" requires a clean environment. Go has a 19x19 board and the binary rules of black and white, chess has eight horizontal rows and a clear win condition. In this closed environment, the model can unequivocally know what "winning" means and can thus optimize unequivocally.


But when you change the task to "write a contract that makes a client pay," "prove an unproven mathematical conjecture," or "drive a taxi back to the hotel in a strange city," how to define the reward signal and set up the environment, these questions have not been truly answered in the past decade.


This time, Silver's bet is to openly admit that this problem has not been solved yet, and then start afresh with $1.1 billion, a new team, and a brand-new organization.


Why Now?


In 2026, the market is willing to bet $1.1 billion on AI that "does not read human data," and the answer lies in several interconnected signals over the past 12 months.


OpenAI's o3 and o4 series have increasingly relied on reinforcement learning post-training. The capabilities of "thinking" and "reasoning" no longer come from larger pretraining but from the RL phase of environment interaction. Immediately following, DeepSeek R1 directly turned the reinforcement learning fine-tuning path into an open-source template. Any team with some engineering capability can now replicate a small model that "thinks." RL is no longer DeepMind's internal mystery but has become an industry common sense.


A deeper layer is the discussion of the pre-training scaling law reaching its peak, with new papers almost every month starting from the second half of 2025. As high-quality tokens from human language corpus have been mostly depleted, the marginal returns of further model scaling have begun to diminish significantly. The capital side has been quietly shifting gears, with a growing number of AI mega-deals from top Silicon Valley VCs in the past six months pouring into directions like RL, world models, and agent – the "post pre-training" realm, rather than just another large language model (LLM) factory.


The market has long been stocking up ammunition for the "post pre-training era." It's just waiting for someone to carry that flag. Silver is almost a textbook answer to this role. In the RL space, he not only enjoys public recognition from AlphaGo but also possesses engineering credentials from projects like AlphaZero and AlphaProof, along with the credibility of co-authoring with Sutton.


An $1.1 billion bet essentially signifies the market voting with money. Reinforcement learning is not just a technological path; it is the next paradigm.


What We Will See in 12 Months


$1.1 billion can build a chip fab, buy a football club, or produce several movies. Can it be used to create a general intelligence without the need for human data?


No one knows. Silver hasn't said.


But several observation points for the next 12 months are already set. The most direct one is whether Ineffable will first tackle a more challenging "self-learning" benchmark than AlphaProof. Mathematical Olympiads provide a clean, closed environment; however, if the next step is "informally defined research-level mathematics," the difficulty will suddenly rise to a higher level. Passing this checkpoint almost determines the entire story's direction.


Next, we need to observe Sequoia's actions. After the top VC heavily reinvests in the seed round, the pace of the Series A determines the external judgment of the project. If a $3 billion Series A round appears within 12 months, it means the early achievements have exceeded expectations. If not, the market will recalibrate the valuation.


Also, keep an eye on DeepMind. After Silver's departure, how the original RL team writes the next paper, who will be credited, whether anyone will follow suit – these are all judgment points for a startup transitioning from a "single-star player" to an "institutional-level research force."


Lastly, there's China. With DeepSeek already on the R1 path and Byte with Byte Seed, will they publicly unveil their "human-data-free" exploration in the second half of 2026? If this path is successful, it will not be exclusive to just one company in London.


Regardless of whether the superlearner path will ultimately succeed, $1.1 billion has at least brought one thing to the forefront. While everyone is competing to make AI mimic human speech more closely, someone has started to ask, why does AI have to become like us first in order to become better than us?


Welcome to join the official BlockBeats community:

Telegram Subscription Group: https://t.me/theblockbeats

Telegram Discussion Group: https://t.me/BlockBeats_App

Official Twitter Account: https://twitter.com/BlockBeatsAsia