Huang Renxun's Latest Podcast Transcript: NVIDIA's Future, Embodied Intelligence and Agent Development, Soaring Demand for Inferencing, and AI's PR Crisis

Bitsfull2026/03/20 17:247677

Summary:

Huang Renxun's Latest Podcast Transcript: NVIDIA's Future, Embodied Intelligence and Agent Development, Soaring Demand for Inferencing, and AI's PR Crisis


Editor's Note: In the current environment of escalating AI narratives, the focal point of market discussions is shifting from "how strong the model is" to "how the system can be implemented." Over the past two years, the industry has successively experienced breakthroughs in large model capabilities, a training compute race, and the expansion of generative applications. However, as these phases gradually become consensual, new questions have also emerged: When AI is no longer just answering questions but beginning to perform tasks, embed in enterprise processes, enter the physical world, what are the underlying conditions that support its continued advancement?


This article is an excerpt from the well-known tech podcast All-In Podcast. As one of Silicon Valley's most influential investor podcasts, the show is hosted by four long-time frontline investors and is known for its in-depth discussions on technology, business, and macro trends.


The show's four hosts are:


· Jason Calacanis, early internet entrepreneur and angel investor, well known for investing in companies like Uber, Robinhood, etc.;

· Chamath Palihapitiya, Founder of Social Capital, former Facebook executive, who has invested in companies like Slack, Box, and many other tech companies;

· David Sacks, Partner at Craft Ventures, one of the "PayPal Mafia" members, founded Yammer and sold it to Microsoft for around $1.2 billion, also an early investor in Airbnb, Uber;

· David Friedberg, Founder of The Production Board, focused on investing in agriculture, climate, and life sciences, founded The Climate Corporation (later acquired by Monsanto).


This episode's guest is Jensen Huang, Co-Founder and CEO of NVIDIA, considered one of the most critical drivers in the current AI infrastructure wave.




The entire interview can be roughly summarized into three levels.


First, AI infrastructure is undergoing a transformation. In the past, the market's understanding of AI was largely based on more powerful GPUs and more data centers. However, Jensen Huang emphasized that future competition is no longer just about a single chip but about the competition of an entire system. As inference demand rises, the variety of models increases, and agents begin to handle more complex tasks, AI computation is shifting from a relatively singular mode in the past to a more complex and specialized system collaboration. NVIDIA is therefore attempting to transition its role from a chip company to further becoming a builder of an "AI factory."


Second, AI is transitioning from "generating content" to "completing tasks." This is the most critical insight in this interview. While ChatGPT allowed the public to intuitively experience AI's capabilities for the first time, Huang believes that the even bigger change is AI starting to enter workflows in the form of an agent: it's not just answering questions but able to invoke tools, break down tasks, collaborate in execution, and ultimately get things done. Consequently, the objects that users are willing to pay AI for will transition from "getting an answer" to gradually "getting a result." This implies a greater demand for inference, higher system complexity, and a potential rewriting of software development, organizational management, and knowledge work processes.


Lastly, AI is extending from the digital world to the real world. Throughout the interview, whether discussing autonomous driving, robotics, healthcare, digital biology, or what Huang refers to as Physical AI, they all fundamentally point to the same trend: the value of AI is not only reflected within screens but will increasingly manifest in factories, hospitals, cars, end devices, and everyday life. However, this also means that the challenges AI will face next will not only be technical but will also include more complex real-world constraints such as supply chains, policies, regulations, manufacturing capabilities, and geopolitics. In other words, the next phase of AI expansion will be a true industrialization process.


From this perspective, the most noteworthy aspect of this conversation is not a specific product or a certain revenue figure but a judgment repeatedly conveyed by Jensen Huang: AI is transitioning from the "model era" to the "system era." The future competition will not only be about whose model is larger or whose computing power is stronger but about who understands the industry better, who can more deeply embed AI into real processes, and who can organize these capabilities into a runnable and scalable system.


This also extends the scope of this article beyond NVIDIA itself. The real question it is trying to answer is: as AI gradually becomes infrastructure, how will the next round of industrial restructuring unfold, and where will new value be created.


The following is the original content (slightly reorganized for better readability):


TL;DR


·AI infrastructure is transitioning from a "single GPU" to a disaggregated architecture. Different computing tasks will be handled collaboratively by GPUs, CPUs, network chips, and inference chips like Groq.


·NVIDIA is evolving from a GPU company to an "AI factory company" providing complete systems. It sells the entire infrastructure, not just individual chips.


·The key measure of AI cost is not data center construction cost, but token cost and throughput efficiency. A more expensive system might actually be cheaper.


·AI is moving from generative models to the Agent era. Users are truly willing to pay for "getting things done" rather than just getting answers.


·Computational demands are experiencing explosive growth. From generation to inference to agent, the scale may have increased by over 10,000x in a short period and is still accelerating.


·The future of software development will change. Engineers will no longer just write code but will define problems, design architectures, and collaborate with agents.


·In the long run, the biggest opportunity lies in deep specialization in vertical domains, rather than the general model itself. Those who understand the industry better will have a stronger moat.


Original Interview


Jason Calacanis (Prominent Angel Investor | All-In Podcast Host | Early Investor in Uber):
This week is a special episode. We are putting aside our regular programming for the week, something we only do for three types of people: President Trump, Jesus, and Jensen Huang (NVIDIA's founder and CEO). You can decide the ranking yourself. You've been on fire lately, and this GTC was a great success.


Jensen Huang (NVIDIA CEO):
The entire industry has arrived. All tech companies, almost all AI companies are here.


Jason Calacanis:
It's so incredible, truly remarkable. One of the most significant launches in the past year was Groq. When you acquired Groq, did you realize how much it would make Chamath even more "insufferable"?



Jensen Huang:
I had a vague premonition.


Jason Calacanis:
We have to deal with him every week.


Jensen Huang:
I know. You also have to endure the full six-week lock-up with him.


Jason Calacanis:
That's right.


From GPU Company to "AI Factory" Company


Jensen Huang:

In fact, many of our strategies are announced years in advance at GTC. Two and a half years ago, I introduced the AI factory's operating system called Dynamo.


As you know, Dynamo was originally a device, invented by Siemens, that could convert the energy of water into electricity, driving the factory system in the last industrial revolution. So I think this name is very suitable as the name of the "factory operating system" in the next industrial revolution. And in Dynamo, one of the most core technologies is disaggregated inference.


Jason Calacanis:

Jensen, I know you have a special understanding of technology. Come on, you define it. I don't want to steal your thunder.


Jensen Huang:

Thank you. The so-called decoupled inference means: the entire reasoning pipeline is extremely complex, possibly the most complex type of computing problem today.


Its scale is astonishing, containing a large number of mathematical calculations of different forms and scales. Our idea is to break down the entire processing pipeline, with part of it running on one type of GPU and another part running on another type of GPU. Furthermore, this also made us realize that perhaps decoupled computing itself is a reasonable direction: we can entirely have different types and natures of computing resources working together.


The same idea later led us to Mellanox. Look at today, NVIDIA's computing is already distributed across GPUs, CPUs, switches, vertical expansion switches, horizontal expansion switches, and network processors. Now, we also need to add Groq to the mix.


Our goal is to place the appropriate workload on the appropriate chip. In other words, we have evolved from being a GPU company to being an AI factory company.


David Sacks (Craft Ventures Partner | Former PayPal COO | All-In Host):

For me, this is probably the most important inspiration. What you are seeing now is a fundamental "decoupling." In the past, there was only one choice, the GPU, and now there are increasingly different forms of computing, and these choices will coexist in the future.


You mentioned one point on stage that I think everyone involved in high-value reasoning should take seriously: you said that approximately 25% of the space in the data center should be allocated to Groq's LPU.



Jensen Huang:

That's right, in the data center, Groq could occupy about 25% of the Vera Rubin system.



David Sacks:

So can you talk about how the industry is currently viewing this direction? Essentially, you are building the next-generation decoupled architecture: prefill and decode separation, inference flow being split. How do you think people will react?


Ren Hsu:

Let's take a step back first. When we added this capability to the system back then, it was because the entire industry had already shifted from big language model processing to Agentic Processing, which is essentially agent-based processing.


When you run an agent, it accesses working memory, long-term memory, calls tools, putting a lot of pressure on storage. You will also see agents collaborating with other agents. Some agents use large models, some use small models; some use diffusion models, some use autoregressive models. In other words, within this data center, you will find a wide variety of completely different types of models coexisting. We built Vera Rubin to address this extreme diversity of workloads.


So, in the past, we were a 'one-rack' company, and now we have added four more racks. In other words, NVIDIA's TAM, the serviceable market, has suddenly expanded, roughly 33% to 50% more than before.


And within this additional 33% to 50%, a large part will be storage processors, namely BlueField; a part—I personally hope is a significant part—will be Groq processors; and a part will be CPUs; of course, there will also be many network processors. All of these together are ultimately running that 'new type of computer' in the AI revolution, which is the agents. It is the operating system of modern industry.


Chamath Palihapitiya (Social Capital Founder | Former Facebook Executive | All-In Podcast Host):

What about embedded applications? For example, in my daughter's teddy bear at home, if it wants to talk to her, what would be inside it? A custom ASIC? Or, in the future, will we see a broader TAM in edge and embedded scenarios, with different tools for different scenarios?



Jensen Huang:

We believe that, in this issue, there are actually three computers.


The first computer, at the largest scale, is the computer used to train AI models, develop AI, and create AI.


The second computer is the computer used to evaluate AI. For example, look around you, everywhere there are robots, self-driving cars, and similar things. You must first put them into a virtual environment that can represent the physical world for evaluation. In other words, this software itself must adhere to the laws of physics. We call this system Omniverse.


The third computer is the computer deployed at the edge, which is the robot's computer. It can be a self-driving car, a robot, or even a small teddy bear.


For devices like teddy bears, one very important direction, which is what we are doing, is: turning telecom base stations into part of AI infrastructure. In this way, the entire $2 trillion telecom industry will gradually become an extension of AI infrastructure in the future. So, wireless devices will become edge devices, factories will become edge devices, and warehouses as well.


In short, all three types of foundational computers are essential.


David Friedberg (Founder of The Production Board | Host of All-In Podcast):

Jensen, I thought last year you saw things earlier than the whole world. At that time, you said that the growth in inference demand would not just be 1000 times.


Jensen Huang:

Did I dig a hole for myself?


David Friedberg:

It would be 1 million times? 1 billion times? Right?


Back then, I think many people thought that was too exaggerated because everyone was still focused on training scale-up. But now, if you look, inference has truly taken off and has started to become "inference constrained." You have now released an "inference factory" with throughput 10 times higher than the next-generation factory.


But if you look at the external discussions, many people will say: your inference factory will cost $40 to $50 billion, while those alternative solutions, such as custom ASIC, AMD, and so on, will only cost $25 to $30 billion, so you will lose market share.


So why don't you just tell us: What did you really see? How do you view market share? Are these customers willing to pay close to double the premium? Is it worth it?


Why Does a More Expensive System Produce Cheaper Tokens?


Justin Sun:

The most important and core point is: do not equate the factory's price with the token's price, nor equate it with the token's cost.


It is very likely, and I can prove it, that the $50 billion factory can actually produce the lowest-cost token. The reason is that our efficiency in generating these tokens is astonishingly high, exceeding 10 times.


You see, the price difference between $50 billion and $20 billion is largely just land, electricity, and the factory building shell. In addition, you would already need to purchase storage, network, CPUs, servers, and cooling systems. So, whether the GPU itself is at full price or half price won't reduce the total cost from $50 billion directly to $30 billion. You can pick any number you like; more realistically, it might only decrease from $50 billion to $40 billion.


And if a $50 billion data center has 10 times higher throughput, then this price difference is actually not that significant.


Jason Calacanis: Got it.


Justin Sun:

That's why I have always said: even for many chips, if you can't keep up with the technological frontier, unable to keep up with the pace we are advancing, then even if the chips are given away for free, they are still not cheap enough.


David Sacks:

I want to ask a more macro strategic question. You are now operating the world's most valuable company. Next year's revenue may exceed $350 billion, with $200 billion in free cash flow, and still growing at a crazy compounded rate.


How do you make decisions? How do you get information? Everyone now knows about your famous email system, but how do you really form intuitions, shape the market, decide where to focus heavily, where to shrink, where to enter new areas? How does this information flow to you? How do you make the final judgment?


Justin Sun:

That's the CEO's job.


David Sacks:

Right.


Justin Sun:

Our responsibility is to define the vision, to define the strategy. Of course, we are inspired and informed by the outstanding computer scientists, technical experts, and countless excellent employees in the company, but ultimately, shaping the future is our responsibility.


One criterion is: Is this thing ridiculously hard? If it's not hard enough, then we should stay away from it. The reason is simple: if something is easy to do, competitors will definitely pile on.


Is it something that has never been done before and is ridiculously hard? Can it conveniently tap into our company's unique "superpowers"? So I have to find this intersection point: it must meet both of these criteria at the same time.


And finally, you also have to know that doing this kind of thing will definitely come with a lot of pain and torment. No great invention is because it's too simple and succeeds effortlessly on the first try.


If something is super hard, never been done before, then it basically means you will go through a lot of pain and suffering. So you better enjoy the process.


David Sacks:

Can you pick three or four more "long-tail" businesses to talk about? For example, the ones you mentioned: data center in space, ADAS and cars, and the biotech direction. Give us a sense: when will these curves start to turn upward? How do you see these long-term businesses?



Ren Xing Huang:

Sure. Physical AI is a large category. I just said, we have three computing systems, and all the software platforms built on top of them. Physical AI is the tech industry's first real opportunity to serve a $50 trillion industry that has hardly been deeply transformed by technology in the past. To do this, we must reinvent all the necessary technology.


I have always seen this as a 10-year journey. We started 10 years ago, and now we are finally seeing it turning upward. For us, this is already a multibillion-dollar business, with the current scale approaching nearly $100 billion annually. So it's already a significant business, and it's still growing exponentially. That's the first point.


The second direction, I think in digital biology, we are really close to its ChatGPT moment.


We are gradually learning how to represent and understand genes, proteins, cells. For chemicals, we already know how to deal with them. So, being able to represent and understand the basic components of biology and their dynamic behavior, I think this will probably happen in two to three to five years. Within five years, I very much believe digital biology will have a huge impact on the entire healthcare industry.


These are all very important directions. Agriculture is also one of them.


Chamath Palihapitiya:

It has already begun.


Reinhard Cate:

No doubt.


Jason Calacanis:

I want to shift the focus from the data center back to the desktop. The early days of the company were largely built on enthusiasts, gamers, and GPU users. When you're on stage today, facing roughly ten thousand people, you mentioned Claude Code, OpenClaw, and the revolution brought by agent.


Especially the enthusiast community, we see a lot of energy and innovation bursting from them now, and many breakthroughs are happening on the desktop. You also released a desktop device this time, I remember it's the Dell 60800? It's a very powerful workstation that can run local models, with 750GB of memory. Now the Mac Studio is sold out everywhere. Our company is now fully transitioning to OpenClaw. Friedberg is using it, Chamath is using it, everyone is very obsessed.


What does this open-source agent movement starting from enthusiasts, the desktop open-source ecosystem, mean to you? Where is it heading?


The Age of Agent: Why Computing Demand Will Expand Another Ten Thousand Times


Reinhard Cate:

First, take a step back. In the past two years, we have actually seen three inflection points.


The first one is Generative AI. ChatGPT brought AI into the mainstream, making everyone realize its importance. In fact, this technology was clearly there a few months before ChatGPT appeared. It's just that until ChatGPT gave it a user-friendly interface, Generative AI truly took off.


And Generative AI, as you know, will generate tokens, both for internal consumption and external consumption. Internal consumption, essentially, is "thinking," which further drives the development of reasoning.


Next, more and more down-to-earth, real-world-based capabilities began to emerge, making AI not just answer questions, but provide more reliable, more useful answers. You also started to see an inflection point-like surge in OpenAI's revenue and business model.


Then, the third inflection point initially was only visible within the industry, and that is Claude Code. This was the first truly very useful agentic system, extremely revolutionary.


But before Claude Code, this set of capabilities was mainly focused on enterprises, and many outsiders had never seen it. It wasn't until OpenClaw brought "What an AI agent can actually do" into the public eye.


So, the cultural significance of OpenClaw is this: it was the first time that the general public truly became aware of the abilities of an agent.


Its second key importance is that OpenClaw is open.


More crucially, it has constructed a brand-new computing model, essentially reinventing computation itself. It has a memory system: scratch is short-term memory, the file system is long-term resource; it has scheduling capabilities; can run cron jobs; can generate new agents; can decompose tasks, perform causal reasoning, problem-solving; it also has an I/O subsystem, can input, output, connect with WhatsApp; it also has an API suite, can run various types of applications, known as skills.


And these four elements fundamentally define a computer. So, we now effectively have: a personal artificial intelligence computer.


And it is open-source, truly open-source, able to run on almost anything. This is the blueprint of modern computing. In a sense, it is already the operating system of modern computing and will be ubiquitous in the future.


Of course, we still have one thing to address: as long as you have agentic software, it may access sensitive information, execute code, communicate externally. So we must ensure that: all of this is governed, secure enough, strategically constrained, allowing these agents to have two out of the three capabilities, but not all three simultaneously.


In terms of governance, we have also made contributions. Peter Steinberger is also here today. We have many great engineers working with him to help make this system more secure, more robust, enabling it to protect both privacy and security.


Chamath Palihapitiya:

Jensen, has this paradigm shift already rendered many of the previously passed AI regulatory laws in the United States obsolete?


Many proposals were originally based on old models. Can you speak to how quickly this paradigm shift has made a large number of existing regulatory approaches outdated? AI regulation has now become a very hot topic in American politics.


Richard Huang:

In this regard, we must always stay ahead of policymakers, and you have done an excellent job in this area. We must proactively approach them, telling them what stage the technology has reached, what it is, and what it is not. It is not a living being, not an alien, it does not have consciousness. It is computer software.


Also, we often hear statements like "we simply do not understand this technology at all." But that is not true; we actually understand a lot. So, first, we must continue to provide policymakers with accurate information; do not let doomsday scenarios and extremism influence their understanding of this technology.


At the same time, we must also recognize that technology is developing rapidly and not let policy lag too far behind. From a national perspective, my biggest concern is this: the United States' greatest national security risk in AI is not AI itself, but that other countries are adopting AI while we, due to anger, fear, or paranoia, are unwilling to let our industry and society embrace AI.


Therefore, what I am truly most concerned about is: AI is not spreading fast enough in the United States.


David Sacks:

One more question. If you were sitting in the Anthropic boardroom, looking at their involvement with the "Department of War" incident, what would you think? This actually follows the point you just made: people do not know how to understand AI, so there is an added layer of resentment, fear, and distrust. If it were you, what different things would you advise Dario and his team to do to change today's outcome and public perception?


Richard Huang:

First, I want to say that Anthropic's technology is very impressive. We are significant users of Anthropic technology ourselves. I admire their focus on security, their persistence in security culture, and the technical excellence in driving these efforts—it's really great.


Furthermore, they want to remind the public of the capability boundaries of this technology, and I think that is a good thing in itself. However, we must realize that the world is a spectrum: reminding is good, but scaring people is not that great.


Jason Calacanis: Right.


Richard Huang: Because this technology is too important for us. I think predicting the future is certainly possible, but we need to be more cautious and humble. In fact, we cannot predict the future entirely.


If some very extreme, very catastrophic judgments are made without evidence that these things will actually happen, the harm it causes may be greater than people imagine.


And now, we are already leaders in the tech industry. No one used to listen to us before, but things are different now. Technology has deeply embedded itself in the social structure, becoming a highly critical industry closely tied to national security. Every word we say carries great weight.


Therefore, I believe we must be more cautious, restrained, balanced, and thoughtful.


David Friedberg:

I will nominate you to do this. AI has only a 17% approval rating in the United States. We have seen what happened in the nuclear energy field: we basically shut down the entire nuclear industry, and now China is building 100 fission reactors while the US has none. Now we are starting to hear calls for moratoriums on data centers. So I think we need to be more proactive.


However, I want to go back to what you mentioned about the internal agent outbreak happening in companies: efficiency gains, productivity gains. Many people are now debating ROI, right? When you and I entered this year, the biggest question was: will the revenue materialize? Will the revenue expand like AI itself? And then we saw something like a 'Oppenheimer Moment': in February, Anthropic reached a monthly revenue of 5 to 6 billion dollars.



How do you see the upcoming trends? Today, you also mentioned that Blackwell and Vera Rubin already have trillion-dollar-level visibility in the coming years. Coupled with the momentum shown by Anthropic and OpenAI, do you think we have already reached that inflection point and will we see revenue accelerate and expand like intelligence in the future?


Ryan Huang:

Let me answer from a few different perspectives. Look at this audience here today; indeed, both Anthropic and OpenAI are here. But in reality, 99% of what's present is AI, and it's not Anthropic or OpenAI. The reason behind this is the extreme diversity of AI itself.


I would say, as a category, the second most popular model is actually the open model. The number one spot is, of course, the entire generalized open ecosystem of OpenAI, open weights, and open models. The second spot is the open model, and there is a significant gap between it and the third spot. The third spot goes to Anthropic.


This illustrates the overall scale of all AI companies here, so first and foremost, one has to recognize that.


Now, speaking of computational load. When we transition from generative AI to reasoning, the required computational load increases by about 100 times; when we move from reasoning to agentic, the computational load likely increases by another 100 times. In other words, in just two years, the computational demand has roughly increased by 10,000 times. At the same time, people will pay for information, but what they truly prefer to pay for is actually the outcome of work.


David Friedberg: Yes.


Ren Xun Huang:

Having a conversation with a chatbot, getting an answer, of course, is great. Helping me with research is also fantastic. But what truly makes me willing to pay is it helping me get things done. And this is exactly where we are now, agentic systems are truly getting the job done. They are helping our software engineers get the job done.


So think about it, on one side is 10,000 times more computation, on the other side, there may already be 100 times more consumption demand. And, we haven't even truly begun large-scale expansion yet. We are absolutely on the road to a 1,000,000-fold growth.


Jason Calacanis:

I think this segues nicely into a question, how many people are in your company?


Ren Xun Huang:

We have 43,000 employees, with around 38,000 being engineers.


Jason Calacanis:

We often discuss a topic on the podcast: Wow, the token usage in our company is growing like crazy. Some even ask, when joining a company, "How much token allocation can I get?" because they want to be efficient employees. I remember you mentioned in that two-and-a-half-hour long keynote, that was really long, but great.


Ren Xun Huang:

Thank you. It could have actually been shorter.


Jason Calacanis:

You mentioned that the token allocation per engineer could be around $75,000. Does that mean that NVIDIA's engineering team would have to spend $1 billion, $2 billion a year on tokens?


Ren Xun Huang:

That's what we're thinking. Let me give you a thought experiment: Suppose you hire a software engineer or AI researcher with a salary of $500,000 a year, which is quite common here.


By the end of the year, I asked him, "How much money have you spent on tokens this year?" If he says "$5,000," I'll explode, really. If an engineer earning a $500,000 annual salary consumes less than $250,000 worth of tokens in a year, I would be very cautious. This is no different from a chip designer saying, "I've decided to only use pen and paper, I don't need CAD tools," at its core.


Jason Calacanis:

This is truly a paradigm shift. Your understanding of these top employees almost reminds me of LeBron James discussed in an MBA classroom: he spends $1 million a year maintaining his body, so he can keep playing at 41. Why shouldn't these top knowledge workers have 'superhuman abilities'?


Chris Wang:

Exactly.


Jason Calacanis:

If we push this trend back two or three years, what will the efficiency of these top NVIDIA employees look like? What can they accomplish?


Chris Wang:

First of all, the idea of "this is too hard" will disappear. The thought of "this will take too long" will also disappear. The idea of "we need a lot of people" will also disappear.


This is like in the last industrial revolution, no one would say, "That building looks too heavy." And no one would say, "That mountain is too big." All thoughts about "too big, too heavy, too time-consuming" will be dissolved.


David Sacks:

What will be left is only creativity. What can you actually think of.


Chris Wang:

Exactly right. In other words, the future problem will be: how do you collaborate with these agents.


Essentially, this is just a whole new way of programming. In the past, we wrote code, in the future what we write will be ideas, architectures, and specifications; we will organize teams; we will define assessment criteria, tell the system what's good, what's bad, what constitutes excellent results; we will iterate and brainstorm with it repeatedly.


This is really what you have to do. I believe that every engineer will have 100 agents in the future.


Jason Calacanis:

Going back to the PR issue. Entrepreneurs like David Friedberg are really doing substantial things with your technology and AI at Ohalo: increasing food production, improving the supply of high-quality calories. Friedberg, how much do you think this vision can reduce costs to? What impact will this vision have on what you are doing?


David Friedberg:

We just did a zero-shot genome modeling, and it worked. The moment when that really worked was mind-blowing. And this was happening against the backdrop of 'someone overnight replacing the entire enterprise software stack.'


I did it myself: in 90 minutes, I replaced the entire software stack and a bunch of workflows. Starting Sunday night at 10 pm, everything was done and deployed by 11:30 pm.


After I, as the CEO, did it, I had all my management team members do the same exercise over the weekend. By Monday, what we saw was: done.


Speaking of something more technical and scientific. We used auto research and a set of data to do something in 30 minutes. If we had followed the traditional path, this would have been a PhD-level result, potentially taking 7 years, and might have been one of the most respected doctoral works in the field, enough to be published in Science.


But what we did was just on a desktop computer, downloading auto research from GitHub, feeding in the newly obtained data, and running it in 30 minutes. Everyone's expression changed at that moment. The potential it unleashed was truly incredible.


So I think this acceleration is expanding everyone's possibilities in an unprecedented way.


But back to this point on auto research: what do you think? A weekend, 600 lines of code, producing such results, and being able to run locally, handling such diverse datasets.


Doesn't this indicate that we are still in a very early stage, whether in algorithm optimization or hardware optimization?


Huang Renxun:

The reason OpenClaw is so amazing is, first, because it perfectly coincided with the breakthrough of large language models; its timing was spot on.


To a large extent, if it weren't for Claude, GPT, and ChatGPT reaching the level they are at today, Peter probably wouldn't have created this thing. The models have indeed reached a very high level.


Second, it brought new capabilities: allowing these models to access tools we have created over the years. Such as browsers, Excel; in chip design, it's Synopsys and Cadence; and also Omniverse, Blender, Autodesk, and so on. And these tools will continue to be used in the future.


Some people are now saying that the enterprise IT software industry is about to be disrupted. But I'll give you another perspective: The scale of the enterprise software industry has always been limited by the number of "butts in seats," that is, the number of seats. But in the future, it will see 100 times more agents. These agents will query SQL, query vector databases, query Blender, Photoshop.


The reason is simple: first, these tools are already doing a great job; second, these tools are essentially the "intermediary interface" between us and the machine. Ultimately, when the work is done, the results must be presented back to me in a way I can control. And I know how to operate these tools.


So I hope everything will eventually come back to Synopsys, back to Cadence, because that's where I can control, where I can do "definite standard" validation.


The Next Battlefield for AI: Open Source, Verticalization, and Global Proliferation


David Sacks:

I have a question about open source. Now we have closed-source models, which are excellent; and we also have open-weight models, with many amazing Chinese models that are really strong.


Two days ago, maybe you were busy on stage at that time and didn't see it, in a cryptography project's Subnet 3 called BitTensor, someone completed a training task: they fully trained a 40 billion-parameter Llama model in a distributed manner. A random group of people contributed computing power, but they were able to manage the entire training process in a stateful manner. I think this is technically very crazy because the participants are completely randomly distributed.


Renxiang Huang:

This is like our era of Folding@home.



David Sacks:

That's right. So how do you see the endgame of open source? Do you see that architecture is also becoming decentralized, computing power is also becoming decentralized, supporting the path to open weight and complete open source, thus making AI truly accessible?


Richard Whang:

I believe that fundamentally we need two things at the same time: first, models as first-class citizen commercial products, proprietary products; second, models in an open-source form.


This is not an A or B relationship, but rather both A and B are needed. Without a doubt. The reason is that the model is first and foremost a technology, not a final product. The model is a technology, not a service.


For the vast majority of users, at that horizontal level, at the general intelligence level, I actually don't want to fine-tune a model myself. I prefer to continue using ChatGPT, Claude, Gemini, X. They each have their own characteristics, depending on my mood and the problem I want to solve. So this part of the industry will develop very well, it will be very prosperous.


However, all the domain knowledge and professional skills in these industries must be precipitated in a way that they can control themselves, and that can only come from open models. The open model industry is very close to the cutting edge. We are also investing heavily in it.


Frankly, even if open models really catch up with the cutting edge, I still believe that model-as-a-service, world-class commercial product models, will continue to thrive.


Jason Calacanis:

Every startup we are now investing in almost always starts open source and then moves towards proprietary models.


Richard Whang:

Yes. And the wonderful thing is: as long as you have an excellent router, from day one, every day, you can access the best models in the world. At the same time, this gives you time to reduce costs, fine-tune, and specialize. So you start with world-class capabilities, and then slowly build your moat.


David Friedberg:

Jensen, I want to ask a geopolitical question. Of course, no one wants the United States to win the global AI race more than you. But a year ago, during the Biden era, that diffusion rule was actually preventing U.S. AI technology from spreading globally.


Now the new administration has been in power for a year. What score would you give it? When it comes to the global spread of AI, are we at A, B, or C now? What has been done well and what has not been done well?


Richard Whang:

First of all, President Trump wanted the U.S. industry to lead, he wanted the U.S. tech industry to lead, he wanted the U.S. tech industry to win, he wanted U.S. technology to spread globally, he wanted the U.S. to become the richest country in the world. He wanted it all to happen.


But at this moment, NVIDIA has lost its original 95% market share in the world's second-largest market, now down to 0%. President Trump wants us to reclaim that.


The first step is to get licenses for the companies we can sell to. Many companies have already applied, and we have applied for licenses on their behalf, with Commerce Secretary Lutnick already approving some. Next, we have informed Chinese companies, many of whom have placed orders with us. So we are now restarting the supply chain and sending out goods.


On a higher level, I think we should acknowledge one thing: when we can't get microcontrollers, rare earth minerals, our national security is weakened; when we can't control our communication network, national security is weakened; when we can't provide sustainable energy for the country, national security is also weakened. Each of these industries is a story that I hope the AI industry will not repeat.


As we look to the future and ask "What does a truly globally leading U.S. technology industry, U.S. AI industry look like," we must be honest: AI models cannot be dominated worldwide by one U.S. company, that outcome doesn't make sense.


But we can envision: the U.S. technology stack, from chips to computing systems to platforms, being widely adopted globally. People around the world can build their own AI, public AI, private AI on this U.S. technology stack and then serve their societies. I hope the U.S. technology stack can cover 90% of the globe. I really hope so.


Otherwise, if the end result looks like solar energy, rare earth, magnets, motors, communication equipment, I would think that would be a very bad outcome for U.S. national security.


Chamath Palihapitiya:

How closely are you following global conflict situations right now? To what extent does this worry you? For example, the Middle East could impact helium supply, and helium is a potential supply chain risk for semiconductor manufacturing. How concerned are you about these issues? How much effort are you putting into this?



Ren-Hsun Huang:

First, speaking of the Middle East, we have 6000 families there. We have many Iranian employees in the company, and their families are still in Iran. So, we have many families there.


The first thing is this: They are now very anxious, very worried, very afraid. We have been thinking of them all along, closely monitoring the situation. They will have our full support. Some have also asked me, given the current situation in the Middle East, whether we will continue to stay in Israel? My answer is: We will stay in Israel 100%. We fully support the families there. We will continue to stay in the Middle East 100%.


Some have also asked, since the situation in the Middle East is as it is, do we still think it's worth expanding AI there? My view is: The reason there is war is because everyone wants a more stable outcome. And I believe, after the war, the Middle East will be more stable than before. So, if we were willing to consider it before the war, we should take it even more seriously after the war. So on this issue, I am also 100% committed.


We have three things to do. First, we must quickly reindustrialize America, whether it's chip factories, computer factories, or AI factories.


Jason Calacanis:

How is that progressing?


Ren Xing Huang:

Progress is very good. The reason we have been able to advance at an astonishing speed in Arizona, Texas, and California is because we have received strategic support, friendship, and assistance from the supply chain in Taiwan, China. They are truly our strategic partners. They deserve our support, our friendship, our generosity. They are also doing their best to help us accelerate the manufacturing process.


Second, we must diversify the manufacturing supply chain. Whether it's South Korea, Japan, or Europe, we need to diversify the supply chain to make it more resilient. Third, as we enhance diversification and resilience, we must also exercise restraint and avoid unnecessary pressure.


Jason Calacanis:

You mean, we need to be patient.


Chamath Palihapitiya:

What about helium? Many reports have mentioned this issue.


Ren Xing Huang:

I think helium could be a problem. However, on the other hand, supply chains usually have a lot of buffer inventory, and such systems generally leave a certain margin.


Jason Calacanis:

You have made great strides in autonomous driving and also made a major announcement. You have added many partners, including Uber. Recently, I saw a video of you in a Mercedes-Benz autonomous driving car. You and Uber also announced that you will be deploying more vehicles with many automakers.


I understand your bet is this: there will be an open platform similar to Android in the future, and you will play a key role in it, serving dozens of car manufacturers; on the other side, there may be a closed system like iOS, such as Tesla or Waymo.


What is your strategy? How will this game unfold? Because it seems like you are both cooperating in some areas and competing in others, and your stack is very deep.


Jensen Huang:

First, we believe that everything that moves in the future will someday be fully or partially autonomous. Second, we don't want to build self-driving cars ourselves, but we want to empower every car company in the world to build self-driving cars.


So we have built three computers: a training computer, a simulation computer, an evaluation computer, and an in-vehicle computer. We have also developed the world's safest driving operating system.


At the same time, we have also developed the world's first autonomous driving system with reasoning capabilities. It can break down complex scenes into simpler ones and navigate through them one by one, just like a reasoning model. This reasoning system is called Alpamayo, and it has enabled us to achieve amazing results.


We will pursue vertical optimization and horizontal innovation; then let each manufacturer decide for themselves. Do you only want to buy one of our computers? Like Elon and Tesla, then they buy our training system; or do you want to buy the training system plus the simulation system? Or do you want to work with us to integrate all three sets, and even install the in-vehicle computer in your car?


Our attitude has always been that we want to solve problems, but we do not insist that only we can provide the unique answer. Regardless of how you choose to cooperate with us, we are happy.


David Sacks:

Following up on this question, I find it particularly interesting. You are actually building a platform to let a thousand flowers bloom. But indeed, some flowers now want to go down, to the bottom of the stack, trying to compete with you. Google has TPU, Amazon has Inferentia and Trainium, almost everyone is working on their "I can also surpass NVIDIA" version. Although they are also your major customers.


How do you handle these relationships? What do you think will happen in the long run? What role will these products ultimately play in the entire ecosystem?


Jensen Huang:

This is a very good question.


First, we are the only true AI company. We build our own foundational models, and we are at the forefront in many areas. We construct every layer, every stack from top to bottom. We are also the only AI company that collaborates with all AI companies in the world.


They never show me what they are doing, but I always make it clear to them what I'm doing. So our confidence comes from one thing: we are very happy to compete on "whose technology is the best." As long as we can continue to run fast, I believe that continuing to purchase from NVIDIA will still be one of their most economical choices. I am very confident about this.


Second, we are the only architecture that can be deployed on all cloud platforms. This brings a fundamental advantage. We are also the only architecture that can be taken down from the cloud and placed in on-prem data centers, cars, any region, or even in space.


So, we actually have a significant portion of the market, about 40% of the business. If you don't have the CUDA stack, if you are not able to provide the entire AI factory, customers simply don't know how to collaborate with you. They are not looking to buy chips; they are building AI infrastructure. So what they need is: you come in with a complete stack, and we happen to have the complete stack.


So, surprisingly, if you look now, NVIDIA's market share is actually still rising.


David Sacks:

So, you mean, these companies tried around, found out, "Wow, this is too complicated," and then came back? Is that why your market share continues to grow?


Jensen Huang:

Market share growth has several reasons.


First, our pace of advancement is too fast. Second, we make everyone realize: the problem is not in making the chip, but in making the system, and this system is extremely difficult to build. So their scale of cooperation with us is still increasing.


Take AWS, for example, I remember they just announced yesterday that they are going to buy 1 million chips in the coming years. This is a very large purchase, not to mention the huge pile they have already bought. We are certainly very willing.


In addition, in recent years, our market share has grown because now Anthropic is here, Meta is also here, the growth of open models is amazing, and all of this is happening on NVIDIA.


So our market share is rising; on the one hand, the number of models is increasing; on the other hand, more and more of these companies are coming out of the cloud and growing in regional deployments, enterprise scenarios, and industry edge scenarios.


And that whole market, if you are just making an ASIC, it is actually very difficult to break into.


David Friedberg:

On a related note, without going into the details of the numbers, analysts don't seem to believe you.


You mentioned that the hash rate might increase by 1,000,000 times, but the market's consistent expectation is: you will grow by 30% next year, 20% the year after, and by 2029, which should have been an explosive growth year, it's only 7%. If you fit your Total Addressable Market (TAM) into these growth numbers, the implicit meaning is that your market share will plummet.


So, from the future order book you've seen, are there any signs to support this assessment?


Ryan Huang:

First of all, they simply do not understand the scale and breadth of AI.


David Sacks:

Yes, I think that's true too.


Ryan Huang:

Most people think AI is just the domain of those five super-giant cloud companies.


Jason Calacanis:

Right.


David Sacks:

There's also an investment orthodoxy that believes "the larger the scale, the harder the sustainable growth." They have to go back and explain the model to the investment bank's risk management committee; they cannot easily believe that "five trillion can rise to fifteen trillion." At most, they are willing to go up to seven trillion, anything beyond that they cannot accept.


Jason Calacanis:

They cannot fathom a $10 trillion market-cap company.


David Sacks:

Essentially, it's a kind of self-protective modeling; things that have never happened in history, they dare not factor in.


Ryan Huang:

Moreover, you must redefine what you are actually doing.


Recently, someone observed: Jensen, how could NVIDIA surpass Intel in the server market size? The reason is quite simple: the entire data center CPU market is roughly about $250 billion a year. And we, as you all know, in almost the time it took for us to sit here and chat, could reach about $250 billion.


Jason Calacanis:

Nice.


Ryan Huang:

Of course, this is a joke.


Chamath Palihapitiya:

Nothing said in podcasts counts as official performance guidance.


Ryan Huang:

That's right, not performance guidance. But the key point is: how big you can grow ultimately depends on what you are actually building.


NVIDIA is not in the business of making chips, that's the first point. Second, making only chips is no longer enough to address the issues of AI infrastructure; this matter is too complex. Third, most people have a narrow understanding of AI, limited to only the part they see, hear, and discuss.


OpenAI is very powerful, it will be very large; Anthropic is also very powerful, it will also be very large. But AI itself will be even larger than the sum of them. And what we serve is precisely that larger part.


David Sacks:

So, can you explain the "space data center" business to the average person? How should it be understood compared to those large data centers on the ground?


Huang Renxun:

We are already in space.


David Sacks:

How should ordinary people understand this business?


Huang Renxun:

First, of course, we should do well on the ground, after all, we are now on the ground. Second, we should also be prepared to enter space. There is, of course, a lot of energy in space. The problem lies in heat dissipation. You cannot rely on conduction and convection like you do on the ground, so you can only rely on radiation for heat dissipation, which requires a very large surface area. This is not an unsolvable problem, after all, there is plenty of space in space, but the cost is still very high. However, we will explore.


Moreover, we are already there. Our hardware has been radiation-hardened, and CUDA is already running in many satellites worldwide. They are doing imaging, image processing, and AI image analysis. These things should be done in space, instead of transmitting all data back to the ground first for image analysis. So, indeed, a lot of work should be done in space.


At the same time, we will continue to research: what should a data center in space look like? This will take many years. That's okay, I have plenty of time.


The Future of Robotics, Healthcare, and Work: How Will AI Ultimately Enter the Real World


Jason Calacanis:

I would like to further inquire about healthcare.


As we reach a certain age, we begin to think about lifespan and healthspan. We all look good, some people may look better. Jensen, I really don't know what your secret is. Are you into anti-aging? What exactly can't you eat? You have to tell me privately.


From the perspective of healthcare system development, where is this direction headed? What progress have we actually made?


I was just using Claude for analysis, looking into what's going on with these medical billing codes in the U.S. The U.S. is spending twice as much as others, yet the health output seems to be only half.


From what I've gathered, about 15% to 25% of the money is actually spent on initial primary care physician visits. Honestly, we all know that today a large language model can already do a more stable and better job at the initial visit.


So, what is still missing to break through regulation and allow AI to truly impact the entire healthcare system?


Richard Hwang:

Our main involvement in healthcare is in several directions.


First is AI physics, which serves AI biology, using AI to understand and represent biology and its behaviors. This is crucial in drug discovery.


Second is AI agents, used in scenarios like diagnostic assistance. OpenEvidence is a great example, as is Hippocratic. I really enjoy working with these companies. I truly believe that agentive technology will fundamentally change the way we interact with doctors and the healthcare system.


The third part is physical AI.


The first part is AI physics, using AI to predict physics; the second part is making physical AI understand physical laws, which can be used in robotic surgery. This area is already very active. In the future, in hospitals, every instrument you encounter, whether it's ultrasound, CT, or any other device, will become agentive.