Try Free

COMPUTEX 2026 Arm CEO Keynote: The foundation for the agentic AI era

Arm® June 3, 2026 48m 6,887 words
▶ Watch original video

About this transcript: This is a full AI-generated transcript of COMPUTEX 2026 Arm CEO Keynote: The foundation for the agentic AI era from Arm®, published June 3, 2026. The transcript contains 6,887 words with timestamps and was generated using Whisper AI.

"There's a specific kind of silence right before the world changes, it's that split second of realisation, the moment when you know that life as you've lived it is about to become something entirely new. For more than 35 years we've learnt to recognise that moment, from Cambridge to the US to Taiwan"

[00:00:00] There's a specific kind of silence right before the world changes, it's that split [00:00:29] second of realisation, the moment when you know that life as you've lived it is about [00:00:36] to become something entirely new. [00:00:44] For more than 35 years we've learnt to recognise that moment, from Cambridge to the US to Taiwan [00:00:58] and to the world. [00:01:01] We partnered with our ecosystem, solving problems together. [00:01:09] We've been here from the start of AI, preparing the world for what's next. [00:01:18] We have powering intelligence everywhere, in how we live, work, play and move. [00:01:29] Together, this is our moment. [00:01:34] Because when you know you have the power to change everything, you've stepped forward. [00:01:39] You've stepped forward. [00:01:40] Let's step forward. [00:01:41] Let's welcome to the Army Directorate of the Army Directorate of the Army Directorate of the Army. [00:01:45] Army Directorate of the Army Directorate of the Army Directorate of the Army Directorate of the Army Directorate of the Army. [00:01:57] Let's welcome to the Army Directorate of the Army Directorate of the Army Directorate of the Army Directorate of the Army. [00:02:00] Let's welcome the Army Chief Executive Officer, René Haas. [00:02:06] Please welcome Army Chief Executive Officer, René Haas. [00:02:19] Hello, welcome. I apologize for the delay, but we will get moving as quickly as possible. [00:02:26] We have a lot of things to share with you this afternoon. [00:02:31] June means Computex, and June means a muggy evening and afternoon in Taipei, [00:02:38] but it is wonderful to be back here. [00:02:41] I think my first Computex was 2004, 2005-ish, [00:02:45] so it's 20-plus years, plus or minus, when COVID hit. [00:02:49] Arm started in 1990, and it was not long after 1990 that Taiwan and Arm started a relationship. [00:02:59] Taiwan has built Arm. [00:03:01] We are nowhere without the ecosystem and partners that exist inside Taiwan. [00:03:08] Now, going back in time, we think probably around 1993-ish, a few years after we started, [00:03:16] the first Arm chip was designed here. [00:03:19] Those were early days. [00:03:21] SoCs were kind of a foreign thing. [00:03:23] Design tools, physical design, EDA that could support. [00:03:28] SoCs really didn't exist, but we were working with Etree back in the day, [00:03:33] who did some initial work with us to test out our IP, our methodologies. [00:03:38] Not long after that, the first Arm chip, manufactured in Taiwan. [00:03:46] We believe it was TSMC. [00:03:47] We're looking back, it may have been UMC. [00:03:49] It was some early test chips. [00:03:50] We didn't get into production really until later in the decade, [00:03:55] but the first Arm chip was packaged and tested here. [00:03:57] So really, not long after Arm started, we were linked to Taiwan. [00:04:02] Some of the significant volumes, though, that really embody what Arm is all about, [00:04:07] started in the 2000s. [00:04:10] And this is before the iPod. [00:04:12] If folks remember these little tiny MP3 players that had maybe 256 songs, [00:04:18] fit in your pocket from Creative, Diamond, Rio, companies like that. [00:04:24] Those were all Arm-based. [00:04:26] And, of course, then the iPod, which in many ways was the catapult [00:04:31] for the Arm technology being everywhere, was here. [00:04:36] And that was in a chip designed by a portal player that went into the very first MP3 player [00:04:41] that took volume was the iPod. [00:04:44] But it was really in 2008 when we had the revolution that was grown relative to mobile. [00:04:52] Here we go. [00:04:53] The mobile revolution was really launched in Taiwan. [00:04:56] Now, we were involved, obviously, with the early GSM phones, as folks know from Nokia and LG, etc. [00:05:04] But it was really the launch of the iPhone and then the Android phones that soon followed. [00:05:10] And that revolution really, really launched the growth of Arm into a set of volumes we've not seen before. [00:05:17] So it was really that period that was the most significant for us. [00:05:20] And today, and I'll talk more about Arm server CPUs, 100% of those CPUs are built here. [00:05:30] And when we look at aggregate across everything that we've done in our history with all our partners, [00:05:37] about 250 billion chips have been built in Taiwan, more than any other region on the planet. [00:05:50] I cannot tell you the gratitude we have as a company for the ecosystem, the people, the talent, the partners here. [00:05:57] Arm is nowhere without the partners of Taiwan. [00:06:01] Now, some very cool products have come out of the Taiwan ecosystem. [00:06:06] When we look at the Edge, products such as the Amazon Edge, Oppo Vivo phones, [00:06:12] Apple Mapbooks, a product I use constantly. [00:06:15] I don't mean this as a promo, but these meta Ray-Ban glasses, they are amazing. [00:06:19] I use them for phone calls, videos, messages, all here in Taiwan. [00:06:26] Physical AI, the humanoids, the most advanced in the world, [00:06:31] whether it's Tesla, Figure, Techman, all the chips here built in the Taiwan ecosystem. [00:06:40] And then, of course, Cloud AI, whether it's the TPU racks, the racks by NVIDIA, [00:06:48] Graviton, everything here, as I mentioned, 100% of our ecosystem is built in Taiwan. [00:06:57] So, without Taiwan, there really is no arm. [00:07:02] Thank you again. [00:07:09] Now, what seems like a long time ago, and in the world that we're living in with AI, we're living in light-year speed, [00:07:18] we did an event called Arm Everywhere back on March 24th. [00:07:22] And at that time, we were looking at what was going on relative to the growth of agents and agentic AI. [00:07:31] And at that time, this is March 24th, not so long ago, showed a slide about the growth of OpenClaw relative to Linux and Kubernetes. [00:07:42] GitHub stars on the left are exactly what you think they are. [00:07:45] They are stars that rate the popularity or the stickiness of a certain application. [00:07:51] OpenClaw reached levels almost beyond parabolic in terms of the takeoff. [00:07:57] And this is back in March 24th. [00:08:00] And what that told us was that the growth of these agentic platforms were driving demand for CPUs in a way we had not seen before. [00:08:12] And the logic behind that is quite simple. [00:08:15] GPUs, XPUs are amazing at generating tokens. [00:08:19] That is their purpose. [00:08:22] Whether it's training to generate the learning or inference to deliver the tokens, [00:08:26] the token machine, the token factory is the accelerator. [00:08:30] But agents, unlike humans, don't sleep. [00:08:34] And agents beget agents that beget agents. [00:08:37] And all of those tokens that need to be distributed, managed, orchestrated, delivered to the destination, [00:08:44] that's only a workload that CPUs can do. [00:08:47] CPUs, of course, in conjunction with a full system design. [00:08:50] So we made a comment back on March 24th. [00:08:54] And I think we were probably one of the very first to do this. [00:08:57] It said we believe going forward that four times the number of CPU cores needed in the same power envelope going forward. [00:09:06] Now, that multiplier ended up getting so many questions relative to show me the math and how do you figure that out. [00:09:14] And not long after that, you started hearing numbers of 4X, 8X, 10X. [00:09:19] It's a hard number to predict just based upon the growth rate of these agents. [00:09:24] But what we do know is as follows. [00:09:27] If we look at today what we're seeing in terms of agentic growth, even fast forwarding from the 24th of March, this is just exploding. [00:09:41] We're seeing this with SaaS companies, whether it's Snowflake or Salesforce or ServiceNow, who are developing all the agents relative to running in the back plane. [00:09:54] The explosion of anthropic with cloud code, codecs from OpenAI. [00:09:59] All these agentic workloads are driving in more demand. [00:10:02] And what that does in turn is drive a very, very significant growth in terms of where the CPUs go. [00:10:12] So if you change the access on the Y side to units and you look forward in terms of what the growth rate looks like, CPUs are even growing faster than we had thought. [00:10:25] And we are seeing this across the board. [00:10:27] It's not just ARM. [00:10:29] Of course, I'll be promoting ARM a little bit more later. [00:10:31] But we're seeing this from anyone who's in the CPU business. [00:10:35] The demand for these CPUs continues to explode because the agents beget agents beget agents. [00:10:42] Now, is the number four to one? [00:10:45] Is the number six to one? [00:10:47] Is the number eight to one? [00:10:50] I don't know. [00:10:51] But what I do know is that it's getting bigger. [00:10:54] It's that the agents continue to accelerate relative to the growth. [00:10:58] And with that, CPU growth is also raising. [00:11:03] We threw out a number back on March 24th, around a CPU TAM in five years, going to north of $100, $120 billion. [00:11:14] And again, at the time when we did that event, we had a lot of questions from media, investors, analysts, saying that number seems a little too aggressive, not sure how you got there. [00:11:25] Fast forward, the numbers that people are talking about are almost twice that number, if not larger. [00:11:31] What we do know is that AI agentic workloads, because of the more tokens you generate, the more information that's being used, the more that they are agentic, drives demand for compute. [00:11:46] And of course, we have an answer for that. [00:11:50] The ARM AGI CPU. [00:11:55] Now, this CPU, as I mentioned before, 100% built in Taiwan. [00:12:03] I'm going to show you a video that we showed on March 24th. [00:12:06] I'm going to show it to you again for those that didn't see it. [00:12:08] I want to show it again because, frankly, I love it. [00:12:11] It's a great video and says everything you want to know about the product, but also emphasizes the importance of the Taiwan ecosystem. [00:12:18] Thank you very much. [00:12:19] Thank you very much. [00:12:19] Thank you very much. [00:12:20] Thank you very much. [00:12:48] Let's go. [00:13:18] Let's go. [00:13:48] I think I can watch that video every single day. [00:13:56] I just get so motivated and enthused by what I see there. [00:14:00] So the Armage AI CPU, built in Taiwan. [00:14:04] TSMC, our partner, we are now in production of this product. [00:14:09] One of the things that we emphasized early on when we talked about potentially delivering solutions into the marketplace was that we didn't want to talk about the product until we had customers, the product was shipping. [00:14:21] And equally as importantly, that we had partners who could help deliver the product to market. [00:14:25] We understand that in this world, it's not just about delivering a chip, but it's delivering a full system with partners. [00:14:33] And we've worked with some of the best on the planet all here in Taiwan. [00:14:39] I understand there are actually some that are outside there in the demo area. [00:14:44] I think we may even have a full rack I've heard from Supermicro sitting out there. [00:14:49] But whether it's ASRock or TSMC, our fab partner, Quanta, InCycle, Supermicro, ASPEED, all fantastic partners who enabled our ecosystem to deliver amazing solutions. [00:15:03] Now, this product comes into flavors from a system standpoint, and one of the things that we really emphasize with the RMA GI CPU is maximum performance, density, and efficiency. [00:15:16] Of course, our hallmark is around energy efficiency. [00:15:20] We were born from mobile phones. [00:15:22] We designed a custom CPU way back in the day that had to fit into a plastic package and run off batteries. [00:15:30] And that is a mindset that sits inside our engineers and everything that we do, and it translates to amazing solutions and products. [00:15:37] An air-cooled rack, 36 kilowatts, 8,000 cores, and a liquid-cooled rack that has over 45,000 cores, 200 kilowatts. [00:15:50] So two different solutions. [00:15:53] But what's key about this product line is the performance per rack, performance per watt. [00:16:01] Two times the performance per rack versus the comparable x86 system. [00:16:07] Basically means same power envelope, two times the benefit in terms of performance. [00:16:12] If you want a half the power, you still have equivalent performance. [00:16:17] So it's incredibly efficient. [00:16:19] But more importantly, when you think about what goes into these giant data centers, and we're seeing announcements literally daily. [00:16:29] In fact, the parent company of ARM SoftBank just announced a partnership in France for a 5 gigawatt data center. [00:16:36] These data centers are incredibly capital intensive. [00:16:39] The energy costs are huge, so having the benefit of performance per rack, more CPU in the same power envelope, has huge, huge benefits versus the competition. [00:16:51] We estimate about 10 gigawatts of capacity, over $10 billion, up to $10 billion of savings. [00:16:58] But as we go forward, and we have more and more CPUs inside the systems, you'll get even more benefit relative to using the ARM AGI CPU. [00:17:10] Now, we were super proud back in March to talk about our partners, people who had embraced the solution, customers that we had signed up, Meta, Rebellions, SAP, Cerebrus, OpenAI, SK Telecom. [00:17:24] And that was just on March 24th, and we talked about our customer base and who had adopted the product. [00:17:32] I'm proud to say that since that time, even more companies have joined the family. [00:17:39] Oracle, huge partner with OCI. [00:17:42] We have a long history with Oracle. [00:17:44] They've now joined the ARM AGI CPU family, as well as ByteDance. [00:17:48] Two new partners, part of the family, validating that the ARM AGI CPU solves real-world problems. [00:17:57] Now, we talked about this back in March, and I want to emphasize it again. [00:18:02] We are now a full end-to-end solution provider. [00:18:07] So, while we do have production silicon of the ARM AGI CPU, not everyone wants to buy the ARM AGI CPU, and that's okay. [00:18:15] We have compute subsystems, many partners who take that, and we have just standalone IP in this space. [00:18:22] Whether it's Google, whether it's Amazon, whether it's NVIDIA, whether it's Microsoft, we have many, many customers who are on the left-hand side of that slide. [00:18:31] And we intend to provide solutions to whatever the customers want to see. [00:18:36] And that is very important, because the momentum is really increasing for us now with Agentec AI, and whether it's our own CPU or our partners. [00:18:48] Very significant announcement took place last month, where Google announced for their TPU 8T and 8I, [00:18:57] that the head node, the CPU that interfaces into the accelerators, is going to move from x86 to Axion, which is their internal chip, using ARM Neoverse. [00:19:11] 60% less power at the same performance. [00:19:17] Andy Jassy had a great quote, I think one of the earnings calls, that basically said for Graviton, [00:19:25] we had two customers say, can we buy everything that you have? [00:19:30] Graviton now has half, more than half of their design starts are based on Graviton versus x86. [00:19:39] From a few years ago, that was zero. [00:19:43] And of course, NVIDIA, who announced Vera, amazing partners, Vera is an amazing product, the list of partners is far larger than here, I didn't have a slide big enough for all of them, but NVIDIA has had a tremendous momentum with Vera. [00:20:01] Now, our intentions are very clear for the ARM AGI CPU, we intend to be in this for the long term, it's multi-generational, ARM AGI CPU2 is already underway, and as you can imagine, it has more cores, more power efficient, better performance. [00:20:18] And ARM AGI CPU3 is on the way, but these are all based on the compute subsystems that we intend to deliver, along with the chips. [00:20:28] And they'll be lined up roughly on the same cadence, so the CSS's that we deliver to our partners, those are what we use to enable our end devices. [00:20:37] So that's ARM AGI CPU, which has had incredible momentum. [00:20:42] Now, I want to switch gears a little bit, because Computex to me, always having come here 20 years ago for the very first one, was always about the old exhibition hall, floppy disk controllers, USB cables, all kinds of things in terms of IT malls. [00:21:05] And you could go into these shops and buy almost anything under the sun, it was like a mini fries, 20 of them on a floor in a building that was 10 stories high. [00:21:16] And that's how people bought PCs back in the day, in terms of how they shopped for them. [00:21:24] And if you think about how these PCs were built, and how we used to buy, it was very interesting. [00:21:35] You'd have literally every single price point you could think about, whether it was a base entry laptop, raise your hand if you remember the netbook. [00:21:47] I knew the NVIDIA guys would remember that one, we have battle scores from that one, all the way up to high-end gaming machines. [00:21:55] But literally, these units were priced at $50 price points. [00:21:59] You had feeds and speeds, clock frequency, memory size, et cetera, et cetera. [00:22:05] So much has changed, obviously, not only in how we buy PCs, but more importantly, how we use these products. [00:22:20] How we use the products has really, really evolved with, obviously, what the smartphone has done, what the web has done, what applications have done. [00:22:32] And what we see is that they've really started to bifurcate into kind of two areas, I would say. [00:22:41] One is, and I think many of you can identify this on the bottom left, is I need a machine that is on the go, battery life is really good, connects everywhere, and I need it to kind of look like a large phone with a keyboard where I can do work, but it maps very closely to what my phone does. [00:23:03] And if I think about myself personally, I have one of these flip phones, which I use for reading documents and reviewing presentations, and I'm a CEO, so I create very little these days. [00:23:15] I review many things. [00:23:17] But what I find is I go back and forth a lot between that smartphone that flips like a tablet into the PC, but it's really super important that the PC and phone are synchronized and they can do things back and forth very, very quickly. [00:23:33] There's also an extreme performance workload, and that is, I'm either running agents, I'm either running models, I'm doing some development work, I need some very, very extreme level of performance. [00:23:50] So there's really two different components in two different areas in terms of how they all work. [00:23:55] So only ARM really enables this for PCs, and I think that's a very, very key distinction in terms of the way we used to think about this category back in the day, where literally you had every single price point covered, every single feed and speed. [00:24:14] Now you want two different ends of the spectrum. [00:24:18] And whether it's long battery life, great AI experience, we're in that bottom category, but if you also want the agentic type of performance, we're there as well. [00:24:30] Now, specifically, when we look at the units that are there, you can see that you've got the Acer device, Mac Neo, pretty interesting product, the Google Book, Microsoft Surface, Mac Studio, of course, the NVIDIA RTX Spark, which was just announced, which I'll talk about. [00:24:52] But these two broad categories are very unique to ARM. [00:24:55] And I get lots of questions, you know, over the years about Windows and ARM, and when is ARM going to really take place to be a significant player in laptops in the compute space. [00:25:04] I would argue that we are now actually there. [00:25:09] Because when we look across the spectrum of the operating systems that are supported, whether it's Linux, whether it's Mac OS, which is 100% on ARM today, Chrome, Windows, only ARM can enable this across the board. [00:25:31] And this would not be done without huge, huge cooperation from all of our partners who are listed there, the folks on the operating system side that we work so closely with. [00:25:42] We've worked for decades with Apple. [00:25:44] We've worked for decades with Google and Microsoft. [00:25:48] This work does not happen overnight. [00:25:50] There is a huge amount of effort to go off and make this happen, and I want to give an applause and thanks to all of our partners to make this work. [00:25:58] Now, I want to talk about a product that we knew was being worked on, and we are proud to be a partner with NVIDIA on the RTX Spark powered by ARM. [00:26:18] 20 cores, ARM-based cores, in the custom grace CPU. [00:26:24] I believe that is the most CPU cores that you can find in a laptop anywhere. [00:26:30] But when you pair it with Blackwell, the world's most powerful GPU for a Gentech, you have an incredibly special product. [00:26:41] One petaflop of FP4, huge amount of memory, full Windows native on ARM. [00:26:50] Amazing product. [00:26:51] And of course, as you'd suspect, partners who are there already. [00:27:05] Acer, Asus, Dell, Gigabyte, HP, Lenovo, Microsoft, MSI. [00:27:14] I think I saw a Surface Ultra that was announced, an amazing product. [00:27:19] Congratulations again to the NVIDIA team for making all this happen. [00:27:23] Now, our role here was working very closely with NVIDIA and with MediaTek using our CSS strategy. [00:27:37] And again, for those who are not familiar with what our compute subsystems do, the CSS is basically the building blocks that we use to put together everything to build a full-end solution system. [00:27:50] The CPUs, the CPUs, the GPUs, the system IP, the memory controllers, everything that goes into building a custom SOC. [00:27:59] We provide these to our customers. [00:28:02] We did this with MediaTek as either full solutions they can take or building blocks that they can start with. [00:28:10] So, we see a very significant opportunity, again, given the strategy we talked about with IP and compute subsystems around the Armagentic CPU, very, very similar with what we're doing with the CPUs for the CSSs. [00:28:26] And I think the PC space is going to be a very, very interesting domain, as I said, going forward, because with these use cases on the bottom left, again, the kind of use that I am relative to using the systems for creation and things of that nature. [00:28:44] So, the high-end systems, when we start thinking about where agents can go and how agents interface with us, it's going to be a very, very different domain. [00:28:53] So, I'm not sure if the systems are available yet, but we actually got access to some of the hardware and technology, and we decided to give it for a spin. [00:29:14] So, again, complete surgeon's general warning here, this following video was AI-generated, so please don't have your legal teams contact us. [00:29:30] But let's take a quick look. [00:29:44] So, let's take a quick look. [00:30:14] So, let's take a quick look. [00:30:18] So, let's take a quick look. [00:30:20] So, let's take a quick look. [00:30:23] So, let's take a quick look. [00:30:25] So, let's take a quick look. [00:30:26] So, let's take a quick look. [00:30:27] So, let's take a quick look. [00:30:28] So, let's take a quick look. [00:30:31] So, let's take a quick look. [00:30:32] So, let's take a quick look. [00:30:34] So, let's take a quick look. [00:30:36] So, let's take a quick look. [00:30:38] So, let's take a quick look. [00:30:39] So, let's take a quick look. [00:30:41] So, let's take a quick look. [00:30:42] So, let's take a quick look. [00:30:43] So, let's take a quick look. [00:30:45] So, let's take a quick look. [00:30:47] So, let's take a quick look. [00:30:49] So, let's take a quick look. [00:30:50] So, let's take a quick look. [00:30:51] So, let's take a quick look. [00:30:52] So, let's take a quick look. [00:30:53] So, let's take a quick look. [00:30:54] So, let's take a quick look. [00:30:55] So, let's take a quick look. [00:30:57] So, let's take a quick look. [00:30:58] So, let's take a quick look. [00:30:59] So, let's take a quick look. [00:31:00] So, let's take a quick look. [00:31:02] So, let's take a quick look. [00:31:03] So, let's take a quick look. [00:31:04] So, let's take a quick look. [00:31:05] So, let's take a quick look. [00:31:06] So, let's take a quick look. [00:31:07] So, let's take a quick look. [00:31:13] Now, I know you're probably saying, I'm not sure that's AI because the dude always wears [00:31:19] the same clothes. [00:31:20] But, on the other hand, those are events that I would not actually do myself. [00:31:24] But, I think it's just a small example of the kind of creation that can be done, you [00:31:29] know, on these computers. [00:31:30] And, where I think we're going to go with Agentec AI. [00:31:35] Now, I want to be able to talk more about the product, but I'm kind of thinking that there's [00:31:40] probably someone better than me to join me on stage to talk about the RTX Spark and everything [00:31:46] that NVIDIA does. [00:31:47] So, I'm going to introduce a special guest here. [00:31:50] My clicker behaves. [00:31:52] That's a pretty cool video of Renee, superstar, action hero, not just a superstar, he's an [00:32:17] action hero. [00:32:18] I think the night market was the part I thought was the coolest. [00:32:20] Yeah, well, that's the most exciting part of your video. [00:32:24] Yeah, well, thank you for joining. [00:32:26] I appreciate it. [00:32:27] Yeah. [00:32:28] So, tell me, Jensen, congratulations on the RTX Spark. [00:32:33] Amazing. [00:32:34] Thank you. [00:32:35] Windows on ARM is not a new thing. [00:32:37] Yeah. [00:32:38] Why is this one going to be different? [00:32:40] Look at, look at his stock price. [00:32:42] I announce, I announce a product, look at his stock price. [00:32:48] Every product I announce, his stock price goes up. [00:32:52] Nothing happens to mine. [00:32:54] Let's also, let's also stay for the record. [00:33:01] That's, I'm very happy about that. [00:33:03] Let's also stay for the record that you were a shareholder and you sold. [00:33:06] Yeah, yeah. [00:33:07] Well, I needed the cash. [00:33:10] What were we talking about? [00:33:14] RTX Spark. [00:33:15] RTX Spark. [00:33:16] How is it going to be different this time? [00:33:17] Well, we wanted to reinvent the computer. [00:33:20] You know, the PC has been here for 40 years and the operating system, code written by hand, [00:33:26] is now going to be replaced with applications that are agentic. [00:33:31] Now, these agentic systems, agentic AI's will use the PC, will use the tools in the PC. [00:33:39] And so, when we imagined this future, we thought, let's see, how would we change the architecture? [00:33:45] And how would we change the operating system? [00:33:48] And reinvent the computer? [00:33:51] And, you know, that's kind of where we are. [00:33:53] And so, one of the things that we realized is that an agentic system really wants to have excellent CPUs, [00:34:00] which is the reason why we used ARM. [00:34:02] And it has a 20-core CPU. [00:34:03] It has to have excellent single-threaded performance. [00:34:06] The parameters, the memory has to hold a lot of parameters. [00:34:10] And so, we created a new numerical format called NVFP4 so that we can compress the large language models [00:34:18] as much as possible and fit a very smart AI into the system memory. [00:34:23] We also wanted to unite CUDA that is for accelerated computing. [00:34:29] And CUDA tiles are tensor core processing into one processor. [00:34:34] And the reason for that is because when you're operating these agents [00:34:37] and they're thinking and they're using the tools, the agents are fast. [00:34:42] And when the agents are fast, they expect the tools to respond quickly. [00:34:46] And so, that's why we're accelerating all of the tools. [00:34:49] We're accelerating Adobe. [00:34:50] Adobe announced they're going to re-architect Adobe Photoshop and Premiere [00:34:54] so that it's CUDA accelerated and agentically accessible. [00:34:59] And so, we're accelerating applications. [00:35:01] We accelerated Blender with RTX. [00:35:03] We accelerated, you know, we're going to accelerate everything. [00:35:05] We accelerated Adobe, Autodesk, Dassault, Siemens. [00:35:09] We're going to accelerate every tool. [00:35:11] And once these tools are accelerated, then they can respond to the agents very quickly. [00:35:15] And so, in order to build this computer, this SOC, unless you have the ability to integrate with the CPU and adapt the CPU to exactly the shape of the computer, [00:35:27] it's really quite impossible, which is the reason why Arm is perfect. [00:35:30] Well, thank you. [00:35:31] And when you think about the agents running locally… [00:35:34] And the key word there is "arms perfect". [00:35:42] The other key word is "thank you". [00:35:45] Agents running locally… [00:35:46] Naughty, naughty. [00:35:48] Naughty, naughty. [00:35:51] Naughty, naughty. [00:35:52] Naughty, naughty. [00:35:53] fair fight at this point you're welcome agents running locally versus agents running in the cloud [00:36:01] how do you think about that as a as a trade-off and where do you think that goes over time well [00:36:07] you know when ultimately this this the computers these personal computers are going to be becoming [00:36:13] agents that are running all the time they're autonomously used running all the time i could [00:36:17] imagine i today if i left my laptop at home where i left my laptop in the hotel i won't use it again [00:36:24] until i get there but in the future you just pick up your phone and you chat with your agent you're [00:36:31] chatting with your pc in the future and that you maybe there's something that you uh needed to have [00:36:37] done and sent to you maybe uh maybe there's a speech i need to have quickly written and so you know i'll [00:36:43] be working with my agent working with my assistant and that is now uh the arm personal computer so and [00:36:50] so so the pc is working in the back while you're not there it's working yeah and so if i want to do [00:36:55] something that requires a cloud api of course i'll i'll call it into the cloud api but whatever i can [00:37:01] do locally we're going to continue to do on the pc which is kind of the nature of pc yeah the nature of [00:37:06] a personal computing device is that whatever you can do on the device you do yeah you don't have to [00:37:10] worry about metering you don't have to worry about the time spent but whatever you need to do in the [00:37:15] cloud you will and when you think about the complexities of the models do you think pc performance [00:37:20] and architecture can scale i mean you guys are doing incredible work with we're with blackwell and then [00:37:25] reuben etc how do you think that all maps together in terms of scaling the systems well if you look at [00:37:30] the rtx spark pc it's got 128 gigabytes memory if it was completely compressed into mvfp4 [00:37:38] then you can have a hundred billion parameter model working on your pc all the time wow and a hundred [00:37:44] billion parameter open model say nematron 3 super say that's a really really good model and so it could [00:37:52] do a lot of the basic work and and whatever whatever deep thinking and frontier model that you need to [00:37:57] use it's just connected to cloud anyhow do you think that changes what happens in the cloud in terms of this [00:38:03] classic client cloud model i don't do i need as much compute in the cloud versus on the client or [00:38:08] do you think there's just so much compute that needs to get done wait they just these agents are going [00:38:12] to be they you're going to be you have agents and sub agents and teams of agents they're going to be [00:38:19] working in the cloud they're going to be working on devices and so it's just like today in a lot of ways [00:38:24] yeah mobile cloud is not cloud only not mobile only it's mobile and cloud and so it allows you to have [00:38:31] have a really great personal computing experience you know your own experience but whatever you need [00:38:37] to connect to the cloud you will and so do you think it may be a bit of a provocative question but as [00:38:41] as these agents are running in the background and they're doing a lot of the work does the operating [00:38:46] system matter is the agent really the os if you will and it does the work and isn't so reliant [00:38:52] on on the hood where do you think that goes over time well the operating systems is going to be just [00:38:56] as important as ever before if not more important and the reason for that and this is this is the [00:39:01] controversial part that people say ai comes along software is dead you know nothing is more nothing [00:39:07] is further from the truth and now people are starting to realize that when agents are here they're going to [00:39:13] use tools and so those tools are more important than ever and so they're going to use adobe photoshop [00:39:19] they're going to use adobe premiere they're going to use canva they're going to use this you know [00:39:23] they're going to use the so's tools siemens tools they're going to use you know tools whatever they [00:39:27] have on the device they're just this is this is the incredible part today most of us probably know [00:39:34] 10 15 20 percent of the features of a tool if you know how to use photoshop you know use lightroom [00:39:41] you're unless you're expert like my son it's kind of hard for you to know all of the features but now [00:39:47] with your agent you tell the agent what you're looking for and the agents know exactly how to use [00:39:53] the tools because it's read a skills file it's essentially read the manual of that tool yeah and [00:39:59] so now it goes and uses the mcp or the cli connected to that tool and it does everything you needed to [00:40:06] do it's going to unlock yeah so it's going to unlock all these tools these tools are going to be more [00:40:10] useful more valuable than ever and that these tools run on the operating system so we're going to need [00:40:15] we're going to need windows we're going to need you know all these apis and all these tools for a long [00:40:19] time so um nvidia's involved understatement in everything around ai i mean you guys do [00:40:26] everything around the networking the systems you know where all the bottlenecks are when you think [00:40:31] about over the next number of years where where are the constraints to to growth where do you think they [00:40:37] are well it's probably going to be everywhere this is this is um at this point if you look at our [00:40:43] evolution uh first hopper was designed for training then grace blackwell was of course great at training [00:40:51] but we also specialized mvlink 72 for inference and at first people thought you know inference was easy [00:40:58] and we explained to people that moes large language models and to be able to inference very quickly and [00:41:03] generate these tokens as efficiently as possible you're going to need a very complicated computer and so [00:41:09] gb or or grace blackwell mvlink 72 is the most efficient and we produce the lowest cost tokens in [00:41:18] the world okay and so that was a big breakthrough and now people understand that that tote that you [00:41:23] want very advanced systems to generate tokens at very low cost vera rubin took of course all of that and we we [00:41:32] extended it to run agents at first when i said that two years ago most people had a hard time [00:41:38] understanding what that meant but now they realized that an agent is orchestrating thinking is using [00:41:44] tools it's accessing long-term memory it's dealing with short-term memory you know working memory and [00:41:50] it's compacting doing memory compaction to to remember to think about what should i remember for [00:41:55] the future how do i index uh sql memory how do i index structured memory how do i index unstructured [00:42:02] memory and so how do i deal with all of that that agentic system is what vera rubin is and it's a large [00:42:10] system and and so people are now starting to understand that that when when we were thinking about agentic [00:42:16] systems where we're really thinking about new computing application pattern and that it really requires a [00:42:22] new new new architecture well now the big breakthrough of course these agents now are producing useful ai [00:42:30] and that's the reason why all of our growth right your growth my growth it's just so incredible because [00:42:36] when ai becomes useful then the tokens that are being generated are profitable and when token generation [00:42:44] is profitable everybody wants to generate a trillion times more token the other part is that the agent the [00:42:51] application this agent compute pattern is a thousand times maybe a hundred thousand times and depending [00:42:59] on the work it's a million times more than chatting yeah and so you could see that the agents are working [00:43:06] they're working for you know minutes hours sometimes days sometimes weeks and so instead of a chat bot which [00:43:14] responds from one click now the ai is thinking using tools reading thinking some more planning trying and so [00:43:23] the amount of tokens that we have to generate has increased tremendously the profitable the profitability of tokens [00:43:31] obviously is driving demand so the compound effect of [00:43:36] need more compute with more demand that compounded effect is what you and i are experiencing and so we're we're seeing you know [00:43:42] constraints almost everywhere in our case you know we were fortunate that we we planned you know one of [00:43:48] the best things about arm is that they don't have to worry about the supply chain [00:43:54] you know you know the supply chain of ip is electrons and you could use as many electrons as you need [00:43:59] okay and so i love his business model i mean i as you know i know i try to try to buy it i try to [00:44:07] i try to become arm you know we were willing i don't know i'm willing i was trying to become more [00:44:13] we were willing renee renee and i used to work together and then we tried to work together again [00:44:20] but anyways that was okay i'm not i'm sad still i'm a little sad but but this is a happy this is a happy [00:44:28] meeting so my point is in our case we saw agents coming and we saw vera rubin coming so we did a good job [00:44:35] planning our supply chain and so our supply chain can support our very robust growth we grew almost [00:44:41] a hundred percent year over year this year we're going to grow very aggressively next year and so we [00:44:45] have our supply chain could support our growth but the fact of the matter is demand is even higher than [00:44:50] that yeah i we uh was talking with uh with cc and kevin uh this week and they were saying you know at [00:44:56] some point gravity has to take over they've never seen four consecutive years of a semiconductor cycle that [00:45:01] looks this good but when you look at the things that you just described there's no reason it can't [00:45:07] continue in terms of the that's right take a step back what's happening yeah take a step back and think [00:45:11] what's happening what's happening is the computer industry was limited by the number of people using [00:45:18] the computers yep and now we have agents that are autonomously using computers and so we're going to have [00:45:25] instead of one billion humans using computers we will have tens of billions yeah maybe more than that of [00:45:32] agents and robots and self-driving cars using computers and so the question is how large can the computer [00:45:39] industry be yeah and so you know my sense is that at this point it's a foregone conclusion that what is [00:45:45] a trillion dollar multi-trillion dollar industry is likely 10 times larger yeah and so all right we're [00:45:51] on our way to and that's why nvidia is the you know the largest market cap company in the world and if [00:45:55] you combine the two companies we'd be the largest in the world still i love i love that i love that [00:46:02] that's such a great idea so you know thank you congratulations on rgx spark just amazing well [00:46:09] congratulations on everything you guys are doing i have a small gift for you really someone's going to give [00:46:14] here yeah so for those who may not recognize what this is and i'm going to sign it this is this is very [00:46:23] very real by the way the very first this is the jensen talks a lot about resiliency and sticking with [00:46:29] things tegra 3 was the first windows on arm laptop that was how come when we were younger [00:46:44] and i have to tell you i think i aged better [00:46:51] do you guys agree [00:46:53] i feel like i feel i aged pretty well come here here you're you're my guest it's to me better it's to [00:47:00] you if i sign it back to you it's because treasure no you sign it back to me there's a contract there's [00:47:07] invoices we can't do that we know that game all right thank you very much thanks guys [00:47:17] one of those things was real up there that actually was a real system that we worked on and and and [00:47:32] fish and cows did those guys will remember on that um i think i aged a little bit better than he did by [00:47:37] the way uh so to wrap up one agentic platform cloud to edge showed these these products before it's the [00:47:46] arm ai compute platform that enables systems from the very very smallest to the very very largest [00:47:53] and we do this through a very consistent effort with software 22 million developers the largest [00:47:59] developer community across the planet for any compute platform but as i said none of this happens [00:48:06] without incredible cooperation and dedication from our partners and again i just want to say thank you [00:48:12] to taiwan arm is nowhere without taiwan the ecosystem the people the engineers the supply chain managers [00:48:19] and thank you so much for everything you've done thank you for attending today

Transcribe Any Video or Podcast — Free

Paste a URL and get a full AI-powered transcript in minutes. Try ScribeHawk →