Microsoft Build event in 25 minutes

[00:00:00] Speaker 1: Good morning. And so today one of the things that I'm really excited about in order to tap into all this compute power is to expand the scope of Windows ML and Windows AI. We are also announcing two very cool new models that are all going to run on Windows Inbox. The first is a new SLM. It's sort of a more efficient model, ION Instruct, and it's a great reasoning model. And then we have the planning model, ION Plan, which is a local agentic loop. I mean, think about it, right? You now have a full local agentic loop. You can give it tools access. And, of course, that brings us to NVIDIA and RTX Spark. This is a next-generation SOC for PCs. It brings together the CPU, the GPU, as well as the AI capabilities into a single SOC. We are really thrilled that one of the first devices that we built was the Surface Ultra. We are very excited to see this later this fall, right? What if we could just max the compute, build out that developer machine that's the dream machine, and that's what we're announcing today. Surface, RTX Spark, DevBox. It's got one petaflop of AI compute, 20 CPU cores. All of those things have 128 gigabytes of unified memory access. So super excited about this coming in the fall, and you can join the waitlist. I'm on the waitlist as well, so we'll get there. What if we did one more thing? So a new, I describe it as the desktop data center, which you can have right there, and running a one trillion parameter model locally. I mean, just to sort of put it in perspective, it's pretty close to what perhaps we had when we built GPT-2-5 or 3, right? One of the first supercomputers, so it's pretty crazy to think that we've come this far where you can now have a data center on your desktop. We have tons and tons of updates, right? We're starting, by the way, with one of the favorite things for all of us, which is distraction-free dev environment. We're also introducing an intelligent terminal, which has got built-in GitHub Copilot, right? Terminal with sort of the Copilot intelligence. And, of course, there's lots and lots of Linux love across Windows now, so grep in full glory is now available for regular Windows access. We're also bringing things from the Mac that you love, things like Starship, Z-Shell, Homebrew is going to be native on Windows as well, so you'll be able to switch to Windows. And today, we're announcing WSL containers. This is, I know, being a point of pain when you're managing all these environments. So, having first-class support for containers will really help us be in the flow when you are building and deploying locally. [00:03:14] Speaker 2: So, what we have here is the default experience on the Surface RTX Spark dev box. After popular demand, we're excited to announce that Vertical Taskbar is now available in Windows Insider builds. The new Surface RTX Spark also has a bunch of key dev tools already installed, like Python, Node, and many more of your favorites. And if you want to get the same experience today on your device, we're making this file available to everyone right now. Now, one cool thing that I have running here is PowerToys' new utility called Grab and Move, which lets you hold Alt and move the window around from anywhere. Another tip is that you can enable End Task, which lets you end the process without having to open Task Manager. Also, File Explorer is Git aware. We've got stuff like last change author name, last change message, the status of each file. Plus, my favorite is that the branch name is on the bottom left. This is an experimental experience called Intelligent Terminal that makes working with agents even more seamless. So, for example, here's an error being generated. My agent pane is able to detect it and provide a fix, which is great when I don't remember the syntax, especially for something like regex. So, I'm going to work on OpenClaw, and I've already built it using WSL Container. WSL Container is a native container experience on Windows, plus it can leverage the GPU, which is perfect for the Surface RTX Spark. It can also reference your existing container files, just like the one in the OpenClaw project. So, here's one of the files open in Microsoft Edit, which ships in Windows by default and just got syntax highlighting in its latest version. We're providing a WSL profile that's designed to feel comfortable for those of you who use tools like Starship, ZSH, and Homebrew. So, the Surface RTX Spark is designed for developer-heavy workloads, including serving large local models for coding. So, here's a quick view of my usage, and we can see how many tokens I've used locally. So, we're looking at about 3.4 million tokens leveraged on the device itself. And just so we don't have to watch me type, I'm going to use Copilot's voice feature, which is also leveraging its own local model. So, I'll just hold Space Bar and tell it what I want it to do. Find any console.writeline or debug.writeline calls in the tray and node projects, and convert them to the standard logger used elsewhere in the code base. There we go. Now, as developers, while we're debugging, we're often looking through log files to diagnose any issues. Sometimes, finding the location of the log files is a challenge. I'd love to be able to just type something like grep log and find them all. Ah, sweet. So, on top of already adding curl, tar, and sudo to Windows, now we're adding over 75 command line utilities like env, head, tail, and touch for those of us who love to live in the terminal. And you can see 90 gigs of RAM being utilized by the GPU, truly showcasing the full power of the Surface RTX Spark. [00:06:12] Speaker 1: Now, let's move to the cloud. The driving equation for us remains the same, which is tokens per dollar per watt. Today, Azure spans more than 500 data centers in 80 regions. It's the most expansive, you know, we have the most expansive hyperscaler footprint out there. And we have added more data center capacity in the last 18 months than the first decade of Azure, just to put that in perspective. MyAer 200 is continuing to scale. MyAer 200 is continuing to scale. In fact, it's live in Iowa and Arizona. We'll deploy it internationally later this year. It delivers 30% improved tokens per dollar compared to sort of what's the leading GPU today. And we have validated it with 5.5, GPT-5.5, and we are going to use that to power Microsoft 365 co-pilot. We are announcing the preview of Cobalt 200 VMs, our next generation ARM-based CPU designed for both cloud native and agent workloads today. So, it's exciting to see Cobalt make progress as well. Cobalt delivers 50-plus percent better performance than Cobalt 100 on cloud native. But we started benchmarking them using the GitHub co-pilot traces. And we're now seeing 33% lower latency for the agent calls, 14% faster speed, 23% higher throughputs. We've talked about the edge and the cloud. The same form factor, but unbelievable new functionality because of the onboard AI capability. Can you build a new platform even for the agent era? And that is the motivation behind Project Solara. [00:07:58] Speaker 3: But with so many possible forms, which one do you pick? What is the next device? Today, we're previewing two very broad categories. The first is stationary, and the second is portable. The first device is designed for your desk, and it's built on MediaTek Silicon. With Hello for Business, just walking up to the device securely signs you in, giving you direct access to your agents, just like Nathan's about to show you here. And with a simple glance, it surfaces what matters next in your workday, helping you think, plan, and even act by delegating tasks to your agents with a simple tap or just using your voice. It even supports experiences like handoff between devices, acting as a companion to your existing Windows PC. Or it can even let you access your cloud PC through Windows 365 and a connected monitor. How cool is that? Now, the second device is portable. Built using Qualcomm Silicon for wearable, this digital badge is a lightweight form factor designed for agent interactions on the go. All right. I have here an early prototype of the badge. And so, thanks. And using my fingerprint, I tap to unlock the device, and I have access now to all my agents in a secured manner. And would you look at that? I already have a task. And it says, gather content for your social media post for today. So, why not just do it right now, right? So, I'm going to hit record. And then now the device's camera is recording. I'm going to pan across. I hope you don't mind. I'm going to take your shots. Yes. Thank you. Co-pilot, find some good shots from this, clean them up, and then send them to me for me and my team to review. All right. And then there you have it. Now my agent's off, running through multiple tasks to actually clean this up and send them to me and the team. That's pretty cool. And while this is an early look, we're really excited that AccuWeather, Best Buy, CBS Health, Levi's, Target, and others are working towards exploring how specialized agents and devices can improve their workflows. [00:10:11] Speaker 1: Agents are continuously storing, retrieving, reasoning, acting, and learning, right? That's sort of what's happening in a continuous loop, which now brings me to a very exciting new service, HorizonDB, which is our fully managed PostgreSQL service on Azure. Really thrilled to have this. We built, ground up a PostgreSQL managed service, which was for high availability, scale-out. It's zoned redundant with automated failover, 128 terabytes of storage per cluster, 15 read replicas. I mean, the read-heavy workloads you can scale with this managed service. Bringing GPU acceleration to Fabric is super key, and we are seeing 7x performance gain, so it's really thrilling. The first domain is the web, and that's why we are really, really excited to announce today WebIQ. It has web, it has news, images, video, so agents can ground responses in fresh, verifiable content, and WebIQ leads across all of the three key criteria, right? It's best-in-class in quality, it's best-in-class in speed, as well as in cost. And so we are bringing together Foundry, Fabric, and Microsoft 365 as this unified IQ layer, right? Continuously updated understanding of your organization. [00:11:51] Speaker 4: I'm here at a Power Utilities Control Center, and we'll start by running a long-running agent. This agent is going to help us assess the current grid operations incident and produce a brief for us so we can respond accordingly. And while that runs, let me show you how we got here with Context from Microsoft IQ. We built our agent in Microsoft Foundry. It's connected to various tools, and it's also wired to a Foundry IQ knowledge base, a single grounded source that packages our documents, operational data, and people into Context the agent can reason over. I'm going to ask about current electricity prices in SF. For this, our agent pulls in the first IQ in our toolbox, WebIQ. WebIQ constantly indexes fresh, official sources from across the web, and additionally, WebIQ does a great job with semantic documents. Let's see how we handled a previous incident. So after using WebIQ to gather external info, we asked for details about our potentially at-risk substations. For this information, our agent pulls in the next layer of Microsoft IQ, Fabric IQ. Here is Brightline's grid represented as a Fabric Ontology, an operational model of the live grid. And critically, we didn't build this from scratch. This model is coupled with live telemetry, so it reflects the real operational state of the grid minute by minute. By asking what are the steps to respond to a substation trip, we activate the final layer of Microsoft IQ, WorkIQ. This is Brightline's response procedure in SharePoint. It's the playbook the team actually reaches for when something goes wrong. And the important thing is the agent isn't working from a stale upload or copied snapshot. It's answering from the same source the team maintains day to day. Now let's go check back in on our long-running agent. Let's check the backup really quick. And boom. Our task finished. Here we can see every step the agent took. First, beginning with WebIQ, connecting it to the outside world. Second, Fabric IQ through Foundry, anchoring it in the real state of our operations. And third, WorkIQ, grounding it in our people and procedures. [00:14:16] Speaker 1: Today, we're introducing Microsoft Execution Containers, or MXC. MXC is a new policy layer that lets Windows apply isolation and containment. Using AI native or other OS native primitives, right? You need to bake this into the operating system. And today, we are really thrilled to announce that OpenClaw runs on Windows leveraging MXC. [00:14:46] Speaker 5: We've been collaborating in the open on GitHub to bring you all an OpenClaw Windows companion app. It's going to help you set up your own clause or connect to existing ones, whether they're hosted in Windows or in WSL. And the Windows companion, we're going to sandbox the OpenClaw tool calls to keep you and your system safe. [00:15:04] Speaker 6: Yeah, you'll see the OpenClaw Windows companion app running right now in the background. Go ahead and right click on it, Scott. That looks awesome. You'll notice immediately it looks like a native Windows app because it is. It's written in WinUI 3. It's got all kinds of information about my gateway, other machines that are participating in my claw, my sessions, and my usage. We've got lots of permissions options along with our sandbox configuration. [00:15:29] Speaker 5: Now, this sandbox is really interesting because this is using MXC, the Microsoft Execution Containers. And for this, we're going to be using process isolation. Now, I can see that I've got one-click security option settings, but Samantha, talk to me about custom folders. [00:15:43] Speaker 6: Yeah, you've got full support about what files and folders you want OpenClaw to have access to and really granular security features like clipboard access or talking to the Internet itself. For the purposes of this demo, I'm going to do something really scary and ask OpenClaw to delete all the files on your desktop. [00:16:02] Speaker 5: So what we've done is we've asked OpenClaw to delete those files from the Windows node. And the only thing that is going to keep from happening is MXC because we've turned off all of the many layers that OpenClaw offers. But our IT, in this case Samantha, has set it to read-only. So it's trying to go and delete all of those files. We can actually see the different attempts where it's going and deleting and then checking the directory and then deleting again. Because it's very persistent. It wants these files gone and I want them to stay. The read-only sandbox is there. 94 JPEGs are still on the desktop. Absolutely. My desktop icons are safe from Samantha's reign of terror. [00:16:38] Speaker 7: Watching a Claw try to delete all your desktop file and just fail made me really happy. Because six months ago that totally would have worked. You know, we changed how access works. It's not all or nothing anymore. You can pick which folder should be read-only, which one should be written or hidden. And we even made the harness itself a plugin. You can bring your own. Copilot, Codex, whatever you already trust and your rules come right with it. [00:17:12] Speaker 1: Introducing our new GitHub Copilot app. [00:17:16] Speaker 8: I can't wait to show you the new GitHub Copilot app. When you open up the app from the start, you see this home screen here. But also before I get into the serious stuff, you can drag Mona around and there's a game. Look. It's so fun. Let's just get back to you can kick off a new agentic coding session. So I started off one a little bit earlier here and it gave me a review of a bunch of release blockers. This app will now kick off a separate session for every single issue here. I don't have to worry about stashing or coding complex or anything because the app takes care of that with Git Work Trees. Git Work Trees are isolated environments for each session that you run so your agents can work in parallel without stepping on each other. If I head over to this issue here, I can run agent merge. And when I enable agent merge, Copilot will continuously babysit this PR through CI checks, code review, and merge conflicts. I can see a focused view of all of my activity and just projects loaded in the app, issues and PRs, everything here. And then under automations, I have a bunch of reusable sessions and workflows that can run locally or on the cloud. If I want to add a new repository, I can click that button here and it can pull from a local repo or from a GitHub repository. And then if I were to just add one, I can add a session in Pocket Cal. This is an open source repo. I can start a session anywhere and it just loads it. I don't have to clone. I don't have to pull. It just works. Now, when I look at a session within this repository, let me look at this other one over here. I get an integrated browser. There's a terminal. I can see the chat. It's all loading. I can even toggle light mode and dark mode in here. And there's also this great button, pick and polish, where if I click that, I can pick and polish anything in this app and it adds it to the chat. And I can say, hey, I want you to add reordering to this list and it'll just work all living in there. I have access to all the most popular models via my single GitHub Copilot subscription, including those from OpenAI, Anthropic, and Google. The canvas is how an agent can build a custom UI to communicate with you. What if your AI could see? Everyone say, demo gods, bless us. Okay, let's see if it works. Here is a fun canvas where, if I get the camera going, okay, the agent shows your PRs down here and I can toggle it with a thumbs up or a thumbs down. Let's approve it. Yay! It's so fun. This is a signal box app. It's 100% agent built. It's containerized with a database backend. Would you be able to deploy this to your enterprise with no questions asked? Exactly. No, but you can with Rayfin. All I have to do is type Rayfin up and then demo gods bless us. Come on. It will maybe deploy. Blamo. It's happening. Yes. And all hosted on Microsoft Fabric. [00:20:24] Speaker 1: Today, we're announcing a number of updates, including the GA of Agent 365 SDK, and we're expanding it to your local agents running on Windows and elsewhere and the clause you just saw earlier. Please help me welcome Alex and Drew from the Chainsmokers. Alex. [00:20:49] Speaker 9: Sure. Well, uh, hey, I'm sure you guys are wondering what timeline you're on where the Chainsmokers are at. Microsoft, Bill. But hey, how are you? Always important to have your authenticity when it comes to creativity. But on the investment side, I think we're moving from producing outputs to producing actions. So instead of humans, you know, producing outputs, it's machines producing outputs and rethinking what that entire space looks like in that context. [00:21:14] Speaker 1: Thank you so much. But today we're introducing something completely new. Autopilots. We can think of autopilots as enterprise grade clause. These are autonomous, long-running agents with full enterprise compliance that run in your tenant. The first autopilot we are introducing is Scout. Scout works where you work, joining group chats and teams, handling threads in Outlook. Starting today, for those of you who are on Copilot Frontier, you can try out Scout. And in the coming months, we will build this out to a complete digital team of autopilots right inside a Copilot, right? So you can go to the Copilot app. Scout is the one that comes by default, but you can build more of these autopilots. [00:22:07] Speaker 10: So today, we are very excited to announce a family of seven new models across image, voice, transcription and coding. MAI Image 2.5 and its Flash variant, two super-strong models that deliver a step change in quality. Now at number two on the leaderboard, surpassing the score of Nano Banana 2 on image editing. Next up, we've got MAI Transcribe 1.5. State-of-the-art accuracy across 43 languages, beating out Gemini and OpenAI's flagship transcription models. For any bespoke use case, five times faster than all rival models. So paired with that, we've got MAI Voice 2. This is our latest speech generation model, and it's available in 15 languages with many more coming soon. Next up, our text foundation model, MAI Thinking 1. This is our first reasoning model, and it's exceptionally strong in our target use cases of reasoning and SWE tasks. It's achieved 97% on AME 2025. It's now at 53% on SWE Bench Pro, which places it right alongside Opus 4.6, at least on the toughest coding benchmark that's out there. So we're very happy with that. Now, finally, I'm incredibly excited to announce MAI Code 1 Flash. It achieves 51% on SWE Bench Pro, despite having just 5 billion parameters. And it's rolling out today inside of VS Code. So today, we're very proud to be announcing that we're partnering with Mayo Clinic to jointly develop a new frontier model for health, and then deploy it around the world in their hospitals and beyond. [00:23:48] Speaker 1: We're also continuing to make rapid progress on our long-term goal of building a scalable quantum computer. I'm really thrilled to announce Majorana 2. And so this is Majorana 2. Majorana 2 implements the next generation material stack that we use Discovery to discover and build and help fabricate. Other common approaches deliver a lifetime of just microseconds or even milliseconds. Majorana 2 provides qubit mean lifetime of 20 seconds, or up to even a minute. Essentially, a thousand times higher than what we were able to achieve with Majorana 1. There are really two stories people can tell about this moment. One is that technology concentrates power, reduces human agency, and leaves the society to absorb the consequences. The other is that we use this next wave to unlock opportunity for developers, scientists, enterprises, and every community. And our job is to make the second story true. That's our North Star for the frontier ecosystem. Let's all go build together. Thank you all very, very much. Thank you.

Related Transcripts from The Verge

Transcribe Any Video or Podcast — Free