Jonathan Frankle, Databricks — Databricks Data+AI Summit 2026 — Full Transcript (June 20, 2026)

[00:00:00] John Furrier: Welcome back to theCUBE's live stream here in San Francisco at Moscone Center for the Databricks' Data and AI Summit 2026. I'm John Furrier, host of theCUBE. Back for his fourth time on theCUBE, Jonathan Frankel, chief AI scientist at Databricks. You know, the show is really making an impact, probably 31,000 plus people breaking at the seams. If this continues, it's on to Vegas or a bigger venue. The Databricks continues to grow. And the question is, we're asking is, can they be the runtime layer of the enterprise cognition? We've seen the waves, Hadoop, to Spark, the cloud era, the lake house era. Now it's the context reasoning and autonomous execution lever. Can they make the case that AI is not just going to reason, but take action? Jonathan, you are the chief. I got to push back a little bit here. You said the runtime layer. What about the application layer too? Everything. He's back right into it. There's a whole, all right, let's get into it. Okay, the preamble there was kind of set up, but runtime application, runtime and application tied together. We're talking about a system now. One of our first interviews we did was at the NYSE. We were really riffing on the systems revolution. I think we talked about systems mindset. This is where it, you got runtime, you got the application layer. But when you inject intelligence into an organization, that concept of saying, bringing AI, let's put intelligence in, shit happens. Okay, that changes everything. So take me through what you see about this market, what you've been doing behind the scenes, and the launch we saw here, the unification, great stuff. But what's going on with AI? What's going on with AI? How to make that intelligence work? [00:01:42] Jonathan Frankel: Yeah, so I'm going to tell you, like the way I think about this, this is all off the top of my head. I think it is four layers we're working on right now. It's not just one layer, it's four. So, you know, fourth time on the cube, four layers, we're working on it. First layer is the systems layer, and we haven't, you know, exactly sat still on that. I mean, we have LakeBase now, and LakeBase isn't just, you know, another kind of database. Cool, we've got Spark, we've got, you know, Databricks SQL, we've got, you know, our lake house, and now we've got LakeBase. What's the point? LakeBase has two different things that make it awesome for agents and that I've been using pretty extensively. The first is separation of storage and compute. So you create a LakeBase, you create a bunch of data in it. If you're not using it, you don't have to pay for the compute. You don't have to set up and provision a bunch of compute anymore. That goes away. That's huge for reinforcement learning and for agents. I'll tell you why in a moment. The other part is branching. You can take a database, make some changes to it, and fork it like a GitHub repo, and then you can merge those changes or revert those changes. Let's say I'm letting an agent loose in a giant Databricks workspace. You know, I have a Databricks workspace that might have hundreds of petabytes of data that I'm letting agents loose in right this minute today, and I want to make sure it doesn't break anything. How do I do that? It's going to create a bunch of new databases. Fine. If I don't use them, they're not going to cost me much money or any money. And also, if it makes a bunch of changes, that's going to be a giant fork of my Databricks workspace via LakeBase. So the systems layer makes it possible for agents to really work fundamentally with the data primitives. Second layer up is the models. We've been working really hard to make sure that basically you've got as much choice as possible. That means you've got open source models. That means you've got access to every closed model out there. And that means we have our own custom models that I've been working really hard on. We've talked about a couple of them already recently. Some of our Agentbricks products are now powered entirely by custom models. A couple of other things we may discuss tomorrow on that front that, you know, I would say keep your eyes out. But I don't want to preempt my colleagues on that one or I'll get in a lot of trouble. [00:03:30] John Furrier: Come on, lay it out. No, let's get into some of the demos. I really was impressed by Databricks very humbly and transparently saying, look, these agents will get things wrong. I mean, just right there. So, I mean, it's hyped up, but it's almost ready for prime time. What have you been doing in the past year? Last year we talked about evaluation. What did that drive you this year to do? What was your main focus? [00:03:55] Jonathan Frankel: Yes, so once you can measure, you can improve. And I've spent this whole year working with my team on how to improve models, how to take custom, you know, open source models, make them really fast, really good, and really cheap at doing whatever task you care about. And it turns out there's a repeatable process for that. We've been doing a lot of reinforcement learning. We have a crack team to do that. And, you know, you're starting to see those results show up not only in our products but in our customers' products because we're helping them do it with our new AI runtime. You can use custom Databricks GPUs. You can access my reinforcement learning code that my team has put together and do the same stuff we're doing. The idea is, you know, we're building all these cool applications using our stack. You can get access to the same tools we're using and build your applications. [00:04:33] John Furrier: I think, actually, we riffed either, it was either last year or the year before, I think we had a talk thread about getting close to the hardware for efficiency purposes. That's kind of what you're doing from a model standpoint, right? [00:04:45] Jonathan Frankel: We're working our asses off on that. I don't know if I'm allowed to say that on air, but, you know, we're working our asses off on, you know, making sure the models are as efficient as humanly possible. And we've released a bunch of results this year, you know, one on kind of, you know, interacting with data, models that can do more than 200 calls to retrieve things. You know, we call it RAG, but it's really, you know, once you're doing 200 calls to databases, it's way beyond RAG. The models that can do that and are, you know, multiples cheaper, multiples faster and better than closed source models at doing this, once they've been trained to really reason about data effectively. And those models are now deployed in our Agent Bricks products. [00:05:18] John Furrier: Do you see your work at Databricks as a template or a path for customers? Or is this something that you think Databricks has to do as a core competency to make the products better? Does your work extend into the practitioner landscape or not? What do you see? [00:05:32] Jonathan Frankel: Where everything that I use is now available to our customers. And I'm personally working with some of our most advanced customers who are really spending a lot of money on tokens to get that bill down by building custom models together. Like, you know, it's not just for us. It's for everybody who you see around here. When they're ready to scale up, we're ready to help them scale up and do so efficiently. [00:05:49] John Furrier: Yeah, there's been a big token maxing vibe. I call that to like, you know, when you're in high school and everyone starts smoking, right? It's like, hey, let's get high. No one really thinks about the consequences. But I don't really buy that whole leaderboard thing because it's like saying to a coder, look how much code I wrote. So I think we're going to shift to a token efficiency outcome type metric. So I want to ask you, because you've been working on evaluation. How do you see that scoping out? Because I could definitely see a metric where, okay, you're really good at token usage, whether that's premium tokens, policy. Did you get the outcome? Does it really matter? This is where it matters. Outcome, token. What's your thoughts on that? [00:06:29] Jonathan Frankel: Yeah, the way I look at it, I actually gave a talk on this last night where, you know, bigger context. But, you know, I thought about like, what were we talking about in December? Like, I'm sure you were thinking in December, like, what are the big trends going to be in AI in 2026? It was going to be things like, we're all going to use a lot of tokens. We're going to deploy agents everywhere. AI for coding. All this stuff. And now what we're talking about is cost, governance, and security. Yeah. A lot's happened. Like, we have a lot of PTSD from all the interesting stuff that's happened in AI in the past few months. And, yeah, it's no longer about token maxing. It's about value. AI can get really expensive. So what I tell my customers to do is, first, prove that AI can actually solve your problem at all. It's okay to spend more money than you'd like. Just get the proof of concept out there. And then let's focus on efficiency. And efficiency can come from lots of places. It's about smaller models. It's about better harnesses. It's prompt optimization, JEPA. And then for the really high-use use cases, it's about RL and, you know, fine-tuning open-source models. It's funny. [00:07:21] John Furrier: I was talking to Ali Gossi, the CEO. I've heard of him. And he said to me, I'm like, do you buy this AGI hype? And then, of course, I was set up questions. He says, well, he had a good point. He said, if you showed me the AI, this is like six months ago. If you show me AI today and you showed me that five years earlier, I would have said that's AGI. I like that comment. But then he said something different. He goes, look, there are so many hard problems to solve. We're just focused on those. It was kind of a, not a canned answer. It was a very pragmatic answer. And so execution becomes the factor now. So what do you've learned when you get out of the buzz phase? Okay, we're buzz. It's all fun. You see the value. It's fun to play with because it's better than anything else. So now the fun's over. I want to get down and I've got to start scaling. What's the prescription for the customer with agents specifically? Is there a one, two, three step? Do I have to go through some progression? [00:08:18] Jonathan Frankel: There's not an obvious progression. The word that I've been using the most is respect. Respect AI. Respect how powerful it is and how fallible it is. And respect the human beings and the people who are doing work today that you hope to bring AI to bear on. It turns out that the world is complicated and a lot of stuff is really, really hard. It's easy to say like, oh yeah, we built AI for finance. What do you mean by finance? Do you mean insurance? Do you mean banking? Do you mean investment? And then what do you mean with an insurance and banking? Do you mean fraud? Do you mean underwrite? Like the world is fractal. This is the word that I keep using. The world is fractal in complexity. Every problem is really deep and complicated. And honestly, the one, two, three step process is first, go talk to the people. In fact, bring the people along with you. Go figure out what they do every day and spend some time with them. Ask them what's most frustrating, what's most boring and most tedious. And then try to get AI working and ask them how it's going. That allows you to start bootstrap and eval. There are formal frameworks that you can now use for this. We built them into MLflow so that you can basically follow the same process my team has been following for the past year. And then you can do prompt optimization. You can kind of slowly improve things over time. The humans are part of your team. The humans need to tell you when AI is doing a good job. And ideally, you make their lives better because you're not focused on, you know, they don't have to focus on the boring stuff that they do by rote. They're focused on the boundaries, the hard stuff, because AI is helping them with the easy stuff. But it's really, it's respect and it's a fractal problem. [00:09:38] John Furrier: If I can read the tea leaves, I would say that you're kind of running a labs within Databricks in the definition of labs. Because they're calling all the model lab companies. It's interesting how they call them lab companies. They're labs. They're AI labs. They're companies. Okay. So I like the labs vibe. So I have to ask you, what have you learned between the model labs and your labs? What jumped out at you? What were some of the things that you discovered? Because they're trying to get better and they are getting better, but they're doing their thing. You're starting to see the synergy between models and systems. What's your big takeaway from where they are going from today to the next step? [00:10:16] Jonathan Frankel: My big takeaway is I think there are two ways to look at the world when you're trying to solve with AI. One is a very top-down, centralized, kind of, you know, one-size-fits-all approach of, you know, we will build AI that is intelligent, then we'll build AI for finance, and we'll build AI for math. And, you know, these big general sweeps. And the other is bottom-up. Like, just go get your hands dirty and solve one specific problem, learn from that, build some methods, figure out how to solve a couple of similar problems, and work your way bottom-up. And, I don't know, at Databricks, we're the bottom-up kind of folks. We like to get our hands dirty, you know, what Ali always taught me was, like, until you can solve one customer's problem, you're never going to solve a thousand customer's problems, so go spend a lot of time with that one customer. And that's what I do. The customers are sometimes internal. Like, you know, I'm working on an experimental form of Genie that hasn't been released yet, but, you know, we're always iterating on the experimental form. It'll never be released, because by the time it's not experimental, it's already been shipped in the product, and we're working on the next experimental thing. We've got one right now that's really cool that we have a bunch of custom models for, but, you know, I'm working with the team on that. [00:11:12] John Furrier: How do your customers think about custom models? Because there's been a big discussion of, you don't have to distill, you can actually run models in certain memory footprints, and you should focus on maybe interfacing with certain things of the models. What are you seeing there? Like, because people, it's almost foreign to most IT people or most enterprises, unless you have a computer science degree or have done machine learning. How should they think about working with models for their own stuff? [00:11:38] Jonathan Frankel: The way I think about it is, it's about your level of commitment. If you've got a use case, and you know it's going to be around for a while, and you know that you're going to spend a lot of tokens on it because a lot of people use it, it's time to commit, it's time to get married to your model and your use case, and that's the good time to start fine tuning and get a custom model. If you bother to fine tune and do the work, it's going to be faster, it's going to be cheaper, it's going to be easier to use. The trade-off is that, you know, it's a commitment and there's some upfront cost to doing it. So you shouldn't do it first, it's never the first thing I tell people to do. First prove that there's a reason to do this. But if you're spending a lot of money on tokens for one specific use case, and that's happening for a while, it's time to commit. [00:12:13] John Furrier: It's kind of like dating. You date for a while, and then you get married, but you don't want to be too promiscuous because there's a trade-off to the point where once you commit, it's hard to switch. Talk about the switching cost, because if you're committed and you divorce yourself from the model, there's a huge impact to either recompiling, resetting everything up. Can you scope that piece? I've been trying to get my head around the, okay, didn't work out here, I'm going to go over there. It's hard, there's a lot of hidden costs. Yeah. What's your view on that? How do you think about that? [00:12:51] Jonathan Frankel: So I think of, like, there are two pieces to it. One is kind of, you know, is the use case going to change? Are you really committed to this use case? Or you tried this, it didn't work. You're going to try that, it didn't work. In that case, yeah, use a base model and do prompt optimization. Use JEPA, use MLflow, and kind of, you know, make your switching cost low, make your upfront cost low, and just figure out that AI can actually solve this problem. And then even once it's time to commit, okay, you've committed, you've built this model, okay, now the, you know, the closed models have gotten better. Like, the bar has been raised, you want to do more. Do you have to go through and retrain this again? So I don't think of the commitment as a one-time thing. Relationships take work. Yeah, yeah. Even, you know, you don't get married and then decide you're going to get lazy. You still need to keep working on the relationship. Yeah, yeah, yeah. You need to keep putting in that same effort again. Cool. Quen 3.5 was great. Quen 3.6 is here. Run your pipeline again. Yeah. But you keep accumulating value over time by learning how to build better data, how to build better evals. [00:13:41] John Furrier: Well, certainly, you and I have a side hustle called model counseling. We can certainly help people through their relationships with models. But it brings up the commitment to the future. Okay, so let's take an example. What happens when models disagree? So say agents are going around, or say you have an espionage model that's hiding in plain sight and, you know, getting some counterintel. It could be a hacker. Models will have interactions. How does your work on evaluation take it to that next level? Because the humans aren't involved. The machines are now involved. [00:14:12] Jonathan Frankel: Yeah, I think there are two aspects to that. One is, you know, everything can be measured. I really, truly believe it. The most qualitative things in the world can be measured in a meaningful way. Security can be measured. We're putting a lot of energy into that. I mean, you saw our acquisition today on that front that we announced. But we're putting a lot of effort into making it easier and easier to measure more and more. And, again, I've been spending all my time on that front making MLflow the best place to do that. The second part is, okay, models interact with each other. I think the bottom line is the more you own of your own intelligence, the more control you have over what happens. If you're dependent on, you know, the next version of the model coming out, is the government going to get rid of this model that you were relying on? Is the new version actually going to be better than the old version at your specific task? Maybe better overall at general intelligence. Did it get worse at your task? And we're seeing, I think, for a lot of our in-house stuff, generation to generation of each model, you know, incremental release to incremental release. Things are a little bit all over the place depending on the task. Sometimes an upgrade is not really an upgrade. And sometimes you have to really rethink your prompts and your harness in such a fundamental way that you may as well fine-tune again as well. [00:15:11] John Furrier: I love that line, you got to own your intelligence. And that brings up the agency of the human, agency of the agents that have an execution plan. Because that really speaks to the benefit. That's a moat. How do people own their intelligence? [00:15:25] Jonathan Frankel: I think you own your intelligence through your data. I mean, that's why I love working at Databricks. Because your intelligence is what you know and how it relates to each other and how people use it. And agents are just another user and just another interaction, another source of creating and consuming and improving data. But at the end of the day, your long-term durable knowledge, kind of what you are at the end of the day, taking away, you know, what's happening on this day and time. But what you are in the long run is the data you've accumulated over years and decades and for some of our customers, centuries. [00:15:53] John Furrier: Yeah, it's like, it's a corporate brain. It's why I brought up the cognition, the runtime and the applications, which are generative. So runtime makes more sense, I think. All right, I want to get your thoughts since I have you here because you're really good at explaining things. You know, when I studied computer science in the 80s, we did ontologies. We had to build them by hand. They were hard as you know what. So I love that ontologies are getting into the mainstream and Palantir claims that they pioneered the term, whatever. It's been around. But with the superpowers of supercomputing, the data architectures that are out there, ontologies are actually in the flow of conversation and architecture. Can you explain for the average enterprise user or tech person, what is the modern definition of an ontology? [00:16:42] Jonathan Frankel: Oh God, this is, I am way out of my depth if you want me to answer this one. Remember, my title is Chief AI Scientist. That AI is doing a lot of work there. But I think the bottom line for me when it comes to ontologies is, you know, they're complicated as hell when you're a human. They're a little bit more parsable if you're an agent and you have all the time in the world to make sense of this. But they're even more important when you're an agent because, you know, the agent has to get started really quickly. You don't have to have an agent implicitly rediscover an ontology by exploring your data every time you ask it a question. So I look at ontology as another form of agent memory. How do you give the agent somewhere like a scratch pad where it can keep track of what your data really means? It can find things faster. It can use the last amount of work it did to improve future around, you know, to reduce the work it has to in the future. Memory is important for that. You don't have to, like, take a new SQL database. And the agent will think that it's seeing it for the first time even though it's seen it for the thousandth time. It has to leave a little breadcrumb, a little bit of memory and understand how it relates to other things. So to me, ontology is just a form of memory that allows agents to make sense of data faster and that the tokens that you spend on this agent call are a valuable use for future agent calls. And that's ontology to me as an AI guy. [00:17:47] John Furrier: I was trying to explain it to someone. I probably use it as a bad example, but I want to get your reaction. There was a movie that came out called Limitless where the guy could use all his brains and get superpowers. I mean, ontologies with agents allow you to create pathways and then come back and just keep doing that. It's almost like creating, you know, neural connections, almost like a graph kind of in the company, which I think really speaks to how agents and scale kicks in. Because then it's done a lot of heavy lifting. And by the way, there's no wrong answer because the data is the data. What's your reaction to that? Because, like, it's hard for people to actually understand. Like, they hear ontology. It equals, you mean something of a higher order function. We see that. [00:18:33] Jonathan Frankel: I, you know, to me, I'm a, you know, I made the fancy title. I'm a very simple person. The way I look at it is, you know, a very simple measurement when it comes to agents and ontologies. You know, an ontology is not something whole cloth and new created from somewhere. It's something that's derived from the data. You can figure it out bottom up from the data. There's no right or wrong answer. But the question is, if you were to say, let's ask an agent 100 questions independently, you know, the agent isn't allowed to, you know, check its homework from the last one. How long does it take to answer those questions and how good does it get of answers? And then what if you allow the agent to remember what it did on the last question and leave itself some notes or breadcrumbs for future questions? You let it do a sequence of 100 questions and ask, how much faster was that and how much better were the answers? And ontology is a way that an agent can leave itself breadcrumbs and leave itself knowledge from the past questions that it can use to make each question better. That's the core of kind of what ontology means for agents. [00:19:25] John Furrier: I want to ask you about, we don't have a lot of time, but I want to get this in there. You saw the paradigm of AI go from reg, some marketing copy, and to then coding, and now agents are opening up a lot more value. But most people think of generative AI as get an answer, and it's not, it's more than that. If you optimize for a search paradigm, you'll get the first answer and return it. That's a chatbot mentality. I'm kind of setting up to the question. What is the preferred way to think about the outcome of the environment? Because it's not search. There's a use case for search, but multi-step reasoning, reinforced learning, the things you're working on, patching together models so they can be configured properly if they need to be. I mean, it's a lot of runtime, a lot of specific things going on, almost like scheduling here and scheduler. It's an operating system. How should people think about moving from the search paradigm to the let the agents figure it out? [00:20:25] Jonathan Frankel: Oh, I think the bottom line for me is, and you've heard this from me a million times, everything is a measurement problem, and everything is a Pareto curve. Everything is a trade-off between cost and quality or speed and quality. And so the way I think about it is, yeah, if you want the fastest possible answer, we released version 2.0 of our knowledge assistant product, and the goal was make it really fast. And it's 3x faster, and it's a lot more fun to use. Answer quality is even a little bit higher, but we weren't optimizing for better quality. We were optimizing for speed. But, you know, we released this work called Carl earlier this spring. This takes, you know, often minutes to answer a question. It'll do 200 tool calls, but it gets you a really freaking good answer. And that was optimizing, you know, for quality. We made it a lot cheaper than, you know, closed models for the same task. But we're always navigating this Pareto curve, and the question is, are you just trying to minimize cost? Are you trying to maximize quality, or is there some sweet spot trade-off that you want to hit? [00:21:17] John Furrier: Speaking of Pareto curves, because now this brings up the next question, which is model routing and data routing, because we saw the new Pareto curve established with Vera Rubin now. They just started shipping. That's high performance. Huge money involved for the gear and the software. So now you have this quality of service set of Pareto curves emerging in all areas. That's going to open up intelligence around just-in-time, least-cost path. I mean, it's a routing concept. How do you think about that in the AI context? Because, I mean, whether you're routing packets or data, it seems to be very synergistic in concepts. [00:21:53] Jonathan Frankel: Yeah. I look at it two ways. There's kind of, you know, the way that you route between different kinds of models, pick the right model for the right task. And, you know, we announced Unity AI Gateway. The whole point is to give you control over all the calls that are going out to your models. And, you know what you can do underneath that? You can route. You can swap out models. You can do cost control. You can set budgets. All of that stuff is the beginning of doing really smart routing. Now, routing is hard. We've watched, you know, OpenAI when GPT-5 first came out, had a router. It's since gone away. Anthropic for Cloud 4.7 had adaptive intelligence. That went away for Cloud 4.8. It's actually really hard to get routing right. So it's something I'm personally putting a lot of work into right now. Because I think we can get to an answer on that, you know, as Databricks. I think we have a lot of customers with a lot of diverse tasks. And I think we can build some really good routers. [00:22:37] John Furrier: I think it's going to be a critical opportunity. All right. So I guess my final question is, what are you working on? It sounds like routing is going to be a big part of it. Holistically, what's the patch of innovation that you're working on for this next year? [00:22:50] Jonathan Frankel: For me, it's, you know, I had mentioned, you know, and I'll come back to where I started. What did we think the trends were going to be last December? What are the trends now? The trends right now, cost and value and getting ROI, governance and security. That is where I am entirely focused right now. How can I make sure that all of our customers have access to our very best agents at a great cost and with a great experience? And I think that's going to require a lot of custom models, a lot of custom RL. How do we deliver that same capability to our customers so they can do that for their customers? And I'm thinking a lot about governance and quality. Because at the end of the day, if you don't trust a model and you're worried about what it's going to do, you're never going to deploy it. I want to get rid of those barriers and then we're going to get to all the cool stuff we wanted to do back in December. [00:23:29] John Furrier: Yeah, we move from the buzz phase to the execution phase, governance speaks. I mean, I've spoken in the past year since we last talked, I probably said the word governance more than the past 10 years combined. It's front and center and it reminds me of the shift left days in cloud that became core to the developer. Now governance is now first party citizen with security. [00:23:52] Jonathan Frankel: If only there was a company that had the world-class leading governance solution called Unity Catalog. Whoever that company is, is incredibly well-placed to do it for AI and things like Unity AI Gateway are the next step there. So I feel pretty damn good about how we can really help the world to govern AI agents and do all the cool stuff that we wish we could have done a few months ago. [00:24:09] John Furrier: It's always a pleasure, Jonathan. As always, Databricks doesn't disappoint on the goodness of the tech. And again, you've got a good loyal customer base. Again, you're busting out the seams here. 31,000 probably going to be, based on my experience, this means like AWS, you move on to Vegas. Yeah, I mean, next year, 30,000 is going to seem like a small number next year. And I'm really excited to see that crowd. The 50,000 number is pretty brutal. I'll be walking through a crowd to go to the bathroom. It's pretty crazy. I've been there before. Great to see you. That's a sign that everybody loves data intelligence. Thank you so much. We have the chief AI scientist on here, breaking it down. There's so much action happening with the models. There's so many new opportunities. The rising tide is happening. And again, governance and security are built in. We're not yet there on the ages, but it's coming super fast. Databricks is doing its part. We're doing our part here on the Cube, sharing high-velocity data. I'm your agent, your host, John Furrier. Thanks for watching.

Jonathan Frankle, Databricks — Databricks Data+AI Summit 2026

Related Transcripts from SiliconANGLE theCUBE

Transcribe Any Video or Podcast — Free