AI Engineer and Researcher: Why Who You Know Is More Important Than Your Code

[00:00:00] Speaker 1: Most people who I was in college with at the time thought I was crazy like why are you doing this like this is not on the syllabus it's not on the exam you know and for years I didn't even talk to anyone about like machine learning like there was no one else cared about it that I knew. [00:00:12] Speaker 2: You open up like a mystery box of potential interesting projects that you can work on right if you're struggling with oh what project should I work on but you talk to 20 different people chances are that at least one of them will have something interesting to work on. When I was at Sakana I also tried out many different ideas and many of them fail some of them work and I learned a lot from them but if you know how to use AI coding tools what they're good for and whatnot and they're pretty damn amazing and we can touch on how we use them which I think is really exciting because this is like a skill that everyone just has to know. [00:00:48] Speaker 1: What does this mean for programmers or software engineers? It's still important to understand the infrastructure around what's going on. It's still important to understand software infrastructure. It's still important to understand you know to be able to prompt correctly. [00:01:02] Speaker 2: I'm a research scientist lean more into the research and I would say it's a bit overrated to have to read every paper especially in full. [00:01:12] Speaker 1: So I actually come in on the opposite side of the research appreciation spectrum to you in the sense that I actually like things that are almost like near-term impact like things that can be used soon like I love when I read a paper and I think that's a really clever idea. And what I would do differently is like I would become. [00:01:32] Speaker 2: Welcome to the first episode of the signal and stories podcast. In this first episode the guests of the podcast are none other than the two hosts themselves. My co-host Max Buckley, a senior machine learning engineer at Google and me, Boris Maynardes. A junior ML researcher who worked at AI startups like Sakana AI and now Radical Numerics. In this episode we discuss why we want to create this podcast, what it is all about, share more about our own, possibly unconventional paths into the world of AI and how we got to where we are now. We also discuss topics like how to approach reading papers where we also debate the value of reading many papers, how coding agents change the game and how we use them and how we think people can stand out nowadays, especially in the age of coding agents changing how we work and what the new important skills are. In essence, in this first episode we want to extract as much signal out of our own stories, learning together about the different paths into AI, the different jobs in AI and learning learning together how to actually do this podcast thing. Enjoy. Okay, Max, we're finally doing it. What is this? What are we doing here on a high level? What is this podcast that we want to do? And why is [00:03:12] Speaker 1: there room for another podcast? Well, we're making a new podcast about AI, machine learning, and the people behind it and their careers and kind of sense making as to what's going on right now in the industry, in the world even. And why there's room for a new podcast is as software engineering is being automated by cloud code, people need something to do with their time. And that is listening to [00:03:41] Speaker 2: podcasts. We just want to, we want to have fun and, but while having fun also educate people and help people with interesting stories and get an idea about different careers paths they are. And there's, there's just so much to talk about. And there's just so much more than just the headlines you see and just, oh, LLM company here, LLM company there. And yeah, even just the difference between engineering and research, even within research, there's a lot of difference, right? I mean, look at us, we're two co-hosts here. One of us is a big tech ML engineer, well, senior ML engineer, actually. That's you, Max Buckley. And then there's me, Boris Minaris, a junior AI researcher, let's say, who's more in the AI startup world. I, after graduating, I worked at Sakana AI for a year, and now I'm at a new startup called Radical Numerics. And it's fascinating. There's just so much exciting stuff going on in any part of the world of AI. And yeah, as, as you said yourself, Max, there's just so many interesting stories that aren't being told in the current AI content landscape, basically. I mean, there are really amazing podcasts out there as well that, that we also really enjoy. But those are often cover, often, in my opinion, or in our opinion, cover the end product, or the final paper that's being discussed, or the new product release that's being discussed. But for me, actually, having now worked at Sakana, for example, and having have had, having had the privilege to talk to so many interesting researchers and also PhD students, I got to see everything behind the scenes on how to create a paper, how to write a paper, how to do research. And there's just so much that's going on behind the final paper or the final project that's not being talked about, like so many failures. And how does a day-to-day life, how does a day-to-day work look like for these people? And yeah, then the question is, if we want to share these stories, what kind of guests or topics are we planning to, to feature here, Max? [00:06:06] Speaker 1: Yeah, I mean, I think a diverse set, I suppose, you know, people who are working in the industry, the tech industry, I suppose, you know, research as an industry, startups, but also people who are working in other industries that are being kind of disrupted or innovated in because of these [00:06:26] Speaker 2: technologies, right? Yeah, definitely. I mean, again, like, we have, I think, this, this unique combination, again, of having you, which as an ML engineer, more, yeah, in the engineering space and me as a researcher, where we also got to know a lot of people, especially you, Max, you are, you are the guy who knows a guy. It's, it's, it's crazy how many interesting stories you just could share with me by you just knowing other people and just having a coffee with them. And I found this so fascinating. I, I thought to myself, every time you told me like these stories of people that you met, I would love to hear more about these stories. And I think I could also learn a lot again, like these, these failures that people don't see behind the scenes, I think, super valuable. [00:07:15] Speaker 1: Yeah. I mean, I mean, networking is a whole other topic that's worth talking about at some point. But yeah, I mean, one of the biggest values, I would say something like this could have is to sort of synthesize what we're learning, what we're seeing, right? Like, as an example, I spent a last week in Germany, partly on a business trip, partly visiting a university there. I met beyond those things. I met several founders of AI startups. I met, you know, people who are in working in the industry. And now, you know, we shared a lot of thoughts. I learned from them. And I mean, if I learned some stuff from them, then likely many people, many listeners aren't familiar with these things either. Right. And some of these things are new, like there's like, you know, new research papers that came out this week or like new GitHub repos or new like plugins for things. So, you know, this is just, again, I mean, a way to find these things, a kind of a consolidation of knowledge, right? You know, you're hoovering up information in, in Japan, you know, in your sphere, I'm hoovering up information here in Zurich and Switzerland on my side. Um, and so hopefully by discussing this stuff that people will listen to it, learn, I mean, it's not going to give someone great detail, you know, us discussing it, but it will give them a kicking off point, right? That they can say, oh, maybe I should try this, uh, thing or look into this library or look into the startup, you know, maybe it's relevant to what they, um, are doing. Maybe it inspires them to make their own startup. [00:08:47] Speaker 2: Yeah. I think this is like a really exciting thing to touch on because as already kind of mentioned before, I feel like everything is just LLMs and you see big tech companies, OpenAI, Anthropic, and all of them are doing amazing work. Like, don't get me wrong. Uh, I, I love something like Cloud Code, for example, and we'll, when we'll touch on that as well later, but there's just so much more. Like I'm now at a startup that's do, that's doing AI for science basically. Right. And I feel like, um, 2026 is, is kind of becoming more and more exciting for, in, in the realm of AI for science. But you, you, you rarely ever hear about the breakthroughs that people make there, except for AlphaFold, which is phenomenal, right? They, they won a Nobel prize for that work. But there's just so much more happening there. And it's just also just very interesting to work in that space. But from me and my, my experience with people reaching out to me and asking about how do you maneuver the space of this AI space? And how do you find a job? And how do you get a job at working at these big LLM companies? And I'm like, there's more there. There's not just the big LLM companies. And it's hard to get into those companies. Like the average age, I think of people being hired as OpenAI is, I've heard that that's 30. Maybe we'll do a fact check on that, but, um, I guess, it's hard. And there's just so much more interesting stuff. Uh, when I was at ICCV like 20, 25 in Hawaii, amazing time. Um, obviously it's a computer vision comp, uh, conference, but yeah, I did not see any LLM papers. I saw human pose estimation papers. I saw nerf papers. I saw virtual avatar generation. I saw hand tracking and all of that were papers that were presented there. And all of that is relevant to certain spaces of the industry. I mean, imagine how it is if you want to do with some AI for games or whatever, right? I mean, Fei-Fei Li, for example, has her startup, I think it's called world, world labs or something like that. Right. And they're doing, yeah, 3d world generation, whether it's, whether it's being used for games alone, or at some point for 3d simulations that can be used for RL training or for autonomous driving, all of that. There's just so much more to the space of AI than I feel like is being shown in the mainstream. And I think that that's also where people can benefit from, um, and yeah, seeing all of these different startups that exist because startups are especially exciting nowadays. I mean, even people, Google used to be the, the end goal of many people be like, I want to someday work at Google and it still is to a certain extent, but even now people at Google seem to be going to startups more and more because startups are also just exciting place to be. And startup and big tech both have pros and cons. Um, but I feel like to get just to circle back what people should expect when they listen to this podcast, or I hope that they can get this out of the podcast is that there are a lot of interesting domains within the space of AI that people can explore. And I hope that people get, even if it's not super in depth in certain parts, just to know that these interesting areas exist can just broaden the scope of people's understanding and views. [00:12:45] Speaker 1: Absolutely. Yeah. Just this last week, just to say, very salient, but, uh, you know, just meeting all those different startups. Like if we wanted to host any of these people as guests, I'm sure they'd be incredibly, uh, grateful, but it would also, you know, immediately give them some exposure, but also, you know, people listening will be able to check out what they're doing. Not, not just what they're doing, what they offer as a product, but maybe inspire them to say, Hey, they can do that. I can do that. [00:13:14] Speaker 2: Yeah, definitely. I mean, yeah. So, so many interesting things that we want to talk about, but I guess the question is now for people that do not know us, let's maybe do like an act like a, like a more deeper introduction to who we are. I guess this, this podcast is called Signals and Stories, right? Where we want to get as much signal out of the stories behind the people and the stories behind papers and products. So I guess I always thought the signals was like the machine learning, you know, the signal to noise kind of signal to noise ratio. You want to extract a signal out of all the noise and out of the nice stories behind the, behind those people. It's, it's, it's, yeah, it's a fun name. That's what's generalizable here. What is generalizable and what we want to learn is who are Max and who is, who are Max and Boris and who are they to tell us what these stories are and why it is important. Let's dive deep into who we are. Max, who are you? How did you get into where I, where are, who, where are you now? And where do you start? [00:14:20] Speaker 1: Um, I, I, it's funny, I've told this story on a few podcasts and whenever I tell the same thing, multiple times, I like to tell it differently to keep myself entertained. Um, of course that means the, the reader or the listener has the extra challenges to kind of integrate the information. But, um, yeah, born and raised in Ireland, um, always, uh, grew up with a computer. So I was always like an IT kid. I always preferred being on my computer than being outside. Um, I didn't trust the outside world. Um, but, um, ended up studying business in university as my bachelor subject, um, in spite of coming from a kind of technical, more technical sort of self-education. Um, didn't really like business. It was very fluffy and sort of subjective, but kind of got back into programming on the side and data science things through Coursera and online courses. Um, and then through some combination of good grades and weird, cool side projects, I kind of captured the attention of some recruiters at Google and got an internship at Google, which quickly turned into a full-time job, um, as a business analyst. And then I joined Google and spent, I've been there 12 plus years now, um, doing about 10 different roles because I changed team quite a lot because I get bored reasonably quickly. And, uh, in parallel to these 12 years at Google, I did multiple postgraduate studies and master's degrees, um, in ML related subjects. So I did a post-grad certificate in statistics, a master's in business analytics, a master's in software engineering, and most recently a diploma in data science in ETH Zurich. Um, so it's kind of unusual in the sense that I came from this like fuzzy or soft qualitative subject background into a more quantitative space. Um, you know, moved from being a business analyst into the software engineering roles, um, and then have kind of moved even from more pure infrastructure software engineering into AI research, applied research. Um, yeah, [00:16:28] Speaker 2: eclectic background. That's what I would say. No, I think that's again, like super interesting. Again, the mix here is very interesting because you have this, yeah, more uncommon backgrounds to get into ML and AI. Whereas for me, it's, it also was kind of a mix, but I, I knew quite early on that I wanted to to do ML. And so for me, it started when I finished high school. I actually, I did, I had, I had math and physics, uh, as my, um, main subjects for like, for 11th and 12th grade. Right. And then I did, um, but I never, I never coded before. I never had even an idea of what coding was really. So there I was just finished high school, was ready to start my, my high, my college degree. And I was like, what do I do? And then I did an intern, an internship at a biotech company that did pacemakers where I got in touch with programming for the very first time. So I was like, okay, I'll, I'll start a programming internship, but I've never touched a single line of code. So I, I literally, I went to like a library and bought like some Java book and was like, okay, let let let's, let's try and learn Java from like, because I Google what's what's the best program language to learn that was in 2018. And I saw, okay, Java. I bought like some weird Java book and tried to learn programming out of a book, which yeah, it feels weird. It wasn't the best, but it was enough. I got, I got in the internship. I then worked with like C sharp and like, yeah, not the, not, not the best language, but it was fun. It was just so much fun that I could just tell the computer what to do and it would do what I told us to do, even if it was wrong because I'm, [00:18:23] Speaker 1: I'm the error, not the computer. Just to interrupt you for a second. Um, I had a very similar experience when I was studying business that I at some point decided I wanted to get into programming. And I asked a friend who was studying software development, if you could like recommend a book on Java, because for some reason I decided that was the one I should use, I lent me a book called Java in two semesters. And we used to have a month off in January. It used to be like semester end from February to June or whatever. So over Christmas break, um, in January, I read the entire book, the 600 page Java in two semesters. And I implemented several of the like programming assignments that were at the end of chapters and I built the application I wanted to build. So that was how I kind of started in Java. [00:19:04] Speaker 2: Very similar. Yeah. I mean, it, I mean, and, and for you, it was even further back, right? When you had, when you, that would have been like 0.10 or something. Wow. We, we, we're both, we both learned [00:19:18] Speaker 1: Java from a book. And there was no vibe coding back then. You had to actually type it yourself. There was [00:19:24] Speaker 2: no vibe coding, but now, but, but that changes everything. That changes like how I would recommend people to learn pro programming fundamental. Right. But I mean, so yeah, then I, then I studied computer engineering because I was like, okay, programming, but I want to program some robots. So that because, so I needed some physics and some electrical engineering. That was my idea. Very simple. Um, and then towards the end in 2020, towards the end of 2020, I heard of a module like machine learning. Huh. So I can teach the computer to do something that I wanted to do, not just tell it. That was interesting. That was before chat GPT. Right. Um, so I took that course and I was instantly hooked. It was, it was phenomenal. Uh, I loved it. I did my bachelor thesis in, I went to do some sort of robotics and, and machine learning and reinforcement learning at that point. Um, so I, so I worked on a, I joined a project where I worked on mostly engineering. I did some, I did train some RL agents. Again, the term agent changes all the time, but some RL agents to drive around autonomous driving, but it was mostly engineering work and actually deploying it onto real robots. And for people having worked with real robots, it's a pain working with physical robots. It's such a pain. I it's, it's horrible. Um, but it was fun. I worked there. I published a, I coauthored a paper there that was also published at like a nice robotics conference. Um, and then I took only AI courses, had a blast, got decent grades because I just enjoyed it as well. Um, and for my master's thesis, I'd worked on, um, vision language modeling and more precisely video language modeling. Um, during all that time, I worked as a student researcher, be it at like a, the Fraunhofer society in Germany. Um, and then as a research student researcher at my university lab where I did my thesis and during that time, during the time that I was doing my thesis, I already had like a PhD offer. I was like, I was ready to do a PhD, but during all that time of my, during my masters, I was applying to internships. I applied to cool internships, like I applied to dozens of internships and got invited to interviews at like a company called Neuro, which did autonomous driving. Um, but I failed that interview. I got an, I got invited to an interview at Amazon for an applied scientist. I failed that interview because I, those were my first like real interview experiences and it was, I, I, I deserve to fail them. Like, like I genuinely deserve to fail them. So that's okay. I had like an interview with a deep mind researcher. I failed that. I don't know why, but I guess I failed that as well. So I was like, okay, one after the other failing, but I, I don't care. I'll continue. I, I'll just have fun and do my research. I have a PhD offer ready that I have, I have, I have, I have next steps to take, but then I saw this Sakana AI company, a new startup in Tokyo, amazing founders, Lion Jones, um, coauthor of the transform paper, David Ha, amazing researcher, quite popular, um, has a quite larger following does really interesting research in my opinion, at least. And I was like, okay, I want to live in Japan as well because Japan is super cool. Um, so I was like, okay, uh, I won't apply because they won't, they won't invite me anyway. I'm, I'm just some weird master's students and I'm, I'm, I would be applying for the full-time research scientist position. Yeah, doesn't matter. But then I did, of course, I still, I did convince myself, okay, let's, let's just try it. What's the worst that can happen? And I got invited to an interview. I, I, I, I don't like the super hard toxic productivity hustle idea, but I worked pretty hard for this interview because I was like, I failed multiple interviews before. I now know how the game roughly works and I'll work hard for this interview. So I did and I passed and that's how I, great, great success. Hey, persistence, persistence works out. Right. I mean, as long as you keep going, it's, it was really cool. I, it was, it was, it was so fascinating. Right. So I, I graduated from it with my masters, not even a PhD and joined this dope startup as my first job, basically. That was, that was really exciting. But then after one year, we parted ways. I still love Sakana and they are phenomenal. I'm rooting for them all the way. But as a junior, I wanted, I just wanted a different environment and more, a working as a big team together on one targeted goal. Whereas at Sakana, you have the, the unique environment where you as a researcher can do whatever, whatever you want. You have full research freedom, which is very rare at industry research labs. But that also meant that it's a bit less one big team working towards one big goal. And that's kind of what I wanted to try out next. I wanted to work on that. And I want to work in this kind of environment. And that's where I'm now at Radical Numerics, which is still, we're still working. It's a very early state startup and we have a lot of work to do, but it's just, it's just really [00:25:25] Speaker 1: exciting. So that's the long story. Yeah. I think one of the common threads that's interesting in both of our stories, even though maybe it's not as obvious is, um, this concept of like agency. Um, it's funny. I feel like the word agentic being applied to people as a recent, uh, innovation, but, um, you know, when it is used, it, it, it's been applied to me multiple times. And I also think it obviously applies to you, right? Which is that you don't kind of passively sit and wait for something to happen. You know, you do things all the time and some of them land and some of them don't. Um, and sometimes even when they do land, you then cancel them, roll them back, don't do it. You know what I mean? Um, but like, you know, you miss all the shots you don't take. Um, and so for you, I think a big thing that helped you get to Canada was the fact that you had a YouTube channel that you had been teaching hundreds of thousands of people, um, ML AI concepts online for several years. All right. Um, I mean, I've had the experience where I've applied to roles and I've had an interview. And the first thing the interviewer says to me is I know you from LinkedIn. I followed you for several years and I'm like, sweet interview complete. Um, I obviously haven't taken the job, but, uh, it happens, you know, um, I have other interesting stories about, you know, just kind of meeting people, you know, opportunities being presented just through weird stuff that I did. Right. You know, that like, you know, you think, Hey, wouldn't it be cool to do this? Then you go and do it or, you know, you think I should blog about this or I should dig into this. Right. And later it comes useful in, in some way. Um, and that's something I think that, I mean, everyone can do, everyone can benefit from. I mean, you can choose your degree of agency. Right. But, um, [00:27:22] Speaker 2: yeah, no, I, I 100% agree. And yeah, you're right. I, my YouTube, I have my YouTube channel and there I, I try to also teach people about the non-standard advice that you might get. Right. It's okay. You get a college degree, you take all the courses, you get good grades. That's, that's fair and good and important, but it's, it's, it's going with the current, right. And it's not, it's what you just describe it's no own agency or initiative, right? Side projects are so important. And that's what's, what, what, what shows who you are. And that's where you learn the most as well. That said, it's very hard for many people to do this self-guided learning without some sort of curriculum, especially in the beginning. If you have never coded, where do you start? If you've never seen ML, what, where do you start learning? So I think the initial bootstrapping of your career is, it's, it's just, it's exploration where you learn a lot of different concepts. You have to get through the math. You have to get like an intuition for all of the things. And once you get past the first hurdle, which, which, which can take years. Then you can do whatever you can write blog posts, you can create side projects, you can start collecting GitHub stars on your repos. You can just, you can create a YouTube channel, heck if you want to. But somehow showing agency and talking with people is super, super valuable. This is something that I [00:29:06] Speaker 1: learned way too late. I mean, you learned it earlier than I would say most people. [00:29:14] Speaker 2: Well, but, but, but in fact, that, that, that's the one reason that I started my YouTube channel is because I wanted to put myself out there. I was bad at talking to people in person or walking up to people. And in, in college, I had like one best friend that I did everything with, um, who then left me and went to Munich for his master's degree. And I was alone in Berlin, in Berlin, where I then found and found out. I didn't have only one friend, but I had, I had a, I had a handful of friends, let's say, and it was really bad at just getting to know other people. I never went to some parties or never did, did hackathons, which I really regret, for example. It was hard for me. Right. I wasn't that kind of guy. So I was like, okay, you know what, I'll just talk to a camera and then people will see me and they will talk to me maybe someday. [00:30:06] Speaker 1: Make it pull rather than push marketing. Yeah. [00:30:08] Speaker 2: Yeah, exactly. Um, and it worked, it worked after two years of, of, of constantly doing it. Like, but yeah, it worked when, in fact, funny story. When I came, when I joined Sakana, um, when I, when I had my introduction or in fact, in most cases where I introduce myself and people recognize me, they're like, Oh wait, I saw your face on YouTube. And then once one person says that and everyone picks up and then, what, where to show me, show me a YouTube channel. And then we're like, Oh wow. You have so many subscribers. And then that's the number one topic. Oh, what papers have you written? Oh, let's start with like, no, it's YouTube. All those, all these insanely smart researchers. Let's talk about YouTube and just how that is because it's interesting because it's just interesting to people because it's unique. There are not many people that they have seen with this YouTube channel and it's not just the YouTube channel. Again, it doesn't, or rather it doesn't have to be YouTube channel. It can be a LinkedIn account with a lot of valuable content there. It can be a super valuable blog. It can be just a large following on X because you do like really cool short, short, short, short form content there. Um, or you're just having, have like a very strong ML meme game. That's also possible. Right. I mean, it's, I mean, I think, I think there's actually, [00:31:40] Speaker 1: there's two parts to it, right? Like there's agency and being agentic and like doing stuff. And there's a question about publicity and being public, you know, like you can be agentic and make the most of your time and your life, right? You can, you know, go out and do things and say, yes, I like the movie, yes, man with Jim Carrey. It's a kind of funny meme, but you know, you can go out and do things, you know, when opportunities are presented to yourself, you can take them, you know, you can proactively, you know, read and learn and implement and present to conferences, whatever, without necessarily putting yourself out there in the, in the sort of internet sense. Right. And then obviously doing it on the internet takes it up another notch, right? That like, once you'd put it out there on the internet, then potentially you're going to get like first hundreds and then thousands and then tens of thousands or thousands of eyeballs. Right. Um, cause for me, as an example, I was very quiet online until the beginning of 2024, but I was very much like agentic in the sense that like, you know, from, for my whole life, obviously, but you know, if I think about my, my getting into Google, right, like I did a bunch of, uh, Coursera's back in 2012, 2013, when that first shipped at some point, I was one of the world's top Coursera completers. I had done 40. Um, you know, I was featured on the Coursera blog for my, some of my stories, um, you know, but, and I was doing other stuff. Like, I mean, obviously in work, I was learning and doing work stuff, but like on the side, I was doing, you know, postgraduate studies and whatever I was going to conferences. I was, you know, presenting various things at conferences, occasionally I had like meetups and stuff, but again, not putting it online. So there was no, or, or very limited like digital trace of these things. Right. Um, but then of course in 2024, I basically started posting on, on LinkedIn and I would say quickly gained quite a bit of like traction or following. Right. And I, I wasn't really doing anything different. I was just writing about it. Right. And by writing about it online, way more people find out than however many people were in the room when you presented about something. Right. Like if I was to go and present at a meetup who, I don't know, 20 people, 30 people will see it and recognize me and say, okay, Max presented at a meetup. If I put it on LinkedIn, I'll probably get like at least 15,000 views, you know, like, you know, and not only is it 15,000 views, right. But it's like 15,000 random people are saying, oh, Max presented at that meetup. Like I'll go to work and some guy I'll be like, oh, good job at that meetup last week. I'm like, were you there? He's like, no, you know, um, so I think, you know, you can be agentic without being public. If you're not that way inclined, if you're a bit nervous about the feedback or, you know, you're nervous about public presence. Um, but of course you can do both. And I think that that's also even more valuable, right. In the sense that if you're willing to be agentic and public, then a lot of stuff can flow to you, right? Like, you know, as you said, when people recognize you, they come to you, you don't have to go to them. You know, that people say, you know, associate your name with, um, particular area of expertise or whatever, and then they reach out to you. And that's obviously amazing. And not likely if you're just presented to a room of 30 people, right? Like, it's not likely that someone in that room is like, have you thought about writing for our company or have you thought about joining our company? You know, um, obviously if you get your name in front of hundreds of [00:35:11] Speaker 2: thousands of people, that's a bit more likely. Yeah, though, I, I, I 100% agree on that. And, and as, as you said yourself, you don't have to put yourself on the internet as, as like the first step. I think it's valuable because again, it's, it's learning in public. It's teaching you skills that are important or can be important in many different ways. But that's actually like a question that I, that I also asked one to, wants to ask you. Um, if you were to restart this journey, if it's like one thing that you do differently and I'll start here because for me, it's, it's very obvious for me, it is trying to get to know more people. Again, it's very important that you have the technical skills. That's like, that's just the foundation, right? But once you have the fundamentals and you enjoy what you're doing and you have fun learning, then also getting to know people, people who work on different, you, you open up like a, like a mystery box of potential interesting projects that you can work on, right? If you're, if you're struggling with, oh, what project should I work on? But you talk to 20 different people, chances are that at least one of them will have something interesting to work on. Chances are that at some point you might find like one or two people that would be happy to go to a hackathon with you. I would love to do a hackathon if I could start over. I think those, those are like super exciting and super valuable. Even if it is nowadays just some LLM rapper that, I don't know, writes bedtime stories or whatever. It's, it's getting to know people that also have, have the agency of going beyond, of, of, of going beyond just standard, the standard beaten track. So yeah, that's, that's something that I would like to do. Again, I did my YouTube channel, which I'm very proud of, but man, there, there were some interesting people now look in retrospect that now have gone down to have like interesting pathways that, that, that, uh, doing PhDs and other interesting projects that I, I could have, I, I could have had the chance to get to know them, but I, I, I didn't. Yeah, I definitely think, I mean, I mean, well, [00:37:37] Speaker 1: I suppose the, the big question was like originally like, what would I do differently? Um, I definitely think I've always been relatively proactive in kind of getting to know people. Um, the internet obviously makes that set of people much bigger and broader. Um, like, I think definitely when I was younger, like, I kind of knew people who were like physically like adjacent, right? Like I was trying to find, like, even when I would have thought about something, like, should I do a startup? You know, the, the people who I would have considered are the people who are my friends, right? And you know, your friends don't necessarily have the same like professional, relevant professional skills and interests that you care about. Right. Um, and similarly, like when I started doing Courseras and learning about the things that I was learning about, most people who I was in college with at the time thought I was crazy. Like, why are you doing this? Like, this is not on the syllabus, it's not on the exam, you know? Um, and, and for years, I didn't even talk to anyone about like machine learning. Like there was no one else cared about it that I knew, you know, it was just like in my little world of learning and reading papers. And at some point, some colleague of mine went to California to work with Google brain team. And then he came back to Dublin and he spoke about some of the research at the time. And that was the first time I ever heard a human live talk about all these papers that I had already read. Now, you know, you don't have to, to have that, go to that extreme, right? Of like completely like rabbit holing yourself for years, you know, isolation. You know, it's nowadays, obviously if you, if you talk about these things publicly, people will, who have a similar interest, will find you and talk to you or, you know, and I think definitely having a kind of a peer group or whatever, you'll definitely learn from them, especially if they are themselves like motivated, interested people, right? That they'll do stuff and you'll see that's interesting. Um, maybe you'll dig into it. Maybe you won't. Um, yeah, I think it's, it's, it's, it's this [00:39:30] Speaker 2: again, agency and doing things that are outside of the syllabus, which, which honestly is also a hard pill swallow because it's, it's more work. It is more work, right? Unless you try to have like a very good symbiosis between what you're learning in your standard coursework and what that you can then apply that to side projects or inside ventures. [00:39:58] Speaker 1: Now another difficult, let me point out that. So one thing that I've really enjoyed in the last year or two is that I feel like I've managed to align my personal interests with my work, with my side projects in this like beautiful kind of flywheel, right? Where the things I do in my free time, you know, are like meeting people who are working in industry, in the relevant industries or like reading papers or like hacking away on programming side projects, things I think are interesting. And then of course I'm in work and people are asking questions about something and I can obviously encyclopedically answer their question because I literally read the paper at the weekend and wrote my own thoughts on it. Right. And then of course, you know, this satisfies them or hopefully impresses them and then takes me back into, well, I should read more papers in this field. I should, maybe they ask me a question I can't answer. And I go, let me dig into that by reading two more papers, writing about those. And then, you know, and then of course I meet some random friend and I'm like, have you read the paper? And they're like, no. And then they're very grateful to get that tip, you know? But yeah, sorry, just what I would do differently is like, I would become more public much earlier. Like if knowing what I know now, I think posting publicly has been really valuable. Like in the, not, not even two years I've been posting publicly. I've gotten so many interesting opportunities. I've met so many interesting people. I've made so many interesting friends like yourself that I would never have met if I was just beavering away head down in Zurich. Right. And I mean, if I had done that 10 years ago, like literally I would have, the kind of stuff that I was doing is stuff that people broadly value. Right. So, you know, likely I would have just a bigger platform and more opportunities. And, you know, I mean, I haven't really had any negative experience yet. Right. And occasionally you get like cheeky, rude people leaving sort of snarky comments, but like don't entertain it. I don't really read them. I don't, it's kind of, yeah. [00:42:05] Speaker 2: You, you, you always, you do get them, but also my personal experience, it's, it's a very, very, very small fraction. Like the very, very, most people either just don't comment or it's just not friendly, friendly comments mostly. Um, because people are friendly people, everyone wants to learn together and grow together. And there are just some people, no matter whether it's in AI or in some other domain that are just a bit more rude and they will have their own challenges in life. And that's, that's okay. Everyone has their own, their own challenges to, to yeah, battle. But I think it just, just also depends on where you are in, in the process, right? Again, if you're just a very, very early stage engineer or researcher, learning the core skills and learning actually by doing is like the number one priority. And if you then find the spare time to also make it public, it's amplifies your learning. But you know, I always, I always envision, envision this, um, XY plane where you have on the X axis, um, the knowledge, your knowledge, right? If you're at the very end, you're just a really real expert in one domain. And on the Y axis, you have your public voice. If you're in the top left corner, no, no expertise, no knowledge, but just a loud voice. You're some AI influencer that a few years ago was a web three influencer, Bitcoin influencer that has no idea what he's talking about. It's just doing pure cloud. And if you're, but, but almost equally as bad is if you're just in the very extreme, you are an expert, but, but nobody knows you, nobody reads your work and you're, and you're just, and, and your values is, is, is lost, which is very sad. So you have to kind of find this balance between what it doesn't have to be like right in the straight line, but I think that's where it's valuable. [00:44:10] Speaker 1: That's where it's nice to separate agency from like publicity, right? That like, you know, I, I don't think it's worth, you know, if you know literally nothing, if it's your first time using something and you don't know how to use it and it's not working for you, just writing a post about like, I tried to use this and it doesn't work because probably you just did it wrong. Right. Um, you know, if you can't, for example, use an embedding model or you're just getting like a type error, you know, don't write a blog post about it, you know, um, but you know, if you are already somewhat in bed, you know, but you can still be agentic, like it's still good that you're doing that. It's still good that you're educating yourself, that you're downloading a model from hugging face, that you're getting as far as a GPU compilation issue. Um, you know, so keep it up, you know, you're doing great, you know, but at some point, maybe, you know, as you become more confident, you become more expert, you can share things more publicly if you so choose. Um, and yeah, about the bootstrap problem you mentioned about like the cold start, right? Like for me, for example, I first heard of machine learning as a concept in Coursera and you know, at first I thought this is great. It sounds like all the best parts of statistics are at the boring bits. Um, the boring bits, which I later realized are actually really important, but, um, but a lot of machine learning doesn't realize yet. Um, but like, you know, I discovered it through this and then I learned a lot about it through lots of Courseras. You know, I don't think that was like the optimal way, you know, and then later I started reading more papers and things, you know, um, but you know, it's definitely, you know, real that [00:45:46] Speaker 2: you got to start somewhere, right? Yeah. I mean, actually, I would somewhat disagree with your take on not, even if you're early stage and not, not an expert yet, I think it's still valuable to somehow post online that, that you, that you, that you, oh, I made this mistake and I, but, but I then figured out the fix and this is what I learned. It's like just small, small posts here and there like [00:46:11] Speaker 1: that. I mean, I, I, I don't mean to say that it's not valuable, but I think it's personality dependent. It's like, how much do you need to save face publicly? Like how, you know, do you think personally that, you know, you'd feel very humiliated being wrong, then obviously you're not going to want to put things out there. Right. You know, if you're more confident, then maybe you're okay. Right. Like, I mean, each person can choose for themselves. Like I, I don't think we need everybody to publicly post everything. Right. Um, and of course, even if we go down that road, then there's obviously more competition for eyeballs and then you'll get fewer viewers because everyone is posting their stuff. Right. But yeah, I mean, I think it's just, it's just like, you know, being kind of agentic and just doing stuff like, don't be, don't feel like limited by, you know, oh, I studied the wrong subject or I, you know, I don't know about that. Like, I, I mean, obviously you don't know about some advanced topic if you haven't looked into it, but you know, you go, you read, I mean, now that you've talked to an LLM about it, you know, now you know something, right. You try to do it, you fail, you know. Um, yeah, that's, that's actually like also [00:47:25] Speaker 2: an interesting point to touch on being an expert in something. I, for my path, I jumped around a lot, just basic, very similar to you. I, when I started, I did robotics and reinforcement learning like for one and a half years. While I was doing that, I had like a part-time student researcher role where I was doing time series forecasting with graph neural networks. And then I, and then in college, I also did like a bit of computer vision for, for in the, in the like healthcare space, recognizing some, some aneurysms in the brain and stuff like that. It's just, just, just fun, like college projects. And then I did video language modeling. And then when I was at Sakana, I also tried out many different ideas and ideas and many of them fail. Many, some of them work and I learned a lot from them, but especially if you want to go into research at some point, many people say that it's valuable to be an expert in one domain rather than a jack of all trades, but an expert in none. And if I, in people, if people were to ask me, I would say in the beginning, you should just try out things and just learn different skills because you, you're nowhere near to being expert. No, no one is an expert in anything, but at some point you have to specialize a bit because it's just impossible to know everything. But what I did in my opinion is I don't know if it is bad or good. My intuition would say that it has advantages, but it's, it makes your life more difficult if you just jump around a lot, because with every new subject that you get into, there's a lot of overheads with reading up on the, on all of the relevant literature for this new domain that you're all of a sudden in. Again, I was doing RL and robotics, I had to read up all of those papers. Graph neural networks, what are graph neural networks? I had to learn all about that. Times use forecasting, video language modeling. Now I had to learn about all of the multimodal large language model papers and all of that. And, and, and a lot of different things. It's okay, but it's, it's not the easiest path in my opinion. And I don't know. I'm curious about what you, what you have to say about that. Because again, I know that you've been also jumping teams a lot and it has its advantages, but also [00:49:49] Speaker 1: disadvantages. Yeah. I mean, so I kind of have a different thought here, a different perspective. Like, obviously if you want to become really expert in something, you need to focus on it. Right. But I also think that you can't specialize too early. You don't have a broad enough base. The analogy I would give here is let's say physical exercise, right? If you're someone who's like desperately unfit, you can't just become super strong. Like, I mean, you can try, you can try and do maximal lifts, but it's better to become all round fit and healthy. And that will make your strength levels increase. And then later, once you're already got a solid base, you can specialize in one or more sports. Like a lot of the best athletes come from other sports because they have this well-rounded sort of solid foundation. And like, I think that, you know, if someone told me they want to become a expert in neural networks and they just start reading the key papers in neural networks, they don't necessarily yet understand the math or the additional computer science as adjacent to be able to really fully get what these papers are about. Does that make sense? [00:51:07] Speaker 2: It does. For me, it's more about at what point do you decide that you want to specialize more, for more, right? I mean, for me, it's like, like, for me, it's kind of like, there's a rate at which [00:51:22] Speaker 1: you're learning. And of course, there's also a rate at which you're like, forgetting, right? And like, obviously, like, at a certain point in time, you've accumulated a certain stock of knowledge, right? Like I know a lot about computer science and machine learning and stuff, right? If I tomorrow started working on learning Swiss law and just became a lawyer, like my stock of machine learning knowledge is both getting outdated and kind of decaying because I'm forgetting some of it. Of course, five years from now, someone could ask me a question and if many things I could still answer, you know, but the more nuanced things I've forgotten and you know, like, I think if you're jumping around completely, then that's going to hurt. But my feeling is if you're jumping around in related areas, there's a lot of relevant connections, right? And again, I have definitely got a bias to exploration. So probably maybe you're talking to the wrong person. Like I know people who are much more narrowly focused, but like, I would say like if tomorrow someone says, Hey, do you want to work on like my graph neural network startup? Like I've read several papers on GNNs. I've implemented some GNNs and things, you know, um, I'm definitely not an expert in it, but how quickly could I become an expert versus some random undergraduate student? I would absolutely be confident that I can school them. You know what I mean? Like, because I already have the basis and these adjacent competencies, right? That like, you know, if I said, okay, I'm going to start reading GNN papers tonight, you know, I could read multiple and not only would I read them, I would already be familiar with like, okay, this is the, you know, generalized attention network or, you know, passage passing and all these other things, right? Like I'm already familiar with the concept. So it's a question of like filling in the details [00:53:14] Speaker 2: or fleshing it out. Right. Um, yeah, I, I, I agree with that. And, and actually, yeah, [00:53:21] Speaker 1: I was gonna say, like, one thing I've always found strange personally is how people so narrowly scope themselves. Like, I mean, okay, right now, some people might say, oh, I'm a rag guy. Max is a rag guy because I've talked about rag at some conferences and things, but I don't think of myself as a rag guy. Like, I think I talk about whatever, you know, that I'm kind of focusing on. Um, like I know I meet developers, sometimes they described themselves as, oh, I'm a C sharp developer and I'm kind of like, I mean, I've never written C sharp, but like I use a dozen languages. So it seems like a very narrow, I mean, don't come here on. There are some people who truly are like a deep expert in like Java. I have a friend who wrote a book on Java and like, if you ask him about how a thing works, he doesn't just tell you how it works. He tells you how it used to work in different versions and why those changes were made. And he tells you about the discourse in terms of the discussion between the standards committees. Right. So absolutely. Like he's a Java expert, [00:54:19] Speaker 2: but, um, you know, no, I, I, I agree actually nice. And I think, I think it depends on also just on personality a lot. Again, I've, Oh, especially, especially in research, usually you do a PhD and your PhD is a certain one topic that you try to make progress in one specific problem of your space. For example, computer use. If you're doing a PhD on a computer use topic, your PhD will be one, two, three, four papers to improve computer use using different techniques. Right. You have to, you usually, again, usually have to have like this red, red line that, that you can follow. And that's, we have a coherent story through your final thesis. And that's where I usually then get this expertise. Whereas for me, what's where I think it's actually a good thing is that this shows that once you just stick to it and you learn a lot of the fundamentals and you learn how to learn, that you can pick up a lot of the skills on the job, right? Learning on the job. You're essentially learning, you're optimizing for learning on the job basically by spreading wide. Again, I worked on many different parts and now I joined this new startup and all of a sudden I'm working on AI for science, which I have basically never done before. But I know AI, I know AI. I know how to implement any kind, most models, not any kind. I would not go that far, but even if there is like a model that I've never implemented before, I'll, I'll read up on it and I'll catch up in depending on how, how comfortable I'm in days or in weeks. And then, then I'm good to go. But again, to get to this point, it's really hard. Once, when, when, when you, when you start as a, as a, any kind of researcher or even engineer starting to read your first papers, you'll spend a week on one paper. And that feels like impossible to then get to. And especially nowadays with AI tools, it's, it's crazy. You can, you can iterate so quickly. You can learn so much. And that's where I want to, to ask you, what is your take on, on the AI tools or do you want to first? [00:56:46] Speaker 1: Let's get to that in a second, but like about the thing about papers and expertise and things, I think papers are a terrible place to start. Um, you know, I think that like papers are, a good paper to me are like relatively short as an, I don't like one that like a hundred pages long, cause then it's too large to read in one kind of comfortable sitting. Um, but like, I love papers that are sort of in the like 15 to 30 page size range, but you know, again, they're not fully self-contained. I mean, the reason they have all those references is so that you can understand them. Right. And even for me, right. I recently gave it like a talk on RAG and I had 40, 44 papers cited in the slides. I have read all the 44 of them fully. Um, and it's funny because I'm giving this talk on RAG and you know, I'm pretty familiar with the, the literature. Um, but like any one of those papers, you know, if you don't know anything or like pointing all over the place, you're like, you haven't read all of the things. And I still haven't read all of the papers in the space. It's not, you know, I've just read many, you know, um, so, you know, you can read it again and say, oh, maybe I should know more about this realm model, or maybe I should know more about this other one. Right. Like, like, you know, you need to get the high level concepts and your bait, your, your basics in, in fitness as I, as for my analogy before you kind of go deep, right? Like, I don't think reading the RAG paper is a good way to learn what RAG is because what most people, you know, use or describe as RAG is not what the RAG paper does. RAG paper is much more complicated and detailed. Um, and so if you look at the RAG paper, you might think, oh, this is hard. This is complicated. What do you mean end to end back prop? And you're optimizing the generator and the query encoder, and you're potentially doing retrieval on every token generated. Whereas most people for RAG is chunking, you know, retrieval context window, boom, you know, so like a blog post will get you that, right? Or even that one sentence I just said, right? You don't need to read the RAG paper. Now, if you want to know the area, know the literature, know where it's coming from or where it's going, [00:58:53] Speaker 2: then you need to read the paper. Um, that, that actually, that, that brings me to somewhat, somewhat of a hot take, you know, um, I think if I were to ask you what is one best practice that you think is not really established enough and what do you think is secretly a bit overrated on the overrated part? I, I, I'm a research scientist, um, lean more into the research. And I would say it's a bit overrated to have to read every paper at, especially in full, like to think about, oh, you have to read a paper every day. It, first of all, it depends on which phase you are in, in your project. Sometimes it's just pure engineering where you just have to code up the entire infra and you don't have time for reading papers in that end, or at least like fully reading them. And then also you have to know your literature, but it's more about knowing the few seminal papers that are like, that define the space you're in, that the pie that, that define the vision and the key core intuition behind what you want to do. Those you have to know in depth, in my opinion, and many follow-up works are incremental improvements. Many of them are okay. Look, number results, numbers go up. Cool. Figure one or figure two, depending on how they do it or whether they do a teaser image or, or not. You look at the system architecture or, and you're like, okay, it's basically exactly like this paper, just that they changed this one component. And they're like, okay, let's let me quickly look at the ablations. Do they clearly show if they take this, this component out, what happens? Hmm. Okay. Let me try integrate this, this, this simple technique into my code, try just having then the flag to toggle it on or off, run an experiment and see, does it improve whatever metric I'm interested in or not? Right. I mean, you can get a lot of insights from papers that are 50 pages long. And I think those are very valuable if you are not already in this, in this space and have the intuition and have the seminal works figured out. But to circle back on my hot take, I think you don't need to read every single paper because they are just often incremental improvements and you have to see what you need for your project. So, I mean, I totally agree. You [01:01:41] Speaker 1: don't need to read every paper. I also think the concept of reading a paper or having read a paper as a binary variable is super flawed, right? Like, and the reason is, is because as I said, papers are very much incomplete. Um, and so it's, you know, it's a combination of your own pre-trained knowledge that you've baked into your brain and what's in the paper, right? So like, I have this experience routinely where I read a paper and then years later, if it's seminal, I read it again and I understand so much more. Or you read a paper and then, you know, a few months or a quarter later, you'd want to implement it or part of it in a system. And then you really reread those sections of the paper and you're really granularly extracting the specifics that maybe you kind of glossed over. You didn't realize, or maybe there's even things that are not specified in the paper and you didn't realize it until you try to implement it. Um, and, or similarly, like when I post about papers, usually just the act of spending a few hours reading it, taking notes, editing the notes together into some coherent form, potentially, um, iterating on my write-up before posting it really helps to improve my own internalization of the paper and my recall later. Um, I'd love to do a little experiment where, you know, we get a set of people to, you know, read the paper and set of people to, you know, read the paper and write about it and like literally measure their recall six months later, 12 months later, five years later. And I'm pretty sure that group that has to go through the extra time and effort, you know, will have much higher recall again. That's not news, right? Because they have spent more hours on the task, but you know, it's just, you know, this idea that, oh, I've read that paper. So it's done so flawed. [01:03:27] Speaker 2: No, I think also, I think for example, Twitter, many researchers, when they have a paper, they just create a thread of their paper, just summarizing the paper even more into the real important bits and that are important to you because papers have a certain vibe to them. They have to have certain have a, they have to have a certain language and a lot of boilerplates, texts and all of that. And even these Twitter threads, when I read like one in full and see like the five experiments that they did and the, the, the insights that you can get from that and like, okay, I guess I read the paper now, right? Because you got the value that you had, that you needed in that moment. But that's still, again, reading a paper is very context dependent. Just reading it because you're curious is one thing. But again, if you're working on an active project, the way you read a paper that you need for this project is very different than just reading it for the sake of having read every paper in your space, just having the high level intuition. Again, I think the seminal papers are very important. And if you are working on a certain project and you encounter a problem, oh, I want to improve this. Let me then look up how other papers do it. Then you have a reason to read or you have a quest for what you're looking for. Right. So it's, it's, it's, it's difficult, but I think the idea of, oh, you have to read a paper every day, like, come on. And what, first of all, what does it mean if you just scroll for Twitter and you see some, some Twitter threads, I guess that already counts as reading a paper nowadays. [01:05:16] Speaker 1: But yeah. Yeah. I mean, I think the same is, it's not just about papers, right? It's, it's about learning or knowledge in general. Um, you know, as I mentioned earlier, I've done a lot of like part-time study. I mean, I did some full-time study obviously. And then I did part-time study and also online study. And one thing that's interesting to me is certain things come up again and again, right? Linear regression is the best example I can think of right now, where if you do any stats class, it usually starts with linear regression. If you do many machine learning classes, it starts with linear regression. So I've done linear regression. I don't know, probably a dozen times over different Courseras, statistics courses and machine learning courses. And oftentimes these linear regression learning experiences are months or years apart even. Right. And what's interesting is it's the same mathematics. It's the same model, but I always appreciate the sort of a slightly different perspective, a different take. I think about like how much more nuanced my understanding is now than it was when I learned about the previous time or even the first time. Um, and it's the same with everything in life, right? That, you know, you can think about things in a kind of binary sense of like, oh, I did that. I saw that movie, but it's a very different experience when you see the movie again, 10 years later, when you're older and wiser and 10 years later again, right. Um, because you can contextualize it more. Maybe you've lived through the sort of themes that are in the story. You know, maybe you like, you know, lost someone that you loved or whatever else. Right. And, you know, previously you saw it as a kid and a kid can watch a movie and describe the plot. But, you know, someone who's a bit older, a bit more life experience can maybe understand the kind of film structure that's behind it. They can understand, they can predict the ending based on something that happens in the beginning, because it's following a kind of narrative arc, even if they've never seen it before. Does that make sense? [01:07:08] Speaker 2: That's fascinating. Yeah, that's, yeah, it's, it's just a lot that one can, it's, I think the core challenge here is deciding on what to do, because there's just so much that you can do. [01:07:25] Speaker 1: For example, there's more to do that can never be done. There's more to see that can never be seen, you know? [01:07:29] Speaker 2: Yeah. For example, coding for me. [01:07:33] Speaker 1: That brings us into coding tools. [01:07:34] Speaker 2: That brings us into coding. Exactly. Where I want to get to now is I don't like the act of coding itself. Some people love coding and love to do like coding competitions and whatnot. I like coding because it's real. It can make my vision work. It puts my vision and my idea into reality if I can, if I can implement it successfully, but I don't like the act of coding itself. I don't know if that kind of makes sense, which is why I really like coding agents. They are not a magic pill that can do anything, but I really like this interface a lot because if you know how to use AI coding tools, what they're good for and whatnot, and again, they will keep improving. I don't think the speed is up for interpretation or guessing how fast it will continue, but they are pretty damn amazing. And we can touch on how we use them, which I think is really exciting because this is like a skill that everyone just has to know. But yeah, me thinking about what experiment I want to implement, what feature I want to implement, what visualization I want to have to understand the data better, to understand the problem better, to understand the results better, to interpret the behavior of the model better. All of that are things I want to do, I'm excited to do, but if I had to implement it, I'd be like, man, I wish I could just get what I want to do, because that's the interesting and difficult problem to solve and not the coding itself. Depends on the problem. And now I have it, more or less. Again, there are limits, but when I use something like cloud code, I have never been this productive in my life. I know there is research that shows how much software engineers think that their productivity has risen versus how much that it has actually fallen when using AI coding agents. I really don't know what those people are doing. But for me, like, I cannot imagine having this much output without a coding agent. And I'm not just talking about lines of code written, because I don't care about lines of code written. I care about experiments being run, where I can see the behavior, where I can see gradients flowing, where I can really understand the behavior. Just that I don't have to use my finger muscles anymore to type it out, but to actually just use my brain more in sketching out what I want to have and knowing the whole infrastructure. Knowing what the code does and knowing where to place the feature that I want and then just telling my agent in exact detail. This is the infrastructure. This is how it looks. I want to implement this new behavior. Let's do this. Let's create a planning document and then I iterate on this planning document because I know what I want to have as a behavior downstream. I know some design patterns that I want to maintain so that the code is roughly maintainable. Um, and I then let's let the agent go. And if I spend enough time in defining the scope of the design. Um, doc basically it's pretty damn good. I, I ha it's, it's pretty damn good. And it's, it's crazy. [01:11:17] Speaker 1: So. Like I'm obviously someone who's working in space, like, uh, you know, AI, right. And both of us are, um, and I'm also someone who's like, not only working in space. My like friends are in the space. I spend most of my like time traveling with talking to people who know about this space. And even for me or reading news about space, even for me, like it's moving incredibly fast. Like, it's funny, this research that we have, you know, that could be from six months or a year ago. Um, and even if the research was released at a certain point, it was done strictly great before the point it was released, right. Um, it's potentially looking at it like a different set of models, a different set of tools. Um, and hence it's already out of date, you know, like likely research from earlier this year was using, it was done our site earlier 2025 and not earlier this year, because there's so much time left in this year, but from earlier in 2025 is likely was done in the end of 2024. Right. Um, possibly using like GP four or whatever it was, you know, GP four at the time. Right. And new models are progressing at this kind of crazy pace. Like I personally had not used cloud code until last weekend. Um, and I'm now using it currently for my second project side project. And it's incredible. Um, it's really quite impressive, quite how well it works. And again, I'm, I'm not new to vibe code. I'm not new to using code generation tools, right. I've obviously used like a lot of different things from different providers, you know, as early as sort of chat GPT appearing, I was very proactive in creating a whole bunch of interesting side projects. Um, but like, I mean, it's really a question right now still of like how to use these things best, like the best practices are still emerging. Right. Like if you give it a kind of prompt to just go and do something, it can do it. But if you can give it, as you said, a rich structure, like iterate on the design, then it can do something much grander, much more complex. Um, you know, there's recent research from cursor suggesting that these things can run for days, unsupervised, executing very large designs, generating millions of lines of code. Um, I mean, that's insane. Like I'm reading anecdotes of people, you know, running this thing overnight and, um, it, you know, outputting whole features or whatever. And I believe it from the, from the two use cases I've seen, I have no reason to doubt these claims. [01:14:00] Speaker 2: Um, I mean, no, I, I just, I want to, yeah, I, I also cannot fathom the, how, how powerful it is. Again, it's not a solution to everything. And as with an intern that you want to supervise, the quality of the intern's output rises proportionally to the quality of the input that you give the intern. Right. The intern does better, a better job if you explain it better. What's that the intern is supposed to do. And that's how it feels like with cloud code. Now there are is constant developments in other open source wrappers, more or less around cloud code that. Implements this planning phase for you so that you don't have to write this careful prompt yourself, but you write what you want at a high level. And then you get pro you get prompted to, to fill in the missing gaps. So it's, it's constantly evolving. And you don't have to be on top of every single new version of agentic tools. You don't have to be hopping around. Oh, now cloud code is this. Oh, now it's codex. Oh, now it's Gemini CLI. Oh, now it's the cursor bug bot or whatever. I mean, if the overhead kills your productivity, it's not worth it. Right. I mean, sure cloud code has its bugs. Sure cloud code has this like flickering or whatever. If you have too much context on the terminal screen and the terminal is too small. Like there are bugs. Sure. But come on, guys, it's, it's, it's, it's, it's already crazy what it can do. Um, again, but, but you need a new kind of discipline with these kinds of tools, which is to not just let it go for freely completely, but still you have to know what it's doing. So you still need this kind of discipline to actually look at the code that it's writing and not be like, it'll be okay unless like stuff like visualization tools or whatever. Right. Then if the visualization looks like it should, we're good. Right. [01:16:08] Speaker 1: Um, yes, we're good. The visualization is, is safe. [01:16:12] Speaker 2: And the visualization is safe unless you're open AI and you have, have a public press release and then the bar chart is a bit wrong, but the metrics have to be correct. Okay. I guess, I guess that's, that's true. [01:16:26] Speaker 1: But I mean, it does strike me though. Like, I mean, some of the discourse here is going to be about like the, like, you know, what does this mean for programmers or software engineers? Um, and it does strike me that it's still important to understand the infrastructure around what's going on. It's, it's still important to understand software infrastructure. It's still important to understand, you know, to be able to prompt correctly. Right. Like obviously if you say, Hey, I want a website, it can generate you something, but obviously [01:16:57] Speaker 2: there's all these things online as well. And then you can share the local, and then you can share the local host link to your friends. Yeah. [01:17:02] Speaker 1: Yeah. Right. Um, you know, similarly, like if you want to make a interesting open source project, you know, there's things like what build system do we want to use? You know, do we want to use Docker? Do we want to do this or that? And, you know, do we want to put our credentials into a GitHub public repo? You know, um, and again, I mean, the accessibility here is incredible. Um, I think this really does mean that even non-technical people can code quite productively. Um, but like, I, I definitely think the higher level skills, as you said of designing and planning and understanding like what is a computer doing without getting bogged into the, like, should I use a for loop or a while loop is game changer. Um, one of the, actually, so one of the guys who I met on, on Friday, um, a founder of a startup in Berlin, he's not in technical. So I used to work with him 11 years ago or something in Google Dublin. He was in sales. He's since gone on to various other companies and then later some startups. And he's founded his own AI startup called formidable.ai. Um, I believe they don't, they don't have a website yet. Um, but, um, I think he'll be glad for the plug. Um, but the, he's building applications using cloud code. And I was asking him like why cloud code and not lovable or one of these like IDE based, more accessible, more kind of friendly things. And he sort of suggested, well, cloud code can do everything. And I don't need to be constrained even by any of the constraints of the IDE. He doesn't really know IDs historically anyway. So it's not like it feels familiar. It's like, um, it's incredible. Like he's just like, and he was even showing us, he was like, oh, you know, I had some demo to some investors and I vibe coded this, you know, include code, this amazing feature tool. [01:19:02] Speaker 2: I want to, but I also want to be clear here that implementing a demo is easier than a full on full blown production system with full security systems in the back end. So it's, I am certain that you still need, need humans in the loop for a while. How, how, how these humans will be integrated into the loop is going to change with, with every six months, I guess, or whatever. But yeah, it's, it's like, like the, the, the, the, the scaffolding around the infrastructure that you're building, like the systems of, of testing such that you don't have bugs in this, you don't have security leaks and all of that might be even more important than actually the implementation. Because if you have some, some, yeah, guardrails around the system that you want to implement such so that you catch known errors or known vulnerabilities, then you can maybe even give your agent more freedom. I don't know. This is just me guessing. And I think people just need to figure out how to best use these tools, right? Again, they, they are not, they can't solve everything, but they are just pretty, very, very powerful. The, well, I just realized that I was actually [01:20:16] Speaker 1: at a really interesting workshop at NeurIPS in December, which I, we can link in the show notes, because there's a whole set of papers that they were presenting on this exact thing, which is like agentic code generation, how to optimize the models to do better at these things. And so there's a whole lot of interesting research that people can read and educate themselves on where it's going. But a lot of what they're doing is focusing on non-functional requirements right now. So, you know, when you generate the code, can it pass the tests is one question, but you can also have like verifiers or run the code and measure certain qualities to see if you have other parameters you want to optimize for, right? So for example, if you have a style guide, if you want to say we only use two spaces rather than four spaces in Python, or, you know, we don't use certain sort of classes or libraries, or we do certain things, if you can check this style at, you know, or L time, then you can or L for that, right? But the code is not only functional, but it's aligned with your style guide. Similarly, if you can measure the runtime of the code, then you can not only do functional completion, but also performance, right? So optimize for the ones that are faster, not slower. And similarly, for things like security, you know, if you can, if you have some sort of static analysis that can identify known vulnerabilities, you can penalize the model whenever it uses those patterns, such that the probability of those gets pushed very strongly towards zero. And so a lot of these extra bells and whistles can be added on top of the URL from verifiable feedback in companies that are building these foundational models for code generation. And then of course, the probability that cloud code uses a forbidden library goes sort of close to zero, which is nice. [01:22:01] Speaker 2: Yeah. Fascinating. Yeah. This is, this is, this is, this is, this is fascinating. And I think this is especially interesting because again, this just shows that there's just still so much interesting work to do. And this is again, still only within the scope of LLAMs because they are very fascinating, but I, I, yeah, now I think now I need to think about how I can implement better guardrails for my little agent. My guardrails, it's still manual guardrails. My guardrails is me. [01:22:37] Speaker 1: Yeah. I mean, what's interesting here is there's like, you can do stuff and then also the model provider can do stuff. Be like, you know, you can obviously run some static analyzers or eyeball checking of things over the code and say, Ooh, we should never have done this, but, um, it may be, it's even better if the model providers can do this, right? That like, when it generates code, not only does it generate code that is correct for the problem, but it's also optimal in its performance characteristics and also compliant with the PEP eight Python style guide and has no known security vulnerabilities. then you're like, I mean, most engineers cannot do all of the above, right? [01:23:18] Speaker 2: It's like, just, just shove everything into, into the cloud MD file. Here, follow, you always use the rough linter, always use black or whatever your, your preference here is. Well, I mean, say it can go in the cloud MD file or, yeah, [01:23:36] Speaker 1: or it can be baked into the model, right? Like, because, because if you think about it, I mean, how an RL verifiable feedback loop works is it generates some code and then some tests are executed against the code and those tests obviously, you know, might be running Python or C++ or Golang or whatever, but you can also quickly in parallel run the like linter, right? And then, you know, that the model doesn't just get 1.0 if it passes the test. It gets 1.0 if it passes tests and the linter is satisfied and the LLM as a judge thinks that there's, you know, good level of commenting or [01:24:11] Speaker 2: whatever, you know? You know, you know what, what, what this brings me to this, this is exactly what I also think is really interesting, but that's to me is where there's where the very fine line between research and just productization is, you know, and this is like an open question also to me, what's the difference between research and productization here? And I, and I, as a researcher, I draw like a, I have my, my own subjective position on what is research and what is engineering and everything that you just described. And a lot of the work that is happening at these frontier labs is just so much and is just a lot of engineering and productization, making the models more efficient such that the margins can be higher, making, building out the infrastructure, making res like research on, for example, um, deep speed, right? This is like a huge paper, but that's just a pure engine engineering, right? Basically. And what you just said is the main challenge of all of RL post training is infra. I mean, everyone is still using GRPO at this point. Okay, sure. Then you have Dr. GRPO that just says, Ooh, don't use a Cal penalty. And then another paper says, Ooh, maybe just the, the clipping ray just make them, um, non-symmetric and like, Ooh, okay, that's a full paper. What? But it's, it's, it's all just, and the whole challenge here is, is infra and engineering. Um, so for me, it's, it's, it's, it's going hardcore into the productization phase, which makes, which makes sense. I view these tools. I view these, yeah, these products as tools for me, whether it's AGI or not. I, I honestly, this is also maybe like a hot take, but I, I don't, I don't really care about whether this is considered AGI or not. I think it's a really, really interesting philosophical question. What is AGI and when have we achieved it and whatnot? I, I really liked it from a research perspective, but like from what is currently going on, it feels like it's just, yeah, people have done research on how to scale up these models, how to decide on patterns that work. And now it's making it scalable, building cool products, building something like cloud code, building something like deep research, building something like, I don't know, integrating it into integrating Gemini into Google sheets and Google docs and all of that. And how to, how to context engineer is that research context engineering? I mean, I mean, can be. [01:27:15] Speaker 1: So, so I, I think that you like the, I mean, the line here in industry is very blurry, uh, between the two, but I think that, you know, I mean, cause there's like, there's different degrees of research, right? There's research where it's not like groundbreaking, like blue sky, crazy far out, like there are things, but rather like more small incremental progress, right? You know, there's a paper that shows something works right at some small scale with some toy dataset. And then there's another paper that shows that it works at a large scale on a more real dataset, right? That's still research. It's very incremental. Like it doesn't, it's not a new paradigm. It's not a new infrastructure way of viewing the world, but it certainly provides value. Um, you know, obviously once something is fully known and fully understood, then it can just become engineering in the sense that you can say, okay, here's a team go do this. Right. But, you know, if you think about a lot of the work I just described, right, this oral for ABCD, all of these non-functional qualities, all of that work would have started in parallel in different universities around the world or different companies. And of course, to some degree, I mean, ORL verifiable feedback was kind of known to work, right? People were already had started doing it for, for mathematics, for, um, programming. But, you know, initially the question was just like, can it pass the tests? And of course, I don't think it was like super hard to believe that if it can generate code and it can generate code to a specification, like, of course it can, you know, be learned, it can be baked in. Right. So I don't think it was ever going to be groundbreaking to say, hey, if we use the linter at ORL time that, you know, this is a groundbreaking research, but you know, it's a nice paper. It explains it well. Once that paper's out there, it helps other people who maybe don't have as much expertise to kind of get grok the concept. Right. And maybe they apply it in there, but maybe they just generalize it. Maybe say, well, if it can do a linter, it can do a whatever other static analyzer. Um, so I mean, the boundary is fuzzy. Um, but you know, both are, are, are valuable. Right. And historically in companies, um, there's been this problem, this, um, this gap between like exploration and exploitation, right? You often have some part of the company that's in charge of exploration. They produce research often in the form of research papers that go public. And then you have other parts of the company that do exploitation. They build products, products to have, you know, users generate revenue. And oftentimes these things are not well translated or communicated. The team that makes the research is being rewarded for putting papers out there, maybe for getting citations, the team that makes the products wants to make money, but they don't know about these papers or if they do, it's through their own personal exploration. Um, and so they don't necessarily implement the latest and greatest things, right. That are available. So it's hard for organizations to kind of square this circle, right. To pipe these ideas over. So if you have this stepwise innovation that kind of says, not just that this is a thing, but like here, it actually works, you know, here, it's kind of almost like a how to recipe on how to do it. Then of course, you're going to get greater application or utilization of it. [01:30:55] Speaker 2: Yeah. No, for me, I, I, I like blue sky research. I think it's a really, really interesting. That's that I, that's why I'm just, I, I, I respect any kind of work that provides value to people. So I, I, I really love research as well, because that is as, as you described other people doing work for multiple months that you then read up on and you're like, okay, ah, cool. If assuming the work has been done properly and there's been no cheating or like bugs and all of that, which is always bugs. Which there always is. You can be like, okay, cool. Nice work team. Now we can use that for our work and we don't have to do this exploration on our own. Right. I, I really, really love that about research. Um, but just subjectively speaking, ah, I like more blue sky research for me. Research feels like, I want to understand reality itself. I want to understand what is intelligence. What does it mean to be an intelligent being and creating an intelligent artificial intelligence, I guess. Right. Whereas for me, I'll, at some point in my career, I was worried. I was, I was, I, I fall, I, I fell for this hype cycle. Like, oh my God, I will lose my job. I'll be worth this and all of that. But now, or one, like a year ago at some point, but I really got more and more and more into research and figuring out what they're like fundamentally limited aspects of this work are for me. It's like, these are amazing, amazing tools that will change the world or how we operate in this world. But I don't, I'm, I'm, I'm, I'm not on the camp of, okay, we have solved intelligence. Right. For me, I also don't really care about that. All I care about is just learning cool stuff. No matter which domain, AI for science, um, open-endedness, graph neural networks, reinforcement learning. I, I, I like learning and these tools make it easier. They make our lives easier. Just as at some point in life, people were writing assembly code and all of a sudden you had a programming language like visual basic or whatever, which was like an upgrade. It was a abstraction higher up and it's changed how the world in programming worked. Right. That, that's like, that was a big jump going from assembly to like programming languages. And now it's another similar jump where you now have a different interface, which is not perfect from the beginning. People will have to know how to use it, yada, yada, yada, everything that we already talked about. But yeah, I, I view these as just really cool products and that make my life easier, but there's still a lot of interesting and challenging stuff to do in this world, which I think is a good thing, which hopefully relieves, puts, takes off of it, takes off a bit of the stress of other people because it's like, okay, maybe Dario says that all code will be written by AI by then of this year or next year. Maybe it will happen. Maybe not. But I think that there will still be challenges in this world that's needs, that needs people that can solve them using the tools that we have at hand. [01:34:35] Speaker 1: So I actually come in on the opposite side of the research appreciation spectrum to you in the sense that I actually like things that are almost like near term impact, like things that can be used soon. Um, like I, I, I mean, I guess that's the space I work in by choice, but, um, you know, I like this applied research. Like I love when I read a paper and I think that's a really clever idea. It makes sense to me how that works and that it would be like, not hard to implement it. I think that's the ones that really like excite me, um, kind of things that I read and I immediately want to go to work and hack it together, uh, or, you know, implement as a side project or something. [01:35:16] Speaker 2: But, Oh, yeah, a hundred percent. Like I, I agree though. Everything has its value. It's, it's, it's mostly just semantics, I guess. Um, but I, I mean, I do like, there is obviously these papers [01:35:26] Speaker 1: where they're much more grand in their vision and it's much more abstract and it's much more the kind of thing that will get discussed on MLST in great detail. And I, you know, those are amazing sometimes, but personally, I definitely gravitate towards these like incremental papers in a, in a space, especially once I'm in that space. Right. I mean, the vast majority of papers are incremental. Like most people are not thinking about these blue sky, grand visions, right? They're, they're thinking, how can I get like enough publications to get a PhD or [01:35:57] Speaker 2: um, yeah. Yeah, no, definitely. Yeah, Max, this, this was, this is the first episode of this podcast and it will only become better from this point. We'll only be learning how to be better hosts. Um, I think in this episode, we touched on the rough vision of this podcast that we call signals and stories because we want to get a lot of, we want to filter out the signal of the noise, out of the noise. And we want to learn about the stories of the people behind the work that's been done in AI, be it research or product or engineering or whatever. And investments, just everyone that's related to AI. It's just learning about all the different diverse possibilities in this space and just trying to share value with people. Right. That's why we talk about this interesting concepts, interesting ideas. And hopefully out of all of this chit chat here, there was hopefully some new knowledge that the listeners could take away, um, and just find interesting or directly implement [01:37:10] Speaker 1: into their own work. Yeah. I mean, there's two things I want to say before finishing. One is actually when we were trying to record this, um, it kept failing. Um, so I kept getting an error that whenever Boris would do the, uh, the test recording that I would get like this error message that said that it wasn't able to record, even though it would do the countdown. And, uh, we went through some of the debugging instructions on the applications website to no avail. I restarted my computer to no avail. And then we asked Claude and Claude said, you need, you should potentially disable the GPU acceleration settings of Google Chrome. And then it will work. I didn't know Chrome had GPU acceleration settings, but I disabled them and then it worked. That's pretty impressive. [01:37:55] Speaker 2: You know what? I, I, I take back my statement that we don't have AGI. When, when this happened, I was like, this, this, this can't be like Claude just fixes this. And oh, okay. AGI confirmed. AGI confirmed. Now that was, yeah, it's, it's, it's great. There's, there's, there's still so much to talk about. Um, and I hope that in the following episodes, we'll get interesting guests. Again, BH from research PhD students, um, full, full blown research scientists from startups, from big tech engineers, from startups, from big tech, whoever we can get our hands on who have interesting backgrounds, diverse views on AI. Um, I think all of that can bring out a lot of the signal [01:38:42] Speaker 1: out of the noise. One thing that I find fascinating right now producing any kind of content is the fact that this will be the training data for future models, um, that like this video will go out on, on YouTube or somewhere and it will get transcribed and the video might be ingested, but also the transcript itself will be used as training data. Right. And so whatever things we talk about will be internalized into the future model, you know, so potentially. We can take in back doors. We, we can back doors. Oh, but, but it's interesting because like when someone asks questions that are related to this, there'll be some ghosts of our conversation in the response that they get in the future. Right. [01:39:23] Speaker 2: Ghosts in a shell. Our ghosts are the shell of the LLM. [01:39:27] Speaker 1: Exactly. Yeah. I think that's really cool. Uh, because, you know, especially if we're talking about something that's not like super mainstream, that it's maybe a little more off the beaten track, maybe it will be more valuable as future training data. [01:39:45] Speaker 2: It is. That's true. Okay. Wait, we, we have it now. We have fun. We're having fun, but now we're going to call it a day. This was lovely. Thank you everyone for watching and for staying. Yeah. It's, it's, it's, it's, it's half, half, half past 11 in the, in the evening here on a Sunday evening, but it's phenomenal. Okay. Everyone, thank you for sticking until this far into the podcast. I hope you had a lot of fun just as much as we did and let us know what you want to learn about. Actually, do you have any guests? Do you have any recommendations? And we are very happy to read all of the comments that you have and build up this podcast such that you also enjoy the content and you can hear whatever you want to hear. Max, have a lovely rest of your day. I'll get ready for work tomorrow. Really excited. Enjoy. Okay guys. Thank you everyone. Bye. All right.

Related Transcripts from Signal and Stories

Transcribe Any Video or Podcast — Free