Neszed-Mobile-header-logo
Wednesday, August 13, 2025
Newszed-Header-Logo
HomeAIGPT-5, Google DeepMind Genie 3, Cloudflare vs. Perplexity, OpenAI’s Open Source Models,...

GPT-5, Google DeepMind Genie 3, Cloudflare vs. Perplexity, OpenAI’s Open Source Models, Claude 4.1 & New Data on AI Layoffs

GPT-5 finally landed, and the hype was matched with backlash. In this episode, Paul and Mike share their takeaways from the new model, provide insights into the gravity of DeepMind’s photorealistic Genie 3 world-model, unravel Perplexity’s stealth crawling controversy, touch on OpenAI’s open-weight release and rumored $500 billion valuation, and more in our rapid-fire section. 

Listen or watch below—and see below for show notes and the transcript.

Listen Now

Watch the Video

Timestamps

00:00:00 — Intro

00:04:57 — GPT-5 Launch and First Reactions

00:25:29 — DeepMind’s Genie 3 World Model

00:32:20 — Perplexity vs. Cloudflare Crawling Dispute

00:37:37 — OpenAI Returns to Open Weights

00:41:21 — OpenAI $500B Secondary Talks

00:44:26 — Anthropic Claude Opus 4.1 and System Prompt Update

00:49:57 — AI and the Future of Work

00:56:02 — OpenAI “Universal Verifiers”

01:00:42 — OpenAI Offers ChatGPT to the Federal Workforce

01:02:59 — ElevenLabs Launches AI Music

01:05:32 — Meta Buys AI Audio Startup

01:09:46 — Google AI Pro for Students

Summary:

GPT-5 Launch and Initial Reactions

OpenAI has unveiled GPT-5, calling it its smartest, fastest, and most useful model yet.

It’s the first “unified” system from the company, combining quick-response chat with deeper reasoning when needed. You don’t need to tweak any settings. Instead GPT-5 will route your requests to the right type of model for the job, depending on if it needs to think for longer or act fast.

The company says it outperforms earlier versions in coding, writing, health advice, and multimodal reasoning, with big reductions in hallucinations and a more honest approach when tasks can’t be completed. 

It also has a context window of 400,000 tokens and 128,000 max output tokens. And OpenAI notes that it has significantly fewer hallucinations and is about 45% less likely to contain factual errors than GPT-4o.

For coders, GPT-5 can spin up full apps from a single prompt, with better design sensibility and debugging skills. For health, it’s far less error-prone and more proactive about flagging issues, though it’s still no substitute for a doctor. Creative work also gets a lift, with more nuanced writing and “better taste” in design.

The launch includes GPT-5 Pro for extended reasoning, new preset personalities that change how the model responds, and API access in three sizes. Free users now get GPT-5 as the default, while Plus and Pro subscribers get higher limits and Pro access.

DeepMind’s Genie 3 World Model 

Google DeepMind has unveiled Genie 3, a breakthrough “world model” that can generate fully interactive, photorealistic environments in real time. Unlike earlier versions, Genie 3 can render at 24 frames per second, maintain visual and physical consistency for minutes at a time, and respond instantly to both navigation and text-based prompts.

The model can simulate anything from volcanic landscapes to enchanted forests, or recreate historical sites like ancient Athens—all based on a short description. Worlds evolve dynamically as you explore, and “promptable world events” let users change conditions on the fly, from altering weather to adding new objects.

This realism isn’t just for show. DeepMind sees world models as a key step toward AGI, offering limitless training grounds for AI agents to learn and adapt. Genie 3’s long-horizon consistency means agents can now tackle multi-step goals, opening the door for complex simulations in robotics, education, and science.

Still, the tech has limits: short interaction durations, constrained actions, and challenges with simulating multiple agents or perfectly accurate real-world locations. For now, it’s in a limited research preview, but DeepMind calls it a “significant moment” in the evolution of generative environments.

Perplexity v. Cloudflare Crawling Dispute

Cloudflare says AI search startup Perplexity has been disguising its web crawlers to bypass site blocks, a practice known as “stealth crawling.”

According to Cloudflare, when Perplexity’s bots hit a robots.txt rule or a firewall block, they sometimes swap their identity from “PerplexityBot” to something like “Google Chrome on macOS,” and rotate IP addresses that aren’t on its official list. 

Cloudflare says the company also changes its network identifiers to dodge detection, a tactic it claims has been used across tens of thousands of domains, making millions of requests each day. Perplexity is pushing back hard against Cloudflare’s claims.

In a detailed rebuttal, Perplexity denies intentional wrongdoing, calling Cloudflare’s post a “publicity stunt” and saying the company mixed up legitimate, user-triggered requests with bot activity, and even confused some of it with unrelated traffic from a tool called BrowserBase.

According to Perplexity, its AI assistants aren’t traditional web crawlers. They don’t systematically scrape and store the internet. Instead, they fetch specific pages in real time when a user asks a question, use that content to answer, and discard it with no training or long-term storage.

Perplexity argues this is no different from a browser or email client fetching a page on a user’s behalf, and warns that labeling such requests as “malicious” risks breaking legitimate tools and creating a two-tier internet where access depends on infrastructure gatekeepers.

Cloudflare has now delisted Perplexity as a verified bot and rolled out new methods to block its crawlers.


This episode is brought to you by our Academy 3.0 Launch Event.

Join Paul Roetzer and the SmarterX team on August 19 at 12pm ET for the launch of AI Academy 3.0 by SmarterX —your gateway to personalized AI learning for professionals and teams. Discover our new on-demand courses, live classes, certifications, and a smarter way to master AI. Register here.


This week’s episode is also brought to you by Intro to AI, our free, virtual monthly class, streaming live on Aug. 14 at 12 p.m. ET. Reserve your seat AND attend for a chance to win a 12-month AI Mastery Membership

For more information on Intro to AI and to register for this month’s class, visit www.marketingaiinstitute.com/intro-to-ai.

Read the Transcription

Disclaimer: This transcription was written by AI, thanks to Descript, and has not been edited for content. 

[00:00:00] Paul Roetzer: So the question has always been, does OpenAI have a secret sauce? Is there something they’re doing that was gonna allow them to get that six to 12 month lead over everybody else? The answer is no. Welcome to the Artificial Intelligence Show, the podcast that helps your business grow smarter by making AI approachable and actionable.

[00:00:20] My name is Paul Roetzer. I’m the founder and CEO of SmarterX and Marketing AI Institute, and I’m your host. Each week I’m joined by my co-host and marketing AI Institute Chief Content Officer Mike Kaput, as we break down all the AI news that matters and give you insights and perspectives that you can use to advance your company and your career.

[00:00:41] Join us as we accelerate AI literacy for all.

[00:00:48] Welcome to episode 161 of the Artificial Intelligence Show. I’m your host, Paul Roetzer, along with my co-host Mike Kaput. We are coming to recording, Monday, August 11th at 11:00 [00:01:00] AM ish Eastern Time. Our long awaited GPT-5 has arrived. Our, our team was like messaging us on Friday, like, are we gonna do an emergency podcast in Talk GPT-5?

[00:01:11] And I’m like, you’re gonna get these AI Academy courses finished, or you’re gonna gonna get an emergency podcast. So, Mike and I chose to focus on getting the AI Academy courses ready for launch instead of the emergency pod, but we’ll have plenty to discuss about GPT-5 today. all right, so this episode is brought to us by AI Academy by SmarterX, which I was just talking about.

[00:01:35] We are having our kind of relaunch event, I guess. We first introduced AI Academy in 2020. we have spent the last, almost year now completely re-imagining what academy is, how it functions, the technology behind it, how to infuse AI into it. the overall learner experience, how to build learning journeys, like everything has just been completely, revised, updated, [00:02:00] improved everything.

[00:02:01] And so on. August 19th at noon Eastern time, we will have a launch event. There’s a webinar you can sign up for to hear all about it. We’re gonna go through the vision and roadmap for AI Academy. We’re gonna talk about all the new on-demand courses and professional certificates that we are developing and launching that day.

[00:02:19] A bunch of ’em are coming out that day. We’re gonna talk about the new AI Academy live, which I’m super excited about, which is gonna be a regularly scheduled occurrence where members are actually gonna be able to join in live. talk, you know, not only with Mike and I, but go through deep dives, go through AI transformation, spotlights, book clubs, things like that.

[00:02:36] There’s a new learning management system coming later this year. We’re gonna preview that, how to build personalized learning journeys. We’re gonna talk about new business accounts where, companies, universities, people can come in, get five plus licenses. You get a whole bunch of. features and benefits, specific to those plus dramatically reduced pricing.

[00:02:56] And then we’re gonna have an ask us Anything session with me and Mike and Kathy. [00:03:00] So all kinds of stuff coming out. We have a new AI fundamental series, a third edition of our piloting AI series, a second edition of our Scaling AI series, which I am finalizing and literally between meetings and the podcast today.

[00:03:14] Mike did a new AI for, for professional services. also Mike, created a new AI and marketing series. So all of these are launching along with a bunch of other stuff. So go to SmarterX dot ai at the top of the page, there’s a banner you can click on to register for the webinar, and we’ll also drop that link in.

[00:03:33] So again, that webinar is free and it is happening on August 19th. this episode is also brought to us by Intro to ai. So this is, I have been teaching this class free every month since November or October of 2021. We are having our 50th edition of Intro to ai. this is happening Thursday, August 14th at noon so you can register.

[00:03:57] we’ve had, I think close to [00:04:00] 40,000 people have gone through this class since I started doing it almost four years ago. So it’s a about 30, 35 minutes. I do a live, kind of go through the fundamentals of ai and then we leave the last 25 minutes for questions. We usually get anywhere between 50 and a hundred questions.

[00:04:16] We do our best to answer as many as we can, and then the ones we can’t get to, we then do a, the week later we do an intro to ai, special for the podcast where we go through a bunch of other questions that we got. So, intro to, to ai 50th edition coming, Thursday, August 14th. and then we’ll do a follow-up podcast with some questions we didn’t get to.

[00:04:38] So I’ll put a link to the show notes in the show notes to intro to AI as well. And we will share all of that information. Alright, so, two great live events coming up August 14th and August 19th. Check those out. And now Mike, the long awaited GPT-5. Let’s get into it. 

[00:04:57] GPT-5 Launch and First Reactions

[00:04:57] Mike Kaput: All right, so first [00:05:00] topic, predictably openAI’s has unveiled GPT-5.

[00:05:03] They’re calling it their sma smartest, fastest, and most useful model yet it is the first unified system from the company. It combines quick response chat with deeper reasoning when needed. you don’t really need to tweak any settings. Instead, GPT-5 will route your requests to the right type of model that it deems to be correct for the job, depending if it needs to think for longer or act faster.

[00:05:31] The company says it outperforms earlier versions in coding, writing, health advice and multimodal reasoning. There are big reductions in hallucinations, and it says it has a more honest approach when tasks cannot be completed. It also has a context window of 400,000 tokens and 128,000 max output tokens.

[00:05:51] Now, another note on those hallucinations, openAI’s says it has significantly fewer hallucinations than GPT-4o and is [00:06:00] 45% less likely to contain factual errors compared to GPT-4o. For coders, GPT-5 can spin up full apps from a single prompt. It’s got really good design sensibility and debugging skills for health.

[00:06:12] It is far less error prone and more proactive about flagging issues. And creative work has also gotten a lift with more nuanced writing and better taste in design. Now this launch includes GPT-5 PRO for extended reasoning. There’s new preset personalities that change how the model responds. And API access across three different model sizes.

[00:06:35] Now, free users are now getting GPT-5 as the default, while plus and pro subscribers get higher limits and access to GPT-5 Pro. Now Paul, there’s a lot to unpack. Here’s a few different angles we’re gonna talk about here, but maybe let’s kick off by saying, what are your initial impressions of GT five?

[00:06:56] Paul Roetzer: A lot of my initial impressions come from [00:07:00] curating opinions of other people online who, whom I trust. And I, you know, I’ve read lots of their reviews, I have experiment with it a bit myself. I didn’t have, you know, I was working on the courses all weekend, so I couldn’t like really put it through a bunch of experiments, but I was, you know, dabbling in it.

[00:07:18] so when you follow the people we follow online, they generally were the people who weren’t super happy about this. So I think like, I want to, I wanna, my caveat here is like. It seems like a really good model. It it, it is not this life changing model that we all kind of have been anticipating for like a year and a half now of G PT five.

[00:07:41] It’s always been like, well, once GPT-5 gets here, then everything changes. So I will say, one, as part of the AI Academy, we are introducing a new Gen AI app series, and Mike and I were talking this morning and he’s gonna do a, GPT five review as the first, course in [00:08:00] that series. So we’ll have more to say.

[00:08:01] It’s like a 15, 20 minute product review, basically. So that’ll be dropping next week for academy members. But he, here’s my, my take Mike. I’m, I’m gonna try and like, you can hear a, a lot of, like, here’s the, you know, Ethan Mooch has a bunch of great stuff. Like, Brian Brickman, our friend, like they, there people have done like these great reviews.

[00:08:19] Allie Miller had had it, people who had access to it beforehand. There’s all these great reviews. So I’m gonna give more of like a zoom out, like what’s the impact here. So. First, it is not multimodal from the ground up. So when they say unified model, what they mean is it’s still like four or five different models that are packaged as one thing called GPT-5.

[00:08:42] And then there’s a router that based on your prompt decides which model it’s gonna use. So if it’s gonna use one that has reasoning, if it’s gonna use the traditional chat, if it’s gonna use image generation, video generation, like all that’s not in a single model. so I, you [00:09:00] know, I assume GPT six will be that it’ll be truly multimodal from the ground up.

[00:09:04] as far as I know, they didn’t give any updates on image generation or soa, their video generation as part of this. I think they made some tweaks to voice capabilities, maybe. I think they improved the voice a little bit. so we, on this podcast have for a while talked about the confusion of the model choice and.

[00:09:24] When you would go into ChatGPT last week, there was eight models to choose from. And the point we always made was the average user has no idea what the difference is between those. Oh, you know, 4 0 0 3, mini, like the average user has no idea. and so they would just use whatever default. And so our point was always why for the average user would you make them choose from a list of models that they don’t understand what the difference is.

[00:09:51] And so it would seem that this router is sort of heading in the right direction, but it actually caused chaos because [00:10:00] there is a small fraction of Chad CPT users who do understand what the different models are and have preferred models that they like to use. And what OpenAI did, sort of their first misstep, and we’ll go through a series of missteps that they made in this process, is that they almost just ignored.

[00:10:20] the loudest, the most vocal online users who do actually understand the different models and really liked some of the other models. ’cause what OpenAI did is they turned on g PT five and removed all the other models. And then when the router was doing its work, I go into ChatGPT, I give a prompt, help me write a business plan for this, you know, idea I have.

[00:10:43] I would have no idea which model it was actually using. So there was no transparency into what model was actually being used. And if there was a model, I used to like that. I liked the tone, the personality, the style, the format. It was gone. And so people were pissed by, by like, end of the day Thursday, people are [00:11:00] like, gimme my model back.

[00:11:01] Like, what I want 4.0, like, I like talking to 4.0. And so kind of surprisingly, Mike, it’s like, it’s almost like OpenAI didn’t understand their user base. Like yeah, there was obviously people who wanted that choice. and then there was this other faction of people who obviously were very attached to specific models and almost like emotionally attached to like 4.0.

[00:11:27] and five is a very different personality. it, it responds in like shorter bursts. Like it’s, it doesn’t have, you know, it’s not as like comforting and things like that. Like it’s just missing some of that. So there was one user, and I didn’t know this guy previously on X, but I thought he gave a great synopsis.

[00:11:46] I’ll just read this one. Put the link in. Alistair McClay is his name, and he said, open. I forgot who actually matters. Power users always lead the culture curve. They set the vibes for a product, especially in consumer software. They’re the [00:12:00] loudest most passionate and have the highest expectations. They are your biggest asset as a consumer company, and you need to keep them front of mind at all times.

[00:12:08] With the GPT-5 launch in chat, GPT OpenAI seems to have been so focused on the benefits of their new router could provide. To their less sophisticated users, which automatically switches the underlying model without telling them that they totally overlook the user group. That actually matters the most.

[00:12:25] If you put yourself in the shoes of Chad GPT Power user, it’s blatantly obvious. They will continue to want the ability to hard switch between models. It’s obvious. They will expect transparency in which model is being used by the router at any point in any time. And most important of all, it’s obvious.

[00:12:40] They will expect to have a reasonable notice period before their existing models deprecated. The response we saw was inevitable. The power users who make up the majority of the noise online quickly set the vibes of frustration, disappointment, and broken trust. People who used four oh or 4.5 were writing for writing were suddenly left with no good [00:13:00] alternative.

[00:13:00] Plus, users who had access to oh four Mini and o3 suddenly found themselves with a 200 message weekly cap on GPT-5 thinking and a router that wouldn’t tell them which model they were actually talking to. Not to mention, most people I’ve spoken to had no idea. There’s now a cap on GPT-5 thinking you only find out when you hit and lose access for the rest of the week.

[00:13:21] So it’s like, that’s a pretty good synopsis of what was going on. And then openAI’s immediately realized this, like Sam Altman was in full blown crisis communications mode by Thursday night, which told you like they just missed this. Like, damn, they didn’t think this through. So Altman tweeted and we’ll put links to all these tweets.

[00:13:43] so this was August 10th, this was on Sunday. If you’ve been following the GPT-5 rollout, one thing you might be noticing is how much of an attachment some people have to specific AI models. It feels different and stronger than the kinds of attachments people have to previous kinds of technology.

[00:13:59] And so [00:14:00] suddenly, suddenly deprecating old models that users depended on in their workflows was a mistake. This is something we’ve been closely tracking for the past year or so, but still hasn’t gotten much mainstream attention. people have used technology including AI in self-destruct, self-destructive ways.

[00:14:17] If a user’s in a mentally fragile state and prone to delusion, we do not want the AI to reinforce that. Most users can keep a clear line between reality and fiction or role play, but a small percentage cannot. We value user freedom as a core principle, but we also feel responsible in how we introduce new technology with new risks.

[00:14:34] So that’s the attachment thing. The rate limit thing was like almost just like sideswipe people. Mm-hmm. So this is, this is an interesting one, Mike, because not only did Sam tweet about this on Sunday, other openAI’s researchers we’re also tweeting about this. So, you know that this one was like a real hot button internally and with their users.

[00:14:54] And the thing that I think about with this one is their restrictions [00:15:00] on capacity, compute capacity to do inference so quick. Like, you know, there, there’s compute to train these models, but then when you and I use them, that’s inference. So when it delivers an answer. Reasoning, which is now baked into this, requires way more computed inference than a standard chat, as does video, as does image, things like that.

[00:15:20] And so the fact that they’re straight up saying this is an issue with capacity opens the door for Google in, in my opinion, like this is a really interesting play where open AI’s lack of maturity and infrastructure when it comes to compute and data centers mm-hmm. is not an issue for Google as much.

[00:15:41] So here was Sam’s tweet again on Sunday, said, today we are s significantly increasing rate limits for reasoning for chat. GPT plus users and all model class limits will shortly be higher than they were before GPT-5 and then today being Monday or Tuesday, they expect to share their thinking on how we are going to make capacity trade-offs [00:16:00] over the coming months.

[00:16:01] Meaning we, a lot of people like our product, we have 700 million users. And the more they use reasoning, the more these things, like we’re, we’re gonna just run outta capacity. Like we have to set rate limits, but people don’t want ’em. and then there was a couple other openAI’s people who also talked about the rate limits.

[00:16:17] Then the other one was that this was the first time we’ve seen this data that I thought was very fascinating. Mike was, we assumed, and we’ve talked about this, like I’ve said, I go, who talks all the time? I ask rooms of hundreds of people, like who’s ever used a reasoning model? Who’s used O three?

[00:16:33] And you get like five hands. And so our like vibe check or like just, you know, eyeball check was, I don’t less, less than 1%, less than 3% of people have any clue what a reasoning model even is. And this is as of like a month ago, openAI’s verified that for us. So yes, the vast majority of open AI’s users have no clue that reasoning models exist or what they do.

[00:16:55] So they have 700 million users, for many [00:17:00] people. GPT five is the first time they’re going to interact with a reasoning model, but they probably won’t know it now because it’s just baked into it. So Sam tweeted the percentage of users using reasoning models each day is significantly increasing. For example, for free users, we went from less than 1% to 7%, and for plus users 7% to 24%.

[00:17:25] Now, that’s a big jump, but that means that people who were paying the plus is 200 bucks a month. Right? Mike? Isn’t that the plus? Is? 

[00:17:31] Mike Kaput: Plus is 20 and then Pro is 200. Okay, so paying the paying tiers. Yeah. So 

[00:17:36] Paul Roetzer: of the people paying 20 bucks a month, only 7% were using the reasoning models, which is wild. Yeah. So, and that would tell you like once you go from seven to 24, now all of a sudden the compute capacity becomes massive.

[00:17:49] and then three other quick thoughts here. The big question with g PT five that we’ve all been waiting for an answer for is, was it going to be a leap [00:18:00] forward over the other frontier models? GPT-4, when it came out in March, 2023, was state of the art for a year and a half. It took, it took a year and a half for Google and others to create something on par with G PT four.

[00:18:12] So the question has always been, does openAI’s have a secret sauce? Is there something they’re doing that was gonna allow them to get that, you know, even six to 12 month lead over everybody else? The answer is no. Like there, my guess is Gemini three from Google, the next version of Claude, the next version of Grok, they will all leapfrog over G PT five.

[00:18:35] there’s some arguments that like Gemini 2.5 Pro is probably already like better than GPT-5 in some capacities. So we, we kind of have our answer that the frontier models have been commoditized. Like there, there is no apparent secret sauce at the moment, which means. We’re back into the game of distribution, who can put a, a comparable model in front of enough users?

[00:18:59] [00:19:00] So openAI’s has 700 million, that’s huge. But Apple, like, you’re, you’re back in the game. Like if you’re Apple, you realize like, hey, we don’t need like the best, we don’t have to build our own frontier model. If you’re Google, you have seven products with over a billion users, seven, seven power, platforms and product like distribution becomes massive again.

[00:19:20] And then the big question I have was like, well, what, what about gpt? I didn’t hear anything about gpt. No. And so I went and looked and it looks like the only thing that changed is the model selector of recommended model as the creator of the GPT is now five, five thinking, or five prob. Like that’s kinda all I can see.

[00:19:36] Yeah. So again, I just wanted to zoom out and be like, high level, the things we were really waiting for was like. Yeah, the model choice issue, was it gonna be a different frontier model than everything else that would cause people to switch back to chat GPT if they were like, love and claw or Gemini, things like that.

[00:19:52] And I, I, overall, it just seems like it’s probably a really, really smart model. The average user isn’t gonna notice the difference. [00:20:00] And there’s, there’s lots they touted, but there’s very little that seems truly, differentiated at this point. And I you, you spent more time with it though, Mike, did you have any other different impressions of it or any other initial feedback?

[00:20:14] Mike Kaput: Yeah, no, I largely agree with your take. I will say it just really struck me how much preferences matter here because personally, and this will seem crazy to some people, I love this model. Yeah. Like, I genuinely find it more useful simply because it is smarter, it is faster, really fast, which is really helpful.

[00:20:33] I get a lot more done. all of my prompts and workflows I’ve tested so far with it work better, which is amazing. I personally don’t have as much preference for switching models. I thought four oh was a little too dumb. Mm-hmm. O three was brilliant, but the form o3 Pro is like my favorite model to use it.

[00:20:52] There’s very much so. However, I would also get frustrated a bit sometimes with the formatting and the slowness of being able to not be [00:21:00] able to just go back and forth rapidly and kind of iterate and converse. For me, this model like squares that circle and like really provides the perfect balance. For me personally.

[00:21:10] I like the tone a lot more. That’s all personal preference. I’m really glad we have it. I think some people hate that it exists. It’s really interesting to see. And I would also add too, if you want to go down a horrifying rabbit hole, go to the ChatGPT subreddit because the stories of people, I don’t know how much of this is like too played up and like viral, but there are so many posts.

[00:21:35] Of people deeply emotionally attached to 4.0 that you feel like the posts are written by people going through withdrawal. Yeah. And it’s really, really weird. 

[00:21:44] Paul Roetzer: And that’s I think what Sam was referring to with that. Like, hey, some people get really attached as therapists, as friends, as companions and like we have a tough job here to balance, like what is unhealthy?

[00:21:58] ’cause they can see the chats [00:22:00] like yes, they know what people are doing with these things and they’re trying to balance like what is good for mental health versus like what is acceptable personal 

[00:22:10] Mike Kaput: choice. It’s really interesting to just see that play out. And they did have an interesting emphasis on health throughout all their, yeah.

[00:22:19] Launch materials. So I think they’re really just understanding that people, for better or for worse are turning to this for emotional and physical health needs. 

[00:22:27] Paul Roetzer: Very, very much. Have you run a comparison, like do you use 2.5 Pro from Gemini? Much? Yeah. How do you think quite a co compares head to head?

[00:22:35] Like have you done any side-by-side? 

[00:22:37] Mike Kaput: I’ve done, I haven’t done too much yet. I really like and rely on Gemini 2.45 Pro for a lot of things, but I usually just cycle between that and either O three slash four. Oh, depending on the use case, obviously it’s way better than 4.0, but just in terms of speed or the complexity of it, that’s kind of my next big thing is like, okay, let’s run, you know, ’cause I have GPTs and gems built out for [00:23:00] some of the same stuff.

[00:23:01] Let’s see how these stack up. I’ll be interested to see what that, how that plays out. And also I think we’ve been seeing more and more chatter even this morning that Google is releasing something like today or tomorrow. I’m convinced they’re 

[00:23:14] Paul Roetzer: just sitting there waiting. Like I think they know that they probably have.

[00:23:18] Maybe something that’ll perform better, at least on the evals. And they were just like, it was a game of chicken. Like you wanna go ahead and release yours first? Yeah, for sure. ’cause Open has done that to them so many times. So I would not be surprised at all if Google came out with something comparable or better in, in, in ways.

[00:23:34] Mike Kaput: And just one kind of final note or impression or kind of perspective here is I genuinely would encourage people just go without any bias, go use this model as extensively as you can. I mean, again, I find it extremely impressive. I also think we all might need to take a breath too. Mm-hmm. Because it’s so easy when we’re in this bubble to be like, you know, you’re gonna see whatever Google comes out with and you’re be like, openAI’s is dead.

[00:23:59] Or ChatGPT, BT [00:24:00] sucks. And it’s like, this is like the first thing that felt like minimum viable AGI to me, to be perfectly honest. But I feel like we, you could make that argument a 4.0 in, in a different context. Right. So I think it’s worthwhile to keep some perspective because this is a genuinely useful model to me, and it just works a lot of the time and I really appreciate that.

[00:24:20] Paul Roetzer: Yep. Yeah, I agree. And I think, get in there, try it. And again, like if people weren’t using reasoning models Yes. And all G PT five does is injects reasoning into their workflows without them even knowing it, it will feel like a leap forward. Yes. Because that’s the biggest thing is Mike and I’ve talked about this many times, using 2.5 Pro, using O three from chat GPT, that is like, at least for me, the majority of my uses is reasoning models now for higher level strategic thinking.

[00:24:51] Mm-hmm. so if you weren’t using those, then you don’t really comprehend how far along these models are [00:25:00] to. Changing 

[00:25:01] Mike Kaput: work, the nature of work. And I wonder once we get past this kind of initial freak out, like how many other stories we’ll see given those numbers you shared. I mean, giving 4, 5, 6 x the amount of people suddenly access to using reasoning models based on those numbers and how they’ve jumped.

[00:25:17] I, I wonder what we’re going to hear people say about this model moving forward too. 

[00:25:23] Paul Roetzer: Yeah. All good. Well, I’m looking forward to your course next week. Yeah, me too. Awesome. 

[00:25:29] DeepMind’s Genie 3 World Model

[00:25:29] Mike Kaput: All right, so next up, Google DeepMind has unveiled Genie three. This is a breakthrough, what they call world model that can generate fully interactive photorealistic environments in real time.

[00:25:41] So, unlike earlier versions of Genie, genie three can render at 24 frames per second, maintain visual and physical consistency for minutes at a time, and respond instantly to both navigation and text-based prompts. So this model can do things like simulate an entire virtual world. Volcanic landscapes, enchanted [00:26:00] forests that can recreate historical sites like ancient Athens, all based on a short description.

[00:26:06] And those worlds evolve. Think of being in a kind, dynamically evolving video game. They evolve as you explore, and there’s these profitable world events that let users change conditions on the fly from altering weather to adding new objects. So DeepMind actually says they see world models as a key step towards AGI because they give a kind of limitless virtual training ground for AI agents to use to learn and adapt.

[00:26:34] So Genie three. Long horizon consistency essentially means agents can now tackle multi-step goals. So this kind of opens the door for really complex simulations in fields like robotics, education and science. But right now, this is still somewhat limited. There’s pretty short interaction, durations constrained actions.

[00:26:55] And it is in a limited research preview. So you can go to the, we’ll provide the link in the show [00:27:00] notes. You can go test out some kind of pre-made examples, but you cannot directly use this yourself. But DeepMind still calls it kind of a pretty significant moment in the evolution of these generative environments.

[00:27:13] Now, Paul, I mean, I realize like world models, this can kind of seem a little bit sci-fi to a lot of people. It’s not available yet to the general public. We’ve got massive news with GPT-5 coming out. But we did wanna talk about this because it seems like world models are pretty important to the trajectory of where AI is going long term.

[00:27:32] So maybe you could talk us through why they matter so much. 

[00:27:36] Paul Roetzer: Yeah, it, it’s been a pursuit of labs for years. This idea of giving the machine the ability to understand the physical world, to create simulations that follow the laws of physics. And DeepMind in particular and Demi asaba, specifically ha have been talking a lot more about them over the last year.

[00:27:56] Like I was going back, when I was kind of getting ready for today [00:28:00] and just looking at the different times that we’ve featured quotes from Demis on the podcast where he was talking about world models and their importance. And they talked about, like even with vo, the video generation, how it just, I mean they, this is their words, like it, it just emerges.

[00:28:16] Like when you train it on enough video data, it starts to like understand the laws of physics. And when you then ask it to produce simulations, it just seems to do it. Now there’s tons of limitations and they highlight those in the launch post. But I mean, in essence it does open all of these possibilities for applications.

[00:28:37] And you know, I think that this idea of the path to AGI when they really start to think about embodying intelligence and like humanoid robots and those robots being able to. See something happening and kind of like think out ahead of, because I understand the laws of physics, I understand human nature, like what is likely to be happening next.

[00:28:56] And that comes whether you’re, you know, training autonomous vehicles or you’re training a [00:29:00] robot to, to work in a human environment. All of these things become kind of essential. And so there’s some cool examples. As you mentioned, Mike, you can play around with like modeling, modeling physical properties of the world.

[00:29:10] So like water and lightning and complex environmental interactions. simulating the natural world. So they talk about generating vibrant ecosystems from animal behaviors to intricate plant life. So it, again, it just like kind of learns and then it’s able to recreate these things. And so this could come into play in storytelling where you’re trying to create these narratives, video game development where it’s rendering in real time the environment.

[00:29:34] So imagine like right now, programmers write all the code to create everything that happens in the game. They create all the environments that stuff. This what they’re envisioning. Elon Musk talks a lot about this. He actually tweeted this week, and he thinks by next year this will be a reality where you could go in and prompt your own video game and like everything just starts happening in real time, creating everything that you see.

[00:29:57] and that’s kind of wild. And [00:30:00] then even like, another tangible example is like right now in a Tesla, when you have autonomous driving going, it shows very like video game-like simulation. It’s showing your car and it shows cars of like approximate size. It’ll show a truck or a motorcycle, but it’s not like watching a live stream video of the road around you.

[00:30:20] What this is saying and what, what Elon Musk implies Tesla is going toward is when you’re driving a Tesla and you’re watching the full self-driving do its thing, it will actually render the physical world to show on the display. But it’s not a live stream. It’s actually like a rendering occurring where it’s simulating this whole world.

[00:30:40] It’s. Yeah, it’s really crazy and it becomes massive in robotics because now you can like simulate these environments and the robots can train in them and all these kinds of things. So world models are huge. We talked about, Fe Fe Lee Spatial Intelligence as a company. She created, I forget what episode that is.

[00:30:56] We can drop the link in the show notes, but she’s [00:31:00] someone who’s been working intensely on this in addition to the research that’s going on in the major labs. 

[00:31:05] Mike Kaput: Yeah, it’s a good reminder too that we will, regardless of the hype or the, buildup of something like GPT-5, regardless of where the verdict ends on that, I mean, progress is happening on a lot of different fronts in ai and it is not slowing down on many of them.

[00:31:22] Paul Roetzer: Yeah. And it’s commonly like six to 12 months ahead of what the public is aware of. Mm-hmm. So if they’re releasing this, they’re obviously already probably far beyond this and within the lab itself. Yeah. and you get people like Elon Musk who just straight up tweet and say, yeah, I think this is coming in three months.

[00:31:40] And yeah. So I mean, if as mu again, like you, you have to, you have to filter like the stuff from Elon Musk you wanna read, but like if you, if you want like a true inside, like just clear train of thought of like what someone thinks it’s possible. Nobody is more honest than you about what he thinks is gonna happen and his [00:32:00] opinions of these other models and kind of where they’re going.

[00:32:02] And while he has a history of sort of over-hyping when technology will arrive, dude built a frontier model in like a year and a half that caught up to the best models in the world. So he, he knows a few things about science and technologies he’s kind of worth paying attention to from that side. 

[00:32:20] Perplexity vs. Cloudflare Crawling Dispute

[00:32:20] Mike Kaput: Alright, our next or third big kind of main topic this week is that CloudFlare says that AI search startup perplexity.

[00:32:28] Has been disguising its web crawlers to bypass site blocks. This is a practice known as stealth crawling. According to CloudFlare, when perplexity bots hit a robots dot txt rule or a firewall block, they sometimes swap their identity from what’s called perplexity bot to something like Google Chrome on Mac Os, and rotate IP addresses that aren’t on its official list.

[00:32:52] So basically, CloudFlare says the company is doing things to dodge detection, including also changing its [00:33:00] network identifiers, which is a tactic. It claims has been used across tens of thousands of domains making millions of requests each day. Perplexity has pushed back pretty hard against Cloud Cloudflare’s claims in a detailed rebuttal.

[00:33:14] Rebuttal. They said they deny intentional wrongdoing. They called cloudflare’s post a publicity stunt, and says the company mixed up legitimate user triggered requests with bot activity. Now, according to perplexity, it says its AI assistant aren’t really traditional web crawlers. They don’t systematically scrape and store the internet.

[00:33:34] Instead, they fetch specific pages in real time. When a user asks a question, they use that content to answer it, and then they discard it with no training or long term storage. So in response, CloudFlare has now delisted perplexity as a verified bot and rolled out new methods to block its crawlers. Now Paul, this is, seems a little technical on the surface kind of in the weeds, but it does seem like a pretty important issue because, correct me if I’m wrong, it seems [00:34:00] like at its core, this is about how AI companies are or are not respecting the boundaries set up by publishers and websites of how their content can and can’t be accessed and used.

[00:34:12] And there’s this big fear given how models were trained, how the content’s already been used, that this material is going to get scraped and used to train models are used to essentially bypass websites entirely. 

[00:34:25] Paul Roetzer: Yeah, which has been going on for the last few years. Like, that’s the thing is like none of this is, well, I mean, I guess the agent side is new, but 

[00:34:33] Mike Kaput: Yeah.

[00:34:34] Paul Roetzer: I mean, part of the issue with like the New York Times lawsuit against openAI’s and others was that they were bypassing paywalls, like to get access to information and stuff. And so, you know, I think in, in the case of perplexity, the problem that we’re running into it here is this is their mo Like, there was, I forget, I don’t remember, I’d have to go back and find the podcast episode we talked about, when Arvin was literally bragging about the fact that they used to scrape LinkedIn against the terms of use, [00:35:00] like that, that that is just what they do.

[00:35:02] And he was proud of the fact that they did it. And it’s kinda like we’re gonna do it until we get caught. So when you’re on the record saying you constantly do these kinds of things, it’s really hard to have credibility when you come out saying, no, we’re not doing anything wrong. It’s like, dude, you’ve, you’ve admitted to things like this before.

[00:35:23] So. The, you have to consider the company itself and its history when you’re looking at this, but when you remove that out, the reality at the end of the day is the rules of the web and business are being rewritten. Yeah. Like we’re gonna have these men messy instances where you have semantics of like, yeah, but we’re not really scraping.

[00:35:43] It’s an agent and an agent’s being requested by a user. So it’s actually really the user that’s visiting the website. So the, you know, how this gets played out, whether it’s through business agreements or court cases or whatever. We’re gonna have this very prolonged [00:36:00] transitional phase where we start running into these kinds of issues and AI agents are gonna be a massive part of this.

[00:36:05] Like, yeah, the more traffic on the web that comes from AI agents, the more challenging it’s gonna be for brands to deal with, for publishers to deal with. It’s kind of similar to, you know, how we’re struggling with copyright and like, were the models allowed to steal it or weren’t they allowed to steal it?

[00:36:21] Was it fair use or not fair use? There, there’s just gonna be so many unanswered questions that we’re gonna come up a against as agents permeate the web and more and more of the traffic and actions taken online are taken by agents. 

[00:36:34] Mike Kaput: Yeah. The fact they’re already having issues with this now, before we even have real, a real explosion of AI agents tells me that we are not ready for whatever’s about to happen.

[00:36:46] Paul Roetzer: Yeah. and I mean, as a, as a publisher of a website, as a, as a brand, you can just like say, well, we want, we don’t want these users or these agents or these, you know, bots to crawl our site. but then what, you’re [00:37:00] just gonna stay out of the chat bot ai, assistant AI agent economy. Like you don’t, your content’s not gonna show up anywhere.

[00:37:08] Yeah. There’s no simple answers, but, I I, and again, like this, when you look at like where, where’s the future of work? Like there’s gonna be people whose jobs is just to kind of figure this sort of stuff out to like wade through all the issues and challenges and figure out plans for this stuff. But yeah, this is, this is kind of a messy one.

[00:37:28] I think it’s just the tip of the spear basically. Like there’s a lot more coming 

[00:37:32] Mike Kaput: for sure. Alright, let’s dive into rapid fire this week. First up, 

[00:37:37] OpenAI Returns to Open Weights

[00:37:37] Mike Kaput: openAI’s has released its first open weight language model. Since GPT two, there are two new models, GPT dash oss, dash one 20 B and G PT dash OSS dash 20 B, that are free to download under the Apache 2.0 license, meaning anyone can run them locally, fine tune them and even use them commercially.

[00:37:58] They support chain of [00:38:00] thought reasoning, tool use and code execution. And the smaller 20 billion parameters with the 20 B stands for 120 billion and 20 billion parameter version. The 20 billion parameter version is able to run on a high-end consumer laptop. OpenAI says the models perform on par with some of its proprietary systems and in certain benchmarks, even exceed them all while being cheaper and faster to operate.

[00:38:25] CEO Sam Altman framed this release as a way to keep innovation in open models happening in the US amid competition from places like China’s Deep Sea. So Paul, I’m curious about openAI’s motivations here. Obviously they are, doing a few things. They’ve got a few things on their plate at the moment. So why spend a bunch of precious time and resources competing in open source at all when your entire business model relies on selling access to closed bottles?

[00:38:53] Paul Roetzer: Yeah, I mean, they’ve talked about the fact they were going to do this for a long time, that they were committed to, you know, the open source [00:39:00] community or just, you know, open weights. so we’ve known it was coming. I think, The way the labs are looking at this now, and we’ve talked a little bit about this before, I know deas has said point blank.

[00:39:13] This is what they’re doing is the open source versions that they’ll release are basically like last year’s proprietary models. So the proprietary models that they’re selling keep getting better, keep getting smarter, more generally capable, let’s say every eight to 12 months is the release cycle for a a next version.

[00:39:34] GPT five obviously took a little longer, but for the most part, the labs are, are looking at kind of that eight to 12 month release cycle of the next version. And so every roughly 12 months, the prior version that’s now kind of outdated. You open source, as long as it’s safe to open source it. And the belief obviously is that the paying users are still going to pay for the premier [00:40:00] version of what’s available.

[00:40:01] plus, you know, they’re, they’re still able to, you know, service the developer community. build those relationships, integrate, you know, APIs still drive a lot of revenue for these, you know, labs specifically openAI’s and Anthropic. It’s a ton of their revenue through their APIs. So it’s just having to service that developer community and be a part of it.

[00:40:21] And then just overall, like the mission of the organization. Now we’ve seen some pullback a little bit on this, like Zuckerberg, who’s been the ultimate champion of open source 

[00:40:28] Mike Kaput: Yeah. 

[00:40:29] Paul Roetzer: Has said already, like they, they may move off of that. They, they may, you know, keep some of their technology more in-house.

[00:40:35] But again, I think what they’ll do is they keep the current frontier model proprietary and then you open source the prior generations accepting that there’s a small portion of users who will just use the open source and not pay for the other stuff. But it’s just kind of the standard model of the labs seem to be following now.

[00:40:54] Mike Kaput: So it’s kind of a no risk way, at least no risk of cannibalizing your existing products. [00:41:00] To get developer goodwill, move the ecosystem forward, remain relevant with people still building on your open source 

[00:41:06] Paul Roetzer: model. Yeah, and I mean in some organizations they’re gonna want to build on the open source too.

[00:41:10] Like you get into an enterprise. So you may have enterprises that have 5,000 chat GPT enterprise licenses, but then the IT teams, you know, also building on top of the open source model, things like that. 

[00:41:21] OpenAI $500B Secondary Talks

[00:41:21] Mike Kaput: Alright, next steps. More openAI’s news. They are in early talks to let employees cash out some of their shares at a valuation of around $500 billion.

[00:41:30] So this is a secondary stock sale. it’s a deal that would potentially be worth billions, giving current and former staff a way to turn their paper wealth into real money while helping the company retain talent. In an era where Meta is trying to poach people for like nine figures. This would basically create a huge jump in openAI’s valuation from going to 500 billion from the last $300 billion valuation when they did a $40 billion financing round led by SoftBank.[00:42:00] 

[00:42:00] And it comes on the heels of an $8.3 billion funding boost. That was oversubscribed. And as openAI’s aggressively pushes on product, so we’ve got open weight models, G PT five, we’ll talk in a second about a federal deal to provide chat GPT to the federal government. Paul, I guess as we’re looking at employees being able to cash out of openAI’s, like what motivates a move like this right now?

[00:42:25] Paul Roetzer: Yeah, I mean, they’re, they’re being drawn by a lot of money from other labs and you have to find ways to, you know, motivate people to stay. You have to give that ability to get something off the table so it makes sense. I’m just looking, Mike real quick. I searched, larger companies in the world by market cap.

[00:42:44] Just to provide some perspective of the significance of a half a trillion dollars. So ExxonMobil, which was the largest company in the world for quite a while, their market cap is 455 billion. Mm. Netflix is 515 billion. MasterCard’s [00:43:00] five 19 Visa’s 6 49. there’s only, well, we got at the trillion dollar plus mark.

[00:43:08] We have Tesla, Berkshire Hathaway, TSMC, or TSM, Broadcom Meta, Amazon, alphabet, apple, Microsoft, Nvidia. That’s it. That’s a list of companies in the world that are a trillion or more. Yeah. and I, and there’s actually only two between a half a trillion and a trillion, so, or well, no, I guess that’s, there’s seven.

[00:43:32] It’s one of like the 20 to 25 biggest companies in the world. Yeah. At, at a half a trillion is what I’m saying. 

[00:43:37] Mike Kaput: That’s incredible. 

[00:43:38] Paul Roetzer: It’s a big number. 

[00:43:40] Mike Kaput: So we’re gonna start seeing a, a, a whole host of other AI researchers, being deca a hundred millionaires, billionaires at some point. Yeah, there 

[00:43:49] Paul Roetzer: was a crazy stat.

[00:43:50] I’d have to find it, but, so don’t, don’t quote me on like the exact numbers here, but it go look it up. the number of Nvidia employees who are [00:44:00] millionaires and the number who are worth like more than 25 million. It’s absurd because their stock in the company, if they’ve been there for any amount of time, like go back, say nine years or more, you worth 10, 20 million.

[00:44:13] Like, it’s crazy. That’s wild. Yeah. It’s, it’s a large percentage. but that’s what’s gonna happen within some of these, you know, massive AI companies is everybody who’s a part of ’em are just gonna make a ton of money. 

[00:44:26] Anthropic Claude Opus 4.1 and System Prompt Update

[00:44:26] Mike Kaput: All right. Next up, Anthropic has released Claude Opus 4.1, and it is a notable step up from Opus four.

[00:44:32] In coding research and reasoning tasks, it hits a 74.5% rating on SWE Bench, a benchmark that is a tough test for real world coding. some companies are reporting it’s better at pinpointing exact corrections in code without making unnecessary changes. The coding startup Windsurf says the improvement is roughly on par with the leap from sonnet 3.7 to sonnet four on their junior developer benchmark [00:45:00] and beyond.

[00:45:00] Code Opus 4.1 has stronger agentic search and detail tracking. It’s more effective for deep research and data analysis. And this upgrade is available to paid users, via Claude Code, the API, Amazon Bedrock and Google Cloud’s Vertex AI, all at the same price as before. Now, interestingly and related to this, just after the release, Anthropic researcher Amanda Asell, shared some more information about the overall updates to Claude’s system prompt.

[00:45:30] This is the master prompt that essentially influences how the model behaves and responds. So in addition to a new model, we gotta look kind of under the hood at how Claude works. these are basically a bunch of updates and tweaks to how Claude interacts with users. So, for example, aswell shared that one change was made that reins in overly casual language and needless swearing from the model.

[00:45:52] Another nudge is clawed to be even handed and critical. Rather than hyping up every idea hears. Claude will also be more direct if it’s suspect, someone might [00:46:00] be dealing with a mental health issue instead of only dropping subtle hints. So Paul, really cool. I mean, in any other news cycle, this would be a huge story.

[00:46:09] Obviously GPT-5 overshadows everything, but it was really cool to see Amanda giving us a peek under the hood of the system prompt too, because, I mean, correct me if I’m wrong, this is at least more transparent than it seems. Some of the labs have been about system prompts, at least until they’re forced to.

[00:46:27] Right? When there’s a huge change to a system prompt like they did when GPT-4o had the really controversial change in their personality. Or unfortunately, when Grock had some really recent unhinged racist behavior due to some system prompt issues. So, maybe talk me through what, what was cool to see about this system, prompt stuff here.

[00:46:47] Paul Roetzer: Yeah, Amanda’s sort of the lead on the personality behind Claude, so she’s great to follow. She, she’s pretty transparent on x about that stuff. the system prompts. You know, the labs aren’t very forthright in [00:47:00] them, but they’re not easy or they’re not hard to extract. So there’s a, I assume I think it’s a guy, I don’t know, but there’s a user on X called Pliny the Liberator.

[00:47:10] the handle is at elder underscore Plin. So we’ll put a link in and the guy drops the system prompts within like an hour after every major update. So he’s a hacker and he’s able to get into, you know, the system and figure out what the system prompts are. And then he publishes the entire system prompt on X.

[00:47:29] So like, if you ever wanna know what the system prompt is, just follow Pliny and you’ll know it. and I know he is been recruited by a lot of the labs. Anthropic in, in particular was trying to hire him recently, and he talked a little bit about that online. So the system prompts are intriguing.

[00:47:42] You actually learn a lot by seeing how they talk, you know, tell the systems to behave and things like that. semi-related, I listened like last week was my. I, I’ve been, I’ve been grinding to get these courses done and like my brain has been like on overdrive every day. So I’ve started a new thing where [00:48:00] like I just go for a run every night.

[00:48:01] So I run like three miles or something, and I’ve been listening to a lot of podcasts, so I put it on 1.75 speed. And you can get through a lot of podcasts, you know, taking a three mile run every night. and so I had like five I listened to last week that were all really good and maybe I’ll list them out in the newsletter this weekend, but one in particular, just to, to the whole point of the story, big technology podcast had an interview with Dario Ade.

[00:48:27] It was, Mike, you gotta listen to this interview. Pissed. Like, it was the most, I don’t, I don’t know, like he’s generally a pretty authentic guy and he kind of seems to wear his emotions on his sleeve a a little bit. But there was a quote where Jensen Wong, CO of Nvidia sort of accused him of being a doomed of like, and he, here’s, here’s the quote.

[00:48:51] I get very angry when people call me a dor when someone like this guy’s a dor. He wants to slow things down. He says, you heard what I just said. And he’s talking about [00:49:00] like his e efforts to like advance and accelerate ai. So my father died because of cures that could have happened a few years later. I understand the benefits technology.

[00:49:09] I’m sure you’ve heard the criticism. This is now the host asking this. I’m sure you’ve heard the criticism from people like Jensen who say, well, Dario thinks he’s the only one who can build this safely and therefore wants to control the higher, higher industry. Dario said, I’ve never said anything like that.

[00:49:24] That’s an outrageous lie. That’s the most outrageous lie I’ve ever heard it. And it just like he was, he was edgy. Yeah, like the whole thing. It’s fascinating about their, their model, their rivalry with openAI’s, how they make money, all this stuff, but like the domm and anthropics approach to safety and how they choose to release models when they release them, things like that.

[00:49:46] safety of the model. So we’ll put the link in. It’s, it is a really good interview. It’s like an hour long. but it’s worth it. It’s, it, it’s good. 

[00:49:57] AI and the Future of Work

[00:49:57] Mike Kaput: Alright, next up, we are still kind of trying to get [00:50:00] a clear picture of AI’s impact on the economy, and we might be making a little progress. So, first we got a report that outplacement firm Challenger Gray and Christmas announced that more than 10,000 US job cuts were directly linked to employers adopting generative AI in the first seven months of 2025.

[00:50:20] They also said that AI appears in four times as many descriptions compared to the previous period. Now, at the same time, though, according to some other reports, including one in the Wall Street Journal, a core question is baffling. Economists, if AI is so valuable in, say, replacing human labor or producing productivity gains.

[00:50:40] Why isn’t it showing up in terms of impact in the form of increased productivity at the macroeconomic level? Because so far economic economists say that AI is not showing up at all in GDP numbers, which is where they would expect to see AI’s impact if it was truly transforming the economy. [00:51:00] But according to a new study from researchers, including Eric Brisen, who we’ve mentioned before, he studies AI’s impact on the economy.

[00:51:08] AI’s impact may be showing up in some other numbers. So Brisen and his colleagues argue that while government data barely registers the value of generative ai, Americans gained an estimated $97 billion in what they call consumer surplus from free or low cost AI tools in 2024 alone. Now, the way they define and quantify this is.

[00:51:33] They basically estimated how much money a US adult would need to be paid to give up the usage of a free or low cost AI tool. And they estimated this based on a survey they ran at $98 per month. In other words, kind of the implicit estimate of the value that the user was getting out of those tools each month.

[00:51:53] Then they went and multiplied that by an estimated number of regular users of ai and they come up with that $97 [00:52:00] billion number. Essentially, they say consumers are getting $97 billion in value out of these tools. These are benefits that don’t appear in GDP because they accrue to users, not companies traditionally.

[00:52:11] GDP accounts only market transactions. So this kind of thing would be invisible. And Brin Sand’s colleagues say this is similar to the paradox that economists spotted with computers starting in the 1980s. You start to see the technology everywhere except in the productivity stats. So. Paul, it’s interesting to see, see real data on AI’s job impact those 10,000 jobs.

[00:52:37] Seems clear it’s having an impact. We know anecdotally through the conversations we’re having, it’s having an impact, but it’s not showing up in the economic data really. Can you maybe walk through the contradictions here in what we’re seeing? 

[00:52:51] Paul Roetzer: So the opinion piece is based on a forthcoming paper called GDP dash B, accounting for the value of [00:53:00] new and free goods.

[00:53:02] so I read this article three times. I think I was trying to like comprehend what they’re saying. so the way where, where I kind of landed on this, since this is a rapid fire item, the logic of the value not being counted in the GDP makes sense. So the reason they give as to why it’s not showing up at GDP is very logical and pretty straightforward.

[00:53:23] The math to get to 97 billion seems. Pretty subjective and like some math gymnastics. Like, it, it, it, it’s a really nice number to put in a headline, 97 billion, the consumer surplus concept and like how they calculate it by like saying, Mike, how much right would it take for you to not use Chad GPT? And you’re like, I don’t know, a hundred dollars.

[00:53:52] Like how do you, how do you come up with that number? So I, again, I I will withhold any judgment. I love the fact that we’re [00:54:00] doing this. I love that economists are trying to find other ways to measure value. I think it’s great and the paper itself may end up being exceptional and make perfect sense in the form of a 500 word opinion piece.

[00:54:12] It’s kind of hard to understand how they’re coming up with that number and how valid that number is. It makes for a nice headline though, and probably research worth. Reading through when it comes out. 

[00:54:24] Mike Kaput: Yeah. I feel like they should have waited for the paper to be Yeah. I don’t get it. It’s really, so 

[00:54:27] Paul Roetzer: it’s way too complex of a concept to try and do in, in a 500 word opinion piece.

[00:54:33] Mike Kaput: But, and I won’t go down the rabbit hole here since it is rapid fire. But it, the point here too is even if this research ends up being terrible, people are scratching their heads about like, ai, we’re seeing productivity gains in our own work. Is it, it’s just not diffused enough into the economy. Like where are the numbers showing up?

[00:54:50] But we’ve talked in the past, we are also sometimes skeptical. Are economists measuring the right thing? Are they aware of the productivity gains happening in other areas? So it’s definitely [00:55:00] a relevant conversation that we need to keep tabs on. 

[00:55:02] Paul Roetzer: Yeah, it’s just like, and again, I, again, I don’t wanna spend too much on this, but this is what it says.

[00:55:07] Rather than asking what people pay for a good, we ask what they would need to be paid to give it up. Hmm. So let’s. Play this out with chat. GPT. Let’s assume you were a chat GPT user maybe paying 20 bucks a month, who was in the camp that had never tried the reasoning model and didn’t know the full value of the system?

[00:55:23] Mike Kaput: Yeah. 

[00:55:23] Paul Roetzer: So I ask you, as someone who’s never used the reasoning model, right, what would, what would it take for you to give it up? And it’s like, I don’t know, 25 bucks, 50 bucks, a hundred bucks. You ask me or Mike, like, dude, I don’t even know. 5,000. Like, I, it’s just worth a lot of money to us. And so then it says, our own survey found their average valuation to forego these tools for one month is $98.

[00:55:44] Multiply that by 82 million users and 12 months in the $97 billion surplus surfaces. It’s like, wait, what? It just seems like a quite a leap to get to 97 billion. But again, I like the direction and I’m anxious to see the actual paper. They’re respected [00:56:00] economists and author. So yeah. 

[00:56:02] OpenAI “Universal Verifiers”

[00:56:02] Mike Kaput: Next up, a couple new articles are giving us a peek under the hood of chat.

[00:56:05] GPT. One of them tackles it from a highly technical perspective. The other from a behavioral one, both are pretty important to understand. If you want to understand where chat GPT and AI is headed. So first, the information reports that openAI’s is now using something called a quote, universal verifier as a quote, secret weapon within chat GPT.

[00:56:27] So basically, a universal verifier is a technique for checking whether an AI’s answers are not just plausible, but actually correct. Basically, like a referee AI model grading another model’s work, pulling in research from multiple sources. For example, in math, it would essentially have AI verifying each step that AI follows to solve a math problem.

[00:56:50] The information speculates that universal verifiers may have actually helped open AI’s latest model score, a gold medal At the International Math Olympiad, which we talked about in [00:57:00] past weeks, researchers say the approach could boost performance in domains that are subjective or hard to score from business decision making to creative tasks.

[00:57:09] Now second OpenAI themselves published a post called What We’re Optimizing Chat GPT for In it, they kinda lay out a short philosophy for how they’re optimizing chat, GPT. They say they are not trying to keep you in the app longer, they’re trying to help you get what you need and get back to your life.

[00:57:29] They wrote, quote, instead of measuring success by time spent or clicks, we care more about whether you leave the product having done what you came for. They also point out that people are increasingly relying on chat GPT for emotional and personal needs, and some new updates reflect that. Chat. GPT will now give gentle break reminders during long sessions.

[00:57:51] It will refuse to make decisions for you on high stakes, personal matters, and provide more thoughtful, grounded support when you are struggling. Apparently [00:58:00] OpenAI says they have worked with more than 90 physicians in over 30 countries, plus researchers in mental health and human computer interaction. To fine tune how the model responds in sensitive moments.

[00:58:11] So Paul, these are two really different looks at how chat GPT works under the hood, but I think they’re both useful to understand. So maybe first let’s really quickly touch on why do universal verifiers matter and then maybe talk about open openAI’s as like emotional and behavioral approach to how this works.

[00:58:30] Paul Roetzer: The verification gap that we’ve talked about numerous times is sort of illuminates why the verifiers would be so valuable. It’s the more you can have other agents or AI that can look at the output. So like if you get a deep research product that’s 42 pages long mm-hmm. And the human has to go through and verify it.

[00:58:50] Well if they build a really smart verifier on top of that and it checks all the stats and you know, makes sure all the citations are correct and the data’s real and you know, does [00:59:00] lookups of those things, like, it’s just increasingly able to do higher value work for humans. So they’re gonna be critical not only in the training of the models, the reinforcement learning of the models, but the actual use of them being a secret weapon.

[00:59:14] Seems like it’s probably a bit of an exaggeration. I know for a fact the other labs are working on these kinds of things. They’ve talked about them publicly, so I can’t imagine, I mean, maybe open eyes a month or two ahead on their use of a verifier. But, that seems like a pretty standard practice within labs to be building agents that can do the verification process.

[00:59:35] Mike Kaput: And, you know, it did strike me too, that some of their commentary around kind of the other side of it, like the emotional, behavioral stuff mm-hmm. Like, was really interesting. I could, like, I feel like there were a couple companies they weren’t naming that they were, that they were taking aim at in saying, you know, we’re not trying to engage you on the app and keep you clicking and eyeballed on it, et cetera.

[00:59:57] Paul Roetzer: Yeah. I think it was also part recruiting [01:00:00] and part retention of talent. They’re basically saying like, listen, if you go work for XAI or meta. You’re just selling yourself off to monetize this technology and keep people on platform. That’s what they need to do with their social platforms. it’s clicks and time on site and daily active use, hourly active use, whatever their, their metrics are.

[01:00:22] And that’s not what we’re doing here. So it’s sort of like a mission thing of like, it’s more than money, like we’re here to actually make the world better, not make more money on ads and clicks and time on site. So yeah, it was a pretty not so subtle dig it. I would imagine meta and X AI in particular.

[01:00:42] OpenAI Offers ChatGPT to the Federal Workforce

[01:00:42] Mike Kaput: All right, next up, OpenAI has struck a deal to make chat GPT Enterprise available across the entire US Federal Executive branch for the next year. So under the agreement, each agency of participates will for just $1 per agency, get access to openAI’s top [01:01:00] models. And get an extra 60 days of unlimited use of advanced tools like deep research and advanced voice mode.

[01:01:06] This also includes some custom training, a dedicated government user, community, and consulting support from Slalom and Boston Consulting Group. So obviously this program aims to cut time spent on red tape and paperwork, freeing public servants to focus on core mission, opening eyesight, some early pilots that show Promise.

[01:01:26] In Pennsylvania, employees saved about 95 minutes a day on routine tasks in North Carolina, 85% of staff and at 12 week trial reported positive experiences. So Paul, the focus on the executive branch is interesting. They call out literally in the announcement, the AI action plan. So I’m guessing this is somewhat related to or motivated by that.

[01:01:49] this definitely seems like a trend of openAI’s getting more embedded in federal and local governments, doesn’t it? 

[01:01:56] Paul Roetzer: Yeah. and obviously the. The [01:02:00] administration is just very, very aggressively moving in and doing deals on these things. Like we had, it came out over the weekend that Nvidia is now allowed to sell their H 20 chips, I think it is, to China.

[01:02:10] Yeah. And then I think Financial Times had the story that the, in essence bribed the government to allow to happen. So like 15% of the revenue for all those sales goes back to the federal government. So they, they basically bought an exclusion on the tariffs. 

[01:02:24] Mike Kaput: Yeah. 

[01:02:25] Paul Roetzer: And so we know that the government is wheeling and dealing all over the place.

[01:02:28] And so yes. On, on its surface. Great. It, it is probably gonna make for more efficient government, no doubt. My guess is sometime within the next 30 days, the information or Financial Times or some Bloomberg, somebody has the story of what was the quid pro quo here? Like what, what did Interesting. Yeah. Yeah.

[01:02:43] What did opening I get in exchange for giving the federal government these licenses for a dollar? Like it’s, yeah, I don’t know. It’s, it’s just, there’s always layers to this stuff, but. On the surface, great. It’ll make for more efficient governments if they’re trained how to actually use this stuff. 

[01:02:59] ElevenLabs Launches AI Music

[01:02:59] Mike Kaput: Right?[01:03:00] 

[01:03:00] All right, next step. 11 Labs, which is best known for its ai, voice technology, is now stepping into music With 11 music, an AI generator that can create fully produced songs from a simple text prompt in minutes. It can generate any genre or style with or without vocals and blend instruments and traditions into seamless original tracks.

[01:03:20] It is apparently built for both creativity and commerce. It has licensing options for film, TV, ads, gaming podcasts, and more. And the company frames it as a way for creators to kind of skip the stock music grind and produce fully unique soundscapes. Interestingly, AI expert and copyright advocate, ed Newton Rex, who we talk about often posted about how the company’s approach, at least initially seems to differ from market incumbents.

[01:03:45] He said co-founder of 11 Labs confirms that their new AI music models trained only on songs they’ve licensed. That is really good to see. When a handful of AI companies try to tell you generative AI can only be built with scraped copyright work, remember that the majority [01:04:00] of AI music models license their training data, including now 11 labs model.

[01:04:05] Very embarrassing for the couple of AI music companies that are known to train on people’s music without permission. Now, Paul, ed Newton, Rex and some follow up comments in this thread on x did say he’d like to see evidence backing up the claim of 11 labs as co-founder. He also asked a few times if they trained their voice model only on audio they’ve licensed.

[01:04:25] Did not get an answer, but at least this does seem like a step in the right direction. 

[01:04:31] Paul Roetzer: Yeah, the tech’s awesome. The is like anything else, like a whole, all these tools are great image generation, video generation music or whatever. there’s always this underlying Yeah. But yeah, it’s train time legally at some point.

[01:04:47] I, I mean, the story’s not gonna go, I don’t want the story to go away, per se, that like, I think this is. There’s people like Ed need to keep the pressure on these labs and find ways to, [01:05:00] compensate creators. I don’t know the answer to how that happens, but so many of the AI labs just seem to kind of like moved on.

[01:05:06] It’s like, man, of course we took their stuff, like, leave us alone. Like, it’s the general gist of how the labs respond whenever they’re called out on it. It just is what it is. So, I don’t know. I don’t know when we’re gonna finally have like a court case that changes anything or some industry agreement that changes things.

[01:05:24] But up until then, every time we talk about awesome something is, there’s always the, yeah, they, they did, they still copy material. Like 

[01:05:32] Meta Buys AI Audio Startup

[01:05:32] Mike Kaput: now some more, AI audio news Meta has quietly snapped up a company called Waveforms, which is a fast rising AI voice startup for an undisclosed s It is Meta’s second major AI audio acquisition in just about a month that follows their purchase of play ai.

[01:05:50] And this is all part of their new AI unit. Super Intelligence Labs Waveforms was founded only eight months ago, but had already raised $40 million from Andreessen [01:06:00] Horowitz, hit. They hit $160 million valuation. Their company, the company’s tech, is focused on passing the so-called quote speech Turing test.

[01:06:08] So basically making AI speech indistinguishable from humans and on building what they call emotional general intelligence to detect and respond to emotional cues. Two of the co-founders, Alexis Conno, and a former meta and openAI’s researcher who helped develop GPT-4o’s, advanced Voice and Corale Lamare, former Google ad strategist, both of them have reportedly joined Meta as part of this.

[01:06:33] So Paul Meta acquires play AI back in June. That’s a, they, that’s a quote, a startup that uses AI to generate human sounding voices. Waveforms is building emotional general intelligence. We’ve been talking in past episodes about meta’s aspirations to build personal super intelligence. I don’t know. This really seems to me like we’re heading in the direction of meta building, hyper personalized voice assistance or companions.

[01:06:58] Like what do you think 

[01:06:59] Paul Roetzer: [01:07:00] definitely seems to be going in that direction? I mean, I think, Zuckerberg’s been on record in recent podcasts talking about voice, plus glasses. Yep. You know, they basically think that the touch goes away as like a largely as an interface, and that most of your interactions with intelligence, with agents, with assistance happens through voice, and your interactions with the world around you.

[01:07:26] and so it makes sense that they would be kind of making lots of investments in this direction. and again, it gets, it gets back to that distribution question. Like, obviously openAI’s is going in the same direction. They’ve been putting a ton in voice. It seems like openAI’s probably had a lead.

[01:07:41] Maybe they still do on voice. Google’s obviously making major plays into voice. I do think, like as you were saying this, like the one thing that crossed my mind, I dunno if you have this issue, Mike, ’cause I think you use, ChatGPT voice as well. I love it, but I often use it when I’m driving. Yeah. And it drops [01:08:00] in like dead zones all the time.

[01:08:01] Definitely. It drives me crazy. And that goes to the whole, like the open source or like the opportunity for Apple to put a smaller voice model on the phone, like on device where I don’t have to be going off device to, to have that conversation. Those are like the windows of opportunity for someone like a Google with Pixel or Apple, you know, with the iPhone where I don’t have to leave and I can just have that uninterrupted voice conversation where like I’m talking, talking, talking, and I’m like, three minutes goes by and then I realize I lost the connection and the voice wasn’t there anymore.

[01:08:32] And you’re like, oh, everything I just said was perfect. I don’t want to have 

[01:08:36] Mike Kaput: to repeat that a hundred percent. That happens all the time. And I feel like despite how amazing in man’s voice mode is, I feel like voice is under. We rated or underutilized at the moment. So yeah, it’s not only just having it on device, but the type of device, right, like phone is the form factor right now we know openAI’s is coming out with some type of device.

[01:08:55] We don’t know what wearables maybe is the play. I feel like [01:09:00] Air Pods, yeah, air Pods would be incredible. just feels like this could be a real big unlock. 

[01:09:05] Paul Roetzer: Yeah, and it’s, it just seemed like a year ago, openAI’s was knocking on the door. Like they had basically solved it with their whisper, you know, technology and built it in and then it just feels like they lost momentum or they ran outta compute.

[01:09:17] Like I could be, it’s very possible they just couldn’t launch it because they, they didn’t have enough compute to do the all the other stuff. Yeah. But again, these are where those Apple Google where like the stalwarts, the people with the distribution with the devices, like that’s where the opportunity is.

[01:09:32] I assume whatever they’re building with Johnny, ive like, that’s probably tied to voice in some capacity. So yeah, I think there’s just gonna be a lot more to come with voice, you know, probably still in 2025. Yeah, for sure. 

[01:09:46] Google AI Pro for Students

[01:09:46] Mike Kaput: All right. Last but not least, we have as our last topic here, Google is making a big push to offer its most advanced AI tools to college students for free.

[01:09:56] It is committing a billion dollars to AI education, [01:10:00] training, and research in the us. So starting now, students in the US and they also added on Japan, Indonesia, Korea, and Brazil can sign up for a free 12 month Google AI Pro plan. That includes Gemini 2.5 Pro. For homework help and research notebook, LM for organizing ideas.

[01:10:17] VO three for AI generated videos, higher limits on Google’s AI coding agents, and two terabytes of storage. This release also debuts guided learning, which is a mode in Gemini that doesn’t just give answers, but actually walks students through problem step-by-step to deepen their understanding. In the us.

[01:10:36] Google also reports that over a hundred colleges have already joined their new AI for Education Accelerator, which is offering free AI training and Google career certificates to college students. CEO Sundar Phai says the goal is to put top tier AI in students’ hands and teach them how to use it well, helping them thrive as the first true generation of what he calls quote AI natives.

[01:10:59] Now, [01:11:00] Paul, I’m, I feel like this might have flown a bit under the radar with all the other news. I mean, I would have to benchmark it, but a billion dollars in commitments to US schools over three years seems pretty significant. the offer of free AI training and Google career certificates to every student, I mean, I just feel like I have a fair amount of conversations.

[01:11:19] I know you did too with teachers, higher ed institutions, this feels like something that could really move the needle if they stick the landing on it. 

[01:11:27] Paul Roetzer: Yeah, it’s great to see. And I don’t know what the connection is to the, like, in April there was the executive order from the White House on advancing artificial intelligence education for American youth.

[01:11:37] And then they just came out, I think it was like last month or something with, the policy plan because the executive order basically said that we would, policy the United States to promote AI literacy and proficiency among Americans by promoting the appropriate integration of AI into education, providing comprehensive AI training for educators, and fostering early exposure to AI concepts and technology to develop an AI ready workforce in the next generation of [01:12:00] American innovators.

[01:12:01] So that was like saying, Hey, we’re gonna do this. We’re gonna create a task force, and in 90 days, 180 days, whatever, this is the plan. I don’t know if this is connected to that, and a commitment from Google related to that, but it, I, it would seem they’re, they’re very closely aligned, at least. So I, yeah, I think this is great.

[01:12:17] I think we’re seeing more and more of this from the major AI companies, whether it’s Microsoft, openAI’s, Andros been releasing some great stuff. And so I would say like, as you’re building out, and ironically like I was building the AI Academy course this morning about building internal AI Academy.

[01:12:32] So this is very, very top of mind for me. think about these things as you’re building personalized learning journeys for your teams. It’s like, okay, we’re gonna have our core curricul but what can we pull from a Google? What can we pull? And obviously this is more K to 12, but conceptually, like, what can we pull from these different resources that can really enhance our people and prepare them for the future of work?

[01:12:52] And as you’re even starting to hire, like looking at what kind of curriculum have people gone through with their AI education, where are they already at with their [01:13:00] understanding and competency in this stuff? So yeah, it’s awesome to see this, you know, really large focus, not just from Google, but the White House and other major companies that AI literacy is, is absolutely critical to the future of work and innovation, not just in the us 

[01:13:16] Mike Kaput: but beyond that.

[01:13:18] Yeah, a hundred percent. All right, Paul. We made it through GPT-5 week. Thanks for breaking it. It all down. You feel 

[01:13:24] Paul Roetzer: different. Like I think my overall is like, it seems awesome. It’s just like after a year and a half of like waiting Yeah. You just, you thought the world was gonna change after GPT-5 came out.

[01:13:35] It feels like they did more backtracking than like actually accept. 

[01:13:40] Mike Kaput: Yeah. I don’t know. We’ll see. I feel like it’s like going to be much more impactful than I even realized now. Yeah. But it’s gonna be a lot more subtle. It’s, you know, like when 

[01:13:47] Paul Roetzer: we look back in like 30 days, 90 days, it’s like, so, oh wait, that actually was a bigger deal than maybe it’s those first 48 hours.

[01:13:54] Mike Kaput: Hey, I, what I just said could be out of date by this afternoon when Google releases something. But yeah, I do [01:14:00] think that we’re going to look back and be like, Hmm, okay. That might’ve been a subtle turning point. But again, it just shows like the bubble, the hype is outta control. 

[01:14:07] Yeah. and that we all live that.

[01:14:09] Paul Roetzer: Anyone listening to this show, at least, we generally live in a bubble and most of your peers have no idea that GPT-5 came out or what it is like. It’s funny, my, my dad who listens to the podcast every week, he’ll often like text me things and he texted me, I think the morning after and he goes, nothing on the news today.

[01:14:28] So he was like watching the news to see if GPT-5 was even talked about in mainstream media. And he is like, nothing. Wow. And so that, again, it tells you like we’re not to the point, like we’re waiting, waiting, waiting for a year and a half. The general public like care less. It’s a non-event to them.

[01:14:45] Mike Kaput: Until the next Studio Ghibli filter goes viral or something. Right. All right, Paul. Well thanks. Good stuff. Thanks again. 

[01:14:52] Paul Roetzer: All right. Thanks everyone. We’ll talk to you next week. Thanks for listening to the Artificial Intelligence Show. Visit SmarterX dot AI to [01:15:00] continue on your AI learning journey, and join more than 100,000 professionals and business leaders who have subscribed to our weekly newsletters, downloaded AI blueprints, attended virtual and in-person events.

[01:15:12] Take in online AI courses and earn professional certificates from our AI Academy and engaged in the Marketing AI Institute Slack community. Until next time, stay curious and explore ai.



Source link

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments