0:00 Why do I have a photo of a tree here?
0:02 More on that in just a second. So,
0:04 Google just took their smartest AI
0:06 model, Gemini 2.5 Pro, and made it even
0:10 better. Now, confusingly, instead of
0:12 calling it 2.6 Pro, it's still called
0:14 Gemini 2.5 Pro, but they've added 0506
0:18 to the end, which signifies the date of
0:20 the release. Anyways, this model is
0:23 seriously impressive. Apparently, it
0:25 dominates the LM Marina leaderboard
0:28 across all categories, making it the
0:31 most performant, most intelligent AI
0:33 model out there. So, in this video, I'm
0:35 going to show you how and where to use
0:37 it. Plus, I'll show you some cool things
0:39 it can do, and of course, we'll go over
0:41 its specs, performance, and benchmark
0:43 scores. Let's jump right in. Thanks to
0:46 HubSpot for sponsoring this video. All
0:48 right, first of all, where can you use
0:50 this? At least at the time of this
0:52 recording, it's only available in
0:54 Google's AI Studio, which I'll link to
0:56 in the description below. At the top
0:58 right in the model dropdown, you should
1:00 see this new Gemini 2.5
1:04 Pro0506. Note that another place to use
1:07 Google's Gemini models is the Gemini
1:10 platform, which I'll also link to in the
1:12 description below. It's just
1:14 gemini.google.com. But if you select the
1:16 model dropdown, at least for now, it's
1:18 not clear if this 2.5 Pro is the latest
1:21 0506 version. So, in this video, I'm
1:24 mostly going to use AI Studio to show
1:26 you some cool examples. And this is what
1:29 I personally prefer because you can
1:31 switch from all these different models,
1:33 including this really powerful image
1:35 editor from Gemini 2.0 Flash. And then
1:38 note that for this latest Gemini 2.5
1:40 Pro, it has a token window of over a
1:43 million tokens. This is basically how
1:46 much information you can fit into your
1:48 prompt at once. So a million tokens is
1:50 roughly over 700,000 words or an hour of
1:55 video. This is like five times larger
1:57 than what other leading AI models can
1:59 take in at once. And then we also have
2:01 this really handy temperature slider
2:04 which determines the creativity in its
2:06 responses. So, if you drag this all the
2:08 way to two, for example, this wouldn't
2:10 follow your prompt as much, and this
2:13 allows it to be more creative. If you
2:15 drag this all the way to the left, it's
2:16 going to process your prompt a lot more
2:18 literally. So, for us, I'm just going to
2:21 leave it at the default of one. And then
2:23 you also have various toggles over here.
2:26 So, structured output basically forces
2:28 the AI to format its response in a
2:30 structured way. So for example, if you
2:32 want it to only output JSON or output a
2:36 data table with specified columns, then
2:38 this would be a good toggle to turn on.
2:41 And then for code execution, this allows
2:43 Gemini to also execute code in the
2:45 prompt. And then for function calling,
2:47 if you enable this, the AI can use
2:49 external tools or APIs to retrieve
2:52 information. And then finally, this one
2:53 is also super useful. If you toggle this
2:56 on, it basically searches the web with
2:59 Google so that it can fetch the latest
3:01 information. Notice for the leading AI
3:03 models out there, including Gemini 2.5
3:06 Pro, they can already do simple stuff
3:09 like writing an essay or simple Q&A. So,
3:11 if you're going for simple stuff like
3:13 that, it doesn't really matter which one
3:15 of these models you use. They're all
3:17 really good. But what makes the top
3:19 models particularly useful, including
3:22 Gemini 2.5 and 03 and04 Mini, is their
3:25 ability to think and reason and solve
3:27 more complex problems for STEM subjects
3:30 like coding, math, and science. So, in
3:33 this video, that's mostly what I'm going
3:34 to show you. I'm going to test it on
3:36 some really challenging STEM related
3:38 prompts. Plus, the awesome thing about
3:40 Gemini is that it's multimodal. So, it
3:42 can take in and understand multiple
3:44 formats, including audio, images, and
3:47 video. In fact, for the first example,
3:49 I'm going to take this video where I
3:52 draw out a diagram of the app I want and
3:55 explain what I wanted to do. Let me play
3:57 you the video first. I want you to
3:59 create an interactive earthquake
4:02 visualization of Japan. So, let's say we
4:04 have a map of Japan like this. First, I
4:07 want you to list or show me all the
4:09 major cities in Japan on the map. And
4:12 then there's going to be a left sidebar
4:14 where I can adjust various settings like
4:17 earthquake, magnitude, etc., etc. So
4:20 these are the settings that I can
4:22 adjust. And whenever I click somewhere
4:24 on the map, so let's say I click here,
4:27 then you would start to create an
4:29 earthquake. So it's going to be an
4:31 animation effect that slowly ripples and
4:34 ripples all the way until it hits one of
4:38 these cities. And based on the magnitude
4:40 of the earthquake, I want you to
4:42 calculate how severe the impact would be
4:45 for each major city. So, I've uploaded
4:48 that to YouTube. And then I'm going to
4:50 paste the YouTube link in here like
4:52 this. So, notice it automatically knows
4:54 how to extract and analyze the YouTube
4:56 video. And this takes up around 14,000
4:58 tokens out of the million tokens. Now,
5:01 to make sure it's actually analyzing the
5:03 video and understanding everything in
5:05 the video for my prompt, I'm not going
5:07 to mention anything about earthquakes or
5:09 Japan, I'm just going to write put
5:11 everything in a standalone single HTML
5:14 file. So, let's click run and see what
5:16 that gives us. All right, so here's its
5:18 thinking process. For all the top models
5:20 out there, usually they have this
5:22 thinking function where it takes some
5:23 time to think through its answer and
5:26 correct itself before it gives you its
5:28 final response. So let's look at its
5:30 thinking process really quickly. Here
5:32 it's breaking down the requirements
5:35 which I specified in the video. A map of
5:37 Japan, left sidebar for settings, click
5:39 to create earthquake, earthquake
5:41 animation, impact calculation, etc.,
5:43 etc. And then it starts with a plan of
5:45 attack. So phase one is the basic
5:47 structure and the map. And then phase
5:49 two is adding the sidebar controls.
5:51 Phase three is the earthquake animation.
5:53 Phase four is impact calculation, etc.,
5:56 etc. So afterwards, it proceeds to give
5:59 me the entire code. So I'm just going to
6:01 scroll all the way down and then
6:03 download this HTML. And then I'm just
6:05 going to open the HTML in my browser.
6:07 And here's what we get. Indeed, we have
6:10 an interactive map of Japan, which you
6:12 can move around. And if I click here,
6:14 for
6:15 example, it does cause an earthquake.
6:17 And we can see the impact of the
6:19 earthquake on all the cities. This is so
6:22 cool. Now, if we change the magnitude,
6:24 let's change this lower. And then if I
6:27 click here again, note that the severity
6:30 is lower than the previous earthquake,
6:32 which has a larger magnitude. And then
6:35 if I drag this all the way to like 10,
6:37 for example, and let's say I click here,
6:40 notice that the severity is a lot
6:42 higher, reaching 100 for some of these
6:45 cities that are nearby. And then let's
6:47 see what wave factor does. I think this
6:50 is like the speed of the ripples. So if
6:52 I drag this to a lower value and I click
6:54 here again. Yeah. So this ripples a bit
6:58 slower. Anyways, a really cool app. It
7:00 totally understood my really lousy
7:03 explanation and illustration from the
7:05 video. And this opens up a ton of
7:08 possibility. Instead of just typing out
7:10 a prompt and not fully being able to
7:13 explain how you want to design an app,
7:15 you can record yourself just drawing out
7:17 an illustration explaining what each
7:19 component of the app does. And then you
7:21 just plug the video into Gemini and it
7:22 would generate the app for you. All
7:24 right, next up again because Gemini is
7:27 multimodal and it can understand images.
7:30 I'm going to upload this image of a tree
7:32 and then I'm going to ask it what is
7:35 here. Let's click run and see if it can
7:37 figure this out. It only thought for
7:39 like 5 seconds. And here it has
7:41 correctly identified that it's a mossy
7:44 leafetailed gecko. It even gave me the
7:46 scientific name camouflaged on the tree
7:49 trunk. And this is indeed correct. So
7:51 for those of you who have no idea what
7:54 the hell you're looking at, there is a
7:56 gecko like over here. So this is its
7:58 head. It's pointing down. You can see
8:00 here are its eyes. And if you follow my
8:04 mouse, this is roughly the outline of
8:06 its head. This is a really cool gecko
8:09 found in, I believe, Madagascar. And
8:11 it's really good at camouflage. So as
8:13 you can see, Gemini has no problem
8:16 analyzing and understanding images.
8:18 Speaking of Google's Gemini, if you're
8:20 in marketing and you find yourself
8:22 spending hours on research, strategy,
8:24 and content creation, it's time to
8:26 rethink your approach with AI. Check out
8:29 this free guide, Google Gemini at Work
8:32 by HubSpot. Inside, you'll discover the
8:34 Gemini Marketing Stack. These are AI
8:37 tools that make your research, campaign
8:39 planning, and content creation way more
8:41 productive. And my favorite part, it
8:44 provides a ton of pre-built prompts and
8:46 templates which you can just copy and
8:48 paste. You'll get step-by-step
8:50 instructions on how to do research 10
8:52 times faster using Gemini Deep Research.
8:55 They also show you how to use Notebook
8:57 LM to connect your campaign data,
9:00 competitor research, and customer
9:02 feedback into one powerful dashboard
9:04 that actually thinks for you. No more
9:07 digging through folders or piecing
9:09 insights together manually. There's even
9:11 a four-week implementation plan at the
9:14 end, so you can start small and see
9:16 results right away. This resource was
9:19 made by HubSpot, the sponsor of this
9:21 video. I recommend you download it for
9:23 free via the link in the description
9:25 below. Next, I'm going to upload this
9:27 image of a hike I did like a few years
9:30 ago. And this isn't even like the main
9:32 lake or attraction of the hike. It's a
9:35 pretty normal, you know, hiking photo of
9:37 mountains and a lake. This could be
9:39 anywhere. And then I'm going to paste
9:40 the image in here and then ask it where
9:43 is this. So let's click run and see what
9:45 that gives us. All right, here's what we
9:47 get. Let's expand its thinking process.
9:50 So it's analyzing the key visual
9:51 elements. It has turquoise water, steep
9:53 tree covered slopes, glaciers in the
9:56 background. It could be all of these
9:58 options. Based on the feel, it seems
10:00 like Canadian Rockies or BC coast. And
10:03 then it's actually searching for
10:05 specific lakes. Now, because I didn't
10:08 turn on grounding with Google search,
10:10 it's not actually using Google to
10:12 search. It's just searching mentally in
10:14 its head based on the knowledge it was
10:16 trained on. And then it has found all
10:18 these turquoise lakes. And then after
10:20 some additional clues, it has arrived
10:23 that this is indeed Joffer Lakes. The
10:26 crazy thing is it has kind of even
10:29 identified that this looks like the
10:31 middle lake, which I believe is correct.
10:33 So, those were some tests on its image
10:36 and video analysis capabilities. Next,
10:38 let's test its knowledge on coding. So,
10:41 I'm going to get it to build a Windows
10:43 XP desktop with the following apps.
10:45 Paint. Clicking on this should open a
10:47 new window with an interactive canvas.
10:50 Video player. Clicking on this should
10:51 open a window where I can enter a
10:53 YouTube URL and press play. And then for
10:56 calculator, clicking on this should open
10:58 a window with a working calculator. Use
11:01 CSS,JS, and HTML in a single HTML file.
11:05 This is a key phrase I like to use to
11:07 keep everything in a self-contained
11:10 standalone file. So after pressing run,
11:13 if we expand its thought process, again,
11:15 it's breaking this down step by step. So
11:18 it's first understanding the request,
11:20 then it's structuring the HTML, and then
11:22 next it's handling styling, so the
11:24 desktop look and feel. And then next
11:26 it's covering the functionality with
11:28 JavaScript. And then it's refining
11:30 everything. And then here is a really
11:32 interesting observation. So it's also
11:34 self-correcting and improving its
11:36 response. So here's its initial thought,
11:39 but here it has corrected itself. You
11:42 also need dragging and stacking. So you
11:44 might need to implement this. And then
11:46 for the YouTube player, would it just be
11:48 this? The correction is no. You also
11:50 need an embedded player. And then also
11:53 for this, is it safe? etc., etc. So,
11:56 it's kind of like evaluating its own
11:58 response and then revising it further.
12:00 And so, afterwards, it has given me this
12:03 code. So, I'm just going to scroll all
12:05 the way down and download the HTML. All
12:08 right. And if I open this up, you can
12:10 indeed see a classic Windows XP desktop
12:14 with the appropriate colors. We even
12:15 have a start menu and the clock over
12:18 here. And if I click on paint, indeed,
12:21 it gives me a window. And let's try
12:23 painting this. This does work. Let me
12:25 change the color a bit. And let me
12:27 change the
12:28 size. And the size and color also work.
12:33 Really impressive. So, let me exit out
12:35 of this. Next, I'm going to open this
12:37 video player. And then, let me paste in
12:39 a YouTube URL. I'm just going to paste
12:41 in my earthquake video and then press
12:43 play. I want you to create an
12:46 interactive earthquake visualization of
12:49 Japan. So, let's say we have a map of
12:51 Japan like this. First, I want you to
12:54 list or show me all the major cities in
12:57 Japan. Very nice. So, that works
13:00 perfectly. And then finally, we have
13:02 this calculator app. Let's do like 3 *
13:06 9. And yes, that equals 27. So, all
13:09 three apps are working. So, it's able to
13:11 code up a Windows XP desktop with three
13:14 functional apps in just one prompt.
13:16 Super impressive. All right. Next up,
13:18 let's get it to create some cool
13:20 visualizations. So my prompt is create a
13:23 particle cloud visualizer that can
13:25 change shape, color, and other
13:27 properties. Make it interactive. Use
13:30 3JS. This is a JavaScript library for us
13:33 to create 3D animations. And also
13:36 anime.js. This is also another library
13:38 that can help us create smooth and
13:40 dynamic animations. And then again, my
13:42 key phrase that I like to use, put
13:44 everything in a single HTML file. So
13:47 let's click run and see what that gives
13:48 us. All right, here's what we get. And
13:50 if I expand this again, it's breaking
13:52 down the core request. Then it's
13:55 planning out the structure of the HTML.
13:57 It's setting up 3.js. It's setting up
14:00 the particle system. And then it's
14:02 coding up the interactivity,
14:04 implementing shape transitions, color
14:06 changes, etc., etc. And then finally, we
14:10 also have this really important
14:12 self-correction and refinement section
14:14 where it evaluates its own response and
14:16 revises it further. So afterwards, it
14:19 has given me this HTML code, which I'm
14:22 just going to scroll all the way down
14:23 and then click download. All right.
14:25 Next, I'm going to open this up on my
14:27 browser again. And holy smokes, what do
14:32 we have here? So, it looks like this
14:35 particle cloud is slowly forming into
14:37 the sphere. Oh my god, this looks so
14:40 cool. And I can like drag my mouse
14:42 around to view this further. If I
14:44 increase the particle size, it does
14:47 increase. Very nice. And then I can also
14:50 change the color like this. Very nice.
14:53 And then if I toggle this, apparently it
14:55 also uses this shape mod color. Let me
14:58 try changing the color of this and see
15:00 what happens. Okay, so it looks like
15:02 it's turning the color into a gradient
15:04 now. And then for shape, right now it's
15:07 a sphere. Let's turn this into a
15:09 cube.
15:11 Whoa. Holy smokes. This is such a cool
15:15 animation. Look at that. And then let's
15:19 turn this into a
15:20 Taurus. This is so impressive. Look at
15:24 that. And then finally, let's turn this
15:27 into a
15:29 plane. And indeed, it turns it into a
15:32 flat plane like this. Really cool. Let
15:35 me turn this back into a sphere. And
15:38 indeed, it creates a sphere from this.
15:41 So there you go. It also just nailed
15:44 this zero shot with just one prompt. In
15:46 fact, let me refresh the page again. I
15:48 really like the initial animation where
15:50 it turns into a sphere from this
15:52 particle cloud. Look how cool this is. I
15:56 really love that effect. All right.
15:58 Next, let's test its ability to
16:00 understand physics. So, here the prompt
16:02 is make a Gton board simulation with a
16:04 grid of pegs, sidewalls, and separate
16:07 dividers at the bottom. Drop balls from
16:09 the top upon button click. use
16:12 matter.js. This is another really
16:14 important JavaScript library that can
16:16 simulate physics very well. And then
16:18 here's my key phrase to put everything
16:20 in a single HTML file. Let's click run
16:24 and see what that gives us. All right,
16:25 here's its response. I'm just going to
16:27 scroll all the way down and download the
16:29 HTML. And then afterwards, let me open
16:32 this up. And here you can see a perfect
16:35 Gen board with perfect physics
16:37 understanding. So if I press on drop
16:39 ball, the ball indeed drops. and drops
16:42 down randomly into a certain container
16:44 based on gravity and physics. So, let's
16:47 click this a few more times so you can
16:49 see a few more examples. This is again a
16:52 flawless app that it created. Zero shot.
16:56 Super impressive. All right, here's
16:58 another cool example. Show me a
17:00 visualizer with animations upon mouse
17:03 hover. In a sidebar, I can choose from
17:05 different effects like blur, liquid,
17:08 chrome, particles, waves, grid,
17:10 distortion, iridesence, hypers speed,
17:12 add more. Some of these effect names I
17:15 just made up. I'm not even sure what
17:16 it's going to give me. And then I'm
17:18 going to use anime.js, which is again
17:20 really good for creating animations on
17:23 web pages. So, let's click run and see
17:25 what that gives us. All right, here's
17:27 its response. Again, it has the usual
17:30 thought process where it breaks down
17:32 everything and tackles it step by step.
17:35 And then at the end of its thinking
17:36 process, it's also correcting itself and
17:39 then also doing a final check on all the
17:42 requirements. And then I'm just going to
17:44 scroll all the way down to the end of
17:46 the code and then press download. All
17:48 right, let's open this up and see what
17:51 we get. So here the first effect is
17:53 blur. If I hover my mouse over this, it
17:56 indeed blurs these circles. And if I
17:59 take my mouse off the screen, the
18:01 circles are sharp again. Really cool.
18:03 So, blur works. Next, let's move on to
18:05 particles. If I hover my mouse, wow,
18:08 look at that. If I move my mouse along
18:10 the screen, it automatically creates
18:12 these particle fireworks. So, you can
18:15 use Gemini to easily add these really
18:18 cool and complex animations on your
18:20 website. Next, let's try waves. All
18:23 right, here's what we get. Now, if I
18:25 place my mouse on the screen, that is
18:27 what it does. Let me just do this a few
18:29 more times so you can see the effect of
18:31 my mouse hover. Really cool. And then
18:34 next up, we have grid distortion. And
18:37 here's what grid distortion does. Again,
18:40 a very interesting effect. And then
18:42 hypers speed. If I place my mouse on the
18:45 screen, this is so cool. So, notice that
18:48 the stars are now moving at a much
18:50 faster pace. And then if I take my mouse
18:52 off the screen, the stars now revert to
18:55 a slower pace. Let's do this again so
18:58 you can see the effect. Very nice. Next,
19:01 let's try glitch and see what that does.
19:05 Very cool. So, depending on where I
19:07 hover my mouse, it will add this glitch
19:10 effect over the text. And then let's see
19:13 what pixel stretch does. Whoa, really
19:16 interesting. So, it seems like it's just
19:18 stretching the letters either
19:20 horizontally or vertically. This kind of
19:22 looks like a barcode as well. And then
19:24 next we have liquid chrome. Let's see
19:27 what this does. Wow, this is also so
19:31 cool. Depending on where I move my
19:33 mouse, it's creating this effect which I
19:35 can't even describe. And then finally,
19:38 we have iridesence, which looks like
19:40 this. I don't even know what to expect
19:42 for iridescents, but uh this does look
19:44 like an iridescent orb. And if I move my
19:48 mouse on this sphere, I'm not sure what
19:51 happens. It does kind of change the
19:53 color slightly, but um yeah, again, I
19:56 don't really know what to expect for a
19:58 lot of these effects, so I'm not
20:00 expecting much here. The fact that it
20:01 was able to even create an iridescent
20:04 looking orb is already really
20:06 impressive. So, those are some of my
20:08 tests. Notice that this new Gemini 2.5
20:11 Pro0506 is not like way better than the
20:15 earlier version. This is just like
20:16 marginally better. And in fact, I
20:18 already did a full review of the
20:21 original Gemini 2.5 Pro where I go over
20:23 some really insane demos. I got it to
20:26 create a Pokédex, an interactive night
20:29 sky viewer with constellations. I got it
20:31 to analyze a ton of financial reports
20:34 and even create a 3D tourist map of Hong
20:37 Kong. So, I'm not going to repeat too
20:39 many of those examples in this video. If
20:42 you want to learn more, check out this
20:43 video if you haven't already. Finally,
20:46 here are some demos by Google
20:48 themselves. So, again, because Gemini is
20:50 multimodal, it can understand images.
20:52 You can upload an image of this tree and
20:55 then get it to transform this image into
20:57 a code-based representation of its
20:59 natural behavior. And this is what you
21:02 get. And instead of a tree, if you
21:04 upload a photo of a spiderweb with the
21:06 same prompt, it would create this app.
21:09 And then here is a photo of a fire with
21:12 the same prompt. Here is a photo of
21:14 fireflies. We also have clouds, a flock
21:18 of birds, and this photo of a fern. I
21:21 really like this animation. And then
21:23 here we have some water ripples, and I
21:26 don't even know what this is. Is this
21:27 like fungus growing or something? And
21:29 then it can even create this lightning
21:32 simulator. Really cool. Here's another
21:35 awesome demonstration by Deis Hassabis
21:37 where he just drew out a really rough
21:40 sketch of the app he wants to create and
21:42 then he simply wrote, "Can you code this
21:44 app?" And this is the final
21:51 result. Or here's another example where
21:53 the user prompts to code a game based on
21:57 his dog. He's going to upload a photo of
21:59 his dog with a Sakura background and it
22:03 actually creates a Sakura related game
22:06 with his dog as the character. How
22:08 incredible is that? All right, next
22:11 let's go over its specs and performance.
22:14 So, first up is this chatbot arena where
22:17 people can blind test different AI
22:19 models side by side. And apparently for
22:22 this latest version of Gemini 2.5 Pro,
22:25 not only is it ranked number one
22:27 overall, but across all these
22:30 categories, including style control,
22:32 hard prompts, coding, math, creative
22:34 writing, instruction following, and
22:36 longer query. And by the way, the margin
22:39 is absolutely huge. So if you look at
22:41 like the next top three models which is
22:44 open eyes 03 and GPT40 and Gro 3 these
22:48 only differ within like 10 points but
22:51 for Gemini 2.5 Pro it beats the next
22:54 best one by 37 points which is an insane
22:58 lead. Now, instead of LM Arena, here's
23:01 another popular leaderboard called
23:02 LiveBench by Abacus AI. And
23:05 interestingly, in this leaderboard, the
23:08 latest version of Gemini 2.5 Pro does
23:10 not perform so well. Now, this is based
23:12 on their own benchmarks. These are not
23:14 blind tests from other users, so keep
23:17 that in mind. Notice that 03 High is
23:20 still ranked number one on their
23:22 leaderboard. And then Gemini 2.5 Pro is
23:24 in third place. It underperforms 03 in
23:28 terms of reasoning and coding and
23:30 language, but it does outperform 03 in
23:32 terms of mathematics and data analysis.
23:36 I also tried going to another
23:37 independent evaluator called artificial
23:39 analysis, but it looks like they have
23:41 not added the latest version of Gemini
23:43 2.5 Pro yet. So, this is still the March
23:46 version. Here's another really useful
23:48 benchmark called Fiction Livebench,
23:50 which tests the AI's ability to analyze
23:53 really long prompts. So, for example, if
23:55 the story is like 120,000 words in
23:59 length and you ask it some really
24:00 specific questions, can the AI model
24:03 actually get it right? And surprisingly,
24:05 OpenAI's 03 got it 100% of the time
24:08 correct, whereas the latest version of
24:10 Gemini 2.5 Pro gets it 71.9% of the
24:14 time. Keep in mind that this is the same
24:16 score as the previous version of Gemini
24:18 2.5 Pro. So, if you want to feed it a
24:21 ton of information at once and then ask
24:23 it specific questions, according to this
24:26 leaderboard, 03 might be the better
24:28 option. By the way, if you're interested
24:29 in learning more about OpenAI's 03 and04
24:32 Mini, I also did a full review on that,
24:34 and it has some crazy abilities, so
24:37 definitely check out this video if you
24:39 haven't already. Next up, we have
24:41 another leaderboard called Humanity's
24:43 Last Exam. This name is really
24:45 misleading. It does not mean that we are
24:47 screwed once AI can get 100%. This is
24:51 basically a test of some really specific
24:54 knowledge on really obscure and
24:57 specialized scientific domains. And
24:59 interestingly, Gemini 2.5 Pro, the
25:02 latest version, actually scores a bit
25:06 below the earlier version that was
25:08 released in March, as you can see from
25:10 the score here. However, based on the
25:13 confidence intervals, this is not a
25:15 significant difference. So, in fact, all
25:17 five of these models do not have a
25:20 significant difference in terms of their
25:22 performance. So, they're all kind of
25:24 tied for number one place. Finally, if
25:26 you look at this leaderboard called
25:28 Geobbench, this basically tests the AI's
25:31 ability on guessing the location based
25:34 on a photo, like I did with the Joffre
25:36 Lakes example. And you can see here that
25:39 Gemini 2.5 Pro is currently ranked
25:42 number one. And if you add search to it,
25:44 which is kind of cheating, but if you
25:46 do, it performs even better. It's also
25:49 really important that the AI model
25:51 actually gives you factually correct
25:53 information and doesn't make stuff up.
25:56 So, here's a really useful leaderboard
25:58 that lists out the hallucination rates
26:00 of these AI models, or basically how
26:02 often they make stuff up. Now, they
26:04 haven't released the results for the
26:06 latest version of Gemini 2.5 Pro yet,
26:09 but as you can see from the March
26:11 version, it hallucinates 1.1% of the
26:14 time. If you really want your
26:16 information to be factually correct,
26:18 like if this is for scientific or legal
26:20 research, then at least according to
26:22 this leaderboard, you should use Gemini
26:24 2.0 Flash instead. Finally, I also want
26:28 to go over the cost of this. So in their
26:30 official blog it says this improved
26:33 version will be available at the same
26:35 price. So if you look at the price of
26:37 Gemini 2.5 Pro notice that it is cheaper
26:40 than Claude 3.7 Gro 3 and Open Eyes03
26:45 which is crazy expensive. So not only is
26:47 this one of the best models out there
26:49 but it's also cheaper than the other
26:51 ones making it really cost effective.
26:53 Anyways that sums up my review on this
26:56 latest version of Gemini 2.5 Pro. For
26:59 me, I think the most useful feature is I
27:01 can record a video explaining exactly
27:04 how I want an app to look and function,
27:06 and it would actually understand
27:08 everything and create the app for me.
27:10 This is way more effective than just
27:12 using a text prompt. But let me know in
27:14 the comments what you think. And if
27:16 you've had a chance to play around with
27:17 this latest version, what are some other
27:19 cool and impressive things you were able
27:22 to come up with? As always, I will be on
27:24 the lookout for the top AI news and
27:27 tools to share with you. So, if you
27:29 enjoyed this video, remember to like,
27:30 share, subscribe, and stay tuned for
27:33 more content. Also, there's just so much
27:35 happening in the world of AI every week,
27:37 I can't possibly cover everything on my
27:40 YouTube channel. So, to really stay up
27:42 to date with all that's going on in AI,
27:45 be sure to subscribe to my free weekly
27:47 newsletter. The link to that will be in
27:49 the description below. Thanks for
27:51 watching, and I'll see you in the next
27:53 one.