YouTube Transcript:
Gemini 2_5 just leveled up_ And it’s a BEAST

Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.

Video Transcript

View:

Why do I have a photo of a tree here?

more complex problems for STEM subjects

like coding, math, and science. So, in

this video, that's mostly what I'm going

to show you. I'm going to test it on

some really challenging STEM related

prompts. Plus, the awesome thing about

Gemini is that it's multimodal. So, it

can take in and understand multiple

formats, including audio, images, and

video. In fact, for the first example,

I'm going to take this video where I

draw out a diagram of the app I want and

explain what I wanted to do. Let me play

you the video first. I want you to

create an interactive earthquake

visualization of Japan. So, let's say we

have a map of Japan like this. First, I

want you to list or show me all the

major cities in Japan on the map. And

then there's going to be a left sidebar

where I can adjust various settings like

earthquake, magnitude, etc., etc. So

these are the settings that I can

adjust. And whenever I click somewhere

on the map, so let's say I click here,

then you would start to create an

earthquake. So it's going to be an

animation effect that slowly ripples and

ripples all the way until it hits one of

these cities. And based on the magnitude

of the earthquake, I want you to

calculate how severe the impact would be

for each major city. So, I've uploaded

that to YouTube. And then I'm going to

paste the YouTube link in here like

this. So, notice it automatically knows

how to extract and analyze the YouTube

video. And this takes up around 14,000

tokens out of the million tokens. Now,

to make sure it's actually analyzing the

video and understanding everything in

the video for my prompt, I'm not going

to mention anything about earthquakes or

Japan, I'm just going to write put

everything in a standalone single HTML

file. So, let's click run and see what

that gives us. All right, so here's its

thinking process. For all the top models

out there, usually they have this

thinking function where it takes some

time to think through its answer and

correct itself before it gives you its

final response. So let's look at its

thinking process really quickly. Here

it's breaking down the requirements

which I specified in the video. A map of

Japan, left sidebar for settings, click

to create earthquake, earthquake

animation, impact calculation, etc.,

etc. And then it starts with a plan of

attack. So phase one is the basic

structure and the map. And then phase

two is adding the sidebar controls.

Phase three is the earthquake animation.

Phase four is impact calculation, etc.,

etc. So afterwards, it proceeds to give

me the entire code. So I'm just going to

scroll all the way down and then

download this HTML. And then I'm just

going to open the HTML in my browser.

And here's what we get. Indeed, we have

an interactive map of Japan, which you

can move around. And if I click here,

for

example, it does cause an earthquake.

And we can see the impact of the

earthquake on all the cities. This is so

cool. Now, if we change the magnitude,

let's change this lower. And then if I

click here again, note that the severity

is lower than the previous earthquake,

which has a larger magnitude. And then

if I drag this all the way to like 10,

for example, and let's say I click here,

notice that the severity is a lot

higher, reaching 100 for some of these

cities that are nearby. And then let's

see what wave factor does. I think this

is like the speed of the ripples. So if

I drag this to a lower value and I click

here again. Yeah. So this ripples a bit

slower. Anyways, a really cool app. It

totally understood my really lousy

explanation and illustration from the

video. And this opens up a ton of

possibility. Instead of just typing out

a prompt and not fully being able to

explain how you want to design an app,

you can record yourself just drawing out

an illustration explaining what each

component of the app does. And then you

just plug the video into Gemini and it

would generate the app for you. All

right, next up again because Gemini is

multimodal and it can understand images.

I'm going to upload this image of a tree

and then I'm going to ask it what is

here. Let's click run and see if it can

figure this out. It only thought for

like 5 seconds. And here it has

correctly identified that it's a mossy

leafetailed gecko. It even gave me the

scientific name camouflaged on the tree

trunk. And this is indeed correct. So

for those of you who have no idea what

the hell you're looking at, there is a

gecko like over here. So this is its

head. It's pointing down. You can see

here are its eyes. And if you follow my

mouse, this is roughly the outline of

its head. This is a really cool gecko

found in, I believe, Madagascar. And

it's really good at camouflage. So as

you can see, Gemini has no problem

analyzing and understanding images.

Speaking of Google's Gemini, if you're

in marketing and you find yourself

spending hours on research, strategy,

and content creation, it's time to

rethink your approach with AI. Check out

this free guide, Google Gemini at Work

by HubSpot. Inside, you'll discover the

Gemini Marketing Stack. These are AI

tools that make your research, campaign

planning, and content creation way more

productive. And my favorite part, it

provides a ton of pre-built prompts and

templates which you can just copy and

paste. You'll get step-by-step

instructions on how to do research 10

times faster using Gemini Deep Research.

They also show you how to use Notebook

LM to connect your campaign data,

competitor research, and customer

feedback into one powerful dashboard

that actually thinks for you. No more

digging through folders or piecing

insights together manually. There's even

a four-week implementation plan at the

end, so you can start small and see

results right away. This resource was

made by HubSpot, the sponsor of this

video. I recommend you download it for

free via the link in the description

below. Next, I'm going to upload this

image of a hike I did like a few years

ago. And this isn't even like the main

lake or attraction of the hike. It's a

pretty normal, you know, hiking photo of

mountains and a lake. This could be

anywhere. And then I'm going to paste

the image in here and then ask it where

is this. So let's click run and see what

that gives us. All right, here's what we

get. Let's expand its thinking process.

So it's analyzing the key visual

elements. It has turquoise water, steep

tree covered slopes, glaciers in the

background. It could be all of these

options. Based on the feel, it seems

like Canadian Rockies or BC coast. And

then it's actually searching for

specific lakes. Now, because I didn't

turn on grounding with Google search,

it's not actually using Google to

search. It's just searching mentally in

its head based on the knowledge it was

trained on. And then it has found all

these turquoise lakes. And then after

some additional clues, it has arrived

that this is indeed Joffer Lakes. The

crazy thing is it has kind of even

identified that this looks like the

middle lake, which I believe is correct.

So, those were some tests on its image

and video analysis capabilities. Next,

let's test its knowledge on coding. So,

I'm going to get it to build a Windows

XP desktop with the following apps.

Paint. Clicking on this should open a

new window with an interactive canvas.

Video player. Clicking on this should

open a window where I can enter a

YouTube URL and press play. And then for

calculator, clicking on this should open

a window with a working calculator. Use

CSS,JS, and HTML in a single HTML file.

This is a key phrase I like to use to

keep everything in a self-contained

standalone file. So after pressing run,

if we expand its thought process, again,

it's breaking this down step by step. So

it's first understanding the request,

then it's structuring the HTML, and then

next it's handling styling, so the

desktop look and feel. And then next

it's covering the functionality with

JavaScript. And then it's refining

everything. And then here is a really

interesting observation. So it's also

self-correcting and improving its

response. So here's its initial thought,

but here it has corrected itself. You

also need dragging and stacking. So you

might need to implement this. And then

for the YouTube player, would it just be

this? The correction is no. You also

need an embedded player. And then also

for this, is it safe? etc., etc. So,

it's kind of like evaluating its own

response and then revising it further.

And so, afterwards, it has given me this

code. So, I'm just going to scroll all

the way down and download the HTML. All

right. And if I open this up, you can

indeed see a classic Windows XP desktop

with the appropriate colors. We even

have a start menu and the clock over

here. And if I click on paint, indeed,

it gives me a window. And let's try

painting this. This does work. Let me

change the color a bit. And let me

change the

size. And the size and color also work.

Really impressive. So, let me exit out

of this. Next, I'm going to open this

video player. And then, let me paste in

a YouTube URL. I'm just going to paste

in my earthquake video and then press

play. I want you to create an

interactive earthquake visualization of

Japan. So, let's say we have a map of

Japan like this. First, I want you to

list or show me all the major cities in

Japan. Very nice. So, that works

perfectly. And then finally, we have

this calculator app. Let's do like 3 *

9. And yes, that equals 27. So, all

three apps are working. So, it's able to

code up a Windows XP desktop with three

functional apps in just one prompt.

Super impressive. All right. Next up,

let's get it to create some cool

visualizations. So my prompt is create a

particle cloud visualizer that can

change shape, color, and other

properties. Make it interactive. Use

3JS. This is a JavaScript library for us

to create 3D animations. And also

anime.js. This is also another library

that can help us create smooth and

dynamic animations. And then again, my

key phrase that I like to use, put

everything in a single HTML file. So

let's click run and see what that gives

us. All right, here's what we get. And

if I expand this again, it's breaking

down the core request. Then it's

planning out the structure of the HTML.

It's setting up 3.js. It's setting up

the particle system. And then it's

coding up the interactivity,

implementing shape transitions, color

changes, etc., etc. And then finally, we

also have this really important

self-correction and refinement section

where it evaluates its own response and

revises it further. So afterwards, it

has given me this HTML code, which I'm

just going to scroll all the way down

and then click download. All right.

Next, I'm going to open this up on my

browser again. And holy smokes, what do

we have here? So, it looks like this

particle cloud is slowly forming into

the sphere. Oh my god, this looks so

cool. And I can like drag my mouse

around to view this further. If I

increase the particle size, it does

increase. Very nice. And then I can also

change the color like this. Very nice.

And then if I toggle this, apparently it

also uses this shape mod color. Let me

try changing the color of this and see

what happens. Okay, so it looks like

it's turning the color into a gradient

now. And then for shape, right now it's

a sphere. Let's turn this into a

cube.

Whoa. Holy smokes. This is such a cool

animation. Look at that. And then let's

turn this into a

Taurus. This is so impressive. Look at

that. And then finally, let's turn this

into a

plane. And indeed, it turns it into a

flat plane like this. Really cool. Let

me turn this back into a sphere. And

indeed, it creates a sphere from this.

So there you go. It also just nailed

this zero shot with just one prompt. In

fact, let me refresh the page again. I

really like the initial animation where

it turns into a sphere from this

particle cloud. Look how cool this is. I

really love that effect. All right.

Next, let's test its ability to

understand physics. So, here the prompt

is make a Gton board simulation with a

grid of pegs, sidewalls, and separate

dividers at the bottom. Drop balls from

the top upon button click. use

matter.js. This is another really

important JavaScript library that can

simulate physics very well. And then

here's my key phrase to put everything

in a single HTML file. Let's click run

and see what that gives us. All right,

here's its response. I'm just going to

scroll all the way down and download the

HTML. And then afterwards, let me open

this up. And here you can see a perfect

Gen board with perfect physics

understanding. So if I press on drop

ball, the ball indeed drops. and drops

down randomly into a certain container

based on gravity and physics. So, let's

click this a few more times so you can

see a few more examples. This is again a

flawless app that it created. Zero shot.

Super impressive. All right, here's

another cool example. Show me a

visualizer with animations upon mouse

hover. In a sidebar, I can choose from

different effects like blur, liquid,

chrome, particles, waves, grid,

distortion, iridesence, hypers speed,

add more. Some of these effect names I

just made up. I'm not even sure what

it's going to give me. And then I'm

going to use anime.js, which is again

really good for creating animations on

web pages. So, let's click run and see

what that gives us. All right, here's

its response. Again, it has the usual

thought process where it breaks down

everything and tackles it step by step.

And then at the end of its thinking

process, it's also correcting itself and

then also doing a final check on all the

requirements. And then I'm just going to

scroll all the way down to the end of

the code and then press download. All

right, let's open this up and see what

we get. So here the first effect is

blur. If I hover my mouse over this, it

indeed blurs these circles. And if I

take my mouse off the screen, the

circles are sharp again. Really cool.

So, blur works. Next, let's move on to

particles. If I hover my mouse, wow,

look at that. If I move my mouse along

the screen, it automatically creates

these particle fireworks. So, you can

use Gemini to easily add these really

cool and complex animations on your

website. Next, let's try waves. All

right, here's what we get. Now, if I

place my mouse on the screen, that is

what it does. Let me just do this a few

more times so you can see the effect of

my mouse hover. Really cool. And then

next up, we have grid distortion. And

here's what grid distortion does. Again,

a very interesting effect. And then

hypers speed. If I place my mouse on the

screen, this is so cool. So, notice that

the stars are now moving at a much

faster pace. And then if I take my mouse

off the screen, the stars now revert to

a slower pace. Let's do this again so

you can see the effect. Very nice. Next,

let's try glitch and see what that does.

Very cool. So, depending on where I

hover my mouse, it will add this glitch

effect over the text. And then let's see

what pixel stretch does. Whoa, really

interesting. So, it seems like it's just

stretching the letters either

horizontally or vertically. This kind of

looks like a barcode as well. And then

next we have liquid chrome. Let's see

what this does. Wow, this is also so

cool. Depending on where I move my

mouse, it's creating this effect which I

can't even describe. And then finally,

we have iridesence, which looks like

this. I don't even know what to expect

for iridescents, but uh this does look

like an iridescent orb. And if I move my

mouse on this sphere, I'm not sure what

happens. It does kind of change the

color slightly, but um yeah, again, I

don't really know what to expect for a

lot of these effects, so I'm not

expecting much here. The fact that it

was able to even create an iridescent

looking orb is already really

impressive. So, those are some of my

tests. Notice that this new Gemini 2.5

Pro0506 is not like way better than the

earlier version. This is just like

marginally better. And in fact, I

already did a full review of the

original Gemini 2.5 Pro where I go over

some really insane demos. I got it to

create a Pokédex, an interactive night

sky viewer with constellations. I got it

to analyze a ton of financial reports

and even create a 3D tourist map of Hong

Kong. So, I'm not going to repeat too

many of those examples in this video. If

you want to learn more, check out this

video if you haven't already. Finally,

here are some demos by Google

themselves. So, again, because Gemini is

multimodal, it can understand images.

You can upload an image of this tree and

then get it to transform this image into

a code-based representation of its

natural behavior. And this is what you

get. And instead of a tree, if you

upload a photo of a spiderweb with the

same prompt, it would create this app.

And then here is a photo of a fire with

the same prompt. Here is a photo of

fireflies. We also have clouds, a flock

of birds, and this photo of a fern. I

really like this animation. And then

here we have some water ripples, and I

don't even know what this is. Is this

like fungus growing or something? And

then it can even create this lightning

simulator. Really cool. Here's another

awesome demonstration by Deis Hassabis

where he just drew out a really rough

sketch of the app he wants to create and

then he simply wrote, "Can you code this

app?" And this is the final

result. Or here's another example where

the user prompts to code a game based on

his dog. He's going to upload a photo of

his dog with a Sakura background and it

actually creates a Sakura related game

with his dog as the character. How

incredible is that? All right, next

let's go over its specs and performance.

So, first up is this chatbot arena where

people can blind test different AI

models side by side. And apparently for

this latest version of Gemini 2.5 Pro,

not only is it ranked number one

overall, but across all these

categories, including style control,

hard prompts, coding, math, creative

writing, instruction following, and

longer query. And by the way, the margin

is absolutely huge. So if you look at

like the next top three models which is

open eyes 03 and GPT40 and Gro 3 these

only differ within like 10 points but

for Gemini 2.5 Pro it beats the next

best one by 37 points which is an insane

lead. Now, instead of LM Arena, here's

another popular leaderboard called

LiveBench by Abacus AI. And

interestingly, in this leaderboard, the

latest version of Gemini 2.5 Pro does

not perform so well. Now, this is based

on their own benchmarks. These are not

blind tests from other users, so keep

that in mind. Notice that 03 High is

still ranked number one on their

leaderboard. And then Gemini 2.5 Pro is

in third place. It underperforms 03 in

terms of reasoning and coding and

language, but it does outperform 03 in

terms of mathematics and data analysis.

I also tried going to another

independent evaluator called artificial

analysis, but it looks like they have

not added the latest version of Gemini

2.5 Pro yet. So, this is still the March

version. Here's another really useful

benchmark called Fiction Livebench,

which tests the AI's ability to analyze

really long prompts. So, for example, if

the story is like 120,000 words in

length and you ask it some really

specific questions, can the AI model

actually get it right? And surprisingly,

OpenAI's 03 got it 100% of the time

correct, whereas the latest version of

Gemini 2.5 Pro gets it 71.9% of the

time. Keep in mind that this is the same

score as the previous version of Gemini

2.5 Pro. So, if you want to feed it a

ton of information at once and then ask

it specific questions, according to this

leaderboard, 03 might be the better

option. By the way, if you're interested

in learning more about OpenAI's 03 and04

Mini, I also did a full review on that,

and it has some crazy abilities, so

definitely check out this video if you

haven't already. Next up, we have

another leaderboard called Humanity's

Last Exam. This name is really

misleading. It does not mean that we are

screwed once AI can get 100%. This is

basically a test of some really specific

knowledge on really obscure and

specialized scientific domains. And

interestingly, Gemini 2.5 Pro, the

latest version, actually scores a bit

below the earlier version that was

released in March, as you can see from

the score here. However, based on the

confidence intervals, this is not a

significant difference. So, in fact, all

five of these models do not have a

significant difference in terms of their

performance. So, they're all kind of

tied for number one place. Finally, if

you look at this leaderboard called

Geobbench, this basically tests the AI's

ability on guessing the location based

on a photo, like I did with the Joffre

Lakes example. And you can see here that

Gemini 2.5 Pro is currently ranked

number one. And if you add search to it,

which is kind of cheating, but if you

do, it performs even better. It's also

really important that the AI model

actually gives you factually correct

information and doesn't make stuff up.

So, here's a really useful leaderboard

that lists out the hallucination rates

of these AI models, or basically how

often they make stuff up. Now, they

haven't released the results for the

latest version of Gemini 2.5 Pro yet,

but as you can see from the March

version, it hallucinates 1.1% of the

time. If you really want your

information to be factually correct,

like if this is for scientific or legal

research, then at least according to

this leaderboard, you should use Gemini

2.0 Flash instead. Finally, I also want

to go over the cost of this. So in their

official blog it says this improved

version will be available at the same

price. So if you look at the price of

Gemini 2.5 Pro notice that it is cheaper

than Claude 3.7 Gro 3 and Open Eyes03

which is crazy expensive. So not only is

this one of the best models out there

but it's also cheaper than the other

ones making it really cost effective.

Anyways that sums up my review on this

latest version of Gemini 2.5 Pro. For

me, I think the most useful feature is I

can record a video explaining exactly

how I want an app to look and function,

and it would actually understand

everything and create the app for me.

This is way more effective than just

using a text prompt. But let me know in

the comments what you think. And if

you've had a chance to play around with

this latest version, what are some other

cool and impressive things you were able

to come up with? As always, I will be on

the lookout for the top AI news and

tools to share with you. So, if you

enjoyed this video, remember to like,

share, subscribe, and stay tuned for

more content. Also, there's just so much

happening in the world of AI every week,

I can't possibly cover everything on my

YouTube channel. So, to really stay up

to date with all that's going on in AI,

be sure to subscribe to my free weekly

newsletter. The link to that will be in

the description below. Thanks for

watching, and I'll see you in the next

one.

Click on any text or timestamp to jump to that moment in the video

Most transcripts ready in under 5 seconds

One-Click Copy125+ LanguagesSearch ContentJump to Timestamps

Paste YouTube URL

Enter any YouTube video link to get the full transcript

Most transcripts ready in under 5 seconds

Get Our Chrome Extension

Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.

Add to Chrome — Free

Works with YouTube, Coursera, Udemy and more educational platforms

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube Transcript:Gemini 2_5 just leveled up_ And it’s a BEAST

Video Transcript

Paste YouTube URL

Transcript Extraction Form

Get Our Chrome Extension

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube Transcript:
Gemini 2_5 just leveled up_ And it’s a BEAST