This content details a comparative coding test where three advanced AI models (GPT Codeex, Claude Code, and Google's Anti-gravity) were tasked with building a game for a custom arcade cabinet, evaluating their planning and execution capabilities.
Key Points
Mind Map
Klik untuk perbesar
Klik untuk menjelajahi mind map interaktif lengkap
Today we're going to be doing a vibe
coding test between GPT Codeex Claude
Code using Claude Opus 4.5 and Google's
anti-gravity using Gemini 3 Pro. Now,
this is going to be an interesting test
because their task is going to be the
exact same thing to build a game that
works on the arcade cabinet behind us,
which you may be familiar with because I
did a video on it, its introduction to
the channel using GLM 4.7 Flash with
rather impressive results. So, I'm very
excited to see what these models are
capable of putting out today. And to
jump right into it, I want to say do not
be concerned by the fact that I have the
web chat interfaces for all of these up
right now. The reason for this is we're
going to be giving them an initial
prompt as well as some reference so that
they can go ahead and denote tasks for
the specific agent that is from either
of these model providers. So essentially
what I'm saying right here is each of
these models are just tasked with
outlining the plan of action for the
agent that will be subsequently taken
using codeex cloud code and Google's
anti-gravity. So really, I think the
first thing to do is probably just
quickly go over the prompt that we're
giving each of these models, which are
going to act as the initial orchestrator
for this task. And then we'll jump into
some agent decoding because the rest of
this video will be done either through
the terminal for codeex and cloud code
or using the UI interface that comes
bundled with Google's anti-gravity. So
this is going to be rather exciting, at
least for myself, and I hopefully the
viewers feel that way as well. So feel
free to subscribe and join as a channel
member for less than 10 cents a day and
let's get into it. It's basically
telling them that the task is to build a
game to run on a custom device which we
have also attached a picture of because
these models can visually reference this
and that may influence some of their
design decisions. So this is the
specific image that we've attached
alongside our prompt. In addition to
that, there is also a Python script that
just shows the proper control mapping
for the inputs for this device. mainly
the two arcade buttons, the steering
wheel, one additional kind of blinker
stock button and then also the joystick.
So, just so they understand that and we
don't want to leave that open to
interpretation because not having the
controls mapped properly can just be a
lot of painful debugging which kind of
is outside the scope of just seeing what
games they make. Regardless of that, we
can see that we're just instructing them
to basically outline a plan of action
for this agent tasked with providing us
the following things. an initial prompt
to send to the agent. If there are to be
multiple prompts, meaning if they want
to use a planning mode initially and
then a build mode, they should specify
that as well as the subsequent prompts
for either specific mode they choose to
use. And in addition to that, they
should also include any additional
information that they want placed in the
root project directory, such as a
markdown file instructing the agent on
specific things. I have told them the
only thing contained within the project
directory is the Python script for the
control schema. so that the models can
reference that and the agents can see
okay these are the controls that I'm
mapping it to and I have said because I
don't want them to just try Python for
this as it would kind of be restrictive
in terms of how cool of an end result so
I have specifically mentioned that that
script being a Python file is not
indicative of wanting a Python project
it is just a reference point for them to
understand how the controls are
implemented so with that we're basically
ready to go ahead and I will probably
just start all of these at the same time
being that I know 5.2 on extended
thinking or heavy thinking as we see
right here is going to take a little
while. I have extended thinking enabled
for Opus 4.5 and Gemini 3 is set to pro
and we also do have the canvas mode
enabled. Game design document steel thunder.
thunder.
All right, so now we're going to go
ahead and see a 3D tank combat arena
game designed specifically for the dual
controller arcade cabinet. Okay, so
that's first and foremost what we see
for Claude. I'm not gonna read through
the entire thing verbosely right here.
Let's see what Gemini 3 Pro I get so
excited when I love this stuff.
Seriously, I like very much enjoy this.
All right, so what are you going to be
giving us today?
My mistake. I was stuck trying to parse
the canvas thinking that was the only
output of it. It is a little it's made a
bit more difficult by having these in
this 33% segment window size for having
three of them up at a time. Tank
commander or mech simulator. Very
interesting. So, two of them have opted
to try to think of a tank thing. I will
say I do like that Gemini 3 Pro has
started out by complenting the hardware
setup. So, I do appreciate that very
much. Now, I don't see any specific
thing with the GPT output. Like with the
other two, we kind of had tank style
games that were suggested. This one just
says use something that has the
controls. I think it's leaning more
towards some form of racing game. Being
that I did see some vehicle selection
things mentioned here, and I was wrong.
I had actually neglected to see this
specific design markdown file from GPT
where it does mention the game design
seed is to be a neon drift corer. That
actually sounds pretty cool. I am
personally probably more interested in
seeing this result than the two tank
games. Just personal preference, but we
definitely will have a wide variety of
different options to play hopefully
assuming they work. I have all of the
repositories seated now with the
markdown files and every other bit of
information that these models wanted for
their agents. So now I'm just going to
sequentially begin actually coding with
these. And I think this needs to be done
one at a time at least for the initial
start because I will get overwhelmed and
probably copy paste the wrong thing into
the wrong model. Now let's go ahead and
set this to the highest end model. So if
we type model right here, we are going
to use 5.2 codecs. But if we press
enter, we can select a different
reasoning effort which is going to be
extra high. And they do give us a
warning that this can consume tokens at
a significantly higher rate. However, I
am looking for maximum performance and
we'll see how it does. GPT has not told
me to do anything specific. It just says
prompt A and prompt B. So, this
hypothetically should just be sent to
the agent right now. And then following
that, once that's done, I should just do
this. So, really, unless I've misread
something, it seems like we can just
basically copy paste this in and then
press enter. And that begins our GPT 5.2
into Codex X high thinking. Now, I was
initially thinking perhaps I could try
these like one video for each specific
model here just to, you know, stretch
out the max amount of content from a
YouTuber perspective. I decided not to
do that. It'd be more interesting with
all of these in one video. I am now
somewhat starting to regret that just
because this is a bit chaotic though.
Um, we'll see what happens. So Claude
had put all of the things that it wants
us to do in terms of specific modes,
specific prompts in this action plan
markdown file, which I neglected to
notice upon initial look through the
generated results from these online chat
agents. But now we're going to go ahead
and just see send this with extended
thinking enabled, high reasoning budget,
validate the design, plan, architecture,
and identify risks. Now, interestingly
enough, Claude has opted to use the Go
Dot game engine for this, something I've
personally never in my life used, though
I do find it very interesting. So
there's going to perhaps be some form of
learning curve for myself being that I
did not restrict how they go about
making these games. So Go dot is a
perfectly reasonable way to go. I just
may have to do a little more work than
anticipated. Okay, so Claude's given me
a few different options. One of which is
the tab key. So we can use that or we
can just use the ultra think keyword.
Now personally I think it's not the best
to have to have like a bunch of
different keywords here to prompt it to
Okay, good. So, Claude agrees with me
that it is a janky solution to have to
put these weird keywords in, but okay.
So, we'll just do what it says. And now
we're going to send our Ultraink plan.
And now Claude is running as well as GPT
has run. So, this is really a disaster
to wrangle these at the same time. I'm
wholeheartedly regretting having not
just done separate videos for each of
these, which I suppose I still could do
being that this video hasn't been put
out yet, but I'm committed. I'm now
invested. So, let's start anti-gravity.
So, I just got a warning that
but it's still open.
So, I'm probably just going to opt to
ignore that. I do have planning mode
selected as Gemini 3 said we should do
for our initial prompt. And we do also
have Gemini 3 Pro high enabled. So we're
just going to paste our initial prompt
now here into anti-gravity as well. And
we now have all three of these embarked
on the initial phase, which was all of
them basically wanted an initial prompt.
So as I was saying, all of these
basically began with an initial prompt
to allow the agent to plan based off of
the outline or architecture for the game
that was decided by the web chat
interface agents. So now we have all of
them working. I'm going to wait till
they have all completed the planning
phase to go into the next prompt which
will embark upon the actual build of
this just to keep things a little
coherent. I think that this is where we
may begin to stray from the prompt
because I almost want to just allow it
to continue now as opposed to sending
the prompt that GPT had said I should
subsequently send once this has been done.
done.
But I suppose I will. This is tough now.
I kind of want to let the agent just
work how it decides to do. So being that
that is the case and keep in mind this
may be wrong, but
we'll just allow them to work in the way
they want to. I think I was maybe
supposed to use an arrow key there to
select one of those. It seems to have
understood what specifically was going
to go on. And Gemini is just asking for
permission initially to create the
plan.mmarkdown file as I had not granted
it permission to create files in that
directory. And that is going ahead and
creating that now. So we'll wait for
that to happen so we can have them all
building at the same exact time. Key
decisions. Okay, I'm fine with the
concept. The critical risk is that it
does not know if Go dot is going to be
able to properly enumerate and
distinguish these HID devices
and it's giving me a confidence level
here. In that case, I would probably
just tell it then use something else cuz
I kind of don't want to have to deal
with a bunch of go dot stuff right now
being that I've genuinely never used it.
I'm actually fine with letting it try
this even using go dot because it has
baked in the implementation order first
and foremost. We need to just verify
that the inputs actually work. So if
they don't work with go dot then we can
go back and have it use a different
language or something of the sort. But I
figure I want to allow it to continue
doing what it wants to do at least until
we know make or break if the controls
are actually going to work. I would
assume they're going to. Now we have
Google Agent Manager. And I could have
sworn that I had set the settings in
this to make everything a bit more
visible. Okay, so Anti-gravity has given
us our design document for a game called
Bore Core. I don't want to laugh at that
because there may be some technical term
that that actually makes sense for this
kind of trench driver tunnel shooter.
Oh, it's going to use Python and Ursina.
This is going to be quite a task to make
look good comparatively to GPT choosing
to just use like 3JS or something. So,
this will be neat. We do have our
document we can kind of take a quick
peek at right here. But to be honest
with you, I'm just going to let them go
ahead and do whatever they want. Oh, it
has given me multiple different concepts
that I could choose from though. And it
did go ahead and select a specific
concept just from all of these, which
I'm cool with then because that
eliminates me having to choose one
specifically and then potentially making
the wrong choice subjectively speaking.
So, I'm just going to let this go ahead.
Okay, do whatever you want. Now, they're
all except for Claude building. Or is
Claude building right now as well? It is
building. Very cool. So, we now have
Codeex 5.2 Extra High, Opus 4.5 with
Ultra Think and Gemini 3 Pro vibe coding
games for our arcade cabinet. Very
exciting. So, apparently Bore Core is done,
done,
which was very quick, which concerns me
a bit. There is one window here that
will specifically show us the files it
has created. And we see we have our
Borcore game and a bunch of other
things. So, I suppose we can just
briefly look at these while we're Oh,
wow. Okay. So, the changes in zoom level
were definitely reflected more
prominently in this anti-gravity window.
I just basically want to see how long
these scripts are. I know length has no
measure of intricacy, but I'll be very
interested to see how how this game
works. I've completed the complete
foundation for Steel Thunder. I like it.
I'm just going to let this proceed with
phase two and I will manually handle the
go dot installation so that we don't
waste tokens on stuff that human can
easily do comparatively to actually
coding this entire thing with rapid quickness.
quickness.
So we're allowing Claude to continue
with phase 2. And again uh yeah I didn't
have anything to say there. GPT still
working. We're down to 87% context
length left. So this will be very
interesting. I find that the Jeep Claude
is obviously known for its fantastic
aesthetic capabilities when using
specific like paywalled models from
OpenAI like GPT 5.2 Pro. I have seen
some very promising 3D design
capability. And being that this is
opting to just use this in a simpler way
with like some 3JS, this could be very
aesthetically pleasing potentially. I'm
probably just going to figure out how to
install Go Dot on this dev uh arcade
I can now say that I have used Go dot
because that was probably one of the
simplest application installs that I've
seen on Ubuntu. Well done. So, all of
these are at some initial level of us
being able to play them at varying
levels. So, we see Claude right here is
allowing me to proceed with phase 3 if I
want to. But before I continuously go
with it, I want to test this in go do
first. being that it was initially
unsure if the controls would actually
work in that engine. GPT has given us
our project as well and it gives us some
atte uh attemp potential next steps that
we can take such as just trying the
prototype, testing it with the hardware
and then we can also expand. But I think
we're at a level now where we can safely
go ahead and try all of these which is
exciting. So, we're now ready to try our
first result. And I think I'm going to
try this from easiest to hardest in
terms of level of complexity to actually
get them set up with dependencies and
everything. Meaning that our Gemini 3
Pro result is the easiest to run. So,
we'll test this one first, followed by
the chat GPT result, then finally the
Claude result. I am very excited and
we'll see uh fingers crossed how this
Did something go wrong? Cuz
So, I'm very happy to report that as we
can see on the screen of the arcade
machine, the Gemini 3 Pro result has a
ton of errors. Basically, it's missing
shapes and things that would make this
game not look like a 2D piece of
garbage. So, this is something we can
give this now for our second iteration
in the true vibe coding fashion, and it
will inevitably make it look far more
impressive than the um red square on
screen. So, now I'm just going to hop in
and try the codeex 5.2x high result,
which it has given me a readme file as
well, which I'm just taking a look
through just to see how it wants me to
go about actually starting this. All
right, let's get our first look at the
as a first draft. That's not horrible.
And basically, so far, while they have
not necessarily been what I expected,
the controls are correctly working with
both the Gemini 3 Pro and the Codeex
results, which is good to see. This one
is obviously a bit more immersive and it
has a low poly aesthetic. Not
necessarily what I hoped to see when it
said it was making a Drift game, but
perhaps my expectations were a bit high,
although I do have justifiable reason
for that, which we can showcase later in
the video. Now, our next step is for the
initial testing of these results in our
first try is to go ahead and figure out
how to use go dot to run the claw code result.
So, while that was perhaps visually the
most impressive, it didn't actually
work. the control maps did not work,
which is quite upsetting. Uh, being that
the other two did work with the controls,
controls,
I don't want to have to try to
troubleshoot this with working with this
specific game engine. Although, I'm
going to be completely honest, in terms
of my first impressions with using Go
Dot and just what I see, I very much
like it. So, this may be something I
play more with later, but for right now,
I'm more interested in just getting like
a decent looking web game. And here's
where I'm at right now. So, I'm
extremely frustrated because last night
I used Claude Code for basically 20
minutes and I had it iterate upon the
game we created with GLM 4.7 Flash
saying, "I want you to make a little 3D
low poly city with this. Have separate
views for the cars and stuff like that."
So, I was kind of anticipating more of
what we just saw. saw instead of the
three very unique results we received
today. I almost have to wonder if how I
went about this initially was incorrect
where I had the online versions of these
models and not even versions just like
the chat GPT website one plan out the
implementation plan for this. I almost
feel like if I just prompted them and
said here's a file that maps the
controls for my arcade cabinet. Make me
a 3D low poly highway racing game. We
probably would have received a better
result. And now I don't know what to do.
I basically hate all of these results so
much that I don't even want to spend
time trying to make them better. I want
to call this like phase one, which So
maybe we'll go into phase two now and
see if we can just get something decent
Klik teks atau cap waktu mana pun untuk melompat ke momen tersebut dalam video
Bagikan:
Sebagian besar transkrip siap dalam waktu kurang dari 5 detik
Salin Satu Klik125+ BahasaCari KontenLoncat ke Cap Waktu
Tempel URL YouTube
Masukkan link video YouTube apa saja untuk mendapatkan transkrip lengkap
Formulir Ekstraksi Transkrip
Sebagian besar transkrip siap dalam waktu kurang dari 5 detik
Pasang Ekstensi Chrome Kami
Dapatkan transkrip seketika tanpa meninggalkan YouTube. Pasang ekstensi Chrome kami untuk akses satu klik ke transkrip video apa pun langsung di halaman tontonan.