0:01 Today we're going to be doing a vibe
0:04 coding test between GPT Codeex Claude
0:07 Code using Claude Opus 4.5 and Google's
0:10 anti-gravity using Gemini 3 Pro. Now,
0:12 this is going to be an interesting test
0:14 because their task is going to be the
0:16 exact same thing to build a game that
0:18 works on the arcade cabinet behind us,
0:20 which you may be familiar with because I
0:22 did a video on it, its introduction to
0:25 the channel using GLM 4.7 Flash with
0:27 rather impressive results. So, I'm very
0:28 excited to see what these models are
0:30 capable of putting out today. And to
0:32 jump right into it, I want to say do not
0:34 be concerned by the fact that I have the
0:36 web chat interfaces for all of these up
0:38 right now. The reason for this is we're
0:40 going to be giving them an initial
0:42 prompt as well as some reference so that
0:44 they can go ahead and denote tasks for
0:46 the specific agent that is from either
0:49 of these model providers. So essentially
0:50 what I'm saying right here is each of
0:52 these models are just tasked with
0:54 outlining the plan of action for the
0:57 agent that will be subsequently taken
0:59 using codeex cloud code and Google's
1:01 anti-gravity. So really, I think the
1:03 first thing to do is probably just
1:04 quickly go over the prompt that we're
1:06 giving each of these models, which are
1:09 going to act as the initial orchestrator
1:11 for this task. And then we'll jump into
1:12 some agent decoding because the rest of
1:14 this video will be done either through
1:17 the terminal for codeex and cloud code
1:19 or using the UI interface that comes
1:21 bundled with Google's anti-gravity. So
1:23 this is going to be rather exciting, at
1:25 least for myself, and I hopefully the
1:27 viewers feel that way as well. So feel
1:29 free to subscribe and join as a channel
1:31 member for less than 10 cents a day and
1:32 let's get into it. It's basically
1:34 telling them that the task is to build a
1:36 game to run on a custom device which we
1:38 have also attached a picture of because
1:40 these models can visually reference this
1:42 and that may influence some of their
1:43 design decisions. So this is the
1:45 specific image that we've attached
1:47 alongside our prompt. In addition to
1:49 that, there is also a Python script that
1:51 just shows the proper control mapping
1:53 for the inputs for this device. mainly
1:55 the two arcade buttons, the steering
1:57 wheel, one additional kind of blinker
1:59 stock button and then also the joystick.
2:01 So, just so they understand that and we
2:02 don't want to leave that open to
2:04 interpretation because not having the
2:06 controls mapped properly can just be a
2:08 lot of painful debugging which kind of
2:10 is outside the scope of just seeing what
2:12 games they make. Regardless of that, we
2:14 can see that we're just instructing them
2:16 to basically outline a plan of action
2:18 for this agent tasked with providing us
2:20 the following things. an initial prompt
2:23 to send to the agent. If there are to be
2:25 multiple prompts, meaning if they want
2:26 to use a planning mode initially and
2:28 then a build mode, they should specify
2:30 that as well as the subsequent prompts
2:32 for either specific mode they choose to
2:34 use. And in addition to that, they
2:36 should also include any additional
2:38 information that they want placed in the
2:40 root project directory, such as a
2:42 markdown file instructing the agent on
2:44 specific things. I have told them the
2:46 only thing contained within the project
2:48 directory is the Python script for the
2:50 control schema. so that the models can
2:51 reference that and the agents can see
2:53 okay these are the controls that I'm
2:55 mapping it to and I have said because I
2:57 don't want them to just try Python for
2:59 this as it would kind of be restrictive
3:01 in terms of how cool of an end result so
3:03 I have specifically mentioned that that
3:05 script being a Python file is not
3:08 indicative of wanting a Python project
3:10 it is just a reference point for them to
3:12 understand how the controls are
3:14 implemented so with that we're basically
3:16 ready to go ahead and I will probably
3:18 just start all of these at the same time
3:21 being that I know 5.2 on extended
3:22 thinking or heavy thinking as we see
3:24 right here is going to take a little
3:26 while. I have extended thinking enabled
3:30 for Opus 4.5 and Gemini 3 is set to pro
3:33 and we also do have the canvas mode
3:36 enabled. Game design document steel thunder.
3:37 thunder.
3:39 All right, so now we're going to go
3:42 ahead and see a 3D tank combat arena
3:44 game designed specifically for the dual
3:45 controller arcade cabinet. Okay, so
3:46 that's first and foremost what we see
3:48 for Claude. I'm not gonna read through
3:50 the entire thing verbosely right here.
3:53 Let's see what Gemini 3 Pro I get so
3:54 excited when I love this stuff.
3:57 Seriously, I like very much enjoy this.
3:59 All right, so what are you going to be
4:01 giving us today?
4:03 My mistake. I was stuck trying to parse
4:05 the canvas thinking that was the only
4:07 output of it. It is a little it's made a
4:09 bit more difficult by having these in
4:12 this 33% segment window size for having
4:13 three of them up at a time. Tank
4:16 commander or mech simulator. Very
4:17 interesting. So, two of them have opted
4:20 to try to think of a tank thing. I will
4:22 say I do like that Gemini 3 Pro has
4:23 started out by complenting the hardware
4:25 setup. So, I do appreciate that very
4:27 much. Now, I don't see any specific
4:29 thing with the GPT output. Like with the
4:31 other two, we kind of had tank style
4:33 games that were suggested. This one just
4:34 says use something that has the
4:36 controls. I think it's leaning more
4:38 towards some form of racing game. Being
4:40 that I did see some vehicle selection
4:41 things mentioned here, and I was wrong.
4:43 I had actually neglected to see this
4:46 specific design markdown file from GPT
4:47 where it does mention the game design
4:50 seed is to be a neon drift corer. That
4:52 actually sounds pretty cool. I am
4:53 personally probably more interested in
4:55 seeing this result than the two tank
4:57 games. Just personal preference, but we
4:59 definitely will have a wide variety of
5:00 different options to play hopefully
5:02 assuming they work. I have all of the
5:04 repositories seated now with the
5:06 markdown files and every other bit of
5:07 information that these models wanted for
5:09 their agents. So now I'm just going to
5:11 sequentially begin actually coding with
5:14 these. And I think this needs to be done
5:15 one at a time at least for the initial
5:17 start because I will get overwhelmed and
5:19 probably copy paste the wrong thing into
5:21 the wrong model. Now let's go ahead and
5:24 set this to the highest end model. So if
5:26 we type model right here, we are going
5:28 to use 5.2 codecs. But if we press
5:30 enter, we can select a different
5:32 reasoning effort which is going to be
5:34 extra high. And they do give us a
5:36 warning that this can consume tokens at
5:38 a significantly higher rate. However, I
5:40 am looking for maximum performance and
5:42 we'll see how it does. GPT has not told
5:44 me to do anything specific. It just says
5:46 prompt A and prompt B. So, this
5:48 hypothetically should just be sent to
5:50 the agent right now. And then following
5:52 that, once that's done, I should just do
5:55 this. So, really, unless I've misread
5:58 something, it seems like we can just
6:02 basically copy paste this in and then
6:06 press enter. And that begins our GPT 5.2
6:08 into Codex X high thinking. Now, I was
6:10 initially thinking perhaps I could try
6:12 these like one video for each specific
6:14 model here just to, you know, stretch
6:16 out the max amount of content from a
6:18 YouTuber perspective. I decided not to
6:19 do that. It'd be more interesting with
6:21 all of these in one video. I am now
6:22 somewhat starting to regret that just
6:25 because this is a bit chaotic though.
6:27 Um, we'll see what happens. So Claude
6:29 had put all of the things that it wants
6:31 us to do in terms of specific modes,
6:33 specific prompts in this action plan
6:35 markdown file, which I neglected to
6:37 notice upon initial look through the
6:39 generated results from these online chat
6:41 agents. But now we're going to go ahead
6:42 and just see send this with extended
6:44 thinking enabled, high reasoning budget,
6:46 validate the design, plan, architecture,
6:48 and identify risks. Now, interestingly
6:50 enough, Claude has opted to use the Go
6:52 Dot game engine for this, something I've
6:54 personally never in my life used, though
6:56 I do find it very interesting. So
6:58 there's going to perhaps be some form of
7:00 learning curve for myself being that I
7:01 did not restrict how they go about
7:04 making these games. So Go dot is a
7:06 perfectly reasonable way to go. I just
7:08 may have to do a little more work than
7:10 anticipated. Okay, so Claude's given me
7:11 a few different options. One of which is
7:14 the tab key. So we can use that or we
7:16 can just use the ultra think keyword.
7:18 Now personally I think it's not the best
7:20 to have to have like a bunch of
7:22 different keywords here to prompt it to
7:30 Okay, good. So, Claude agrees with me
7:32 that it is a janky solution to have to
7:36 put these weird keywords in, but okay.
7:39 So, we'll just do what it says. And now
7:42 we're going to send our Ultraink plan.
7:45 And now Claude is running as well as GPT
7:49 has run. So, this is really a disaster
7:51 to wrangle these at the same time. I'm
7:53 wholeheartedly regretting having not
7:55 just done separate videos for each of
7:56 these, which I suppose I still could do
7:58 being that this video hasn't been put
8:00 out yet, but I'm committed. I'm now
8:03 invested. So, let's start anti-gravity.
8:04 So, I just got a warning that
8:12 but it's still open.
8:14 So, I'm probably just going to opt to
8:16 ignore that. I do have planning mode
8:19 selected as Gemini 3 said we should do
8:20 for our initial prompt. And we do also
8:23 have Gemini 3 Pro high enabled. So we're
8:24 just going to paste our initial prompt
8:26 now here into anti-gravity as well. And
8:28 we now have all three of these embarked
8:30 on the initial phase, which was all of
8:34 them basically wanted an initial prompt.
8:36 So as I was saying, all of these
8:38 basically began with an initial prompt
8:40 to allow the agent to plan based off of
8:42 the outline or architecture for the game
8:44 that was decided by the web chat
8:46 interface agents. So now we have all of
8:48 them working. I'm going to wait till
8:50 they have all completed the planning
8:52 phase to go into the next prompt which
8:54 will embark upon the actual build of
8:56 this just to keep things a little
8:59 coherent. I think that this is where we
9:01 may begin to stray from the prompt
9:03 because I almost want to just allow it
9:05 to continue now as opposed to sending
9:07 the prompt that GPT had said I should
9:09 subsequently send once this has been done.
9:10 done.
9:14 But I suppose I will. This is tough now.
9:15 I kind of want to let the agent just
9:18 work how it decides to do. So being that
9:20 that is the case and keep in mind this
9:23 may be wrong, but
9:25 we'll just allow them to work in the way
9:27 they want to. I think I was maybe
9:30 supposed to use an arrow key there to
9:32 select one of those. It seems to have
9:34 understood what specifically was going
9:36 to go on. And Gemini is just asking for
9:38 permission initially to create the
9:40 plan.mmarkdown file as I had not granted
9:42 it permission to create files in that
9:44 directory. And that is going ahead and
9:45 creating that now. So we'll wait for
9:47 that to happen so we can have them all
9:49 building at the same exact time. Key
9:50 decisions. Okay, I'm fine with the
9:52 concept. The critical risk is that it
9:54 does not know if Go dot is going to be
9:56 able to properly enumerate and
9:59 distinguish these HID devices
10:00 and it's giving me a confidence level
10:03 here. In that case, I would probably
10:06 just tell it then use something else cuz
10:07 I kind of don't want to have to deal
10:09 with a bunch of go dot stuff right now
10:11 being that I've genuinely never used it.
10:12 I'm actually fine with letting it try
10:14 this even using go dot because it has
10:16 baked in the implementation order first
10:18 and foremost. We need to just verify
10:20 that the inputs actually work. So if
10:21 they don't work with go dot then we can
10:23 go back and have it use a different
10:25 language or something of the sort. But I
10:26 figure I want to allow it to continue
10:29 doing what it wants to do at least until
10:31 we know make or break if the controls
10:32 are actually going to work. I would
10:34 assume they're going to. Now we have
10:36 Google Agent Manager. And I could have
10:38 sworn that I had set the settings in
10:40 this to make everything a bit more
10:42 visible. Okay, so Anti-gravity has given
10:45 us our design document for a game called
10:47 Bore Core. I don't want to laugh at that
10:49 because there may be some technical term
10:51 that that actually makes sense for this
10:54 kind of trench driver tunnel shooter.
10:56 Oh, it's going to use Python and Ursina.
10:58 This is going to be quite a task to make
11:01 look good comparatively to GPT choosing
11:04 to just use like 3JS or something. So,
11:06 this will be neat. We do have our
11:07 document we can kind of take a quick
11:09 peek at right here. But to be honest
11:10 with you, I'm just going to let them go
11:12 ahead and do whatever they want. Oh, it
11:14 has given me multiple different concepts
11:15 that I could choose from though. And it
11:17 did go ahead and select a specific
11:20 concept just from all of these, which
11:21 I'm cool with then because that
11:23 eliminates me having to choose one
11:25 specifically and then potentially making
11:28 the wrong choice subjectively speaking.
11:30 So, I'm just going to let this go ahead.
11:33 Okay, do whatever you want. Now, they're
11:35 all except for Claude building. Or is
11:37 Claude building right now as well? It is
11:39 building. Very cool. So, we now have
11:42 Codeex 5.2 Extra High, Opus 4.5 with
11:45 Ultra Think and Gemini 3 Pro vibe coding
11:47 games for our arcade cabinet. Very
11:49 exciting. So, apparently Bore Core is done,
11:50 done,
11:54 which was very quick, which concerns me
11:56 a bit. There is one window here that
11:58 will specifically show us the files it
12:00 has created. And we see we have our
12:01 Borcore game and a bunch of other
12:03 things. So, I suppose we can just
12:04 briefly look at these while we're Oh,
12:07 wow. Okay. So, the changes in zoom level
12:08 were definitely reflected more
12:11 prominently in this anti-gravity window.
12:12 I just basically want to see how long
12:15 these scripts are. I know length has no
12:17 measure of intricacy, but I'll be very
12:19 interested to see how how this game
12:21 works. I've completed the complete
12:24 foundation for Steel Thunder. I like it.
12:25 I'm just going to let this proceed with
12:27 phase two and I will manually handle the
12:29 go dot installation so that we don't
12:31 waste tokens on stuff that human can
12:33 easily do comparatively to actually
12:35 coding this entire thing with rapid quickness.
12:38 quickness.
12:39 So we're allowing Claude to continue
12:43 with phase 2. And again uh yeah I didn't
12:45 have anything to say there. GPT still
12:47 working. We're down to 87% context
12:49 length left. So this will be very
12:51 interesting. I find that the Jeep Claude
12:53 is obviously known for its fantastic
12:55 aesthetic capabilities when using
12:58 specific like paywalled models from
13:01 OpenAI like GPT 5.2 Pro. I have seen
13:03 some very promising 3D design
13:05 capability. And being that this is
13:07 opting to just use this in a simpler way
13:10 with like some 3JS, this could be very
13:12 aesthetically pleasing potentially. I'm
13:14 probably just going to figure out how to
13:17 install Go Dot on this dev uh arcade
13:23 I can now say that I have used Go dot
13:25 because that was probably one of the
13:27 simplest application installs that I've
13:30 seen on Ubuntu. Well done. So, all of
13:32 these are at some initial level of us
13:34 being able to play them at varying
13:36 levels. So, we see Claude right here is
13:38 allowing me to proceed with phase 3 if I
13:41 want to. But before I continuously go
13:42 with it, I want to test this in go do
13:44 first. being that it was initially
13:46 unsure if the controls would actually
13:48 work in that engine. GPT has given us
13:51 our project as well and it gives us some
13:54 atte uh attemp potential next steps that
13:56 we can take such as just trying the
13:58 prototype, testing it with the hardware
14:00 and then we can also expand. But I think
14:02 we're at a level now where we can safely
14:04 go ahead and try all of these which is
14:06 exciting. So, we're now ready to try our
14:08 first result. And I think I'm going to
14:10 try this from easiest to hardest in
14:12 terms of level of complexity to actually
14:14 get them set up with dependencies and
14:16 everything. Meaning that our Gemini 3
14:18 Pro result is the easiest to run. So,
14:20 we'll test this one first, followed by
14:22 the chat GPT result, then finally the
14:24 Claude result. I am very excited and
14:26 we'll see uh fingers crossed how this
15:06 Did something go wrong? Cuz
15:12 So, I'm very happy to report that as we
15:14 can see on the screen of the arcade
15:16 machine, the Gemini 3 Pro result has a
15:17 ton of errors. Basically, it's missing
15:19 shapes and things that would make this
15:22 game not look like a 2D piece of
15:24 garbage. So, this is something we can
15:26 give this now for our second iteration
15:28 in the true vibe coding fashion, and it
15:29 will inevitably make it look far more
15:32 impressive than the um red square on
15:34 screen. So, now I'm just going to hop in
15:37 and try the codeex 5.2x high result,
15:39 which it has given me a readme file as
15:41 well, which I'm just taking a look
15:42 through just to see how it wants me to
15:44 go about actually starting this. All
15:46 right, let's get our first look at the
16:30 as a first draft. That's not horrible.
16:32 And basically, so far, while they have
16:35 not necessarily been what I expected,
16:36 the controls are correctly working with
16:39 both the Gemini 3 Pro and the Codeex
16:40 results, which is good to see. This one
16:43 is obviously a bit more immersive and it
16:44 has a low poly aesthetic. Not
16:47 necessarily what I hoped to see when it
16:49 said it was making a Drift game, but
16:51 perhaps my expectations were a bit high,
16:53 although I do have justifiable reason
16:54 for that, which we can showcase later in
16:57 the video. Now, our next step is for the
16:59 initial testing of these results in our
17:01 first try is to go ahead and figure out
17:03 how to use go dot to run the claw code result.
17:37 So, while that was perhaps visually the
17:39 most impressive, it didn't actually
17:42 work. the control maps did not work,
17:45 which is quite upsetting. Uh, being that
17:47 the other two did work with the controls,
17:48 controls,
17:50 I don't want to have to try to
17:52 troubleshoot this with working with this
17:54 specific game engine. Although, I'm
17:55 going to be completely honest, in terms
17:57 of my first impressions with using Go
17:59 Dot and just what I see, I very much
18:01 like it. So, this may be something I
18:03 play more with later, but for right now,
18:05 I'm more interested in just getting like
18:08 a decent looking web game. And here's
18:09 where I'm at right now. So, I'm
18:11 extremely frustrated because last night
18:13 I used Claude Code for basically 20
18:16 minutes and I had it iterate upon the
18:19 game we created with GLM 4.7 Flash
18:20 saying, "I want you to make a little 3D
18:23 low poly city with this. Have separate
18:25 views for the cars and stuff like that."
19:04 So, I was kind of anticipating more of
19:06 what we just saw. saw instead of the
19:07 three very unique results we received
19:10 today. I almost have to wonder if how I
19:12 went about this initially was incorrect
19:14 where I had the online versions of these
19:16 models and not even versions just like
19:19 the chat GPT website one plan out the
19:21 implementation plan for this. I almost
19:23 feel like if I just prompted them and
19:25 said here's a file that maps the
19:27 controls for my arcade cabinet. Make me
19:29 a 3D low poly highway racing game. We
19:31 probably would have received a better
19:34 result. And now I don't know what to do.
19:36 I basically hate all of these results so
19:38 much that I don't even want to spend
19:39 time trying to make them better. I want
19:42 to call this like phase one, which So
19:44 maybe we'll go into phase two now and
19:46 see if we can just get something decent