0:02 Everybody wants production ready AI
0:04 code, but the brutal truth is AI code
0:06 breaks in production more often than
0:08 not. And in this video, I'm going to
0:09 solve that problem for you because I'm
0:11 going to teach you how you can make your
0:14 AI agent test its own code using
0:16 testdriven development. This means that
0:19 AI will first write the tests to
0:20 actually know whether its implementation
0:23 is correct or not, thereby being a self-
0:25 testing AI agent. And to prove that this
0:27 works, I'm going to be implementing this
0:29 strategy on a codebase that uses both
0:31 Python as well as Java. So that no
0:33 matter what programming language you
0:35 use, you know how to use the strategy
0:37 properly like a senior engineer. So
0:39 let's get started. Welcome to our test
0:42 application of today, an auction site
0:43 that's implemented in both Python as
0:45 well as Java. And this front end
0:47 interacts with both of these backends at
0:49 once. And these backends actually
0:51 communicate over the same database. So,
0:53 for example, I can actually create a
0:55 sample auction on the Python site here.
0:57 And then let's go ahead and bid on this
0:59 Raspberry Pi cluster kit with Python
1:03 Pete. I'm going to bid $250 like so. And
1:04 then you can indeed see that now I am
1:06 the highest bidder. And if I switch to
1:08 Java side of things, I can actually
1:10 enter a bid here as well. I can, for
1:12 example, say that I want to bid 270
1:14 bucks with Java Jane. So, I'm going to
1:16 go ahead and place a bid. And you can
1:17 actually see now that my bid must be at
1:19 least $275
1:21 because this actually has a current
1:23 price and then a minimum increment. So,
1:25 okay, we'll go ahead and actually bid
1:29 275 bucks then. And there we go. We can
1:30 even for example close the current auction
1:32 auction
1:34 and then go ahead and create a new
1:36 sample auction in the Java side as well
1:38 just to show you that these backends are
1:40 interoperable. They both have the same
1:42 endpoints. Now, we are going to be
1:43 implementing a new feature on this
1:46 auctioning website, namely the ability
1:48 to set the price at which an item can be
1:50 immediately purchased. You see this a
1:51 lot on platforms like eBay, right? Where
1:53 you're able to just bid a specific
1:55 amount and then the item is guaranteed
1:57 to be yours. So, let's go ahead and see
1:59 how we're going to implement that in
2:01 both of these backends using testdriven
2:03 development. First, we have to explore
2:04 the codebase. So, let's go ahead and
2:06 jump right into Visual Studio Code. In
2:08 here, you can actually see that we have
2:10 a couple of folders. You can see here
2:12 how we have a Java backend, a Python
2:14 backend, and this simple web interface,
2:17 which is just an HTML file. And to build
2:19 this new feature, I actually have
2:21 prepared a prompt. And this prompt will
2:22 actually be included in the description
2:24 down below. No need to worry about that
2:26 for now. And this prompt actually
2:29 describes how Claw Code should implement
2:31 a new feature using test-driven
2:33 development. So, let's actually go ahead
2:34 and check out what we're going to be
2:36 building. We're going to be building a
2:38 buy it now feature for the auctioning
2:40 system. Now the thing is there's a lot
2:42 of requirements here about adding an
2:45 optional field, setting a certain buyout
2:47 price that must be higher than the
2:50 starting price etc. But the process of
2:52 building this feature doesn't start with
2:53 actually creating the code for the
2:56 feature itself. No, actually test-driven
2:58 development is a form of development
3:00 where you first write the tests for your
3:02 feature which will obviously fail
3:03 because the code has not been
3:05 implemented yet. And then from that
3:07 point forward, your AI agent will
3:09 implement the actual code. And the
3:11 beautiful part is after it has
3:13 implemented the code, it can run all of
3:16 the tests again. And if they all pass,
3:18 then you know that your code is actually
3:20 genuinely functional. And of course, you
3:22 can have a bit of a human in the loop
3:23 element here as well, where you can, for
3:26 example, first check the unit tests
3:28 before you let the AI agent actually
3:30 write the code to make sure that the
3:32 tests already match your expectation. So
3:34 in this case, what's actually going to
3:35 happen is we're going to be implementing
3:38 eight test cases. For example, creating
3:40 an auction with a valid buyout price,
3:42 but we also want to test some edge
3:43 cases, right? That's how you're going to
3:46 get production ready code. For example,
3:49 we want to make sure that auctions can
3:51 still be created without a buyout price
3:52 to make sure that we have backwards
3:54 compatibility with how the system worked
3:57 before. So that's really how this prompt
3:59 works on the high level. But in order to
4:01 do test-driven development, you do
4:03 actually need an existing test suite. In
4:04 this case, if we open for example the
4:07 Python backend folder, you will see how
4:09 we actually have a test auction service
4:12 Python file and this file contains our
4:15 current unit tests. Similarly on the
4:16 Java side of things, if we go into
4:18 source, you can actually see we have a
4:19 test folder here as well. And you can
4:21 see how we actually have various tests,
4:23 for example, for the auction service.
4:25 And this is basically the test suite
4:27 that Claude code is going to be
4:29 extending with test-driven development
4:31 before it actually implements the new
4:33 bidding feature. In any case, enough
4:35 talking. Let's get coding. So, what I'm
4:37 going to do is I'm actually going to go
4:40 ahead and open two new windows because I
4:42 want to implement this feature on both
4:44 the Java back end as well as the Python
4:45 back end. But I'm not going to make
4:47 Cloud Code do both backends at once.
4:49 That's just asking for problems. I want
4:51 Cloud Code to be able to focus on one
4:52 programming language at a time. All
4:54 right. So, I set up two terminal
4:56 windows, one for the Python backend and
4:58 then one for the Java backend. And all
4:59 I'm going to do now is actually just
5:01 start up two clawed code windows. And
5:03 I'm going to say proceed for this one. I
5:06 do trust my own back end, of course. And
5:08 then what I'm going to do is I'm
5:09 actually just going to paste this
5:11 test-driven development feature prompt
5:13 because it already includes all of the
5:15 directives that Claude Code needs to get
5:17 started on writing the actual test
5:18 first. So, we're going to paste the
5:21 exact same prompt into both of these
5:23 Cloud Code sessions. And in a way, we're
5:25 actually going to be parallelizing this
5:26 effort, right? Because we're going to
5:29 have two agents working on both of the
5:32 actual backends at the same time. Now,
5:34 you can actually see here that it's
5:36 going to be starting to implement this
5:38 feature by using strict testdriven
5:40 development methodology. And while it's
5:41 coming up with the first test, I just
5:42 wanted to let you know that watching
5:44 this video until the very end is very
5:46 important because there's so much
5:48 content out there nowadays that tries to
5:50 make you believe that AI coding will 100
5:52 extra productivity and that AI code can
5:54 just oneshot the most complex
5:56 applications out there. But this is not
5:58 true. AI coding has been a great
6:00 productivity booster for me as an actual
6:02 senior engineer, but it has its limits.
6:04 But by using real software methodologies
6:07 like test-driven development, you can
6:09 actually write productionready AI code.
6:11 You just have to follow tutorials like
6:13 this one and really understand how to
6:15 write proper AI code first. There are a
6:17 lot of distracting methodologies out
6:19 there that people try to teach you like
6:21 the BMAT method. And I'm not saying that
6:23 these methods are bad. It's just that it
6:25 doesn't compensate for lack of skills.
6:27 If you for example don't know how to
6:29 code, then you are going to get stuck
6:31 with AI coding no matter what method
6:32 you're using because it's going to make
6:34 a mistake now and then and then if you
6:36 don't know any Python or Java then how
6:38 are you going to fix this application?
6:40 That's right, you will not be able to
6:42 and that's where you get stuck. So
6:43 actually understanding how to code
6:46 properly is super important and that's
6:47 what you're learning today because
6:48 test-driven development has been a
6:50 tested framework that has been used in
6:52 software development for many years now.
6:54 Okay, enough theoretical talk. Let's
6:56 actually see what cloud code is up to
6:58 right now. And you can actually see that
7:00 it's understood the codebase structure
7:02 and of course it's explored different
7:04 files for the Java side of things
7:07 compared to the Python files. The Python
7:08 files are a lot flatter. There's a lot
7:10 more included in one single file whereas
7:12 the Java files are a little bit more
7:13 split apart. It's an interesting
7:15 difference between writing Java and
7:18 Python code. Right. And now you can see
7:19 that the first failing test has been
7:21 written. I'm going to give this terminal
7:22 a little bit more room so you can see
7:23 what's going on. I'm going to go ahead
7:26 and allow it to make these edits. And
7:28 then if we check out our git work tree,
7:29 you can see that finally we have our
7:32 first test here. This is a new test that
7:34 will actually fail because if you look
7:36 here, you can actually see that set
7:38 buyout price is not even a valid method
7:40 because it's not been implemented yet.
7:41 That's a good thing. That's how we're
7:43 actually approaching this test-driven
7:45 development properly. So now what you
7:47 can see is that claw code is going to
7:50 run all tests to confirm that these new
7:52 tests do fail. So here you go. It's
7:54 written a bunch of new failing tests and
7:55 now it's actually going to go ahead and
7:58 run all of the tests. And I think I can
8:00 actually show you if I do controlr here
8:02 that a lot of these tests are failing.
8:04 That's exactly what we want. So we can
8:06 go ahead and toggle back and you can
8:08 indeed see that in this case that's a
8:09 great thing. Cloud code is aware that
8:11 it's supposed to be failing. And now
8:12 it's actually going to be implementing
8:15 the minimum code to make the tests pass.
8:17 And this is a super important element as
8:19 well. A lot of the times AI code is
8:21 super verbose and it will write way more
8:23 code than it needs to. In this case,
8:25 cloud code will write the minimum amount
8:27 of code that it needs to in order to
8:29 make the tests pass. And now on the left
8:30 side here, you can see that it's
8:32 starting to actually finish up those
8:34 tests on the Python file as well. So
8:36 that's great. I'm going to go ahead and
8:38 approve all of that there too. And you
8:40 can see here how the approach is the
8:42 exact same. It doesn't matter whether
8:44 you're using a strict language like Java
8:45 or a more loosely typed language like
8:47 Python. You can use this method
8:49 regardless of the programming language.
8:51 And indeed here you can see that the
8:52 Python tests are failing as well which
8:54 is actually expected. Now I'm going to
8:56 go ahead and give cloud code the time
8:57 that it needs to actually write the
8:59 implementation code. And then we'll have
9:02 a look at whether the tests will pass.
9:03 So on the bottom right you can see after
9:05 a while cloud code has finished the
9:07 implementation and now when running all
9:09 of the tests you can see that all the
9:11 tests are actually passing which is
9:13 perfect. And now you can actually see
9:15 here on the left in our change log that
9:17 we actually have modifications in the
9:19 actual root application. So for example,
9:21 we now have a new big decimal buyout
9:22 price. And then if we go into the
9:24 auction service, you can actually see
9:27 that if the buyout price is included, we
9:29 do a couple of validations which all
9:31 have to do with making sure that these
9:34 tests that it was creating earlier can
9:36 actually pass. For example, there are
9:38 tests here like fail when buyout price
9:41 is less than or equal to starting price,
9:43 which is a great test case, right?
9:44 Because we want to make sure that the
9:46 buyer price has to be more than the
9:47 starting price. It doesn't make sense
9:49 for the auction system otherwise. So,
9:51 you can see here how it actually works
9:53 very well. Now, let's go ahead and see
9:55 how far ahead it is with the Python
9:57 implementation. And you can see that
9:58 it's running all the tests, but there
10:00 are some issues here. It seems like
10:02 decimal places is not valid for
10:04 Padantics decimal field. Now, it's
10:06 interesting that it actually oneshot all
10:08 the Java unit tests, but it's having
10:10 some trouble with Python. And that has
10:12 to do with the fact that Java is a much
10:14 more strictly typed language, which is
10:16 also something that is really beneficial
10:18 for an AI coding mechanism. Because if
10:19 you look here on the bottom right, you
10:21 can actually compile the code as well
10:23 with a language like Java, which gives
10:26 you a lot of guarantees on the I guess
10:27 baseline quality of the code. Just
10:29 because code is compiling doesn't mean
10:30 that the code is perfect but it's
10:32 definitely a step forward and you know
10:34 that the code is at least meeting some
10:36 kind of minimum requirement there right
10:38 so of course if we try and run clean
10:40 compile it will actually work because
10:43 claude code was able to oneshot the Java
10:46 implementation here whereas here on the
10:48 Python side of things finally it did
10:50 actually manage to implement the test
10:52 suite correctly but it's actually a
10:53 relatively simple feature and you can
10:55 see here already how the behavior drifts
10:57 between these two different programming
10:59 languages. And that's another great
11:00 learning point for you from this video.
11:02 You should pick the language that you're
11:04 the most comfortable with, but it can be
11:07 beneficial to learn a more strict typed
11:10 language like Java or C. You can also
11:11 implement types in a programming
11:13 language like Python, but it's still not
11:15 really the same as a language that can
11:18 compile in a real way like a Java
11:21 application. Anyway, I digress. We now
11:23 have code that runs on the back end of
11:25 both the Java application and the Python
11:27 one, which is great. But I'm sure that
11:29 you want to see some proof of this, some
11:31 actual proof in the web application
11:32 instead of it just being in the back
11:34 ends. So let's go ahead and implement a
11:37 change in our HTML page so we can
11:39 actually interact with this new feature.
11:41 What I'm going to be doing is I'm going
11:43 to go to my files here and then I'm
11:45 going to go ahead and drag in index.html
11:47 HTML and I'm going to do that inside of
11:49 the Java chat session that I have
11:51 because the Java implementation is
11:53 strictly typed. So I trust this a little
11:54 bit more compared to the Python
11:56 implementation since I have the luxury
11:58 to choose anyway. And I'm going to say
12:01 the following. This is our web page to
12:06 interact with the back end. In fact, we
12:10 interact with both a Java and Python
12:13 back end with the same implementation.
12:15 implementation.
12:20 Given your latest Java edition, rework
12:24 the HTML/JavaScript
12:28 to include the ability to handle the buyout
12:29 buyout
12:34 price. The sample auctions that the
12:38 front end calls should include a buyout
12:42 price and this price should of course be
12:46 displayed in the front end. Here we go.
12:48 This is what I want to do now. I wanted
12:50 to rework that HTML file. So while cloud
12:52 code is working on this implementation,
12:54 I just wanted to let you know that these
12:56 kinds of real AI coding strategies is
12:58 what I focus on in my AI native engineer
13:00 community. And in this community, you
13:01 can learn how to accelerate yourself
13:03 with AI regardless of whether you are
13:06 working on your career or a business. So
13:07 you can check out the community in the
13:08 link in the description below.
13:10 Otherwise, I'll see you in just a second
13:12 and we'll check out how this has been
13:14 implemented in our front end. So it
13:15 seems like it's done with the front end
13:17 implementation. It wants to test the
13:18 front end itself, but I'm just going to
13:20 go ahead and exit out of the session
13:21 because we're going to do that manually,
13:24 right? So in our application, I can now
13:25 go ahead and create a new sample
13:27 auction. And then you will see that we
13:29 actually have a buyout price set of
13:31 $150. So I can actually just create an
13:34 initial bid of 55 bucks which will work
13:35 just fine. And then now I can actually
13:39 create a buyout price bid of 155 bucks.
13:40 So I'm going to go ahead and place a bit
13:42 here. And then actually something seems
13:44 to go wrong. So what seems to happen
13:46 here is that I place a bid and the
13:48 auction is closed off. But the thing is
13:50 my front end doesn't really know that
13:52 that is a possibility. my front end
13:54 continuously tries to fetch the latest
13:56 auction and it doesn't really have a way
13:58 of knowing that the auction was actually
14:00 closed off because our front end doesn't
14:02 actually have any logic for when a
14:04 buyout price is reached. And this
14:06 actually shows you why test-driven
14:08 development is so important. We did not
14:10 do test-driven development for our front
14:13 end. So our front end does sort of work
14:14 now, but it's already running into
14:16 issues. And that is the reality of AI
14:18 coding without a proper framework like
14:20 test-driven development. So what I have
14:23 to do now is now I have to go back into
14:24 claude and actually just communicate
14:27 that this issue exists and then let's
14:28 see if we can fix it. So I can for
14:32 example say here the front end does not
14:36 know how to deal with a bid that's
14:41 placed that actually buys out the item
14:44 because the front end continuously
14:47 refreshes the auction.
14:50 I actually get an ID error. And you can
14:52 see here that now I have to go back to
14:54 Claude and try to fix the error. If I
14:55 had actually done test-driven
14:57 development for my front end from the
14:59 very beginning, I probably could have
15:01 oneshot that implementation as well.
15:03 This shows you the reality of AI coding.
15:05 Other content would probably not show
15:06 you this and just act like everything is
15:08 working. But this is the truth that you
15:10 see here on this channel. You have to
15:12 use the right coding methodology to
15:14 actually get success out of AI coding.
15:16 Looks like we're done. Let's go back
15:17 into our front end. Give it a full
15:19 refresh just to make sure. We're going
15:20 to go ahead and create a new sample
15:21 auction. And then I'm just going to go
15:23 ahead and buy that out straight away
15:25 with 500 bucks. Going to go ahead and
15:27 place a bid. And there you go. Now we
15:29 can actually see that the front end is
15:31 able to deal with the new buying out
15:33 logic. And this just shows you how
15:35 powerful using the right methodologies
15:37 can be for AI coding. So I hope that
15:39 from this video you've learned that
15:41 using the right methodology to do AI
15:43 coding can give you so many amazing real
15:45 results. If you want to escape the trap
15:48 of vibe coding and actually get
15:50 productive with AI as an engineer, you
15:51 should definitely check out my AI native
15:53 engineering community in the link in the
15:55 description below. And I hope to see you there.