0:02 Hey there. This is not the Claude
0:04 website. This is actually a onetoone
0:07 clone that I built with over 200 unique
0:10 features. And I didn't write a single
0:13 line of code myself. An AI agent built
0:16 this entire thing while I was sleeping.
0:18 So, here's the problem. When you're
0:20 building a project this big, we're
0:22 talking full conversations,
0:26 projects, artifacts, file uploads, all
0:29 of it, you hit the wall pretty quickly.
0:31 the context window fills up and the
0:33 agent loses track of what it was doing.
0:34 And if you've tried to build anything
0:36 this substantial with coding agents
0:38 before, you know exactly what I'm
0:40 talking about. And compacting the
0:42 conversation is just not good enough.
0:44 The workaround that a lot of people use
0:47 is to manually orchestrate everything.
0:49 You would create an implementation plan
0:51 using your agent, maybe store that plan
0:53 somewhere in your project folder. You
0:55 could even use something like specit and
0:58 bmat to do this. And you then get the
1:00 agent to implement these features one by
1:02 one. You then clear the conversation
1:04 after each session and ask the agent to
1:07 implement the next feature. Rinse and
1:09 repeat. This works, but it's exhausting,
1:12 especially for larger projects. You're
1:14 effectively babysitting the agent the
1:16 entire time. What I'm about to show you
1:18 is completely different. You give your
1:20 requirements once and an initialization
1:23 agent will break everything down into a
1:26 detailed feature list. And then coding
1:28 agents take over implementing one
1:31 feature at a time, testing, committing
1:34 the changes, clearing the context
1:36 window, and picking up the next feature
1:39 automatically. This even does regression
1:41 testing before moving on to the next
1:44 feature. This ran for hours while I did
1:46 absolutely nothing. And by the end of
1:49 the process, we had a fully functional
1:51 clone of the Claude website. In this
1:53 video, I'll show you exactly how to set
1:55 this up yourself. I've really simplified
1:57 the process so you don't have to be a
1:59 developer to follow along. And as an
2:01 added bonus, I'll show you how to
2:04 integrate with NATN to get realtime
2:06 updates as your agent is making
2:08 progress. In this instance, the agent
2:11 sent me notifications to Telegram every
2:13 time it completed a new feature. This is
2:15 all based on an article written by
2:19 Anthropic about an effective harness for
2:22 long-running agents. This is a brilliant
2:24 article and I actually recommend you
2:26 read it. It's all about getting agents
2:28 to perform tasks that would take a lot
2:31 of time and context. As AI agents become
2:34 more capable, developers are relying on
2:37 these agents to implement way more
2:39 complex tasks. And these tasks can take
2:42 hours if not days to implement. So the
2:44 challenge when you're using something
2:46 like specit and bmad or even just the
2:49 planning mode in your IDE is that agents
2:51 will actually have to work in sessions
2:53 because the context window will fall up
2:56 as it's working through the solution and
2:58 at some point the quality is going to
3:00 decrease and you might actually have to
3:02 compact the session which will summarize
3:04 the conversation dropping off a lot of
3:07 important context. emerging software
3:09 engineers working in shifts where each
3:12 new engineer arrives with no memory of
3:14 what happened in the previous shift.
3:16 That is exactly the problem here. Even
3:18 if you clear the context and ask the
3:20 agent to implement the next feature, it
3:22 has no idea what's been implemented
3:25 already. So what this project proposes
3:27 is that we use a two-fold solution where
3:30 we can use something like the claw agent
3:33 SDK to plan and implement the solution
3:35 in two phases. First, we'll have an
3:37 initialization agent which will
3:39 basically take in your prompt and create
3:42 a feature list from that and it will
3:44 also set up the basic project structure.
3:46 Once that's done, the framework will use
3:49 coding agents to implement the features
3:51 one task at a time. So, these agents
3:54 will make incremental progress in every
3:56 session. Now, they don't mention it
3:57 here, but something I really like about
4:00 their solution is when the coding agent
4:02 starts a session, it will pick two
4:04 features that have already been
4:06 implemented at random and do regression
4:09 testing on them and then fix any issues
4:11 before moving on to the next feature.
4:13 So, you can definitely read through this
4:15 article, but what I do want to focus on
4:17 is their quick start where they give you
4:19 access to an example project that
4:21 implements all of this. Now, the setup
4:24 process is not too complicated, so you
4:25 can definitely try this yourself, but
4:27 I'm actually going to show you an even
4:29 easier way to get going. In the
4:31 description, you'll find a link to this
4:33 repository. I simply took their project
4:35 and modified it slightly, so it's a bit
4:38 easier to work with. So really all you
4:40 have to do is click on code and you can
4:42 either download this as a zip file or if
4:45 you've got get installed simply copy
4:47 this link then extract the contents of
4:49 that zip file and then open the folder
4:52 in a code editor. I'm using cursor but
4:55 you can use VS code or whatever editor
4:57 you want. Now the project is really
4:59 straightforward. There's a bunch of
5:02 Python files like agent autonomous agent
5:05 demo and the client file. This basically
5:08 uses the agent SDK to set up this entire
5:11 project. Now, one file you might want to
5:13 go through is the readme file. This is
5:15 where I give you detailed instructions
5:18 on how to set everything up. So, there
5:19 are a few dependencies that we have to
5:22 install and it also shows you how to set
5:25 up any environment variables and finally
5:28 how to start this project. But we'll go
5:30 through all of that in detail. Now,
5:32 since this project uses Python, I do
5:34 recommend setting up a virtual
5:36 environment. If you're new to Python,
5:38 this is really easy to set up. Let's
5:40 create a new terminal window. And in the
5:43 terminal, let's run Python. If you're
5:46 using Mac and Linux, I think it's Python
5:48 3, but for Windows, it's just Python.
5:53 Then dash M then venv
5:56 space Venv.
5:58 So, it looks something like this. This
6:00 will create a new virtual environment
6:02 within this folder. Now, we have to
6:05 activate this virtual environment. On
6:08 Linux and Mac, it's this command. Or if
6:10 you're using Windows like I am, the
6:12 command looks something like this. So,
6:14 then press enter. And if everything was
6:16 done correctly, you should see the
6:19 virtual environment name over here. So,
6:21 why do we need a virtual environment?
6:22 Well, we're going to install a whole
6:24 bunch of Python dependencies. And by
6:26 using a virtual environment, those
6:28 dependencies will only be installed in
6:31 this project. So it's only scope to this
6:33 project. If you don't activate the
6:35 virtual environment, everything will
6:36 still work. But all of these
6:38 dependencies will be installed globally
6:40 on your machine, which could affect
6:42 other projects or scripts on your
6:44 machine. So really, this is not a lot of
6:46 effort. Just activate your virtual
6:48 environment. So let's install our Python
6:52 dependencies by running pip install and
6:55 requirements.xt. txt. Now again, all of
6:57 this is in that readme file. Cool. We've
6:59 now installed the project dependencies.
7:01 Now, this framework uses the anthropic
7:04 models for the initialization agent and
7:06 the coding agent. This also means we
7:09 have to provide an anthropic API key.
7:11 And if you're using the quick start from
7:13 anthropic, they only allow you to use
7:15 the API key, which can actually be
7:18 really, really expensive. But I'm going
7:20 to show you a way cheaper solution.
7:23 First, let's rename this. env.example
7:27 file. So let's rename it to env. Now in
7:29 this file you have a choice of two
7:32 variables. We can either provide the
7:35 anthropic API key which is the default
7:37 or we can use our claude code o orth
7:39 token. So if you're already using claw
7:41 code and you've got a claw subscription
7:43 you can simply piggy back on your
7:46 subscription. And trust me this agent
7:48 uses a lot of tokens and it runs for
7:51 hours. So, in my opinion, using the
7:54 Anthropic API key is simply not an
7:56 option. So, if you've got the basic $20
7:58 claw subscription, you can run this
8:01 process for hours and for days and for
8:03 weeks without ever going over that
8:05 subscription cost. So, I'm actually
8:07 going to comment out this anthropic API
8:10 key and I'm going to use my Claude code
8:12 subscription instead. Now, I had no idea
8:13 that you could use the Claude code
8:17 oorthth token in the agent SDK. So, I do
8:19 want to give a shout out to a friend of
8:21 the channel, WebDev, Cody. He worked
8:23 with me on Discord to get all of this
8:25 working and he's got some brilliant
8:28 content on aentic coding. Cody also has
8:29 a fantastic course on learning how to
8:32 use aic coding to build full stack
8:33 applications. So, definitely go to
8:36 aentic jumpstart.com and tell him Leon
8:38 sent you. I'm not getting paid for this
8:40 at all. He's a good friend of the
8:42 channel and I highly recommend to check
8:44 his stuff out. So just run the command
8:48 claude setup token. You will be asked to
8:50 authorize this token. So just click on
8:53 authorize. You can now close the browser
8:55 window. Then in the terminal you can
8:58 simply copy the token and add it to the
9:00 env file. Now before we move off the
9:03 file you will also notice this optional
9:06 variable for process n web hook. So if
9:09 you want you can uncomment this variable
9:11 and provide a link to your nadn
9:13 instance. So as the agent is making
9:15 progress, it will send some valuable
9:18 status updates to this endpoint and then
9:20 you can do whatever you want with it.
9:23 You could email the results to yourself.
9:25 You could send updates to Telegram,
9:27 whatever. I'll simply leave this
9:29 commented out for now. Now we can
9:32 finally test this application. Now this
9:34 prompts folder is really important. This
9:38 contains three files. The appspec which
9:41 is critical. This appspec file actually
9:44 drives the entire solution and this is
9:46 something you have to provide. So this
9:47 is where you can explain what the
9:49 project is about. So you've got this
9:52 overview section, the text stack for the
9:54 front end, the back end, communication
9:57 layer. We can also specify prerequisites
10:00 and of course all the core features. And
10:04 this is a massive list of features. Now
10:05 don't worry, you don't have to type all
10:07 of this stuff out by hand. You can of
10:10 course just simply give this file to a
10:12 agent and say hey here's an example
10:15 appspec file. You can replace all of
10:18 this with my apps requirements. And of
10:19 course on my channel we have a look at
10:22 very cool ways to simplify this even
10:24 further. I'll show you in a second. Now
10:26 we also have this coding prompt file and
10:29 this will be used by the coding agent.
10:31 The same with the initializer prompt.
10:33 Now you don't really have to modify
10:35 these files. I personally made quite a
10:37 few changes to these files in this
10:39 project because I actually used this
10:42 extensively in the last week and I felt
10:44 that the anthropic demo actually still
10:46 had a few gaps in it. As an example, I
10:48 noticed that the coding agent would
10:49 create the app with a whole bunch of
10:52 pages and these pages would show
10:54 results, but those results were all
10:57 hardcoded mock data. And when the agent
10:59 did testing, it looked at the page and
11:01 it simply said, "Oh, it looks like
11:02 everything is working. The page is
11:04 showing up and I can see a bunch of
11:06 values. But at no point did it consider
11:09 that this might be mock data and that
11:11 mock data needs to be replaced with real
11:13 time data. So I added a lot of steps in
11:15 these prompts to force the agents to
11:18 ensure that the data that is looking at
11:20 is actually real. Now the only thing you
11:22 might want to change yourself in this
11:25 initialization prompt is this section
11:27 where it says you need to create a
11:30 feature list with 200 detailed test
11:33 cases. Now, this really depends on your
11:35 application. If you're building a simple
11:37 to-do list app that only you will use,
11:39 then you definitely don't need 200
11:41 features, right? Or if you're building
11:43 something massive like an enterprise
11:46 scale application, you might want to
11:48 bump this up to 500 features. Now,
11:50 again, I'm giving you a really simple
11:52 way to automate all of this. So, instead
11:54 of trying to type out all of this
11:57 manually, I added a custom prompt to
12:00 this claude folder. This create spec
12:02 file. Now, this is a really detailed
12:04 prompt, but this is going to help the
12:06 agent populate all of the stuff for you.
12:09 So, let's open up our terminal. I'm
12:11 actually just going to open up another
12:14 session and I'm going to start cla code.
12:16 So, all we have to do is run the custom
12:19 command front slashcre spec. Right? So,
12:21 the agent's going to ask us a few
12:23 questions like what do you want to call
12:25 this project in your own words? What are
12:28 you building? And who will use it? Just
12:30 you or others too? So this will tell the
12:32 agent whether or not user authentication
12:34 is required. Help me build an
12:36 application that I can use to come up
12:39 with unique YouTube titles. So I will
12:42 provide the topic and idea of the video.
12:45 And this app will then call open router
12:47 to generate unique YouTube ideas. And
12:49 what I also want is for a second agent
12:52 to review the titles to give feedback to
12:54 the first agent. And then that agent
12:57 needs to rewrite the titles until we get
13:00 really good high clickthrough rate title
13:03 ideas. Only I will use this application
13:06 and no one else. We can just call this
13:08 title smmith. I don't know something
13:12 like that. So let's simply run this. And
13:14 I'm currently in editing mode. It really
13:16 doesn't matter. If you want you can just
13:18 go into planning mode to make sure the
13:20 agent won't accidentally make any
13:22 changes. So this custom prompt will
13:24 force slot code to ask you clarifying
13:27 questions and I really love this. So you
13:29 can choose between quick mode and
13:31 detailed mode. In quick mode, we can
13:33 describe the app at a high level without
13:35 really providing any details on the
13:37 technical architecture. This could be
13:39 ideal for vibe coders or for someone
13:41 that really doesn't understand this tech
13:43 stack. Or if you really want to dive
13:44 into the weeds of how everything should
13:47 work, you can go into detailed mode.
13:49 I'll just go with quick mode. So how
13:52 complex is your application? So simple,
13:55 medium or complex. By the way, this will
13:57 determine how many features we will add
13:59 to this initialization prompt. So this
14:02 value over here. But as you can see, I'm
14:03 really trying to abstract all of that
14:06 away. So let's just say simple. Any
14:08 technology preferences or should I
14:11 choose sensible defaults? I'll just go
14:13 with defaults. Right. The agent is
14:16 asking us a few more questions like how
14:18 do we envision the output to work and
14:21 the generation process. I'm actually
14:23 just going to say you choose. Of course,
14:25 in your application, you probably want
14:27 to be a bit more involved in this, but
14:29 for tutorial sake, let's just get the
14:32 agent to decide. And cool. So, this app
14:34 spec file was updated. The project name
14:36 is now titlesmith with a proper
14:39 overview. And our agent now populated
14:41 the text stack. So it covers the front
14:44 end, back end, the prerequisites,
14:46 security and access control, and of
14:49 course all of these key features. And
14:51 looking at the initializer prompt, our
14:54 agent decided to create a 150 unique
14:56 test cases. So now that we have our
14:58 appspec, we can finally go ahead and
15:00 implement this solution. And for this,
15:02 let's go back to that Python
15:04 environment. Now to start this process,
15:06 we have to run the following command. In
15:08 fact, let's go to the readme file under
15:11 quick start. We can simply copy this
15:13 command and let's paste it into the
15:15 terminal. Now all we have to change is
15:18 the name of the project folder. So I'll
15:20 just call this title Smith. And that's
15:22 really it. Let's run this. The
15:24 initializer agent is now running. And
15:27 this is going to create a subfolder. So
15:30 if we go to the generations folder, we
15:32 can now see a subfolder called
15:34 titlesmith. And the initializer agent is
15:36 now doing a lot of work. It's going to
15:39 create a feature list file. And by the
15:40 way, this can take a few minutes to
15:43 complete. These feature list files are
15:45 massive. It will then also set up the
15:48 basic project structure, right? Our
15:49 initializer just created this feature
15:51 list file. So, let's have a quick look
15:55 at it. This file is massive. And for a
15:57 small app like this, this file is
15:59 already 1,922
16:01 lines long. Each and every feature
16:04 contains a description on what it is, as
16:06 well as all of the steps needed to
16:08 implement this feature. And each feature
16:12 also contains a property called passes
16:14 which is false by default. So as the
16:16 agent works through this list, it will
16:19 implement a change, test it, and then
16:22 set passes to true. It will then move on
16:24 to the next feature. What's really cool
16:25 is that these coding agents have
16:28 instructions to retrieve two features
16:30 that have already been implemented by
16:33 random and then do regression testing on
16:36 those features and fix any bugs. So this
16:38 means that if any feature actually broke
16:40 one of the existing features, the agent
16:42 will automatically pick up this issues
16:44 and address it. Besides for the feature
16:46 list, this initializer agent will also
16:49 set up all of the project dependencies.
16:51 So it will create the project structure
16:53 and install any dependencies. All right,
16:55 so the initialization agent has now set
16:58 up the project and the feature list file
17:00 and now it's updating this cla progress
17:02 text file. This file is really useful
17:04 for keeping track of the current
17:06 progress. Now, this is really where the
17:09 fun begins. The agent SDK is now going
17:11 to use the coding agent to implement all
17:13 of these features. And honestly, you can
17:15 now step back and let the agent do its
17:17 thing. This coding agent will now have a
17:19 look at the feature list and retrieve
17:21 any features that have not yet been
17:23 implemented. So, any feature where
17:26 passes equals false. It will then look
17:28 at the highest priority feature and
17:30 implement that first. It will also do
17:32 regression testing on any features that
17:35 have already been implemented. Now,
17:36 there are a few things that I do want to
17:38 mention about the coding agent. First,
17:41 if we go to this autonomous agent demo
17:44 file and we scroll down, we can see that
17:46 we're currently using opus to implement
17:49 this project. By default, the anthropic
17:52 demo actually uses sonnet. So, if you
17:54 prefer to use sonnet, you can simply
17:56 comment out this line and save this
17:59 file. But honestly I just prefer Opus.
18:01 Then the second thing is if we go to
18:04 this client file we can see all the MCP
18:07 servers and tools that are available to
18:09 this agent. So if we go down to this
18:12 claude SDK client section here we can
18:15 see all the MCP servers. The anthropic
18:18 demo actually uses Puppeteer for end to
18:20 end testing but I did a sideby-side
18:23 comparison and Playright is way faster.
18:25 I'm not sure why they decided on
18:27 Puppeteer. Maybe you can tell me in the
18:29 comments. But honestly, Playright was
18:32 just so much faster. And you might be
18:33 wondering, well, what is Puppeteer and
18:36 Playright used for? This coding agent
18:38 really likes to do end to-end testing.
18:40 It does this by opening the browser
18:43 window. Then it takes a screenshot of
18:45 the browser window and it uses the
18:47 agent's vision to analyze the image and
18:49 it will then determine if there's any UI
18:53 issues, etc. Now, I find that process to
18:55 be really slow. So I'm actually running
18:58 playright in headless mode. The agent
18:59 will still be able to see all the
19:02 elements by actually just looking at the
19:04 HTML code. But if for some reason you
19:06 want the agent to use the browser, you
19:09 can simply comment out this first line
19:11 and add back the second line. So this
19:14 will run the playright MCP server where
19:16 it will actually use the browser window.
19:19 And I'm just providing a viewport size.
19:22 So the screenshots are not too big. Now,
19:24 this process can run for hours, days, or
19:26 even weeks. It really depends on how
19:29 large and complex your project is. Now,
19:31 I personally wanted some way of
19:33 receiving updates every time the agent
19:35 makes progress. I don't want to go and
19:37 babysit my monitor and see what's going
19:40 on. So, this is totally optional, but if
19:43 you want to receive notifications, I've
19:45 actually integrated NN into this
19:48 workflow. So in the env file there's
19:50 this progress nitn web hbook URL
19:53 variable. I'm actually going to comment
19:55 this out and I'm going to stop this
19:58 process just for now so that I can
20:00 actually show you how to implement this.
20:02 By the way, you can stop and resume this
20:04 workflow at any time. You just press
20:07 Ctrl C to stop the process. And as you
20:10 can see here, to resume, simply run the
20:12 same command again. So we'll restart it
20:15 in a second. I'm just going to save this
20:17 env file. And now all we have to do is
20:20 provide this NN web hookbook. Again,
20:22 this is totally optional. You're more
20:23 than welcome to let this process run in
20:26 the background, but I personally want to
20:28 receive notifications. So, of course,
20:29 the first thing you need to do is open
20:32 up N8N and create a new workflow. If you
20:35 don't yet have an NA instance, then what
20:37 you can simply do is use the link in the
20:39 description to go to this page.
20:41 Hostinger is without a doubt the
20:43 cheapest way to host this N8N instances.
20:45 So what you can do is choose a plan like
20:49 the KMV1 plan is only $5 per month. I'll
20:51 go with the KMV2 plan. Select your
20:54 application as N8N and then under the
20:56 discount code you can enter the code
20:58 Leon and this will give you an
21:00 additional 10% off. You don't have to go
21:02 with 24 months either of course. You can
21:05 just go monthtomonth or maybe a 12-month
21:07 period. Then simply continue with the
21:09 checkout process. Then after setting
21:11 your root password, hostinger will build
21:13 your NAT instance and you'll have access
21:16 to this dashboard. All you really need
21:18 is to click on manage app and you will
21:20 now have access to your very own N8
21:23 instance. How awesome is that? Cool.
21:25 Let's create our workflow. I'll just
21:27 give it a name like autocoder
21:29 notifications. Then let's add our
21:32 trigger node. And for this we need the
21:34 web hook trigger. Let's change the
21:37 method from get to post. Let's give it a
21:40 path name like autocoder.
21:42 And that's actually it. What you can do
21:45 then is grab your production URL. Let's
21:47 just copy this and let's add that to
21:49 this variable. And the last thing we
21:52 have to do in N8N is to simply save this
21:55 workflow and let's activate it as well.
21:57 So let's restart this process. Now
21:58 thankfully it won't run the
22:00 initialization agent again as it's
22:02 already run. The coding agent will
22:04 simply pick up from where it left off.
22:06 And as this agent is working through
22:08 these changes, I can already see that
22:10 N8N was triggered. So if I go to
22:13 executions, I can see one execution
22:15 executed already. This is everything our
22:18 autonomous agent just sent to N8N. So it
22:20 includes this body property which
22:23 includes the name of the event, how many
22:25 tests are passing, how many there are in
22:28 total, the percentage completed, as well
22:31 as a list of completed tasks. And now of
22:33 course then you can use that information
22:36 to send emails or WhatsApp messages or
22:39 telegram messages to yourself. The sky
22:41 really is the limit. So I decided to
22:43 send telegram messages and I just sent
22:46 like the project name, the tests
22:49 completed and whatever else. And that
22:51 resulted in something that looks like
22:53 this. So it's got the project name, the
22:55 list of tests that were completed, the
22:58 total tests, etc. And this way I could
23:01 get notifications to my phone every time
23:03 something was implemented. If you are
23:05 curious to see how I implemented that
23:07 Telegram integration, then you can
23:09 download it from my community which I'll
23:10 link to in the description of this
23:12 video. I hope you found this video
23:14 useful. If you did, hit the like button
23:16 and subscribe to my channel for more
23:19 Claude Code and Agentic Coding content.
23:21 Thank you for watching. I'll see you in