YouTube 字幕：
I Let Claude Code Run for 24 Hours. Here's What Happened.

不必从头看完视频——获取完整字幕，搜索关键词，一键复制。

AutoDub

听懂YouTube外语视频

沉浸式YouTube翻译中文配音

告别语言障碍，拥抱全球优质内容

免费使用

视频字幕

视频摘要

Summary

Core Theme

This content introduces a novel approach to building complex software projects using AI agents, overcoming the limitations of context windows and manual orchestration by employing a two-phase agent system with automated testing and progress tracking.

Mind Map

点击展开

点击探索完整互动思维导图

Hey there. This is not the Claude

website. This is actually a onetoone

clone that I built with over 200 unique

features. And I didn't write a single

line of code myself. An AI agent built

this entire thing while I was sleeping.

So, here's the problem. When you're

building a project this big, we're

talking full conversations,

projects, artifacts, file uploads, all

of it, you hit the wall pretty quickly.

the context window fills up and the

agent loses track of what it was doing.

And if you've tried to build anything

this substantial with coding agents

before, you know exactly what I'm

talking about. And compacting the

conversation is just not good enough.

The workaround that a lot of people use

is to manually orchestrate everything.

You would create an implementation plan

using your agent, maybe store that plan

somewhere in your project folder. You

could even use something like specit and

bmat to do this. And you then get the

agent to implement these features one by

one. You then clear the conversation

after each session and ask the agent to

implement the next feature. Rinse and

repeat. This works, but it's exhausting,

especially for larger projects. You're

effectively babysitting the agent the

entire time. What I'm about to show you

is completely different. You give your

requirements once and an initialization

agent will break everything down into a

detailed feature list. And then coding

agents take over implementing one

feature at a time, testing, committing

the changes, clearing the context

window, and picking up the next feature

automatically. This even does regression

testing before moving on to the next

feature. This ran for hours while I did

absolutely nothing. And by the end of

the process, we had a fully functional

clone of the Claude website. In this

video, I'll show you exactly how to set

this up yourself. I've really simplified

the process so you don't have to be a

developer to follow along. And as an

added bonus, I'll show you how to

integrate with NATN to get realtime

updates as your agent is making

progress. In this instance, the agent

sent me notifications to Telegram every

time it completed a new feature. This is

all based on an article written by

Anthropic about an effective harness for

long-running agents. This is a brilliant

article and I actually recommend you

read it. It's all about getting agents

to perform tasks that would take a lot

of time and context. As AI agents become

more capable, developers are relying on

these agents to implement way more

complex tasks. And these tasks can take

hours if not days to implement. So the

challenge when you're using something

like specit and bmad or even just the

planning mode in your IDE is that agents

will actually have to work in sessions

because the context window will fall up

as it's working through the solution and

at some point the quality is going to

decrease and you might actually have to

compact the session which will summarize

the conversation dropping off a lot of

important context. emerging software

engineers working in shifts where each

new engineer arrives with no memory of

what happened in the previous shift.

That is exactly the problem here. Even

if you clear the context and ask the

agent to implement the next feature, it

has no idea what's been implemented

already. So what this project proposes

is that we use a two-fold solution where

we can use something like the claw agent

SDK to plan and implement the solution

in two phases. First, we'll have an

initialization agent which will

basically take in your prompt and create

a feature list from that and it will

also set up the basic project structure.

Once that's done, the framework will use

coding agents to implement the features

one task at a time. So, these agents

will make incremental progress in every

session. Now, they don't mention it

here, but something I really like about

their solution is when the coding agent

starts a session, it will pick two

features that have already been

implemented at random and do regression

testing on them and then fix any issues

before moving on to the next feature.

So, you can definitely read through this

article, but what I do want to focus on

is their quick start where they give you

access to an example project that

implements all of this. Now, the setup

process is not too complicated, so you

can definitely try this yourself, but

I'm actually going to show you an even

easier way to get going. In the

description, you'll find a link to this

repository. I simply took their project

and modified it slightly, so it's a bit

easier to work with. So really all you

have to do is click on code and you can

either download this as a zip file or if

you've got get installed simply copy

this link then extract the contents of

that zip file and then open the folder

in a code editor. I'm using cursor but

you can use VS code or whatever editor

you want. Now the project is really

straightforward. There's a bunch of

Python files like agent autonomous agent

demo and the client file. This basically

uses the agent SDK to set up this entire

project. Now, one file you might want to

go through is the readme file. This is

where I give you detailed instructions

on how to set everything up. So, there

are a few dependencies that we have to

install and it also shows you how to set

up any environment variables and finally

how to start this project. But we'll go

through all of that in detail. Now,

since this project uses Python, I do

recommend setting up a virtual

environment. If you're new to Python,

this is really easy to set up. Let's

create a new terminal window. And in the

terminal, let's run Python. If you're

using Mac and Linux, I think it's Python

3, but for Windows, it's just Python.

Then dash M then venv

space Venv.

So, it looks something like this. This

will create a new virtual environment

within this folder. Now, we have to

activate this virtual environment. On

Linux and Mac, it's this command. Or if

you're using Windows like I am, the

command looks something like this. So,

then press enter. And if everything was

done correctly, you should see the

virtual environment name over here. So,

why do we need a virtual environment?

Well, we're going to install a whole

bunch of Python dependencies. And by

using a virtual environment, those

dependencies will only be installed in

this project. So it's only scope to this

project. If you don't activate the

virtual environment, everything will

still work. But all of these

dependencies will be installed globally

on your machine, which could affect

other projects or scripts on your

machine. So really, this is not a lot of

effort. Just activate your virtual

environment. So let's install our Python

dependencies by running pip install and

requirements.xt. txt. Now again, all of

this is in that readme file. Cool. We've

now installed the project dependencies.

Now, this framework uses the anthropic

models for the initialization agent and

the coding agent. This also means we

have to provide an anthropic API key.

And if you're using the quick start from

anthropic, they only allow you to use

the API key, which can actually be

really, really expensive. But I'm going

to show you a way cheaper solution.

First, let's rename this. env.example

file. So let's rename it to env. Now in

this file you have a choice of two

variables. We can either provide the

anthropic API key which is the default

or we can use our claude code o orth

token. So if you're already using claw

code and you've got a claw subscription

you can simply piggy back on your

subscription. And trust me this agent

uses a lot of tokens and it runs for

hours. So, in my opinion, using the

Anthropic API key is simply not an

option. So, if you've got the basic $20

claw subscription, you can run this

process for hours and for days and for

weeks without ever going over that

subscription cost. So, I'm actually

going to comment out this anthropic API

key and I'm going to use my Claude code

subscription instead. Now, I had no idea

that you could use the Claude code

oorthth token in the agent SDK. So, I do

want to give a shout out to a friend of

the channel, WebDev, Cody. He worked

with me on Discord to get all of this

working and he's got some brilliant

content on aentic coding. Cody also has

a fantastic course on learning how to

use aic coding to build full stack

applications. So, definitely go to

aentic jumpstart.com and tell him Leon

sent you. I'm not getting paid for this

at all. He's a good friend of the

channel and I highly recommend to check

his stuff out. So just run the command

claude setup token. You will be asked to

authorize this token. So just click on

authorize. You can now close the browser

window. Then in the terminal you can

simply copy the token and add it to the

env file. Now before we move off the

file you will also notice this optional

variable for process n web hook. So if

you want you can uncomment this variable

and provide a link to your nadn

instance. So as the agent is making

progress, it will send some valuable

status updates to this endpoint and then

you can do whatever you want with it.

You could email the results to yourself.

You could send updates to Telegram,

whatever. I'll simply leave this

commented out for now. Now we can

finally test this application. Now this

prompts folder is really important. This

contains three files. The appspec which

is critical. This appspec file actually

drives the entire solution and this is

something you have to provide. So this

is where you can explain what the

project is about. So you've got this

overview section, the text stack for the

front end, the back end, communication

layer. We can also specify prerequisites

and of course all the core features. And

this is a massive list of features. Now

don't worry, you don't have to type all

of this stuff out by hand. You can of

course just simply give this file to a

agent and say hey here's an example

appspec file. You can replace all of

this with my apps requirements. And of

course on my channel we have a look at

very cool ways to simplify this even

further. I'll show you in a second. Now

we also have this coding prompt file and

this will be used by the coding agent.

The same with the initializer prompt.

Now you don't really have to modify

these files. I personally made quite a

few changes to these files in this

project because I actually used this

extensively in the last week and I felt

that the anthropic demo actually still

had a few gaps in it. As an example, I

noticed that the coding agent would

create the app with a whole bunch of

pages and these pages would show

results, but those results were all

hardcoded mock data. And when the agent

did testing, it looked at the page and

it simply said, "Oh, it looks like

everything is working. The page is

showing up and I can see a bunch of

values. But at no point did it consider

that this might be mock data and that

mock data needs to be replaced with real

time data. So I added a lot of steps in

these prompts to force the agents to

ensure that the data that is looking at

is actually real. Now the only thing you

might want to change yourself in this

initialization prompt is this section

where it says you need to create a

feature list with 200 detailed test

cases. Now, this really depends on your

application. If you're building a simple

to-do list app that only you will use,

then you definitely don't need 200

features, right? Or if you're building

something massive like an enterprise

scale application, you might want to

bump this up to 500 features. Now,

again, I'm giving you a really simple

way to automate all of this. So, instead

of trying to type out all of this

manually, I added a custom prompt to

this claude folder. This create spec

file. Now, this is a really detailed

prompt, but this is going to help the

agent populate all of the stuff for you.

So, let's open up our terminal. I'm

actually just going to open up another

session and I'm going to start cla code.

So, all we have to do is run the custom

command front slashcre spec. Right? So,

the agent's going to ask us a few

questions like what do you want to call

this project in your own words? What are

you building? And who will use it? Just

you or others too? So this will tell the

agent whether or not user authentication

is required. Help me build an

application that I can use to come up

with unique YouTube titles. So I will

provide the topic and idea of the video.

And this app will then call open router

to generate unique YouTube ideas. And

what I also want is for a second agent

to review the titles to give feedback to

the first agent. And then that agent

needs to rewrite the titles until we get

really good high clickthrough rate title

ideas. Only I will use this application

and no one else. We can just call this

title smmith. I don't know something

like that. So let's simply run this. And

I'm currently in editing mode. It really

doesn't matter. If you want you can just

go into planning mode to make sure the

agent won't accidentally make any

changes. So this custom prompt will

force slot code to ask you clarifying

questions and I really love this. So you

can choose between quick mode and

detailed mode. In quick mode, we can

describe the app at a high level without

really providing any details on the

technical architecture. This could be

ideal for vibe coders or for someone

that really doesn't understand this tech

stack. Or if you really want to dive

into the weeds of how everything should

work, you can go into detailed mode.

I'll just go with quick mode. So how

complex is your application? So simple,

medium or complex. By the way, this will

determine how many features we will add

to this initialization prompt. So this

value over here. But as you can see, I'm

really trying to abstract all of that

away. So let's just say simple. Any

technology preferences or should I

choose sensible defaults? I'll just go

with defaults. Right. The agent is

asking us a few more questions like how

do we envision the output to work and

the generation process. I'm actually

just going to say you choose. Of course,

in your application, you probably want

to be a bit more involved in this, but

for tutorial sake, let's just get the

agent to decide. And cool. So, this app

spec file was updated. The project name

is now titlesmith with a proper

overview. And our agent now populated

the text stack. So it covers the front

end, back end, the prerequisites,

security and access control, and of

course all of these key features. And

looking at the initializer prompt, our

agent decided to create a 150 unique

test cases. So now that we have our

appspec, we can finally go ahead and

implement this solution. And for this,

let's go back to that Python

environment. Now to start this process,

we have to run the following command. In

fact, let's go to the readme file under

quick start. We can simply copy this

command and let's paste it into the

terminal. Now all we have to change is

the name of the project folder. So I'll

just call this title Smith. And that's

really it. Let's run this. The

initializer agent is now running. And

this is going to create a subfolder. So

if we go to the generations folder, we

can now see a subfolder called

titlesmith. And the initializer agent is

now doing a lot of work. It's going to

create a feature list file. And by the

way, this can take a few minutes to

complete. These feature list files are

massive. It will then also set up the

basic project structure, right? Our

initializer just created this feature

list file. So, let's have a quick look

at it. This file is massive. And for a

small app like this, this file is

already 1,922

lines long. Each and every feature

contains a description on what it is, as

well as all of the steps needed to

implement this feature. And each feature

also contains a property called passes

which is false by default. So as the

agent works through this list, it will

implement a change, test it, and then

set passes to true. It will then move on

to the next feature. What's really cool

is that these coding agents have

instructions to retrieve two features

that have already been implemented by

random and then do regression testing on

those features and fix any bugs. So this

means that if any feature actually broke

one of the existing features, the agent

will automatically pick up this issues

and address it. Besides for the feature

list, this initializer agent will also

set up all of the project dependencies.

So it will create the project structure

and install any dependencies. All right,

so the initialization agent has now set

up the project and the feature list file

and now it's updating this cla progress

text file. This file is really useful

for keeping track of the current

progress. Now, this is really where the

fun begins. The agent SDK is now going

to use the coding agent to implement all

of these features. And honestly, you can

now step back and let the agent do its

thing. This coding agent will now have a

look at the feature list and retrieve

any features that have not yet been

implemented. So, any feature where

passes equals false. It will then look

at the highest priority feature and

implement that first. It will also do

regression testing on any features that

have already been implemented. Now,

there are a few things that I do want to

mention about the coding agent. First,

if we go to this autonomous agent demo

file and we scroll down, we can see that

we're currently using opus to implement

this project. By default, the anthropic

demo actually uses sonnet. So, if you

prefer to use sonnet, you can simply

comment out this line and save this

file. But honestly I just prefer Opus.

Then the second thing is if we go to

this client file we can see all the MCP

servers and tools that are available to

this agent. So if we go down to this

claude SDK client section here we can

see all the MCP servers. The anthropic

demo actually uses Puppeteer for end to

end testing but I did a sideby-side

comparison and Playright is way faster.

I'm not sure why they decided on

Puppeteer. Maybe you can tell me in the

comments. But honestly, Playright was

just so much faster. And you might be

wondering, well, what is Puppeteer and

Playright used for? This coding agent

really likes to do end to-end testing.

It does this by opening the browser

window. Then it takes a screenshot of

the browser window and it uses the

agent's vision to analyze the image and

it will then determine if there's any UI

issues, etc. Now, I find that process to

be really slow. So I'm actually running

playright in headless mode. The agent

will still be able to see all the

elements by actually just looking at the

HTML code. But if for some reason you

want the agent to use the browser, you

can simply comment out this first line

and add back the second line. So this

will run the playright MCP server where

it will actually use the browser window.

And I'm just providing a viewport size.

So the screenshots are not too big. Now,

this process can run for hours, days, or

even weeks. It really depends on how

large and complex your project is. Now,

I personally wanted some way of

receiving updates every time the agent

makes progress. I don't want to go and

babysit my monitor and see what's going

on. So, this is totally optional, but if

you want to receive notifications, I've

actually integrated NN into this

workflow. So in the env file there's

this progress nitn web hbook URL

variable. I'm actually going to comment

this out and I'm going to stop this

process just for now so that I can

actually show you how to implement this.

By the way, you can stop and resume this

workflow at any time. You just press

Ctrl C to stop the process. And as you

can see here, to resume, simply run the

same command again. So we'll restart it

in a second. I'm just going to save this

env file. And now all we have to do is

provide this NN web hookbook. Again,

this is totally optional. You're more

than welcome to let this process run in

the background, but I personally want to

receive notifications. So, of course,

the first thing you need to do is open

up N8N and create a new workflow. If you

don't yet have an NA instance, then what

you can simply do is use the link in the

description to go to this page.

Hostinger is without a doubt the

cheapest way to host this N8N instances.

So what you can do is choose a plan like

the KMV1 plan is only $5 per month. I'll

go with the KMV2 plan. Select your

application as N8N and then under the

discount code you can enter the code

Leon and this will give you an

additional 10% off. You don't have to go

with 24 months either of course. You can

just go monthtomonth or maybe a 12-month

period. Then simply continue with the

checkout process. Then after setting

your root password, hostinger will build

your NAT instance and you'll have access

to this dashboard. All you really need

is to click on manage app and you will

now have access to your very own N8

instance. How awesome is that? Cool.

Let's create our workflow. I'll just

give it a name like autocoder

notifications. Then let's add our

trigger node. And for this we need the

web hook trigger. Let's change the

method from get to post. Let's give it a

path name like autocoder.

And that's actually it. What you can do

then is grab your production URL. Let's

just copy this and let's add that to

this variable. And the last thing we

have to do in N8N is to simply save this

workflow and let's activate it as well.

So let's restart this process. Now

thankfully it won't run the

initialization agent again as it's

already run. The coding agent will

simply pick up from where it left off.

And as this agent is working through

these changes, I can already see that

N8N was triggered. So if I go to

executions, I can see one execution

executed already. This is everything our

autonomous agent just sent to N8N. So it

includes this body property which

includes the name of the event, how many

tests are passing, how many there are in

total, the percentage completed, as well

as a list of completed tasks. And now of

course then you can use that information

to send emails or WhatsApp messages or

telegram messages to yourself. The sky

really is the limit. So I decided to

send telegram messages and I just sent

like the project name, the tests

completed and whatever else. And that

resulted in something that looks like

this. So it's got the project name, the

list of tests that were completed, the

total tests, etc. And this way I could

get notifications to my phone every time

something was implemented. If you are

curious to see how I implemented that

Telegram integration, then you can

download it from my community which I'll

link to in the description of this

video. I hope you found this video

useful. If you did, hit the like button

and subscribe to my channel for more

Claude Code and Agentic Coding content.

Thank you for watching. I'll see you in

点击任意文字或时间戳，即可跳转到视频对应位置

大多数字幕 5 秒内即可准备好

一键复制125+ 种语言搜索内容跳转到时间戳

粘贴 YouTube 链接

输入任意 YouTube 视频链接，获取完整字幕

大多数字幕 5 秒内即可准备好

安装 Chrome 扩展

无需离开 YouTube，一键获取视频字幕。安装我们的 Chrome 扩展，直接在视频页面访问任意视频的完整字幕。

免费添加到 Chrome

支持 YouTube、Coursera、Udemy 等主流教育平台

快速获取字幕：直接修改地址栏中的域名即可！

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube 字幕正在为您准备结果……

YouTube 字幕：I Let Claude Code Run for 24 Hours. Here's What Happened.