YouTube 字幕：
'Prompting' Just Split Into 4 Skills. You Only Know One. Here's Why You Need the Other 3 in 2026.

不必从头看完视频——获取完整字幕，搜索关键词，一键复制。

AutoDub

听懂YouTube外语视频

沉浸式YouTube翻译中文配音

告别语言障碍，拥抱全球优质内容

免费使用

视频字幕

视频摘要

Summary

Core Theme

The way we interact with AI, termed "prompting," has fundamentally shifted due to the advent of autonomous agents. Traditional chat-based prompting is becoming obsolete, necessitating a new, multi-layered skill set to effectively leverage AI for complex, long-running tasks.

Mind Map

点击展开

点击探索完整互动思维导图

If you're prompting like it's last

month, you're already too late. And I'm

not just doing that for clickbait. If

you haven't updated how you think about

prompting since January 2026, you're

already behind. Opus 4.6, Gemini 3.1

Pro, and GPT 5.3 codecs have all shipped

in the past few weeks with autonomous

agent capabilities that make the

chatbased prompting most people are

practicing functionally obsolete for

serious work. These models don't just

answer better. They work autonomously

for a long time, for hours, for days

against specs without really checking

in. That changes what good at prompting

means on a fundamental level. And it's

time to revisit how we think about

prompting as a result. Not because

prompting stopped mattering. It actually

matters more than ever, but because the

word prompting is now hiding four

completely different skill sets, and

most people are only practicing one of

them. And the gap between the people who

see all four of them and the people who

don't is already 10x and widening. In

this piece, I'm going to lay out what

those four skills are, why the

distinction matters now, and exactly how

to build the skills you're missing. This

builds on my earlier work on intent

engineering, but it goes way beyond it

to lay out a full framework for how to

think about prompting post February

2026. Intent engineering is just one

layer in a larger stack. This is the

full stack for prompting post February,

post these new autonomous models. First,

what changed? The prompting skill that

mattered since 2024 has been really

conversational. You sit in a chat

window, you type a request, you read the

output, you iterate, you get better at

phrasing things, you provide examples,

you structure instructions. If you're

good at that, and if you've been

following this video series, you

probably are. You've been building real

skills. They work. you're faster than

you were a year ago. But that

fundamental chatbased skill has a

ceiling. And in early 2026, a lot of

people are hitting it because the models

have stopped really being chat partners

and started being workers. Workers that

run for a long time. I'm not kidding

when I say days and sometimes weeks. And

the thing about a worker that runs for a

long time is that everything you relied

on in a conversation, like your ability

to catch mistakes in real time, your

ability to provide missing context

accurately when the model asks, your

ability to course correct when things

drift. All of that must be encoded

before the agent starts. Not during the

course of a conversation, but at the

top. This is a fundamentally different

skill. It's not a harder version of the

same skill. It's actually different.

I've talked before about the importance

of thinking about prompting even in the

chat window as providing the relevant

context for the LLM to provide you an

accurate response. But this goes way

beyond that. If you were giving an agent

a longunning task, which is where most

of AI is going, even if you're not a

coder, then you have to think not about

how do I build for a chat response, but

how do I build for economically real

work that this agent will do and provide

the agent the relevant context for that.

And this shift is happening really

quickly. Between October of 2025 and

January of 2026, in just 3 months, the

longest autonomous cloud code sessions

nearly doubled, and they've doubled

again since then. Agents are running

into the hundreds and thousands in

production systems at major companies.

And this is just from publicly available

information. We have Telus reporting

that they have 13,000 custom AI

solutions internally. We have Zapia

reporting that they have over 800 agents

internally. Look, whenever they release

a press release, you got to assume that

company feels behind and is releasing

something to help them feel better about

themselves. The companies that are

really serious about AI don't feel the

need for press releases and have an

order of magnitude or more agents. So,

this is not about a world that is

coming. This is about a world that has

landed. But this still might not feel

concrete enough for you. So, let me talk

to you about a random Tuesday. Two

people sit down with the same model,

same subscription, same context window.

One of them uses 2025 skills. You type a

request, you get back something that is

80% right. You spend 40 minutes cleaning

it up and you have a good use of AI

because you saved it a lot of time. It

might have saved 50% on your time, might

have saved 60%, maybe higher. Let's make

the gap concrete. Let's say two people

are sitting down with the same model on

a Tuesday morning in February of 2026.

Same subscription, same context window.

The only difference is that one of them

is using 2025 prompting skills and one

of them is using 2026 prompting skills.

So the 2025 person types a request and

they're asking for a PowerPoint deck,

right? They get back something that's

about 80% correct. Maybe there's some

formatting issues. Maybe the font has

some collisions in it. Maybe there's

some styling issues. They spend about 40

minutes cleaning it up, but they're

pretty happy because this deck would

have taken two or three hours. That's a

2025 prompting skull application. it

would have been good in 2025.

Person B sits down with 2026 prompting

skills. They write a structured

specification in 11 minutes. They take

longer to prompt. Then they hand it off

to the same chatbot, but they're

thinking of it and using it as an

autonomous agent. They go to make

coffee. They come back to a completed

PowerPoint that hits every quality bar

defined up front. And they're able to do

this for five other decks before lunch.

In other words, they are now doing a

week's worth of work in a morning

easily. Same model, same Tuesday, 10x

gap. And if you want to replicate this,

you can replicate this experiment

directly in Claude Opus 4.6 in the

co-work model, which is available on

Windows and Mac. And you can see exactly

how this plays out. And this did not

happen because the person with 2026

prompting skills is smarter or because

she's more technical. It's because she's

practicing a different skill than person

one and person one doesn't know that

kind of prompting skill exists. I think

it's worth paying attention to Shopify

CEO Toby Lutkkey here. Unlike most CEOs,

Toby is a technical guy and he does not

just dig into AI from a LinkedIn

perspective. He has a folder of prompts

that he runs against every new model

release and he really deeply thinks

about how new model releases change his

workflow. He uses the term context

engineering because he believes the

fundamental skill that we're all facing

is the ability to state a problem with

enough context in a way that without any

additional pieces of information, the

task becomes plausibly solvable. I think

that's a really elegant way to describe

what person B did in the example I just

showed you. Person B put all of the

information that the model needed to

build a deck in one task very clearly

defined and the model could just go to

work. This isn't about clever prompt

tricks. This isn't about magical words

that an AI can use to produce a better

output. It's about a communication

discipline. Can you state a problem so

completely with so much relevant

surrounding information that a capable

system can solve it without going out

and fetching more context? Can you make

your request as self-contained as

possible? This is a really big deal

because it demands a much higher bar for

communication from us humans than we're

used to. And that's something that Toby

called out when he reflected on the

impact of AI on his own leadership

style. One of the things he mentioned is

that by being forced to provide AI with

complete context, he is now better at

communicating as a CEO. His emails are

tighter, his memos are better, his

decision-making frameworks are stronger,

and Toby has gone farther than most

people thinking about the implications

of context engineering. I think one of

his most provocative assessments is that

a lot of what people in big companies

call politics is actually bad context

engineering for humans. What he suggests

is that essentially good context

engineering would surface disagreements

about assumptions that are never

surfaced explicitly but play out as

politics and grudges in large companies.

And he says that happens because humans

tend to be sloppy communicators who rely

on shared context that doesn't actually

exist. I think that's a really

interesting thesis. I think one of the

implications of getting this February

2026 prompting lesson deeply ingrained

in ourselves is that our communication

humanto human is likely to improve and

our organizations are likely to have

cleaner decisioning and cleaner

communication even between humans as a

result. So I bet you're wondering what

are these four disciplines? What does it

mean when prompting becomes multiple

skills that we need to learn? Well, I

don't want to hide the ball here. Here

is the framework that I would lay out

that describes what prompting should be

in February 2026. And I've built it to

be futurep proof. So we look at the

direction that agents are going, how

they're developing this year. These four

disciplines are going to matter even as

we expect agents to continue to scale.

This represents a significant update on

how I've taught prompting before 2026

because the way we prompted before 2026

was helpful as a foundation. You're not

losing something by having learned it,

but it's not enough as agents get more

capable. And I think we're due for a

reset. So, fundamentally, prompting is

the broad skill of providing input to AI

systems so that they can do useful work.

Prompting has diverged into four

distinct disciplines, but it's not

taught that way. And so, I'm laying this

out for the first time here. Each of

these disciplines is operating at a

different altitude and time horizon. And

you need to understand them all to

prompt well. Just because they're not

often prompted this way, just because

they're not often taught this way,

doesn't mean that good prompters don't

intuitively know this. What I'm taking

is intuitive knowledge that I see in

excellent prompters and boiling it down

into four key disciplines that you can

practice and learn from. And these build

on each other. If you skip one, I'm

presenting them in order. If you skip

one, you're creating the kind of

failures we tend to see at scale in the

enterprise, but you're creating it for

yourself in your own prompting. And I'll

kind of get at what that means and

you'll get the idea. So discipline one

is prompt craft. This is the original

skill. This is the skill I have taught

and many others have taught for the last

year or two. It's synchronous. It's

sessionbased and it's an individual

skill. You you sit in front of a chat

window. You write an instruction. You

evaluate the output. Then you iterate.

The skill here is knowing how to

structure a query. And I've talked a lot

about this in the past. I'll rehash it

briefly here. You must have clear

instructions. You must include relevant

examples and counter examples. You need

to include appropriate guard rails. You

need to include an explicit output

format. And you should be very clear

about how you resolve ambiguity and

conflicts. So the model doesn't have to

make it up on the fly. This is what

anthropics prompt engineering

documentation covers. Open AI talks

about this. Google talks about this.

It's on a thousand blog posts and

LinkedIn courses. Prompt craft has not

become irrelevant. Don't hear that. It's

just become table stakes. It's sort of

the way knowing how to type with 10

fingers was once a professional

differentiator and now it's just table

stakes. It's just assumed. If you can't

write a clear, well ststructured prompt

in 2026, you're the person in 1998 who

couldn't send an email. Is it important

to be able to do it? Yes. Is it going to

differentiate you in the workforce? No,

not really. The key shift is that

promptcraft was the whole game when AI

interactions were synchronous and

sessionbased. You wrote something, you

got something back, and you refined it

in real time. As a human interacting

with that model, you were acting as the

intent layer, as the context layer, and

as the quality layer so that longunning

tasks could get done. You did all the

breaking out of those tasks. That model

of prompting broke the moment agents

started running for hours without

checking in. Discipline 2 we've been

talking about for a few months now. It's

called context engineering. Enthropic

published the foundational piece on this

back in September of 2025, but there's a

lot of other good pieces out there as

well. I've written a fair bit on it. I

define context engineering as the set of

strategies for curating and maintaining

the optimal set of tokens during an LLM

task. And that's not just me defining

that. That's a pretty commonly held

definition. Wang Shane's Harrison Chase

was even bluntter about what context

engineering is during a recent Sequoia

Capital interview when he said

everything is context engineering. It

actually describes everything we've done

at lane chain without knowing the term

existed. So that's actually somewhat

dangerous because context engineering is

only one of four levels and people have

misunderstood it to mean everything. And

one of the things I'm trying to get us

to move toward is a world where we

understand context engineering as a

specific skill where we're providing

relevant tokens to the LLM for

inference. And yes, it is certainly

foundational. It is certainly

significant. It is where the industry's

attention is focused today. It is the

shift from crafting a single instruction

to curating the entire information

environment and agent operates within.

all of the system prompts, all of the

tool definitions, all of the retrieved

documents, all of the message history,

all of the memory systems, the MCP

connections. The prompt you write might

be 200 tokens. The context window it

lands in might be a million. Your 200

tokens are 002% of what the model sees.

The other 99.98%

that's context engineering. This is the

discipline that produces claw.md files,

agent specifications, rag pipeline

design, memory architectures. It's the

discipline that determines whether a

coding agent understands your project's

conventions, whether a research agent

has access to the right documents,

whether a customer service agent can

retrieve relevant account history.

Anthropics engineering team identified

the core challenge precisely. LLM

degrade as you give them more

information. That's correct. They do.

And the point therefore is to include

relevant tokens because the issue is not

that they can't hold the tokens. It's

that retrieval quality does drop as

context grows. The practical implication

is that people who are 10x more

effective with AI than their peers are

not writing 10x better prompts. They're

building 10x better context

infrastructure. Their agents start each

session with the right project files,

the right conventions, the right

constraints already loaded. The prompt

itself can be relatively simple because

the context does the heavy lifting. I

have seen this for myself as I've built

out my own context engineering

scaffolding. And if you're wondering how

to do it for yourself, I'm putting a

guide together with this video over on

the Substack and I think it'll be

helpful. Discipline number three. This

one we don't talk a lot about. I think

is where we're going as these agents

start to do much longer autonomous

running tasks. I wrote about this one at

length in a prior piece. I did a video

on it. I'm going to be brief here. I'm

going to contextualize where it sits in

the stack. Context engineering tells

agents what to know. Intent engineering

tells agents what to want. It's the

practice of encoding organizational

purpose, your goals, your values, your

trade-off hierarchies, your decision

boundaries into infrastructure that

agents can act against. Claro story is

the proof case I talked about. Their AI

agent resolved 2.3 million customer

conversations in the first month, but it

optimized for the wrong thing. It

slashed resolution times, but it didn't

optimize for customer satisfaction. And

as a result, CLA got into big trouble

and had to rehire a bunch of human

agents and is still dealing with the

customer trust aftermath. So, intent

engineering sits above context

engineering the way strategy sits above

tactics. You can have perfect context

and terrible intent alignment. You

cannot have good intent alignment

without good context, though, because

the agent needs information to act on

the intent. So these disciplines again

they're cumulative. Another thing I want

you to notice is that failure as we

progress up this hierarchy is getting

more and more serious. When you as an

individual sit down and you screw up a

prompt, it might waste your morning at

worst. When you as a human being sit

down and screw up context engineering or

intent engineering, you are screwing up

for the entire team, your entire org,

your entire company. The stakes get

higher. And because the stakes get

higher, our attention to detail matters

and the value of the work we do

increases commensurately. What I am

talking about when I talk about context

engineering and intent engineering can

be a full-time role at a big company.

And if it's not, it is a highstakes

human skill that has a lot of

transferable value. Level four is

specification engineering. We're just

starting to talk about this now, even

though the best practitioners are

already doing it. Specification

engineering is the practice of writing

documents across your organization that

autonomous agents can execute against

over extended time horizons without

human intervention. This is a level

above everything I've described because

all of the first three levels focused on

how you prepare work directly for an

agent. Specification engineering is

really about thinking about your entireformational

entireformational

corpus in your organization as agent

fungeible, agent readable. Everything

you write has to be something the agent

can access and do something with. It's

not really about prompting per se. It's

not about an individual agent's context

window. It's not even about the intent

you've given agents. Specifications

are complete. They're structured.

They're internally consistent

descriptions of what an output should be

for a given task. They look at how

quality is measured. Specifications are

a mindset you bring to your documents

that allow you to apply agents across

large swaths of your current company's

context with the confidence that what

the agent reads is going to be relevant.

I think an interesting example from

Anthropic actually comes from the team's

struggles with the Opus 4.5 agent which

is one generation ago now. They were

trying to build a production quality web

app. But if you give the agent only a

highlevel prompt like build a clone of

claw.ai, the agent tries to do too much

at once, runs out of context

mid-implementation, and leaves the next

session guessing at what happened. The

fix, it turned out, was not a better

model. It was specification engineering.

It was a pattern that you could specify

where an initial laser agent sets up the

environment. A progress log documents

what's been done and a coding agent then

makes incremental progress against a

structured plan every session. The

specification became the scaffolding

that let multiple agents produce

coherent output over days. So the shift

from prompt to specification mirrors a

transition that happened in human

engineering decades ago. When you're

building something small, verbal

instructions and conversations work

really well. When you're building

something large enough to require a team

or span multiple sessions, you need

blueprints. Anthropic needed blueprints

in the Opus 4.5 example. And even though

we've now moved to Opus 4.6, the need

for specification engineering has not

gone down, it's gone up because Opus 4.6

can do even more work. That's true for

Codeex 5.3. It's true for Gemini 3.1 Pro

as well. The smarter models get, the

better you need to get at specification

engineering. Which is why I deliberately

started this section zooming out and

saying the entire org's document corpus

should be viewed as a form of

specification engineering. And yes, this

is a fractal insight. You can also think

about specification engineering for your

individual agent task where you think

about what is the log that the agent

has. How do we assign tasks across this

agent build? How do we make sure the

agent has a clearly specified

requirements list to work from? But all

of that gets way way easier to put

together if you think of your entire

organizational document corpus as

specifications that are agent readable.

Your corporate strategy is a

specification. Your product strategy is

a specification. Your OKRs are a

specification. Everything ends up being

a specification that your agent can use.

And that's different from context

engineering because the art of context

engineering is really about shaping the

context window in a way that's relevant

for the agent. Right? If you look at

these four levels, the prompt is you and

the agent and you're working on crafting

clear instructions. The context window

is how do you shape relevant tokens.

Intent engineering is how do you

communicate goals and objectives to the

agent that allow the agent to work

autonomously for long periods of time in

a direction consistent with company

strategy. Specification engineering is

how do you think about your entire

corporate document structure, the

knowledge, the context that makes the

corporation work as a form of

specification. And yes, for individual

agent runs, how do you refine that

specification that becomes something

where you give the agent a good

specification? It is going to start to

keep the context window cleaner than

before. And so these start to interplay,

right? If you write good spec, if you

have a good task log, if the agent

understands what the spec is from the

broader organizational context, they're

less likely to go off the rails because

of intent engineering conflicts. They're

less likely to bloat out with bad

context. And so all of these start to

interplay, but the highest level is to

think about specification as the way

your organization does business. You

specify the outputs you want. The agent

does the work. the outputs are produced.

That is the highest level description of

what business is going to look like in

the next couple of years and it starts

with understanding how to specify. This

is where anthropics best practices

documentation for cloud code becomes

really revealing. The recommended

workflow for complex features is

relatively simple. Interview me in

detail. Ask about technical

implementation, UIUX, edge cases,

concerns, and trade-offs. Don't ask

obvious questions. Dig into the hard

parts. The agent then writes the spec

with the human. I think that that is an

artifact of this moment in time. And I

think we will get to a point where the

agent will only be asking us for places

where the broader specification corpus

is in conflict or ambiguous and we have

to talk about what it means to us for

this task to be accomplished. Well,

because the entire organizational

infrastructure is going to be agent. So

the practical skill going forward is not

writing code. It's not crafting prompts.

It's the ability to describe an outcome

with enough precision and completeness

that an autonomous system can execute

against it for days or weeks. I hope

you're seeing here how that is a

fundamentally different skill from

writing a good prompt in a chat window.

And the people who are excellent at one

of these layers are not automatically

excellent at all of them. Context

engineers spend a lot of time thinking

about how to compress tokens and get

good tokens into context windows and how

to keep bad tokens out. That is a

different mindset than thinking about

your information environment is agent

translatable, agent readable, agent

fungeible. We have to have all of these

skills in order to effectively bring AI

into the enterprise or even into a small

business in 2026. And I will tell you

one-person businesses have the greatest

advantage right now because if you are a

oneperson business and you can just

convert your notion to be agent

readable, you're off to the races today.

There's no gigantic effort required to

make all of your SharePoint agent

readable. It's simple. It's easy. You

just get it into notion and you're done.

This comes back to the core idea that in

2026, speed is going to matter because

agents are going to keep getting better

quickly. What we have now as days and

weeks is going to become weeks and

months by the end of the year. And the

corresponding impact of getting

specification engineering correct is

going to be even higher. The

corresponding impact of getting all four

levels translated into specific roles,

people who are responsible, DRIs, teams

who handle this, that's going to be even

more valuable in 2026. And so what I

mean by that is that if you are at a

large company, you should have people

who are doing context engineering and

that's all they're doing. You should

have people who are doing specification

engineering and thinking about how

agents can read the enterprise. You

should have people who are thinking

about intent engineering and how you

translate goals into a set of objectives

that an agent can read and value and a

set of verifiable guardrails the agent

can follow. Look, the mental model most

people carry is that prompting is good

instructions for the AI and that fails

for a very specific reason. And I hope

you're seeing it here. That entire model

assumes synchronous interaction. In the

synchronous AI human partnership model,

you're always there at the computer. You

see the output in real time. You correct

mistakes right away. You provide

additional context when the model asks

or when you notice it going off track.

longunning agents break every single

assumption in that model. So if you've

relied on the assumptions of synchronous

prompting, you have a structural

vulnerability in the way you think about

AI. You need to start thinking about AI

as if your real time oversight is

embedded in the specification before the

agent begins to work. The planner worker

architecture that's dominating

production agent deployments really

reflects this reality. A capable model

plans the work, decomposes it into

subtasks, defines the acceptance

criteria, and assigns work out to models

and then cheaper, faster models do the

work. The planning phase, you could call

it the specification phase, determines

the quality ceiling. The planning phase,

basically taking your specification and

expanding it and enriching it and

breaking it out and planning against it.

That's what determines the quality of

the work of the overall system.

Execution without that specification

step produces broken work and it

requires extensive human rework to be a

value at all. So the shift from fixing

it in real time, which is what we do

with a lot of prompting in the chat

window to we must get the spec right up

front changes your bottleneck skill.

Right? Real time prompting rewards

verbal fluency. It rewards quick

iteration. It rewards a good eye for

output quality. Specification

engineering rewards completeness of

thinking, anticipation of edge cases,

clear articulation of acceptance

criteria, and the ability to decompose

really complicated outcomes into

independently executable components.

Different people are good at these

things in different ways. Some people

are going to be naturally exceptional at

synchronous prompting and they're going

to struggle with specification work. And

some people will be mediocre at

chatbased interaction, but they might

actually be excellent spec engineers. My

challenge to you is that you don't

tolerate whatever your natural

propensity is as your ceiling. Think of

this as a learnable skill and go after

it. Now, you might ask, if we're going

after it, what are the foundational

elements to learn? Specification is

ironically very vague. Nate, please tell

me what you mean. I want to suggest to

you that in 2026 we can define the

primitives that go into good

specifications in ways that are useful

for us to learn. And I'm going to go

ahead and define them right here. I

think these are the foundation that we

need to learn if we want to get better

at specifying and we want to get better

at the prompting skills that will matter

in 2026 and beyond. Primitive number

one, self-contained problem statements.

This is actually Toby's insight, but

it's not only his insight. It's been

primitive. Number one is self-contained

problem statements. Think back to what

Toby said when he talked about the idea

that we have to give the model

everything it needs to do the work. Can

you state a problem with enough context

that the task is plausibly solvable

without the agent going out and getting

more information? The discipline of

self-containment forces you to be clear.

It surfaces hidden assumptions. It makes

you articulate constraints you normally

leave implicit because you trust the

human on the other end to fill in the

gaps. AI doesn't fill in gaps reliably.

It fills them with statistical

plausibility, and that's a polite way of

saying it guesses in ways that are often

subtly wrong. So if you're trying to

train this primitive, I would say take a

request you would normally make

conversationally, like update the

dashboard to show the Q3 numbers and

rewrite that as if the person receiving

it has never seen your dashboard,

doesn't know what Q3 means in your org

context, doesn't know what database to

query, and has no access to any

information other than what you include.

That is the level of self-containment

you should be challenging yourself with

if you want to get better at this

primitive. Primitive two, learn about

acceptance criteria. If you can't

describe what done looks like, an agent

can't know when to stop or more

precisely, it will stop at whatever

point its internal huristics say the

task is complete, which may bear no

relationship to what you needed. This is

why the 80% problem is a big issue for

agent system design. the specification.

Let's say the specification said build a

build a login page that handles email

passwords, social ooth via Google and

GitHub, progressive disclosure of 2FA

session persistence for 30 days and rate

limiting after five failed attempts. If

you're training on this primitive, you

want to get to the latter, not the

former. For every task you delegate,

write three sentences that an

independent observer could use to verify

the output without asking you any

questions whatsoever. If you can't write

those sentences down, you probably do

not understand the task well enough to

give it to an agent. I have had that

happen where I've been in a conversation

with an AI agent and I realize I don't

know enough to delegate the task and I

have to come back later. That's okay.

It's good to realize that before you

assign the work. Primitive three is

constraint architecture. Learn

constraint architecture. What the agent

has to do, what the agent cannot do,

what the agent should prefer when

multiple valid approaches exist, what

the agent should escalate rather than

decide autonomously. These four

categories, the musts, the must notss,

the preferences, and the escalation

triggers form the constraint

architecture that turns a loose

specification into a very reliable one.

The claw.md pattern that's emerging in

the coding community is a practical

implementation of constraint

architecture. The best claw.md files are

not long lists of rules. They're

concise. They're extremely high signal

constraint documents. Use these build

commands. Follow these code conventions.

Run these tests before marking a task

complete. Never modify these files

without explicit instructions. The

community consensus is very strongly

that every line in an file needs to earn

its place. If you ask would removing

this line cause the AI to make mistakes

and the answer is no it really wouldn't

then kill the line. So if you want to

train this primitive before delegating a

task you should be writing down what a

smart well-intentioned person might do

that would technically satisfy the

request but produce the wrong outcome.

Those failure mos end up being your

constraint architecture. Encode them.

Primitive four is decomposition. Large

tasks need to be broken into components

that can be executed independently,

tested independently, and integrated

predictably. This is software

engineering's oldest lesson, modularity,

but is being applied to AI task

delegation. Anthropic's longunning agent

harness splits every complex project

into an environment setup phase, a

progress documentation phase, and an

incremental coding session, each

independently verifiable. You get

similar task decomposition automatically

inside codecs. A marketing content audit

requires the same decomposition as a

coding task. So I'm not just talking to

engineers here. You would have to

decompose your marketing content into

quality scoring, gap analysis,

recommendation generation, etc. If you

want to train on the primitive of task

decomposition, take any project that you

would estimate at a few days of work and

decompose it into subtasks that each

take less than 2 hours for you to do.

Have clear input output boundaries and

can be verified independently of the

other tasks. That is the granularity at

which agents work best and it's the

granularity at which specification

engineering tends to operate. Now, in

2026, you do not have to pre-specify all

of those 2-hour tasks when you are

writing a prompt, but you do have to

understand what all of those tasks are.

And you have to understand how to

describe for a planner agent what done

looks like and what decomposible

pieces look like in such a way that the

planner agent can reliably break the

work into those 50 or 60 subtasks. In

other words, your job increasingly is

not to manually write the subtasks for

the agent. Your job is to provide the

break patterns that a planner agent can

use to break up larger work in a

reliable executable fashion. That's a

level of abstraction even above

decomposition and that is a lot of where

we're going as agents start to run.

Primitive 5 is evaluation design. This

is critical to do not just in an

individual level but at an org level.

Organizations need to think about every

level of AI deployment in terms of eval.

How do you know the output is good? Not

does it look reasonable, which is how

most people evaluate AI output. But can

you prove measurably consistently that

this is good. If prompt craft is the art

of the input, evaluation design is the

art of knowing whether that input

worked. And in a world where agents can

run for a really long time, eval design

is the only thing standing between AI

generated output that I can't use and AI

generated output we can really use asis.

If you want to train this primitive for

every recurring AI task in your world,

build an eval build three to five test

cases with known good outputs and run

them periodically, especially after

model updates. This is going to catch a

regression. It'll build your intuition

for where models fail. It will create

institutional knowledge about what good

looks like for your specific use cases,

your team, you, your org. You need to be

doing this systematically. If you're

listening to all this and you're

wondering where to start, I gave you

those four layers for a reason in order.

Start by closing the prompt craft gap.

Most people are worse at basic prompting

than they think. You should be rereading

prompting documentation. I've written a

bunch on the Substack. I'm writing more

for this piece. You should do

interactive tutorials. You can head over

to AI Cred and see where you are on

prompting. You should be building a

folder of tasks that you do regularly,

writing your best prompt against each

one and saving the outputs as your

baseline and then revisiting them over

time. Take prompt craft seriously.

Second, once you feel like you start to

have a handle on that, start to build

your personal context layer. You should

be writing a cla.markown equivalent for

your work. I don't care if you use

claude. You still need to have an idea

of your goals, your constraints, your

communication preferences, your quality

standards, the institutional context

that a new team member would need 6

months to absorb written down. Start AI

sessions by loading this context. The

difference in output quality should be

immediate and obvious. Then get into

specification engineering. Take a real

project, not a toy problem, and write a

specification for it. Then start to get

into intent infrastructure. This is an

organizational layer. So if you manage

people or systems, you can start

encoding the decision frameworks your

team uses implicitly. If you are an

individual contributor, you should be

encoding the decision frameworks you

understand and trying to be a champion

that pushes for this at the

organizational level. A lot of teams

like to talk about adopting AI. We'll

talk about it in terms of building

intent infrastructure. Talk about what

good enough looks like for each category

of work together. Talk about what gets

escalated by AI versus what AI can

decide. Write it down, structure it,

make it available to agents. And then

practice specification engineering. Take

a real project, not a toy problem. Write

a spec for it before touching AI. Talk

about acceptance criteria, constraint

architectures, decomposition, etc. Hand

that spec to an agent and see what comes

back. And oh yes, from an org

perspective, start to think about every

doc you touch as a spec that the agent

will need to read and operate against.

Your org is a system of business

processes, even if you're a team of one.

And those business processes should be

agent readable and they should be

speckable. This has some of the

downstream implications that Toby talked

about where he said a lot of

organizational politics is just bad

context engineering at the org level. If

we practice better specification

engineering for our documents, we will

expose a lot of the implicit assumptions

that we end up being political about

inside orgs and we will start to make

those agent readable and they will start

to become fungeible and we will start to

have fewer issues. Practicing

specification engineering is a way for

us to clearly describe intent at

organizational scale and clearly

translate that intent in a way that

agents can read it. And yes, it nests

down to individual agent runs and it

ladders up to the full organizational

context. And that's why it's the last

and most difficult skill to learn. The

progression here from prompt craft all

the way up to spec engineering is not a

ladder where you can abandon lower

rungs. It's a stack where each layer

makes the layers above it possible. You

cannot write good spec if you can't

write good prompts. You can't build

effective agent systems if you don't

understand context engineering. You

can't align agent behavior with

organizational goals without

understanding how intent works and how

that plays into context engineering.

They all go together. There's a final

dimension to this that goes beyond AI,

and I want to spend some time on it. I

hinted at it when I talked about the

idea that Toby found that he

communicated better when he got better

at prompting. The best human managers

that I've worked with already operate

with that degree of clarity. They give

complete context when they delegate.

They specify acceptance criteria to

their team members. They articulate

constraints. They're effectively

following the four disciplines of AI

input with their people. And that makes

for effective leadership. What's

happening right now, I think, if we step

back, is that AI is enforcing a

communication discipline that the best

leaders have always practiced

intuitively and now everyone needs it in

order to be effective. You cannot just

rely on shared context with the machine.

You cannot just assume that AI will

know. And that is something that is a

gift to us because so many of our

colleagues don't know what we mean

either. How many times have you sat in a

meeting where someone is referring to a

document and you don't know what that

document is and you're afraid to ask?

That is a wonderful example of the kind

of poor communication quality that goes

into human meetings. This is not a

framing you'll see in a lot of how to

prompt courses. I think it should be.

The skill of providing highquality input

to intelligent systems turns out to be a

skill that's translatable for AIs and

for humans. It turns out to be a

fundamental skill of the agent age that

benefits us as humans and how we work

together. I think the people who develop

these skills, this collection of skills

around prompting for 2026 are going to

end up being the leaders who will run

organizations where agents and humans

both perform at their ceilings. And the

people who don't, the people who are

stuck in 2025 prompting skills are going

to wonder why their AI investments keep

producing partial value. And meanwhile,

their human teams keep having alignment

issues. The prompt by itself is dead.

The specification, the context, the

organizational intent. That is where the

value in prompting is moving toward

because agents are starting to work for

longer and longer periods and look in a

lot of ways like junior employees. the

specification done right turns out to be

just what clear thinking has always

looked like really made explicit because

machines don't let us be lazy about it

and I'm really excited for the way that

kind of communication clarity can clean

up our organizations and our humanto

human communication as well. Good luck

with prompting the humans and agents in

your life. Cheers. And yes, there's lots

粘贴 YouTube 链接

输入任意 YouTube 视频链接，获取完整字幕

大多数字幕 5 秒内即可准备好

安装 Chrome 扩展

无需离开 YouTube，一键获取视频字幕。安装我们的 Chrome 扩展，直接在视频页面访问任意视频的完整字幕。

免费添加到 Chrome

支持 YouTube、Coursera、Udemy 等主流教育平台

快速获取字幕：直接修改地址栏中的域名即可！

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube 字幕正在为您准备结果……

YouTube 字幕：'Prompting' Just Split Into 4 Skills. You Only Know One. Here's Why You Need the Other 3 in 2026.