Hang tight while we fetch the video data and transcripts. This only takes a moment.
Connecting to YouTube player…
Fetching transcript data…
We’ll display the transcript, summary, and all view options as soon as everything loads.
Next steps
Loading transcript tools…
[NEW SKILL] Context Engineering is the REAL Secret to Better AI 🔍🧠 | Agility AI | YouTubeToText
YouTube Transcript: [NEW SKILL] Context Engineering is the REAL Secret to Better AI 🔍🧠
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
So, here's a question we're hearing a
lot. How do we really get the most out
of these large language models, these
LLMs, especially now, things are moving
so fast, and there's this term that
keeps popping up, gaining some serious
traction. Context engineering.
We've seen people like Toby Lipkkey,
Shopify's CEO, talk about it. He
actually prefers context engineering to
prompt engineering. Says it better
captures uh the art of providing all the
context for the task to be plausibly
solved by the LLM, which sounds pretty comprehensive.
comprehensive.
It does, and that feeling is definitely
shared. You see McKay Wrigley agreeing
saying things like all the alpha is in
assembling context well to reduce the
fog of war for the model. He points out
how models are you know converging to
humanish info needs. It signals a real
shift in thinking.
Right? So that's our mission for this
deep dive. Then we want to unpack what
context engineering really means. Why is
it suddenly so critical? And maybe most
importantly, what does it imply for you?
Whether you're using LLM day-to-day or
actually building applications with
them. Okay, let's get into it. Now when
chat GPT first landed, everyone was
talking prompt engineering.
Mhm. The tips and tricks.
Exactly. It was all about finding those
little uh quirks, the right way to ask.
But models are getting smarter fast and
a lot of those old tricks, they just
don't really work anymore. Plus, you see
UI is kind of hiding the prompting
process now. Like I used recently, gave
it a pretty simple prompt. Fun
retrofuturistic cover for the quest for
the solopreneur unicorn 1950s
mid-century modern. And it turned that
into this super detailed multi-sense
description for the image way beyond
what I typed. That's a great example
because in this newer way of thinking,
context is really all the information
you give the LLM that helps it get the
answer right. Yeah. So context
engineering is fundamentally about
making sure the LLM has the right
information it needs. And like you said,
it's way beyond just deciding, you know,
which documents should I upload? It's
about the whole informationational environment.
environment.
Okay. And digging into this, it seems
like there are two key aspects or
domains to context engineering. They're
connected but distinct. First up,
there's what we might call deterministic
context. This feels like the maybe the
smaller parts
potentially. Yeah. It's the stuff we can
directly control. Things like system
instructions, the rules you set for a
specific chat session, any documents you
upload, static prompts, knowledge bases,
data feeds,
right? The controllable inputs. And
initially, a lot of the talk around this
was really focused on the context
window. You know, how to use those
limited tokens efficiently because of
token burn, the cost and limits.
There were technical approaches like
chain of draft where the LLM uses its
own symbols or shorthand instead of full
sentences to sort of think on paper
saving tokens but keeping the logic. It
was very much about micromanaging that
window. But then there's the other side,
the potentially much larger part,
probabilistic context. Okay,
this emerges when the LM starts
accessing external tools, especially the web.
web.
Suddenly, if an LLM can browse that
deterministic context you carefully
craft it, it can become well a drop in
the bucket. I mean, you could give
Claude Opus a document asking to do some
research and it might hit, you know,
400, 500, 600 websites to answer you.
Wow. Okay, that's a lot. How does it
even stay focused on my original
question when it's waiting through
hundreds of websites? What keeps it on
track? Yeah. And here's where it gets
really interesting. It stays focused
because it's been reinforcement learned
and trained specifically to zero in on
the user's ask. Ah,
Ah,
so the responsibility for shaping that
huge probabilistic context on all those
websites actually shifts back to the
prompt itself. The prompt becomes
probabilistic too in a sense.
So the prompt isn't just telling it what
to do, but also guiding how it explores
this vast external space.
Precisely. It's less about controlling
every bit of info it sees and more about
shaping the environment it explores to
get more correct, more useful, more uh
congruent answers. And from an
engineering perspective, this is rapidly
becoming the main job. Cognition, the
folks behind Devon, put it bluntly.
Context engineering is effectively the
number one job of engineers building AI agents.
agents.
Okay. Number one job. That's a strong
statement. And they tie this into the
problems with multi-agent systems,
right? That brittleleness.
Exactly. They actually argued don't
build multi- aents because of this
context issue.
They used that Flappy Bird clone
example, didn't they?
Yeah. If you give one agent the
background task and another the bird
movement task, they might misinterpret
the context. You end up with like a
Super Mario background and a bird sprite
that isn't even a game asset and doesn't
move. Right.
Right. Because they don't share the same
understanding. The final agent gets
garbage in, garbage out. Basically,
pretty much miscommunications
everywhere. So, Cognition's idea is this
singlethreaded linear agent.
The same agent breaks down the task,
does the subtask, combines the results.
It maintains continuous context much smoother.
smoother.
That makes sense. But what about really
huge tasks? Can one agent hold all that context?
context?
That's the next challenge. They
acknowledge that for really big
problems, the context window might
overflow. So, they're thinking about
things like a sidelong context
compression LLM. Basically, another LLM
that watches the main agent and
compresses the conversation actions into
just the key moments and decisions. That
compressed summary then informs the next
steps, keeping the context manageable
but relevant.
That sounds complex. It reminds me of
that analogy Andre Carpathy used that
Lance Martin highlighted
OS analogy.
Yeah. That LLMs are a kind of new
operating system. The LLM is the CPU and
its context window is the RAM.
It's a good one.
And context engineering then becomes
essentially packaging and managing the
data, the context that needs to be
loaded into that RAM for the LLM CPU to
process a task. Anthropic said something
similar, didn't they, about agents
needing careful context management over
hundreds of conversational turns.
They did. And Langchain really doubles
down on this. They see context
engineering as building these dynamic
systems to provide the right
information, the right tools in the
right format. They argue it's the most
important skill for an AI engineer now
because applications are becoming these
complex dynamic agentic systems. And
crucially, when these agent systems
fail, it's increasingly because they
simply didn't have the appropriate
context, not because the core model was dumb.
dumb.
Okay, so it's fundamental for engineers,
but what about for the rest of us, the
end users? How does this translate? We
learned prompt engineering sort of. Do
we now need to become mini context engineers?
engineers?
I think so. Yes, in a way, users will
get better at understanding what's the
right amount of information to give a
specific model and also which models
handle different types of information
better. Think about GPT3.5 Pro. Remember
how it was specifically optimized to be
better at context? There was that piece,
God is hungry for context, showing that
when they fed 3.5 Pro a huge amount of
company info, pass, meetings,
recordings, everything, it produced a
much better strategy than GPT3 did with
less context. That's user level context
engineering right there. Choosing the
right model and knowing what kind of
information and how much to give it.
Exactly. Model selection and information strategy.
strategy.
So thinking about managing this
probabilistic context especially for
engineers building these systems. It
sounds tricky. You mentioned hundreds of
websites. What are the key principles to
keep in mind to avoid you know total chaos?
chaos?
It is tricky. There are probably five
critical things. First, expect discovery
and design for semantic highways.
You have to assume the agent will find
unexpected stuff when it searches
broadly. So you design systems, these
semantic highways to guide it towards
the desired information consistently.
Focus on the rate at which good stuff
comes back.
Okay. Build good paths through the
noise. What's second?
Second, monitor information sources. You
absolutely have to track the quality of
the sources the agent relies on. We've
all seen chat GPT do deep research and
site some uh pretty sketchy websites
sometimes. Even if the output seems
okay, you might need to constrain its
sources or at least be aware of where
it's going.
Makes sense. Third.
Third, take security seriously. This is
huge. LLM injection attacks where
malicious data found during web searches
or in databases tries to manipulate the
agent. It's not if, it's when. You have
to anticipate this,
right? Security can't be an
afterthought. Number four.
Fourth, measure overall decision
accuracy with relevant scoring. Forget
just precision and recall. Sometimes for
these probabilistic contexts, how
relevant and reliable the sources are is
actually more predictive of the final
output quality. So score the sources.
Interesting. Focus on the inputs to
predict the output quality than the last one.
one.
Fifth, version everything. This sounds
basic, but it's crucial. Those prompts
that shape the agent search. You need to
test them, version them meticulously.
Tiny changes can lead to vastly
different explorations and results.
Wow. Okay. So, those principles really
paint a picture. We're moving towards
evaluating systems based on source
quality across these huge probabilistic
contexts. And that deterministic
context, the stuff we directly control.
Its main job isn't just saving tokens anymore.
anymore.
Not primarily though.
It's more about its power to shape and
guide the exploration within that much
larger probabilistic window. How your
specific input steers the massive search.
search.
Exactly. And look, context engineering,
the idea of giving information isn't
brand new, but the term really fits
what's becoming an incredibly important skill.
skill.
The fundamental shift is that chat bots
aren't just chat bots anymore. Many are
becoming agents in a trench coat. You
know, using complex agent behavior on
the back end.
That's a great way to put it, agents in
a trench coat. So, wrapping up, what
does this all mean for you listening?
How do you adapt? Whether you're just
trying to get better answers from your
AI assistant or you're an engineer
building these complex agentic systems,
managing context, even the parts you
can't fully control, seems absolutely
critical. This feels like an incredibly
dynamic area, maybe even more important
in the long run than prompt engineering
ever was. Definitely something to watch
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.