A new foundational infrastructure stack for AI agents is rapidly emerging, analogous to the shifts to cloud computing and microservices, but it's currently complex and difficult to navigate, requiring deep understanding for effective development and deployment.
Mind Map
클릭해서 펼치기
클릭해서 인터랙티브 마인드맵 전체 보기
Right now, a new infrastructure stack is
being assembled in public for AI, and
most of us aren't paying attention to
it. It's got billions and billions of
capital behind it. It's not software.
It's not agents. It's the layer
underneath both of them. It's the layer
that lets agents actually do things in
the world. And the problem right now is
that almost nobody on the outside of
that category can figure out what's real
and what's hype. And this video is all
about disentangling that, giving you a
way to understand this new
infrastructure category for agents
that's being created and helping you
make sense of what the big pieces of
this category are and how you can think
about what's needed for your agents and
your deployments. I want to be clear, we
have seen this movie before. In fact,
we've seen it twice before. Between 2006
and 2010, computing infrastructure
shifted from onremise servers into the
cloud. Uh EC2, S3, Lambda, all of these
like cloud compute interfaces. The
builders who understood the new stack
started companies that are now dominant
like AWS. And then between 2012 and
2016, we saw the movie again. Monolithic
applications gave way very rapidly into
decomposed applications with APIs in
between them and what we call our
microservices architecture. Now we're
watching yet another shift. We're moving
from human first tools to agent first
primitives. I believe it is foundational
in the same way that moving to cloud was
foundational. So the new customer for
infrastructure is going to be the agent
in the same way that the new customer
for compute became the enterprise
renting compute from data centers in the
2010s. We are talking about a shift at
least as big as cloud when we're moving
to agentic primitives. But because it's
so new and because most of the startups
are so small, it's been really, really
hard to distinguish the noise in the
space from the actual signal. And so I'm
going to break the category down for you
here. And first, a word of warning. I
love Legos. You know that. These are not
the same thing as Lego bricks for
agents. I want them to be. That's the
hope. They'll sell you as Lego bricks
for agents, but right now, you don't
have the same degree of composability
and predictability that you would have
in a Lego analogy. These are not all
bricks with the same size knobs that you
can just slap together and they're going
to make a little wall. Right now, it's
as if you have Legos and wooden blocks
and they're all marketing themselves as
Legos. You don't know which is which and
you don't know how to snap them
together. And that is one of the biggest
problems in the space right now. I think
a more reliable analogy for where we're
at in the space today is something like
system calls. Agents need defined,
reliable interfaces that help them
figure out identity, that help them
figure out compute, that help them
figure out memory and persistence and
communication and payments. And the
companies that are building those system
calls are essentially building the
operating system that agents will need
to do their work in the new economy. And
I want to go through and com decompose
that stack and give you layer by layer
what agents look like and where the
agent primitives are going in this
economy. Number one, layer one, compute
and sandboxing. This one is perhaps the
most productionready in the stack. It's
already bearing the load for agents. And
what's the premise? It's really simple.
The agent needs somewhere safe to run
code. It should not run on your laptop.
It should not run in production. And it
should not run unsupervised. It needs
isolated, sandboxed, auditable
execution. I I think that this layer has
the most mature competition at this
point. Right. E2B has roughly $32
million in funding. It uses firecracker
microVMs. It has the same tech as AWS
Lambda has. And it's intended to give
each agent a session with its own
dedicated kernel. But it's not the only
one, right? Daytona raised a $24 million
series A recently and took a different
architectural bet. They have Docker
containers with a shared kernel and it's
optimized for speed. I think they claim
a 90 millisecond cold start which is
insanely fast and also a persistent
state. Uh Modal is a startup that
targets GPUheavy workloads. browser base
is valued at $300 billion after the
series B and it focuses on headless
browser automation giving agents the
ability to interact with web pages as if
they were human users and then there's
newer entrance in the space uh Alibaba's
open sandbox is a good example the
interesting split in this space is
really philosophical are you going to be
ephemeral with your agents or are you
going to be persistent E2B treats
sandboxes as disposable you spin one up
you run code you spin it down and
sprites treat these spaces as longived
and they assume that your agent can
install dependencies that it can create
files and that your agent in some sense
will come back later. They assume a
degree of agentic persistence. This is
not a style preference. This is not
optional for you to think about. This is
an architectural bet from these startups
on how long agent sessions in the new
economy will run and whether state
matters for those agent sessions. Both
camps will probably survive because the
Asian economy is going to be that big.
But you're going to have to think about
what your workloads need. If you want to
look at this part of the stack, this is
a really high durability component,
right? Whatever you want to bet on,
whether you want to bet on persistent
agents or disposable agents, your agents
need safe spaces to run code. And it
totally makes sense that they're going
to have virtual boxes to do it on. There
will be a lot more startups working to
solve this problem over the next year or
so. But this is an area where it's
relatively mature now and you have
multiple options and you need to think
about which one you want to use if you
want to deploy your agents in any kind
of computing environment that is
virtual. What is layer two on top of the
compute piece? Layer two is identity and
communication. This is in a weird space.
It's transitional and I think that we
need to be thoughtful about what we
consider basic in this space. So right
now we know that an agent needs to exist
on the internet as an entity. It needs
to send and receive messages of some
sort. It needs to authenticate with
services. It needs to hold some kind of
verifiable identity that other systems
can recognize. And today one of the
pragmatic answers to that is well the
agent needs an email address. So Agent
Mail raised a $6 million seed round from
General Catalyst a couple of weeks ago.
uh and Paul Graham and HubSpot CTO
Dharmmes Shaw are both angels on agent
mail. The API lets you programmatically
create email inboxes for for agents.
Real addresses, full threading,
attachment, labels, search. The
onboarding API even lets agents sign
themselves up. So far so good. The
thesis is very sharp. Agent mail CEO
frames email not as a communication
tool, but as a fundamental identity
layer. And he's not the only one, by the
way. There's like half a dozen startups
in the space going after email. Email is
today a universal key to the internet.
Every SAS service needs one at signup.
Every verification flow sends codes to
it. If you give an agent email address,
the idea goes, you're essentially giving
it an identity. But what if that is a
shim? Email works today because it's
everywhere, not because it's the right
protocol for agents, but because it's
the right protocol for humans who built
the internet. And right now, we all have
problems with email as humans, right?
Threading is brittle. We have rate
limits designed to prevent spam and
automated agents in particular. We have
a signal to noise ratio that's terrible
for agent context windows. The real need
underneath the presenting need for email
is a need to give an agent identity and
communication protocols something that
doesn't require pretending to be human.
And multiple teams are working on this
right you have uh onchain agent identity
options. You have dedicated Ato
communication standard. You have
MCPbased service discovery. Nothing has
a defined right to win yet in this
space. If you are in this space, be
thoughtful about what an agent native
protocol looks like. Now, by the way, I
am not the person to bet against email.
Email has been famously cockroachlike in
its ability to survive lots and lots of
revolutions. It just seems to stick
around. So, I'm not the one to say
agents will never use email. But you
should be aware if you're betting on
agent email that you're making a
pragmatic bet, not necessarily an
architectural bet. Okay. So, agents need
compute. That's a layer that's kind of
mature. Agents need identity and
communication. That one's really in
flux. Agents need memory and
statefulness. This is early. This is
real. And the platform risk is just as
real. Now, the need is clear, right?
Agents need to be able to remember what
happened in many cases, not just within
a session, but across many sessions,
many tasks, many days. Uh, Mem0ero is
the clear leader here. They've raised
$24 million, hit 41,000 some GitHub
stars and 14 million downloads and were
selected by AWS as the exclusive memory
provider for its agent SDK. So, of
course, their API has done very well.
Call call volume has gone up fivefold,
the whole thing. What Mem Zero gets
right is the insight that memory isn't
there to save the conversation, which is
the old chat way of doing things. It's
actually that memory is an act of active
curation. So their system will store
important information. It will it will
it will deliberately forget outdated
conflicting details and only recall
relevant context when you are inferring
something with an LLM query. So the
architecture is very much a hybrid data
store to reflect that. It has a network
graph. It has a vector database. And it
also has a key value store. And those
three different formats allows me zero
to treat memory as managed
infrastructure, not just a feature
bolted onto the model. And you get
better results. Right? On the locomo
benchmark, they outperform OpenAI's
built-in memory by 26% on accuracy with
91% faster latency and 90% reduced token
usage. It's a it's a clear win. But keep
in mind that every Frontier lab is
obsessed with building memory into its
models. OpenAI already has big
investments in long-term memory and is
going to continue working on it.
Anthropic is building memory into
claude. If memory becomes a model level
feature that the labs build in the same
way that search got sort of built into
Chad GPT and not as a separate feature,
then all of these standalone memory
companies are at risk because the model
makers can just grab them. Now, if
you're on Mezero side of things, the
counter thesis is portability, right?
And we've talked about this the idea
that no one should own your memory.
That's a really important concept that
you should be able to own your memory.
I've talked about this in relation to
the frontier project that OpenAI is
doing with AWS. Do you feel confident
with your context layer which is a step
above memory being something that a
company owns and not you own? So I think
the question here is really it's a
question of which thesis wins and I
think it's a very uncertain outlook.
Will this be a situation where the
market decides that they want a memory
solution that does not belong to a
hyperscaler? Or is it a situation where
the convenience the hyperscalers offer
is so compelling that the market as a
whole just decides to go with it and
throw memory at the hyperscalers and say
you can solve the problem. And we look
back in two years and we see companies
like Memz and we say yeah they didn't
last because they just weren't
convenient enough. I don't know. It
feels a little bit like a coin flip and
I think we get to shape that by what we
demand. Okay, so we've talked about
compute and sandboxes. We've talked
about identity and communication. We've
talked about agent memory and state.
Let's talk about tools and integration
for agents.
This one is growing explosively quickly
as a layer and it solves real and
immediate pain points. Any agent needs
to interact with tools to do its work,
right? whether it's interacting with
Slack or with Jira or with Salesforce or
with GitHub or with Google Workspace and
those are all integrations or with basic
primitives like it's running Unix or
it's running Python or whatever it is
and you have to have those tools if
you're going to do work at an enterprise
level and you have to have those
integrations. Anything that makes that
easier is going to coin. And so Compose
with $29 million in funding from
Lightseed provides a managed integration
layer for agents, right? provides
authentication handling without
complicated ooth flows. It provides
pre-built connectors to a couple of
hundred solutions and it provides
observability on every tool call. So
they don't necessarily build agents. All
they're doing is equipping agents with
the plumbing that the agents need to
navigate enterprise environments
successfully and safe and safely. The
problem is the classic NSM integration
nightmare. Right? without middleware.
Every single agent builder independently
is going to manage their credentials,
their off flows, their rate limits,
their error handling, their API schema
changes for every single tool that agent
touches, which is just an enormous
combinatorial problem. That's
unsustainable at small scale. That is
why it's N* M, right? It's it's the
number of middleware times an infinite
number. And at enterprise scale where an
agent might need to touch your CRM, your
ticketing, your email, your calendar,
it's just impossible to keep up with.
And so I would argue that this space is
a very durable space to operate in. If
you're building, as long as the
ecosystem for tools remains fragmented,
and if anything, it's going to fragment
more as companies build their own stuff,
agents are going to need an integration
layer. The long-term risk here is
standardization. If MCP truly becomes a
universal, the value of a managed
integration is going to start to
diminish. So you're essentially betting
if you're building in the space like
Composeio that agents are going to need
something to handle all of the massive
enterprise integration touch points for
a long time to come because so many of
these companies are going to be slow to
roll out MCPs and then slow many of the
enterprises that have those tools are
going to be slow to adopt those MCPS and
that gap is where your entire company
thesis sits if you're compos that one's
going to stick around for a while. I do
not believe in fastch changing
enterprises at scale. I think a lot of
enterprises move like dinosaurs and it's
going to it's going to stick around.
Okay, so compute and sandbox is done,
identity done, memory done, tool access,
we've talked about that. Layer five for
agents, provisioning and billing. This
is just brand new is the trust layer and
it's just starting to arrive now. Agents
essentially need to be able to acquire
services and pay for them securely. And
this is where Stripe Projects fits. It
launched this past week. It's the first
credible trust layer for agent to
service transactions. The agent uses the
same CLI commands as a human does and it
can provision its own database. It can
upgrade a hosting tier. Stripe can
tokenize payment credentials for that
agent. And the developers raw card
details never leave Stripe's vault. The
problem is very simple. Since the start
of this year, agents have been able to
do almost everything when it comes to
spinning up a project except for the
part where they need to create accounts
and provision infrastructure because
that has always required a human for
authentication. And that's the gap that
Stripe is closing here. Their databases
are ready in roughly 350 milliseconds or
a third of a second. They're free to
start. They scale to zero when inactive.
Every design choice is optimized for how
quickly an agent can provision something
that the agent is building. It's not for
human speed dashboard clicking. You are
assuming a terminal access here. We
still have some missing pieces here. I
think we're going to see growth around
agentto agent payments. We're going to
see growth around metered billing that
maps to agent compute patterns. We're
going to see growth around dynamic
budget allocation where like agent A can
spend so many dollars without human
approval and agent B can spend so many
dollars with human approval. We're going
to have a lot of like observability
layers that we build in. I think this is
a case where Stripe as usual has made an
excellent product decision. Uh they're
known for that and this is something
that's going to stick around. like this
is immediately looking like fundamental
infrastructure for how agents build on
the web. I think we're going to see a
few more players in this space, but
fundamentally it's going to be focused
not on human readability first, but on
agent legibility and agent buildability
and then human observability over the
top. But regardless, like any Agentic
economy future, you're going to have
provisioning and billing. And so it's
it's newly here, but it's here to stay.
Okay, the last layer is orchestration
and coordination. This is the biggest
opportunity in the stack because you can
leverage the power of multiple agents.
It's also a big gap right now. So, the
agent needs to work with other agents
reliably at scale, with fallback
handling, with audit trails, with cost
controls, and it it it matters a lot to
get that right. And there's just so
little done well around this. And it's
not for lack of interest. Like, Gartner
reported a 1,445%
surge. I don't even know what is that
14x surge in multi- aent system
inquiries between Q1 2024 and Q2 2025
let alone 2026 which is going to be up
again. Look quite bluntly the current
tooling is at the framework level not
the infrastructure level. So lang chain
lets you stand up a multi- aent workflow
right but the gap between I can spin up
three agents in a notebook and I can
reliably run 50 agents across enterprise
systems with failure recovery and cost
controls and audit logging and human
escalation paths. that latter piece that
we all need, that is something that
we're all hand rolling right now. And so
individual agent capabilities are
something we've largely solved. What's
been missing is the layer that makes
those capabilities composable and
parallel and reliable. And so if you're
building in this space, here's what
doesn't exist yet that needs to exist.
Number one, you need a scheduling and
life cycle layer for agents. Not in the
container sense, right? I'm not saying
that it's a Kubernetes container. I'm
saying that you need something that
handles agent creation and assignment
and health checking and scaling and
termination as a managed service. Number
two, you need merge and coordination
infrastructure that is built from the
ground up for parallel agent work.
Right? When five agents work on related
tasks simultaneously, you need merge
cues, you need conflict detection, you
need resolution protocols. And today
this is like a bunch of duct tape and a
bunch of get work trees. Like it can be
so much better. You need number three,
supervision hierarchies, right? Meta
agents that monitor and evaluate and
course correct other agents. Not as a
framework pattern you have to code
yourself, which so many of us have to do
now, but as infrastructure that you can
configure. You need financial
observability, right? Across multiple
agent workflows, what what did this
agent spend? Uh what was the outcome
quality? What's the cost per successful
task? This is like Finnops for agents
and it's brand new. It barely exists.
Last but not least, you need standard
failure patterns and standard recovery
patterns. So when an agent's tool call
fails, instead of making it up on an
individual team basis, you have to have
some like standard provisioning around
what happens, right? You shouldn't have
to depend on the tool, the framework,
and what the PM had for lunch that day
to decide whether or not your agent
recovered. Look, this layer, this
orchestration layer, this is the layer
where the next infrastructure defining
company is going to get built. The
orchestration problem for agents is
structurally analogous to the container
orchestration problem that Kubernetes
solved. Right? Not the compute itself,
but the scheduling, the scaling, the
health checking, the life cycle
management that makes compute usable at
enterprise scale. So, whoever solves
orchestration at infrastructure grade is
going to own the most valuable position
in the agent stack. And it is too early
to call a winner. So, what does all of
this mean? You've seen the six layers of
the stack. What does it mean for
builders right now? I want to give you
three lessons that you should take away
if you're building on the agent stack
today that are truisms for 2026. I don't
think they're going to change a lot this
year. The stack has to evolve
significantly. Number one, right now,
reliability is compounding in the wrong
direction. When your agent depends on
five different primitives, your
endto-end reliability is the product of
five different reliability. So if each
delivers 99% uptime your system delivers
only 95%. If it's at 97% each it's at
86. You get the idea. Essentially you
are stacking the liabilities of all your
agentic primitives right now because you
have to compose so much of this layer by
hand. Reliability is really hard to
engineer these days. Number two
transitional lockin is a real risk this
year. Right? Building on shims like
email as identity creates migration
costs when native protocols arrive.
Every single shim you adopt is a bet
that it either becomes the standard or
it becomes something that you're willing
to swap out. So think about your choices
and think strategically about what is
truly agent native and what is something
that is a practical bet and make a
choice about what you think is going to
be correct over the next two or three
years. We're all living through this
together. I can't tell you if email like
a wonderful cockroach is going to
survive forever because it might or if
we're actually going to get true agent
to agent communication that is post
email. Number three, and this is a big
one, agent sprawl is coming. This is the
same problem that plagued microservices
back in 2018 when when when you would
get people literally walking into
startups from places like Amazon and
Microsoft and the first thing they did
when when this little startup had a tiny
little codebase and just wanted to ship,
they would say, "Well, it all needs to
be a microservices architecture."
Did it? No, it didn't. It's a monolith.
Shift fast. In the same way, people are
looking at agents and they're saying
everything needs to be an agent and it's
sprawling all over the enterprise and
they're taking unexpected actions and
you don't have observability and you
don't have an orchestration layer and so
you're just kind of guessing and vibing.
That is going to be a bigger and bigger
problem over the course of 2026 unless
people invest now in orchestration
layers that yes, you're going to have to
hand roll. Look, I've talked about what
I think the new builder skills are. I'll
just reiterate them for you here.
Context engineering matters a lot these
days because what you feed the agent
matters for the outcomes you drive. Eval
driven development matters because you
have to be able to get the agent to
autonomously drive against a result to
avoid a lot of the bottlenecks that
comes from human reviewed code and stack
literacy is going to be really important
right you have to know which layer in
the stack is your competitive advantage
and why and you have to build
relentlessly against that. In the world
of agents, the builders who survive, and
I don't care if you're building in the
space as an entrepreneur, if you're
building as an individual with your open
claw, or if you're building as a leader
and you're building on top of the stack
and you need to get agents implemented
regardless, the ones who survive, you're
going to have to have stack literacy.
You're going to have to understand how
these six layers work. You're going to
have to keep a weather eye on which
pieces of the stack are changing and how
they affect your business. There is no
excuse for lack of stack literacy. And
part of the reason this matters, part of
the reason I'm talking about it in this
channel is because if you don't have
that, even as a business leader, even as
a non-CTO, non- tech leader, you are
going to be in big trouble because
agents drive so much business outcome
and business leverage now that is so
dependent on these pieces of the stack.
And so if you want to have an agent that
has tremendous blast radius across your
customer success, you got to understand
what's driving that. What parts of the
stack actually work? What parts of the
stack are you hand rolling? What are
your shims that you're betting on? If
you don't have that detailed
understanding, you're just kind of
hoping and praying that the agent works.
That's not a good strategy for the long
term. And so, we need to have better
stack literacy. And that's why this
video is important. So, share it with
someone who doesn't understand the agent
stack because I guarantee you there are
a lot of people who are walking around
with a lot of LinkedIn buzzwords in
their heads and they don't understand
the agent stack. And that's going to
lead to a lot of suffering and pain.
Frankly, for a lot of IC engineering
teams, they're going to be asked to
build stuff that doesn't make any sense.
텍스트나 타임스탬프를 클릭하면 동영상의 해당 장면으로 바로 이동합니다
공유:
대부분의 자막은 5초 이내에 준비됩니다
원클릭 복사125개 이상의 언어내용 검색타임스탬프로 이동
YouTube URL 붙여넣기
YouTube 동영상 링크를 입력하면 전체 자막을 가져옵니다
자막 추출 양식
대부분의 자막은 5초 이내에 준비됩니다
Chrome 확장 프로그램 설치
YouTube를 떠나지 않고 자막을 즉시 가져오세요. Chrome 확장 프로그램을 설치하면 동영상 시청 페이지에서 바로 자막에 원클릭으로 접근할 수 있습니다.