YouTube 자막:
You're Building AI Agents on Layers That Won't Exist in 18 Months. (What this Means for You)

동영상을 끝까지 볼 필요 없이 전체 자막을 가져오고, 키워드를 검색하고, 한 번에 복사하세요.

AutoDub

YouTube 외국어 영상 이해하기

몰입형 YouTube 한국어 더빙

언어 장벽을 넘어 전 세계 양질의 콘텐츠를 즐기세요

무료로 사용

동영상 자막

동영상 요약

Summary

Core Theme

A new foundational infrastructure stack for AI agents is rapidly emerging, analogous to the shifts to cloud computing and microservices, but it's currently complex and difficult to navigate, requiring deep understanding for effective development and deployment.

Mind Map

클릭해서 펼치기

클릭해서 인터랙티브 마인드맵 전체 보기

Right now, a new infrastructure stack is

being assembled in public for AI, and

most of us aren't paying attention to

it. It's got billions and billions of

capital behind it. It's not software.

It's not agents. It's the layer

underneath both of them. It's the layer

that lets agents actually do things in

the world. And the problem right now is

that almost nobody on the outside of

that category can figure out what's real

and what's hype. And this video is all

about disentangling that, giving you a

way to understand this new

infrastructure category for agents

that's being created and helping you

make sense of what the big pieces of

this category are and how you can think

about what's needed for your agents and

your deployments. I want to be clear, we

have seen this movie before. In fact,

we've seen it twice before. Between 2006

and 2010, computing infrastructure

shifted from onremise servers into the

cloud. Uh EC2, S3, Lambda, all of these

like cloud compute interfaces. The

builders who understood the new stack

started companies that are now dominant

like AWS. And then between 2012 and

2016, we saw the movie again. Monolithic

applications gave way very rapidly into

decomposed applications with APIs in

between them and what we call our

microservices architecture. Now we're

watching yet another shift. We're moving

from human first tools to agent first

primitives. I believe it is foundational

in the same way that moving to cloud was

foundational. So the new customer for

infrastructure is going to be the agent

in the same way that the new customer

for compute became the enterprise

renting compute from data centers in the

2010s. We are talking about a shift at

least as big as cloud when we're moving

to agentic primitives. But because it's

so new and because most of the startups

are so small, it's been really, really

hard to distinguish the noise in the

space from the actual signal. And so I'm

going to break the category down for you

here. And first, a word of warning. I

love Legos. You know that. These are not

the same thing as Lego bricks for

agents. I want them to be. That's the

hope. They'll sell you as Lego bricks

for agents, but right now, you don't

have the same degree of composability

and predictability that you would have

in a Lego analogy. These are not all

bricks with the same size knobs that you

can just slap together and they're going

to make a little wall. Right now, it's

as if you have Legos and wooden blocks

and they're all marketing themselves as

Legos. You don't know which is which and

you don't know how to snap them

together. And that is one of the biggest

problems in the space right now. I think

a more reliable analogy for where we're

at in the space today is something like

system calls. Agents need defined,

reliable interfaces that help them

figure out identity, that help them

figure out compute, that help them

figure out memory and persistence and

communication and payments. And the

companies that are building those system

calls are essentially building the

operating system that agents will need

to do their work in the new economy. And

I want to go through and com decompose

that stack and give you layer by layer

what agents look like and where the

agent primitives are going in this

economy. Number one, layer one, compute

and sandboxing. This one is perhaps the

most productionready in the stack. It's

already bearing the load for agents. And

what's the premise? It's really simple.

The agent needs somewhere safe to run

code. It should not run on your laptop.

It should not run in production. And it

should not run unsupervised. It needs

isolated, sandboxed, auditable

execution. I I think that this layer has

the most mature competition at this

point. Right. E2B has roughly $32

million in funding. It uses firecracker

microVMs. It has the same tech as AWS

Lambda has. And it's intended to give

each agent a session with its own

dedicated kernel. But it's not the only

one, right? Daytona raised a $24 million

series A recently and took a different

architectural bet. They have Docker

containers with a shared kernel and it's

optimized for speed. I think they claim

a 90 millisecond cold start which is

insanely fast and also a persistent

state. Uh Modal is a startup that

targets GPUheavy workloads. browser base

is valued at $300 billion after the

series B and it focuses on headless

browser automation giving agents the

ability to interact with web pages as if

they were human users and then there's

newer entrance in the space uh Alibaba's

open sandbox is a good example the

interesting split in this space is

really philosophical are you going to be

ephemeral with your agents or are you

going to be persistent E2B treats

sandboxes as disposable you spin one up

you run code you spin it down and

sprites treat these spaces as longived

and they assume that your agent can

install dependencies that it can create

files and that your agent in some sense

will come back later. They assume a

degree of agentic persistence. This is

not a style preference. This is not

optional for you to think about. This is

an architectural bet from these startups

on how long agent sessions in the new

economy will run and whether state

matters for those agent sessions. Both

camps will probably survive because the

Asian economy is going to be that big.

But you're going to have to think about

what your workloads need. If you want to

look at this part of the stack, this is

a really high durability component,

right? Whatever you want to bet on,

whether you want to bet on persistent

agents or disposable agents, your agents

need safe spaces to run code. And it

totally makes sense that they're going

to have virtual boxes to do it on. There

will be a lot more startups working to

solve this problem over the next year or

so. But this is an area where it's

relatively mature now and you have

multiple options and you need to think

about which one you want to use if you

want to deploy your agents in any kind

of computing environment that is

virtual. What is layer two on top of the

compute piece? Layer two is identity and

communication. This is in a weird space.

It's transitional and I think that we

need to be thoughtful about what we

consider basic in this space. So right

now we know that an agent needs to exist

on the internet as an entity. It needs

to send and receive messages of some

sort. It needs to authenticate with

services. It needs to hold some kind of

verifiable identity that other systems

can recognize. And today one of the

pragmatic answers to that is well the

agent needs an email address. So Agent

Mail raised a $6 million seed round from

General Catalyst a couple of weeks ago.

uh and Paul Graham and HubSpot CTO

Dharmmes Shaw are both angels on agent

mail. The API lets you programmatically

create email inboxes for for agents.

Real addresses, full threading,

attachment, labels, search. The

onboarding API even lets agents sign

themselves up. So far so good. The

thesis is very sharp. Agent mail CEO

frames email not as a communication

tool, but as a fundamental identity

layer. And he's not the only one, by the

way. There's like half a dozen startups

in the space going after email. Email is

today a universal key to the internet.

Every SAS service needs one at signup.

Every verification flow sends codes to

it. If you give an agent email address,

the idea goes, you're essentially giving

it an identity. But what if that is a

shim? Email works today because it's

everywhere, not because it's the right

protocol for agents, but because it's

the right protocol for humans who built

the internet. And right now, we all have

problems with email as humans, right?

Threading is brittle. We have rate

limits designed to prevent spam and

automated agents in particular. We have

a signal to noise ratio that's terrible

for agent context windows. The real need

underneath the presenting need for email

is a need to give an agent identity and

communication protocols something that

doesn't require pretending to be human.

And multiple teams are working on this

right you have uh onchain agent identity

options. You have dedicated Ato

communication standard. You have

MCPbased service discovery. Nothing has

a defined right to win yet in this

space. If you are in this space, be

thoughtful about what an agent native

protocol looks like. Now, by the way, I

am not the person to bet against email.

Email has been famously cockroachlike in

its ability to survive lots and lots of

revolutions. It just seems to stick

around. So, I'm not the one to say

agents will never use email. But you

should be aware if you're betting on

agent email that you're making a

pragmatic bet, not necessarily an

architectural bet. Okay. So, agents need

compute. That's a layer that's kind of

mature. Agents need identity and

communication. That one's really in

flux. Agents need memory and

statefulness. This is early. This is

real. And the platform risk is just as

real. Now, the need is clear, right?

Agents need to be able to remember what

happened in many cases, not just within

a session, but across many sessions,

many tasks, many days. Uh, Mem0ero is

the clear leader here. They've raised

$24 million, hit 41,000 some GitHub

stars and 14 million downloads and were

selected by AWS as the exclusive memory

provider for its agent SDK. So, of

course, their API has done very well.

Call call volume has gone up fivefold,

the whole thing. What Mem Zero gets

right is the insight that memory isn't

there to save the conversation, which is

the old chat way of doing things. It's

actually that memory is an act of active

curation. So their system will store

important information. It will it will

it will deliberately forget outdated

conflicting details and only recall

relevant context when you are inferring

something with an LLM query. So the

architecture is very much a hybrid data

store to reflect that. It has a network

graph. It has a vector database. And it

also has a key value store. And those

three different formats allows me zero

to treat memory as managed

infrastructure, not just a feature

bolted onto the model. And you get

better results. Right? On the locomo

benchmark, they outperform OpenAI's

built-in memory by 26% on accuracy with

91% faster latency and 90% reduced token

usage. It's a it's a clear win. But keep

in mind that every Frontier lab is

obsessed with building memory into its

models. OpenAI already has big

investments in long-term memory and is

going to continue working on it.

Anthropic is building memory into

claude. If memory becomes a model level

feature that the labs build in the same

way that search got sort of built into

Chad GPT and not as a separate feature,

then all of these standalone memory

companies are at risk because the model

makers can just grab them. Now, if

you're on Mezero side of things, the

counter thesis is portability, right?

And we've talked about this the idea

that no one should own your memory.

That's a really important concept that

you should be able to own your memory.

I've talked about this in relation to

the frontier project that OpenAI is

doing with AWS. Do you feel confident

with your context layer which is a step

above memory being something that a

company owns and not you own? So I think

the question here is really it's a

question of which thesis wins and I

think it's a very uncertain outlook.

Will this be a situation where the

market decides that they want a memory

solution that does not belong to a

hyperscaler? Or is it a situation where

the convenience the hyperscalers offer

is so compelling that the market as a

whole just decides to go with it and

throw memory at the hyperscalers and say

you can solve the problem. And we look

back in two years and we see companies

like Memz and we say yeah they didn't

last because they just weren't

convenient enough. I don't know. It

feels a little bit like a coin flip and

I think we get to shape that by what we

demand. Okay, so we've talked about

compute and sandboxes. We've talked

about identity and communication. We've

talked about agent memory and state.

Let's talk about tools and integration

for agents.

This one is growing explosively quickly

as a layer and it solves real and

immediate pain points. Any agent needs

to interact with tools to do its work,

right? whether it's interacting with

Slack or with Jira or with Salesforce or

with GitHub or with Google Workspace and

those are all integrations or with basic

primitives like it's running Unix or

it's running Python or whatever it is

and you have to have those tools if

you're going to do work at an enterprise

level and you have to have those

integrations. Anything that makes that

easier is going to coin. And so Compose

with $29 million in funding from

Lightseed provides a managed integration

layer for agents, right? provides

authentication handling without

complicated ooth flows. It provides

pre-built connectors to a couple of

hundred solutions and it provides

observability on every tool call. So

they don't necessarily build agents. All

they're doing is equipping agents with

the plumbing that the agents need to

navigate enterprise environments

successfully and safe and safely. The

problem is the classic NSM integration

nightmare. Right? without middleware.

Every single agent builder independently

is going to manage their credentials,

their off flows, their rate limits,

their error handling, their API schema

changes for every single tool that agent

touches, which is just an enormous

combinatorial problem. That's

unsustainable at small scale. That is

why it's N* M, right? It's it's the

number of middleware times an infinite

number. And at enterprise scale where an

agent might need to touch your CRM, your

ticketing, your email, your calendar,

it's just impossible to keep up with.

And so I would argue that this space is

a very durable space to operate in. If

you're building, as long as the

ecosystem for tools remains fragmented,

and if anything, it's going to fragment

more as companies build their own stuff,

agents are going to need an integration

layer. The long-term risk here is

standardization. If MCP truly becomes a

universal, the value of a managed

integration is going to start to

diminish. So you're essentially betting

if you're building in the space like

Composeio that agents are going to need

something to handle all of the massive

enterprise integration touch points for

a long time to come because so many of

these companies are going to be slow to

roll out MCPs and then slow many of the

enterprises that have those tools are

going to be slow to adopt those MCPS and

that gap is where your entire company

thesis sits if you're compos that one's

going to stick around for a while. I do

not believe in fastch changing

enterprises at scale. I think a lot of

enterprises move like dinosaurs and it's

going to it's going to stick around.

Okay, so compute and sandbox is done,

identity done, memory done, tool access,

we've talked about that. Layer five for

agents, provisioning and billing. This

is just brand new is the trust layer and

it's just starting to arrive now. Agents

essentially need to be able to acquire

services and pay for them securely. And

this is where Stripe Projects fits. It

launched this past week. It's the first

credible trust layer for agent to

service transactions. The agent uses the

same CLI commands as a human does and it

can provision its own database. It can

upgrade a hosting tier. Stripe can

tokenize payment credentials for that

agent. And the developers raw card

details never leave Stripe's vault. The

problem is very simple. Since the start

of this year, agents have been able to

do almost everything when it comes to

spinning up a project except for the

part where they need to create accounts

and provision infrastructure because

that has always required a human for

authentication. And that's the gap that

Stripe is closing here. Their databases

are ready in roughly 350 milliseconds or

a third of a second. They're free to

start. They scale to zero when inactive.

Every design choice is optimized for how

quickly an agent can provision something

that the agent is building. It's not for

human speed dashboard clicking. You are

assuming a terminal access here. We

still have some missing pieces here. I

think we're going to see growth around

agentto agent payments. We're going to

see growth around metered billing that

maps to agent compute patterns. We're

going to see growth around dynamic

budget allocation where like agent A can

spend so many dollars without human

approval and agent B can spend so many

dollars with human approval. We're going

to have a lot of like observability

layers that we build in. I think this is

a case where Stripe as usual has made an

excellent product decision. Uh they're

known for that and this is something

that's going to stick around. like this

is immediately looking like fundamental

infrastructure for how agents build on

the web. I think we're going to see a

few more players in this space, but

fundamentally it's going to be focused

not on human readability first, but on

agent legibility and agent buildability

and then human observability over the

top. But regardless, like any Agentic

economy future, you're going to have

provisioning and billing. And so it's

it's newly here, but it's here to stay.

Okay, the last layer is orchestration

and coordination. This is the biggest

opportunity in the stack because you can

leverage the power of multiple agents.

It's also a big gap right now. So, the

agent needs to work with other agents

reliably at scale, with fallback

handling, with audit trails, with cost

controls, and it it it matters a lot to

get that right. And there's just so

little done well around this. And it's

not for lack of interest. Like, Gartner

reported a 1,445%

surge. I don't even know what is that

14x surge in multi- aent system

inquiries between Q1 2024 and Q2 2025

let alone 2026 which is going to be up

again. Look quite bluntly the current

tooling is at the framework level not

the infrastructure level. So lang chain

lets you stand up a multi- aent workflow

right but the gap between I can spin up

three agents in a notebook and I can

reliably run 50 agents across enterprise

systems with failure recovery and cost

controls and audit logging and human

escalation paths. that latter piece that

we all need, that is something that

we're all hand rolling right now. And so

individual agent capabilities are

something we've largely solved. What's

been missing is the layer that makes

those capabilities composable and

parallel and reliable. And so if you're

building in this space, here's what

doesn't exist yet that needs to exist.

Number one, you need a scheduling and

life cycle layer for agents. Not in the

container sense, right? I'm not saying

that it's a Kubernetes container. I'm

saying that you need something that

handles agent creation and assignment

and health checking and scaling and

termination as a managed service. Number

two, you need merge and coordination

infrastructure that is built from the

ground up for parallel agent work.

Right? When five agents work on related

tasks simultaneously, you need merge

cues, you need conflict detection, you

need resolution protocols. And today

this is like a bunch of duct tape and a

bunch of get work trees. Like it can be

so much better. You need number three,

supervision hierarchies, right? Meta

agents that monitor and evaluate and

course correct other agents. Not as a

framework pattern you have to code

yourself, which so many of us have to do

now, but as infrastructure that you can

configure. You need financial

observability, right? Across multiple

agent workflows, what what did this

agent spend? Uh what was the outcome

quality? What's the cost per successful

task? This is like Finnops for agents

and it's brand new. It barely exists.

Last but not least, you need standard

failure patterns and standard recovery

patterns. So when an agent's tool call

fails, instead of making it up on an

individual team basis, you have to have

some like standard provisioning around

what happens, right? You shouldn't have

to depend on the tool, the framework,

and what the PM had for lunch that day

to decide whether or not your agent

recovered. Look, this layer, this

orchestration layer, this is the layer

where the next infrastructure defining

company is going to get built. The

orchestration problem for agents is

structurally analogous to the container

orchestration problem that Kubernetes

solved. Right? Not the compute itself,

but the scheduling, the scaling, the

health checking, the life cycle

management that makes compute usable at

enterprise scale. So, whoever solves

orchestration at infrastructure grade is

going to own the most valuable position

in the agent stack. And it is too early

to call a winner. So, what does all of

this mean? You've seen the six layers of

the stack. What does it mean for

builders right now? I want to give you

three lessons that you should take away

if you're building on the agent stack

today that are truisms for 2026. I don't

think they're going to change a lot this

year. The stack has to evolve

significantly. Number one, right now,

reliability is compounding in the wrong

direction. When your agent depends on

five different primitives, your

endto-end reliability is the product of

five different reliability. So if each

delivers 99% uptime your system delivers

only 95%. If it's at 97% each it's at

86. You get the idea. Essentially you

are stacking the liabilities of all your

agentic primitives right now because you

have to compose so much of this layer by

hand. Reliability is really hard to

engineer these days. Number two

transitional lockin is a real risk this

year. Right? Building on shims like

email as identity creates migration

costs when native protocols arrive.

Every single shim you adopt is a bet

that it either becomes the standard or

it becomes something that you're willing

to swap out. So think about your choices

and think strategically about what is

truly agent native and what is something

that is a practical bet and make a

choice about what you think is going to

be correct over the next two or three

years. We're all living through this

together. I can't tell you if email like

a wonderful cockroach is going to

survive forever because it might or if

we're actually going to get true agent

to agent communication that is post

email. Number three, and this is a big

one, agent sprawl is coming. This is the

same problem that plagued microservices

back in 2018 when when when you would

get people literally walking into

startups from places like Amazon and

Microsoft and the first thing they did

when when this little startup had a tiny

little codebase and just wanted to ship,

they would say, "Well, it all needs to

be a microservices architecture."

Did it? No, it didn't. It's a monolith.

Shift fast. In the same way, people are

looking at agents and they're saying

everything needs to be an agent and it's

sprawling all over the enterprise and

they're taking unexpected actions and

you don't have observability and you

don't have an orchestration layer and so

you're just kind of guessing and vibing.

That is going to be a bigger and bigger

problem over the course of 2026 unless

people invest now in orchestration

layers that yes, you're going to have to

hand roll. Look, I've talked about what

I think the new builder skills are. I'll

just reiterate them for you here.

Context engineering matters a lot these

days because what you feed the agent

matters for the outcomes you drive. Eval

driven development matters because you

have to be able to get the agent to

autonomously drive against a result to

avoid a lot of the bottlenecks that

comes from human reviewed code and stack

literacy is going to be really important

right you have to know which layer in

the stack is your competitive advantage

and why and you have to build

relentlessly against that. In the world

of agents, the builders who survive, and

I don't care if you're building in the

space as an entrepreneur, if you're

building as an individual with your open

claw, or if you're building as a leader

and you're building on top of the stack

and you need to get agents implemented

regardless, the ones who survive, you're

going to have to have stack literacy.

You're going to have to understand how

these six layers work. You're going to

have to keep a weather eye on which

pieces of the stack are changing and how

they affect your business. There is no

excuse for lack of stack literacy. And

part of the reason this matters, part of

the reason I'm talking about it in this

channel is because if you don't have

that, even as a business leader, even as

a non-CTO, non- tech leader, you are

going to be in big trouble because

agents drive so much business outcome

and business leverage now that is so

dependent on these pieces of the stack.

And so if you want to have an agent that

has tremendous blast radius across your

customer success, you got to understand

what's driving that. What parts of the

stack actually work? What parts of the

stack are you hand rolling? What are

your shims that you're betting on? If

you don't have that detailed

understanding, you're just kind of

hoping and praying that the agent works.

That's not a good strategy for the long

term. And so, we need to have better

stack literacy. And that's why this

video is important. So, share it with

someone who doesn't understand the agent

stack because I guarantee you there are

a lot of people who are walking around

with a lot of LinkedIn buzzwords in

their heads and they don't understand

the agent stack. And that's going to

lead to a lot of suffering and pain.

Frankly, for a lot of IC engineering

teams, they're going to be asked to

build stuff that doesn't make any sense.

텍스트나 타임스탬프를 클릭하면 동영상의 해당 장면으로 바로 이동합니다

대부분의 자막은 5초 이내에 준비됩니다

원클릭 복사125개 이상의 언어내용 검색타임스탬프로 이동

YouTube URL 붙여넣기

YouTube 동영상 링크를 입력하면 전체 자막을 가져옵니다

대부분의 자막은 5초 이내에 준비됩니다

Chrome 확장 프로그램 설치

YouTube를 떠나지 않고 자막을 즉시 가져오세요. Chrome 확장 프로그램을 설치하면 동영상 시청 페이지에서 바로 자막에 원클릭으로 접근할 수 있습니다.

Chrome에 추가 — 무료

YouTube, Coursera, Udemy 등 주요 교육 플랫폼 지원

자막을 바로 가져오려면: 주소창에서 도메인만 바꾸면 됩니다!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube 자막결과를 준비하고 있습니다…

YouTube 자막:You're Building AI Agents on Layers That Won't Exist in 18 Months. (What this Means for You)