YouTube Transcript:
Andrew Ng: State of AI Agents | LangChain Interrupt

Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.

Video Transcript

View:

I'm really excited for this next

section. Uh, so we'll be doing a

fireside chat with Andrew Wing. And

Andrew probably doesn't need any

introduction to most folks here. I'm

guessing a lot of people have taken some

of his classes on Corsera or deep

learning. Um, but Andrew's also been a

big part of the lane chain story. So, I

met I met Andrew a little over two years

ago at at at a conference um when we

started talking about Lingchain and he

he he graciously invited us to do a

course on Lingchain with deep learning.

I think it must have been the second or

third one that they they they ever did

and I know a lot of people here probably

watched that course or got started on

Lingchain because of that course. So,

Andrew has been a huge part of the Lang

Chain journey and I'm super excited to

welcome him on stage for a fireside

chat. So, let's welcome Andrew in.

[Music]

[Applause]

Thanks for being

[Music]

here. But, by the way, Harrison was

really kind. Um, I think Harrison and

his team has taught six short causes so

far on deep learning.AI and our metrics

by net promoter score and so on are that

Harrison's causes are among our most

highly rated. So, go take all of

Harrison's courses. I think the recent

Langrath one had the clearest

explanation I have seen myself of a

bunch of agent concepts. So, they

they've definitely helped make our our

courses and explanations better. So,

thank you guys for that as well. Um,

you've obviously touched and thought

about so many things in this industry,

but one of one of your takes that I cite

a lot and and and probably people have

have heard me talk about is your take on

kind of like talking about the

agenticness of an application as opposed

to whether something is an agent. And

so, you know, as we're here now at an

agent conference, maybe we should rename

it to an agentic conference, but would

you mind kind of like clarifying that?

and and I think it was like almost a

year and a half two years ago that you

said that and so I'm curious if things

have changed in your mind since then. So

I I remember actually Harris and I spoke

both spoke at a conference like a year

over a year ago and at that time I think

both of us were trying to convince other

people that agents are a thing and we

should pay attention to it. And uh that

was before maybe I think it was

midsummer last year that a bunch of

marketers got a hold of the agentic term

and started sticking that sticker

everywhere until last meeting. But to to

Herren's question, I think about a year

and a half ago, I saw that a lot of

people are arguing is this an agent, is

this not there different, you know,

arguments. Is it truly autonomous not to

an agent? And I felt that um it was fine

to have that argument, but that we would

succeed better as a community if we just

say that there degrees to which

something is agentic. So um and then if

we just say if you want to build an

agentic system with a little bit of

autonomy or a lot of autonomy is all

fine. No need to spend time arguing is

this truly an agent. Let's just call all

of these things agentic systems with

different degrees of autonomy. Um, and I

think that actually hopefully reduce the

amount of time people wasted spend

arguing of something as an agent and

let's just call them all agentic and

then and then get on with it. So I I

think it's actually worked out.

Where where on that spectrum of kind of

like a little autonomy to a lot of

autonomy do you see people building for

these days? Yeah. So um my team

routinely uses land graph for our

hardest problems right with complex

flows and so on. Um I'm also seeing tons

of business opportunities that frankly

are fairly linear workflows or linear

with just occasional side branches. So a

lot of businesses there are

opportunities where you know right now

we have people um looking at a form on a

website doing web search checking some

of the database to see if there's a

compliance issue or if there you know

someone we shouldn't sell certain stuff

to. And it's kind of a or take something

copy paste it maybe do another web

search paste in a different form. So in

business processes there actually a lot

of fairly linear workflows or linear

with very small loops and occasional

branches usually conoting a failure

because they reject this workflow. So um

I see a lot of opportunity but but one

one challenge I see businesses have is

it's still pretty difficult to look at

you know some stuff that's being done in

your business and figure out how to turn

into an agentic workflow. So what is the

granularity with which you should break

down this thing into micro tasks? uh and

then you know after you build your

initial prototype if it doesn't work

well enough which of these steps do you

work on to improve the performance so I

think that whole bag of skills on how to

look at a bunch of stuff that people are

doing break it into sequential steps u

where the small number of branches how

do you put in place evals you know all

that that skill set is still far too

rare I think and and then of course the

much more complex agentic workflows I

think you heard a bunch about uh with

very complex loops um uh that that's

very valuable as well. But I see much

more in terms of number of opportunities

still about value. There's a lot of

simpler workflows that I think are still

being built out. Let's let's talk about

some of those skills like so you've

you've been doing deep learning. I think

a lot of courses are in in in pursuit of

helping people kind of like build

agents. And so what are what are some of

the skills that you think agent builders

all across the spectrum should kind of

like master and get started with? Boy,

it's a good question. And I wish I knew

the good answer to that. I've been

thinking a lot about this actually

recently. I think a lot of the challenge

is um uh if you have a business process

workflow you often have people in

compliance legal HR whatever doing these

steps how do you um put in place the

plumbing um either through a land graph

type integration or we'll see if MCP

helps with some of that too uh to ingest

the data and then how do you prompt or

process and do the multiple steps uh in

order to build this end to end system

and one thing I see a lot is um putting

in place the right eval framework um uh

to not only understand the performance

of the overall system but to trace the

individual steps you can hone in on

what's the one step that is broken

what's the one prompt is broken to work

on I find that a lot of teams probably

wait longer than they should just using

human evals where every time you change

something you then sit there and look at

a bunch of output receivers right I see

most teams probably slower to put in

place eval systematic evals is ideal.

But I find that um having the right

instincts for what to do next in a

project is is still really difficult,

right? The skilled teams um the the the

teams that are still learning these

skills will often, you know, go down

blind alleys, right? Where you spend

like a few months trying to improve one

component. The the more experienced team

will say, you know what, I don't think

this can ever be made to work. Uh so

just don't just find a different way

around this problem. I I I wish I knew I

wish I I I knew more efficient ways to

get this kind of almost tactile

knowledge. often you're there you you

know look at the the output look at

trace look at the lang output uh and you

just got to make a decision right in

minutes or hours on what to do next and

that's still very difficult and and is

this kind of like tactile knowledge

mostly around LLMs and their limitations

or more around like just the product

framing of things and and and that skill

of of of taking a job and breaking it

down that's something that we're still

getting accustomed to I think it's all

of the above actually so I I feel like

over the last couple years uh AI tool

companies have created an amazing set of

AI tools and this includes tools like

you know Lang graph uh but also uh how

do you ideas like how do you think about

rack how do you think about building

chat bots uh many many different ways of

approaching memory um I don't know what

else uh how do you build evals how do

you build guardrails but I feel like

there's this you know wide sprawling

array of really exciting tools one One

picture I often have in my head is um if

all you have are, you know, purple Lego

bricks, right? You can't build that much

interesting stuff. But and and I think

of these tools as being akin to Lego

bricks, right? And the more tools you

have is as if you don't just a purple

Lego bricks, but a red one and a black

one and a yellow one and a green one.

And as you get more different colored

and shaped Lego bricks, you can very

quickly assemble them into really cool

things. And so I think a lot of these

tools like the ones I was rattling off

as different types of Lego bricks and

when you're trying to build something,

you know, sometimes you need that right

squiggly weird shaped Lego brick and

some people know it and can plug it in

and just get the job done. But if you've

never built evals of a certain type,

then you know, then you could actually

end up spending whatever three extra

months doing something that someone else

that's done that before could say, "Oh,

well, we should just build evals this

way. use the OM as a judge and well and

just go through that process to get it

done much faster. So one of the for

unfortunate things about AI is not just

one tool and in when I'm coding I just

use a whole bunch of different stuff

right and I'm not a master of enough

stuff myself but I've learned enough

tools to assemble them quickly so um

yeah and I think having that practice

with different tools also helps with

much faster decision- making on for and

and oh one of the thing is it it also

changes so for example because OM have

been having longer and longer context

memory

um a lot of the best practices for rag

from you know a year and a half ago or

whatever are much less relevant today

right I remember Harrison was really

early to a lot of these things like play

the early lang chain rag frameworks

recursive summarization and all that as

OM context memories got longer now we

just dump a lot more stuff into context

it's not that rack has gone away but the

hyperparameter tuning has gotten way

easier there's a huge range of

hyperparameters that work you know like

just fine so so as OM keep progressing

um the instincts we hold you know two

years ago may or may not be relevant

anymore today. You you mentioned a lot

of things that I wanna I want I want to

talk about. So okay what are what are

some of the Lego bricks that are maybe

underrated right now that you would

recommend that that people aren't

talking about like eval you know we we

had we had three people talk about evals

and I think that's top of people's mind

but what are what are some things that

most people maybe haven't thought of or

or haven't heard of yet that you would

recommend them looking into? Good

question. I don't know. Yeah. Uh maybe

maybe I'm sure so even though people

talk about evals, for some reason people

don't do it. Uh near why why don't why

don't you think they do it? And I think

I think it's because people often have

um I saw a post on this on on eval

writers blog. People think of writing

evals as this huge thing that you have

to do, right? Um, I think of evals as

something I'm going to throw together

really quickly, you know, in 20 minutes

and it's not that good, but it starts to

complement my human eyeball evals and

and so what often happens, I'll build a

system and there's one problem where I

keep on getting regression. I thought I

made it work, then it breaks. I thought

I made it work, then it breaks. Well,

darn it, this is getting annoying. Then

I code up a very simple eval maybe with,

you know, five input examples and some

very simple LMS judge to just check for

this one regression, right? Did this one

thing break? And then I'm not swapping

out human evals for automated evals. I'm

still looking at the output myself. But

when I change something or run this

evals to just, you know, take this

burden something so I don't have to

think about it. And then what happens is

um just like the way we write English

maybe once you have some slightly

helpful but clearly very broken uh

imperfect eval then you'll start to go

you know what I can improve my eval to

make it better and I can improve it to

make it better. So just as when we built

a lot of applications we built some you

know very quick and dirty thing that

doesn't work and we incrementally make

it better. For a lot of the way I built

evals, I built really awful evals that

barely helps. And then when you look at

what it does, you go, you know what,

this eval is broken. I could fix it. And

you incrementally make it better. Uh, so

that's one thing. I'll mention one thing

that people have talked a lot about, but

I think is so underrated um is the voice

stack. Um, it's one of the things that

I'm actually very excited about voice

applications. A lot of my friends are

very excited about voice applications. I

see a bunch of large enterprises really

excited about voice applications. very

large enterprise, very large use cases.

For some reason, while there are some

developers in this community doing

voice, the amount of developer attention

on voice stack applications there, there

is some, right? It's not the people

ignored it, but that's one thing that

feels much smaller than the um large

enterprise uh uh importance I see as

well as applications coming down the

pipe. Um and not not all of this is the

real-time voice API. It's not all uh

speechtoech native uh audio in audio

models. I find those models are very

hard to control. Uh but when we use more

of an agentic voice stack workflow which

is we which find much more controllable

um uh boy a fan working with a ton of

teams on on voice stack stuff that some

of which hopefully will be announced in

the near future. I'm seeing a lot of

very exciting things. Um and then other

things I think underrated one other one

that maybe is not underrated but more

business should do it. I think many of

you have seen that um developers that

use AI assistance in our coding is so

much faster than developers that don't.

Uh I've been uh it's been interesting to

see how many companies CIOS and CTO's

still have you know policies that don't

let engineers use AI assisted coding. Um

I think maybe sometimes for good reasons

but I think we have to get past that

because frankly I don't know my teams

and I I just hate to ever have to code

again without AI assistance. So, but I

think some businesses still need to get

through that. Um, I think underrated is

the idea that I I think everyone should

learn to code. Uh, uh, one one fun fact

about AI fund, um, everyone in AI fund,

including, you know, the person that

runs our front desk receptionist, uh,

and my CFO and my, uh, at and the

general counsel, everyone in AI fund

actually knows how to code. And um it's

not that I want them to be software

engineers, they're not. But in their

respective job functions, many of them

by learning a little bit about how to

code are better able to tell a computer

what they wanted to do. Um and so it's

actually driving meaningful productivity

improvements across all of these job

functions that are not software

engineering. So that that's been

exciting as well. Talking about kind of

like AI coding, how how what what tools

are you using for that personally?

So we're working on some things that

we've not yet announced. Um Oh,

exciting. Yeah.

So maybe I I I do use cursor winds surf

uh um uh and some other things. All

right, we'll come back to that later.

Um talking about voice, if if people

here want to get into voice and they're

familiar with building kind of like

agents with LLMs, how how similar is it?

Are there a lot of ideas that are

transferable or or what's new? what will

they have to learn? Yeah. So, it turns

out um there are a lot of applications

where I think voice is important. It

creates certain interactions uh that um

that are much more it turns out that uh

it turns out from an application

perspective um a input text prompt is

kind of intimidating right for a lot of

applications. Well, we can go to user

and say tell me what you think. Here's a

block of text prompt. Write a bunch of

text for me. That's actually very

intimidating for users. And one of the

problems with that is um people can use

backspace and so you know people are

just slower to respond via text whereas

for voice you know time rolls forward

you just have to keep talking you could

change your mind you could actually say

oh I changed my mind forget that earlier

thing and our model is actually pretty

good at dealing with it but I find that

um there a lot of applications where the

user friction to just getting them to

use it is lower and we just say you know

tell me what you think and then they

they respond in voice Um so in terms of

voice the one biggest difference is uh

in in terms of um engine requirements is

latency because if you can if someone

says something you kind of really want

to respond in you know I don't know sub

one second right less than 500 millconds

is great but really ideally sub one

second and with a lot of um agentic

workflows that will run for many

seconds. So when DBWI worked with real

avatar to build an avatar of me, uh this

is on a web page. You can talk to an

avatar of me if you want. Um uh our

initial version had kind of five to nine

seconds of latency and was and it's just

a bad user experience. You say

something, you know, 9 seconds of

silence, then my avatar responds. But so

we w up building things like um uh we

call a pre-response. So just as you

know, if you ask me a question, I might

go, "Huh, that's interesting." Or, "Let

me think about that."

So, we prompted an ARM to basically do

that to hide the latency. Um, and it

actually seems to work great. And there

all these other little tricks as well.

Turns out if you're building a voice um

customer service chatbot, uh, it turns

out that if you play background noise of

a customer contact center instead of

dead silence, people are much more

accepting of that of that, you know,

latency. So I find that there are a lot

of these things that um uh that are

different than a pure textbased LM but

in applications where a voice-based

modality lets a user be comfortable and

just start talking. Uh I think it

sometimes really reduces the user

friction to you know getting some

information out of them in a safe but I

think when we talk we don't feel like we

need to deliver perfection as much as

when we write. Um, so it's somehow

easier for people to just start blurting

out their ideas and change their mind

and go back and forth and that lets us

get the information from them that we

need to help the user to move forward.

So, huh, that's interesting.

Yeah. Um, one of the one one of the new

things that's out there and you

mentioned briefly is MCP. How are you

seeing that transform how people are

building apps, what types of apps

they're building or what's generally

happening in the ecosystem? Yeah, I

think it's really exciting. Uh just this

morning we released with anthropic uh

short halls on MCP. Um uh I actually saw

a lot of uh stuff you know on the

interweb on MCP that I thought was quite

confusing. So when we got together

anthropy we said you know let's let's

create a really good short course on MCP

that explains it clearly. I think MCP is

fantastic. I think it was a very clear

market gap and you know that that OpenAI

adopted it also I think speaks to the

importance of this. Um I think the MCP

standard will continue to evolve right

so for example so I I think many many of

you know what MCP is right makes it much

easier for agents primarily but frankly

I think other types of software to plug

into different types of data when I'm

using OM myself or when I'm building

applications frankly for a lot of us we

spend so much time on the plumbing right

so I I think for those of you from large

enterprises as well the AI especially

you know reasoning models are like

pretty darn intelligent They could do a

lot of stuff when given the right

context. But so I find that I spend my

team spend a lot of time working on the

plumbing on the data integrations to get

the context of the OM to make it you

know do something that often is pretty

sensible when it has the right input

context. So MCP I think is a fantastic

way to try to standardize the interface

to a lot of tools or API calls as well

as data sources. Um it feels like uh it

feels a little bit like wild west. You

know a lot of MCP servers you find in

the internet do not work right and then

the authentication systems are kind of

you know even for the very large

companies you know with MCP servers a

little bit clunky. It's not clear if the

authentication token totally works and

expires. There's a lot of that going on.

Um I think the MCP protocol itself is

also early right now. MCP gives a long

list of the resources available. you

know, eventually I think we'll need some

more hierarchal discovery. Imagine you

want to build something um I don't know

even I don't know if there ever be an

MCP uh interface to to a land graph but

Lang graph has so many API calls you

just can't have like a long list of

everything under the sun for agent to

sort out. So I think some sort of

hierarchal discovery mechanism. So I

think MCP is a really fantastic first

step. Definitely encourage you to learn

about it. it will make your life easier

probably um if you find a good MCP

server implementations to help some of

the data integrations and I think I

think it'll be important this this idea

of um when you have you know n models or

n agents and m data sources it should

not be n* m e effort to do all the

integration should be n plus m and I

think mcp is a is a fantastic first step

it will need to evolve but it's a

fantastic first step toward that type of

data integration

Another type of protocol that's seen

less buzz than MCP is some of the agent

to aagent stuff. And I remember when we

when we were at a conference a year or

so ago, I think you were talking about

multi-agent systems which this would

kind of enable. So how how do you see

some of the multi-agent or agentto agent

stuff evolving? Yeah. So I think you

know agent AI is still so early most of

us right including me we struggle to

even make our code work. And so making

my co my agent work with someone else's

agent, it feels like a two miracle, you

know, requirement. Um, so I see that

when one team is building a multi- aent

system, that often works because we

build a bunch of agents, they go with

themselves, we understand the protocols,

blah, blah, that works. But to right

now, at least at this moment in time,

and maybe I'm off, the number of

examples I'm seeing of when, you know,

one team's agent or collection of agents

successfully engages a totally different

team's agent or collection of agents. I

think we're a little bit early to that.

I'm sure we'll get there, but I'm not

personally seeing, you know, real

success, huge success stories of that

yet. I I'm not sure if yall seeing No, I

agree. It's it's I I think it's super

early. I think if MCP is early, I think

agent agent stuff is even earlier. Um,

another thing that's kind of like top of

people's mind right now is is kind of

vibe coding. Um, and all of that, and

you touched on it a little bit earlier

with how people are using the these AI

coding assistants, but how how do you

think about vibe coding? Is that a

different skill than before? What what

what kind of purpose does that serve in

in the world? Yeah. So, but I I think

you know, many of us code with barely

looking at the code, right? I think it's

a fantastic thing to be doing. Um, I

think it's unfortunate that that call

Vive coding because it's misleading a

lot of people into thinking just go with

the vibes, you know, accept this, reject

that. And frankly, when I'm coding for a

day, uh, uh, you know, with Vive coding

or whatever with air coding assistance,

I'm frankly exhausted by the end of the

day. This is a deeply intellectual

exercise. Um, and so I think the name is

unfortunate, but the phenomenon is real

and it's been taking off and is great.

Um so I I I I I over the last year a few

people have been advising others to not

learn to code on the basis that AI will

automate coding. I think we look back at

some of the worst career advice ever

given um because over the last many

decades as coding became easier uh more

people started to code. So it it turns

out, you know, when we went from punch

cards to keyboards and terminals, right?

Or when it it turns out, I actually

found some very old articles when

programming went from assembly language

to, you know, literally cobalt. There

were people arguing back then, yep, we

have cobalt, it's so easy, we don't need

programmers anymore. And and obviously

when it became easier, more people learn

to code. And so with AI coding

assistance, more a lot more people

should code. Um uh but I think and it

turns out one of the most important

skills of the future for developers and

non-developers is the ability to tell a

computer exactly what you want so they

will do it for you. And um I think

understanding at some level which all of

you do I know but understanding at some

level how a computer works lets you

prompt or instruct a computer much more

precisely which is why I still try to

advise everyone to you know learn one

programming language learn Python or

something. Um, and then I I think so

maybe some of you know this. I I

personally am a you're much stronger

Python developer than say JavaScript,

right? Uh, but with um AI assisted

coding, I now write a lot more

JavaScript and TypeScript code than I

ever used to. But even when debugging,

you know, JavaScript code that something

else wrote for me that I didn't write

with my own fingers, really

understanding, you know, what are the

error cases? What does this mean? that

that's been really important for me to

write debug my JavaScript code. So, if

you if you don't like the name vibe

coding, do you have a better name in

mind? Oh, it's a good question. I should

think about that. We'll we'll get back

to you on that. That's a good question.

Um, one one of the things that you

announced recently is a new fund for AI

funds. So, congrats on that. Thank you.

For people in the audience who are maybe

thinking of starting a startup or

looking into that, what advice would you

have for them?

So um AI funds a venture studio. So we

built companies and we exclusively

invest in companies that we co-ounded.

So um I think in terms of looking back

on AI funds, you know, lessons learned,

the the number one I would say the

number one predictor of a startup

success is speed. Um I know we're in

Silicon Valley, but I see a lot of

people that have never seen yet the

speed with which a skilled team can

execute. And if you've never seen it

before, I know many of you have seen it,

it's just so much faster than you know

anything that

um slower businesses know how to do. Uh

and I think the number two predictor

also very important is uh technical

knowledge. It turns out if we look at

the skills needed to build a startup,

there's some things like how do you

market, how do you sell, how do you

price, you know, all that is important,

but that knowledge has been around. So

it's a little bit more widespread. But

the knowledge that's really rare is how

does technology actually work because

technology been evolving so quickly. So

I I have deep respect for the go to

market people. Pricing is hard, you

know, marketing is hard, um positioning

is hard, but that knowledge is more

diffused and the most rare resource is

someone that really understands how the

technology works. So AI fund, we really

like working with deeply technical

people um that have good instincts or

understands do this, don't do that. This

lets you go twice as fast. Um, and then

I think uh uh a lot of the business

stuff, you know, that knowledge is very

important, but it's usually easier to

figure out. All right, that's great

advice for starting something. Um, we

are going to wrap this up. We're going

to go to a break now, but before we do,

please join me in giving Andrew a big

hand and thank you.

Click on any text or timestamp to jump to that moment in the video

Most transcripts ready in under 5 seconds

One-Click Copy125+ LanguagesSearch ContentJump to Timestamps

Paste YouTube URL

Enter any YouTube video link to get the full transcript

Most transcripts ready in under 5 seconds

Get Our Chrome Extension

Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.

Add to Chrome — Free

Works with YouTube, Coursera, Udemy and more educational platforms

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube Transcript:Andrew Ng: State of AI Agents | LangChain Interrupt

Video Transcript

Paste YouTube URL

Transcript Extraction Form

Get Our Chrome Extension

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube Transcript:
Andrew Ng: State of AI Agents | LangChain Interrupt