YouTube Transcript:
Andrew Ng: State of AI Agents | LangChain Interrupt
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
View:
I'm really excited for this next
section. Uh, so we'll be doing a
fireside chat with Andrew Wing. And
Andrew probably doesn't need any
introduction to most folks here. I'm
guessing a lot of people have taken some
of his classes on Corsera or deep
learning. Um, but Andrew's also been a
big part of the lane chain story. So, I
met I met Andrew a little over two years
ago at at at a conference um when we
started talking about Lingchain and he
he he graciously invited us to do a
course on Lingchain with deep learning.
I think it must have been the second or
third one that they they they ever did
and I know a lot of people here probably
watched that course or got started on
Lingchain because of that course. So,
Andrew has been a huge part of the Lang
Chain journey and I'm super excited to
welcome him on stage for a fireside
chat. So, let's welcome Andrew in.
[Music]
[Applause]
Thanks for being
[Music]
here. But, by the way, Harrison was
really kind. Um, I think Harrison and
his team has taught six short causes so
far on deep learning.AI and our metrics
by net promoter score and so on are that
Harrison's causes are among our most
highly rated. So, go take all of
Harrison's courses. I think the recent
Langrath one had the clearest
explanation I have seen myself of a
bunch of agent concepts. So, they
they've definitely helped make our our
courses and explanations better. So,
thank you guys for that as well. Um,
you've obviously touched and thought
about so many things in this industry,
but one of one of your takes that I cite
a lot and and and probably people have
have heard me talk about is your take on
kind of like talking about the
agenticness of an application as opposed
to whether something is an agent. And
so, you know, as we're here now at an
agent conference, maybe we should rename
it to an agentic conference, but would
you mind kind of like clarifying that?
and and I think it was like almost a
year and a half two years ago that you
said that and so I'm curious if things
have changed in your mind since then. So
I I remember actually Harris and I spoke
both spoke at a conference like a year
over a year ago and at that time I think
both of us were trying to convince other
people that agents are a thing and we
should pay attention to it. And uh that
was before maybe I think it was
midsummer last year that a bunch of
marketers got a hold of the agentic term
and started sticking that sticker
everywhere until last meeting. But to to
Herren's question, I think about a year
and a half ago, I saw that a lot of
people are arguing is this an agent, is
this not there different, you know,
arguments. Is it truly autonomous not to
an agent? And I felt that um it was fine
to have that argument, but that we would
succeed better as a community if we just
say that there degrees to which
something is agentic. So um and then if
we just say if you want to build an
agentic system with a little bit of
autonomy or a lot of autonomy is all
fine. No need to spend time arguing is
this truly an agent. Let's just call all
of these things agentic systems with
different degrees of autonomy. Um, and I
think that actually hopefully reduce the
amount of time people wasted spend
arguing of something as an agent and
let's just call them all agentic and
then and then get on with it. So I I
think it's actually worked out.
Where where on that spectrum of kind of
like a little autonomy to a lot of
autonomy do you see people building for
these days? Yeah. So um my team
routinely uses land graph for our
hardest problems right with complex
flows and so on. Um I'm also seeing tons
of business opportunities that frankly
are fairly linear workflows or linear
with just occasional side branches. So a
lot of businesses there are
opportunities where you know right now
we have people um looking at a form on a
website doing web search checking some
of the database to see if there's a
compliance issue or if there you know
someone we shouldn't sell certain stuff
to. And it's kind of a or take something
copy paste it maybe do another web
search paste in a different form. So in
business processes there actually a lot
of fairly linear workflows or linear
with very small loops and occasional
branches usually conoting a failure
because they reject this workflow. So um
I see a lot of opportunity but but one
one challenge I see businesses have is
it's still pretty difficult to look at
you know some stuff that's being done in
your business and figure out how to turn
into an agentic workflow. So what is the
granularity with which you should break
down this thing into micro tasks? uh and
then you know after you build your
initial prototype if it doesn't work
well enough which of these steps do you
work on to improve the performance so I
think that whole bag of skills on how to
look at a bunch of stuff that people are
doing break it into sequential steps u
where the small number of branches how
do you put in place evals you know all
that that skill set is still far too
rare I think and and then of course the
much more complex agentic workflows I
think you heard a bunch about uh with
very complex loops um uh that that's
very valuable as well. But I see much
more in terms of number of opportunities
still about value. There's a lot of
simpler workflows that I think are still
being built out. Let's let's talk about
some of those skills like so you've
you've been doing deep learning. I think
a lot of courses are in in in pursuit of
helping people kind of like build
agents. And so what are what are some of
the skills that you think agent builders
all across the spectrum should kind of
like master and get started with? Boy,
it's a good question. And I wish I knew
the good answer to that. I've been
thinking a lot about this actually
recently. I think a lot of the challenge
is um uh if you have a business process
workflow you often have people in
compliance legal HR whatever doing these
steps how do you um put in place the
plumbing um either through a land graph
type integration or we'll see if MCP
helps with some of that too uh to ingest
the data and then how do you prompt or
process and do the multiple steps uh in
order to build this end to end system
and one thing I see a lot is um putting
in place the right eval framework um uh
to not only understand the performance
of the overall system but to trace the
individual steps you can hone in on
what's the one step that is broken
what's the one prompt is broken to work
on I find that a lot of teams probably
wait longer than they should just using
human evals where every time you change
something you then sit there and look at
a bunch of output receivers right I see
most teams probably slower to put in
place eval systematic evals is ideal.
But I find that um having the right
instincts for what to do next in a
project is is still really difficult,
right? The skilled teams um the the the
teams that are still learning these
skills will often, you know, go down
blind alleys, right? Where you spend
like a few months trying to improve one
component. The the more experienced team
will say, you know what, I don't think
this can ever be made to work. Uh so
just don't just find a different way
around this problem. I I I wish I knew I
wish I I I knew more efficient ways to
get this kind of almost tactile
knowledge. often you're there you you
know look at the the output look at
trace look at the lang output uh and you
just got to make a decision right in
minutes or hours on what to do next and
that's still very difficult and and is
this kind of like tactile knowledge
mostly around LLMs and their limitations
or more around like just the product
framing of things and and and that skill
of of of taking a job and breaking it
down that's something that we're still
getting accustomed to I think it's all
of the above actually so I I feel like
over the last couple years uh AI tool
companies have created an amazing set of
AI tools and this includes tools like
you know Lang graph uh but also uh how
do you ideas like how do you think about
rack how do you think about building
chat bots uh many many different ways of
approaching memory um I don't know what
else uh how do you build evals how do
you build guardrails but I feel like
there's this you know wide sprawling
array of really exciting tools one One
picture I often have in my head is um if
all you have are, you know, purple Lego
bricks, right? You can't build that much
interesting stuff. But and and I think
of these tools as being akin to Lego
bricks, right? And the more tools you
have is as if you don't just a purple
Lego bricks, but a red one and a black
one and a yellow one and a green one.
And as you get more different colored
and shaped Lego bricks, you can very
quickly assemble them into really cool
things. And so I think a lot of these
tools like the ones I was rattling off
as different types of Lego bricks and
when you're trying to build something,
you know, sometimes you need that right
squiggly weird shaped Lego brick and
some people know it and can plug it in
and just get the job done. But if you've
never built evals of a certain type,
then you know, then you could actually
end up spending whatever three extra
months doing something that someone else
that's done that before could say, "Oh,
well, we should just build evals this
way. use the OM as a judge and well and
just go through that process to get it
done much faster. So one of the for
unfortunate things about AI is not just
one tool and in when I'm coding I just
use a whole bunch of different stuff
right and I'm not a master of enough
stuff myself but I've learned enough
tools to assemble them quickly so um
yeah and I think having that practice
with different tools also helps with
much faster decision- making on for and
and oh one of the thing is it it also
changes so for example because OM have
been having longer and longer context
memory
um a lot of the best practices for rag
from you know a year and a half ago or
whatever are much less relevant today
right I remember Harrison was really
early to a lot of these things like play
the early lang chain rag frameworks
recursive summarization and all that as
OM context memories got longer now we
just dump a lot more stuff into context
it's not that rack has gone away but the
hyperparameter tuning has gotten way
easier there's a huge range of
hyperparameters that work you know like
just fine so so as OM keep progressing
um the instincts we hold you know two
years ago may or may not be relevant
anymore today. You you mentioned a lot
of things that I wanna I want I want to
talk about. So okay what are what are
some of the Lego bricks that are maybe
underrated right now that you would
recommend that that people aren't
talking about like eval you know we we
had we had three people talk about evals
and I think that's top of people's mind
but what are what are some things that
most people maybe haven't thought of or
or haven't heard of yet that you would
recommend them looking into? Good
question. I don't know. Yeah. Uh maybe
maybe I'm sure so even though people
talk about evals, for some reason people
don't do it. Uh near why why don't why
don't you think they do it? And I think
I think it's because people often have
um I saw a post on this on on eval
writers blog. People think of writing
evals as this huge thing that you have
to do, right? Um, I think of evals as
something I'm going to throw together
really quickly, you know, in 20 minutes
and it's not that good, but it starts to
complement my human eyeball evals and
and so what often happens, I'll build a
system and there's one problem where I
keep on getting regression. I thought I
made it work, then it breaks. I thought
I made it work, then it breaks. Well,
darn it, this is getting annoying. Then
I code up a very simple eval maybe with,
you know, five input examples and some
very simple LMS judge to just check for
this one regression, right? Did this one
thing break? And then I'm not swapping
out human evals for automated evals. I'm
still looking at the output myself. But
when I change something or run this
evals to just, you know, take this
burden something so I don't have to
think about it. And then what happens is
um just like the way we write English
maybe once you have some slightly
helpful but clearly very broken uh
imperfect eval then you'll start to go
you know what I can improve my eval to
make it better and I can improve it to
make it better. So just as when we built
a lot of applications we built some you
know very quick and dirty thing that
doesn't work and we incrementally make
it better. For a lot of the way I built
evals, I built really awful evals that
barely helps. And then when you look at
what it does, you go, you know what,
this eval is broken. I could fix it. And
you incrementally make it better. Uh, so
that's one thing. I'll mention one thing
that people have talked a lot about, but
I think is so underrated um is the voice
stack. Um, it's one of the things that
I'm actually very excited about voice
applications. A lot of my friends are
very excited about voice applications. I
see a bunch of large enterprises really
excited about voice applications. very
large enterprise, very large use cases.
For some reason, while there are some
developers in this community doing
voice, the amount of developer attention
on voice stack applications there, there
is some, right? It's not the people
ignored it, but that's one thing that
feels much smaller than the um large
enterprise uh uh importance I see as
well as applications coming down the
pipe. Um and not not all of this is the
real-time voice API. It's not all uh
speechtoech native uh audio in audio
models. I find those models are very
hard to control. Uh but when we use more
of an agentic voice stack workflow which
is we which find much more controllable
um uh boy a fan working with a ton of
teams on on voice stack stuff that some
of which hopefully will be announced in
the near future. I'm seeing a lot of
very exciting things. Um and then other
things I think underrated one other one
that maybe is not underrated but more
business should do it. I think many of
you have seen that um developers that
use AI assistance in our coding is so
much faster than developers that don't.
Uh I've been uh it's been interesting to
see how many companies CIOS and CTO's
still have you know policies that don't
let engineers use AI assisted coding. Um
I think maybe sometimes for good reasons
but I think we have to get past that
because frankly I don't know my teams
and I I just hate to ever have to code
again without AI assistance. So, but I
think some businesses still need to get
through that. Um, I think underrated is
the idea that I I think everyone should
learn to code. Uh, uh, one one fun fact
about AI fund, um, everyone in AI fund,
including, you know, the person that
runs our front desk receptionist, uh,
and my CFO and my, uh, at and the
general counsel, everyone in AI fund
actually knows how to code. And um it's
not that I want them to be software
engineers, they're not. But in their
respective job functions, many of them
by learning a little bit about how to
code are better able to tell a computer
what they wanted to do. Um and so it's
actually driving meaningful productivity
improvements across all of these job
functions that are not software
engineering. So that that's been
exciting as well. Talking about kind of
like AI coding, how how what what tools
are you using for that personally?
So we're working on some things that
we've not yet announced. Um Oh,
exciting. Yeah.
So maybe I I I do use cursor winds surf
uh um uh and some other things. All
right, we'll come back to that later.
Um talking about voice, if if people
here want to get into voice and they're
familiar with building kind of like
agents with LLMs, how how similar is it?
Are there a lot of ideas that are
transferable or or what's new? what will
they have to learn? Yeah. So, it turns
out um there are a lot of applications
where I think voice is important. It
creates certain interactions uh that um
that are much more it turns out that uh
it turns out from an application
perspective um a input text prompt is
kind of intimidating right for a lot of
applications. Well, we can go to user
and say tell me what you think. Here's a
block of text prompt. Write a bunch of
text for me. That's actually very
intimidating for users. And one of the
problems with that is um people can use
backspace and so you know people are
just slower to respond via text whereas
for voice you know time rolls forward
you just have to keep talking you could
change your mind you could actually say
oh I changed my mind forget that earlier
thing and our model is actually pretty
good at dealing with it but I find that
um there a lot of applications where the
user friction to just getting them to
use it is lower and we just say you know
tell me what you think and then they
they respond in voice Um so in terms of
voice the one biggest difference is uh
in in terms of um engine requirements is
latency because if you can if someone
says something you kind of really want
to respond in you know I don't know sub
one second right less than 500 millconds
is great but really ideally sub one
second and with a lot of um agentic
workflows that will run for many
seconds. So when DBWI worked with real
avatar to build an avatar of me, uh this
is on a web page. You can talk to an
avatar of me if you want. Um uh our
initial version had kind of five to nine
seconds of latency and was and it's just
a bad user experience. You say
something, you know, 9 seconds of
silence, then my avatar responds. But so
we w up building things like um uh we
call a pre-response. So just as you
know, if you ask me a question, I might
go, "Huh, that's interesting." Or, "Let
me think about that."
So, we prompted an ARM to basically do
that to hide the latency. Um, and it
actually seems to work great. And there
all these other little tricks as well.
Turns out if you're building a voice um
customer service chatbot, uh, it turns
out that if you play background noise of
a customer contact center instead of
dead silence, people are much more
accepting of that of that, you know,
latency. So I find that there are a lot
of these things that um uh that are
different than a pure textbased LM but
in applications where a voice-based
modality lets a user be comfortable and
just start talking. Uh I think it
sometimes really reduces the user
friction to you know getting some
information out of them in a safe but I
think when we talk we don't feel like we
need to deliver perfection as much as
when we write. Um, so it's somehow
easier for people to just start blurting
out their ideas and change their mind
and go back and forth and that lets us
get the information from them that we
need to help the user to move forward.
So, huh, that's interesting.
Yeah. Um, one of the one one of the new
things that's out there and you
mentioned briefly is MCP. How are you
seeing that transform how people are
building apps, what types of apps
they're building or what's generally
happening in the ecosystem? Yeah, I
think it's really exciting. Uh just this
morning we released with anthropic uh
short halls on MCP. Um uh I actually saw
a lot of uh stuff you know on the
interweb on MCP that I thought was quite
confusing. So when we got together
anthropy we said you know let's let's
create a really good short course on MCP
that explains it clearly. I think MCP is
fantastic. I think it was a very clear
market gap and you know that that OpenAI
adopted it also I think speaks to the
importance of this. Um I think the MCP
standard will continue to evolve right
so for example so I I think many many of
you know what MCP is right makes it much
easier for agents primarily but frankly
I think other types of software to plug
into different types of data when I'm
using OM myself or when I'm building
applications frankly for a lot of us we
spend so much time on the plumbing right
so I I think for those of you from large
enterprises as well the AI especially
you know reasoning models are like
pretty darn intelligent They could do a
lot of stuff when given the right
context. But so I find that I spend my
team spend a lot of time working on the
plumbing on the data integrations to get
the context of the OM to make it you
know do something that often is pretty
sensible when it has the right input
context. So MCP I think is a fantastic
way to try to standardize the interface
to a lot of tools or API calls as well
as data sources. Um it feels like uh it
feels a little bit like wild west. You
know a lot of MCP servers you find in
the internet do not work right and then
the authentication systems are kind of
you know even for the very large
companies you know with MCP servers a
little bit clunky. It's not clear if the
authentication token totally works and
expires. There's a lot of that going on.
Um I think the MCP protocol itself is
also early right now. MCP gives a long
list of the resources available. you
know, eventually I think we'll need some
more hierarchal discovery. Imagine you
want to build something um I don't know
even I don't know if there ever be an
MCP uh interface to to a land graph but
Lang graph has so many API calls you
just can't have like a long list of
everything under the sun for agent to
sort out. So I think some sort of
hierarchal discovery mechanism. So I
think MCP is a really fantastic first
step. Definitely encourage you to learn
about it. it will make your life easier
probably um if you find a good MCP
server implementations to help some of
the data integrations and I think I
think it'll be important this this idea
of um when you have you know n models or
n agents and m data sources it should
not be n* m e effort to do all the
integration should be n plus m and I
think mcp is a is a fantastic first step
it will need to evolve but it's a
fantastic first step toward that type of
data integration
Another type of protocol that's seen
less buzz than MCP is some of the agent
to aagent stuff. And I remember when we
when we were at a conference a year or
so ago, I think you were talking about
multi-agent systems which this would
kind of enable. So how how do you see
some of the multi-agent or agentto agent
stuff evolving? Yeah. So I think you
know agent AI is still so early most of
us right including me we struggle to
even make our code work. And so making
my co my agent work with someone else's
agent, it feels like a two miracle, you
know, requirement. Um, so I see that
when one team is building a multi- aent
system, that often works because we
build a bunch of agents, they go with
themselves, we understand the protocols,
blah, blah, that works. But to right
now, at least at this moment in time,
and maybe I'm off, the number of
examples I'm seeing of when, you know,
one team's agent or collection of agents
successfully engages a totally different
team's agent or collection of agents. I
think we're a little bit early to that.
I'm sure we'll get there, but I'm not
personally seeing, you know, real
success, huge success stories of that
yet. I I'm not sure if yall seeing No, I
agree. It's it's I I think it's super
early. I think if MCP is early, I think
agent agent stuff is even earlier. Um,
another thing that's kind of like top of
people's mind right now is is kind of
vibe coding. Um, and all of that, and
you touched on it a little bit earlier
with how people are using the these AI
coding assistants, but how how do you
think about vibe coding? Is that a
different skill than before? What what
what kind of purpose does that serve in
in the world? Yeah. So, but I I think
you know, many of us code with barely
looking at the code, right? I think it's
a fantastic thing to be doing. Um, I
think it's unfortunate that that call
Vive coding because it's misleading a
lot of people into thinking just go with
the vibes, you know, accept this, reject
that. And frankly, when I'm coding for a
day, uh, uh, you know, with Vive coding
or whatever with air coding assistance,
I'm frankly exhausted by the end of the
day. This is a deeply intellectual
exercise. Um, and so I think the name is
unfortunate, but the phenomenon is real
and it's been taking off and is great.
Um so I I I I I over the last year a few
people have been advising others to not
learn to code on the basis that AI will
automate coding. I think we look back at
some of the worst career advice ever
given um because over the last many
decades as coding became easier uh more
people started to code. So it it turns
out, you know, when we went from punch
cards to keyboards and terminals, right?
Or when it it turns out, I actually
found some very old articles when
programming went from assembly language
to, you know, literally cobalt. There
were people arguing back then, yep, we
have cobalt, it's so easy, we don't need
programmers anymore. And and obviously
when it became easier, more people learn
to code. And so with AI coding
assistance, more a lot more people
should code. Um uh but I think and it
turns out one of the most important
skills of the future for developers and
non-developers is the ability to tell a
computer exactly what you want so they
will do it for you. And um I think
understanding at some level which all of
you do I know but understanding at some
level how a computer works lets you
prompt or instruct a computer much more
precisely which is why I still try to
advise everyone to you know learn one
programming language learn Python or
something. Um, and then I I think so
maybe some of you know this. I I
personally am a you're much stronger
Python developer than say JavaScript,
right? Uh, but with um AI assisted
coding, I now write a lot more
JavaScript and TypeScript code than I
ever used to. But even when debugging,
you know, JavaScript code that something
else wrote for me that I didn't write
with my own fingers, really
understanding, you know, what are the
error cases? What does this mean? that
that's been really important for me to
write debug my JavaScript code. So, if
you if you don't like the name vibe
coding, do you have a better name in
mind? Oh, it's a good question. I should
think about that. We'll we'll get back
to you on that. That's a good question.
Um, one one of the things that you
announced recently is a new fund for AI
funds. So, congrats on that. Thank you.
For people in the audience who are maybe
thinking of starting a startup or
looking into that, what advice would you
have for them?
So um AI funds a venture studio. So we
built companies and we exclusively
invest in companies that we co-ounded.
So um I think in terms of looking back
on AI funds, you know, lessons learned,
the the number one I would say the
number one predictor of a startup
success is speed. Um I know we're in
Silicon Valley, but I see a lot of
people that have never seen yet the
speed with which a skilled team can
execute. And if you've never seen it
before, I know many of you have seen it,
it's just so much faster than you know
anything that
um slower businesses know how to do. Uh
and I think the number two predictor
also very important is uh technical
knowledge. It turns out if we look at
the skills needed to build a startup,
there's some things like how do you
market, how do you sell, how do you
price, you know, all that is important,
but that knowledge has been around. So
it's a little bit more widespread. But
the knowledge that's really rare is how
does technology actually work because
technology been evolving so quickly. So
I I have deep respect for the go to
market people. Pricing is hard, you
know, marketing is hard, um positioning
is hard, but that knowledge is more
diffused and the most rare resource is
someone that really understands how the
technology works. So AI fund, we really
like working with deeply technical
people um that have good instincts or
understands do this, don't do that. This
lets you go twice as fast. Um, and then
I think uh uh a lot of the business
stuff, you know, that knowledge is very
important, but it's usually easier to
figure out. All right, that's great
advice for starting something. Um, we
are going to wrap this up. We're going
to go to a break now, but before we do,
please join me in giving Andrew a big
hand and thank you.
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.
Works with YouTube, Coursera, Udemy and more educational platforms
Get Instant Transcripts: Just Edit the Domain in Your Address Bar!
YouTube
←
→
↻
https://www.youtube.com/watch?v=UF8uR6Z6KLc
YoutubeToText
←
→
↻
https://youtubetotext.net/watch?v=UF8uR6Z6KLc