Hang tight while we fetch the video data and transcripts. This only takes a moment.
Connecting to YouTube player…
Fetching transcript data…
We’ll display the transcript, summary, and all view options as soon as everything loads.
Next steps
Loading transcript tools…
Why MCP really is a big deal | Model Context Protocol with Tim Berglund | Confluent Developer | YouTubeToText
YouTube Transcript: Why MCP really is a big deal | Model Context Protocol with Tim Berglund
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
Model Context Protocol.
It really is a big deal,
but I think most people
are missing the point here.
Everybody's talking about
enhancing desktop applications
with agentic functionality.
But if you want to write
agentic AI applications
at work
like a professional
you're going to need a broader vision.
In order for me to give you that vision,
I'm going to need to explain to you how it works.
And, here's a hint:
comparing it to the USB-C
of AI applications
is probably not going to be helpful.
Didn't help me,
I don't think it's going to help you.
Now, let's start by remembering
how an LLM basically works
from an outside perspective, right?
You have a prompt
and you send that into an LLM.
Out of that LLM, you get a response.
Now, there are two problems here.
That response is just words.
And if words are what you want, you're doing fine.
But what if you want to do something?
That's what agentic AI is all about.
You want to cause effects out in the world.
The AI needs to be able to take those actions
or invoke what we call tools.
It also needs
more up-to-date information
or maybe just broader information
than what's available in that
core foundation model.
And that's great.
This guy is here as an API
out on the internet.
You might have wrapped that with the so-called
Retrieval Augmented Generation pattern.
Some people say RAG is old and busted
and yesterday's news.
In fact, in enterprise context,
you may well be using this pattern
and there's not a thing in the world wrong with that
to bring the data of the enterprise
into the context that the LLM can work with.
Whether RAG is there or not
doesn't really matter.
The fact is there are going to be other resources
in our world that we're going to need
to get into the prompt,
into the scope of what the model can deal with.
And these can be anything.
Files.
It could be binaries.
You could have a database out there.
You could have things happening in a Kafka topic.
That's even a pretty likely
source of resources.
So there's just this data out in the world
that the agent needs to be aware of.
These are two things that are just not going to be present
in the base foundation model.
Now, let's talk about
a little bit of architecture.
What we're doing is building an agent.
You could think of it as a microservice.
There's nothing particularly exotic about this.
But, in MCP terms, this is called
the host application.
And the host application uses
the MCP client library
to create an instance of a client in there.
Out here, we're going to create an MCP server.
This may be a server that already exists
that somebody else has built
that we want to take advantage of
to bring agentic functionality into our service,
or this could be a server that we ourselves are creating.
Inside the server, what do we have?
Well, we've got access to tools, resources, prompts,
capabilities that the server makes available
and even describes to the outside world.
So, this is a server process.
There's a URL, port, etc,
and a variety of well-known RESTful endpoints
described by the MCP specification
that are implemented by this server,
including this capabilities list
that tells the world,
tells the host application,
tells the client,
whether there are tools present,
what sort of resources might be available,
what prompts it has, etc, etc.
The connection between these two things,
between client and server,
can be two things.
It can be, interestingly, standard IO.
So if this is a process
running locally on my laptop
and I've got some
LLM host application, like say,
Claude Desktop or something,
that's something that shows up in a lot of the examples,
they can just communicate
via pipes and standard IO.
We don't want that.
That's not kind of what we're interested in
in the model I'm trying to give you here.
So, we also have as an option
HTTP and Server Sent Events,
and the messages being exchanged here
are going to be in JSON RPC.
Now, I will not apologize
for those technology choices
because I didn't make them.
Yes, they have raised
the occasional eyebrow,
but this is what we've got.
There's a little bit of sort of protocol
for a client announcing itself to the server
and then establishing communications.
There are ways for servers
to send asynchronous notifications back to the client.
We'll come back to that in a minute.
So, a relatively rich setup here
for client and server to talk.
But what does it do?
Let's walk through an example.
I think that would be helpful.
Let's say we're building a service
for making
appointments.
Sort of generalized meet with somebody,
some group of people at some place,
and not necessarily
a conference room in the office,
but maybe we're getting coffee.
Maybe we're getting breakfast.
Maybe it's a romantic dinner with your spouse.
I've just described a number of tools and resources
that are necessary to make that happen.
Let's think about that.
I need to create a calendar invite.
I need some kind of calendar API integration.
I need to see at least when my calendar is free,
I might need to make assumptions about the counterparty.
Maybe I can get access to their calendar as well,
depending on what I've got permissions for.
That would be kind of cool.
Places I might meet.
I suppose knowing about the calendar
might better fit under Resource.
Tool would be making the appointment,
maybe making a reservation at a restaurant,
knowing what restaurants,
what coffee shops,
what breakfast joints are in the area.
These are resources that I wanna make available
to my agentic application.
I could just do all that stuff,
do the calendar integration,
go talk to Yelp
or whatever APIs I wanna do
and bake that into my agent,
but then it's locked there
and nobody else can get at that
unless they've got that code.
So the whole idea of MCP
is I'm putting those things in here.
Let's go through a workflow
of how this might work.
A prompt comes in, and that prompt from the user
and that's the actual input
and it's something like,
"I wanna have coffee with Peter next week."
Okay, well, you just ask the LLM, it's like,
"Who's Peter? Where's coffee? I can't help you with this."
But here
we can start to do better.
This application, the host, the client,
whatever you wanna call it, can say,
"What capabilities do you have?"
It knows the URL of this agent.
You've had to tell it and maybe very tactically
there's a properties file somewhere
with a little list of the URLs of servers
that are registered with the agent.
And so it can
interrogate the capabilities and see,
"Oh, you have resources.
Okay.
Let me get a list of your resources"
which will include text descriptions of each resource.
And it's important
when writing the server,
when building a server
to make those good.
I can take my prompt.
I don't know, I'm just a poor little agentic application.
I don't know how to figure out from the input
whether I need any of those resources,
but I can ask my model.
I can say,
"You know what, on pass number one,
I'll say, here is what my user said.
Here is a list of resources:
resource one, resource two, resource three.
Do I need these?"
We are telling the LLM,
"I got this request.
I have things like this.
Do you think I should go get anything from them?"
We submit this as a prompt up to our LLM
and it tells us in return,
"Yes, you need resource two.
That resource two, that list of coffee shops in the area,
that looks super interesting.
Please give me that."
And so now my client says,
"Oh, resource two?
I know where that is.
I'll just go ask my MCP server
for the details
of resource two."
Maybe passing some parameters, maybe not.
And then I will get that text back
or whatever that data is.
I'll get that back and serialize it as text
or otherwise attach it to my next prompt.
Where I say, again,
"Here is my user prompt.
And now here is the resource data."
And I provide that data in detail
and then ask,
"What should I do as a result?"
That's how I get the model
to help me interpret the resources.
How do I interpret the tools?
Well, the good news is,
so this call is gonna go
back to that same LLM
and the APIs now
for the foundation models,
the biggies,
I can actually put the description of the tools
in the API call.
I don't even have to mess with the prompt or anything.
It's structured data that goes in there,
the name of the tool, the URL,
the schema of the parameters, all that.
And in the reply, that tells me
if there's a tool I should invoke, it'll say,
"Yes, invoke this tool,
pass these parameters."
I don't have to write any of the code
to parse any of the stuff out
because I don't know how to do that.
That's all very difficult stuff
that LLMs are wonderful at.
And those APIs will help me with that tool invocation.
They won't call them, okay?
ChatGPT, Claude, Gemini,
they're not gonna go invoke some URL inside my network
and go do something.
You know, that's a little bit Skynet-y there, right?
But they're gonna tell me,
"I recommend you do this."
And now my client code gets to
make the decision,
maybe asking the user first, maybe not,
to go call that tool and cause the effect out in the world.
So you can kind of see how this works.
So instead of just baking all this code in here,
we have this that is now pluggable and discoverable.
I don't need to know very much about what this tool does.
I just plug it in.
I just say, "Hey, you have this agent registered with you,
go find out about it, go through this process,
and you get its functionality."
They're also composable.
The server itself can be a client.
So, let's say I had some data source
that I knew was in Kafka out there,
and I don't wanna go write a bunch of extra Kafka code
to go do that in here.
Well, I can just then go use,
let's say the Confluent MCP server
and connect to that topic
or even do actually a bunch more stuff.
It's a pretty cool MCP server.
If Kafka and Confluent are a part of your life,
it's good stuff.
But if I just need to consume from a Kafka topic,
this server itself gets to be a client of another server.
So I've got pluggability, discoverability,
composability,
huge benefits.
These are things that we want
in our code.
So, I hope you can see now
how this really is a big deal.
There's a broader vision here
than just enhancing a desktop application
with some way to help me write code locally.
This is really a gateway
to building true agentic AI
in the enterprise,
in a professional setting.
That is really cool stuff.
So check it out, get started, links below with great help.
And as always, let me know in the comments what you build.
Thanks.
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.