YouTube Transcript:
The Four Types of Memory Every AI Agent Needs

Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.

AutoDub

Understand YouTube Foreign Videos

Immersive YouTube Dubbing in English

Break language barriers, embrace global quality content

Use for Free

Video Transcript

Video Summary

Summary

Core Theme

AI agents require distinct types of memory—working, semantic, procedural, and episodic—to function effectively, moving beyond simple chatbots by incorporating persistent knowledge and learned experiences.

Mind Map

Click to expand

Click to explore the full interactive mind map

AI agents have different ways to remember stuff and each serves a different purpose.

So let's take a look at the four main types of AI agent memory

from some pretty foundational stuff to what I think are some quite interesting emerging areas.

And I think it's really, first of all, worth considering how we do it.

How does human memory actually.

We can think of human memory as having first of all short-term memory.

So that's the stuff that is active in the brain right now like what I'm saying at this very moment.

That's one type of memory but there's also a type of memory called factual knowledge.

So this is things like the company security policies that you remember or

it could be facts like Python is an interpreted language.

Then there are learned skills.

Like, I don't know, writing backwards on a sheet of glass, for example, which I am totally doing here.

There is absolutely no camera trickery involved.

And then there is personal experience.

Like the time I spent three hours debugging a Kubernetes cluster only to discover...

I was pointing at the wrong cluster the entire time.

Seriously, that was three hours of my time.

Anyway, anyway, it turns out that well-designed AI agents,

they also need these three types of memory or these four types of memories that I've got here.

And there's actually a well-known framework for this.

And it's from a Princeton research team and they gave it the name of CoALA.

That's Cognitive Architectures for Language Agents, and CoALA maps out four distinct types of memory that agents need.

So let's walk through each one and see how they actually work in real agentic systems today.

So type one, that is working memory.

This is the agent's context window.

It's everything the agent can see right now, the current conversation,

if there's any system instructions, they'll be in there.

If there's any files or data that have been loaded into the prompt, that's where they'll be as well.

So it's really kind of the scratch pad.

And the analogy everybody uses for this is this is just like

RAM, random access memory.

It's fast and immediately accessible, but it's volatile.

When the session ends, it's gone.

And it's also limited in size.

I mean the- the biggest context windows available today are pretty big.

I mean, it could be like one million tokens or even more than that,

but that still has a ceiling and try to stuff too much in there and performance

is gonna degrade as the model starts losing track of things that are kind of buried in the middle of the context window.

So every agent has working memory, but then so does every chat bot, it's just the context windows.

So the question is...

What else do agentic systems need?

Well let me add to that list number two semantic memory and this is the agent's knowledge base,

so semantic memory stores facts and rules and conventions,

documentation and in the academic literature this often gets described in terms of

things like vector databases or as knowledge graphs, and yeah, those are real implementations, but,

in a lot of production agentic systems today, semantic memory is something much simpler than that.

It's just simply Markdown files, .md files.

So take Claude code as an example of this.

So it has one of these Markdown Files.

It's one is called Claude.md and that sits in the root of a project.

And that file contains the project architecture, the coding conventions,

the build commands, what frameworks to use, and also what not to do.

And that far gets loaded into the context window at the start of every session.

So semantic memory tells the agent what it needs to know in general.

And without it, the agent is, well, it's kind of destined to make the same mistakes over and over again,

because it has no persistent knowledge to draw from.

Working memory, semantic memory, what else is there?

Well, number three, that is procedural memory.

Now procedural memory is how the agent knows how to do things.

And there's an open standard for this that's called agent skills.

And it uses a file format called skill.md.

A skill is just a folder with a markdown file that describes the skill and what that skill does

and some step-by-step instructions for how to perform that skill

and it could be anything from creating a PowerPoint presentation to running a structured code review.

Now skills use something called progressive disclosure so the agent doesn't load all of its skills into the context window

or I guess I should say into the working memory

at once because that can blow through the working memory budget if there are a lot of defined agent skills.

So instead the agent just sees a lightweight index, which is just the name and the description of each available skill.

So maybe that's a hundred tokens per skill.

Then when a task comes in that matches one of these skill descriptions,

the agent loads the full instructions and if the skill references other stuff like other files or templates or scripts.

Well, those only get pulled in when the agent actually needs them during execution.

So the agent advertises what skills it has,

it loads the instructions in when they're needed and then executes with the

additional resources pulled in as they're needed as well.

And all that is quite different from semantic memory where the knowledge is always present in context.

All right, number four is episodic memory.

Episodic memory is the agent's record of what happened in past interactions and past decisions and what it learned from them.

Now a naive implementation of this is just to save every conversation transcript

and then just search through them as you need to.

And that technically counts as episodic memory but often that's not very useful.

So what production systems actually do is a bit more distillation.

So as the agent works across sessions it kind of accumulates notes for itself, but it doesn't save everything.

It decides what's worth remembering based on whether that information would actually be useful in a future conversation.

So the result is distilled or compressed experience.

So things like last time we debugged the auth module, the issue was in the middleware layer.

That's something that's a lot more useful to remember than just a full transcript of a 45 minute debugging session.

And this is where memory starts to kind of genuinely look like learning because the agent is gonna get better over time.

But episodic memory is also the hardest type of these to get right because what do you delete?

When does information become obsolete?

If a user changes jobs, do you keep the old project memories around?

Or should we forget them?

Well, humans are actually pretty good at forgetting.

I do it all the time.

And as frustrating as that can be, it can be quite useful.

But for agents, forgetting is an engineering problem.

So four types of memory, working, semantic, procedural, episodic, but not every agent necessarily needs all four.

Let me give you an example.

So let's say, we're building a simple reflex agent.

So that's something like a thermostat or it's just like a basic routing bot.

Doesn't need all four.

It might only need access to working memory, the context window, and that's basically it.

Now, if we take something a little bit more complicated like a customer support agent,

but one that's still fairly simple and narrow, like an agent that resets passwords, for example.

That will still have access to the working memory, of course,

but it probably also needs access to procedural memory as well because it needs to recall the password reset skill.

But that might be it.

Whereas if we take a look at something like a coding agent, it probably needs access to all four,

so it certainly needs access to the working memory, the context window,

but it also needs the product knowledge it would get from semantic

and then the skill system from procedural and also the auto memory from episodic that learns across sessions.

So memory is really what separates a chatbot

from an agent because a chat bot gives a response, but an agent can give a response shaped by persistent knowledge.

Accumulated experience.

It remembers the project.

It remembers preferences and a good memory architecture also remembers the mistakes

so we're not destined to repeat them

which honestly would have been wonderful if an agent had told me about that Kubernetes cluster before hour three.

So four types of AI agent memory.

Which of these are you using in your own agentic workflows?

Thank you.

Click on any text or timestamp to jump to that moment in the video

Most transcripts ready in under 5 seconds

One-Click Copy125+ LanguagesSearch ContentJump to Timestamps

Paste YouTube URL

Enter any YouTube video link to get the full transcript

Most transcripts ready in under 5 seconds

Get Our Chrome Extension

Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.

Add to Chrome — Free

Works with YouTube, Coursera, Udemy and more educational platforms

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube TranscriptPreparing your results…

YouTube Transcript:The Four Types of Memory Every AI Agent Needs