AI agents require distinct types of memory—working, semantic, procedural, and episodic—to function effectively, moving beyond simple chatbots by incorporating persistent knowledge and learned experiences.
Mind Map
클릭해서 펼치기
클릭해서 인터랙티브 마인드맵 전체 보기
AI agents have different ways to remember stuff and each serves a different purpose.
So let's take a look at the four main types of AI agent memory
from some pretty foundational stuff to what I think are some quite interesting emerging areas.
And I think it's really, first of all, worth considering how we do it.
How does human memory actually.
We can think of human memory as having first of all short-term memory.
So that's the stuff that is active in the brain right now like what I'm saying at this very moment.
That's one type of memory but there's also a type of memory called factual knowledge.
So this is things like the company security policies that you remember or
it could be facts like Python is an interpreted language.
Then there are learned skills.
Like, I don't know, writing backwards on a sheet of glass, for example, which I am totally doing here.
There is absolutely no camera trickery involved.
And then there is personal experience.
Like the time I spent three hours debugging a Kubernetes cluster only to discover...
I was pointing at the wrong cluster the entire time.
Seriously, that was three hours of my time.
Anyway, anyway, it turns out that well-designed AI agents,
they also need these three types of memory or these four types of memories that I've got here.
And there's actually a well-known framework for this.
And it's from a Princeton research team and they gave it the name of CoALA.
That's Cognitive Architectures for Language Agents, and CoALA maps out four distinct types of memory that agents need.
So let's walk through each one and see how they actually work in real agentic systems today.
So type one, that is working memory.
This is the agent's context window.
It's everything the agent can see right now, the current conversation,
if there's any system instructions, they'll be in there.
If there's any files or data that have been loaded into the prompt, that's where they'll be as well.
So it's really kind of the scratch pad.
And the analogy everybody uses for this is this is just like
RAM, random access memory.
It's fast and immediately accessible, but it's volatile.
When the session ends, it's gone.
And it's also limited in size.
I mean the- the biggest context windows available today are pretty big.
I mean, it could be like one million tokens or even more than that,
but that still has a ceiling and try to stuff too much in there and performance
is gonna degrade as the model starts losing track of things that are kind of buried in the middle of the context window.
So every agent has working memory, but then so does every chat bot, it's just the context windows.
So the question is...
What else do agentic systems need?
Well let me add to that list number two semantic memory and this is the agent's knowledge base,
so semantic memory stores facts and rules and conventions,
documentation and in the academic literature this often gets described in terms of
things like vector databases or as knowledge graphs, and yeah, those are real implementations, but,
in a lot of production agentic systems today, semantic memory is something much simpler than that.
It's just simply Markdown files, .md files.
So take Claude code as an example of this.
So it has one of these Markdown Files.
It's one is called Claude.md and that sits in the root of a project.
And that file contains the project architecture, the coding conventions,
the build commands, what frameworks to use, and also what not to do.
And that far gets loaded into the context window at the start of every session.
So semantic memory tells the agent what it needs to know in general.
And without it, the agent is, well, it's kind of destined to make the same mistakes over and over again,
because it has no persistent knowledge to draw from.
Working memory, semantic memory, what else is there?
Well, number three, that is procedural memory.
Now procedural memory is how the agent knows how to do things.
And there's an open standard for this that's called agent skills.
And it uses a file format called skill.md.
A skill is just a folder with a markdown file that describes the skill and what that skill does
and some step-by-step instructions for how to perform that skill
and it could be anything from creating a PowerPoint presentation to running a structured code review.
Now skills use something called progressive disclosure so the agent doesn't load all of its skills into the context window
or I guess I should say into the working memory
at once because that can blow through the working memory budget if there are a lot of defined agent skills.
So instead the agent just sees a lightweight index, which is just the name and the description of each available skill.
So maybe that's a hundred tokens per skill.
Then when a task comes in that matches one of these skill descriptions,
the agent loads the full instructions and if the skill references other stuff like other files or templates or scripts.
Well, those only get pulled in when the agent actually needs them during execution.
So the agent advertises what skills it has,
it loads the instructions in when they're needed and then executes with the
additional resources pulled in as they're needed as well.
And all that is quite different from semantic memory where the knowledge is always present in context.
All right, number four is episodic memory.
Episodic memory is the agent's record of what happened in past interactions and past decisions and what it learned from them.
Now a naive implementation of this is just to save every conversation transcript
and then just search through them as you need to.
And that technically counts as episodic memory but often that's not very useful.
So what production systems actually do is a bit more distillation.
So as the agent works across sessions it kind of accumulates notes for itself, but it doesn't save everything.
It decides what's worth remembering based on whether that information would actually be useful in a future conversation.
So the result is distilled or compressed experience.
So things like last time we debugged the auth module, the issue was in the middleware layer.
That's something that's a lot more useful to remember than just a full transcript of a 45 minute debugging session.
And this is where memory starts to kind of genuinely look like learning because the agent is gonna get better over time.
But episodic memory is also the hardest type of these to get right because what do you delete?
When does information become obsolete?
If a user changes jobs, do you keep the old project memories around?
Or should we forget them?
Well, humans are actually pretty good at forgetting.
I do it all the time.
And as frustrating as that can be, it can be quite useful.
But for agents, forgetting is an engineering problem.
So four types of memory, working, semantic, procedural, episodic, but not every agent necessarily needs all four.
Let me give you an example.
So let's say, we're building a simple reflex agent.
So that's something like a thermostat or it's just like a basic routing bot.
Doesn't need all four.
It might only need access to working memory, the context window, and that's basically it.
Now, if we take something a little bit more complicated like a customer support agent,
but one that's still fairly simple and narrow, like an agent that resets passwords, for example.
That will still have access to the working memory, of course,
but it probably also needs access to procedural memory as well because it needs to recall the password reset skill.
But that might be it.
Whereas if we take a look at something like a coding agent, it probably needs access to all four,
so it certainly needs access to the working memory, the context window,
but it also needs the product knowledge it would get from semantic
and then the skill system from procedural and also the auto memory from episodic that learns across sessions.
So memory is really what separates a chatbot
from an agent because a chat bot gives a response, but an agent can give a response shaped by persistent knowledge.
Accumulated experience.
It remembers the project.
It remembers preferences and a good memory architecture also remembers the mistakes
so we're not destined to repeat them
which honestly would have been wonderful if an agent had told me about that Kubernetes cluster before hour three.
So four types of AI agent memory.
Which of these are you using in your own agentic workflows?
Thank you.
텍스트나 타임스탬프를 클릭하면 동영상의 해당 장면으로 바로 이동합니다
공유:
대부분의 자막은 5초 이내에 준비됩니다
원클릭 복사125개 이상의 언어내용 검색타임스탬프로 이동
YouTube URL 붙여넣기
YouTube 동영상 링크를 입력하면 전체 자막을 가져옵니다
자막 추출 양식
대부분의 자막은 5초 이내에 준비됩니다
Chrome 확장 프로그램 설치
YouTube를 떠나지 않고 자막을 즉시 가져오세요. Chrome 확장 프로그램을 설치하면 동영상 시청 페이지에서 바로 자막에 원클릭으로 접근할 수 있습니다.