0:07 AI. AI. AI. AI. AI.
0:10 AI. You know, more agentic. Agentic
0:12 capabilities. An AI agent. Agents.
0:15 Agentic workflows. Agents. Agents.
0:19 Agent. Agent. Agent. Agent. Agentic.
0:20 All right. Most explanations of AI
0:23 agents is either too technical or too
0:26 basic. This video is meant for people
0:28 like myself. You have zero technical
0:30 background, but you use AI tools
0:33 regularly and you want to learn just
0:36 enough about AI agents to see how it
0:38 affects you. In this video, we'll follow
0:41 a simple one, two, three learning path
0:43 by building on concepts you already
0:46 understand like chatbt and then moving
0:49 on to AI workflows and then finally AI
0:52 agents. All the while using examples you
0:55 will actually encounter in real life.
0:56 And believe me when I tell you those
0:58 intimidating terms you see everywhere
1:02 like rag, rag, or react, they're a lot
1:04 simpler than you think. Let's get
1:05 started. Kicking things off at level
1:08 one, large language models. Popular AI
1:10 chatbots like CHBT, Google Gemini, and
1:14 Claude are applications built on top of
1:17 large language models, LLMs, and they're
1:19 fantastic at generating and editing
1:21 text. Here's a simple visualization.
1:24 You, the human, provides an input and
1:27 the LLM produces an output based on its
1:29 training data. For example, if I were to
1:31 ask Chachi BT to draft an email
1:33 requesting a coffee chat, my prompt is
1:36 the input and the resulting email that's
1:37 way more polite than I would ever be in
1:40 real life is the output. So far so good,
1:43 right? Simple stuff. But what if I asked
1:47 Chachi BT when my next coffee chat is?
1:49 Even without seeing the response, both
1:52 you and I know Chachi PT is gonna fail
1:53 because it doesn't know that
1:56 information. It doesn't have access to
1:58 my calendar. This highlights two key
2:00 traits of large language models. First,
2:02 despite being trained on vast amounts of
2:04 data, they have limited knowledge of
2:07 proprietary information like our
2:09 personal information or internal company
2:12 data. Second, LLMs are passive. They
2:14 wait for our prompt and then respond.
2:17 Right? Keep these two traits in mind
2:19 moving forward. Moving to level two, AI
2:21 workflows. Let's build on our example.
2:25 What if I, a human, told the LM, "Every
2:26 time I ask about a personal event,
2:29 perform a search query and fetch data
2:31 from my Google calendar before providing
2:33 a response." With this logic
2:35 implemented, the next time I ask, "When
2:38 is my coffee chat with Elon Husky?" I'll
2:40 get the correct answer because the LLM
2:42 will now first go into my Google
2:45 calendar to find that information. But
2:48 here's where it gets tricky. What if my
2:50 next follow-up question is, "What will
2:53 the weather be like that day?" The LM
2:55 will now fail at answering the query
2:57 because the path we told the LM to
3:00 follow is to always search my Google
3:02 calendar, which does not have
3:04 information about the weather. This is a
3:07 fundamental trait of AI workflows. They
3:10 can only follow predefined paths set by
3:12 humans. And if you want to get
3:15 technical, this path is also called the
3:17 control logic. Pushing my example
3:20 further, what if I added more steps into
3:22 the workflow by allowing the LM to
3:24 access the weather via an API and then
3:26 just for fun use a text to audio model
3:28 to speak the answer. The weather
3:31 forecast for seeing Elon Husky is sunny
3:33 with a chance of being a good boy.
3:35 Here's the thing. No matter how many
3:39 steps we add, this is still just an AI
3:41 workflow. Even if there were hundreds or
3:44 thousands of steps, if a human is the
3:47 decision maker, there is no AI agent
3:49 involvement. Pro tip: retrieval
3:52 augmented generation or rag is a fancy
3:54 term that's thrown around a lot. In
3:56 simple terms, rag is a process that
3:58 helps AI models look things up before
4:00 they answer, like accessing my calendar
4:03 or the weather service. Essentially, Rag
4:06 is just a type of AI workflow. By the
4:07 way, I have a free AI toolkit that cuts
4:09 through the noise and helps you master
4:10 essential AI tools and workflows. I'll
4:12 leave a link to that down below. Here's
4:14 a real world example. Following Helena
4:17 Louu's amazing tutorial, I created a
4:19 simple AI workflow using make.com. Here
4:21 you can see that first I'm using Google
4:23 Sheets to do something. Specifically,
4:25 I'm compiling links to news articles in
4:28 a Google sheet. And this is that Google
4:31 sheet. Second, I'm using Perplexity to
4:34 summarize those news articles. Then
4:36 using Claude and using a prompt that I
4:38 wrote, I'm asking Claude to draft a
4:42 LinkedIn and Instagram post. Finally, I
4:44 can schedule this to run automatically
4:46 every day at 8 a.m. As you can see, this
4:49 is an AI workflow because it follows a
4:52 predefined path set by me. Step one, you
4:55 do this. Step two, you do this. Step
4:57 three, you do this. And finally,
4:59 remember to run daily at 8 am. One last
5:02 thing, if I test this workflow and I
5:05 don't like the final output of the
5:08 LinkedIn post, for example, as you can
5:10 see right here, uh, it's not funny
5:11 enough and I'm naturally hilarious,
5:16 right? I'd have to manually go back and
5:20 rewrite the prompt for Claude. Okay? And
5:23 this trial and error iteration is
5:25 currently being done by me, a human. So
5:27 keep that in mind moving forward. All
5:29 right, level three, AI agents.
5:31 Continuing the make.com example, let's
5:33 break down what I've been doing so far
5:36 as the human decision maker. With the
5:37 goal of creating social media posts
5:39 based off of news articles, I need to do
5:43 two things. First, reason or think about
5:44 the best approach. I need to first
5:46 compile the news articles, then
5:48 summarize them, then write the final
5:51 posts. Second, take action using tools.
5:53 I need to find and link to those news
5:55 articles in Google Sheets. Use
5:58 Perplexity for real-time summarization
6:00 and then claw for copyrightiting. So,
6:01 and this is the most important sentence
6:04 in this entire video. The one massive
6:06 change that has to happen in order for
6:09 this AI workflow to become an AI agent
6:13 is for me, the human decision maker, to
6:16 be replaced by an LLM. In other words,
6:19 the AI agent must reason. What's the
6:20 most efficient way to compile these news
6:22 articles? Should I copy and paste each
6:24 article into a word document? No, it's
6:26 probably easier to compile links to
6:28 those articles and then use another tool
6:30 to fetch the data. Yes, that makes more
6:34 sense. The AI agent must act, aka do
6:37 things via tools. Should I use Microsoft
6:39 Word to compile links? No. Inserting
6:41 links directly into rows is way more
6:44 efficient. What about Excel? M. So the
6:45 user has already connected their Google
6:47 account with make.com. So Google Sheets
6:49 is a better option. Pro tip. Because of
6:51 this, the most common configuration for
6:55 AI agents is the react framework. All AI
6:59 agents must reason and act. So
7:01 react. Sounds simple once we break it
7:03 down, right? A third key trait of AI
7:06 agents is their ability to iterate.
7:08 Remember when I had to manually rewrite
7:10 the prompt to make the LinkedIn post
7:13 funnier? I, the human, probably need to
7:15 repeat this iterative process a few
7:17 times to get something I'm happy with,
7:19 right? An AI agent will be able to do
7:22 the same thing autonomously. In our
7:25 example, the AI agent would autonomously
7:28 add in another LM to critique its own
7:30 output. Okay, I've drafted V1 of a
7:32 LinkedIn post. How do I make sure it's
7:34 good? Oh, I know. I'll add another step
7:36 where an LM will critique the post based
7:38 on LinkedIn best practices. And let's
7:40 repeat this until the best practices
7:42 criteria are all met. And after a few
7:45 cycles of that, we have the final
7:47 output. That was a hypothetical example.
7:50 So let's move on to a real world AI
7:53 agent example. Andrew is a preeeminent
7:55 figure in AI and he created this demo
7:58 website that illustrates how an AI agent
8:00 works. I'll link the full video down
8:02 below, but when I search for a keyword
8:07 like skier, enter the AI vision agent in
8:10 the background is first reasoning what a
8:12 skier looks like. A person on skis going
8:14 really fast in snow, for example, right?
8:18 I'm not sure. And then it's acting by
8:22 looking at clips in video footage,
8:24 trying to identify what it thinks a
8:29 skier is, indexing that clip, and then
8:32 returning that clip to us. Although this
8:34 might not feel impressive, remember that
8:36 an AI agent did all that instead of a
8:39 human reviewing the footage beforehand,
8:42 manually identifying the skier, and
8:45 adding tags like skier, mountain, ski,
8:47 snow. The programming is obviously a lot
8:49 more technical and complicated than what
8:51 we see in the front end, but that's the
8:53 point of this demo, right? The average
8:56 user like myself wants a simple app that
8:58 just works without me having to
9:00 understand what's going on in the back
9:02 end. Speaking of examples, I'm also
9:05 building my very own basic AI agent
9:07 using Nan. So, let me know in the
9:08 comments what type of AI agent you'd
9:11 like me to make a tutorial on next. To
9:12 wrap up, here's a simplified
9:14 visualization of the three levels we
9:17 covered today. Level one, we provide an
9:19 input and the LM responds with an
9:22 output. Easy. Level two, for AI
9:24 workflows, we provide an input and tell
9:27 the LM to follow a predefined path that
9:29 may involve in retrieving information
9:31 from external tools. The key trait here
9:34 is that the human programs a path for LM
9:37 to follow. Level three, the AI agent
9:39 receives a goal and the LM performs
9:41 reasoning to determine how best to
9:44 achieve the goal, takes action using
9:46 tools to produce an interim result,
9:48 observes that interim result, and
9:51 decides whether iterations are required,
9:53 and produces a final output that
9:56 achieves the initial goal. The key trait
9:58 here is that the LLM is a decision maker
10:00 in the workflow. If you found this
10:02 helpful, you might want to learn how to
10:04 build a prompts database in Notion. See
10:05 you on the next video. In the