0:04 welcome to the final video of this week
0:06 I'd like to share with you in this video
0:09 how LMS are starting to use tools and
0:12 then also discuss a Cutting Edge topic
0:14 of Agents which is where we let OMS try
0:17 to decide for themselves what action
0:20 they want to take next let's take a look
0:22 in the early example of a food order
0:25 taking chat bot we saw that if you were
0:28 to say sam Burger the bot May reply okay
0:31 is on the way in order for a chatbot to
0:33 enter the order and send it to you this
0:35 is what actually is happening behind the
0:38 scenes the LM can't just say Okay is on
0:40 its way because it needs to take some
0:43 action to actually send the burger to
0:48 you and so an LM might output this
0:52 response order burger for user 9876 to
0:55 sent to this address and then also say
0:58 the user message is to say okay is on
1:02 his way and L that's been fine-tuned to
1:05 Output text like this will be able to
1:07 generate an order which in this case
1:10 would trigger a software application
1:12 that passes the restaurant ordering
1:15 system a request to deliver a burger to
1:18 this user at that address and what is
1:21 shown to the user is not the full LM
1:23 output the full LM output is all four
1:25 lines of text here but only the final
1:29 line okay is on his way is what get sent
1:32 to the user as the response so this is
1:36 example of tool use by an LM where the
1:39 text the LM outputs can trigger calling
1:42 a software system to place a restaurant
1:45 order now placing an incorrect order can
1:49 be a costly mistake so perhaps a better
1:51 user interface would be before
1:54 finalizing the order to pop up a
1:56 verification dialogue to let the user
1:58 confirm yes or no if you've got the
2:00 order right before charging the credit
2:03 card and sending it to them and clearly
2:05 given that lm's outputs are not
2:08 completely reliable for any safety
2:10 critical or Mission critical action it
2:13 would be a good idea to let a user
2:15 confirm that that's the right action
2:17 before letting the L trigger some
2:21 potentially costly mistake by itself in
2:24 addition to tools for ticking actions
2:28 tools can also be used for reasoning for
2:31 example if you were to prompt an LM how
2:33 much would I have after8 years if I
2:35 deposit $100 in the bank account that
2:38 pays 5% interest an LM might generate an
2:41 answer like this which sounds plausible
2:43 but the number
2:47 $147.4 is not actually the right answer
2:49 it turns out LMS Having learned to
2:51 predict the next word or maybe even
2:53 instruction tuned are not great at
2:57 precise math and just as you I might use
2:59 a calculator to calculate the right
3:01 answer to a problem like this we can
3:04 also give the LM a calculator too to
3:07 help it get the right answer so rather
3:08 than having the L output the answer
3:13 directly if the LM were to Output this
3:15 after compounding and so on you would
3:20 have calculator 100 time 1.05 that's 5%
3:22 interest rate compounded to the power of
3:25 8 this can be interprets commands to
3:27 call an external calculator program to
3:29 explicitly compute the right answer
3:32 which turns out to be
3:36 $147 74 and plug that back into the text
3:39 to give the user the correct dollar
3:43 figure so by giving lm's the ability to
3:46 call Tools in his output we can
3:49 significantly extend the reasoning or
3:53 the action-taking capabilities of LMS to
3:56 use today is an important part of many
3:58 um applications and of course designers
4:00 of these applications s should be
4:03 careful to make sure that tools aren't
4:06 triggered in a way that causes harm or
4:09 causes IR reversible damage going Beyond
4:12 tools into a more experimental area AI
4:14 researchers have been examining agents
4:16 which go beyond triggering a tool to
4:18 carry out a single action but is
4:21 exploring whether Els can choose and
4:24 Carry Out complex sequences of actions
4:26 there's a lot of excitement and research
4:28 on agents but this is at The Cutting
4:30 Edge of AI research is is not yet mature
4:32 enough to count on for most important
4:33 applications but I want to share with
4:35 you what many in the AI Community are
4:39 excited about if you would ask an agent
4:41 that's built on top of an LM help me
4:43 research better Burger's top competitors
4:45 then an agent might use an LM as a
4:48 reasoning engine to figure out what are
4:51 the steps it needs to carry out to do
4:53 your task of researching better Burger's
4:57 competitors and this reasoning engine DM
4:59 might decide it needs to search for the
5:01 a list of the top competitors then visit
5:03 the website of each competitor and
5:05 finally for each competitor write a
5:07 summary based on the homepage content
5:10 and then perhaps by making a sequence of
5:12 calls to this reasoning engine it may
5:14 figure out that to search the top
5:17 competitors it has to trigger a tool to
5:19 call web search engine on the query
5:21 Better Burgers competitors and then
5:24 after that it may visit the websites of
5:26 some of the top competitors to download
5:29 their homepages and then additionally
5:33 call and El yet again to summarize the
5:35 text that they found on the website on
5:37 the internet there have been some nice
5:41 demos of Agents but this technology is
5:44 not really ready for prime time yet but
5:46 perhaps in the future as researchers
5:48 make it better and better it become more
5:50 useful and I think that would be the
5:53 exciting future if lm's as a reasoning
5:56 engine can help decide what's the
5:58 sequence of steps to take safely and
6:01 responsibly of course to help a user
6:02 carry out the
6:05 task thank you and congrats on making it
6:07 to the very end of week two with just
6:10 one more week to go in this course next
6:12 week we'll look at how gent of AI is
6:14 affecting companies including how you
6:16 might be able to come up with gent VII
6:19 use cases for your business as well as
6:22 look at how gent VII is affecting
6:25 society and his impact on jobs I look