0:04 the ability of systems like chat GPT and
0:07 bod to generate text seems almost
0:10 magical and they do represent a big step
0:13 forward for AI technology but how does
0:16 text generation actually work in this
0:18 video we'll take a look at what actually
0:21 underlies the generative AI technology
0:23 and this will hopefully help you
0:25 understand what you can use it for and
0:27 also when you might not want to count on
0:28 it let's take a
0:31 look let's start by looking at where
0:33 generative AI fits within the AI
0:36 landscape there's a lot of Buzz and
0:39 excitement and also hype about Ai and I
0:42 think a useful way to think of AI is as
0:45 a collection or as a set of tools one of
0:47 the most important Tools in AI is
0:49 supervised learning which turns out to
0:51 be really good at labeling things don't
0:53 worry if you don't know what this means
0:54 we'll talk more about it on the next
0:57 slide and a second to that started to
1:00 work really well only fairly recently is
1:03 generative AI if you study AI you may
1:05 recognize that there are other tools as
1:07 well such as things called UNS
1:09 supervised learning and reinforcement
1:11 learning but for the purposes of this
1:14 course I'm going to touch briefly on
1:17 what is supervised learning and then
1:18 spend most of our time talk about
1:21 generative Ai and these two supervised
1:24 learning and generative AI are the two
1:26 most important Tools in AI today and for
1:29 most business use cases you should be
1:31 fine if you just not worry about the
1:34 other tools than these for now before
1:37 describing how generative AI works let
1:39 me briefly describe what is supervised
1:42 learning because it turns out gen of AI
1:45 is built using supervised learning
1:47 supervised learning is a technology that
1:50 is made computes is very good when given
1:53 an input which I'm going to call a to
1:55 generate a corresponding output which
1:58 I'm going to call B so look at a few
2:01 examples you given an email supervised
2:04 learning can decide if that email is
2:07 Spam or not so the input a is an email
2:10 and the output B is either zero or one
2:13 where zero is not spam and one is Spam
2:16 and this is how spam filters work today
2:19 as a second example probably the most
2:21 lucrative application not the most
2:23 inspiring but lucrative for some
2:25 companies the that work on was online
2:28 advertising where given an ad and some
2:31 information about a user an AI system
2:34 can gener an output B corresponding to
2:36 whether or not you're likely to click on
2:38 that ad and by showing slightly more
2:41 relevant ads this drives significant
2:45 revenue for the online ad Platforms in
2:46 self-driving calls and in driver
2:49 assistance systems supervised learning
2:51 is used to take us input a picture of
2:53 what's in front of your car and radar
2:55 info and label that with the position of
2:59 other cars given a medical x-ray it can
3:00 try to label that with a medical
3:03 diagnosis I've also done a lot of work
3:06 in manufacturing defect inspection where
3:07 you can have a system take a picture of
3:09 a phone as it rolls off the assembly
3:11 line and check if the phone has any
3:13 scratches or the defects or in speech
3:16 recognition the input a would be a piece
3:18 of audio and we would label that with a
3:22 text transcript or as a final example if
3:24 you run a restaurant or some of the
3:26 business where occasionally you have
3:28 reviews written about your business or
3:30 your products supervised learning can
3:34 read those reviews and label each one as
3:36 having either a positive or A negative
3:38 sentiment and this is useful for
3:41 reputation monitoring of the business so
3:43 it turns out the decade of around 2010
3:46 to 2020 was a decade of large scale
3:49 supervised learning and I want to touch
3:50 on this briefly because it turns out
3:53 this laid the foundation for modern
3:55 generative AI but what we found starting
3:58 around 2010 was that for a lot of
4:02 applications we had a lot of data but
4:04 even as we FedEd more data his
4:06 performance wasn't getting that much
4:08 better if we were training small AI
4:11 models this means for example if you are
4:13 building a speech recognition system
4:16 even as your AI listen to tens of
4:18 thousands or hundreds of thousands of
4:20 hours of data that's a lot of data it
4:22 didn't get that much more accurate
4:25 compared to a system that listen to only
4:28 a smaller amount of audio data but what
4:30 more and more researchers started to
4:32 realize through this period is if you
4:34 were to train a very large AI model
4:37 meaning an AI model on very fast very
4:40 powerful computers with a lot of memory
4:42 then performance as you FedEd more and
4:44 more data would just keep on getting
4:47 better and better in fact years ago when
4:49 I started and led the Google brain team
4:51 the primary mission that I set for the
4:54 Google brain team in the early days was
4:55 I said let's just build really really
4:58 large AI models and feed them a lot of
5:00 data and fortunately that recipe worked
5:02 and ended up driving a lot of AI
5:04 progress at Google large scale
5:07 supervised learning remains important
5:11 today but this idea of very large models
5:15 for labeling things is how we got to
5:18 generative AI today let's look at how
5:22 Gena of AI generates text using a
5:25 technology called large language models
5:27 here's one way that large language
5:30 models which are abbreviate l m can
5:34 generate text given an input like I love
5:37 eating this is called a prompt and LM
5:41 can then complete this sentence with
5:44 maybe bagels with cream cheese or if you
5:46 run it a second time it might say my
5:48 mother's meat low or if you run it the
5:50 third time maybe it'll say also with
5:54 friends so how does an LM a large
5:57 language model generate this output it
6:01 turns out that lm's a build by using
6:03 supervised learning that's a technology
6:06 to input a and output a label B it uses
6:08 supervised learning to repeatedly
6:10 predict what is the next word for
6:13 example if an AI system has read on the
6:15 internet a sentence like my favorite
6:19 food is a bagel with cream cheese then
6:22 this one sentence will be turned into a
6:25 lot of data points for it to try to
6:28 learn to predict the next word specially
6:32 given this sentence we now have one data
6:34 point that says given the phrase my
6:37 favorite food is a what do you think is
6:39 the next word in this case the right
6:41 answer is bagel and also given my
6:43 favorite food is a bagel what do you
6:47 think is the next word is with and so
6:50 on so this one sentence is turned into
6:53 multiple inputs a and outputs B for it
6:56 to try to learn from where the LM is
6:59 learning given a few words to predict
7:01 what the next word that comes out there
7:04 when you train a very large AI system on
7:08 a lot of data a lot of data for LS means
7:10 hundreds of billions of words and in
7:13 some cases more than a trillion words
7:15 then you get a large language model like
7:19 chat GPT that given a prompt is very
7:21 good at generating some additional words
7:24 in response to that prompt but now I'm
7:27 omitting some technical details
7:30 specifically next week what talk about a
7:34 process that makes LMS not just predict
7:35 the next word but actually learn to
7:39 follow instructions and also be safe in
7:42 what it outputs but at the heart of LMS
7:44 is this technology that's learned from a
7:47 lot of data to predict what is the next
7:50 word so that's how large language models
7:51 work they're trained to repeatedly
7:54 predict the next word and it turns out
7:56 that many people perhaps including you
7:58 are already finding these models useful
8:01 for day today activities at work to help
8:03 with writing to find basic information
8:06 or to be a thought partner to help think
8:08 things through let's take a look at some