0:04 when using an LM to build a software
0:06 applications you find that there are a
0:08 lot of different LS out there some big
0:10 ones some small ones some open source
0:12 some closed Source how do you choose
0:14 from all of these different options in
0:16 this video let's take a look at some
0:19 guidelines one way to estimate how
0:21 capable an omm is is to look at the
0:25 model size Loosely if we look at models
0:27 that are say in the 1 billion parameter
0:30 range we'll find that they're often good
0:32 that patent matching and will have some
0:34 basic knowledge of the world so if what
0:36 you want is to classify restaurant
0:39 reviews for sentiment I think a 1
0:41 billion parameter model would probably
0:43 be able to do just fine in terms of that
0:46 type of pattern matching with basic
0:49 knowledge about food types of words as
0:53 you go to a 10 billion parameter model
0:55 you find that the models have greater
0:57 World Knowledge they just know more
0:59 esoteric facts about the world and the
1:01 models also get better at following
1:04 basic instructions so if you want to
1:07 build a food order chatbot a 10 billion
1:10 parameter model might be okay especially
1:12 if you were to fine tune it to become
1:14 better at the types of Specific
1:17 Instructions you wanted to follow and
1:20 then the very large models say 100
1:23 billion plus parameters will tend to
1:25 have very rich World Knowledge they'll
1:28 know a lot of things about physics and
1:30 philosophy and history and science and
1:33 so on and they'll be better as well at
1:35 complex reasoning this is why if you're
1:38 building a food order chatbot maybe you
1:40 don't need the chatbot to know so much
1:42 about history and philosophy and all of
1:44 these other things under the sun some of
1:45 these models might be cheap enough to
1:48 deploy that it might be okay to use a
1:50 huge model even for a food order chat
1:53 bot but where I would definitely tend to
1:56 use these larger models would be tasks
1:58 that involve deep knowledge or complex
2:01 reasoning so for example if I'm looking
2:03 for a brainstorming partner to help me
2:07 think through ideas I'll often use one
2:09 of the larger models one of the things
2:11 you've heard me say earlier though is
2:14 that development using LMS is often a
2:17 highly empirical meaning experimental
2:20 process so it's hard to know in advance
2:22 exactly what the performance of a given
2:25 LM will be and while I'm sharing some
2:28 general guidelines here in practice it
2:29 might be worth just trying a few
2:30 different different models and testing
2:33 them and based on the results you see
2:35 from testing a few options then pick
2:37 what actually seems to work best for
2:40 your application another decision you
2:42 might have to make is whether to use a
2:45 closed Source or an open source model
2:47 close Source models are usually
2:49 accessible via Cloud programming
2:52 interface and I find that many of them
2:55 are pretty easy to build into
2:57 applications you just have to write a
2:59 few lines of code like we saw earlier
3:02 this week to incorporate them into
3:04 software applications many of the
3:06 largest and most powerful models today
3:08 are also available only via Cloud
3:10 programming interfaces and a closed
3:13 Source models and they're also
3:16 relatively inexpensive to run because
3:19 the large companies hosting these models
3:21 will often have put a lot of work into
3:24 serving up these API calls inexpensively
3:26 a downside is that if you develop using
3:29 these close Source models there is some
3:31 risk of vendor lock in today the
3:33 switching cost from one LM to a
3:36 different one is not very high but there
3:38 is some cost to retesting all your
3:40 prompts to see if they work on a
3:42 different LM if you do switch vendors in
3:44 comparison there are also many open-
3:47 Source models that are available now one
3:49 advantage using an open source model is
3:52 you have full control over the model you
3:54 know you always have access to that
3:57 model and don't have to worry about what
4:00 of the company providing it were to to
4:02 retire or deprecate the model that you
4:04 had built on top of you can also often
4:07 run these models on your own device so
4:09 if you want to run it on premises or on
4:12 Prem that is on your own service or on a
4:15 PC or a laptop or mobile device then
4:17 open source models may give you a good
4:19 starting point to do that and using an
4:21 open source model might also let you
4:24 build an application in a way that
4:27 retains full control over data privacy
4:29 and data access for example I was
4:32 recently working on an application using
4:34 electronic health records and because of
4:37 patient privacy we just could not upload
4:40 the patient records to a cloud provider
4:43 and so for that project my team used an
4:45 open-source model that we ran on our own
4:48 computers because we had to do that to
4:51 guarantee privacy of the patient data so
4:54 to summarize this week we talked about
4:57 software applications built using LS we
5:00 saw the life cycle of gentv project as
5:02 well as techniques like Rag and
5:04 fine-tuning that can make your LM more
5:07 capable and lastly in this video we
5:08 talked about how to choose an
5:11 appropriate model to build on there also
5:13 a couple of optional videos after this
5:16 one one that goes a bit deeper into the
5:18 technology that enables L to not just
5:20 predict the next word found on the
5:22 internet but actually follow your
5:26 instructions and do so in a safe way and
5:28 the other optional video talks about
5:31 some Frontier Cutting Edge technology
5:34 that can use LMS to automat decide what
5:37 to do and also use tools along the way
5:39 so please feel free to check out those
5:43 videos if you wish and then in the next
5:45 and final week of this course we'll take
5:48 a look at how LM technology is affecting
5:52 businesses and Society for example how
5:54 can you identify LM use cases that could
5:57 be useful for your company we'll take a
5:59 look at next week as well at a
6:02 systematic way to understand why jobs
6:06 are more or less affected by gen of AI
6:08 and how both the individuals doing the
6:11 jobs as well as businesses employing
6:13 people doing those jobs May navigate the
6:16 changes that generative AI is bring to
6:18 work I look forward to seeing you next week