Choosing the right Large Language Model (LLM) for software applications involves considering model size, capabilities, and whether to opt for open-source or closed-source solutions, with practical experimentation being crucial for optimal selection.
Mind Map
Нажмите, чтобы развернуть
Нажмите, чтобы открыть полную интерактивную карту
when using an LM to build a software
applications you find that there are a
lot of different LS out there some big
ones some small ones some open source
some closed Source how do you choose
from all of these different options in
this video let's take a look at some
guidelines one way to estimate how
capable an omm is is to look at the
model size Loosely if we look at models
that are say in the 1 billion parameter
range we'll find that they're often good
that patent matching and will have some
basic knowledge of the world so if what
you want is to classify restaurant
reviews for sentiment I think a 1
billion parameter model would probably
be able to do just fine in terms of that
type of pattern matching with basic
knowledge about food types of words as
you go to a 10 billion parameter model
you find that the models have greater
World Knowledge they just know more
esoteric facts about the world and the
models also get better at following
basic instructions so if you want to
build a food order chatbot a 10 billion
parameter model might be okay especially
if you were to fine tune it to become
better at the types of Specific
Instructions you wanted to follow and
then the very large models say 100
billion plus parameters will tend to
have very rich World Knowledge they'll
know a lot of things about physics and
philosophy and history and science and
so on and they'll be better as well at
complex reasoning this is why if you're
building a food order chatbot maybe you
don't need the chatbot to know so much
about history and philosophy and all of
these other things under the sun some of
these models might be cheap enough to
deploy that it might be okay to use a
huge model even for a food order chat
bot but where I would definitely tend to
use these larger models would be tasks
that involve deep knowledge or complex
reasoning so for example if I'm looking
for a brainstorming partner to help me
think through ideas I'll often use one
of the larger models one of the things
you've heard me say earlier though is
that development using LMS is often a
highly empirical meaning experimental
process so it's hard to know in advance
exactly what the performance of a given
LM will be and while I'm sharing some
general guidelines here in practice it
might be worth just trying a few
different different models and testing
them and based on the results you see
from testing a few options then pick
what actually seems to work best for
your application another decision you
might have to make is whether to use a
closed Source or an open source model
close Source models are usually
accessible via Cloud programming
interface and I find that many of them
are pretty easy to build into
applications you just have to write a
few lines of code like we saw earlier
this week to incorporate them into
software applications many of the
largest and most powerful models today
are also available only via Cloud
programming interfaces and a closed
Source models and they're also
relatively inexpensive to run because
the large companies hosting these models
will often have put a lot of work into
serving up these API calls inexpensively
a downside is that if you develop using
these close Source models there is some
risk of vendor lock in today the
switching cost from one LM to a
different one is not very high but there
is some cost to retesting all your
prompts to see if they work on a
different LM if you do switch vendors in
comparison there are also many open-
Source models that are available now one
advantage using an open source model is
you have full control over the model you
know you always have access to that
model and don't have to worry about what
of the company providing it were to to
retire or deprecate the model that you
had built on top of you can also often
run these models on your own device so
if you want to run it on premises or on
Prem that is on your own service or on a
PC or a laptop or mobile device then
open source models may give you a good
starting point to do that and using an
open source model might also let you
build an application in a way that
retains full control over data privacy
and data access for example I was
recently working on an application using
electronic health records and because of
patient privacy we just could not upload
the patient records to a cloud provider
and so for that project my team used an
open-source model that we ran on our own
computers because we had to do that to
guarantee privacy of the patient data so
to summarize this week we talked about
software applications built using LS we
saw the life cycle of gentv project as
well as techniques like Rag and
fine-tuning that can make your LM more
capable and lastly in this video we
talked about how to choose an
appropriate model to build on there also
a couple of optional videos after this
one one that goes a bit deeper into the
technology that enables L to not just
predict the next word found on the
internet but actually follow your
instructions and do so in a safe way and
the other optional video talks about
some Frontier Cutting Edge technology
that can use LMS to automat decide what
to do and also use tools along the way
so please feel free to check out those
videos if you wish and then in the next
and final week of this course we'll take
a look at how LM technology is affecting
businesses and Society for example how
can you identify LM use cases that could
be useful for your company we'll take a
look at next week as well at a
systematic way to understand why jobs
are more or less affected by gen of AI
and how both the individuals doing the
jobs as well as businesses employing
people doing those jobs May navigate the
changes that generative AI is bring to
work I look forward to seeing you next week
Нажмите на любой текст или временную метку, чтобы перейти к этому моменту видео
Поделиться:
Большинство транскрипций готово менее чем за 5 секунд
Копировать одним кликом125+ языковПоиск по текстуПерейти к временным меткам
Вставьте ссылку на YouTube
Введите ссылку на любое YouTube-видео, чтобы получить полную транскрипцию
Форма извлечения транскрипции
Большинство транскрипций готово менее чем за 5 секунд
Установите расширение для Chrome
Получайте транскрипции прямо на YouTube, не переходя на другие сайты. Установите наше расширение и открывайте текст любого видео в один клик — прямо на странице просмотра.