This content demystifies the cost of using Large Language Models (LLMs) in software applications by providing intuitive examples and breaking down pricing structures based on token usage.
Key Points
Mind Map
Click to expand
Click to explore the full interactive mind map
in this video I'd like to walk through
with you some quick examples to build
intuition about how much using large
language models in the software
application actually costs let's take a
look these are some example prices for
prompting and getting responses from
different large language models they're
available to developers that is for if
you call these large language models in
your code so open eyes GPD 3.5 charges
002 per 1,000 tokens that's 0.2 cents
per 1,000 tokens gbd4 costs quite a bit
more 6 cents per th000 tokens and
Google's Palm to and Amazon's Titan
light are also pretty inexpensive what
I'm showing here are the costs of
generating different numbers of tokens
technically these large language models
charge for the length of the prompt as
well but the length of the prompt
sometimes called the input tokens is
almost always cheaper than the cost of
the output tokens so so let's just focus
on the con of the output tokens for now
you may be wondering what is a token it
turns out that a token is Loosely either
a word or a subpart of a word because
that's how large language models process
text so common words like the or example
would be counted as a single token when
a large language model processes it or
my name Andrew is it relatively common
name and so that's also a single token
but a less common words like translate
might be split by large language model
into two subp positive words TR and
slate and so having it generate to
translate will cost you two output
tokens unlike the more common words
which will cost you only one token or
programming turns out might be split by
LM into program and Ming and also cost
two tokens and the less fre word like
ton cotu might be split into four tokens
with ton and K and OTS and U but average
over large collections of text documents
roughly each token is about three
quarters of a word so if you were to generate
generate
300 words that would cost you about
400 tokens don't worry about it if the
math doesn't totally make sense but the
intuition I hope you take away from this
is the number of tokens is Loosely equal
to the number of words but a little bit
bigger it turns out to be roughly 33%
more than the number of words and on the
next slide we'll do this calculation
assuming a cost
of2 cents per 1,000 tokens but of course
if you were to use different LM options
the cost may be higher or lower so
imagine that you're building an LM
application for your own team maybe to
generate text as useful for them to read
let's estimate how much how much would
cost to generate enough text to keep
someone on your team occupied for an
hour so typical ad reading speed might
be about 250 words per minute so to keep
someone occupied for an hour you need to
generate 60 * 250
Words which is uh
15,000 words that the LM has output but
we need to prompt the LM as well to get
it to generate this output so if we
assume that the length of the prompt is
comparable to the length of the output
that might add another 15,000
words that is if we need to prompt it in
total for 15,000 words worth of inputs
and then also generate 15,000 words of
output to keep someone occupied for an
hour of course this is a very crude
assumption but probably good enough for
the purposes of building intuition so in
total we need need to pay for 30,000
words and as we saw on the previous
slide because each token corresponds
to roughly
3/4 of a
40,000 tokens and if the cost
is 0
002 cents per 1,000 or
1K tokens then generating 40,000 tokens
costs 40 times that 0.02 * 40 which is
equal to 8 cents so if your software
application uses a cloud hosted um
service by openi or Aero or Google or
AWS or others that's maybe 8 cents to
keep keep someone busy for an hour I
know I made a lot of assumptions in this
calculation but this seems you know decently
decently
inexpensive in the United States minimum
wage for many places is maybe around $10
to $15 an hour so paying an additional 8
cents per hour of someone reading
intensely seems like a small incremental
cost especially if it helps become helps
them be more
productive of course if you have a free
product that a million users are using
than 8 cents times a million with no
Associated Revenue can get expensive but
I find that for many applications using
an LM turns out to be cheaper than most
people think so I hope this gives some
useful intuition for the cost of Ls
let's go on to the next video we'll
learn about some more Advanced
Technologies they can use to make your
alms even more powerful I'll see you in
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.