0:03 in this video I'd like to walk through
0:06 with you some quick examples to build
0:08 intuition about how much using large
0:09 language models in the software
0:12 application actually costs let's take a
0:15 look these are some example prices for
0:18 prompting and getting responses from
0:20 different large language models they're
0:23 available to developers that is for if
0:25 you call these large language models in
0:28 your code so open eyes GPD 3.5 charges
0:32 002 per 1,000 tokens that's 0.2 cents
0:35 per 1,000 tokens gbd4 costs quite a bit
0:37 more 6 cents per th000 tokens and
0:39 Google's Palm to and Amazon's Titan
0:42 light are also pretty inexpensive what
0:44 I'm showing here are the costs of
0:47 generating different numbers of tokens
0:49 technically these large language models
0:52 charge for the length of the prompt as
0:54 well but the length of the prompt
0:56 sometimes called the input tokens is
0:58 almost always cheaper than the cost of
1:00 the output tokens so so let's just focus
1:03 on the con of the output tokens for now
1:05 you may be wondering what is a token it
1:09 turns out that a token is Loosely either
1:12 a word or a subpart of a word because
1:14 that's how large language models process
1:18 text so common words like the or example
1:20 would be counted as a single token when
1:22 a large language model processes it or
1:24 my name Andrew is it relatively common
1:27 name and so that's also a single token
1:31 but a less common words like translate
1:33 might be split by large language model
1:36 into two subp positive words TR and
1:39 slate and so having it generate to
1:41 translate will cost you two output
1:44 tokens unlike the more common words
1:46 which will cost you only one token or
1:49 programming turns out might be split by
1:52 LM into program and Ming and also cost
1:55 two tokens and the less fre word like
1:58 ton cotu might be split into four tokens
2:01 with ton and K and OTS and U but average
2:04 over large collections of text documents
2:07 roughly each token is about three
2:11 quarters of a word so if you were to generate
2:13 generate
2:19 300 words that would cost you about
2:22 400 tokens don't worry about it if the
2:24 math doesn't totally make sense but the
2:26 intuition I hope you take away from this
2:29 is the number of tokens is Loosely equal
2:31 to the number of words but a little bit
2:35 bigger it turns out to be roughly 33%
2:37 more than the number of words and on the
2:39 next slide we'll do this calculation
2:43 assuming a cost
2:46 of2 cents per 1,000 tokens but of course
2:49 if you were to use different LM options
2:52 the cost may be higher or lower so
2:54 imagine that you're building an LM
2:56 application for your own team maybe to
2:59 generate text as useful for them to read
3:00 let's estimate how much how much would
3:02 cost to generate enough text to keep
3:05 someone on your team occupied for an
3:07 hour so typical ad reading speed might
3:10 be about 250 words per minute so to keep
3:12 someone occupied for an hour you need to
3:17 generate 60 * 250
3:20 Words which is uh
3:24 15,000 words that the LM has output but
3:26 we need to prompt the LM as well to get
3:29 it to generate this output so if we
3:31 assume that the length of the prompt is
3:33 comparable to the length of the output
3:37 that might add another 15,000
3:41 words that is if we need to prompt it in
3:44 total for 15,000 words worth of inputs
3:47 and then also generate 15,000 words of
3:49 output to keep someone occupied for an
3:53 hour of course this is a very crude
3:55 assumption but probably good enough for
3:58 the purposes of building intuition so in
4:02 total we need need to pay for 30,000
4:05 words and as we saw on the previous
4:10 slide because each token corresponds
4:13 to roughly
4:16 3/4 of a
4:28 40,000 tokens and if the cost
4:31 is 0
4:36 002 cents per 1,000 or
4:41 1K tokens then generating 40,000 tokens
4:46 costs 40 times that 0.02 * 40 which is
4:51 equal to 8 cents so if your software
4:53 application uses a cloud hosted um
4:56 service by openi or Aero or Google or
4:59 AWS or others that's maybe 8 cents to
5:01 keep keep someone busy for an hour I
5:03 know I made a lot of assumptions in this
5:05 calculation but this seems you know decently
5:07 decently
5:09 inexpensive in the United States minimum
5:12 wage for many places is maybe around $10
5:15 to $15 an hour so paying an additional 8
5:17 cents per hour of someone reading
5:20 intensely seems like a small incremental
5:23 cost especially if it helps become helps
5:24 them be more
5:27 productive of course if you have a free
5:29 product that a million users are using
5:31 than 8 cents times a million with no
5:34 Associated Revenue can get expensive but
5:36 I find that for many applications using
5:39 an LM turns out to be cheaper than most
5:42 people think so I hope this gives some
5:45 useful intuition for the cost of Ls
5:47 let's go on to the next video we'll
5:49 learn about some more Advanced
5:51 Technologies they can use to make your
5:54 alms even more powerful I'll see you in