YouTube Transcript:
W2 4 Cost intuition

Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.

AutoDub

Understand YouTube Foreign Videos

Immersive YouTube Dubbing in English

Break language barriers, embrace global quality content

Use for Free

Video Transcript

Video Summary

Summary

Core Theme

This content demystifies the cost of using Large Language Models (LLMs) in software applications by providing intuitive examples and breaking down pricing structures based on token usage.

Key Points

Mind Map

Click to expand

Click to explore the full interactive mind map

in this video I'd like to walk through

with you some quick examples to build

intuition about how much using large

language models in the software

application actually costs let's take a

look these are some example prices for

prompting and getting responses from

different large language models they're

available to developers that is for if

you call these large language models in

your code so open eyes GPD 3.5 charges

002 per 1,000 tokens that's 0.2 cents

per 1,000 tokens gbd4 costs quite a bit

more 6 cents per th000 tokens and

Google's Palm to and Amazon's Titan

light are also pretty inexpensive what

I'm showing here are the costs of

generating different numbers of tokens

technically these large language models

charge for the length of the prompt as

well but the length of the prompt

sometimes called the input tokens is

almost always cheaper than the cost of

the output tokens so so let's just focus

on the con of the output tokens for now

you may be wondering what is a token it

turns out that a token is Loosely either

a word or a subpart of a word because

that's how large language models process

text so common words like the or example

would be counted as a single token when

a large language model processes it or

my name Andrew is it relatively common

name and so that's also a single token

but a less common words like translate

might be split by large language model

into two subp positive words TR and

slate and so having it generate to

translate will cost you two output

tokens unlike the more common words

which will cost you only one token or

programming turns out might be split by

LM into program and Ming and also cost

two tokens and the less fre word like

ton cotu might be split into four tokens

with ton and K and OTS and U but average

over large collections of text documents

roughly each token is about three

quarters of a word so if you were to generate

generate

300 words that would cost you about

400 tokens don't worry about it if the

math doesn't totally make sense but the

intuition I hope you take away from this

is the number of tokens is Loosely equal

to the number of words but a little bit

bigger it turns out to be roughly 33%

more than the number of words and on the

next slide we'll do this calculation

assuming a cost

of2 cents per 1,000 tokens but of course

if you were to use different LM options

the cost may be higher or lower so

imagine that you're building an LM

application for your own team maybe to

generate text as useful for them to read

let's estimate how much how much would

cost to generate enough text to keep

someone on your team occupied for an

hour so typical ad reading speed might

be about 250 words per minute so to keep

someone occupied for an hour you need to

generate 60 * 250

Words which is uh

15,000 words that the LM has output but

we need to prompt the LM as well to get

it to generate this output so if we

assume that the length of the prompt is

comparable to the length of the output

that might add another 15,000

words that is if we need to prompt it in

total for 15,000 words worth of inputs

and then also generate 15,000 words of

output to keep someone occupied for an

hour of course this is a very crude

assumption but probably good enough for

the purposes of building intuition so in

total we need need to pay for 30,000

words and as we saw on the previous

slide because each token corresponds

to roughly

3/4 of a

40,000 tokens and if the cost

is 0

002 cents per 1,000 or

1K tokens then generating 40,000 tokens

costs 40 times that 0.02 * 40 which is

equal to 8 cents so if your software

application uses a cloud hosted um

service by openi or Aero or Google or

AWS or others that's maybe 8 cents to

keep keep someone busy for an hour I

know I made a lot of assumptions in this

calculation but this seems you know decently

decently

inexpensive in the United States minimum

wage for many places is maybe around $10

to $15 an hour so paying an additional 8

cents per hour of someone reading

intensely seems like a small incremental

cost especially if it helps become helps

them be more

productive of course if you have a free

product that a million users are using

than 8 cents times a million with no

Associated Revenue can get expensive but

I find that for many applications using

an LM turns out to be cheaper than most

people think so I hope this gives some

useful intuition for the cost of Ls

let's go on to the next video we'll

learn about some more Advanced

Technologies they can use to make your

alms even more powerful I'll see you in

Click on any text or timestamp to jump to that moment in the video

Most transcripts ready in under 5 seconds

One-Click Copy125+ LanguagesSearch ContentJump to Timestamps

Paste YouTube URL

Enter any YouTube video link to get the full transcript

Most transcripts ready in under 5 seconds

Get Our Chrome Extension

Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.

Add to Chrome — Free

Works with YouTube, Coursera, Udemy and more educational platforms

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube TranscriptPreparing your results…

YouTube Transcript:W2 4 Cost intuition