YouTube 字幕：
W2 6 Fine tuning

不必从头看完视频——获取完整字幕，搜索关键词，一键复制。

AutoDub

听懂YouTube外语视频

沉浸式YouTube翻译中文配音

告别语言障碍，拥抱全球优质内容

免费使用

视频字幕

视频摘要

Summary

Core Theme

Fine-tuning is an alternative technique to RAG for enhancing Large Language Models (LLMs) by enabling them to absorb larger contexts, adopt specific styles, or gain domain-specific knowledge, often with more precision than prompt-based methods.

Mind Map

点击展开

点击探索完整互动思维导图

whereas rag gives you one way to give

additional information to a Lun language

model there's another technique called

fine-tuning which is another way to give

it more information in particular if you

have context that is bigger that can fit

into the input length for the input

context window length for the LM then

fine tuning gives you another way to get

an LM to absorb this information and

fine tuning also turns out to be useful

for getting the LM output text in a

certain in given style but this actual

implementation is a bit harder than rag

let's take a look let's say you have an

LM trained the way that we had described

previously with sentence found on the

internet like my favorite food is a

bagel with cream cheese then it may have

learned from hundreds of billions of

words or maybe more than a trillion

words to predict the next word like this

an El like this will have learned to

generate text that sounds like what's on

the internet and this process of

training a large language model on a lot

of data is often called pre-training now

let's say I want to modify the LM to

have a relentlessly positive and

optimistic attitude about everything

there's a technique called fine-tuning

that we can use to cause the LM to do a

little bit more learning to change its

outputs to be in this example much more

positive and optimistic to fine tune the

LM we would come up with a set of

sentences a set of texts that takes on a

positive optimistic attitude such as

what a wonderful chocolate click or the

novel was thrilling given text like this

you can then create an additional data

set using what a wonderful chocolate

cake you would have given what next word

it will try to predict a what a next

word is wonderful what a wonderful

chocolate and so one and it turns out

that if you take an LM that has been

pre-trained on hundreds of billions of

words and fine-tune it on just an

additional say 10,000 words or more

could be 100,000 words if you have more

data or even a million words if even

more data F tuning to this relatively

modest Siz data set can shift the output

of your LM to take on this positive

optimistic attitude now maybe shifting

an LM to have a relentlessly positive

attitude isn't that helpful an

application but fine-tuning is used in

many real applications one class of

applications that fine tuning is useful

is when the task isn't easy to Define in

a prom for example if you want to use an

L to summarize customer service calls a

generic om May locally call like this

and summarize it to say the customer

tells the agent about a problem with a

monitor but if you run a customer call

center you might want it to generate

specifics of about what the conversation

was about it was about the MK 4127 KX

reported broken by customer

542 and so on and if you create a data

set with maybe just hundreds of examples

of human expert written summaries and

have a large language model that's

learned from hundreds of billions of

words on the internet so it's learned a

lot of general knowledge on the internet

but if you additionally fine tune it on

maybe just hundred of carefully

handwritten summaries of this specific

style then that would shift the L's

ability to write summaries in the style

that you want and the specific style of

summary is actually not that easy to

Define in a text prompt maybe you could

do it but fine tuning would just be a

very precise way to tell the Elum what

summaries you want another example of

when a task isn't easy to Define in a

prompt is if you want to mimic a

specific writing or speaking style so

Tommy Nelson who's been working with me

on this course actually tried kind of

just for fun to get an LM to sound like

me but it turns out that the way most

individuals sound is not that easy to

describe in a prompt I mean how would

you give someone clear instructions to

sound like me so if you were to prompt a

general prosum and ask it to sound like

me you get texts like this which I don't

think it sounds that much like me but if

were to take a lot of transcripts of the

way I actually talk and have an OM be

fine-tuned to train it to really sound

exactly like me by learning on my actual

words then asking it to write something

that sounds like me results in text like

this which I don't know this sounds more

like how I would talk but because

mimicking a specific writing or speaking

style is very difficult to do VI

prompting because just difficult to

describe a specific person's Style by

writing text instructions fine tuning

turns out to be a more effective way to

get an alarm to speak in a certain style

and if you're building an artificial

character maybe a cartoon character fine

tuning could also be a way to get an Al

to speak in a certain style other than

Ts that AR easy to Define in the prompt

a second broad class of applications of

fine tuning is to help the um gain a

domain of knowledge for example if you

want an OM to be able to read and

process medical notes this is what a

medical note written about a patient by

a doctor might look like and this is

really not normal English PT is patient

Co complaining of s so shortness of

breath doe dis near on exertion PE this

is the results of the physical

examination and so on treatment is the

follow up with the primary care

physician stat chess x-ray continuing

treatment as needed on oxygen but this

is really not normal English and if you

were to take an LM trained on normal

English it wouldn't be very good at

processing text like this so if you were

to find T LM on a collection of medical

records then the LM could get much

better at absorbing this body of

knowledge about what medical notes sound

like and you could then use that to

build other appications on top of it to

better understand medical records or

legal documents here's a piece of legal

Le kind of written by lawyers for

lawyers that's really difficult for non-

lawyers to read license GRS licy Pro

section 2 A3 and non-exclusive right and

so on and so on within 15 days hereof I

don't know about you I do not use the

word he of in my ordinary day-to-day

speech but this is what legal documents

sound like and if if you want your LM to

gain a body of knowledge about how to

read and understand legal documents then

take an LM and fine-tuning it to legal

documents would help it to gain that

body of knowledge and similarly

financial documents too fine-tuning and

LM on a large set of financial documents

would help it to better gain that body

of knowledge about finance and make it

better at applications involving

processing documents that look like this

finally another reason to find t om is

to get a smaller model to perform a task

that may previously have required a

larger model we'll discuss later this

week some of the pros and cons of

choosing a larger versus a smaller model

but for some applications that need a

lot of knowledge or need complex

reasoning you might use a relatively

large model say with over 100 billion

parameters but if you were to use a

model like that such a model may have

relatively High latency meaning after

you prompted you might need to wait a

while to get back a response and if you

were deploying this on your own

computers it could be quite costly and

even though we said in the earlier video

that these models aren't that expensive

maybe want it to be even cheaper and

that's because a 100 billion paramet

model may take specialized computers

such as a GPU server or other really

fast computers to run you probably have

a hard time running such a large model

on a normal laptop or PC and certainly

not on a smartphone today but if you can

get your application to work on a much

smaller model say 1 billion parameters

then that's the range of model size that

they would run much more easily on a

laptop or a PC or on a mobile phone so

for example if what you want is to

classify restaurant reviews as positive

or negative sentiment this is a simple

enough task that you probably don't need

a 100 or 200 billion parameter model to

run but maybe a 1 billion parameter

model would be just fine maybe even smaller

smaller

frankly but these smaller models aren't

as smart or not as they aren't as good

as a really large models which is why if

you were to take a small model and then

fine-tune it on the data set like the

one shown here not just three examples

but maybe a few hundred or maybe a

thousand examples if you have that much

data then you can get a small model say

a billion parameters to do really well

on a task like this so to summarize

fine-tuning gives you another technique

in addition to rag to help improve the

capabilities of an LM you might use it

for tasks that are hard to specify in a

prompt such as if you wanted to Output

text in a style or if you want the to

gain a body of knowledge such as about

medical Nots or if you want to get a

smaller and cheaper to run L to do a

task that might otherwise have required

a larger

L it turns out that Rag and fine tuning

are both relatively cheap to implement

rag just is modifications of your prompt

and fine-tuning you might be able get

started with tens of dollars or maybe

low hundreds of dollars

depending on how much data you want to

find tune on there's another technique

pre-training your own model that turns

out to be very expensive and today

almost no one other than reasonably

large companies usually tech companies

are attempting this but for completeness

let's take a look at the next video at

点击任意文字或时间戳，即可跳转到视频对应位置

大多数字幕 5 秒内即可准备好

一键复制125+ 种语言搜索内容跳转到时间戳

粘贴 YouTube 链接

输入任意 YouTube 视频链接，获取完整字幕

大多数字幕 5 秒内即可准备好

安装 Chrome 扩展

无需离开 YouTube，一键获取视频字幕。安装我们的 Chrome 扩展，直接在视频页面访问任意视频的完整字幕。

免费添加到 Chrome

支持 YouTube、Coursera、Udemy 等主流教育平台

快速获取字幕：直接修改地址栏中的域名即可！

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube 字幕正在为您准备结果……

YouTube 字幕：W2 6 Fine tuning