YouTube 字幕：
W1 10 Generative ai application Image generation

不必从头看完视频——获取完整字幕，搜索关键词，一键复制。

AutoDub

听懂YouTube外语视频

沉浸式YouTube翻译中文配音

告别语言障碍，拥抱全球优质内容

免费使用

视频字幕

视频摘要

Summary

Core Theme

Generative AI image generation primarily utilizes diffusion models, a supervised learning technique that learns to progressively denoise images from pure noise, guided by text prompts, to create novel visuals.

Mind Map

点击展开

点击探索完整互动思维导图

thanks for sticking with me for this

final optional video on image generation

so far this week we focus most of

attention on text generation and text

generation is what a lot of users are

using and is having the biggest impact

of all the different tools of generative

AI but part of the excitement of

generative AI is also image generation

and they're also starting to be some

models that can generate either text or

images and these are sometimes called

multi mod models because it can operate

in multiple modalities text or images

what I'd like to do in this video is

share with you how image generation

Works let's take a look with just a

prompts you can use generative AI to

generate a beautiful picture of a person

that that never existed or a picture of

a futuristic scene or a picture of a

cool robot like this how does this

technology work image generation today

is mostly done via a method called a

diffusion usion model diffusion models

have learned from huge numbers of images

found on the internet or elsewhere and

it turns out that at the heart of a

diffusion model is supervised learning

here's what it does let's say the argm

finds a picture on the internet of an

apple like this and it wants to learn

from pictures like this and hundreds of

millions of others on how to generate

images the first step is to take this

image and gradually add more and more

noise to it so that you go from this

nice picture of an apple to a noisier to

an even noisier to finally a picture

that just looks like Pure Noise where

all the pixels are just chosen at random

and it doesn't look at all like an apple

the diffusion model then uses pictures

like these as data to learn using

supervised learning to take as input a

noisy image and to Output a slightly

less noisy image specifically it would

create a data set where the first data

point says if it's given the second

input image what we want the supervised

learning algum to do is learn to Output

a cleaner version of this apple and

here's another dat Point given this

third image of an even noisier image we

would like the algorithm to learn to

Output a slightly less noisy version

like this and finally given an image of

Pure Noise like this fourth image we

would like it to learn to Output a

slightly less noisy picture here that

just suggest the presence of an Apple

after training on maybe hundreds of

millions of images via a process like

this when you want to apply it to

generate a new image this is how you

would run it you would start off with a

Pure Noise image so start by taking a

picture where every single Pixel in

picture is just chosen completely at

random we then feed this picture to the

supervised learning algorithm that we

trained up on the previous slide when we

feed in Pure Noise it learns to remove a

little bit of noise from this picture

and you may end up with a picture like

this that suggests some sort of fruit in

the middle but we're not quite sure what

it is yet given the second picture we

again feeded to the model and it then

takes away even a little bit more noise

and now it looks like we can see a noisy

picture of a

watermelon and then if you apply this

one more time we end up with this fourth

image which looks like a pretty nice

picture of a watermelon I'm illustrating

this process using four steps of adding

noise on the previous slide and four

steps of removing noise on this slide

but in practice maybe about a 100 steps

would be more typical for a diffusion

model so this algorithm will work for

generating pictures completely at random

but we want to be able to control the

image it generates by specifying a

prompt to tell it what we want it to

generate let me describe a modification

of the algorithm that lets you add text

or add a prompt to tell it what you want

it to generate in in this trading data

we given pictures like this apple as

well as a description or prompt that

could have generated this Apple so here

I have a text description saying this is

a red apple then we will same as before

add noise to this picture until we get

this fourth image which is Pure Noise

but we're going to change how we build

the learning algorithm which is rather

than inputting the slightly noisy

picture and expecting it to generate a

clean picture when instead have the

input a to the supervisor learning album

be this noisy picture as well as the

text caption or the prompt that could

have generated this picture namely Red

Apple and given this input we now want

the algorithm to Output this clean

picture of an apple and similarly will

generate additional data points for the

algorithm using the other noisy images

where each time given a noisy image and

the text prompt red apple we want the AL

to learn to generate a less noisy

picture of a red apple so Having learned

from a very large data set when you want

to apply the Alum to generating say a

green banana this is what you do same as

before we start off with an image of

Pure Noise so every single Pixel is

chosen completely at random and if you

wanted to generate a green banana you

input to the suis learning algorithm

that picture of Pure Noise together with

with the prompt green banana now that it

knows you want a green banana hopefully

the ALB will output a picture that maybe

looks like this can't see the banana

that clearly but maybe this a suggestion

of some sort of greenish fruit in the

middle and this is the first step of

image generation the next step is we

then take this image on the right that

was the output B and now feed that is

the input a with Again The Prompt green

banana to get it to generate a slightly

less noisy picture and now we see See

Clearly looks like it's a green banana

but a pretty noisy one and we do this

one more time and it finally removes

most of the noise um until we end up

with that picture of a pretty nice green

banana so that's how diffusion models

work for generating images and at the

heart of this really magical process of

generating beautiful images is again

supervised learning thanks for sticking

with me for this optional video and I

look forward to seeing you next week

where we'll dive much more into

applications being built using

generative AI I'll see you in the next video

点击任意文字或时间戳，即可跳转到视频对应位置

大多数字幕 5 秒内即可准备好

一键复制125+ 种语言搜索内容跳转到时间戳

粘贴 YouTube 链接

输入任意 YouTube 视频链接，获取完整字幕

大多数字幕 5 秒内即可准备好

安装 Chrome 扩展

无需离开 YouTube，一键获取视频字幕。安装我们的 Chrome 扩展，直接在视频页面访问任意视频的完整字幕。

免费添加到 Chrome

支持 YouTube、Coursera、Udemy 等主流教育平台

快速获取字幕：直接修改地址栏中的域名即可！

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube 字幕正在为您准备结果……

YouTube 字幕：W1 10 Generative ai application Image generation