YouTube-Transkript:
ResNet (actually) explained in under 10 minutes

Kein langes Zuschauen mehr – hol dir das vollständige Transkript, suche nach Stichwörtern und kopiere alles mit einem Klick.

AutoDub

Fremdsprachige YouTube-Videos verstehen

Immersive YouTube-Synchronisation auf Deutsch

Sprachbarrieren überwinden, erstklassige Inhalte aus aller Welt genießen

Kostenlos nutzen

Videotranskript

Videozusammenfassung

Summary

Core Theme

This content explains the concept of residual networks (ResNets) and their fundamental contribution to enabling the training of much deeper neural networks by addressing the vanishing gradient problem and signal degradation.

Mind Map

Zum Vergrößern klicken

Klicke, um die vollständige interaktive Mind Map zu öffnen

I want you to imagine approximating a

function parameterized by a deep neural network

network

in this example we're going to pass our

Network an up sample low resolution

input image and pass it through each of

the network layers

we want the network to Output the input

image but now in a high resolution

a task commonly known as super resolution

resolution

unfortunately in practice after training

our Network on high and low resolution

image pairs somehow our network is

spitting out images they're even worse

than our input after putting all your

effort into a beautifully deep

architecture you are horrified to see

that instead of going down your training

loss shoots endlessly upwards your

classmates and colleagues can't help but laugh

laugh

and seemingly counter-intuitive because

now the model has more parameters

now how can we address this and get your

loss going in the right direction [Applause]

[Applause]

this problem partly comes down to the

fact that we have an input signal that

is being lost the deeper and deeper we

go into our Network

as the signals pass through each of the

non-linear functions at each layer

look at what can happen to a training

signal even after being passed through a

single relay function the most popular

activation function for neural networks

essentially you ask in the network to do

two things

one is to retain the input signal and

the second is to find out what needs to

be added to the input image to transform

it from a low to high resolution image

instead let's look at the problem from a

different angle

let's first minus the low and high

resolution image from one another

this gives us what is known as a

residual image or the difference between

the two images

now let's reshift this equation to get

our intended output on the right hand side

side

now given we already have the low

resolution image at training time let's

now just get our Network to learn the

only bit we actually care about the residual

residual

framing the problem in this way makes

the Network's life easier as it doesn't

need to retain the entire input signal

this was the same intuition that

inspired the authors from the 2015 paper

deep residual learning for image recognition

recognition

this paper is now considered seminal in

relation to deep learning with over 130

000 citations it is rare to run into a

model architecture in deep learning

today that doesn't utilize the

contributions from this paper and some fashion

in the previous example I gave you an

easy and intuitive introduction to

residuals let's have another look at the

layers of a neural network

I chose to present residual connections

to you using the example of super

resolution as it can be visualized very easily

easily

by simply adding the input onto the

output we can instead learn the mapping

to the residual image as you can see here

here however

however

this approach I've shown you so far has

two major problems when generalizing to

other tasks

the first problem is where we have a

task where the input and outputs don't

share the same dimensionality

for example an image classification

where you take an image input and map it

to a single class label how would you

meaningfully add the inputs and outputs

in this scenario

the second problem is how the input

signal is propagated throughout the network

network

let's consider the midpoint of our super

resolution Network

at this point no matter what our input

or output is it is still easy for the

network to lose the training signal

this signal is an important piece of

information that would be useful for the

network to have access to

in order to remedy both these problems

we can add what are known as residual

connections all the way along our

Network this not only boosts input

signals all the way along the network

but also makes it easier to submit

inputs and outputs as feature

dimensionality is adjusted on the go

we can now also view the network as a

series of residual blocks instead of a

series of independent layers

most importantly now the network has the

option to not fully utilize all the

blocks since it is easy for each block

to Output the identity function and take

no penalty in relation to the loss function

function

this opens the doors to training

extremely deep Networks

now let's have a deeper look at the main

idea I introduced here

foreign

so what exactly was the resnet block

they proposed in the original paper

let's go through it step by step

firstly we pass our inputs through a 3X3

convolutional layer with a stride of one

and padding one

these parameters mean that our output

features will have the same

dimensionality as our input

we then apply batch Norm to renormalize

these features and pass them through an

activation function such as relu

we then pass the features through a

second convolutional layer exactly the

same as the first and again followed by

a batch Norm

at this stage we just have a normal

vanilla neural network so let's now add

a residual connection we can do this by

simply adding the Block's inputs onto

the current set of features

we do this element wise as our inputs

and features share the same

dimensionality remember this is only

because we have carefully chosen our

convolutional parameters

however for tasks such as image

classification we do actually want to

reduce our dimensionality throughout the

network more on that in a moment

finally we pass our features through a

final activation function now that is

essentially it it really is quite a

simple idea now let's have a quick look

at the official Pi torch code

implementation for a resnet blocks

forward pass and consolidate what we've

just learned we start with an input

tensor X and save a copy of this as our

identity function we can use later

we then pass our input through a set of

convolutional batch norm and activation layers

layers

down sampling the features if required

YouTube-URL einfügen

Gib den Link eines beliebigen YouTube-Videos ein und erhalte das vollständige Transkript

Die meisten Transkripte sind in unter 5 Sekunden bereit

Unsere Chrome-Erweiterung installieren

Transkripte abrufen, ohne YouTube zu verlassen. Installiere unsere Chrome-Erweiterung und greife mit einem Klick direkt auf der Wiedergabeseite auf das Transkript jedes Videos zu.

Zu Chrome hinzufügen – kostenlos

Funktioniert mit YouTube, Coursera, Udemy und weiteren Lernplattformen

Transkripte sofort abrufen: Einfach die Domain in der Adressleiste ändern!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube-TranskriptDeine Ergebnisse werden vorbereitet …

YouTube-Transkript:ResNet (actually) explained in under 10 minutes