YouTube Transcript:
Backpropagation In CNN Model | Deep Learning Playlist

Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.

Video Transcript

View:

okay so this lecture is going to be

about back propagation in CNN

architecture before we get started with

this topic there are two things that I

want to point out here first is that

since we are learning the concept of

back propagation in CNN or since we are

going through the topic of CNN then I am

just assuming that you are already aware

of the concept of back propagation in

ANS and machine learning so I am

assuming that you have a basic idea

around how the parameter training

happens in a machine learning algorithm

or in an architecture like artificial

neural network and you have a decent

understanding around gradient descent as

well so in case you are not comfortable

around these things then you can

consider checking out the back

propagation lectures from the deep

learning and machine learning playlist

and then you can come back to this

particular video so that it makes more

sense to you and the second thing that I

want to point out here is that this back

propagation thing in CNN is something

that you will not have to be bothered

about when you are working in a CNN

project or or I should say when you are

using a convolutional neural network

architecture your code will handle

everything over here but it is just that

being a good data scientist you should

have some clarity that what is actually

happening behind the code execution and

this is why for your general

understanding and from the interview

perspective as well it is important that

you should have some idea around back

propagation in CNN architecture as well

this should not look like a black box to

you okay so keeping these two things in

the mind let's go ahead and proceed with

the topic

[Music]

so if you have been following in

sequence then so far going by the

previous lectures of CNN you should be

already comfortable around convolutional

layer and you should have Clarity that

what are featured Maps we also discussed

around pooling different types of

pooling like Max pooling mint cooling

average pooling Etc and exactly how

after applying the pooling method the

featured map gets reduced in size then

we apply another convolutional layer and

then we apply another pooling if

required you can add or remove the

number of layers for convolution

operation and pulling both so there is

no concrete rule around that you should

have at least this number of

convolutional or pooling layers so it is

completely your choice we also

understood that what is the meaning of

the flattened layer where basically what

we do is we take each and every row of

pixel that we have within the image and

then we stack it on top of each other

in order to create the input column this

is the thing which happens in flattened

layer and then we pass this information

to the fully connected layer which is

the a n attached to the end of this CNN

architecture and by the end you get the

classes with probability values in order

to do the classification of the object

within the image So within last three to

four lectures we have discussed around

all these topics we have also discussed

about the concept of padding and strides

as well so if you are understanding

around all these things are absolutely

clear then we can go ahead and discuss

about back propagation now so first of

all I will quickly erase all this to

make it more clear and in order to

understand the back propagation in CNN I

am going to create a simpler version of

the CNN architecture so let's say that

we have an image here and the size of

the image is six by six and then we have

a filter or kernel Matrix so it is

filter or you can call it kerna and this

kernel Matrix has a size of three by

three so basically we are going to use

this filter to convolve over the top of

the image this is something that you

should be anyways comfortable with going

by the previous lectures as we have

already discussed this and after doing

the convolutional operation the next

thing that you will get will be a

featured map and since the image is in a

size of 6x6 and the filter we have in a

size of three by three going by the

formula that we have discussed earlier

which was M minus n multiplied by m

minus n plus 1 actually M representing

the size of the image M by m and n

representing the size of the kernel

Matrix which is n by n this feature map

will be in a size of 4 by 4 and since

over here we are considering only one

filter this is why it you can also call

it in a size of four by four by one so

let's assume that over here instead of

having only one filter you have two more

filters like this okay of the same size

in that case over here you will have two

more feature Maps like this and the size

of the feature map will become four by

four by three but anyways since for the

Simplicity purpose we are considering

only one filter so we are going to have

only one feature map so I am erasing all

this after achieving this feature map

the next thing that we will do is we

will apply value which is a simple

operation of converting all the negative

values into zero nothing else so

obviously this will do no change with

the size of the future map and the size

remains four by four and as you already

May guess that the next step will be

applying pooling operation and after

applying the cooling operation as you

know the size of the featured map will

be reduced so it's shrinks to two by two

so basically we are taking this image

then we are applying the filter to do

the convolutional operation in order to

achieve this featured map feature map

that will have the features from this

input image then we apply value

operation on top of this feature map to

achieve this one and then we apply the

pooling operation to get a featured map

which is reduced in size so basically in

this architecture we have only one

convolutional layer and only one cooling

layer and since this featured map is

reduced in size and it is in a size of

2x2 which means it is basically an image

of extremely low resolution that has

only 4 pixels so in Practical obviously

you will not work with images of this

size but I can try to understand I am

keeping this entire thing simple just to

give you a better explanation nothing

else and the next step will be to create

the flattened layer so basically what we

need to do is we will take this image or

the future map and first we will take

the first row of pixel and place it like

this so this will have two values then

we will take the second row of pixel and

we will place it on bottom of this and

this will also have two values and this

entire thing becomes the input of the

fully connected neural network towards

the end of this CNN architecture so I

will write it as a n and again please do

not consider this as a single neuron

this circle at the end is representing

an entire neural network so basically

this a m thing that I have drawn towards

the end is representing this a neural

network that will have four neurons or

perceptrons in the input layer because

we have four values in the input over

here then you will have number of hidden

layers you can choose that and finally

we will have one output layer and this

particular a n thing is representing

this whole architecture so I try to

explain you this thing so that you don't

have questions in your mind that in the

input layer if we have four values then

how come we have only one perceptron

over here okay so this is actually not a

perceptron this is representing an

entire artificial neural network

architecture and now I am going to make

the screen a bit more clean by erasing

the unnecessary stuff right so this is a

very basic CNN architecture that we are

considering in order to understand the

back propagation operation so going by

your previous understanding of back

propagation in a n or in any other

machine learning algorithm where you

will have trainable parameters we

already know that the basic idea of back

propagation is nothing but propagating

backwards in your architecture in order

to adjust the value of your trainable

parameters so obviously over here we are

able to understand that how the forward

propagation is happening we take the

image and we take this filter or kernel

Matrix to do the convolutional operation

that gives us this feature map then we

apply value function to discard all the

negative values and replace them with

zeros here then we apply this pooling

operation in order to reduce the size of

the featured map and then we apply the

flattening method in order to have the

values as an input layer for the NN

architecture and once this in an

architecture gives you an output let's

consider that as y hat over there

obviously we will will use a loss

function in order to check that how

close we are in compared to the input

value the actual value and as a loss

function you can either go with binary

cross entropy in case you have only two

classes in your target column let's say

you are doing a classification between

cat and dog or you can also use soft Max

in case you have more than two classes

and this will be the entire idea of

forward propagation starting from the

image towards the end of the

architecture so as a next step back

propagation will happen in order to

adjust the values of all the trainable

parameters and before we understand

about that let's try to understand first

that how many trainable parameters we

actually have within this particular

architecture so the parameters for which

we need to adjust the values in order to

reduce the loss are first of all these

all the values within this filter so 3

by 3 which means nine values that you

need to adjust here and plus this filter

Matrix will also have a bias value a

scalar value okay so I am taking that as

well so 9 plus 1 additionally let's

check that where else we have trainable

parameters in this architecture so after

flattening this particular Matrix or

featured map when we acquire our input

values so for example we have four

values over here which is being passed

to the fully connected neural network or

the artificial neural network

architecture then obviously each and

every value will have a dedicated

weights assigned to it so it will have

W1 W2 W3 and W4 so we have four weights

over here plus let's also consider a

bias term over here as well so let's

take it as B2 and here we will take it

as B1 so we have five trainable

parameters towards the end as well so it

will be 10 plus 5 so in total we have 15

trainable parameters for which the

values needs to be adjusted in order to

reduce the loss that we have and since

now we have understood that how many

trainable parameters we have let's try

to understand the backward propagation

operation over here so basically I want

to keep this lecture in such a way that

this should be the only video that you

will ever need in order to understand

the back propagation in CNN so

definitely this is going to be a bit

lengthy so please have some patience and

stick to the end so before we move ahead

what I want to do is I want to

generalize this entire architecture okay

because over here it is looking very

complex and this might seem a bit

difficult if I'll try to explain you

back propagation on this particular

figure so let me place it over here on

the top and let's continue from here so

let's say that you have an image X and

you apply the convolutional operation on

top of this so let this be the symbol of

the convolutional operation which is

happening with the help of a kernel

Matrix or filter that has some weights

and that has a bias as well and after

doing this convolutional operation you

get a featured map and let's call that

X1 what is the next step then obviously

you apply the value operation on top of

this and let's say after applying this

rally operation you are getting R1 and

then the next step would be applying Max

pooling on top of this although you can

use any type of pooling but for the time

being let's consider we are using Max

pooling and then let's say you get P1

the feature map is now being called P1

what will be the next step then

obviously you will apply flattening and

this will give you the input layer let's

call that F this F will be passed on to

your artificial neural network

architecture or you can say the fully

connected neural network and let's say

that will give you an equation s which

will be then passed on to an activation

function like sigmoid that finally gives

you the output y hat so let's do one

thing first let's shrink the size of

this simple architecture a bit so that

it can fit well on the screen all right

so now this output value y can be used

in order to calculate the loss to check

that how well we are doing the

prediction now the way we have the

trainable parameters over here we also

have the trainable parameters over here

as well so if you remember what we

discussed on the top that when we are

passing on the input values to the fully

connected neural network here as well we

have four parameters and one bias value

right so I'm talking about that one so

let's call it as W2 and bias 2. so

remember that this W1 has 9 values

because it is a filter of three by three

size plus one bias value so in total

there are 10 trainable parameters and

over here we have four weights plus one

bias so these are the parameters for

which we need to adjust the values

during back propagation in order to

reduce this loss value or you can say in

order to do more accurate prediction

since the better you predict the Lesser

loss you will have right so basically we

want to update the values of these

parameters right and if you remember by

the previous lectures of deep learning

or machine learning we use the formula

of this gradient descent algorithm in

order to update these values so what

does the formula say let's say we want

to update the value for this parameter

W1 okay so the new value for this weight

will be equals to the old value of W1

minus a small learning rate multiplied

by DL by D W old now what does this mean

it simply means that how the value of

loss is changing by bringing a very

small change with the value of w old so

obviously at this point of time I am

assuming that you are anyways

comfortable with the extremely basic

calculus or the idea of derivatives so I

am not going to focus so much on that

and in exactly similar way we also

update the value of the bias as well

just like this okay so basically we are

looking for these new values okay and

let's see what we already have we have

the old values that we initiated

randomly okay that we have actually used

in order to do the front propagation in

order to achieve this output and to

calculate the loss as well we know the

value of this learning rate because this

is something that we have decided or

chosen by the algorithm so that is also

within our control all we need to find

is the value for this the derivative of

loss with respect to the previous value

or the old value of the weights so the

idea of back propagation is nothing but

calculating this particular thing okay

so I will erase these things first and

let me bring this entire portion to the

side over here okay okay so the next

thing that we are going to do is this

entire CNN architecture okay I want you

to assume it in two parts first will be

this the convolutional part and second

will be this where we have the fully

connected neural network and within the

previous lectures of this particular

deep learning playlist we have discussed

a lot about artificial neural networks

and we have also discussed about how the

weight adjustment or the values of the

weights are being adjusted in the

artificial neural network during the

back propagation so anyways you will

have a decent Clarity around that if you

are following along with this playlist

the challenging part over here to

understand is that how the back

propagation happens in the flattening

layer in the pooling layer also for the

rally operation and most importantly

within the convolutional layer how

exactly it is happening over here so

obviously this part is going to be the

Crux of this lecture but still I will

quickly go ahead and cover very quick

how the weight adjustment is happening

here so I'm talking about these weights

actually okay so basically what we want

we want to understand that how the loss

will be changed with respect to a change

with W2 okay or over here you can also

say P2 but let's consider W2 for the

timing and try to understand that W2 or

the weights for the W2 parameter is not

directly involved in order to calculate

the loss because using the values of W2

or B2 or this entire weights these five

weights first we are calculating this

equation s which is then being passed on

to the sigmoid function in order to

calculate the output which is y using

the value of y we are then calculating

the loss so now we will have to

propagate backwards in order to

understand that how we calculate DL by

dw2 so let's try to understand that so

obviously this is a point where we are

going to talk about chain rule so what

does the idea of chain rule say over

here first we will check that how the

value of loss Will Change by bringing a

small change in the value of y hat so

that will be DL by d y hat Next Step

will be how the value of d y hat is

being changed by bringing a change in s

so that will be d y Hat by d s and then

finally we will calculate that how the

value in s will be changed by changing

the values of W2 or B2 let's say anyways

both will be calculated in the same way

but for the explanation purpose I am

considering W2 for the time being so

this will be DS by D W 2 and applying

the method of chain room this will

cancel this and this will cancel out

this and this is how we calculate DL by

dw2 and by finding this particular

derivative what we basically try to

check is that what happens when we are

moving or changing the value of W2 in a

certain direction so let's say if we are

increasing using the value of W2 and

that is leading to the increasing value

of loss as well then obviously the

gradient descent formula will not update

the value of W2 by increasing it instead

it will go for the other approach by

trying to decrease the value of W2 and

it will check if the value of loss is

decreasing in that way this is how the

weight updation works anyways this is

something that we have already learned

in a lot of details when we were

discussing back propagation in a n but

still I thought of giving you this much

explanation for a revision purpose maybe

so that you should be feeling

comfortable within this lecture so

hopefully we are clear about the weight

updation part for this a n part of the

CNN architecture and we are able to

understand that how these weights are

being updated or adjusted by doing the

back propagation by calculating

different derivatives and applying chain

rule in order to achieve the optimum

weights for these parameters and now we

are going to understand step by step the

idea of back propagation within the

convolutional part of this CNN

architecture since we are propagating

backwards from the end and we have

understood how we are calculating or

updating these values which means we

have already understood that how to

calculate this particular term right now

let's talk about the updation of these

weights so again for the explanation

purpose I am considering only W1 to show

you how the weight updation happens and

the same method or the same trick will

be applied for B1 as well okay so

basically we want to understand that how

the loss is being changed with respect

to W1 okay so obviously coming back from

the point of loss we are already here at

this point where we have calculated or

we have understood how DL by dw2 is

being calculated now let's try to flow

backwards from this point okay so we

will further apply the chain room and we

need to calculate the derivative of W2

with respect to so from this point we

will now capture this one so P1 Next

Step will be to calculate the derivative

of P1 with respect to the derivative of

R1 so this one multiplied by let me

erase this one okay we will calculate

the derivative of R1 with respect to d x

1 and then finally

dx1 by D W1 again going by the chain

rule this will cancel out this this will

cancel out this and so on and we will

finally end up calculating that how loss

is changing with respect to bringing a

small change in W 1 which is this

weights W1 or B1 you can see because

obviously you can understand the way we

are trying to update the value of W1

using this chain rule method the value

for B1 as well will will be updated in

the similar way so let me make it

cleaner again since this part is all

clear for us we understand very well

that how the weight updation happens for

the an architecture let's talk about

this part so how do we calculate the

derivative for the flattening layer so

let's understand about back propagation

in flattened layer so previously if you

will check what we were doing exactly in

the flatting layer after applying the

pooling operation the size of the

featured map will be reduced from 4x4 to

2 by 2 and basically we were taking this

2 by 2 Matrix and we were flattening it

in order to have the input column or the

input values and during back propagation

we need to understand that how do we go

back from this step to this step right

so let's understand that so after

applying the pooling operation we had

this Matrix of size 2x2 that we were

calling as P or let me raise it from

here and write it here p and then what

we were basically doing is we were

applying the flattening method in order

to create this array that will have four

values to pass it on as an input to the

fully connected neural network and when

doing back propagation we do the exactly

opposite of this we take this input

column okay these four values which was

previously in a size of four by one that

means four rows and one column and we if

we store it back to the previous shape

of two by two like this that's it that's

all happens when we are doing back

propagation in flattening layer so let's

go to the previous explanation so

previously we knew that how we calculate

this P1 by applying Max pooling and we

flatten it in order to have the input

value but during back propagation we

restore the input values back to the

previous shape of two by two or whatever

the shape was previously so this is what

happens in the step of dw2 by dp1 okay

now let's move further in the discussion

of back propagation so starting from the

loss we came here to here we updated the

weights and we also understood that how

back propagation happens in the

flattening layer now it is time that we

understand that how back propagation is

happening in the pooling layer so let's

discuss over that now so just like the

flattened layer again we do not have any

trainable parameters here so we are not

gonna using any conventional back

propagation instead we will again do

exactly opposite of what we have done

previously during pooling or Max pooling

operation so if you remember the max

pooling operation in this way let's say

if you have a structure like this okay

let me draw it completely then I will

explain so let's say that you have an

image like this that has a size of four

by four and in total it has 16 pixels

then when you are applying Max pooling

basically what you are doing is you take

a window of two by two okay this could

be of any size but let's say two by two

and you start moving this window on top

of the image and for each and every

position let's say initially the window

is over here then specifically when you

are applying Max cooling you take the

maximum value out okay so from the first

position you will take four again you

move the pulling window to the left and

you take out eight similarly you place

it over here this time and you take out

12 and finally in the last position you

will be taking out the highest value

which is 16 okay this is more or less

the entire idea of Max pooling but when

you are doing back propagation on the

pulling layer you do exactly the

opposite of this particular operation so

let me tell you what exactly happens

here you take this Matrix okay and then

you move it backwards to the previous

shape and it is really important that

while doing this back propagation we

should have a note that from which

particular index we have taken the

maximum value like these ones okay

because when we are restoring the

information backwards in the previous

shape only that particular position from

where the maximum element has been taken

out only that position will be non-zero

and rest of the part we will keep it as

0 but let me tell you one thing again

that none of these things will be done

by you manually okay we have already

discussed this particular thing at the

beginning of this lecture that the

entire operation of back propagation

will be handled by your code only it is

just that you should have some

understanding that what is actually

happening behind the code execution so

that a typical CNN architecture should

not look like a black box to you it is

really important from the interview

perspective and this is how the back

propagation happens in the pooling layer

obviously there are no trainable

parameters all we are doing is we are

doing the reverse of the previous

operation so initially it was a four by

four Matrix and by applying the max

cooling operation it got converted to a

two by two Matrix during the back

propagation we take this 2 by 2 Matrix

and we store it back to the previous

shape in this way keeping the positions

from where the highest elements were

taken as non-zero that's it and with

that being said we have now also

understood that how this particular part

is being calculated so now we need to

understand the next step where we will

understand how the value operation is

being considered curing back propagation

so this one is going to be the most

simplest thing to understand okay see

what is happening within this rally

operation all we are doing is let's

assume over here we have a two by two

Matrix okay something that looks like

this let's say 2 minus 3 minus 1 and 6

then all we are doing is after applying

value on this particular Matrix there

will be no change in the shape of this

Matrix it is just that all the positive

values will remain same and the negative

values will be converted to zero that's

the basic thing that value operation is

doing for you this is first thing

secondly in order to calculate this ETL

by D W1 which means in order to

understand how the loss is changing with

respect to bringing a small change in W1

or P1 for say we calculated or we are

trying to apply this particular chain

rule right and we have understood that

how the calculation is happening till

this point so far okay so when you are

propagating back from the point of loss

to the point of value which means from

this particular point to this particular

point after propagating back towards

over here the output that you will have

that also is going to be a two by two

Matrix and let's say this particular two

by two Matrix looks like this let's say

1 3

minus 6 and 2 okay so basically over

here what you are trying to do is you

are trying to calculate dr1 by dx1 which

means the derivative of these values all

the values within this particular Matrix

right so you will try to calculate D1

with respect to

dw1 because we are checking for W1 right

and similarly we will try to calculate

the derivative for other elements as

well so the idea is going to be very

simple if you are doing the derivative

for a positive value like this this or

this you will always get one obviously

you know that going by the basics of

calculus otherwise you will get zero

that's it that's the whole idea behind

it okay so if anyone asks you in an

interview that during back propagation

in a CNN architecture how does the back

propagation happens in the value layer

then you simply say that within this

particular step exactly here where we

are calculating Dr by dx1 we are trying

to differentiate the values of R1 right

R1 is nothing but a 2 by 2 Matrix that

has four values and we are trying to

calculate the derivative of those four

values only and for those four values if

the values are positive then obviously

the derivative will be 1 otherwise the

derivative will be zero that's it and

finally since we are done till this

point in the back propagation Journey

now the last step is to calculate this

dx1 by dw1 so basically for this

particular filter okay where we are

using a three by three filter over here

so let's say for Simplicity purpose that

we are using a filter of 2 byte okay

that looks let's say something like this

simply like 0 0 1 and 1. okay so now

during back propagation we will try to

update the values of this particular

filter into any random number okay

although that will not be random the

values will be updated in such a way

that the loss should be minimum but the

idea is to have the values in such a way

that these values within the filter when

applied on the image during the

convolutional operation this should

derive some meaningful features that's

it so I'll try to explain what I mean

here let's consider a very simple use

case let's say that you have an image

okay in the image you have a round

object like this okay and then you have

a two by two filter that you are using

size 2x2 uh let me write it outside okay

2 by 2 and this will have random values

like X1 let's say X2 uh X3 or let's say

X4 then what's the idea of using this

particular filter we will convert the

filter on top of the image first at this

position next over here straight by

stride then over here then over here

right so after convolving the filter all

over the image we should be able to grab

some meaningful features or edges right

like some primitive features like this

like this this this and then towards the

end of the convolutional layer finally

we will be able to identify the object

like this okay but let's say let's say

the values of your filter are kept

randomly in such a way that it will

detect horizontal edges like this or

let's say vertical edges like this or

let's say slant edges like this then you

only can think that combining this kind

of edges you will never be able to form

a circle that you have on the image so

when you finally float towards the end

of the CNN architecture over here the

loss will be too much when you calculate

the loss using binary cross entropy or

let's say soft Max the loss will be very

high okay because the edges that you are

trying to capture with the help of this

particular filter those edges are not

really that meaningful in order to

identify the object within the image so

the back propagation happens exactly the

way we have learned this back

propagation happens and we adjust these

values in such a way that this time we

will try to capture another type of

edges maybe edges that instead of being

horizontal or vertical edges edges that

may look like this curved curved edges

like this primitive edges okay like this

in that case combining edges like this

and obviously let's say if we are

talking about detecting four edges okay

first like this second this third this

and fourth being like this okay in that

case you will need to have four filters

so you will have two by two size four

filters and you will apply them one by

one for the convolutional operation on

top of your image and by doing the back

propagation again and again the

algorithm finally succeeds to update

these values in such a way that they

will start capturing edges like this

okay this is the whole idea of doing the

back propagation in CNN finally you will

have these values let me make it a bit

cleaner this is looking very untidy so

in a CNN model when you are doing the

back propagation in order to reduce this

loss okay and we discussed all of this

how do we calculate this step then this

step then how do we back propagate on

top of flattening layer Max pooling

layer value layer and finally after

coming to this point the algorithm will

start updating these weights and biases

the way it was updating these weights

and biases okay finally in the end when

we are when we have achieved the best

values for these weights and biases

which is nothing but the best values for

the filter that we are using okay if we

are able to do that in fact we will not

be doing that the algorithm will do that

for us but finally we reached to the

point where we get the best values for

the filter and then the filter will

start capturing the meaningful edges or

features from the image and then finally

if you pass an image of a Tweety Bird

then since these filters have the best

values in order to capture the edges or

features like the eyes of fruity bird or

its eyebrows or head or other features

it will pass the information towards the

end of the architecture so that it can

confidently say that it is an image of a

treatable not of goofy or Donald Duck so

I hope this this lecture was able to

provide you some understanding a very

decent understanding around back

propagation in CNN in case you are

learning this topic for entirely first

time if CNN is completely new to you in

that case you may require to go through

this particular video maybe couple of

more times but please do not hesitate

doing that because initially I also

struggled a lot to understand this

particular topic and congratulations if

you have already understood it try to

share it with your friends or peers who

may find it useful and thank you very

much for watching till the end we have

learned a lot within this deep learning

playlist although it is becoming very

difficult to make time from my full-time

job to create tutorials like this but I

will try my best to create more

tutorials on recurrent neural network in

order to continue this deep learning

playlist so please subscribe to the

channel if you're new here and hopefully

I will see you in the next lecture

thank you

Click on any text or timestamp to jump to that moment in the video

Most transcripts ready in under 5 seconds

One-Click Copy125+ LanguagesSearch ContentJump to Timestamps

Paste YouTube URL

Enter any YouTube video link to get the full transcript

Most transcripts ready in under 5 seconds

Get Our Chrome Extension

Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.

Add to Chrome — Free

Works with YouTube, Coursera, Udemy and more educational platforms

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube Transcript:Backpropagation In CNN Model | Deep Learning Playlist

Video Transcript

Paste YouTube URL

Transcript Extraction Form

Get Our Chrome Extension

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube Transcript:
Backpropagation In CNN Model | Deep Learning Playlist