YouTube Transcript:
Backpropagation In CNN Model | Deep Learning Playlist
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
View:
okay so this lecture is going to be
about back propagation in CNN
architecture before we get started with
this topic there are two things that I
want to point out here first is that
since we are learning the concept of
back propagation in CNN or since we are
going through the topic of CNN then I am
just assuming that you are already aware
of the concept of back propagation in
ANS and machine learning so I am
assuming that you have a basic idea
around how the parameter training
happens in a machine learning algorithm
or in an architecture like artificial
neural network and you have a decent
understanding around gradient descent as
well so in case you are not comfortable
around these things then you can
consider checking out the back
propagation lectures from the deep
learning and machine learning playlist
and then you can come back to this
particular video so that it makes more
sense to you and the second thing that I
want to point out here is that this back
propagation thing in CNN is something
that you will not have to be bothered
about when you are working in a CNN
project or or I should say when you are
using a convolutional neural network
architecture your code will handle
everything over here but it is just that
being a good data scientist you should
have some clarity that what is actually
happening behind the code execution and
this is why for your general
understanding and from the interview
perspective as well it is important that
you should have some idea around back
propagation in CNN architecture as well
this should not look like a black box to
you okay so keeping these two things in
the mind let's go ahead and proceed with
the topic
[Music]
so if you have been following in
sequence then so far going by the
previous lectures of CNN you should be
already comfortable around convolutional
layer and you should have Clarity that
what are featured Maps we also discussed
around pooling different types of
pooling like Max pooling mint cooling
average pooling Etc and exactly how
after applying the pooling method the
featured map gets reduced in size then
we apply another convolutional layer and
then we apply another pooling if
required you can add or remove the
number of layers for convolution
operation and pulling both so there is
no concrete rule around that you should
have at least this number of
convolutional or pooling layers so it is
completely your choice we also
understood that what is the meaning of
the flattened layer where basically what
we do is we take each and every row of
pixel that we have within the image and
then we stack it on top of each other
in order to create the input column this
is the thing which happens in flattened
layer and then we pass this information
to the fully connected layer which is
the a n attached to the end of this CNN
architecture and by the end you get the
classes with probability values in order
to do the classification of the object
within the image So within last three to
four lectures we have discussed around
all these topics we have also discussed
about the concept of padding and strides
as well so if you are understanding
around all these things are absolutely
clear then we can go ahead and discuss
about back propagation now so first of
all I will quickly erase all this to
make it more clear and in order to
understand the back propagation in CNN I
am going to create a simpler version of
the CNN architecture so let's say that
we have an image here and the size of
the image is six by six and then we have
a filter or kernel Matrix so it is
filter or you can call it kerna and this
kernel Matrix has a size of three by
three so basically we are going to use
this filter to convolve over the top of
the image this is something that you
should be anyways comfortable with going
by the previous lectures as we have
already discussed this and after doing
the convolutional operation the next
thing that you will get will be a
featured map and since the image is in a
size of 6x6 and the filter we have in a
size of three by three going by the
formula that we have discussed earlier
which was M minus n multiplied by m
minus n plus 1 actually M representing
the size of the image M by m and n
representing the size of the kernel
Matrix which is n by n this feature map
will be in a size of 4 by 4 and since
over here we are considering only one
filter this is why it you can also call
it in a size of four by four by one so
let's assume that over here instead of
having only one filter you have two more
filters like this okay of the same size
in that case over here you will have two
more feature Maps like this and the size
of the feature map will become four by
four by three but anyways since for the
Simplicity purpose we are considering
only one filter so we are going to have
only one feature map so I am erasing all
this after achieving this feature map
the next thing that we will do is we
will apply value which is a simple
operation of converting all the negative
values into zero nothing else so
obviously this will do no change with
the size of the future map and the size
remains four by four and as you already
May guess that the next step will be
applying pooling operation and after
applying the cooling operation as you
know the size of the featured map will
be reduced so it's shrinks to two by two
so basically we are taking this image
then we are applying the filter to do
the convolutional operation in order to
achieve this featured map feature map
that will have the features from this
input image then we apply value
operation on top of this feature map to
achieve this one and then we apply the
pooling operation to get a featured map
which is reduced in size so basically in
this architecture we have only one
convolutional layer and only one cooling
layer and since this featured map is
reduced in size and it is in a size of
2x2 which means it is basically an image
of extremely low resolution that has
only 4 pixels so in Practical obviously
you will not work with images of this
size but I can try to understand I am
keeping this entire thing simple just to
give you a better explanation nothing
else and the next step will be to create
the flattened layer so basically what we
need to do is we will take this image or
the future map and first we will take
the first row of pixel and place it like
this so this will have two values then
we will take the second row of pixel and
we will place it on bottom of this and
this will also have two values and this
entire thing becomes the input of the
fully connected neural network towards
the end of this CNN architecture so I
will write it as a n and again please do
not consider this as a single neuron
this circle at the end is representing
an entire neural network so basically
this a m thing that I have drawn towards
the end is representing this a neural
network that will have four neurons or
perceptrons in the input layer because
we have four values in the input over
here then you will have number of hidden
layers you can choose that and finally
we will have one output layer and this
particular a n thing is representing
this whole architecture so I try to
explain you this thing so that you don't
have questions in your mind that in the
input layer if we have four values then
how come we have only one perceptron
over here okay so this is actually not a
perceptron this is representing an
entire artificial neural network
architecture and now I am going to make
the screen a bit more clean by erasing
the unnecessary stuff right so this is a
very basic CNN architecture that we are
considering in order to understand the
back propagation operation so going by
your previous understanding of back
propagation in a n or in any other
machine learning algorithm where you
will have trainable parameters we
already know that the basic idea of back
propagation is nothing but propagating
backwards in your architecture in order
to adjust the value of your trainable
parameters so obviously over here we are
able to understand that how the forward
propagation is happening we take the
image and we take this filter or kernel
Matrix to do the convolutional operation
that gives us this feature map then we
apply value function to discard all the
negative values and replace them with
zeros here then we apply this pooling
operation in order to reduce the size of
the featured map and then we apply the
flattening method in order to have the
values as an input layer for the NN
architecture and once this in an
architecture gives you an output let's
consider that as y hat over there
obviously we will will use a loss
function in order to check that how
close we are in compared to the input
value the actual value and as a loss
function you can either go with binary
cross entropy in case you have only two
classes in your target column let's say
you are doing a classification between
cat and dog or you can also use soft Max
in case you have more than two classes
and this will be the entire idea of
forward propagation starting from the
image towards the end of the
architecture so as a next step back
propagation will happen in order to
adjust the values of all the trainable
parameters and before we understand
about that let's try to understand first
that how many trainable parameters we
actually have within this particular
architecture so the parameters for which
we need to adjust the values in order to
reduce the loss are first of all these
all the values within this filter so 3
by 3 which means nine values that you
need to adjust here and plus this filter
Matrix will also have a bias value a
scalar value okay so I am taking that as
well so 9 plus 1 additionally let's
check that where else we have trainable
parameters in this architecture so after
flattening this particular Matrix or
featured map when we acquire our input
values so for example we have four
values over here which is being passed
to the fully connected neural network or
the artificial neural network
architecture then obviously each and
every value will have a dedicated
weights assigned to it so it will have
W1 W2 W3 and W4 so we have four weights
over here plus let's also consider a
bias term over here as well so let's
take it as B2 and here we will take it
as B1 so we have five trainable
parameters towards the end as well so it
will be 10 plus 5 so in total we have 15
trainable parameters for which the
values needs to be adjusted in order to
reduce the loss that we have and since
now we have understood that how many
trainable parameters we have let's try
to understand the backward propagation
operation over here so basically I want
to keep this lecture in such a way that
this should be the only video that you
will ever need in order to understand
the back propagation in CNN so
definitely this is going to be a bit
lengthy so please have some patience and
stick to the end so before we move ahead
what I want to do is I want to
generalize this entire architecture okay
because over here it is looking very
complex and this might seem a bit
difficult if I'll try to explain you
back propagation on this particular
figure so let me place it over here on
the top and let's continue from here so
let's say that you have an image X and
you apply the convolutional operation on
top of this so let this be the symbol of
the convolutional operation which is
happening with the help of a kernel
Matrix or filter that has some weights
and that has a bias as well and after
doing this convolutional operation you
get a featured map and let's call that
X1 what is the next step then obviously
you apply the value operation on top of
this and let's say after applying this
rally operation you are getting R1 and
then the next step would be applying Max
pooling on top of this although you can
use any type of pooling but for the time
being let's consider we are using Max
pooling and then let's say you get P1
the feature map is now being called P1
what will be the next step then
obviously you will apply flattening and
this will give you the input layer let's
call that F this F will be passed on to
your artificial neural network
architecture or you can say the fully
connected neural network and let's say
that will give you an equation s which
will be then passed on to an activation
function like sigmoid that finally gives
you the output y hat so let's do one
thing first let's shrink the size of
this simple architecture a bit so that
it can fit well on the screen all right
so now this output value y can be used
in order to calculate the loss to check
that how well we are doing the
prediction now the way we have the
trainable parameters over here we also
have the trainable parameters over here
as well so if you remember what we
discussed on the top that when we are
passing on the input values to the fully
connected neural network here as well we
have four parameters and one bias value
right so I'm talking about that one so
let's call it as W2 and bias 2. so
remember that this W1 has 9 values
because it is a filter of three by three
size plus one bias value so in total
there are 10 trainable parameters and
over here we have four weights plus one
bias so these are the parameters for
which we need to adjust the values
during back propagation in order to
reduce this loss value or you can say in
order to do more accurate prediction
since the better you predict the Lesser
loss you will have right so basically we
want to update the values of these
parameters right and if you remember by
the previous lectures of deep learning
or machine learning we use the formula
of this gradient descent algorithm in
order to update these values so what
does the formula say let's say we want
to update the value for this parameter
W1 okay so the new value for this weight
will be equals to the old value of W1
minus a small learning rate multiplied
by DL by D W old now what does this mean
it simply means that how the value of
loss is changing by bringing a very
small change with the value of w old so
obviously at this point of time I am
assuming that you are anyways
comfortable with the extremely basic
calculus or the idea of derivatives so I
am not going to focus so much on that
and in exactly similar way we also
update the value of the bias as well
just like this okay so basically we are
looking for these new values okay and
let's see what we already have we have
the old values that we initiated
randomly okay that we have actually used
in order to do the front propagation in
order to achieve this output and to
calculate the loss as well we know the
value of this learning rate because this
is something that we have decided or
chosen by the algorithm so that is also
within our control all we need to find
is the value for this the derivative of
loss with respect to the previous value
or the old value of the weights so the
idea of back propagation is nothing but
calculating this particular thing okay
so I will erase these things first and
let me bring this entire portion to the
side over here okay okay so the next
thing that we are going to do is this
entire CNN architecture okay I want you
to assume it in two parts first will be
this the convolutional part and second
will be this where we have the fully
connected neural network and within the
previous lectures of this particular
deep learning playlist we have discussed
a lot about artificial neural networks
and we have also discussed about how the
weight adjustment or the values of the
weights are being adjusted in the
artificial neural network during the
back propagation so anyways you will
have a decent Clarity around that if you
are following along with this playlist
the challenging part over here to
understand is that how the back
propagation happens in the flattening
layer in the pooling layer also for the
rally operation and most importantly
within the convolutional layer how
exactly it is happening over here so
obviously this part is going to be the
Crux of this lecture but still I will
quickly go ahead and cover very quick
how the weight adjustment is happening
here so I'm talking about these weights
actually okay so basically what we want
we want to understand that how the loss
will be changed with respect to a change
with W2 okay or over here you can also
say P2 but let's consider W2 for the
timing and try to understand that W2 or
the weights for the W2 parameter is not
directly involved in order to calculate
the loss because using the values of W2
or B2 or this entire weights these five
weights first we are calculating this
equation s which is then being passed on
to the sigmoid function in order to
calculate the output which is y using
the value of y we are then calculating
the loss so now we will have to
propagate backwards in order to
understand that how we calculate DL by
dw2 so let's try to understand that so
obviously this is a point where we are
going to talk about chain rule so what
does the idea of chain rule say over
here first we will check that how the
value of loss Will Change by bringing a
small change in the value of y hat so
that will be DL by d y hat Next Step
will be how the value of d y hat is
being changed by bringing a change in s
so that will be d y Hat by d s and then
finally we will calculate that how the
value in s will be changed by changing
the values of W2 or B2 let's say anyways
both will be calculated in the same way
but for the explanation purpose I am
considering W2 for the time being so
this will be DS by D W 2 and applying
the method of chain room this will
cancel this and this will cancel out
this and this is how we calculate DL by
dw2 and by finding this particular
derivative what we basically try to
check is that what happens when we are
moving or changing the value of W2 in a
certain direction so let's say if we are
increasing using the value of W2 and
that is leading to the increasing value
of loss as well then obviously the
gradient descent formula will not update
the value of W2 by increasing it instead
it will go for the other approach by
trying to decrease the value of W2 and
it will check if the value of loss is
decreasing in that way this is how the
weight updation works anyways this is
something that we have already learned
in a lot of details when we were
discussing back propagation in a n but
still I thought of giving you this much
explanation for a revision purpose maybe
so that you should be feeling
comfortable within this lecture so
hopefully we are clear about the weight
updation part for this a n part of the
CNN architecture and we are able to
understand that how these weights are
being updated or adjusted by doing the
back propagation by calculating
different derivatives and applying chain
rule in order to achieve the optimum
weights for these parameters and now we
are going to understand step by step the
idea of back propagation within the
convolutional part of this CNN
architecture since we are propagating
backwards from the end and we have
understood how we are calculating or
updating these values which means we
have already understood that how to
calculate this particular term right now
let's talk about the updation of these
weights so again for the explanation
purpose I am considering only W1 to show
you how the weight updation happens and
the same method or the same trick will
be applied for B1 as well okay so
basically we want to understand that how
the loss is being changed with respect
to W1 okay so obviously coming back from
the point of loss we are already here at
this point where we have calculated or
we have understood how DL by dw2 is
being calculated now let's try to flow
backwards from this point okay so we
will further apply the chain room and we
need to calculate the derivative of W2
with respect to so from this point we
will now capture this one so P1 Next
Step will be to calculate the derivative
of P1 with respect to the derivative of
R1 so this one multiplied by let me
erase this one okay we will calculate
the derivative of R1 with respect to d x
1 and then finally
dx1 by D W1 again going by the chain
rule this will cancel out this this will
cancel out this and so on and we will
finally end up calculating that how loss
is changing with respect to bringing a
small change in W 1 which is this
weights W1 or B1 you can see because
obviously you can understand the way we
are trying to update the value of W1
using this chain rule method the value
for B1 as well will will be updated in
the similar way so let me make it
cleaner again since this part is all
clear for us we understand very well
that how the weight updation happens for
the an architecture let's talk about
this part so how do we calculate the
derivative for the flattening layer so
let's understand about back propagation
in flattened layer so previously if you
will check what we were doing exactly in
the flatting layer after applying the
pooling operation the size of the
featured map will be reduced from 4x4 to
2 by 2 and basically we were taking this
2 by 2 Matrix and we were flattening it
in order to have the input column or the
input values and during back propagation
we need to understand that how do we go
back from this step to this step right
so let's understand that so after
applying the pooling operation we had
this Matrix of size 2x2 that we were
calling as P or let me raise it from
here and write it here p and then what
we were basically doing is we were
applying the flattening method in order
to create this array that will have four
values to pass it on as an input to the
fully connected neural network and when
doing back propagation we do the exactly
opposite of this we take this input
column okay these four values which was
previously in a size of four by one that
means four rows and one column and we if
we store it back to the previous shape
of two by two like this that's it that's
all happens when we are doing back
propagation in flattening layer so let's
go to the previous explanation so
previously we knew that how we calculate
this P1 by applying Max pooling and we
flatten it in order to have the input
value but during back propagation we
restore the input values back to the
previous shape of two by two or whatever
the shape was previously so this is what
happens in the step of dw2 by dp1 okay
now let's move further in the discussion
of back propagation so starting from the
loss we came here to here we updated the
weights and we also understood that how
back propagation happens in the
flattening layer now it is time that we
understand that how back propagation is
happening in the pooling layer so let's
discuss over that now so just like the
flattened layer again we do not have any
trainable parameters here so we are not
gonna using any conventional back
propagation instead we will again do
exactly opposite of what we have done
previously during pooling or Max pooling
operation so if you remember the max
pooling operation in this way let's say
if you have a structure like this okay
let me draw it completely then I will
explain so let's say that you have an
image like this that has a size of four
by four and in total it has 16 pixels
then when you are applying Max pooling
basically what you are doing is you take
a window of two by two okay this could
be of any size but let's say two by two
and you start moving this window on top
of the image and for each and every
position let's say initially the window
is over here then specifically when you
are applying Max cooling you take the
maximum value out okay so from the first
position you will take four again you
move the pulling window to the left and
you take out eight similarly you place
it over here this time and you take out
12 and finally in the last position you
will be taking out the highest value
which is 16 okay this is more or less
the entire idea of Max pooling but when
you are doing back propagation on the
pulling layer you do exactly the
opposite of this particular operation so
let me tell you what exactly happens
here you take this Matrix okay and then
you move it backwards to the previous
shape and it is really important that
while doing this back propagation we
should have a note that from which
particular index we have taken the
maximum value like these ones okay
because when we are restoring the
information backwards in the previous
shape only that particular position from
where the maximum element has been taken
out only that position will be non-zero
and rest of the part we will keep it as
0 but let me tell you one thing again
that none of these things will be done
by you manually okay we have already
discussed this particular thing at the
beginning of this lecture that the
entire operation of back propagation
will be handled by your code only it is
just that you should have some
understanding that what is actually
happening behind the code execution so
that a typical CNN architecture should
not look like a black box to you it is
really important from the interview
perspective and this is how the back
propagation happens in the pooling layer
obviously there are no trainable
parameters all we are doing is we are
doing the reverse of the previous
operation so initially it was a four by
four Matrix and by applying the max
cooling operation it got converted to a
two by two Matrix during the back
propagation we take this 2 by 2 Matrix
and we store it back to the previous
shape in this way keeping the positions
from where the highest elements were
taken as non-zero that's it and with
that being said we have now also
understood that how this particular part
is being calculated so now we need to
understand the next step where we will
understand how the value operation is
being considered curing back propagation
so this one is going to be the most
simplest thing to understand okay see
what is happening within this rally
operation all we are doing is let's
assume over here we have a two by two
Matrix okay something that looks like
this let's say 2 minus 3 minus 1 and 6
then all we are doing is after applying
value on this particular Matrix there
will be no change in the shape of this
Matrix it is just that all the positive
values will remain same and the negative
values will be converted to zero that's
the basic thing that value operation is
doing for you this is first thing
secondly in order to calculate this ETL
by D W1 which means in order to
understand how the loss is changing with
respect to bringing a small change in W1
or P1 for say we calculated or we are
trying to apply this particular chain
rule right and we have understood that
how the calculation is happening till
this point so far okay so when you are
propagating back from the point of loss
to the point of value which means from
this particular point to this particular
point after propagating back towards
over here the output that you will have
that also is going to be a two by two
Matrix and let's say this particular two
by two Matrix looks like this let's say
1 3
minus 6 and 2 okay so basically over
here what you are trying to do is you
are trying to calculate dr1 by dx1 which
means the derivative of these values all
the values within this particular Matrix
right so you will try to calculate D1
with respect to
dw1 because we are checking for W1 right
and similarly we will try to calculate
the derivative for other elements as
well so the idea is going to be very
simple if you are doing the derivative
for a positive value like this this or
this you will always get one obviously
you know that going by the basics of
calculus otherwise you will get zero
that's it that's the whole idea behind
it okay so if anyone asks you in an
interview that during back propagation
in a CNN architecture how does the back
propagation happens in the value layer
then you simply say that within this
particular step exactly here where we
are calculating Dr by dx1 we are trying
to differentiate the values of R1 right
R1 is nothing but a 2 by 2 Matrix that
has four values and we are trying to
calculate the derivative of those four
values only and for those four values if
the values are positive then obviously
the derivative will be 1 otherwise the
derivative will be zero that's it and
finally since we are done till this
point in the back propagation Journey
now the last step is to calculate this
dx1 by dw1 so basically for this
particular filter okay where we are
using a three by three filter over here
so let's say for Simplicity purpose that
we are using a filter of 2 byte okay
that looks let's say something like this
simply like 0 0 1 and 1. okay so now
during back propagation we will try to
update the values of this particular
filter into any random number okay
although that will not be random the
values will be updated in such a way
that the loss should be minimum but the
idea is to have the values in such a way
that these values within the filter when
applied on the image during the
convolutional operation this should
derive some meaningful features that's
it so I'll try to explain what I mean
here let's consider a very simple use
case let's say that you have an image
okay in the image you have a round
object like this okay and then you have
a two by two filter that you are using
size 2x2 uh let me write it outside okay
2 by 2 and this will have random values
like X1 let's say X2 uh X3 or let's say
X4 then what's the idea of using this
particular filter we will convert the
filter on top of the image first at this
position next over here straight by
stride then over here then over here
right so after convolving the filter all
over the image we should be able to grab
some meaningful features or edges right
like some primitive features like this
like this this this and then towards the
end of the convolutional layer finally
we will be able to identify the object
like this okay but let's say let's say
the values of your filter are kept
randomly in such a way that it will
detect horizontal edges like this or
let's say vertical edges like this or
let's say slant edges like this then you
only can think that combining this kind
of edges you will never be able to form
a circle that you have on the image so
when you finally float towards the end
of the CNN architecture over here the
loss will be too much when you calculate
the loss using binary cross entropy or
let's say soft Max the loss will be very
high okay because the edges that you are
trying to capture with the help of this
particular filter those edges are not
really that meaningful in order to
identify the object within the image so
the back propagation happens exactly the
way we have learned this back
propagation happens and we adjust these
values in such a way that this time we
will try to capture another type of
edges maybe edges that instead of being
horizontal or vertical edges edges that
may look like this curved curved edges
like this primitive edges okay like this
in that case combining edges like this
and obviously let's say if we are
talking about detecting four edges okay
first like this second this third this
and fourth being like this okay in that
case you will need to have four filters
so you will have two by two size four
filters and you will apply them one by
one for the convolutional operation on
top of your image and by doing the back
propagation again and again the
algorithm finally succeeds to update
these values in such a way that they
will start capturing edges like this
okay this is the whole idea of doing the
back propagation in CNN finally you will
have these values let me make it a bit
cleaner this is looking very untidy so
in a CNN model when you are doing the
back propagation in order to reduce this
loss okay and we discussed all of this
how do we calculate this step then this
step then how do we back propagate on
top of flattening layer Max pooling
layer value layer and finally after
coming to this point the algorithm will
start updating these weights and biases
the way it was updating these weights
and biases okay finally in the end when
we are when we have achieved the best
values for these weights and biases
which is nothing but the best values for
the filter that we are using okay if we
are able to do that in fact we will not
be doing that the algorithm will do that
for us but finally we reached to the
point where we get the best values for
the filter and then the filter will
start capturing the meaningful edges or
features from the image and then finally
if you pass an image of a Tweety Bird
then since these filters have the best
values in order to capture the edges or
features like the eyes of fruity bird or
its eyebrows or head or other features
it will pass the information towards the
end of the architecture so that it can
confidently say that it is an image of a
treatable not of goofy or Donald Duck so
I hope this this lecture was able to
provide you some understanding a very
decent understanding around back
propagation in CNN in case you are
learning this topic for entirely first
time if CNN is completely new to you in
that case you may require to go through
this particular video maybe couple of
more times but please do not hesitate
doing that because initially I also
struggled a lot to understand this
particular topic and congratulations if
you have already understood it try to
share it with your friends or peers who
may find it useful and thank you very
much for watching till the end we have
learned a lot within this deep learning
playlist although it is becoming very
difficult to make time from my full-time
job to create tutorials like this but I
will try my best to create more
tutorials on recurrent neural network in
order to continue this deep learning
playlist so please subscribe to the
channel if you're new here and hopefully
I will see you in the next lecture
thank you
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.
Works with YouTube, Coursera, Udemy and more educational platforms
Get Instant Transcripts: Just Edit the Domain in Your Address Bar!
YouTube
←
→
↻
https://www.youtube.com/watch?v=UF8uR6Z6KLc
YoutubeToText
←
→
↻
https://youtubetotext.net/watch?v=UF8uR6Z6KLc