YouTube Transcript:
Convolutional Neural Networks (CNNs) explained
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
View:
[Music]
in this video we'll be discussing
convolutional neural networks a
convolutional neural network also known
as a CNN or comp net is an artificial
neural network that is so far been most
popularly used for analyzing images
although image analysis has been the
most widespread use of CNN's they can
also be used for other data analysis or
classification problems as well most
generally we can think of a CNN as an
artificial neural network that has some
type of specialization for being able to
pick out or detect patterns and make
sense of them
this pattern detection is what makes CNN
so useful for image analysis so if a CNN
is just some form of an artificial
neural network what differentiates it
from just a standard multi-layer
perceptron or MLP well a CNN has hidden
layers called convolutional layers and
these layers are precisely what makes a
CNN well a CNN now CNN's can and usually
do have other non convolutional layers
as well but the basis of a CNN is the
convolutional layers all right so what
do these convolutional layers do just
like any other layer a convolutional
layer receives input then transforms the
input in some way and then outputs the
transform input to the next layer with a
convolutional layer this transformation
is a convolution operation we'll come
back to this operation in a bit for now
let's look at a high-level idea of what
convolutional layers are doing as
mentioned earlier convolutional neural
networks are able to tech patterns and
images more precisely the convolutional
layers are able to detect patterns well
actually let's be a little more precise
than that with each convolutional layer
we need to specify the number of filters
the layers should have and will speak
technically about what a filter is in
just a few moments but for now
understand that these filters are
actually what detect the patterns now
when I say that the filters are able to
detect patterns what precisely do I mean
by patterns well think about how much
may be going on in any single image
multiple edges shapes textures objects
etc
so one type of quote pattern that a
filter could detect could be edges and
images so this filter would be called an
edge detector for example some filters
may detect corners some may detect
circles other squares now these simple
and kind of geometric filters are what
we'd see at the start of our network
the deeper our network goes the more
sophisticated these filters become so in
later layers rather than edges in simple
shapes our filters may be able to detect
specific objects like eyes ears hair or
fur feathers scales and beaks even and
in even deeper layers the filters are
able to take even more sophisticated
objects like full dogs cats lizards and
birds to understand what's actually
happening here with these convolutional
layers and their respective filters
let's look at an example so say we have
a convolutional neural network that's
accepting images of handwritten digits
like from the amnesty ADA set and our
network is classifying them into their
respective categories of whether the
images of a 1 2 3 etc let's now assume
that the first hidden layer in our model
is a convolutional layer as mentioned
earlier when adding a convolutional
layer to a model we also have to specify
how many filters we want the layer to
have a filter can technically just be
thought of as a relatively small matrix
for which we decide the number of rows
and number of columns that this matrix
has and the values within the matrix are
initialized with random numbers so for
this first convolutional layer in this
example of ours we're going to specify
that we want the layer to contain one
filter of size 3 by 3 now when this
convolutional layer receives input the
filter will slide over each 3x3 set of
pixels from the input itself until it
slid over every 3x3 block of pixels from
the entire image this sliding is
actually referred to as convolving so
really we should say that the filter is
going to convolve across each 3x3 block
of pixels from the input to actually
illustrate this I'm going to use an
example that Jeremy Howard used in one
of his lectures for fast AI his example
really gave me a lot of insight behind
what was going on within a convolutional
layer so I'd like to share that with you
all too I've also linked to his lecture
in the description of this video
so here we have our matrix
representation of an image of a7 from
the emne status set the values in this
matrix are the individual pixels from
the image alright so this is our input
and this input will be passed to a
convolutional layer as just discussed
we've specified this layer to only have
one filter and this filter is going to
convolve across each 3x3 block of pixels
from the input so here's our 3x3 filter
of random numbers here when the filter
first lands on the first 3x3 block of
pixels the dot product of the filter
itself with the 3x3 block of pixels from
the input will be computed and stored
this will occur for each 3x3 set of
pixels that the filter convulse so look
we would just take the dot product of
the filter here with this first 3x3
block and then we'd store it over here
now we slide to the next 3x3 block take
the dot product and then store the value
here if we look at the formula for each
of these cells we can see that it is
just indeed the dot product of the
filter with each 3x3 section of pixels
from the input so here we have this
first value is the dot product of this
input with this filter and then if I
click on another random value over here
we can see that this value is the dot
product of the filter with this input so
after this filter has convolve the
entire input will be left with a new
representation of our input which is
going to be made up of the entire matrix
of those store dot products we got from
the filter this matrix of dot products
is going to be the output of this layer
and is represented here this is what
will then be passed to the next layer as
input in this same process that we just
went through with the filter will happen
to this new output with the next layers
filters now this was just a very simple
illustration but as mentioned we can
think of these filters as pattern
detectors so we can't really observe any
specific pattern that was picked out
from our filter in the example we just
looked at in Excel but let's show our
original image of the 7 here and now
let's say we have 4 3 by 3 filters for
our convolutional layer and these
filters are filled with the values you
see here and these values can be
represented visual
as these filters where the minus ones
correspond to black ones correspond to
white and zeros correspond to gray
so if we convolve our original image of
a7 with each of these four filters
individually this is what the output
would look like for each filter we can
see that all four of these filters are
detecting edges and the output that
brightest pixels can be interpreted as
what the filter has detected so this
first one we can see detects top
horizontal edges of the seven and that's
indicated by the brightest pixels here
the second detects the left vertical
edges again being displayed with the
brightest pixels the third detects
bottom horizontal edges and the fourth
detects right vertical edges now these
filters are really basic and just detect
edges these are filters we may see
towards the start of our network more
complex filters would be located deeper
in the network and would gradually be
able to detect more sophisticated
patterns like the ones shown here we can
see the shapes that the filters on the
left detected from the images on the
right this one here detects circles and
this one at the bottom is detecting
corners and as we go even further into
our layers the filters are able to
detect much more complex patterns like
these dog faces being interpreted in
this filter or even the bird legs
detected in this one all right so now if
you're interested in seeing how to work
with cnn's and code then check out the
CNN and fine tuning videos in Mike
Harris deep learning playlist so I hope
that you now have a basic understanding
of convolutional neural networks and how
these networks are made up of
convolutional layers which themselves
are made up of filters and I hope you
found this video helpful if you did
please like the video subscribe suggest
and comment and thanks for watching
[Music]
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.
Works with YouTube, Coursera, Udemy and more educational platforms
Get Instant Transcripts: Just Edit the Domain in Your Address Bar!
YouTube
←
→
↻
https://www.youtube.com/watch?v=UF8uR6Z6KLc
YoutubeToText
←
→
↻
https://youtubetotext.net/watch?v=UF8uR6Z6KLc