This content provides a comprehensive overview of machine learning, its subfields (AI, deep learning), types (supervised, unsupervised, reinforcement), algorithms, and practical applications, along with an introduction to essential tools like Jupyter Notebook and statistical concepts. It aims to equip learners with the foundational knowledge and skills required for a career in machine learning.
Mind Map
Click to expand
Click to explore the full interactive mind map • Zoom, pan, and navigate
I'm sure you all agree that machine learning is one
of the hottest Trend in today's market right Gartner predicts
that by 2022 there would be at least 40%
of new application development project going on in the market
that would be requiring machine learning co-developers
on their team.
It's expected that these project will generate a revenue
of around three point nine trillion dollar,
isn't it cute so looking at the huge?
Upcoming demand of machine learning around the world.
We guys at Eureka have come up
and designed a well-structured machine learning full course
for you guys.
But before we actually drill down over there,
let me just introduce myself.
Hello all I am Atul from Edureka.
And today I'll be guiding you
through this entire machine learning course.
Well, this course has been designed in a way
that you get the most out of it.
So we'll slowly and gradually start
with a beginner level and then move towards the advanced topic.
So without delaying any further,
let's start with the agenda of today's Action
on machine learning course has been segregated
into six different module will start our first module
with introduction to machine learning here.
We'll discuss things.
Like what exactly is machine learning
how it differs from artificial intelligence and the planning
what is various types or dead space application
and finally we'll end up first module
with a basic demo and python.
Okay a second module focuses on starts
and probability here will cover things
like descriptive statistics and inferential statistics to Bob.
Rarity Theory and so
on our third module is unsupervised learning.
Well supervised learning is one of a type of machine learning
which focuses mainly
on regression and classification type of problem.
It deals with label data sets and the algorithm
which are a part of it are linear regression
logistic regression Napier's random Forest decision tree
and so on.
Our fourth module is on unsupervised learning.
Well this module focuses mainly on dealing
with unlabeled data sets
and the algorithm which are a part.
Offered or k-means algorithm
and a priori algorithm as a part of fifth module.
We have reinforcement learning here.
We are going to discuss about reinforcement learning
and depth on also
about Q learning algorithm finally in the end.
It's all about to make you industry ready.
Okay.
So here we are going to discuss about three different projects
which are based on supervised learning
and unsupervised learning
and reinforcement learning finally in the end.
I tell you about some of the skills
that you need to become a machine learnings and Jean.
Nia okay, and also I am discussing about some
of the important questions
that are asked in a machine-learning interview fine
with this we come to the end of this agenda
before you move ahead
don't forget to subscribe to a dareka and press
the Bell icon to never miss any update from us.
Hello everyone.
This is a toll from Eureka
and welcome to today's session on what is machine learning.
As you know,
we are living in a world of humans
and machines humans have been evolving
and learning from the past experience since millions
of years on the other hand the era of machines
and robots have just begun in today's world.
These machines are the rewards are
like they need to be program
before they actually follow your instructions.
But what if the machine started to learn
on their own and this is
where machine learning comes
into picture machine learning is the core
of many futuristic technology advancement in our world.
And today you can see various examples
or implementation of machine learning around us
such as Tesla's self-driving car Apple Siri, Sophia.
I do bot and many more are there.
So what exactly is machine learning?
Well Machine learning is a subfield
of artificial intelligence
that focuses on the design of system
that can learn from and make decisions
and predictions based on the experience
which is data in the case of machines machine learning
enables computer to act
and make data-driven decisions rather than
Being explicitly programmed
to carry out a certain task these programs
are designed to learn
and improve over time
when exposed to new data.
Let's move on and discuss one
of the biggest confusion of the people in the world.
They think that all the three of them
the AI the machine learning and the Deep learning all are same,
you know, what they are wrong.
Let me clarify things
for you artificial intelligence is a broader concept
of machines being able to carry out tasks in a smarter way.
It covers anything which enables the computer to be.
Have like humans think of a famous Turing test to determine
whether a computer is capable of thinking
like a human being or not.
If you are talking to Siri on your phone
and you get an answer you're already very close to it.
So this was about the artificial intelligence now coming
to the machine learning part.
So as I already said machine learning is a subset
or a current application of AI it is based on the idea
that we should be able to give machine the access
to data and let them learn from done cells.
It's a subset of artificial intelligence.
Is that deals
with the extraction of pattern from data set?
This means that the machine can not only find the rules
for optimal Behavior,
but also can adapt to the changes in the world many
of the algorithms involved have been known
for decades centuries even thanks to the advances
in the computer science and parallel Computing.
They can now scale up to massive data volumes.
So this was about the machine learning part now coming over
to deep learning deep learning is a subset of machine learning
where similar machine learning.
Tamar used to train deep neural network.
So as to achieve better accuracy in those cases
where former was not performing up to the mark, right?
I hope now you understood that machine learning Ai
and deep learning all three are different.
Okay moving on ahead.
Let's see in general how a machine learning work.
One of the approaches is
where the machine learning algorithm is strained
using a labeled or unlabeled training data
set to produce a model
new input data is introduced to the machine learning algorithm
and it make prediction based on the model.
The prediction is evaluated for accuracy.
And if the accuracy is acceptable the machine
learning algorithm is deployed.
Now if the accuracy is not acceptable
the machine learning algorithm is strained again,
and again with an argument a training data set.
This was just in high-level example
as they are many more factor and other steps involved in it.
Now, let's move on and subcategorize the Machine
learning into three different types the supervised learning
and unsupervised learning and reinforcement
learning and let's see what each of them are how they work.
Work and how each
of them is used in the field of banking Healthcare retail
and other domains.
Don't worry.
I'll make sure
that I use enough examples and implementation of all three
of them to give you a proper understanding of it.
So starting with supervised learning.
What is it?
So let's see a mathematical definition
of supervised learning supervised learning is
where you have input variables X and an output variable Y
and you use an algorithm to learn the mapping function
from the input to the output.
That is y Affects
the goal is to approximate the mapping function.
So well that whenever you have a new input data
X you could predict the output variable.
That is why for that data, right?
I think this was confusing for you.
Let me simplify the definition of supervised learning
so we can rephrase the understanding
of the mathematical definition as a machine learning method
where each instances of a training data set is composed
of different input attribute
and an expected output the input attributes
of a training data set can be of any End of data it can be
a pixel of the image.
It can be a value of a data base row
or it can even be an audio frequency histogram right
for each input instance
and expected output values
Associated value can be discreet representing a category
or can be a real or continuous value in either case.
The algorithm learns the input pattern
that generate the expected output now
once the algorithm is strain,
it can be used to predict the correct output
of a never seen input.
You can see I image on your screen right
in this image.
And see that we are feeding raw inputs as image of Apple
to the algorithm as a part of the algorithm.
We have a supervisor who keeps on correcting
the machine or who keeps on training the machine.
It keeps on telling him that yes, it is a Apple.
No, it is not an apple things like that.
So this process keeps
on repeating until we get a final train model.
Once the model is ready.
It can easily predict the correct output
of a never seen input in this slide.
You can see
that we are giving an image of a green apple to the machine
and the Machine can easily identify it as yes,
it is an apple and it is giving the correct result right?
Let me make things more clearer to you.
Let's discuss another example of it.
So in this Slide,
the image shows an example
of a supervised learning process used to produce a model
which is capable of recognizing the ducks in the image.
The training data set is composed of labeled picture
of ducks and non Ducks.
The result of supervised learning process is
a predictor model
which is capable of associating a label duck.
Or not duck to the new image presented to the model.
Now one strain,
the resulting predictive model can be deployed
to the production environment.
You can see a mobile app.
For example once deployed it is ready to recognize
the new pictures right now.
You might be wondering why this category
of machine learning is named as supervised learning.
Well, it is called a supervised learning
because the process of an algorithm learning
from the training data set can be thought
of as a teacher supervising the learning process
if we know the correct answers.
I will go Rhythm iteratively makes
while predicting on the training data
and is corrected by the teacher the learning stops
when the algorithm achieves an acceptable level of performance.
Now, let's move on and see some
of the popular supervised learning algorithm.
So we have linear regression random forest
and support Vector machines.
These are just for your information.
We will discuss about these algorithms
in our next video.
Now, let's see some of the popular use cases
of supervised learning
so we have Donna codon or any other speech
Automation in your mobile phone trains using your voice
and one strain it start working based on the training.
This is an application of supervised learning suppose.
You are telling OK Google call Sam
or you say Hey Siri call Sam you get an answer to it
and action is performed
and automatically a call goes to Sam.
So these are just an example
of supervised learning next comes the weather up
based on some of the prior knowledge
like when it is sunny the temperature is high.
Fire when it is cloudy humidity is higher any kind of that they
predict the parameters for a given time.
So this is also an example of supervised learning
as we are feeding the data to the machine and telling
that whenever it is sunny.
The temperature should be higher whenever it is cloudy.
The humidity should be higher.
So it's an example of supervised learning.
Another example is biometric attendance
where you train the machine and after couple of inputs
of your biometric identity beat your thumb your iris
or yellow or anything
once trained Machine gun validate your future input
and can identify you next comes in the field of banking sector
in banking sector
supervised learning is used to predict the credit worthiness
of a credit card holder
by building a machine learning model to look
for faulty attributes by providing it
with a data on deliquent
and non-delinquent customers.
Next comes the healthcare sector in the healthcare sector.
It is used to predict the patient's readmission rates
by building a regression model
by providing data
on the patients treatment Administration and readmissions
to show variables
that best correlate with readmission.
Next comes the retail sector and Retail sector.
It is used to analyze the product
that a customer by together.
It does this by building a supervised model
to identify frequent itemsets
and Association rule from the transactional data now,
lets learn about the next category
of machine learning the unsupervised part mathematically
unsupervised learning is
where you only
have Put data X and no corresponding output variable.
The goal for unsupervised learning is to model
the underlying structure
or distribution in the data
in order to learn more about the data.
So let me rephrase you this in simple terms
in unsupervised learning approach the data instances
of a training data set do not have
an expected output Associated
to them instead unsupervised
learning algorithm detects pattern based
on innate characteristics
of the input data an example of machine learning tasks.
Ask that applies unsupervised learning is clustering
in this task similar data instances are grouped together
in order to identify clusters of data in this slide.
You can see that initially we have different varieties
of fruits as input.
Now these set of fruits as input X are given to the model.
Now, what is the model is trained using
unsupervised learning algorithm.
The model will create clusters on the basis of its training.
It will grip the similar fruits and make their cluster.
Let me make things more clearer to you.
Let's take another example of it.
So in this Slide the image below shows an example
of unsupervised learning process this algorithm processes
an unlabeled training data set
and based on the characteristics.
It grips the picture
into three different clusters of data despite the ability
of grouping similar data into clusters.
The algorithm is not capable to add labels to the crow.
The algorithm only knows which data instances are similar,
but it cannot identify the meaning of this group.
So, Now you might be wondering why this category
of machine learning is named as unsupervised learning.
So these are called as
unsupervised learning because unlike supervised learning ever.
There are no correct answer
and there is no teacher algorithms are left
on their own to discover
and present the interesting structure in the data.
Let's move on and see some
of the popular unsupervised learning algorithm.
So we have here k-means apriori algorithm
and hierarchical clustering now,
let's move on and see some of the examples
of Is learning suppose a friend invites you to his party
and where you meet totally strangers.
Now, you will classify them using unsupervised learning
as you don't have any prior knowledge about them
and this classification can be done on the basis
of gender age group dressing education qualification
or whatever way you might like now why
this learning is different from supervised learning
since you didn't use any pasta prior knowledge
about the people you kept on classifying them on the go
as they kept on coming you kept on classifying them.
Yeah, this category of people belong to this group
this category of people belong to that group and so on.
Okay, let's see one more example.
Let's suppose you have never seen a football match before
and by chance you watch a video on the internet.
Now, you can easily classify the players on the basis
of different Criterion,
like player wearing the same kind of Jersey are
in one class player wearing different kind
of Jersey aren't different class
or you can classify them on the basis
of their playing style like the guys are attacker.
So he's in one class.
He's a Defender he's Another class
or you can classify them.
Whatever Way You observe the things
so this was also an example of unsupervised learning.
Let's move on and see
how unsupervised learning is used in the sectors
of banking Healthcare undertale.
So starting at banking sector.
So in banking sector it is used to segment customers
by behavioral characteristic by surveying prospects
and customers to develop multiple segments
using clustering and Healthcare sector.
It is used to categorize the MRI data by normal or abnormal.
Ages it uses deep learning techniques to build a model
that learns from different features of images to recognize
a different pattern.
Next is the retail sector and Retail sector.
It is used to recommend the products to customer
based on their past purchases.
It does this by building a collaborative filtering model
based on the past purchases by them.
I assume you guys
now have a proper idea of what unsupervised learning means
if you have any slightest doubt
don't hesitate and add your doubt to the I'm in section.
So let's discuss the third
and the last type of machine learning
that is reinforcement learning.
So what is reinforcement learning?
Well reinforcement learning
is a type of machine learning algorithm
which allows software agents
and machine to automatically determine the ideal Behavior
within a specific context to maximize its performance.
The reinforcement learning is about interaction
between two elements
the environment and the learning agent
the learning agent leverages to mechanism namely exploration.
And exploitation when learning agent acts on trial
and error basis,
it is termed as exploration
and when it acts based on the knowledge gained
from the environment,
it is referred to as exploitation.
Now this environment rewards the agent for correct actions,
which is reinforcement signal leveraging the rewards
obtain the agent
improves its environment knowledge to select
the next action in this image.
You can see that the machine is confused
whether it is an apple or it's not an apple
then the Sheena's chain using reinforcement learning.
If it makes correct decision.
It get rewards point for it
and in case of wrong it gets a penalty for that.
Once the training is done.
Now.
The machine can easily identify which one of them is an apple.
Let's see an example here.
We can see that we have an agent
who has to judge from the environment to find out
which of the two is a duck the first task
he did is to observe the environment next.
We select some action using some policy.
It seems that the machine has made a wrong decision.
Bye.
Choosing a bunny as a duck.
So the machine will get penalty for it.
For example - 50.4 a wrong answer right now.
The machine will update its policy
and this will continue
till the machine gets an optimal policy
from the next time machine will know that bunny is not a duck.
Let's see some of the use cases of reinforcement learning
but before that lets see
how Pavlo trained his dog using reinforcement learning
or how he applied
the reinforcement method to train his dog.
Babu integrated learning
in four stages initially Pavlo gave me to his dog
and in response to the meet the dog started salivating next
what he did he created a sound
with the bell for this the dog did not respond anything
in the third part it tried to condition the dog
by using the bell
and then giving him the food seeing the food
the dog started salivating eventually a situation came
when the dog started salivating just after hearing the Bell even
if the food was not given to him as the The dog was reinforced
that whenever the master will ring the bell he
will get the food now.
Let's move on and see
how reinforcement learning is applied in the field
of banking Healthcare and Retail sector.
So starting with the banking sector
in banking sector reinforcement learning is used to create
a next best offer model
for a call center by building a predictive model
that learns over time
as user accept or reject offer made by the sales staff fine now
in healthcare sector it is used to allocate the scars.
Resources to handle different type of er
cases by building a Markov decision process
that learns treatment strategies for each type of er case next
and the last comes in retail sector.
So let's see
how reinforcement learning is applied to retail sector
and Retail sector.
It can be used to reduce excess stock
with Dynamic pricing by building a dynamic pricing model
that are just the price based
on customer response to the offers.
I hope by now you have attained some understanding of
what is machine learning and you are ready to move.
Move ahead.
Welcome to today's topic of discussion on AI
versus machine learning versus deep learning.
These are the term which have confused a lot
of people and if you two are one among them,
let me resolve it for you.
Well artificial intelligence is a broader umbrella
under which machine learning
and deep learning come you can also see in the diagram
that even deep learning
is a subset of machine learning so you can say
that all three of them The AI and machine learning
and deep learning are just the subset of each other.
So let's move on and understand
how exactly the differ from each other.
So let's start with artificial intelligence.
The term artificial intelligence
was first coined in the year 1956.
The concept is pretty old,
but it has gained its popularity recently.
But why well,
the reason is earlier we had very small amount of data
the data we had was not enough to predict the Turret result
but now there's a tremendous increase
in the amount of data statistics
suggest that by 2020 the accumulated volume
of data will increase
from 4.4 zettabyte stew roughly around 44 zettabytes
or 44 trillion jeebies
of data along with such enormous amount of data.
Now, we have more advanced algorithm
and high-end computing power and storage
that can deal with such large amount of data as a result.
It is expected
that 70% of The price will Implement a i
over the next 12 months
which is up from 40 percent in 2016 and 51 percent in 2017.
Just for your understanding.
What does AI well,
it's nothing but a technique
that enables the machine to act like humans
by replicating the behavior and nature with AI
it is possible
for machine to learn from the experience.
The machines are just their responses based
on new input there
by performing human-like tasks artificial intelligence can be
and to accomplish
specific tasks by processing large amount of data
and recognizing pattern in them.
You can consider
that building an artificial intelligence is like Building
a Church the first church took generations to finish.
So most of the workers were working in it never saw
the final outcome those working
on it took pride in their craft building bricks
and chiseling stone
that was going to be placed into the great structure.
So as AI researchers,
we should think of ourselves as humble brick makers was job.
It's just study
how to build components example Parts is planners
or learning algorithm or Etc anything
that someday someone and somewhere will integrate
into the intelligent systems some of the examples
of artificial intelligence from our day-to-day life
are Apple series chess-playing computer Tesla self-driving car
and many more these examples are based on deep learning
and natural language processing.
Well, this was about what is AI and how it gains its hype.
So moving on ahead.
Let's Gus about machine learning and see what it is
and why it was the when introduced well
Machine learning came into existence in the late 80s
and the early 90s,
but what were the issues with the people
which made the machine learning come into existence let
us discuss them one by one in the field of Statistics.
The problem was
how to efficiently train large complex model in the field
of computer science and artificial intelligence.
The problem was how to train more robust version
of AI system while in the case of Neuroscience.
Problem faced by the researchers was
how to design operation model of the brain.
So these were some of the issues
which had the largest influence and led to the existence
of the machine learning.
Now this machine learning shifted its focus
from the symbolic approaches.
It had inherited from the AI and move
towards the methods and model.
It had borrowed from statistics and probability Theory.
So let's proceed and see
what exactly is machine learning.
Well Machine learning is a subset of AI
which enables the computer to act
and make data-driven decisions to carry out a certain task.
These programs are algorithms are designed in a way
that they can learn and improve over time
when exposed to new data.
Let's see an example of machine learning.
Let's say you want to create a system
which tells the expected weight of a person based on its side.
The first thing you do is you collect the data.
Let's see there is
how your data looks like now each point
on the graph represent one data point to start
with we can draw a simple line to predict the weight based
on the height for Sample a simple line W equal x
minus hundred with W is waiting kgs and edges hide
and centimeter this line can help us to make the prediction.
Our main goal is to reduce the difference
between the estimated value and the actual value.
So in order to achieve it,
we try to draw a straight line that fits through all
these different points and minimize the error.
So our main goal is to minimize the error
and make them as small as possible decreasing the error
or the difference between the actual value and estimated.
Value increases the performance of the model further
on the more data points.
We collect the better.
Our model will become we can also improve our model
by adding more variables
and creating different production lines for them.
Once the line is created.
So from the next time if we feed a new data,
for example height of a person to the model,
it would easily predict the data for you and it will tell you
what has predicted weight could be.
I hope you got a clear understanding
of machine learning.
So moving on ahead.
Let's learn about deep learning now what is deep learning?
You can consider deep learning model as a rocket engine
and its fuel is its huge amount of data
that we feed to these algorithms the concept
of deep learning is not new,
but recently it's hype as increase
and deep learning is getting more attention.
This field is a particular kind of machine learning
that is inspired by
the functionality of our brain cells called neurons
which led to the concept of artificial neural network.
It simply takes
the data connection between all the artificial neurons
and adjust them according to the data pattern.
More neurons are added
at the size of the data is large it automatically features
learning at multiple levels of abstraction.
Thereby allowing a system
to learn complex function mapping without depending
on any specific algorithm.
You know, what no one actually knows what happens
inside a neural network and why it works so well,
so currently you can call it as a black box.
Let us discuss some of the example of deep learning
and understand it in a better way.
Let me start with in simple example
and explain you how things And at a conceptual level,
let us try and understand
how you would recognize a square from other shapes.
The first thing you do is you check
whether there are four lines associated with a figure
or not simple concept, right?
If yes, we further check
if they are connected and closed again a few years.
We finally check whether it is perpendicular
and all its sides are equal, correct.
If everything fulfills.
Yes, it is a square.
Well, it is nothing but a nested hierarchy of Concepts.
What we did here we took a complex task
of identifying a square
and this case and broken into simpler tasks.
Now this deep learning also does the same thing
but at a larger scale,
let's take an example of machine which recognizes
the animal the task of the machine is to recognize
whether the given image is of a cat or a dog.
What if we were asked to resolve the same issue using the concept
of machine learning what we would do first.
We would Define the features such as
check whether the animal has whiskers or not a check.
The animal has pointed ears
or not or whether its tail is straight or curved in short.
We will Define the facial features and let
the system identify which features are more important
in classifying a particular animal now
when it comes to deep learning it takes this to one step ahead
deep learning automatically finds are the feature
which are most important for classification compare
into machine learning
where we had to manually give out that features by now.
I guess you have understood that AI is the bigger picture
and machine learning and deep learning are it's apart.
So let's move on
and focus our discussion on machine learning
and deep learning the easiest way to understand the difference
between the machine learning and deep learning is to know
that deep learning is machine learning more specifically.
It is the next evolution of machine learning.
Let's take few important parameter
and compare machine learning with deep learning.
So starting with data dependencies,
the most important difference between deep learning
and machine learning is its performance as the volume
of the data gets From the below graph.
You can see
that when the size of the data is small deep learning algorithm
doesn't perform that well,
but why well,
this is because deep learning algorithm needs
a large amount of data to understand it perfectly
on the other hand the machine learning algorithm can easily
work with smaller data set fine.
Next comes the hardware dependencies deep learning
algorithms are heavily dependent on high-end machines
while the machine learning algorithm can work
on low and machines as Well,
this is because the requirement
of deep learning algorithm include gpus
which is an integral part
of its working the Deep learning algorithm requires gpus
as they do a large
amount of matrix multiplication operations,
and these operations
can only be efficiently optimized using a GPU
as it is built for this purpose.
Only our third parameter
will be feature engineering well feature engineering is a process
of putting the domain knowledge to reduce the complexity
of the data.
Make patterns more visible to learning algorithms.
This process is difficult and expensive in terms of time
and expertise in case of machine learning
most other features are needed to be identified by an expert
and then hand coded as per the domain
and the data type.
For example, the features
can be a pixel value shapes texture position orientation
or anything fine the performance
of most of the machine learning algorithm depends
on how accurately the features are identified and stood
where as in case
of deep learning algorithms it try to learn high level features
from the data.
This is a very distinctive part of deep learning
which makes it way ahead
of traditional machine learning deep learning reduces the task
of developing new feature extractor for every problem
like in the case
of CNN algorithm it first try to learn the low-level features
of the image such as edges and lines
and then it proceeds to the parts of faces of people
and then finally to the high-level representation
of the face.
I hope that things Getting clearer to you.
So let's move on ahead and see the next parameter.
So our next parameter is problem solving approach
when we are solving a problem using traditional machine
learning algorithm.
It is generally recommended
that we first break down the problem
into different sub parts solve them individually
and then finally combine them to get the desired result.
This is how the machine learning algorithm handles the problem
on the other hand the Deep learning algorithm
solves the problem from end to end.
Let's take an example.
To understand this suppose you have a task
of multiple object detection.
And your task is to identify.
What is the object and where it is present in the image.
So, let's see and compare.
How will you tackle this issue using the concept
of machine learning
and deep learning starting with machine learning
in a typical machine learning approach.
You would first divide the problem into two step
first object detection and then object recognization.
First of all,
you would use a bounding box detection algorithm
like grab could fight.
Sample to scan through the image
and find out all the possible objects.
Now, once the objects are recognized you would use
object recognization algorithm,
like svm with hog to recognize relevant objects.
Now, finally,
when you combine the result you would be able to identify.
What is the object and where it is present
in the image on the other hand in deep learning approach.
You would do the process from end to end for example
in a yellow net
which is a type of deep learning algorithm you would pass.
An image and it would give out the location along with the name
of the object.
Now, let's move on to our fifth comparison parameter
its execution time.
Usually a deep learning algorithm takes a long time
to train this is
because there's so
many parameter in a deep learning algorithm
that makes the training longer
than usual the training might even last for two weeks
or more than that.
If you are training completely from the scratch,
whereas in the case of machine learning,
it relatively takes much less time to train ranging
from a few weeks.
Too few Arts.
Now.
The execution time is completely reversed
when it comes to the testing of data during testing
the Deep learning algorithm takes much less time to run.
Whereas if you compare it with a KNN algorithm,
which is a type of machine learning algorithm the test
time increases as the size of the data increase last
but not the least we have interpretability as
a factor for comparison of machine learning
and deep learning.
This fact is the main reason why deep learning is still
thought ten times before anyone knew.
Uses it in the industry.
Let's take an example suppose.
We use deep learning to give
automated scoring two essays the performance it gives
and scoring is quite excellent and is near
to the human performance,
but there's an issue with it.
It does not reveal white
has given that score indeed mathematically.
It is possible to find out
that which node of a deep neural network were activated,
but we don't know
what the neurons are supposed to model
and what these layers of neurons are doing collectively.
So if To interpret the result
on the other hand machine learning algorithm,
like decision tree gives us a crisp rule for void chose
and watered chose.
So it is particularly easy to interpret the reasoning
behind therefore the algorithms like decision tree
and linear or logistic
regression are primarily used in industry for interpretability.
Let me summarize things
for you machine learning uses algorithm to parse
the data learn from the data
and make informed decision based on what it has learned fine.
in this deep learning structures algorithms in layers to create
artificial neural network
that can learn
and make Intelligent Decisions on their own finally
deep learning is a subfield of machine learning
while both fall under the broad category
of artificial intelligence deep learning is usually
what's behind the most
human-like artificial intelligence now
in early days scientists used to have a lab notebook
to Test progress results
and conclusions now Jupiter is a modern-day
to that allows data scientists
to record the complete analysis process much
in the same way other scientists use a lab notebook.
Now, the Jupiter product was originally developed as a part
of IPython project the iPad
and project was used to provide interactive online access
to python over time.
It became useful to interact
with other data analysis tools such as are in the same manner
with the split from python
the tool crew in in his current manifestation of Jupiter.
Now IPython is still an active tool
that's available for use.
The name Jupiter itself is derived from the combination
of Julia Python.
And our while Jupiter runs code
in many programming languages python is a requirement
for installing the jupyter notebook itself now
to download jupyter notebook.
There are a few ways in their official website.
It is strongly recommended installing Python and Jupiter
using Anaconda distribution,
which includes python Don't know what book
and other commonly used packages
for scientific Computing as well as data science.
Although one can also
do so using the pipe installation method personally.
What I would suggest is downloading an app
on a navigator, which is
a desktop graphical user interface included in Anaconda.
Now, this allows you to launch application
and easily manage conda packages environments
and channels without the need to use command line commands.
So all you need to do is go to another Corner dot orgy
and inside you go.
To Anaconda Navigators.
So as you can see here,
we have the conda installation code which you're going
to use to install it in your particular PC.
So either you can use these installers.
So once you download the Anaconda Navigator,
it looks something like this.
So as you can see here,
we have Jupiter lab jupyter notebook you have QT console,
which is IPython console.
We have spider which is somewhat similar to a studio
in terms of python again,
we have a studio so we have orange three
We have glue is and we have VSC code.
Our Focus today would be on this jupyter notebook itself.
Now when you launch the Navigator,
you can see there are many options available
for launching python as well.
As our instances Now by definition are jupyter.
Notebook is fundamentally
a Json file with a number of annotations.
Now, it has three main parts
which are the metadata The Notebook format and the list
of cells now you
should get yourself acquainted with the environment
that Jupiter user interface has a number of components.
So it's important to know
what our components you should be using
on a daily basis and you should get acquainted with it.
So as you can see here
our Focus today will be on the jupyter notebook.
So let me just launched the Japan and notebook.
Now what it does is creates a online python instance
for you to use it over the web.
So let's launch now
as you can see we have Jupiter on the top left
as expected and this acts as a button to go
to your home page whenever you click
on this you get back to your particular home paste.
Is the dashboard now there are three tabs displayed
with other files running and clusters.
Now, what will do is will understand all
of these three and understand
what are the importance
of these three tabs other file tab shows the list
of the current files in the directory.
So as you can see we have so many files here.
Now the running tab
presents another screen of the currently running processes
and the notebooks now the drop-down list for the terminals
and notebooks are populated with there.
Running numbers.
So as you can see inside,
we do not have any running terminals
or there no running notebooks as of now
and the cluster tab
presents another screen to display the list
of clusters available see in the top right corner of the screen.
There are three buttons which are upload new
and the refresh button.
Let me go back so you can see here.
We have the upload new and the refresh button.
Now the upload button is used to add files
to The Notebook space and you may also just drag and drop
as you would when handling files.
Similarly, you can drag
and drop notebooks into specific folders as well.
Now the menu with the new in the top residents
of further many of text file folders terminal
and Python 3.
Now, the test file option is used to add a text file
to the current directory Jupiter will open a new browser window
for you for the running new text editor.
Now, the text entered is automatically saved
and will be displayed in your notebooks files display.
Now the folder option
what it does is creates a new folder.
With the name Untitled folder and remember all the files
and folder names are editable.
Now the terminal option allows you to start
and IPython session.
The node would options available will be activated
when additional note books are available in your environment.
The Python 3 option is used to begin pythons recession
interactively in your note.
The interface looks like the following screen shot.
Now what you have is full file editing capabilities
for your script including saving as new file.
You also have a complete ID
for your python script now we come to the refresh button.
The refresh button is used to update the display.
It's not really necessary as a display is reactive
to any changes in the underlying file structure.
I had a talk with the files tab item.
There is a check box drop
down menu and a home button as you can see here.
We have the checkbox the drop-down menu
and the home button.
Now the check box is used to toggle all the checkboxes
in the item list.
So as you can see you can select all of these when either move
or either delete all of the file selected,
It or what you can do is select all
and deselect some of the files
as your wish now the drop down menu presents a list
of choices available,
which are the folders all notebooks running
and files to the folder section
will select all the folders in the display
and present account of the folders in the small box.
So as you can see here,
we have 18 number of folders now all the notebooks section
will change the count of the number of nodes
and provide you with three option
so you can see here.
It has selected all the given notebooks
which are a In a number
and you get the option to either duplicate the current notebook.
You need to move it view it edit it or delete.
Now, the writing section will select any running scripts
as you can see here.
We have zero running scripts
and update the count to the number selected.
Now the file section will select all the files
in the notebook display and update the counts accordingly.
So if you select the files here,
we are seven files as you can see here.
We have seven files some datasets CSV files
and text files now the home button.
Brings you back to the home screen of the notebook.
So on you to do is click on the jupyter.
Notebook lower.
It will bring you back to the Jupiter notebook dashboard.
Now, as you can see on the left hand side
of every item is a checkbox and I can and the items name.
The checkbox is used to build a set of files to operate
upon and the icon is indicated of of the type of the item.
And in this case,
all of the items are folder here coming down.
We have the ring notebooks.
And finally we have certain files which are the text files
and the As we files now a typical workflow
of any jupyter.
Notebook is to first of all create a notebook
for the project or your data analysis.
Add your analysis step coding
and output and Surround your analysis with organization
and presentation mark down to communicate
and entire story now interactive notebooks
that include widgets and display modules
will then be used by others by modifying parameters
and the data to note the effects of the changes now
if we talk about security jupyter notebooks are created
in order to Be shared with other users in many cases
over the Internet.
However, jupyter notebook can execute arbitrary code
and generate arbitrary code.
This can be a problem.
If malicious aspects have been placed
in the note Now the default security mechanism for Japan
or notebooks include raw HTML,
which is always sanitized and check for malicious coding.
Another aspect is you cannot run external Java scripts.
Now the cell contents,
especially the HTML and the JavaScript are not trusted
it requires user value.
Nation to continue
and the output from any cell is not trusted all other HTML
or JavaScript is never trusted
and clearing the output will cause the notebook
to become trusted
when save now notebooks can also use a security digest
to ensure the correct user is modifying the contents.
So for that what you need to do is a digest
what it does is takes into the account
the entire contents of the notebook and a secret
which is only known by The Notebook Creator
and this combination ensures
that malicious coding is is not going to be added
to the notebook
so you can add security to address to notebook
using the following command which I have given here.
So it's Jupiter the profile
what you have selected and inside you
what you need to do is security and notebook secret.
So what you can do is replace the notebooks
secret with your putter secret
and that will act as a key for the particular notebook.
So what you need to do is share that particular key
with all your colleagues
or whoever you want to share that particular notebook
with and in that case,
it keeps the notebooks.
Geode and away from other malicious coders
and all other aspect of Jupiter is configuration.
So you can configure some of the display parameters used
and presenting notebooks.
Now, these aren't configurable due to the use of product known
as code mirror to present and modify the notebook.
So cold mirror water basically is it is a JavaScript
based editor for the u.s.
Within the web pages and notebooks.
So what you do is what you do code mirror,
so as you can see here
code mirror is a versatile text editor implemented.
In JavaScript for the browser.
So what it does is
allow you to configure the options for Jupiter.
So now let's execute some python code
and understand the notebook in A Better Way Jupiter
does not interact
with your scripts as much as it executes your script
and request the result.
So I think this is how jupyter notebooks
have been extended to other languages besides python
as it just takes
a script runs it against a particular language engine
and across the output from the engine all
the while not Really knowing what kind
of a script is being executed now the new windows shows
and empty cell
for you to enter the python code know
what you need to do is under new you select the Python 3 and
what I will do is open a new notebook.
Now this notebook is Untitled.
So let's give the new work area and name python code.
So as you can see we have renamed this particular cell
now order save option should be on the next to the title
as you can see last.
Checkpoint a few days ago, unsaved changes.
The autosave option is always on what we do is
with an accurate name.
We can find the selection
and this particular notebook very easily
from The Notebook home page.
So if you select your browser's Home tab
and refresh you will find
this new window name displayed here again.
So if you just go to a notebook home
and as you can see,
I mentioned it by then quotes and under running.
Also, you have the pilot and quotes here.
So let's get back to the Particular page
or the notebook one thing to note here
that it has
and does an item icon versus a folder icon
though automatically assigned extension
as you can see here is ipy and be the IPython note and says
the item is in a browser in a Jupiter environment.
It is marked as running answer
is a file by that name in this directory as well.
So if you go to your directory,
let me go and check it.
So as you can see if you go into the users are you
can see we have the in class projects
that Python codes like the series automatically
have that particular IPython notebook created
in our working environment
and the local disk space also.
So if you open the IP y + B file in a text editor,
you will see basic context of a Jupiter code as you can see
if I'm opening it.
The cells are empty.
Nothing is there so let's type in some code here.
For example, I'm going to put in name equals edgy Rekha.
Next what I'm going to do is provide subscribers
that equals seven hundred gay and to run this particular cell.
What you need to do is click on the run Icon
and it will see here we have one.
So this is the first set to be executed in the second cell.
We enter python code
that references the variables from the first cell.
So as you can see here,
we have friend named has strings subscribers.
So let me just run this particular.
So as you can see here note.
Now that we have an output here
that Erica has 700k YouTube subscriber now
since more than 700 K now to know more about Jupiter
and other Technologies,
what you can do is subscribe to our Channel and get
updates on the latest trending Technologies.
So note that Jupiter color codes
your python just as decent editor vote
and we have empty braces
to the left of each code block such as you can see here.
If we execute the cell
the results are displayed in line now, it's interesting
that Jupiter keeps.
The output last generated in the saved version of the file
and it's a save checkpoints.
Now, if we were to rerun your cells using the rerun
or the run all the output would be generated
and c8y autosave now,
the cell number is incremented and as you can see
if I rerun this you see the cell number change
from one to three
and if I rerun this the Selma will change from 2 to 4.
So what Jupiter does is keeps a track of the latest version
of each cell so similarly
if you are to close the browser tab It's the display
in the Home tab.
You will find a new item we created
which is the python code your notebook saved autosaved
as you can see here in the bracket has autosaved.
So if we close this in the home button,
you can see here.
We have python codes.
So as you can see if we click that it opens the same notebook.
It has the previously
displayed items will be always there showing the output sweat
that we generated in the last run now
that we have seen how python Works
in Jupiter including the underlying encoding then
how this python.
This allows data set or data set Works in Jupiter.
So let me create another new python notebook.
So what I'm going to do is name this as pandas.
So from here,
what we will do is read in last dataset
and compute some standard statistics of data.
Now what we are interested in in seeing
how to use the pandas in Jupiter
how well the script performs
and what information is stored in the metadata,
especially if it's a large dataset
so our Python script accesses the iris dataset here
that's built into one of the Python packages.
Now.
All we are looking in to do is to read in slightly large number
of items and calculate some basic operations
on the data set.
So first of all,
what we need to do is
from sklearn import the data set so sklearn
is scikit-learn and it is another library of python.
It contains a lot of data sets for machine learning
and all the algorithms
which are present for machine learning
and the data sets which are there so,
So import was successful.
So what we're going to do now is pull in the IRS data.
What we're going to do is Iris underscore data set equals
and the load on the screen now that should do and I'm sorry,
it's data set start lower.
So so as you can see here,
the number here is considered three now
because in the second drawer
and we encountered an error it was data set.
He's not data set.
So so what we're going
to do is grab the first two corner of the data.
So let's pretend x equals.
If you press the tab, it automatically detects
what you're going to write as Todd datasets dot data.
And what we're going to do is take the first two rows comma
not to run it from your keyboard.
All you need to do is press shift + enter.
So next what we're going to do is calculate
some basic statistics.
So what we're going to do is X underscore.
Count equals x I'm going to use the length function and said
that we're going to use x dot flat similarly.
We going to see X-Men
and X Max and the Min our display our results.
What we're going to do is you just play the results now, so
as you can see the counter 300 the minimum value is 3.8 m/s.
And what is 0.4
and the mean is five point eight four three three three.
So let me connect you to the real life
and tell you what all are the things
which you can easily do using the concepts of machine learning
so you can easily get answer to the questions
like which types of house lies in this segment
or what is the market value of this house or is this
a male as spam or not spam?
Is there any fraud?
Well, these are some of the question you could ask
to the machine
but for getting an answer to these you need some algorithm
the machine need to train on the basis of some algorithm.
Okay, but how will you decide which algorithm
to choose and when?
Okay.
So the best option for us is to explore them one by one.
So the first is classification algorithm
where the categories predicted using the data
if you have some question,
like is this person a male
or a female or is this male a Spam or not?
Spam then these category of question would fall
under the classification algorithm classification is
a supervised learning approach
in which the computer program learns from the input
given to it
and then uses
this learning to classify new observation some examples
of classification problems
are speech organization handwriting recognized.
Shouldn't biometric identification document
classification Etc.
So next is the anomaly detection algorithm
where you identify the unusual data point.
So what is an anomaly detection.
Well, it's a technique
that is used to identify unusual pattern
that does not conform to expected Behavior
or you can say the outliers.
It has many application
in business like intrusion detection,
like identifying strange patterns in the network traffic
that could signal a hack or system Health monitoring
that is sporting a deadly tumor in the MRI scan
or you can even use it
for fraud detection credit card transaction
or to deal with fault detection in operating environment.
So next comes the clustering algorithm,
you can use this clustering algorithm to group the data
based on some similar condition.
Now you can get answer to which type of houses lies
in this segment or what type of customer buys this product.
The clustering is a task of dividing the population
or data points into number of groups such
that the data point and the same groups are more.
Hello to other data points in the same group than those
in the other groups in simple words.
The aim is to segregate groups with similar trait
and assigning them into cluster.
Now this clustering is a task of dividing the population
or data points into a number of groups such
that the data points
in the X group is more similar to the other data points
in the same group rather than those in the other group.
In other words.
The aim is to segregate the groups with similar traits
and assigning them into different clusters.
Let's understand this
with an example Suppose you are the head of a rental store
and you wish to understand the preference of your customer
to scale up your business.
So is it possible
for you to look at the detail of each customer and design
a unique business strategy for each of them?
Definitely not right?
But what you can do is to Cluster all your customer saying
to 10 different groups based on their purchasing habit
and you can use a separate strategy
for customers in each of these ten different groups.
And this is what we call clustering.
Next we have regression algorithm where the data
itself is predicted question.
You may ask to this type of model is like what is
the market value of this house
or is it going to rain tomorrow or not?
So regression is one of the most important and broadly
used machine learning and statistics tool.
It allows you to make prediction from data by learning
the relationship between the features of your data
and some observe continuous valued response regulation
is used in a massive number of application.
You know, what stock Isis prediction can be done
using regression now,
you know about different machine learning algorithm.
How will you decide which algorithm to choose
and when so let's cover this part using a demo.
So in this demo part
what we will do will create six different machine learning model
and pick the best model and build the confidence such
that it has the most reliable accuracy.
So far our demo part will be using the IRS data set.
This data set is quite very famous
and is considered one of the best small project to start
with you can consider
this as a hello world data set for machine learning.
So this data set consists
of 150 observation of Iris flower.
Therefore Columns of measurement of flowers in centimeters
the fifth column being the species
of the flower observe all the observed flowers belong
to one of the three species of Iris setosa Iris virginica
and Iris versicolor.
Well, this is a good good project
because it is so well to understand
the attributes are numeric.
So you have to figure out how to load and handle the data.
It is a classification problem.
Thereby allowing you to practice
with perhaps an easier type of supervised learning algorithm.
It has only four attributes and 150 rose.
Meaning it is very small
and can easily fit into the memory and even all
of the numeric attributes are in same unit
and the same scale means you do not require any special scaling
or transformation to get started.
So let's start coding and as I told earlier for the
But I'll be using Anaconda with python 3.0 install on it.
So when you install Anaconda
how your Navigator would look like.
So there's my home page of my anaconda navigator on this.
I'll be using the jupyter notebook,
which is a web-based interactive Computing notebook environment,
which will help me to write and execute my python codes on it.
So let's hit the launch button and execute
our jupyter notebook.
So as you can see
that my jupyter notebook is starting on localhost
double eight nine zero.
Okay, so there's my jupyter notebook
what I'll do here.
I'll select new.
book Python 3 Does my environment where I can write
and execute all my python codes on it?
So let's start by checking the version
of the libraries in order to make this video short
and more interactive and more informative.
I've already written the set of code.
So let me just copy and paste it down.
I'll explain you then one by one.
So let's start
by checking the version of the Python libraries.
Okay, so there is the code let's just copy
it copied and let's paste it.
Okay first let
me summarize things for you what we are doing here.
We are just checking the version
of the different libraries starting
with python will first check what version
of python we are working on then we'll check
what are the version of sci-fi we are using
the numpy matplotlib then Panda then scikit-learn.
Okay.
So let's execute the Run button and see
what are the various versions of libraries
which we are using it the run.
So we are working on Python 3 point 6 point 4 PSI by 1.0 now.
By 1.1 for matplotlib 2.12 pandas 0.22 and scikit-learn
or version 0.19.
Okay.
So these are the version
which I'm using ideally your version should be more recent
or it should match but don't worry
if you lack a few versions behind
as the API is do not change so quickly everything
in this tutorial will very likely still work for you.
Okay, but in case you
are getting an error stop and try to fix that error
in case you are unable to find the solution for the error,
feel free to reach out at Eureka even after the This class.
Let me tell you this
if you are not able to run the script properly,
you will not be able to complete this tutorial.
Okay, so whenever you get a doubt reach out
to a deal-breaker and just resolve it now,
everything is working smoothly then now is the time
to load the data set.
So as I said,
I'll be using the iris flower data set for this tutorial
but before loading the data set,
let's import all the modules function and the object
which we are going to use in this tutorial same
I've already written the set of code.
So let's just copy and paste them.
Let's load all the libraries.
So these are the various libraries
which will be using in our tutorial.
So everything should work fine without an error.
If you get an error just stop you need to work
on your cyber environment before you continue any further.
So I guess everything should work fine.
Let's hit the Run button and see.
Okay, it worked.
So let's now move ahead and load the data.
We can load the data direct
from the UCI machine learning repository.
First of all,
let me tell you we are using Panda to load the data.
Okay.
So let's say my URL.
Is this so This is
My URL for the use your machine learning repository
from where I will be downloading the data set.
Okay.
Now what I'll do,
I'll specify the name of each column
when loading the data.
This will help me later to explore the data.
Okay, so I'll just copy and paste it down.
Okay, so I'm defining a variable names
which consists of various parameters
including sepal length sepal width petal length battle
with and class.
So these are just the name of column from the data set.
Okay.
Now let's define the data set.
So data set equals Panda dot read underscore CSV inside
that we are defining URL and the names
that is equal to name.
As I already said we'll be using Panda to load the data.
Alright, so we are using Panda dot read CSV,
so we are reading.
The CSV file and inside that
from where that CSV is coming from the URL which you are.
So there's my URL.
Okay name sequel names.
It's just specifying the names of the various columns
in that particular CSV file.
Okay.
So let's move forward and execute it.
So even our data set is loaded.
In case you have some network issues just go ahead
and download the iris data file into your working directory
and loaded using the same method but your make sure
that you change the url to the local name
or else you might get an error.
Okay.
Yeah, our data set is loaded.
So let's move ahead and check out data set.
Let's see how many columns or rows we have in our data set.
Okay.
So let's print the number
of rows and columns in our data set.
So our data set is data set dot shape
what this will do.
It will just give you the numbers of total number
of rows and 2.
Little more of column or you can say the total number
of instances are attributes in your data set fine.
So print data set dot shape audio getting 150 and 500.
So 150 is the total number of rows in your data set
and five is the total number of columns fine.
So moving on ahead.
What if I want to see the sample data set?
Okay.
So let me just print the first certain instances
of the data set.
Okay, so print data set.
Head.
What I want is the first 30 instances fine.
This will give me the first 30 result of my data set.
Okay.
So when I hit the Run button
what I am getting is the first 30 result,
okay 0 to 29.
So this is
how my sample data set looks like sepal length sepal
width petal and petal width and the class, okay.
So this is how our data set looks like now,
let's move on
and look at the summary of each attribute.
What if I want to find out the count mean the minimum
and the maximum values and some other percentiles as well.
So what should I do then
for that print data set dot described.
What did we give let's see.
So you can see
that all the numbers are the same scales of similar range
between 0 to 8 centimeters,
right the mean value the standard deviation
the minimum value the 25 percentile
50 percentile 75 percentile
the maximum value all these values lies
in the range between 0 to 8 centimeter.
Okay.
So what we just did is we just took a summary
of each attribute.
Now, let's look at the number of instances
that belong to each class.
So for that what we'll do print data set.
First of all,
so let's print data set
and I want to group it Group by using class
and I want the size of it size of each class fine,
and let's hit the Run.
Okay.
So what I want to do,
I want to print print out data set.
However want to get it.
I want it by class.
So Group by class.
Okay.
Now I want the size
of each class find the size of each class.
So Group by class dot size and skewed the run
so you can see
that I have 50 instances of Iris setosa 50 instances
of Iris versicolor
and 50 instances of Iris virginica.
Okay, all our of data type integer of base64 fine.
So now we have a basic idea of Data, now,
let's move ahead and create some visualization for it.
So for this we are going to create two different types
of plot first would be the univariate plot
and the next would be the multivariate plot.
So we'll be creating univariate plots to better understand
about each attribute
and the next will be creating the multivariate plot to better
understand the relationship between different attributes.
Okay.
So we start with some univariate plot
that is plot of each individual variable.
So given that the input variables are numeric
we can create box and whiskers plot for it.
Okay.
So let's move ahead and create a box and whiskers plot
so data set Dot Plot.
What kind I want it's a box.
Okay, I'm do I need a subplot?
Yeah, I need subplots for that.
So subplots equal to what type
of layout do I won't so my layout structure is
2 cross 2 next do I want to share my coordinates X
and Y coordinates.
No, I don't want to share it.
So share x equal false
and even share why that 2 equals false?
Okay.
So we have our data set Dot Plot kind equal box.
My subplots is to lay out to Us too and then
what I want to do it,
I want to see so Plot show whatever I created short.
Okay, execute it.
Not just gives us a much clearer idea
about the distribution of the input attribute.
Now what if I had given the layout to 2 cross 2 instead
of that I would have given it for cross for so
what it will result just see fine.
Everything would be printed in just one single row.
Hold on guys area is a doubt.
He's asking that why we're using the sheriff's
and share y values.
What are these why we have assigned false values to it?
Okay Ariel.
So in order to resolve this query,
I need to show you what will happen
if I give True Values to them.
Okay, so be with me so share its go.
Pull through and share why that equals true.
So let's see what result will get.
You're getting it the X
and y-coordinates are just shared among all the
for visualization.
Right?
So are you can see
that the sepal length and sepal width has
y values ranging from zero point zero two seven point five
which are being shared
among both the visualization so is with the petal length.
It has shared value
between zero point zero two seven point five.
Okay, so that is why I don't want to share
the value of X and Y,
so it's just giving us a cluttered visualization.
So Aria why I'm doing this.
I'm just doing it
cause I don't want my X and Y coordinates To be shared
among any visualization.
Okay.
That is why my share X and share by value are false.
Okay, let's execute it.
So this is a pretty much Clear visualization
which gives a clear idea about the distribution
of the input attribute.
Now if you want you can also create a histogram
of each input variable to get a clear idea
of the distribution.
So let's create a histogram for it.
So data set dot his okay.
I would need to see it.
So plot dot show.
Let's see.
So there's my histogram and it seems
that we have two input variables that have a go.
And distribution so this is useful to note
as we can use the algorithms
that can exploit this assumption.
Okay.
So next comes the multivariate lat now
that we have created the univariate plot to understand
about each attribute.
Let's move on and look at the multivariate plot and see
the interaction between the different variables.
So first, let's look at the scatter plot
of all the attribute this can be helpful
to spot structured relationship between input variables.
Okay.
So let's create a scatter Matrix.
So for creating a scatter plot, we need scatter Matrix,
and we need to pass our data set into It okay.
And then what I want I want to see it.
So plot dot show.
So this is how my scatter Matrix looks
like it's like
that the diagonal grouping of some pear, right?
So this suggests a high correlation
and a predictable relationship.
All right.
This was our multivariate plot.
Now, let's move on and evaluate some algorithm
that's time to create some model of the data
and estimate the accuracy on the basis of unseen data.
Okay.
So now we know all about our data set, right?
We know how many instances
and attributes are there in our data set.
We know the summary of each attribute.
So I guess we have seen much about our data set.
Now.
Let's move on and create some algorithm
and estimate their accuracy based on the Unseen data.
Okay.
Now what we'll do we'll create some model of the data
and estimate the accuracy based on the some unseen data.
Okay.
So for that first of all,
let's create a validation data set.
What is the validation data set validation data set is
your training data set
that will be using it to trainer model fine.
All right.
So how will create a validation data
set for creating a validation data set?
What we are going to do is we are going to split our data set
into two point.
Okay.
So the very first thing we'll do is to create
a validation data set.
So why do we even need a validation data set?
So we need a validation data set know
that the model we created is any good later.
What we'll do we'll use the statistical method
to estimate the accuracy of the model that we create
on the Unseen data.
We also want a more concrete estimate of the accuracy
of the best model
on unseen data by evaluating it on the actual unseen data.
Okay confused.
Let me simplify this for you.
What we'll do we'll split the loaded data
into two parts the first 80 percent of the data.
User to train our model and the rest 20% will hold back
as the validation data set
that will use it to verify our trained model.
Okay fine.
So let's define an array.
This is my ra water it will consist of will consist
of all the values from the data set.
So data set dot values.
Okay next.
I'll Define a variable X
which will consist of all the column
from the array from 0 to 4 starting from 0 to 4
and the next variable Y
which would consist of of the array starting from this.
So first of all,
we will Define a variable X that will consist of the values
in the array starting from the beginning 0 Del for
okay.
So these are the column which will include
in the X variable
and for a y variable I'll Define it as a class or the output.
So what I need,
I just need the fourth column that is my class column.
So I'll start it from the beginning
and I just want the fourth column.
Okay now I'll Define the my validation size.
Validation underscore sighs,
I'll Define it as 0.20 and our use a seed I
Define CD equals 6.
So this method seed sets the integers starting value used
in generating random number.
Okay, I'll Define the value of C R equals x.
I'll tell you what is the importance of that later on?
Okay.
So let me Define first few variables such as X
underscore train test why underscore train
and why underscore test Okay,
so What do you want to do is Select some model.
Okay, so module underscore selection.
But before doing
that what we have to do is split our training data set
into two halves.
Okay, so dot train underscore test underscore split
what you want to split is a value of X and Y.
Okay and my test size is equals to validation size,
which is a 0.20 correct and my random state.
Is equal to seed so what the city is doing?
It's helping me to keep the same Randomness in the training
and testing data set fine.
So let's execute it and see what is our result.
It's executed next.
We'll create a test harness for this.
We'll use 10-fold cross-validation to
estimate the accuracy.
So what it will do it will split a data set
into 10 parts crane on the nine part
and test on the one part
and this will repeat for all combination of train
and test pilots.
Okay.
So for that,
let's define again my CD that was six already
Define and scoring equals accuracy fine.
So we are using the metric of accuracy to evaluate the model.
So what is this?
This is a ratio of number of correctly predicted instances
divided by the total number of instances in the data set x
hundred giving a percentage example.
It's 98% accurate or 99% accurate things like that.
Okay, so we'll be In the scoring variable
when we run the build
and evaluate each model in the next step.
The next part is building model till now.
We don't know which algorithm would be good for this problem
or what configuration to use.
So let's begin with six different algorithm.
I'll be using
logistic regression linear discriminant analysis,
k-nearest neighbor classification and
regression trees neighbor buys.
And what Vector machine well
these algorithms chime using is a good mixture of simple linear
or non-linear algorithms in simple linear switch.
Included the logistic regression
and the linear discriminant analysis or the nonlinear part
which included the KNN algorithm the card algorithm
that the neighbor buys and the support Vector machines.
Okay.
So we reset the random number seed
before each run to ensure that evaluation
of each algorithm
is performed using exactly the same data spreads.
It ensures the result are directly comparable.
Okay, so, let me just copy and paste it.
Okay.
So what we're doing here,
we are building five different types of model.
We are building logistic regression
linear discriminant analysis,
k-nearest neighbor decision tree ghajini buys
and the support Vector machine.
Okay next what we'll do we'll evaluate model in each turn.
Okay.
So what is this?
So we have six different model
and accuracy estimation for each one of them
now we need to compare the model to each other
and select the most accurate of them all.
So running the script we saw the following result
so we can see some of the results on the screen.
What is It is just
the accuracy score using different set of algorithms.
Okay, when we are using logistic regression,
what is the accuracy rate
when we are using linear discriminant algorithm?
What is the accuracy and so-and-so?
Okay.
So from the output with seems
that LD algorithm was the most accurate model
that we tested now,
we want to get an idea of the accuracy of the model
on our validation set or the testing data set.
So this will give us an independent final check
on the accuracy of the best model.
It is always valuable to keep our testing data set
for just in case you made a our overfitting
to the testing data set
or you made a data leak both will result
in an overly optimistic result.
Okay, you can run the ldo model directly on the validation set
and summarize the result as a final score a confusion Matrix
and a classification statistics and probability are essential
because these disciples form the basic Foundation
of all machine learning algorithms deep learning.
Social intelligence and data science,
in fact mathematics and probability is
behind everything around us from shapes patterns
and colors to the count
of petals in a flower mathematics is embedded
in each and every aspect of our lives.
So I'm going to go ahead and discuss the agenda
for today with you all we're going to begin
the session by understanding
what is data after that.
We'll move on and look at the different categories
of data like quantitative and Qualitative data,
then we'll discuss what exactly statistics is
the basic terminologies in statistics and a couple
of sampling techniques.
Once we're done with that.
We'll discuss a different types of Statistics
which involve descriptive and inferential statistics.
Then in the next session,
we will mainly be focusing on descriptive statistics
here will understand the different measures
of center measures of spread Information Gain
and entropy will also
understand all of these measures with the help of a user.
And finally, we'll discuss what exactly a confusion Matrix is.
Once we've covered
the entire descriptive statistics module will discuss
the probability module
here will understand what exactly probability
is the different terminologies in probability.
We will also study the different probability distributions,
then we'll discuss the types of probability which include
marginal probability joint and conditional probability.
Then we move on
and discuss a use case wherein we will see examples
that show us
how the different types of probability work
and to better understand Bayes theorem.
We look at a small example.
Also, I forgot to mention
that at the end of the descriptive statistics module
will be running a small demo in the our language.
So for those of you who don't know much
about our I'll be explaining every line in depth,
but if you want to have a more in-depth understanding
about our I'll leave a couple of blocks.
And a couple of videos in the description box
you all can definitely check out that content.
Now after we've completed the probability module will discuss
the inferential statistics module will start this module
by understanding
what is point estimation will discuss
what is confidence interval and how you can estimate
the confidence interval will also discuss margin of error
and will understand all of these concepts by looking
at a small use case.
We finally end the inferential Real statistic module by looking
at what hypothesis testing is hypothesis.
Testing is a very important part of inferential statistics.
So we'll end the session by looking at a use case
that discusses how hypothesis testing works
and to sum everything up.
We'll look at a demo
that explains how inferential statistics works.
Right?
So guys, there's a lot to cover today.
So let's move ahead and take a look at our first topic
which is what is data.
Now, this is a quite simple question
if I ask any of You what is data?
You'll see that it's a set of numbers
or some sort of documents
that have stored in my computer now data is actually everything.
All right, look around you there is data everywhere each click
on your phone generates more data than you know,
now this generated data provides insights for analysis
and helps us make Better Business decisions.
This is why data is so important to give you
a formal definition data refers to facts and statistics.
Collected together for reference or analysis.
All right.
This is the definition of data in terms
of statistics and probability.
So as we know data can be collected it
can be measured and analyzed
it can be visualized by using statistical models
and graphs now data is divided into two major subcategories.
Alright, so first we have qualitative data
and quantitative data.
These are the two different types of data
under qualitative data.
I'll be have nominal and ordinal data
and under quantitative data.
We have discrete and continuous data.
Now, let's focus on qualitative data.
Now this type of data deals with characteristics and descriptors
that can't be easily measured
but can be observed subjectively
now qualitative data is further divided
into nominal and ordinal data.
So nominal data is any sort of data
that doesn't have any order or ranking?
Okay.
An example of nominal data is gender.
Now.
There is no ranking in gender.
There's only male female or other right?
There is no one two,
three four or any sort of ordering in gender race is
another example of nominal data.
Now ordinal data is basically an ordered series of information.
Okay, let's say that you went to a restaurant.
Okay.
Your information is stored in the form of customer ID.
All right.
So basically you are represented with a customer ID.
Now you would have rated their service as
either good or average.
All right, that's how no ordinal data is
and similarly they'll have a record of other customers
who visit the restaurant along with their ratings.
All right.
So any data which has some sort of sequence
or some sort of order to it is known as ordinal data.
All right, so guys,
this is pretty simple to understand now,
let's move on and look at quantitative data.
So quantitative data basically these He's
with numbers and things.
Okay, you can understand
that by the word quantitative itself quantitative is
basically quantity.
Right Saudis with numbers a deals with anything
that you can measure objectively, right?
So there are two types
of quantitative data there is discrete and continuous data
now discrete data is also known as categorical data
and it can hold a finite number of possible values.
Now, the number of students in a class is a finite Number.
All right, you can't have infinite number
of students in a class.
Let's say in your fifth grade.
There were a hundred students in your class.
All right, there weren't infinite number but there was
a definite finite number of students in your class.
Okay, that's discrete data.
Next.
We have continuous data.
Now this type of data can hold infinite number
of possible values.
Okay.
So when you say weight of a person is an example
of continuous data
what I mean to see is my weight can be 50 kgs
or it Can be 50.1 kgs
or it can be 50.00 one kgs or 50.000 one or is 50.0 2 3
and so on right?
There are infinite number of possible values, right?
So this is what I mean by continuous data.
All right.
This is the difference between discrete and continuous data.
And also I would like to mention a few other things over here.
Now, there are a couple of types of variables as well.
We have a discrete variable
and we have a continuous variable discrete variable
is also known as a categorical variable
or and it can hold values of different categories.
Let's say that you have a variable called message
and there are two types of values that this variable
can hold let's say
that your message can either be a Spam message
or a non spam message.
Okay, that's when you call a variable as discrete
or categorical variable.
All right, because it can hold values
that represent different categories of data
now continuous variables are basically variables
that can store in finite number of values.
So the weight of a person can be denoted as
a continuous variable.
All right, let's say there is a variable called weight
and it can store infinite number of possible values.
That's why we'll call it a continuous variable.
So guys basically variable is anything
that can store a value right?
So if you associate any sort of data with a A table,
then it will become either discrete variable
or continuous variable.
There is also dependent and independent type of variables.
Now, we won't discuss all with that in depth because
that's pretty understandable.
I'm sure all of you know,
what is independent variable and dependent variable right?
Dependent variable is any variable whose value
depends on any other independent variable?
So guys that much knowledge I expect or
if you do have all right.
So now let's move on and look at our next topic which Which is
what is statistics now coming to the formal definition
of statistics statistics is an area of Applied Mathematics,
which is concerned
with data collection analysis interpretation
and presentation now usually
when I speak about statistics people think statistics is
all about analysis
but statistics has other path toward it has data collection is
also part of Statistics data interpretation presentation.
All of this comes
into statistics already are going to use statistical methods
to visualize data to collect data to interpret data.
Alright, so the area of mathematics deals
with understanding
how data can be used to solve complex problems.
Okay.
Now I'll give you a couple of examples
that can be solved by using statistics.
Okay, let's say
that your company has created a new drug
that may cure cancer.
How would you conduct a test to confirm
the As Effectiveness now,
even though this sounds like a biology problem.
This can be solved with Statistics.
All right, you will have to create a test
which can confirm the effectiveness of the drum
or a this is a common problem
that can be solved using statistics.
Let me give you another example you
and a friend are at a baseball game and out of the blue.
He offers you a bet
that neither team will hit a home run in that game.
Should you take the BET?
All right here you just discuss the probability
of I know you'll win or lose.
All right, this is another problem
that comes under statistics.
Let's look at another example.
The latest sales data has just come in
and your boss wants you to prepare a report
for management on places
where the company could improve its business.
What should you look for?
And what should you not look for now?
This problem involves a lot
of data analysis will have to look at the different variables
that are causing your business to go down
or the you have to look at a few variables.
That are increasing the performance of your models
and does growing your business.
Alright, so this involves a lot of data analysis
and the basic idea
behind data analysis is to use statistical techniques
in order to figure out the relationship
between different variables
or different components in your business.
Okay.
So now let's move on and look
at our next topic which is basic terminologies and statistics.
Now before you dive deep into statistics, it is important
that you understand the basic terminologies
used in statistics.
The two most important terminologies in statistics
are population and Sample.
So throughout the statistics course or throughout any problem
that you're trying to stall with Statistics.
You will come across these two words,
which is population and Sample Now population is a collection
or a set of individuals or objects or events.
Events whose properties are to be analyzed.
Okay.
So basically you can refer to population as a subject
that you're trying to analyze now a sample is just
like the word suggests.
It's a subset of the population.
So you have to make sure that you choose the sample
in such a way
that it represents the entire population.
All right.
It shouldn't Focus add one part of the population instead.
It should represent the entire population.
That's how your sample should be chosen.
So Well chosen sample will contain most
of the information about a particular population parameter.
Now, you must be wondering how can one choose a sample
that best represents the entire population now
sampling is a statistical method
that deals with the selection of individual observations
within a population.
So sampling is performed
in order to infer statistical knowledge about a population.
All right, if you want to understand
the different statistics of a population
like the mean
the median Median the mode or the standard deviation
or the variance of a population.
Then you're going to perform sampling.
All right,
because it's not reasonable for you to study a large population
and find out the mean median and everything else.
So why is sampling performed you might ask?
What is the point of sampling?
We can just study the entire population now guys,
think of a scenario
where in you're asked to perform a survey
about the eating habits of teenagers in the US.
So at present there are over 42 million teens in the US
and this number is growing
as we are speaking right now, correct.
Is it possible to survey each of these 42 million individuals
about their health?
Is it possible?
Well, it might be possible
but this will take forever to do now.
Obviously, it's not it's not reasonable to go around
knocking each door
and asking for what does your teenage son eat
and all of that right?
This is not very reasonable.
That's Why sampling is used?
It's a method wherein a sample of the population is studied
in order to draw inferences about the entire population.
So it's basically a shortcut to starting
the entire population instead of taking the entire population
and finding out all the solutions.
You just going to take a part of the population
that represents the entire population
and you're going to perform all your statistical analysis
your inferential statistics on that small sample.
All right,
and that sample basically here Presents the entire population.
All right, so I'm sure have made this clear
to you all what is sample and what is population now?
There are two main types of sampling techniques
that are discussed today.
We have probability sampling and non-probability
sampling now in this video
will only be focusing on probability sampling techniques
because non-probability sampling is not within the scope
of this video.
All right will only discuss the probability part
because we're focusing
on Statistics and probability correct.
Now again under probability sampling.
We have three different types.
We have random sampling systematic
and stratified sampling.
All right, and just to mention the different types
of non-probability
sampling zwi have no ball Kota judgment
and convenience sampling.
All right now guys in this session.
I'll only be focusing on probability.
So let's move on
and look at the different types of probability sampling.
So what is Probability sampling.
It is a sampling technique
in which samples from a large population
are chosen by using the theory of probability.
All right, so there are three types
of probability sampling.
All right first we have the random sampling now
in this method each member
of the population has an equal chance
of being selected in the sample.
All right.
So each and every individual or each and every object
in the population has an equal chance
of being a A part of the sample.
That's what random sampling is all about.
Okay, you are randomly going to select any individual
or any object.
So this Bay each individual has
an equal chance of being selected.
Correct?
Next.
We have systematic sampling now
in systematic sampling every nth record is chosen
from the population to be a part of the sample.
All right.
Now refer this image
that I've shown over here out of these six groups
every Skinned group is chosen as a sample.
Okay.
So every second record is chosen here and this is
our systematic sampling works.
Okay, you're randomly selecting the nth record
and you're going to add that to your sample.
Next.
We have stratified sampling.
Now in this type
of technique a stratum is used to form samples
from a large population.
So what is a stratum a stratum is basically a subset
of the population
that shares at least one comment.
Characteristics so let's say
that your population has a mix of both male and female
so you can create to straightens
out of this one will have only the male subset
and the other will have the female subset
or a this is what stratum is it is basically a subset
of the population
that shares at least one common characteristics.
All right in our example, it is gender.
So after you've created
a stratum you're going to use random sampling
on the stratums and you're going to choose a final Samba.
But so random sampling meaning
that all of the individuals
in each of the stratum will have an equal chance
of being selected in the sample, correct.
So Guys, these were the three different types
of sampling techniques.
Now, let's move on and look at our next topic
which is the different types of Statistics.
So after this,
we'll be looking at the more advanced concepts of Statistics,
right so far we discuss the basics of Statistics,
which is basically
what is statistics the different sampling.
Techniques and the terminologies and statistics.
All right.
Now we look at the different types of Statistics.
So there are two major types of Statistics
descriptive statistics
and inferential statistics in today's session.
We will be discussing both of these types
of Statistics in depth.
All right, we'll also be looking at a demo
which I'll be running in the our language
in order to make you understand what exactly
descriptive and inferential statistics is so guys,
which is going to look at the 600 don't worry,
if you don't have much knowledge,
I'm explaining everything from the basic level.
All right, so guys descriptive statistics is a method
which is used to describe and understand the features
of specific data set by giving a short summary of the data.
Okay, so it is mainly
focused upon the characteristics of data.
It also provides a graphical summary of the data now
in order to make you understand what descriptive statistics is,
let's suppose.
Pose that you want to gift all your classmates or t-shirt.
So to study the average shirt size of a student
in a classroom.
So if you were to use descriptive statistics to study
the average shirt size of students in your classroom,
then what you would do is you would record the shirt size
of all students in the class
and then you would find out the maximum minimum and average
shirt size of the club.
Okay.
So coming to inferential statistics inferential
statistics makes Is
and predictions about a population based
on the sample of data taken from the population?
Okay.
So in simple words,
it generalizes a large data set and it applies probability
to draw a conclusion.
Okay.
So it allows you to infer data parameters
based on a statistical model by using sample data.
So if we consider the same example of finding
the average shirt size of students in a class
in infinite shal statistics,
you will take a sample.
All set of the class
which is basically a few people from the entire class.
All right, you already have had grouped the class
into large medium and small.
All right in this method you basically build
a statistical model
and expand it for the entire population in the class.
So guys, there was a brief understanding of descriptive
and inferential statistics.
So that's the difference between descriptive
and inferential now in the next section,
we will go in depth about descriptive statistics.
All right, so,
That's a discuss more about descriptive statistics.
So like I mentioned
earlier descriptive statistics is a method
that is used to describe and understand the features
of a specific data set by giving short summaries about the sample
and measures of the data.
There are two important measures in descriptive statistics.
We have measure of central tendency,
which is also known as measure
of center and we have measures of variability.
This is also known as measures of spread.
Ed so measures of center include mean median and mode now
what is measures
of center measures of the center are statistical measures
that represent the summary of a data set?
Okay, the three main measures of center are mean median
and mode coming to measures of variability
or measures of spread.
We have range interquartile range variance
and standard deviation.
All right.
So now let's discuss each
of these measures in a little more.
Up starting with the measures of center.
Now.
I'm sure all of you know what the mean is mean is basically
the measure of the average of all the values in a sample.
Okay, so it's basically the average of all
the values in a sample.
How do you measure the mean I hope all of you know
how the main is measured
if there are 10 numbers
and you want to find the mean of these 10 numbers.
All you have to do is you have to add up all the 10 numbers
and you have to divide it by 10 then here
represents the Number of samples in your data set.
All right, since we have 10 numbers,
we're going to divide this by 10.
All right, this will give us the average
or the mean so to better understand the measures
of central tendency.
Let's look at an example.
Now the data set over here is basically the cars data set
and it contains a few variables.
All right, it has something known as cars.
It has mileage
per gallon cylinder type displacement horsepower
and roll axle ratio.
All right, all of these measures are related to cars.
Okay.
So what you're going to do is you're going
to use descriptive analysis
and you're going to analyze each of the variables
in the sample data set
for the mean standard deviation median mode and so on.
So let's say that you want to find out the mean
or the average horsepower
of the cars among the population of cards.
Like I mentioned earlier
what you'll do is you will check the average of all the values.
So in this case,
we will take the sum of the horizontal.
Horsepower of each car
and we'll divide that by the total number of cards.
Okay, that's exactly
what I've done here in the calculation part.
So this hundred and ten basically
represents the horsepower for the first car.
Alright, similarly.
I've just added up all the values of horsepower
for each of the cars
and I've divided it by 8 now 8 is basically the number
of cars in our data set.
All right, so hundred and three point six two five is what
our mean is or Average of horsepower is all right.
Now, let's understand what median is with an example?
Okay.
So to Define median median is basically a measure
of the central value
of the sample set is called the median.
All right, you can see that it is a middle value.
So if we want to find out the center value
of the mileage per gallon among the population
of cars first,
what we'll do is we'll arrange the MGP values in ascending
or descending order
and Choose a middle value right in this case
since we have eight values, right?
We have eight values which is an even entry.
So whenever you have even number of data points
or samples in your data set,
then you're going to take the average
of the two middle values.
If we had nine values over here.
We can easily figure out the middle value
and you know choose that as a median.
But since they're even number of values we're going
to take the average of the two middle values.
All right, so,
Eight and twenty three are my two middle values
and I'm taking the mean of those 2
and hence I get twenty two point nine,
which is my median.
All right.
Lastly let's look at how mode is calculated.
So what is mode the value
that is most recurrent
in the sample set is known as mode or basically the value
that occurs most often.
Okay, that is known as mode.
So let's say
that we want to find out the most common type of cylinder
among the population
of cards all we have to Do is we will check the value
which is repeated the most number of times here.
We can see that the cylinders come in two types.
We have cylinder of Type 4 and cylinder of type 6, right?
So take a look at the data set.
You can see that the most recurring value is 6 right.
We have one two, three four and five.
We have five six and we have one two, three.
Yeah, we have three four types of lenders and 5/6.
Cylinders.
So basically we have
three four type cylinders and we have five six type cylinders.
All right.
So our mode is going to be 6 since 6 is more
recurrent than 4 so guys
those were the measures
of the center or the measures of central tendency.
Now, let's move on and look at the measures of the spread.
All right.
Now, what is the measure of spread a measure of spread?
Sometimes also called
as measure of dispersion is used to describe the The variability
in a sample or population.
Okay, you can think of it as some sort
of deviation in the sample.
All right.
So you measure this
with the help of the different measure of spreads.
We have range interquartile range variance
and standard deviation.
Now range is pretty self-explanatory, right?
It is the given measure of how spread apart the values
in a data set are the range can be calculated
as shown in this formula.
So you're basically going to The maximum value
from the minimum value in your data set.
That's how you calculate the range of the data.
Alright, next we have interquartile range.
So before we discuss interquartile range,
let's understand.
What a quartile is red.
So quartiles basically tell us about the spread of a data set
by breaking the data set into different quarters.
Okay, just like
how the median breaks the data into two parts.
The quartile will break it.
In two different quarters,
so to better understand how quartile and
interquartile are calculated.
Let's look at a small example.
Now this data set basically represents the marks
of hundred students ordered from the lowest
to the highest scores red.
So the quartiles lie
in the following ranges the first quartile,
which is also known as q1 it
lies between the 25th and 26th observation.
All right.
So if you look at this I've highlighted the 25th
and the Six observation.
So how you can calculate Q 1 or first quartile is
by taking the average of these two values.
Alright, since both the values are 45
when you add them up and divide them by two
you'll still get 45 now the second quartile
or Q 2 is between the 50th and the fifty first observation.
So you're going to take the average of 58
and 59 and you will get a value of 58.5 now,
this is my second quarter the third quartile Q3.
Is between the 75th
and the 76th observation here again will take the average
of the two values
which is the 75th value and the 76 value right
and you'll get a value of 71.
All right, so guys this is exactly
how you calculate the different quarters.
Now, let's look at what is interquartile range.
So IQR or the interquartile range is a measure
of variability based on dividing a data set into quartiles.
Now, the interquartile range is Calculated
by subtracting the q1 from Q3.
So basically Q3
minus q1 is your IQ are so your IQR is your Q3 minus q1?
All right.
Now this is how each
of the quartiles are each core tile represents a quarter,
which is 25% All right.
So guys, I hope all of you are clear
with the interquartile range and what our quartiles now,
let's look at variance covariance is
basically a measure that shows how much a
I'm variable the first from its expected value.
Okay.
It's basically the variance in any variable now variance
can be calculated by using this formula right here x
basically represents any data point in your data set
n is the total number of data points in your data set
and X bar is basically the mean of data points.
All right.
This is how you calculate variance variance is
basically a Computing the squares of deviations.
Okay.
That's why it says s Square there now.
Look at what is deviation deviation is just the difference
between each element from the mean.
Okay, so it can be calculated by using this simple formula
where X I basically represents a data point
and mu is the mean of the population
or add this is exactly
how you calculate the deviation Now population variance
and Sample variance are very specific to
whether you're calculating the variance
in your population data set or in your sample data
set now the only difference between Elation
and Sample variance.
So the formula
for population variance is pretty explanatory.
So X is basically each data point mu is the mean
of the population
n is the number of samples in your data set.
All right.
Now, let's look at sample.
Variance Now sample variance
is the average of squared differences from the mean.
All right here x i is any data point
or any sample in your data set X bar is the mean
of your sample.
All right.
It's not the main of your population.
It's the If your sample and
if you notice any here is a smaller n is the number
of data points in your sample.
And this is basically the difference between sample
and population variance.
I hope that is clear coming
to standard deviation is the measure of dispersion
of a set of data from its mean.
All right, so it's basically the deviation from your mean.
That's what standard deviation is now to better understand
how the measures of spread are calculated.
Let's look at a small use case.
So let's say the Daenerys has 20 dragons.
They have the numbers nine to five four and so on
as shown on the screen,
what you have to do is you have to work out
the standard deviation or at
in order to calculate the standard deviation.
You need to know the mean right?
So first you're going to find out the mean of your sample set.
So how do you calculate the mean you add all the numbers
and divided by the total number of samples in your data set
so you get a value of 7 here
then you I'll clear the rhs of your standard deviation formula.
All right, so from each data point you're going
to subtract the mean and you're going to square that.
All right.
So when you do that,
you will get the following result.
You'll basically get this 425 for 925
and so on so finally you will just find the mean
of the squared differences.
All right.
So your standard deviation
will come up to two point nine eight three
once you take the square root.
So guys, this is pretty simple.
It's a simple mathematic technique.
All you have to do is you have to substitute the values
in the formula.
All right.
I hope this was clear to all of you.
Now let's move on
and discuss the next topic which is Information Gain
and entropy now.
This is one of my favorite topics in statistics.
It's very interesting and this topic is mainly involved
in machine learning algorithms,
like decision trees and random forest.
All right, it's very important
for you to know
how Information Gain and entropy really work and why they are
so essential in building machine learning models.
We focus on the statistic parts of Information Gain and entropy
and after that we'll discuss As a use case and see
how Information Gain
and entropy is used in decision trees.
So for those of you
who don't know what a decision tree is it is
basically a machine learning algorithm.
You don't have to know anything about this.
I'll explain everything in depth.
So don't worry.
Now.
Let's look at what exactly entropy and Information Gain Is
As entropy is basically the measure
of any sort of uncertainty that is present in the data.
All right, so it can be measured by using this formula.
So here s is the set of all instances in the data set
or although data items in the data set
n is the different type
of classes in your data set Pi is the event probability.
Now this might seem a little confusing
to you all but when we go to the use case,
you'll understand all of these terms even better.
All right cam.
To Information Gain
as the word suggests Information Gain indicates
how much information a particular feature
or a particular variable gives us about the final outcome.
Okay, it can be measured by using this formula.
So again here hedge of s is the entropy
of the whole data set s SJ is the number
of instances with the J value
of an attribute a s is the total number of instances
in the data set V is the set
of distinct values of an attribute a hedge
of SJ is the entropy of subsets of instances
and hedge of a comma s is the entropy of an attribute
a even though this seems confusing.
I'll clear out the confusion.
All right, let's discuss a small problem statement
where we will understand
how Information Gain
and entropy is used to study the significance of a model.
So like I said Information Gain
and entropy are very important statistical measures
that let us understand
the significance of a predictive model.
Okay to get a more clear understanding.
Let's look at a use case.
All right now suppose we are given a problem statement.
All right, the statement is that you have to predict
whether a match can be played
or Not by studying the weather conditions.
So the predictor variables here are outlook humidity wind day
is also a predictor variable.
The target variable is basically played already.
The target variable is the variable
that you're trying to protect.
Okay.
Now the value of the target variable will decide
whether or not a game can be played.
All right, so that's why The play has two values.
It has no and yes, no,
meaning that the weather conditions are not good.
And therefore you cannot play the game.
Yes, meaning that the weather conditions are good and suitable
for you to play the game.
Alright, so that was a problem statement.
I hope the problem statement is clear to all of you now
to solve such a problem.
We make use of something known as decision trees.
So guys think of an inverted tree
and each branch of the tree denotes some decision.
All right, each branch is Is known as the branch node
and at each branch node,
you're going to take a decision in such a manner
that you will get an outcome at the end of the branch.
All right.
Now this figure here basically shows
that out of 14 observations 9 observations result in a yes,
meaning that out of 14 days.
The match can be played only on nine days.
Alright, so here
if you see on day 1 Day 2 Day 8 day 9 and 11.
The Outlook has been Alright,
so basically we try to plaster a data set
depending on the Outlook.
So when the Outlook is sunny,
this is our data set when the Outlook is overcast.
and when the Outlook is the rain this is what we have.
All right, so
when it is sunny we have two yeses and three nodes.
Okay, when the Outlook is overcast.
We have all four as yes has meaning
that on the four days when the Outlook was overcast.
We can play the game.
All right.
Now when it comes to drain,
we have three yeses and two nodes.
All right.
So if you notice here,
the decision is being made by choosing the Outlook variable
as the root node.
Okay.
So the root node is
basically the topmost node in a decision tree.
Now, what we've done here is we've created a decision tree
that starts with the Outlook node.
All right, then you're splitting the decision tree further
depending on other parameters like Sunny overcast and rain.
All right now like we know that Outlook has three values.
Sunny overcast and brain so let me explain this
in a more in-depth manner.
Okay.
So what you're doing here is you're making
the decision Tree by choosing the Outlook variable
at the root node.
The root note is basically the topmost node
in a decision tree.
Now the Outlook node has three branches coming out from it,
which is sunny overcast and rain.
So basically Outlook
can have three values either it can be sunny.
It can be overcast or it can be rainy.
Okay now these three values Use are assigned
to the immediate Branch nodes and for each
of these values the possibility
of play is equal to yes is calculated.
So the sunny
and the rain branches will give you an impure output.
Meaning that there is a mix of yes and no right.
There are two yeses here three nodes here.
There are three yeses here and two nodes over here,
but when it comes to the overcast variable,
it results in a hundred percent pure subset.
All right, this shows that the overcast baby.
Will result in a definite and certain output.
This is exactly what entropy is used to measure.
All right, it calculates the impurity or the uncertainty.
Alright, so the lesser the uncertainty or the entropy
of a variable more significant is that variable?
So when it comes to overcast there's literally no impurity
in the data set.
It is a hundred percent pure subset, right?
So be want variables like these in order to build a model.
All right now,
we don't always Ways get lucky and we don't always find
variables that will result in pure subsets.
That's why we have the measure entropy.
So the lesser the entropy of a particular variable the most
significant that variable will be so in a decision tree.
The root node is assigned the best attribute
so that the decision tree can predict the most
precise outcome meaning that on the root note.
You should have the most significant variable.
All right, that's why we've chosen Outlook
or and now some of you might ask me why haven't you chosen
overcast Okay is overcast is not a variable.
It is a value of the Outlook variable.
All right.
That's why we've chosen outlook here because it has
a hundred percent pure subset which is overcast.
All right.
Now the question in your head is how do I decide which variable
or attribute best Blitz the data now right now,
I know I looked at the data
and I told you that,
you know here we have a hundred percent pure subset,
but what if it's a more complex problem
and you're not able to understand which variable
will best split the data,
so guys when it comes
to decision tree Information and gain
and entropy will help
you understand which variable will best split the data set.
All right, or which variable you have to assign to the root node
because whichever variable is assigned to the dude node.
It will best let the data set
and it has to be the most significant variable.
All right.
So how we can do this is we need to use
Information Gain and entropy.
So from the total of the 14 instances
that we saw nine of them said yes
and 5 of the instances said know
that you cannot play on that particular day.
All right.
So how do you calculate the entropy?
So this is the formula you just substitute
the values in the formula.
So when you substitute the values in the formula,
you will get a value of 0.9940.
All right.
This is the entropy
or this is the uncertainty of the data present in a sample.
Now in order to ensure
that we choose the best variable for the root node.
Let us look at all the possible combinations
that you can use on the root node.
Okay, so these are All the possible combinations
you can either have Outlook you can have
windy humidity or temperature.
Okay, these are four variables and you can have any one
of these variables as your root node.
But how do you select
which variable best fits the root node?
That's what we are going to see by using
Information Gain and entropy.
So guys now the task at hand is to find the information gain
for each of these attributes.
All right.
So for Outlook for windy for humidity and for temperature,
we're going to find out the information.
Nation gained right now a point to remember is
that the variable
that results in the highest Information Gain must be chosen
because it will give us the most precise and output information.
All right.
So the information gain for attribute windy will calculate
that first here.
We have six instances of true and eight instances of false.
Okay.
So when you substitute all the values in the formula,
you will get a value of zero point zero four eight.
So we get a value of You 2.0 for it.
Now.
This is a very low value for Information Gain.
All right, so the information
that you're going to get from Windy attribute is pretty low.
So let's calculate the information gain
of attribute Outlook.
All right, so from the total of 14 instances,
we have five instances with say Sunny for instances,
which are overcast and five instances,
which are rainy.
All right for Sonny.
We have three yeses and to nose for overcast we have All the
for as yes for any we have three years and two nodes.
Okay.
So when you calculate the information gain
of the Outlook variable will get a value
of zero point 2 4 7 now compare this to the information gain
of the windy attribute.
This value is actually pretty good.
Right we have zero point 2 4 7 which is a pretty good value
for Information Gain.
Now, let's look at the information gain
of attribute humidity now over here.
We have seven instances with say hi and seven instances
with say normal.
Right and under the high Branch node.
We have three instances with say yes,
and the rest for instances would say no similarly
under the normal Branch.
We have one two, three,
four, five six seven instances would say yes
and one instance with says no.
All right.
So when you calculate the information gain
for the humidity variable,
you're going to get a value of 0.15 one.
Now.
This is also a pretty decent value,
but when you compare it to the Information Gain,
Of the attribute Outlook it is less right now.
Let's look at the information gain of attribute temperature.
All right, so the temperature can hold repeat.
So basically the temperature attribute can hold
hot mild and cool.
Okay under hot.
We have two instances with says yes and two instances
for no under mild.
We have four instances of yes and two instances of no
and under col we have three instances of yes
and one instance of no.
All right.
When you calculate
the information gain for this attribute,
you will get a value of zero point zero to nine,
which is again very less.
So what you can summarize from here is if we look
at the information gain for each of these variable will see
that for Outlook.
We have the maximum gain.
All right, we have zero point two four seven,
which is the highest Information Gain value
and you must always choose a variable with the highest
Information Gain to split the data at the root node.
So that's why we assign The Outlook variable
at the root node.
All right, so guys.
I hope this use case with clear if any of you have doubts.
Please keep commenting those doubts now,
let's move on and look at what exactly a confusion Matrix is
the confusion Matrix is the last topic
for descriptive statistics read after this.
I'll be running a short demo where I'll be showing you
how you can calculate mean median mode
and standard deviation variance and all of those values
by using our okay.
So let's talk about confusion Matrix now guys.
What is the confusion Matrix now don't get confused.
This is not any complex topic now confusion.
Matrix is a matrix
that is often used to describe the performance of a model.
All right, and this is specifically used
for classification models
or a classifier
and what it does is it will calculate the accuracy
or it will calculate the performance of your classifier
by comparing your actual results and Your predicted results.
All right.
So this is what it looks
like to positive to negative and all of that.
Now this is a little confusing.
I'll get back to what exactly true positive
to negative and all of this stands for for now.
Let's look at an example and let's try and understand what
exactly confusion Matrix is.
So guys have made sure
that I put examples after each and every topic
because it's important you
understand the Practical part of Statistics.
All right statistics has literally nothing to do
with Theory you need to understand how Calculations
are done in statistics.
Okay.
So here what I've done is now let's look at a small use case.
Okay, let's consider
that your given data about a hundred and sixty five
patients out of which hundred and five patients have a disease
and the remaining 50 patients don't have a disease.
Okay.
So what you're going to do is you will build a classifier
that predicts by using
these hundred and sixty five observations.
You'll feed all
of these 165 observations to your classifier
and it will predict the output every time
a new patients detail is fed to the classifier right now
out of these 165 cases.
Let's say that the classifier predicted.
Yes hundred and ten times and no 55 times.
Alright, so yes basically stands for yes.
The person has a disease and no stands for know.
The person does not have a disease.
All right, that's pretty self-explanatory.
But yeah, so it predicted that a hundred and ten times.
Patient has a disease and 55 times
that know the patient doesn't have a disease.
However in reality only hundred and five patients
in the sample have the disease and 60 patients
who do not have the disease, right?
So how do you calculate the accuracy of your model?
You basically build the confusion Matrix?
All right.
This is how the Matrix looks
like and basically denotes the total number of observations
that you have
which is 165 in our case actual denotes the actual use
in the data set
and predicted denotes the predicted values
by the classifier.
So the actual value is no here
and the predicted value is no here.
So your classifier was correctly able
to classify 50 cases as no.
All right, since both of these are no so 50
it was correctly able to classify but 10
of these cases it incorrectly classified meaning
that your actual value here is no but you classifier
predicted it as yes
or I that's why this And over here similarly
it wrongly predicted
that five patients do not have diseases
whereas they actually did have diseases
and it correctly predicted hundred patients,
which have the disease.
All right.
I know this is a little bit confusing.
But if you look at these values no,
no 50 meaning
that it correctly predicted 50 values No
Yes means that it wrongly predicted.
Yes for the values are it was supposed to predict.
No.
All right.
Now what exactly is?
Is this true positive to negative and all of that?
I'll tell you what exactly it is.
So true positive are the cases in which we predicted a yes
and they do not actually have the disease.
All right, so it is basically this value
already predicted a yes here,
even though they did not have the disease.
So we have 10 true positives right similarly true-
is we predicted know
and they don't have the disease meaning
that this is correct.
False positive is be predicted.
Yes, but they do not actually have the disease
or at this is also known as type 1 error falls- is we predicted.
No, but they actually do not have the disease.
So guys basically falls-
and true negatives are basically correct classifications.
All right.
So this was confusion Matrix
and I hope this concept is clear again guys.
If you have doubts,
please comment your doubt in the comment section.
So guys, that was the entire descriptive.
X module and now we will discuss about probability.
Okay.
So before we understand what exactly probability is,
let me clear out a very common misconception people
often tend to ask me this question.
What is the relationship between statistics and probability?
So probability and statistics are related fields.
All right.
So probability is a mathematical method used
for statistical analysis.
Therefore we can say
that a probability and statistics are interconnected.
Launches of mathematics
that deal with analyzing the relative frequency of events.
So they're very interconnected feels
and probability makes use of statistics
and statistics makes use
of probability or a they're very interconnected Fields.
So that is the relationship between statistics
and probability.
Now, let's understand what exactly is probability.
So probability is the measure
of How likely an event will occur to be more precise.
It is the ratio.
Of desired outcome to the total outcomes.
Now, the probability
of all outcomes always sum up to 1 the probability will always
sum up to 1 probability cannot go beyond one.
Okay.
So either your probability can be 0 or it can be 1
or it can be in the form of decimals like 0.5
to or 0.55 or it can be in the form of 0.5 0.7 0.9.
But it's valuable always stay between the range 0 and 1.
Okay at the famous example
of probability is rolling a dice example.
So when you roll a dice you get six possible outcomes, right?
You get one two,
three four and five six phases of a dice now
each possibility only has one outcome.
So what is the probability that on rolling a dice?
You will get 3 the probability is 1 by 6, right
because there's only one phase
which has the number 3 on it out of six phases.
There's only one phase which has the number three.
So the probability of getting 3
when you roll a dice is 1 by 6 similarly,
if you want to find the probability of getting
a number 5 again,
the probability is going to be 1 by 6.
All right, so all of this will sum up to 1.
All right, so guys this is exactly what probability is.
It's a very simple concept we all learnt it
in 8 standard onwards right now.
Let's understand the different terminologies
that are related to probability.
Now the three terminologies that you often come across
when We talk about probability.
We have something known as the random experiment.
Okay, it's basically an experiment or a process
for which the outcomes cannot be predicted with certainty.
All right.
That's why you use probability.
You're going to use probability in order to predict the outcome
with some sort
of certainty sample space is the entire possible set
of outcomes of a random experiment an event is
one or more outcomes of an experiment.
So if you consider the example Love rolling a dice.
Now.
Let's say that you want to find out the probability
of getting a to when you roll the dice.
Okay.
So finding this probability is the random experiment
the sample space is basically your entire possibility.
Okay.
So one two, three, four,
five six phases are there and out of that you need
to find the probability of getting a 2, right.
So all the possible outcomes
will basically represent your sample space.
Okay.
So 1 to 6 are all your possible outcomes this represents.
Sample space event is one or more outcome
of an experiment.
So in this case my event is to get a to
when I roll a dice, right?
So my event is the probability of getting a to
when I roll a dice.
So guys, this is basically what random experiment sample space
and event really means alright.
Now, let's discuss the different types of events.
There are two types of events that you should know about there
is disjoint and non disjoint events disjoint events.
These are events
that do not have any common outcome.
For example,
if you draw a single card from a deck of cards,
it cannot be a king and a queen correct.
It can either be king or it can be Queen.
Now a non disjoint events are events
that have common outcomes.
For example, a student can get hundred marks
in statistics and hundred marks in probability.
All right, and also the outcome
of a ball delibird can be a no ball
and it can be a 6 right.
So this is Non disjoint events are or n.
These are very simple to understand right now.
Let's move on and look at the different types
of probability distribution.
All right, I'll be discussing
the three main probability distribution functions.
I'll be talking about probability density
function normal distribution and Central limit theorem.
Okay probability density function also known
as PDF is concerned
with the relative likelihood for a continuous random variable.
To take on a given value.
All right.
So the PDF gives the probability
of a variable that lies between the range A and B.
So basically what you're trying to do is you're going to try
and find the probability of a continuous random variable
over a specified range.
Okay.
Now this graph denotes the PDF of a continuous variable.
Now, this graph is also known as the bell curve right?
It's famously called the bell curve because of
its shape and there are three important properties
that you To know about a probability density function.
Now the graph of a PDF will be continuous over a range.
This is because you're finding the probability
that a continuous variable lies between the ranges A and B,
right the second property is
that the area bounded by the curve of a density function
and the x-axis is equal to 1 basically the area
below the curve is equal to 1 all right,
because it denotes probability again
the probability cannot arrange.
More than one it has to be
between 0 and 1 property number three is that the probability
that our random variable assumes a value between A
and B is equal to the area
under the PDF bounded by A and B. Okay.
Now what this means is
that the probability value is denoted by the area
of the graph.
All right, so whatever value that you get here,
which basically one is the probability
that a random variable will lie between the range A and B.
All right, so I hope
If you have understood the probability density function,
it's basically the probability of finding the value
of a continuous random variable between the range A and B.
All right.
Now, let's look at our next distribution,
which is normal distribution now normal distribution,
which is also known as
the gaussian distribution is a probability distribution
that denotes the symmetric property
of the mean right meaning
that the idea behind this function is
that The data near the mean
occurs more frequently than the data away from the mean.
So what it means to say is
that the data around the mean represents the entire data set.
Okay.
So if you just take a sample of data
around the mean it can represent the entire data set now similar
to the probability density function the normal distribution
appears as a bell curve.
All right.
Now when it comes to normal distribution,
there are two important factors.
All right, we have the mean of the population.
And the standard deviation.
Okay, so the mean and the graph determines the location
of the center of the graph,
right and the standard deviation determines the height
of the graph.
Okay.
So if the standard deviation is large the curve is going
to look something like this.
All right, it'll be short and wide and
if the standard deviation is small the curve
is tall and narrow.
All right.
So this was it about normal distribution.
Now, let's look at the central limit theorem.
Now the central limit theorem states
that the sampling distribution
of the mean of any independent random variable will be normal
or nearly normal
if the sample size is large enough now,
that's a little confusing.
Okay.
Let me break it down for you now in simple terms
if we had a large population
and we divided it into many samples.
Then the mean of all the samples
from the population will be almost equal
to the mean of the entire population right meaning
that each of the sample is normally distributed.
Right.
So if you compare the mean of each of the sample,
it will almost be equal to the mean of the population.
Right?
So this graph basically shows a more clear understanding
of the central limit theorem red you can see each sample here
and the mean
of each sample is almost along the same line, right?
Okay.
So this is exactly
what the central limit theorem States now the accuracy
or the resemblance
to the normal distribution depends on two main factors.
Right.
So the first is the number of sample points
that you consider.
All right,
and the second is a shape of the underlying population.
Now the shape obviously depends on the standard deviation
and the mean of a sample, correct.
So guys the central limit theorem basically states
that each sample will be normally distributed
in such a way
that the mean of each sample will coincide with the mean
of the actual population.
All right in short terms.
That's what central limit theorem States.
Alright, and this holds true only for a large.
Is it mostly for a small data set
and there are more deviations
when compared to a large data set is because of
the scaling Factor, right?
The small is deviation
in a small data set will change the value very drastically,
but in a large data set a small deviation
will not matter at all.
Now, let's move on and look at our next topic
which is the different types of probability.
Now, this is a important topic
because most of your problems can be solved by understanding
which type of probability should I use to solve?
This problem right?
So we have three important types of probability.
We have marginal joint and conditional probability.
So let's discuss each
of these now the probability of an event occurring unconditioned
on any other event is known as marginal probability
or unconditional probability.
So let's say that you want to find the probability
that a card drawn is a heart.
All right.
So if you want to find the probability
that a card drawn is a heart the prophet.
B13 by 52 since there are 52 cards in a deck
and there are 13 hearts in a deck of cards.
Right and there are 52 cards in a turtleneck.
So your marginal probability will be 13 by 52.
That's about marginal probability.
Now, let's understand.
What is joint probability.
Now joint probability
is a measure of two events happening at the same time.
Okay.
Let's say that the two events are A and B.
So the probability of event A
and B occurring is the dissection of A and B.
So for example,
if you want to find the probability
that a card is a four and a red that would be joint probability.
All right, because you're finding a card
that is 4 and the card has to be red in color.
So for the answer,
this will be 2 by 52 because we have 1/2
in heart and we have 1/2 and diamonds correct.
So both of these are red and color therefore.
Our probability is to by 52
and if you further down it Is 1 by 26, right?
So this is what joint probability is all
about moving on.
Let's look at what exactly conditional probability is.
So if the probability
of an event or an outcome is based on the occurrence
of a previous event or an outcome,
then you call it as a conditional probability.
Okay.
So the conditional probability of an event B is the probability
that the event will occur given
that an event a has already occurred, right?
So if a and b are dependent events,
then the expression
for conditional probability is given by this.
Now this first term on the left hand side,
which is p b of a is basically the probability
of event B occurring
given that event a has already occurred.
All right.
So like I said,
if a and b are dependent events,
then this is the expression but if a
and b are independent events,
and the expression for conditional probability is
like this, right?
So guys P of A and B of B is obviously the probability of A
and probability of B right now.
Let's move on now in order
to understand conditional probability joint probability
and marginal probability.
Let's look at a small use case.
Okay now basically we're going to take a data set
which examines the salary package and training
undergone my candidates.
Okay.
Now in this there are 60 candidates without training
and forty five candidates,
which have enrolled for Adder a curse training.
Right.
Now the task here is you have to assess the training
with a salary package.
Okay, let's look at this in a little more depth.
So in total,
we have hundred and five candidates out of which 60
of them have not enrolled Frederick has training
and 45 of them have enrolled for a deer Acres training
or this is a small survey
that was conducted
and this is the rating of the package or the salary
that they got right?
So if you read through the data,
you can understand there were five candidates.
It's without education
or training who got a very poor salary package.
Okay.
Similarly, there are
30 candidates with Ed Eureka training
who got a good package, right?
So guys basically you're comparing the salary package
of a person depending on
whether or not they've enrolled for a director training, right?
This is our data set.
Now, let's look at our problem statement find the probability
that a candidate has undergone a Drake
has training quite simple,
which type of probability is this Is this is
marginal probability?
Right?
So the probability
that a candidate has undergone edger Acres training is
obviously 45 divided by a hundred and five
since 45 is the number
of candidates with Eddie record raining
and hundred and five is the total number of candidates.
So you get a value of approximately 0.4
to all right,
that's the probability of a candidate
that has undergone educate a girl straining next question
find the probability
that a candidate has attended edger Acres training.
Also has good package.
Now.
This is obviously a joint probability problem, right?
So how do you calculate this now?
Since our table is quite formatted we can directly find
that people who have gotten a good package
along with Eddie record raining or 30, right?
So out of hundred and five people 30 people
have education training and a good package, right?
They specifically asking for people
with Eddie record raining.
Remember that night.
The question is find the probability that a gang Today,
it has attended editor Acres training
and also has a good package.
All right, so we need to consider two factors
that is a candidate
who's addenda deaderick has training and
who has a good package.
So clearly that number
is 30 30 divided by total number of candidates,
which is 1:05, right?
So here you get the answer clearly next.
We have find the probability
that a candidate has a good package given
that he has not undergone training.
Okay.
Now this is Early conditional probability
because here you're defining a condition you're saying
that you want to find the probability of a candidate
who has a good package given that he's not undergone.
Any training, right?
The condition is that he's not undergone any training.
All right.
So the number of people
who have not undergone training are 60 and out
of that five of them have got a good package
that so that's why this is Phi by 60 and not five
by a hundred and five
because here they have clearly mentioned has a good pack.
Given that he has not undergone training.
So you have to only
consider people who have not undergone training, right?
So any five people
who have not undergone training have gotten
a good package, right?
So 5 divided by 60 you get a probability of around 0.08
which is pretty low, right?
Okay.
So this was all
about the different types of probability now,
let's move on and look at our last Topic in probability,
which is base theorem.
Now guys base.
Your room is a very important concept when it comes
to statistics and probability.
It is majorly used in knife bias algorithm.
Those of you who aren't aware.
Now I've bias is a supervised learning classification
algorithm and it is mainly used in Gmail spam filtering right?
A lot of you might have noticed that if you open up Gmail,
you'll see that you have a folder called spam right
or that is carried out through machine learning
and And the algorithm use there is knife bias, right?
So now let's discuss what exactly the Bayes theorem is
and what it denotes the bias theorem is used
to show the relation between one conditional probability
and it's inverse.
All right.
Basically it's nothing but the probability
of an event occurring based on prior knowledge of conditions
that might be related to the same event.
Okay.
So mathematically the bell's theorem is represented
like this right now.
Shown in this equation.
The left-hand term is referred to as the likelihood ratio
which measures the probability
of occurrence of event be given an event a okay
on the left hand side is
what is known as the posterior right
is referred to as posterior,
which means that the probability
of occurrence of a given an event be right.
The second term is referred
to as the likelihood Ratio or at this measures the probability
of occurrence of B given an event.
A now P of a is also known as the prior
which refers to the actual probability distribution of A
and P of B is again,
the probability of B, right.
This is the bias theorem
and in order to better understand the base theorem.
Let's look at a small example.
Let's say that we have three bowels we have bow is
a bow will be and bouncy.
Okay barley contains two blue balls
and for red balls bowel B
contains eight blue balls and for red balls.
Wow Zeke.
Games one blue ball and three red balls now
if we draw one ball from each Bowl,
what is the probability to draw a blue ball
from a bowel a if we know
that we drew exactly a total of two blue balls, right?
If you didn't understand the question,
please read it I shall pause for a second or two.
Right.
So I hope all of you have understood the question.
Okay.
Now what I'm going to do is I'm going to draw
a blueprint for you
and tell you how exactly to solve the problem.
But I want you all to give me the solution
to this problem, right?
I'll draw a blueprint.
I'll tell you what exactly the steps are
but I want you to come up with a solution
on your own right the formula is also given to you.
Everything is given to you.
All you have to do is come up with the final answer.
Right?
Let's look at how you can solve this problem.
So first of all,
what we will do is Let's consider a all right,
let a be the event
of picking a blue ball from bag in and let
X be the event of picking exactly two blue balls,
right because these are the two events
that we need to calculate the probability of now
there are two probabilities that you need to consider here.
One is the event of picking a blue ball from bag a
and the other is the event of picking exactly two blue balls.
Okay.
So these two are represented by a and X respectively
and so what we want is the probability of occurrence
of event a given X,
which means that given
that we're picking exactly two blue balls.
What is the probability
that we are picking a blue ball from bag?
So by the definition of conditional probability,
this is exactly what our equation will look like.
Correct.
This is basically a occurrence of event a given element X
and this is the probability of a and x
and this is the probability of X alone, correct.
What we need to do is we need to find these two probabilities
which is probability of a and X occurring together
and probability of X. Okay.
This is the entire solution.
So how do you find P probability
of X this you can do in three ways.
So first is white ball from a either white from be
or read from see now first is to find the probability of x x
basically represents the event
of picking exactly two blue balls.
Right.
So these are the three ways in which it is possible.
So you'll pick one blue ball from bowel a and one from bowel
be in the second case.
You can pick one
from a and another blue ball from see in the third case.
You can pick a blue ball from Bagby
and a blue ball from bagsy.
Right?
These are the three ways in which it is possible.
So you need to find the probability of each
of this step do is
that you need to find the probability of a
and X occurring together.
This is the sum of terms one and two.
Okay, this is
because in both of these events,
you're picking a ball from bag, correct?
So there is find out this probability and let
me know your answer in the comment section.
All right.
We'll see if you get the answer right?
I gave you the entire solution to this.
All you have to do is substitute the value right?
If you want a second or two,
I'm going to pause on the screen so that you can go through this
in a more clearer way right?
Remember that you need to calculate two.
He's the first probability
that you need to calculate is the event of picking a blue ball
from bag a given
that you're picking exactly two blue balls.
Okay, II probability you need to calculate is the event
of picking exactly to bluebirds.
All right.
These are the two probabilities.
You need to calculate so remember that and this
is the solution.
All right, so guys,
make sure you mention your answers
in the comment section for now.
Let's move on and Get our next topic,
which is the inferential statistics.
So guys, we just completed the probability module right now.
We will discuss inferential statistics,
which is the second type of Statistics.
We discussed descriptive statistics earlier.
All right.
So like I mentioned earlier inferential statistics also
known as statistical inference is a branch of Statistics
that deals with forming inferences and predictions
about a population based on a sample of data.
Taken from the population.
All right, and the question you should ask is
how does one form inferences or predictions on a sample?
The answer is you use Point estimation?
Okay.
Now you must be wondering
what is point estimation one estimation is concerned
with the use of the sample data to measure a single value
which serves as an approximate value
or the best estimate of an unknown population parameter.
That's a little confusing.
Let me break it down to you for Camping
in order to calculate the mean of a huge population.
What we do is we first draw out the sample of the population
and then we find the sample mean
right the sample mean is then used to estimate
the population mean this is basically Point estimate,
you're estimating the value of one of the parameters
of the population, right?
Basically the main
you're trying to estimate the value of the mean.
This is what point estimation is the two main terms
in point estimation.
There's something known as
as the estimator and the something known
as the estimate estimator is a function of the sample
that is used to find out the estimate.
Alright in this example.
It's basically the sample mean right so a function
that calculates the sample mean is known as the estimator
and the realized value
of the estimator is the estimate right?
So I hope Point estimation is clear.
Now, how do you find the estimates?
There are four common ways in which you can do this.
The first one is method of Moment yo,
what you do is you form an equation
in the sample data set
and then you analyze the similar equation
in the population data set
as well like the population mean population variance and so on.
So in simple terms,
what you're doing is you're taking down some known facts
about the population
and you're extending those ideas to the sample.
Alright, once you do that,
you can analyze the sample and estimate more
essential or more complex values right next.
We have maximum likelihood.
This method basically uses a model to estimate a value.
All right.
Now a maximum likelihood is majorly based on probability.
So there's a lot of probability involved in this method next.
We have the base estimator this works by minimizing
the errors or the average risk.
Okay, the base estimator has a lot to do
with the Bayes theorem.
All right, let's not get into the depth
of these estimation methods.
Finally.
We have the best unbiased estimators in this method.
There are seven unbiased estimators that can be used
to approximate a parameter.
Okay.
So Guys these were a couple of methods
that are used to find the estimate
but the most well-known method to find the estimate is known as
the interval estimation.
Okay.
This is one
of the most important estimation methods right?
This is where confidence interval also comes
into the picture right apart from interval estimation.
We also have something known as margin of error.
So I'll be discussing all of this.
In the upcoming slides.
So first let's understand.
What is interval estimate?
Okay, an interval or range of values,
which are used to estimate a population parameter is known as
an interval estimation, right?
That's very understandable.
Basically what they're trying to see is you're going to estimate
the value of a parameter.
Let's say you're trying to find the mean of a population.
What you're going to do is you're going to build a range
and your value will lie in that range or in that interval.
Alright, so this way your output is going to be more accurate
because you've not predicted a point estimation instead.
You have estimated an interval
within which your value might occur, right?
Okay.
Now this image clearly shows
how Point estimate and interval estimate or different
so guys interval estimate is obviously more accurate
because you are not just focusing on a particular value
or a particular point
in order to predict the probability instead.
You're saying that the value might be
within this range between the lower confidence limit
and the upper confidence limit.
All right, this is denotes the range or the interval.
Okay, if you're still confused about interval estimation,
let me give you a small example
if I stated that I will take 30 minutes to reach the theater.
This is known as Point estimation.
Okay, but if I stated
that I will take between 45 minutes
to an hour to reach the theater.
This is an example of into Estimation.
All right.
I hope it's clear.
Now now interval estimation gives rise to two important
statistical terminologies one is known as confidence interval
and the other is known as margin of error.
All right.
So there's it's important
that you pay attention
to both of these terminologies confidence interval is one
of the most significant measures
that are used to check
how essential machine learning model is.
All right.
So what is confidence interval confidence interval is
the measure of your confidence
that the interval estimated contains
the population parameter or the population mean
or any of those parameters right now statisticians
use confidence interval to describe the amount
of uncertainty associated
with the sample estimate of a population parameter now guys,
this is a lot of definition.
Let me just make you understand confidence interval
with a small example.
Okay.
Let's say that you perform
a survey and you survey a group of cat owners.
The see how many cans of cat food they purchase
in one year.
Okay, you test
your statistics at the 99 percent confidence level
and you get a confidence interval
of hundred comma 200 this means
that you think
that the cat owners
by between hundred to two hundred cans in a year and also
since the confidence level is 99% shows
that you're very confident that the results are, correct.
Okay.
I hope all of you are clear with that.
Alright, so your confidence interval here will be
a hundred and two hundred
and your confidence level will be 99% Right?
That's the difference between confidence interval
and confidence level So within your confidence interval
your value is going to lie and your confidence level will show
how confident you are about your estimation, right?
I hope that was clear.
Let's look at margin of error.
No margin of error
for a given level of confidence is a greatest possible distance
between the Point estimate
and the value of the parameter
that it is estimating you can say
that it is a deviation from the actual point estimate right.
Now.
The margin of error can be calculated
using this formula now zc her denotes the critical value
or the confidence interval
and this is X standard deviation divided by root
of the sample size.
All right, n is basically the sample size now,
let's understand how you can estimate
the confidence intervals.
So guys the level of confidence
which is denoted by C is the probability
that the interval estimate contains a population parameter.
Let's say that you're trying to estimate the mean.
All right.
So the level of confidence is the probability
that the interval estimate contains
the population parameter.
So this interval between minus Z and z
or the area beneath this curve is nothing but the probability
that the interval estimate contains a population parameter.
You don't all right.
It should basically contain the value
that you are predicting right.
Now.
These are known as critical values.
This is basically your lower limit
and your higher limit confidence level.
Also, there's something known as the Z score now.
This court can be calculated by using the standard normal table.
All right, if you look it up anywhere on Google
you'll find the z-score table
or the standard normal table to understand
how this is done.
Let's look at a small example.
Okay, let's say that the level of confidence.
Vince is 90% This means that you are 90% confident
that the interval contains the population mean.
Okay, so the remaining 10% which is out of hundred percent.
The remaining 10% is equally distributed
on these tail regions.
Okay, so you have 0.05 here and 0.05 over here, right?
So on either side
of see you will distribute the other leftover percentage
now these Z scores are calculated from the table
as I mentioned before.
All right one.
I'm 6 4 5 is get collated from the standard normal table.
Okay, so guys how you estimate the level of confidence?
So to sum it up.
Let me tell you the steps that are involved in constructing
a confidence interval first.
You would start by identifying a sample statistic.
Okay.
This is the statistic
that you will use to estimate a population parameter.
This can be anything like the mean
of the sample next you will select a confidence level
now the confidence level describes the uncertainty
of a Sampling method right
after that you'll find something known as the margin
of error right?
We discussed margin of error earlier.
So you find this based on the equation
that I explained in the previous slide,
then you'll finally specify the confidence interval.
All right.
Now, let's look at a problem statement
to better understand this concept a random sample
of 32 textbook prices is taken from a local College Bookstore.
The mean of the sample is so so
and so and the sample standard deviation is
This use a 95% confident level
and find the margin of error for the mean price
of all text books in the bookstore.
Okay.
Now, this is a very straightforward question.
If you want you can read the question again.
All you have to do is you have to just substitute the values
into the equation.
All right, so guys,
we know the formula for margin of error you take the Z score
from the table.
After that we have deviation Madrid's 23.4 for right
and that's standard deviation and n stands for the number
of samples here.
The number of samples is 32 basically 32 textbooks.
So approximately your margin of error is going to be
around 8.1 to this is a pretty simple question.
All right.
I hope all of you understood this now
that you know,
the idea behind confidence interval.
Let's move ahead to one
of the most important topics in statistical inference,
which is hypothesis testing, right?
So Ugly statisticians use hypothesis testing
to formally check
whether the hypothesis is accepted or rejected.
Okay, hypothesis.
Testing is an inferential statistical technique
used to determine
whether there is enough evidence in a data sample to infer
that a certain condition holds true for an entire population.
So to understand
the characteristics of a general population,
we take a random sample,
and we analyze the properties of the sample right we test.
Whether or not the identified conclusion represents
the population accurately
and finally we interpret the results now
whether or not to accept the hypothesis depends
upon the percentage value that we get from the hypothesis.
Okay, so to better understand this,
let's look at a small example before that.
There are a few steps that are followed
in hypothesis testing you begin by stating the null
and the alternative hypothesis.
All right.
I'll tell you what exactly these terms are
and then you formulate.
Analysis plan right after that you analyze the sample data
and finally you can interpret the results
right now to understand the entire hypothesis testing.
We look at a good example.
Okay now consider for boys Nick jean-bob
and Harry these boys were caught bunking a class
and they were asked to stay back at school
and clean the classroom as a punishment, right?
So what John did is he decided
that four of them would take turns to clean their classrooms.
He came up with a plan of writing each
of their names on chits
and putting them
in a bowl now every day they had to pick up a name from the bowel
and that person had to play in the clock, right?
That sounds pretty fair enough now it is been three days
and everybody's name has come up except John's assuming
that this event is completely random
and free of bias.
What is a probability
of John not cheating right or is the probability
that he's not actually cheating this can Solved
by using hypothesis testing.
Okay.
So we'll Begin by calculating the probability of John
not being picked for a day.
Alright, so we're going to assume
that the event is free of bias.
So we need to find out the probability
of John not cheating right first we will find the probability
that John is not picked for a day, right?
We get 3 out of 4,
which is basically 75% 75% is fairly high.
So if John is not picked for three days in a row
the Probability will drop down to approximately 42% Okay.
So three days in a row meaning
that is the probability drops down to 42 percent.
Now, let's consider a situation
where John is not picked for 12 days in a row
the probability drops down to three point two percent.
Okay.
That's the probability of John cheating becomes
fairly high, right?
So in order
for statisticians to come to a conclusion,
they Define what is known as a threshold value.
Right considering the above situation
if the threshold value is set to 5 percent.
It would indicate
that if the probability lies below 5% then John is cheating
his way out of detention.
But if the probability is about threshold value then John
it just lucky and his name isn't getting picked.
So the probability
and hypothesis testing give rise to two important components
of hypothesis testing,
which is null hypothesis and alternative hypothesis.
Null.
Hypothesis is based.
Basically approving
the Assumption alternate hypothesis is
when your result disapproves the Assumption right therefore
in our example,
if the probability of an event occurring
is less than 5% which it is then the event is biased hence.
It proves the alternate hypothesis.
So guys with this we come to the end of this session.
Let's go ahead and understand what exactly is.
Was learning so supervised learning is
where you have the input variable X
and the output variable Y and use an algorithm
to learn the map Egg function from the input to the output
as I mentioned earlier
with the example of face detection.
So it is called supervised learning
because the process of an algorithm learning
from the training data set can be thought
of as a teacher supervising the learning process.
So if we have a look at the supervised learning steps
or What would rather say the workflow?
So the model is used as you can see here.
We have the historic data.
Then we again we have the random sampling.
We split the data into train your asset
and the testing data set using the training data set.
We with the help of machine learning
which is supervised machine learning.
We create statistical model then
after we have a mod which is being generated
with the help of the training data set.
What we do is use the testing data set
for production and testing.
What we do is get the output
and finally we have the model validation outcome.
That was the training and testing.
So if we have a look at the prediction part
of any particular supervised learning algorithm,
so the model is used for operating outcome
of a new data set.
So whenever performance
of the model degraded the model is retrained
or if there are any performance issues,
the model is retained with the help of the new data now
when we talk about supervisor
in there not just one but quite a few algorithms here.
So we have linear regression logistic regression.
This is entry.
We have random Forest.
We have made by classifiers.
So linear regression is used to estimate real values.
For example, the cost of houses.
The number of calls the total sales based
on the continuous variables.
So that is what reading regression is.
Now when we talk about logistic regression,
which is used to estimate discrete values, for example,
which are binary values like 0 and 1 yes,
or no true.
False based on the given set of independent variables.
So for example,
when you are talking about something like the chances
of winning or if you talk
about winning which can be either true or false
if will it rain today with it can be the yes or no,
so it cannot be
like when the output of a particular algorithm
or the particular question is either.
Yes.
No or Banner e then
only we use a large stick regression the next
we have decision trees.
So now these are used for classification problems it work.
X for both categorical and continuous
dependent variables and
if we talk about random Forest So Random Forest is an M symbol
of a decision tree,
it gives better prediction accuracy than decision tree.
So that is another type of supervised learning algorithm.
And finally we have the need based classifier.
So it was
a classification technique based on the Bayes theorem
with an assumption of Independence between predictors.
A linear regression is one of the easiest algorithm
in machine learning.
It is a statistical model
that attempts to show the relationship
between two variables with a linear equation.
But before we drill down
to linear regression algorithm in depth,
I'll give you a quick overview of today's agenda.
So we'll start a session
with a quick overview of what is regression
as linear regression is one of a type
of regression algorithm.
Once we learn about regression,
its use case the various types of it next.
We'll learn about the algorithm from scratch.
Each where I'll teach
you it's mathematical implementation first,
then we'll drill down to the coding part
and Implement linear regression using python
in today's session will deal
with linear regression algorithm using least Square method check
its goodness of fit
or how close the data is
to the fitted regression line using the R square method.
And then finally
what will do will optimize it using the gradient decent method
in the last part on the coding session.
I'll teach you to implement linear regression using Python
and Coding session
would be divided into two parts the first part would consist
of linear regression using python from scratch
where you will use the mathematical algorithm
that you have learned in this session.
And in the next part of the coding session
will be using scikit-learn for direct implementation
of linear regression.
So let's begin our session with what is regression.
Well regression analysis is
a form of predictive modeling technique
which investigates the relationship between a dependent
and independent variable a regression analysis.
Vols graphing a line over a set of data points
that most closely fits the overall shape of the data
or regression shows the changes in a dependent variable
on the y-axis
to the changes
in the explanation variable on the x-axis fine.
Now you would ask what are the uses of regression?
Well, there are major three uses of regression analysis the first
being determining the strength of predicates errs,
the regression might be used to identify the strength
of the effect
that the independent variables have on the dependent variable
or But you can ask question.
Like what is the strength of relationship between sales
and marketing spending or what is the relationship between age
and income second is forecasting
an effect in this the regression can be used to forecast effects
or impact of changes.
That is the regression analysis help us to understand
how much the dependent variable changes with the change
and one or more independent variable fine.
For example, you can ask question like how much
additional say Lancome will I get for each?
Thousand dollars spent on marketing.
So it is Trend forecasting
in this the regression analysis predict Trends
and future values.
The regression analysis can be used to get Point estimates
in this you can ask questions.
Like what will be the price of Bitcoin
and next six months, right?
So next topic is linear versus logistic regression by now.
I hope that you know, what a regression is.
So let's move on and understand its type.
So there are various kinds of regression like linear
and much more now all this math might seem intimidating at first
if you have been away from it for a while just
machine learning is much more math intensive
than something like front-end developer.
Just like any other skill getting better at math is a man.
Our Focus practice the next skill
in our list is the neural network architectures.
We need machine learning for tasks that are too complex
for human to quote directly that is tasks
that are so complex
that it is Impractical now neural networks are a class
of models within the general machine learning literature
or neural networks are a specific set of algorithms
that have revolutionized machine learning.
They're inspired by biological neural networks,
and the current so-called deep neural networks
have proven to work quite well.
Well, the neural networks are themselves
General function approximations,
which is why they
can be applied to almost any machine learning problem
about learning a complex mapping
from the input to the output space.
Of course, there are still good reason for the surge
in the popularity of neural networks,
but neural networks have been by far the most accurate way of
approaching many problems like translation speech recognition
and image classification now coming to our next point
which is the natural language processing now
since it combines computer science and Listed,
there are a bunch of libraries like the NLT K chances.
Mm and the techniques such as sentimental analysis
and summarization
that are unique to NLP now audio and video processing
has a frequent overlap with the natural language processing.
However, natural language processing can be applied
to non audio data like text voice
and audio analysis involves extracting useful information
from the audio signals themselves being well-versed
in math will get you far in this one
and you should also be familiar.
Her with the concepts such as the fast Fourier transforms.
Now, these were the technical skills
that are required
to become a successful machine learning engineer.
So next I'm going to discuss some of the non-technical skills
or the soft skills,
which are required to become a machine-learning engineer.
So first of all, we have the industry knowledge.
Now the most successful machine learning projects out.
There are going to be those
that address real pain points whichever industry we
are working for you should know
how that industry works
and Will be beneficial for the business
if a machine learning engineer does not have business Acumen
and the know-how of the elements
that make up a successful business model
or any particular algorithm.
Then all those technical skills cannot be Channel productively,
you won't be able to discern the problems
and potential challenges that need solving
for the business to sustain
and grow you won't really be able to help
your organization explore new business opportunities.
So this is a must-have skill now next we
have effective communication.
You'll need to explain the machine learning
Concepts to the people with little to no expertise
in the field chances are you'll need to work
with a team of Engineers as well as many other teams.
So communication is going to make all of this much
more easier companies searching
for a strong machine learning engineer looking for someone
who can clearly
and fluently translate their technical findings
to a non technical team such as marketing
or sales department and next on our list.
We have rapid prototyping so Iterating on ideas as quickly as
possible is mandatory for finding one
that works in machine learning this applies to everything
from picking up the right model
to working on projects such as A/B Testing
you need to do a group
of techniques used to quickly fabricate a scale model
of a physical part
or assembly using
the three-dimensional computer aided design,
which is the cat so last
but not the least we have the final skill
and that is to keep updated.
You must stay up to date with Any upcoming changes
every month new neural network models come out
that are performed the previous architecture.
It also means being aware
of the news regarding the development of the tools
the changelog the conferences
and much more you
need to know about the theories and algorithms.
Now this you can achieve
by reading the research papers blogs the conference's videos.
And also you need to focus on the online community
with changes very quickly.
So expect and cultivate this change now,
this is not the Here we have certain skills the bonus skills,
which will give you an edge over other competitors
or the other persons
who are applying
for a machine-learning engineer position on the bonus point.
We have physics.
Now, you might be in a situation
where you're like to apply machine learning techniques
to A system
that will interact with the real world having some knowledge
of physics will take you far the next we
have reinforcement learning.
So this reinforcement learning has been a driver
behind many of the most exciting developments
in the Deep learning
and the AI community.
T from the alphago zero to the open a is Dota 2 pot.
This will be a critical to understand
if you want to go into robotics self-driving cars
or other AI related areas.
And finally we have
computer vision out of all the disciplines out there.
There are by far
the most resources available for learning computer vision.
This field appears to have the lowest barriers to entry
but of course this
likely means you will face slightly more competition.
So having a good knowledge of computer vision
how it rolls will give you an edge.
Other competitors now.
I hope you got acquainted with all the skills
which are required to become a successful
machine learning engineer.
As you know,
we are living in the worlds of humans and machines
in today's world.
These machines are the robots have to be programmed
before they start following your instructions.
But what if the machine started
learning on its own from their experience work like us
and feel like us
and do things more accurately than us now?
Well his machine learning Angela comes into picture to make sure
everything is working according to the procedures
and the guidelines.
So in my opinion machine learning is one of the most
recent and And Technologies,
there is you probably use it at dozens of times every day
without even knowing it.
So before we indulge
into the different roles the salary Trends
and what should be there on the resume
of a machine learning engineer
while applying for a job.
Let's understand
who exactly a machine learning engineering is so machine
learning Engineers are sophisticated programmers
who develop machines and systems
that can learn
and apply knowledge without specific Direction artificial
intelligence is the goal of a machine-learning engineer.
They are computer programmers
but their focus goes beyond specifically
programming machines to perform specific tasks.
They create programs
that will enable machines to take actions
without being specifically directed to perform those tasks.
Now if we have a look at the job trends
of machine learning in general.
So as you can see in Seattle itself,
we have 2,000 jobs in New York.
We have 1100 San Francisco.
We have 1100 in Bengaluru India,
we have 1100 and then we have Sunnyvale,
California where we have If I were a number of jobs,
so as you can see the number of jobs in the market is too much
and probably with the emergence of machine learning
and artificial intelligence.
This number is just going to get higher now.
If you have a look at the job opening salary-wise percentage,
so you can see for the $90,000 per annum bracket.
We have 32.7 percentage and that's the maximum.
So be assured
that if you get a job as a machine-learning engineer,
you'll probably get around 90 thousand bucks a year.
That's safe to say.
Now for the hundred and ten thousand dollars per year.
We have 25% $120,000.
We have 20 percent
almost then we have a hundred and thirty thousand dollars
which are the senior machine learning and Jenna's
that's a 13 point 6 7% And finally,
we have the most senior machine learning engineer
or we have the data scientist here,
which have the salary
of a hundred and forty thousand dollars per annum
and the percentage for that one is really low.
So as you can see there is a great opportunity for people.
What trying to go into machine learning field
and get started with it?
So let's have a look at the machine learning
in junior salary.
So the average salary in the u.s.
Is around a hundred eleven thousand four hundred
and ninety dollars
and the average salary in India is around
seven last nineteen thousand six hundred forty six rupees.
That's a very good average salary
for any particular profession.
So moving forward if we have a look
at the salary of an entry-level machine learning.
You know, so the salary ranges from $76,000
or seventy seven thousand dollars two hundred
and fifty one thousand dollars per annum.
That's a huge salary.
And if you talk about the bonus here,
we have like
three thousand dollars to twenty five thousand dollars depending
on the work YouTube and the project you are working on.
Let's talk about the profit sharing now.
So it's around two thousand dollars
to fifty thousand dollars.
Now this again depends
upon the project you are working the company you are working
for and the percentage
that Give to the in general or the developer
for that particular project.
Now, the total pay comes around seventy six thousand dollars
or seventy-five thousand dollars
two hundred and sixty two thousand dollars
and this is just for the entry level machine learning engineer.
Just imagine if you become an experience machine
learning engineer your salary is going to go through the roof.
So now that we have understood
who exactly is a machine learning engineer
the various salary Trends the job Trends in the market
and how it's rising.
Let's understand.
What skills it takes to become a machine learning engine.
So first of all,
we have programming languages now programming languages are
big deal when it comes to machine learning
because you don't just need to have Proficiency
in one language you might require Proficiency in Python.
Java are or C++
because you might be working in a Hadoop environment
where you require
Java programming to do the mapreduce Coatings
and sometimes our is very great
for visualization purposes and python has you know,
Another favorite languages
when comes to machine learning now next scale
that particular individual needs is calculus and statistics.
So a lot of machine learning algorithms are mostly
maths and statistics.
So and a lot of static
is required majorly the matrix multiplication
and all so good understanding
of calculus as well as statistic is required.
Now next we have
signal processing now Advanced signal processing is something
that will give you an upper Edge
over other machine learning engine is
if you are Applying for a job anywhere.
Now the next kill we have is applied maths
as I mentioned earlier many of the machine
learning algorithms here are purely mathematical formulas.
So a good understanding of maths and how the algorithm Works
will take you far ahead the next on our list.
We have neural networks.
No real networks are something
that has been emerging
quite popularly in the recent years and due to its efficiency
and the extent to which it can walk and get the results
as soon as possible.
Neural networks are a must
for machine learning engine now moving forward.
We have language processing.
So a lot of times machine learning Engineers have to deal
with text Data the voice data as well as video data now
processing any kind of language audio
or the video is something
that a machine-learning engineer has to do on a daily basis.
So one needs to be proficient in this area also now,
these are only some of the few skills
which are absolutely necessary.
I would say for any machine learning
and Engineer so let's now discuss the job description
or the roles
and responsibilities
of a particular machine learning engineer now
depending on their level
of expertise machine learning Engineers may have
to study and transform data science prototypes.
They need to design machine Learning Systems.
They also need to research and Implement appropriate machine
learning algorithms and tools
as it's a very important part of the job.
They need to develop new machine learning application
according to the industry requirements the Select
the appropriate data sets
and the data representation methods
because if there is a slight deviation in the data set
and the data representation
that's going to affect Model A lot.
They need to run machine learning tests and experiments.
They need to perform statistical analysis
and fine-tuning using the test results.
So sometimes people ask
what exactly is a difference between a data analyst
and a machine learning engineer.
So so static analysis
just a small part of of machine learning Engineers job.
Whereas it is a major part
or it probably covers a large part of a data analyst job
rather than a machine learning Engineers job.
So machine learning Engineers might need to train
and retrain the systems whenever necessary
and they also need
to extend the existing machine learning libraries
and Frameworks to their full potential
so that they could make the model Works superbly
and finally they need to keep abreast of the developments
in the field needless to say
that any machine.
In general or any particular individual has to stay updated
to the technologies
that are coming in the market
and every now and then a new technology arises
which will overthrow the older one.
So you need to be up to date now coming
to the resume part of a machine learning engineer.
So any resume of a particular machine learning Engineers
should consist like clear career objective skills,
which a particular individual possesses
the educational qualification
certain certification the past experience
if you are an experienced machine learning and Jen
are the projects which you have worked on and that's it.
So let's have a look at the various elements
that are required
in a machine-learning Engineers resume.
So first of all,
you need to have a clear career objective.
So here you will need not stretch it too much
and keep it as precise as possible.
So next we have the skills required and these skills
can be technical as well as non technical.
So let's have a look
at the various Technical and non-technical skills out here.
So starting with the technical skills.
First of all,
we have programming languages as an our Java Python and C++.
But the first
and the foremost requirement is to have a good grip
on any programming languages preferably python
as it is easy to learn
and it's applications are wider than any other language now,
it is important to have a good understanding of topics
like data structures memory management and classes.
All the python is a very good language it
alone cannot help you
so you will probably have to learn all
these he's languages like C++ are python Java
and also work on mapreduce
at some point of time the next on our list.
We have calculus and linear algebra and statistics.
So you'll need to be intimately familiar
with matrices the vectors and the matrix multiplication.
So statistics is going to come up a lot
and at least make sure you are familiar
with caution distribution means standard deviations
and much more.
So you also need to have a firm understanding of probability.
Stats to understand the machine learning models the next
as I mentioned earlier, it's signal processing techniques.
So feature extraction is one of the most important parts
of machine learning
different types of problems need various Solutions.
So you may be able to utilize
the really cool Advanced signal processing algorithms such as
wavelengths shallots curve.
Let's and the ballast so try to learn about
the time-frequency analysis and try to apply it to your problems
as it gives you an upper jaw.
Our other machine learning Engineers,
so just go for the next we
have mathematics and a lot of machine learning techniques out.
There are just fancy types of function approximation
having a firm understanding of algorithm Theory and knowing
how the algorithm works is really necessary
and understanding subjects
like gradient descent convex optimization
quadratic programming
and partial differentiation will help a lot the neural networks
as I was talking earlier.
So we need machine learning for tasks that are too Flex
for humans to quote directly.
So that is the tasks that are so complex
that it is Impractical neural networks are a class
of models within the general machine learning literature.
They are specific set of algorithms
that have revolutionized machine
learning deep neural networks have proven to work quite well
and neural networks
are themself General function approximations,
which is why they can be applied
to almost any machine learning problem out there
and they help a lot about learning a complex mapping
from the input
to The output space now next we have language processing
since natural language processing combines two
of the major areas of work
that are linguistic and computer science
and chances are at some point you are going to work
with either text or audio or the video.
So it's necessary to have a control over libraries
like gents mm and ltk
and techniques like what to wet sentimental analysis
and text summarization Now voice
and audio analysis involves extracting useful information
from the Your signals themselves very well versed in maths
and concept like Fourier transformation will get
you far in this one.
These were the technical skills that are required but be assured
that there are a lot of non technical skills.
Also that are required to land a good job
in a machine learning industry.
So first of all,
you need to have an industry knowledge.
So the most successful machine learning projects out.
There are going to be those that address real pain points,
don't you agree?
So whichever industry are working for You should know
how that industry works
and what will be beneficial for the industry.
Now, if a machine learning engineer
does not have business Acumen and the know-how of the elements
that make up a successful business model.
All those technical skills cannot be
channeled productively.
You won't be able to discern the problems
and the potential challenges that need solving
for the business to sustain and grow the next on our list.
We have effective communication and not this is one
of the most important parts in any job requirements.
So you'll need to In machine learning Concepts to people
with little to no expertise in the field a chances are
you will need to work with a team of Engineers
as well as many other teams like marketing
and the sales team.
So communication is going to make all of this much
easier companies searching
for the strong machine learning engineer looking for someone
who can clearly and fluency translate technical findings
to a non technical team.
Rapid prototyping is another skill,
which is very much required for any machine learning engineer.
So iterating on ideas as quickly as possible is mandatory
for finding the one
that works in machine learning this applies to everything
from picking the right model
to working on projects such as a/b testing
and much more now you need to do a group
of techniques used to quickly fabricate a scale model
of a physical part
or assembly using
the three-dimensional computer aided design,
which is the cat data now coming to the final skills,
which will be required
for any machine learning agenda is to keep updated.
So you must stay up to date with any upcoming changes
every month new neural network models come out
that outperformed the previous architecture.
It also means being aware of the news regarding the development
of the tools Theory
and algorithms through research papers blocks conference videos
and much more.
Now another part of any machine learning engineer's resume is
the education qualification.
So a bachelor's
or master's degree in computer science RIT economics statistics
or even mathematics can help.
Up you land a job in machine learning plus
if you are an experienced machine learning engineer,
so probably some standard company certifications
will help you a lot
when Landing a good job in machine learning
and finally coming to the professional experience.
You need to have experience in computer science statistics data
as is if you are switching
from any other profession into a machine learning engineer,
or if you have a previous experience in machine learning
that is very well.
Now finally if we talk about The projects
so you need to have not just any project
that you have worked on you need to have working on machine
learning related projects
that involve a certain level of AI
and working on neural networks to a certain degree
to land a good job as a machine-learning engineer.
Now if you have a look at the company's hiring machine
learning Engineers,
so every other company is looking
for machine learning Engineers
who can modify the existing model to something
that did not need much more.
Of Maintenance and cancel sustain so basically working
on artificial intelligence and new algorithms
that can work on their own is what every company deserves.
So Amazon Facebook.
We have Tech giants
like Microsoft IBM again in the gaming industry,
we have or the GPU industry Graphics industry.
We have Nvidia in banking industry.
We have JPMorgan Chase again,
we have LinkedIn and also we have Walmart.
So all of these companies require machine learning engine
at some part of the time.
So be assured that
if you are looking for a machine learning engineer post,
every other companies be it a big shot company or even
the new startups are looking for machine learning Engineers.
So be assured you will get a job now with this we come
to an end of this video.
So I hope you've got a good understanding
of who exactly are machine learning engineer is
the way just job Trends the salary Trends.
What are the skills required to become machine learning engineer
and once you become a machine-learning engineer,
what are the roles and responsibilities
what appears to be on the resume or the job description
what appears to be
on the job application of any machine learning engineers?
And also I hope you got to know how to prepare your resume
or how to prepare it in the correct format.
And what on to keep their
in the resume the career objectives the skills
Technical and non-technical
previous experience education qualification
and certain projects
which are related to it.
So that's it guys Ed Rica
as you know provides a machine learning.
Engineer master's program now that is aligned in such a way
that will get you acquainted in all the skills
that are required to become a machine
learning engine and that too in the correct form.
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.