YouTube Transcript:
Data Analyst Bootcamp for Beginners (SQL, Tableau, Power BI, Python, Excel, Pandas, Projects, more)
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
you are in the right place to learn to
become a data analyst in this massive
boot camp Alex the analyst will cover
all the core topics that data analysts
need to know and along the way you'll
build plenty of projects to gain
hands-on experience hello everybody my
name is Alex freeberg better known as
Alex the analyst on YouTube and in this
video you're going to be taking my
entire data analyst boot camp this boot
camp is comprised of videos that I've
made over the past 3 years and they
cover a lot of different topics like SQL
Excel powerbi tableau and python
throughout the boot camp there are a lot
of Hands-On guided projects that will
really help you learn these skills well
and speaking of projects there's an
entire Part near the end where you can
build a free portfolio website where you
can put all of your projects on so that
hiring managers and recruiters can go
and look at all these projects that
you've built if you wanted to go even
more in depth into the skills that we
learn in this boot camp I have a data
analytics learning platform called
analyst Builder analy Builder was
designed specifically for data analyst
so all of the courses and all the
content are just for you and it has a
coding section where you can learn and
practice for technical interviews and
lastly before we jump into the boot camp
I want to give a huge shout out to free
code camp for putting this all together
personally learned a ton from free code
camp and so I'm really honored that my
boot camp is going to be here for you
guys to learn and I really hope you
enjoy it what's going on everybody it is
2023 and in this video I'm going to help
you become a data [Music]
[Music]
analyst we're going to start at the very
beginning assuming you haven't started
this process at all of becoming a data
analyst if you already have you can kind
of find IDE identify where you are in
this process and then go from there now
before we dive into everything I want to
warn you I will be mentioning my own
channel a lot in this video I have
videos and playlists on just about every
single topic that we're going to be
talking about today I'll have all the
links to those videos in the description
so you can dive into those topics more
in depth so I hope that's okay and it's
all completely free I've been building
this out for the past 3 years and
honestly you can probably get 90% of the
way to learning everything you need for
data analytics just on my channel so now
that I've warned you let's J been of
number one and that is learn the data
analyst skills now there are literally a
hundred different things that you can
learn for data analytics you can learn
things like alter X or a cloud platform
or different programming languages but
there are some core skills that I
recommend you start out with before kind
of branching into some of those other
skills the number one skill that I
always recommend people start with is
SQL SQL is just one of those fundamental
skills I think everybody should learn
even if you don't use SQL you'll use
some variation of SQL if your company
has a large enough data set SQL is used
to actually query and retrieve data from
a database so if your company collects
data which every company does they're
going to put it somewhere to store it's
usually stored in a database and sqls
how you get that data from the database
I think SQL is also fairly easy to learn
which makes it really good when you're
just starting out I have several
playlists dedicated to SQL starting from
beginner all the way to Advanced and you
can learn all of that for free one other
reason why I think you should learn SQL
first is that a lot of companies
interview or have a technical interview
during the interview process on SQL
that's something that really caught me
off guard when I was first starting out
out because I thought it was going to be
more behavioral I didn't even know what
a technical interview was so knowing SQL
actually became a really important part
of interviewing and getting a job as a
data analyst the second skill that I
would learn is a business intelligence
tool like Tableau or powerbi now there
are a ton of different bi tools I can
literally name 10 off the top of my head
that I've used throughout my career but
what I will say is that learning
something like Tableau or powerbi is
pretty transferable to almost all those
other bi tools they're all fairly
similar and how they do things and how
they show display the data you most
likely won't have a technical interview
asking you about Tableau or powerbi like
to build something for them that usually
does not happen but the combination of
SQL where you can query your data and
then taking that data to build something
that is a really really great
combination to learn right away I have
entire series on both Tableau and
powerbi with projects on my channel the
third skill that I would learn is Excel
now most people have used Excel they
know what Excel is and how it's used but
it can be used a little bit differently
for a data analyst for example example
in Excel a lot of people haven't cleaned
data in Excel or built charts and graphs
using Excel and those are things that
data analysts would probably do excel is
also just a fundamental skill that every
company is going to expect you to know
so I have an entire playlist dedicated
to excel to actually walk you through
how to use it for data analysis the
fourth skill that I recommend you learn
is python now a lot of people will have
python higher up on their list they only
use Python they don't use SQL or a bi
tool they just do everything in Python
now python is a fantastic tool you can
use it to manipulate your data to create
data visualizations and a ton more like
web scraping and regular expression and
a hundred different other things but it
can be kind of hard to learn it took me
a long time to really learn the basics
very well that's really the only reason
why it is farther back I feel like SQL
and a bi tool are really easy to learn
and really pack a big punch whereas
python can be quite tough to learn in my
experience and you may not use it as
often as you would something like SQL or
a bi tool if you're interested in
learning py python I have an entire
series dedicated to python as well as
projects that you can build again I
warned you there's going to be a lot of
self-promotion in this video I have
videos on just about every single one of
these topics the fifth and the last
skill that I recommend you learning and
this is the only one that I don't have a
series on yet I will make those is
learning a cloud platform like AWS
Google Cloud platform or Azure there's
no denying that these platforms have
played a huge impact in how we use data
as a whole in the data analyst industry
they can be kind of tough to learn
though if you aren't using it Hands-On
in an actual job I think that learning a
cloud platform is already something that
most people should start working towards
because in the future it's only going to
become more prevalent now where can you
go and actually learn all of these
skills that you need to become a data
analyst well the number one place I'd
recommend of course is my channel I have
free tutorials on all these skills and a
lot of other topics and I think it's
just a really great place to start the
next place that I recommend you looking
at is udemy I recommend udemy especially
if you're just starting out because it's
pretty pretty cheap you can buy an
entire course entire SQL course for $10
or $15 and they have courses on every
single one of these skills and I just
recently made a video called DIY data
analysts curriculum using udemy for
under $75 so you can create an entire
curriculum to learn all of these skills
for under $75 which is just amazing the
next place I'm going to recommend you
look is corsera now udemy is fantastic
they have really good instructors and
good courses but as a whole I find that
sometimes corsera just has more
professional or better content corsera
is a bit more expensive though you're
looking at $59 per month for all of
their courses or you can pay upfront an
annual fee of $399 so again it's just a
lot more expensive I moved to corsera
once I started having a data analyst job
and had a bit more money but when I was
first starting out I just couldn't
afford it so I went to udemy and it was
a really great place to start there's
also places like data camp and data
Quest that kind of gamify learning and
they're more text based so all these
other platforms udem me corsera and me
they're all video based but if you like
reading data camp and data Quest are a
lot more of text where you can learn it
by reading it and doing it after you
learn all of these skills the next thing
that I recommend you do is actually
build projects with those skills now
what is building a project actually mean
it means taking a skill and then
building something out of it that you
can then show a potential employer for
example if you went through and learn
Tableau you go and take a data set and
you could build a visualization and a
dashboard in tableau and that would be a
project with these projects you can
build something called a portfolio and I
usually call it a portfolio website a
portfolio website is a website that you
create where you store all of your
projects and then you can share that
with recruiters and hiring managers so
that they can see all of your work now
do you absolutely need a portfolio to
show employers no you don't but it does
help in two different ways the first
thing that it may do is actually help
you land the interview if you have a
link on your resume and they click on it
they may see your skills and see your
projects and be like man this person
really knows what they're doing this is
exactly what we need the second reason
that I recommend building projects is
because most likely during your
interview you're going to get asked
questions like how have you used SQL how
have you used Tableau and if you don't
have any experience in that you're just
going to say well you know I've taken
courses to learn it but with a project
you can be a lot more specific you'll be
able to say well I actually just built
out this project in Tableau I took the
data and cleaned it in Excel and then I
put it in Tableau and built out this
Dash board and here are the insights
that I found from this data set it's
just a much better answer and as a
hiring manager myself I can tell you
that it is definitely beneficial to
build out these projects The Next Step
that I recommend you take in becoming a
data analyst is building a data analyst
resume the resume to say the least is
extremely important it's what's going to
actually allow you to land an interview
to potentially get a job now if you were
like me when I was first starting out I
had a resume it just had nothing to do
with data analytics so how do you make a
data analyst resume if you don't have
any experience as a data analyst well
you are asking the perfect questions
because the very first things that we
talked about are what are going to go on
your resume those skills and those
projects if you have no experience or
degree like myself who has a
recreational therapy degree if you have
no background in this it can be really
daunting to kind of display that you
know what you're doing and that a
company should hire you so what I
usually recommend is right beneath your
contact at the top you put your skills
and your projects that you built out on
your resume things like work experience
and education should go on your resume
as well but just a little bit lower you
want them to see those things before
they see that your last work experience
was at Domino's and you have a degree in
Marine Biology it's just not relevant to
data analysis and if you put those
things at the top they're probably going
to rule you out right away the fourth
step to become a data analyst is
actually applying you have the skills
you have the projects you have the
resume now you're ready to start
applying for those data analyst jobs now
there's there's a lot of different
opinions on how you need to go about
applying for data analyst jobs but I'll
give you my take on it and this has been
the most successful for me in my career
the first thing I want to mention is
actually what I would not do which is
just blindly apply on glass door monster
zip recruiter and all these other
platforms to just any data analyst job
that you can find now I'm not against
this I think you should do that but I
don't think that's the only thing that
you should do because the chances of you
getting a call back or actually hearing
something back are extremely low to
really increase your chance of becoming
a data analyst I highly highly highly
recommend working with a recruiter a
recruiter is literally someone who is
there to help you find a job now when I
first started out I didn't understand
what a technical recruiter was at all I
was kind of nervous or scared to work
with him but it's actually pretty simple
a company has a position that they want
to fill and they don't want to spend
hours and hours and hours to find
someone to fill that position so they
hire a recruiter a recruiter is going to
go out and try to find someone to fill
that position AKA you and so if you go
into talk to that recruiter and they
have a position that opens up they will
help you get that interview and then if
you get a job let's say for
$50,000 the company is going to pay that
recruiter let's say 10% of your salary
so they'll give them $5,000 so you don't
actually lose or have anything to lose
using a recruiter you can reach out to
Recruiters in several ways and I've done
every variation but I'll tell you my
most successful way which was using
LinkedIn there are tens of thousands of
Recruiters on LinkedIn I made an entire
video of how you can reach out to
recruiters and what to St Recruiters on
LinkedIn to help you land a job so be
sure to check out that video when you
actually get to that point but you can
also just cold email and cold call these
recruiting companies but to me it's just
not as effective as reaching out
directly on LinkedIn and this is just a
bonus one the last thing that you need
to do is accept a job offer so on step
number four after you apply to those
jobs you do actually have to go in
interview and then get a job offer which
you will accept I just thought I'd
mentioned that just in case that was not
super clear now that was a lot of stuff
let's talk about time frames to actually
complete all of these things now doing
all of these things from scratch is
going to take a while but let's break it
down by each step and see how long I
generally think it's going to take let's
start with step number one which is
actually learning the skills now just to
be up front this one probably is going
to take the longest for most people for
most people to learn all of these skills
it's going to take around 3 to four
months now if you don't learn a cloud
platform and python which are the last
ones that I recommend and you just focus
on SQL a to in Excel I think you can do
that in under 3 months that is very
dependent though on how much time you
have to study that time frame is more
for someone who has several hours per
day maybe 3 hours in the end of a night
after you go to work that is someone who
has quite a bit of time to dedicate to
learning during their week of course
that time frame is going to take longer
if you don't have as much time to
dedicate to learning now let's look at
number two which was creating projects
and a portfolio of projects from my
experience when you're first starting
out it takes a lot longer to actually
create these projects it can take one
one or two weeks per project I usually
recommend people doing three to five
projects in their portfolio before they
start applying and since they can take
anywhere from 1 to two weeks you're
looking at anywhere from 3 to 6 weeks
The Next Step was to create a data
analyst resume now in my opinion this
one should take the shortest out of
every single step here because you're
really just kind of reformatting a
resume or creating a resume you're just
adding skills you're adding your
projects and then kind of reformatting
it to make it look nice this should
hopefully take under a week but if you
use something like a professional
service so they help you build a resume
it could take one to two weeks the two
last steps which kind of go hand inand
are step four and five which is actually
applying for jobs and then Landing a job
now this process can take as little as a
month or it can take as long as 6 months
or a year it really depends on how
you're applying where you're applying
and just the kind of luck that you're
having with actually Landing interviews
I've seen people who have never had any
experience land a job within a month of
starting to apply and it's incredible
it's amazing but it doesn't happen too
often you're usually looking at around 2
to 4 months on average to land your
first data analyst job if you put all of
those together and kind of average
everything out you're looking at around
6 months total for the entire process
now I don't want that to discourage you
okay 2023 is a long year you have a lot
of time and it doesn't have to take 6
months you could do it faster you could
do it in three months and just prove me
wrong but if you are really focused and
you are really driven to become a data
analyst this year I know that you can do
it now to maybe boost your spirits and
make you feel a little bit better I
didn't know any of these things when I
first started out I didn't have anyone
telling me kind of a plan on what to do
I had to go out and figure all these
things out by myself and it took me
almost a year to land my first real data
analyst job so with all that being said
I hope that this video is helpful I hope
you now have a path on how to become a
data analyst this year and that my
channel can be a big part of that so
thank you guys so much for watching I
really appreciate it if you like this
video be sure to like And subscribe
video [Music]
what's going on everybody my name is
Alex freeberg and in today's video we're
going to be starting our basics of SQL
series now in this series we're going to
be going over everything you need just
to get started and then in future videos
we're going to be going over some
intermediate Concepts and some more
advanced concepts and then in the final
series we're going to be going over some
portfolio projects in this video in
particular we're going to be downloading
SQL Server Studio we're going to be
creating our tables inserting data into
our tables and in future videos we're
going to actually learn how to query
those tables if you already have SQL
Server management Studio downloaded you
can skip ahead to where we actually
create the tables and insert the data
into the tables if you don't care about
that at all and you're just looking at a
query I would skip to the next video
where we actually start quering the data
that we inserted into those tables so to
download SQL Server management Studio we
actually have to download two things and
I have both links right here I'm going
to leave those in the descriptions that
you guys have those but this one is to
actually download SQL Server management
studio so let's go down here I actually
deleted it off my computer so I can walk
through this with you guys so we're
going to download that let's also go
over here this is actually a server so
we have to download a SQL server and if
you go down right here there's a free
version now I don't need the developer
version I'm just going to download the
express version it's actually smaller so
let's download that as
well now once this is done running we're
going to open it up and I'll show you
what to do next so it just finished
running let's click on
it all right so we need to install it
we're going to click yes and this is
going to take a little while so this
popped up I clicked install and it's
been running for the past couple minutes
apparently I was not recording so I
apologize for that but that's all I did
so now it's been installed I'm actually
going to pull it up right
here and let's open it
up now when it pulls up it's going to
ask you to connect to a server and
that's why we downloaded the SQL Express
that and there you go it's as easy as
that so now we have SQL Server
management Studio set up and we are good
to go so the first thing that we need to
do is actually create a database so
let's go over here to databases and
let's click new
database and let's just do SQL
SQL
tutorial keep it simple and if we click
that it's going to create our database
for us now when you open up the database
there's going to be a lot of stuff you
really do not need to know all this
really what we're going to be sticking
to is this tables right here uh as of
right now we do not have any tables so
we need to create tables now there's two
ways that you can do that you can click
right here and you can go to new and
create table we're not actually going to
do that we're going to create it using a
script or a t-sql so we're going to go
over here and do new query and we will
get started on actually creating uh the
two tables that we're going to be using
for all the stuff going forward all
right so let's get rid of me CU you
really don't need to be seeing me
anymore let's get started by doing our
very first table which is going to be
our employee demographics table so let's
start off by saying create table and we
have to name it so let's do
employee demographics and enter down we
want to do an open parenthesis now we
need to specify what our column names
are going to be and what the data type
is for each column so let's start off
with employee ID and we want that to be
an integer so that'll be like 1 2 3 4 uh anything
anything
numeric now we want to
do first name and let's make that varar
50 if you don't know what these data
types are that's okay uh that will
probably be covered in a different video
that's not really necessary for this
video uh let's do last name we'll also
make that varar 50 let's do age make
that an integer and very last let's do
gender and we will make that varar 50 as
well so now we have our very first
table let's run that and we'll see if it
works we'll go over here we'll refresh our
our
tables and there you go so we have our
very first table let's go up here let's
get rid of this one and now let's create
our second table so we're going to do
basically the exact same thing but we're
going to have a little bit different
information in it this is going to be
our employee salary table so let's do
it and enter and open parenthesis so now
we're going to do the same thing we're
going to do employee ID let's make that
an integer now we want the job title
because we want to know what they
do and this one is going to be varar 50
because we keep it pretty simple
whoops and then for our very last one
we're going to do salary and that will
be integer as
well and I'll just do PR here so let's
create this
table let's see if it is there and there
we go so let's open up one of these
tables really
quick see what's in there see what it
looks like as you can see we do not have
any information in there uh when you
create a new table sometimes when you
open it up you're going to see this if
you want to get rid of that you just
need to do a I think it's called A Hard
refresh or something like that but you
can do control shift R let's see if it
works for me I just did it all right it
goes away so now it recognizes it as a
table so we're good there let's go back
here and let's get rid of all this we've
already created our tables now we want
to insert the data into our tables so
let's see what that looks like let's do
insert into and now we need to specify
what table we're inserting our data into
so let's start off with employee
demographics let's do
values so now we have to select what
values we're going to put into um into
this table
so now we're going to have to do the
employee ID so let's do
101 then we're do first name so let's do
Jim last name
Halpert and then his age let's say he's
30 and he is a
male now just for fun let's execute that
let's go back to this table right here
and execute
and as you can see all of our
information actually went in there so
now we have his employee ID his first
name his last name age and gender now we
need a lot more information uh for this
table in order to actually learn a lot
of the concepts of quering the table so
I'm actually going to go through and add
a ton more information I'm not going to
bore you through that but I will show
you the final product before I actually
hit execute so stick with me I'm
actually just going to cut to the end
where I insert all my stuff down on here
and then if you want that I'll probably
leave it in the description or maybe put
in my GitHub or something so you can
easily just go copy and paste that if
that's what you want to do so I'll see
you in a few
seconds all right so I have all my
values right here I actually going to
take this one out cu I already did that
one but this is our additional
information let's insert that into our
table real quick and go back here and
take a look at it and there you go this
is going to be our core information that
we are querying off of
uh in future videos so that table is
completely finished let's go back here
we're going to get rid of this because
now we want to insert our information to
our other table so let's do insert into
and let's do
employee and now we're going to do
salary so let's do values to specify
that we're inserting values into
there and in this one we have employee
ID so again let's do in th1 that's
gym his job title is
$45,000 and let's execute that and you
can't see it but down here it says it's
done let's go to that
table and as you can see that is
inserted I'm going to do the exact same
thing as I did before I am going to fill
out all these and in a second it will be
done uh on your side and then again I
will leave it in the description or I'm
going to put it on my GitHub and you
guys can just copy and paste that if
that's what you want to do or you can
write it out whatever you want to do all
right just like before I'm going to get
rid of this first one that is Jim he is
already done now let's insert this
information Ed is finished let's go back
here and there we go now we have both of
our tables and we are good to go for
future videos so thank you so much for
sticking all the way through this one in
the next video we're going to actually
begin uh quering the table and learning
the select the from the where the group
by and the order by statement everything
is in these upcoming videos so stick
around and we will learn all of that
together thank you so much for joining
me if you like this type of content be
sure to subscribe below and I'll see you
in the next video what is going on
everybody my name is Alex freeberg and
in today's video we're going to be going
over the select and the from statement
so if you joined us for our last video
we went over creating our tables and
inserting data into those tables and so
we have this employee demographics table
and we also have this employee salary
table and today we're going to be
walking through the select statement in
the fir statement on these tables so
here are some of the concepts that we're
going to be going over today let's just
get it started by doing select
everything and let's do this from the
employee demographics table so let's
execute this if we wanted to only show
the first names we can just do first
name and run that
and if we want first name and last name
we can just separate that by using a
comma and it will return those well if
we want to return all columns and all
rows then all we have to do is use this
star so that's what the star does now we
have nine rows of data here and if we
only wanted to return let's say the top
five we can easily do that and we can
just say top five of everything now the
reason this could be useful is say you
have a table that has millions of rows
in it and you only want a small sample
you can say select top 1,000 and when
you do that it will only select the top
five rows now let's get everything back
in here really quick because we're going
to move on to this distinct feature so
when we use distinct we're actually
saying that we want the unique values in
a specific column so if we say distinct
and then let's do employee ID
D everything should be returned so all
nine rows should be returned and that's
because every single one of these are
unique now let's try gender so there's
only going to be two results the male
and the female and that's because
there's only two distinct values in that
column now let's look at all of our data
again so now we want to look at count
now count is very simple all is going to
do is going to show us all the non null
values in a column so let's look at last
name for example if we do count of last
name all that's going to give us is a
count of nine because we have nine last
names if for whatever reason somebody's
last name was left out and that was null
then it would have returned maybe eight
or seven depending on how many were
actually in there so if an entire column
was null we it would be a Return To
Zero and if you notice we are not given
a column name that's because this is
derived information based off the last
name so if we want to actually give this
a name so that that column does not say
no column name we can use this as right
here so once you put as you can actually
name it so since this is the count of
the last name we'll write last name
count keep it simple and if we execute
that as you can see we have last name
count right there so that's how you use
that as let's look at all of our data
again we want to look at some Max mins
and averages right now and the only
column here where it would be useful to
do it on is age but let's actually go
over and let's look at our salary table
and at our salary table we have some
really interesting salaries that I think
would be a little bit more useful for
this information so let's go over
to employee
salary all all right and let's look at
this table really quick so we have our
salary now we want to look at the maximum
maximum
salary that is
in uh that column and that is going to be
be
$65,000 now let's say we wanted to know
what the minimum salary was let's
execute this and the person who makes
the least money is making
$36,000 now what's the average what is
the average salary for all employees
that's going to be $
48,5 so so super easy to use all of
these things they're extremely useful I
use them every single day so I know that
each of these are very very useful and
are definitely among the basics that you
have to know let's look real quick at
everything really quick so we just
learned the select statement but
learning this from statement really
quick is also important up here this
actually shows us that we're already
Hitting off the SQL tutorial database
but let's say we change it to master
when we try to run this it's going to
give us an error and that's because now
we're hitting off this database and this
database does not have this table in it
so in order to do this in order to still
hit off that table while up here we're
actually hitting off a different table
we can change this information so the
from statement you have to specify three
separate things the first thing that you
need to specify is the database so let's
say we want to hit off the SQL tutorial database
database
now we want to select what table we're
going to do this is actually a dbo so
let's put dbo there's there's a lot that
can go into that um it's not worth
getting into now but dbo do and let's do employee
employee
salary when we execute this our
information comes up even though up here
we're still hitting off the master
database when we specify it right here
then we actually are choosing what
database and what table a hit off of and
so it does not matter what it is up here
so that's how you use the from statement
in the next video we're going to be
going over the wear statement and then
after that the group by and order by
statement and that will be the complete
basics of SQL tutorial and then we'll
start getting into a little bit more fun
stuff some more advanced concepts which
I think it be really really exciting for
everybody to learn thank you guys so
much for joining me I really appreciate
I hope this has been helpful if you like
this type of content subscribe below and
I'll see you in the next video thanks
and goodbye
what's going on everybody my name is
Alex freeberg and in this video we're
going to be going over the we statement
and SQL in the very first video we
created our table inserted data into our
table in the second video we went over
the select and the from statement and
now we are on to the wear statements now
what does the wear statement do it helps
limit the amount of data and specify
what data you want returned we have
quite a few Concepts that we're going to
be covering today let's just start out
with something really easy let's
do where first name
equals gym really simple so we're
selecting everything where our first
name equals gym and this is our output
so really really simple now let's try
where it does not equal this right here
says does not equal gym and let's
execute that and as you can see we have
everybody except Jim Halbert in there so
now let's look at the greater than or
less than so in this table I think the
one that we're going to look at is age
so let's look at age and let's do where
it's greater than
30 and when we execute that we're going
to get everyone who is over the age of
30 now as you can see we're not
including people who are 30 years old if
we want to include people who actually
are 30 years old we're going to add the
equal sign right there so we should be
seeing people who are now 30 so before
Pam and Jim were not in there and now
they are if we do the exact same thing
let's do less than 32
here's everyone that's going to be
included but if we want to include the
people who are 32y old then we are just
going to add that equal sign and now the
people who are 32 years old like Toby
and Meredith are now
included if we want to go even further
we want people who are less than or
equal than
32 and who are male we can say where gender
gender equals
equals
male so now we have two two things that
we are specifying that we need we need
somebody whose age is less than 32 and
we need their gender to be male so let's
execute that and we have four people who
meet that criteria so that's what the
and statement does if we write or then
only one of these criteria has to be
correct in order for it to be met so if
we hit execute now we're saying anybody
who's under the age or equal to 32 or
their gender equals male so if we look
down here Michael Scott is actually 35
years old so he's over 32 but since he
is male he is now included let's get rid
of everything really quick I want to
look at this like really quick so let's
execute just that and if you do that you
highlight just that hit execute then it
uh will only run what you have
highlighted so now let's look at this
whole table now when you're using like
you typically are doing this for
sometimes numerical but most of the time
you're using it for text
information so if we're looking at this
right here if I'm looking at last names
and let's say I want everybody whose
last name starts with s you can't really
do that with anything else so I'm going
to say where it's like and then I'm
going to say s and after that I'm going
to put a percent sign that's actually
called a wild card and if I close that
off what this is saying is is I want
every last name where it starts with
where it's like
where it only starts with an S so let's
run this really quick now we have two
people whose last names start with s now
if I put a wild card at the beginning we
are now saying where there's an S
anywhere in anybody's name so let's
execute this and see what we get so now
even if the S is like flenderson towards
the end it's still counts so you can
specify multiple things in here as well
so let's say I want it to start with s
that would return shre and Scott but now
I want something that also has an o in
it so so it has an S at the beginning
and then somewhere in there there's an O
now let's execute that and there's only
one person that meets that criteria so
you can do that for multiple things you
can even say OT
TT and let's execute that and he's still
going to be returned and if we put C at
the back it's not going to be returned
because it follows it in order so isn't
s o TT C the C would actually need to
go over here so now we have s c o t t
and although there's a bunch of wild
cards in here it is going to return
Scott so that is a little bit a little
hint at how you can use like there is a
little bit more that goes into it you
can use it for numerics um there's a lot
of things that you can use this for but
this is just the basics how you can use
it today how you get started on using
the like a nutshell that is how you use
like and as I said before you can use
like with numerical data as well but for
demonstration purposes I wanted to use
text Data let's get rid of this really
quick um let's look at our entire
table and I wanted to show you how to
use null and not null I can't really
show you how to use null because I do
not have any null Fields I could easily
update this table and make n but that's
in a future video where it's a little
bit more advanced where you can start
altering your data but just for purposes
of showing you what null and not null is
let's do where first
name is
null and if we see that is not going to
return anything but if we say is not
null it's going to return everything
because nothing in here is null nothing
in this first name column is null so
that's how you use it um there are a lot
of use cases where you actually will use
null and not null that will be in future
videos probably in the project section
or the portfolio section we weren't able
to show really how to use this super
well but just as a demonstration that's
really all it does it looks at the whole
column and whether it is null or not
null that's really all it's used for
this is actually super useful and you
can use it in a ton of situations but
again for demonstration purposes that's
really all it does so let's get rid of
this let's look at in really quick so in
is kind of like the equal statement but
it's multiple equal statements so let's
say we want to say we first name equals
gy and then we were like wait we also
want to include Michael
Scott so then we would have to write and
where first name equals and then we
would do Michael and then etc etc for
anybody that we wanted to include but if
we said in we could do an open
parentheses and then we can say
gy we can say
Michael and we can say as many people as
we want going down the road just
separating it by commas and if we had
execute everything would be returned so
it really is just a condensed way to say
equal for multiple
things so that is the we statement I
think the wear statement can get
extremely complex but this really is
highlighting the basics so if you can
learn all of these Concepts you will
absolutely have the basics down and will
be set to go over some more intermediate
and more advanced things with the we
statement later on in the next video
we're going to be going over the group
buy and the order buy and then we are
done with the SQL Basics and then you
can practice and work your way up into
my intermediate level videos which are
going to be coming out very shortly
after these videos thank you guys so
much for joining me if you like this
tutorial Series be sure to subscribe
below and I'll see you in the next
video going on everybody my name is Alex
freeberg and in today's video we're
going to be going over the group by and
the order by statements in previous
videos we created tables we went over to
select the from and the where and now we
are at the very end of our SQL basic
series if you stayed with us for the
whole time hopefully you have learned a
lot and learned the basics of SQL in
future videos we're going to be going
over intermediate and even more advanced
concepts and even going through
portfolio projects that you can use to
put on your resume if you like this type
of content be sure to subscribe below
but let's get into it for today the
group by statement is similar to
distinct in the select statement in that
it's going to show the unique values in
a column the difference is is if we say distinct
distinct
gender what's going to be returned is
the very first unique value of female
and the very first unique value of
male but if we say
gender and we say Group by
gender it's only going to return two
values but in these two values we
actually have all the males rolled up
into this one row and all the females
rolled up into this one row now let me
further show you what that means if I
say count of
gender now you can see that this whole
time there were six males in this one
row and there were three females in this
one row so with a distinct it really is
only showing us what value is in there
that's unique but with the group by it's
showing us what the unique value is but
it's also rolling them all up into one
column that we can use it for other
things now real quick I want to be able
to see both of these at the same time so
let's just put this up here and let's
run this so we can actually see both now
let's add age to this statement down
here or this
query and let's only run this one and I
want to show you what happens and why it
happens we're now looking at gender age
and then the count of gender so if we
look down here we only have one male who
is 29 we have one male who is female
that's age 30 and so on and so forth so
none of these people are both the same
gender and the same age if for example
we had two or three people who were male
and who were 30 years old then we would
have a two or a three over here so this
count is actually being counted at each
row that's being returned so for our
data that we have today this isn't a
fantastic example CU it really split it
out there any that were the same but as
you can see you can put multiple columns
as long as you put multiple down here
now why did we not have to put this
count gender down here in this group by
that's because this count gender is
actually a derived field or derived
column it's derived based off the gender
column so it's technically not a real
column that's in the table it's one that
we're creating that's fictional uh per
se so the age and the gender are actual
fields or actual columns that are in our
table they have to be down here and like
I said before it's the comparison to
that distinct in the select statement
because we're looking at the distinct of
gender and age so we're saying distinct
across multiple columns both gender and
age now as we had it before we were only
looking at gender it's going to roll all
of those up into just male and female
but if we want to add more we can easily
add more in this group by statement we
can still do things like where age is
greater than
31 we can still do those things so let's
execute this and our numbers are going
to change now we're doing it based off
gender and we're looking at the count of
people whose age is greater than 31
which is smaller than before now let's
look at order bu I'll do it down here
really quick for demonstration but I am
eventually going to come up here and use
it because I think it'll be a little bit
better to completely round out this
query down here let me give this a name
let's do count of gender and then let's
come down here and let's order
by uh let's order by count
gender and when we run that it's going
to do 1 three and that's because as a
default SQL has an ascending feature
which is going to be smallest to largest
going down if we want to change that we
can change it to descending that's going
to be largest to smallest so now we have
31 and if we want to do it based off
gender and we do it descending now we
have Z to
A and so that's going to be male female
and if we get rid of that it's going to
do the the default
ascending and let's see what that brings
female male now for what we're trying to
do let's look at this large table so I
think it's going to be a little bit more
descriptive or a little bit better
visually let's do order by and let's do
age let's run this and it's going to
order smallest to
largest if we do
descending it's going to do largest to
smallest now you don't only have to do
just one thing you can do multiple
columns so if I wanted to do age and
then gender I can do that as well so
let's do
gender and let's run that so now we have
the age but under the age we also have
it ordered by female and that's an
ascending order so AB BC d f so females
first so it's going to be female first
and then it's going to be male and again
female and male now we don't have to
just let it be ascending for each one if
I wanted to do it reverse in this column
I can do descending now let's run that
and when we have 30 now male is first
and female second and if I wanted to do
that over here I can do descending and
now we have them both descending so it's
going to go top to bottom and we have 32
it's going to be male 32 female so you
can specify lots of different things in
here and we don't actually have to use
column names we could just use numbers
so if I wanted to do 1 2 3 4 5 I could
but let's try to replicate the exact
same thing before this would be column 1
2 three four so let's do where four
descending and then let's do
five descending and if we execute that
it's going to give us the exact same
result as if we' actually put in the
column name and I I do use this a lot
oftentimes I don't use the column name I
just if it's a small table I'll just use
the number so in my actual queries I do
this a lot where I just use the number
instead of the column name so that is
the group buy and the order by statement
and if you have walked through my
previous videos you should be completely
done with the basics of SQL so
congratulations the next thing to do is
really just practice the basics because
the basics are what you're going to be
using day in day out and so what I would
recommend is create a few more tables
query those tables try to think of use
cases and what you would actually want
to know from that information after that
I would move on to my intermediate
videos if those are already out and then
I would move on to my Advanced videos
those are going to go over some more
challenging topics but things that would
be very useful for anybody to know in my
next video I'm going to be going over
intermediate SQL topics things like
joins and subqueries and a ton more so
if I already have posted those be sure
to go check those out on my page and if
I haven't I hope to have those up soon
thank you thank you guys so much for
watching I really appreciate it if you
learned anything in this basics of
sequel Series be sure to subscribe below
and I'll see you in the next video
what's going on everybody my name is
Alex freeberg and today we're going to
be starting our intermediate SQL series
if you joined us for our last series we
walked through the basics of SQL which
is everything you needed just to get
started and in this series we're going
to be walking through some intermediate
Concepts to really take your skills up
to the next level now today we're going
to be walking through joins but let me
show you what you can expect from the
entire series for this intermediate course
course
so we're going walking through joins
today and then in future videos we're
walking through unions case statements
updating and deleting data Partition by
data types aliasing creating views
having versus the group by statement the
get date function primary care of your
foreign key and then we're going to have
an advanced course and this is not set
in stone yet but these are some of the
things that I think I will be going
through or walking through we're going
through CTE CIS tables or system tables
subqueries temp tables string functions
regular expression store procedures and
then importing and exporting data so
with all that being said let's get into
it all right now let's get rid of me
because we do not need to be seeing me
for the rest of the series at the very
top here are some of the things that
we're going to be going through today
which are inner joins and then outer
joins and in the outer joins we have a
few different styles or a few different
types of outer joins now a join is a way
to combine multiple tables
into a single
output for now we're going to be using
the employee demographics and the
employee salary table so let's get a
look at both of these tables and see
what's in them in our employee
demographics table we have employee ID
first name last name age and gender and
then down here in our employee salary
table we have employee ID job title and
salary if you notice they have a similar
column and that's going to be the
employee ID now when you're doing a join
you have to do this based off a similar
column and typically you want it to be a
unique field so we're going to be using
the employee ID from both tables to join
these tables together to create one
output so let's get rid of this real
quick and let's start building our query
to join these two tables
together so the first thing we're going
to do is an inner join so let's do select
select
everything and let's do it from SQL tutorial.
tutorial.
db. employee
demographics and let's do join we can
also say inner join but join by default
is going to say
iner and we're going to do SQL tutorial.
db. employee
salary now we have to join them together
which is what we talked about earlier
and we're going to be doing that based
off the employee ID so for that we have
to say on and then we're going to
say employee
demographics dot employee ID is equal to employee
employee
salary dot employee ID so let's run this
real quick and take a look at the
output and let me pull this up real quick
quick
so what we are looking at is actually
both tables combined we have the
employee ID first name last name age
gender and then here's the salary
employee ID job title salary now an in
join is really only going to show
everything that is the same so in both
tables there are employee IDs of
10001 all the way down to
10009 but if you notice there is data
that is missing real quick let's go down
to this graphic and let's look at this
inner join an inner join is going to
show everything that is common or
overlapping between table a and table B
so what we are looking at here is
exactly that we're only looking at the
things that are similar based off this
employee ID in both tables now let's
change this join to a full outer
join and let's run this and see what we
get now if you notice the output is very
different so let's take a look at it and
see why it's so different if you notice
everything down till here is the exact
same so employees 101 down to 1009 are
exactly the same but once we get down to
row 10 it starts to get very different
now we are joining these tables based
off the employee ID so for example right
here Ryan Howard has an employee ID of
101 but as you can see in this table for
salaries there is no 101 employee ID so
it has nothing to link it to so because
of that it fills in everything as null
because it has nothing to match on this
table and vice versa in the employee
salary table there's a person in here
that's a Salesman and there's no
employee ID at all which means all this
information is going to be null and we
can see that in this diagram right here
so this is the full outer join right
here and what it is saying is we are
going to show everything from table a
and table B regardless of if it has a
match based on what we were joining them
on so even if table a has an employee ID
but there's no employee ID in table B
we're still going to show it and vice
versa so now let's look at a left outer
join a left outer join is going to take
the left table and say we want
everything from the left table and
everything that's overlook lapping but
if it's only in the right table we do
not want it now what is the left and the
right table the left table is going to
be our first table that we use our right
table is going to be the second table
that we use so we're going to look at
everything in the employee demographics
table regardless of whether or not it
has a match on the employee ID in the
employee salary table so this is what
that looks like so as you can see this
is our entire table for employee
demographics and down here we have three
that have information in the employee
demographics table but have absolutely
no information in any of the employee
salary table because there's nothing to
match it on so this 101 is not in this
table this 13 is not in this table and
this one does not even have an employee
ID so we're not going to have a match at
all and if we change that to the
right you'll see the exact opposite it's
going to show us everything in the
employee salary table so now we have all
of our information right here from the
employee salary table and if it doesn't
match in this table it's just going to
give nulls so down here we have 1,0 and
obviously there's not going to be
anything associated with that because
there's no 10,0 in the employee
demographics table and for this one we
have a Salesman with no employee ID and
since there's no employee ID to tie it
to this demographics table we're going
to have nothing and we can see that in
the diagram right here so for the left
outer join we're looking at everything
in table a which is our demographics
table and in our right outer join
looking at everything at table B which
is our salary table now let's pull this
down a little bit so so far we've only
been using the select star so we've been
selecting everything and I only did that
just for demonstration purposes but you
most likely would not be doing this when
you actually use these joins what you're
probably going to want to do is Select
exactly what columns you want in your
output so for example let's do employee
ID let's do first
name last
name and let's do job
title and let's do
salary and let's try to run that really
quick and as you can see it is not going
to work now why is that not working it's
not working because we have two Fields
one in each of these tables and we have
to specify what employee ID we want
because that is going to drastically
change what our output is so we have an
employee ID in this table and in this
table which one do we want to use so for
this demonstration let's use employeed
demographics. employee ID and let's
actually just do an inner join because
it's easier for the
output now let's run this and see what
we get so as you can see we now have the
employee ID first name last name job
title and salary now we're doing this
with an injoin based off the employee ID
from the employee demographics table but
if we use the employee salary table it
should give us the exact same output and
that's cuz we're using an in join and an
in joint is only going to show us
everything that overlaps between both
tables but now let's try a write outer
join and let's run this now we're using
this employee ID from our employee
salary table and since we're doing a
write outer join we're going to get all
the information from our employee salary
table and it does not have to be in our
left table which is our employee
demographics table so if you look at the
information down here this 110 is in the
employee salary table but it's in this
position because that's what we're
looking at in our select statement and
then over here we have our salary and
since we have information right here
which is in our employee salary table
but there is no employee employe ID our
employee ID is null now let's change
this to look at the employee
demographics employee ID and execute it
as you can see that 110 is gone now we
just have this information right down
here and we didn't have the employee ID
for either of these so it's going to
show it regardless and that's again
because we have a right outer join and
that's why we have no employee ID down
here now let's do a left
join and it's basically going to do the
opposite of what we just looked at now
we're looking at everything from our
left table regardless of if it's in our
right table and so our left table is our
employee demographics table and we are
looking at our employee demographics ID
so with the employee demographics ID
it's going to show us the first name and
the last name which is everything in our
left table our employee demographics
table and since for these IDs or lack of
IDs it's just going to give us NES in
all of these places if I change it right
up here to the employee salary employee
ID and I execute it because we're
showing everything from our left table
which is our employee demographics table
we are still going to see our names but
since we're using the employee ID from
our right table now we're just going to
have blanks in this information and this
information now let's look at a use case
for these joins let's say Robert
California is pressuring Michael Scott
to meet his quarterly quota and Michael
Scott is almost there he needs like a
thousand more dollars and he comes up
with the genius idea to deduct pay from
the highest paid employee at his Branch
besides himself so how does he go about
doing this and identifying the person
that makes the most money well of course
he's going to come to SQL first so we
actually want to look at a
full outer join real quick
and let's just look at
everything so here's what we have we
have the employee ID first name last
name age gender employee ID job title
and salary now what information do we
need to know to get the information that
Michael Scott needs well we need the
employee ID we want the first name and
last name so let's write all that real
quick so employee ID we need first name
name we
need last name and then we're also going
to need the
salary cuz we need to know who is the
highest paid
employee so now let's do an injin
because we really only want to look at
the employee IDs where we know what
their name is and their salary is and
let's do this based off the employee
demographics table really doesn't matter
for an in join but let's do that real
quick so let's look at this so we have
our employee ID we have our first name
our last name and our salary and we want
to do it where it's not Michael Scott and that's because Michael doesn't want
and that's because Michael doesn't want to take away his own money he wants to
to take away his own money he wants to take away his employees money so let's
take away his employees money so let's do
do where first name does not equal Michael
where first name does not equal Michael and he knows that he's the only one that
and he knows that he's the only one that is not named Michael so now we have our
is not named Michael so now we have our list and let's do order
list and let's do order bu and let's do
bu and let's do salary and let's execute
salary and let's execute this and let's do descending so that we
this and let's do descending so that we can get at the very
can get at the very top and this is tough tough news for
top and this is tough tough news for Dwight shut because it looks like he is
Dwight shut because it looks like he is the highest paid employee besides
the highest paid employee besides Michael and so it looks like he is going
Michael and so it looks like he is going to get a cut in his pay this quarter so
to get a cut in his pay this quarter so that Michael can meet his quota so
that Michael can meet his quota so that's just one use case let's look at
that's just one use case let's look at one more use case let's start out by
one more use case let's start out by getting rid of this and looking at
getting rid of this and looking at everything
again so for our next use case Kevin Malone who is an accountant thinks that
Malone who is an accountant thinks that he may have made a mistake when looking
he may have made a mistake when looking at the average salary for our salesman
at the average salary for our salesman now Angela Martin is very good at SQL
now Angela Martin is very good at SQL and so what she is going to do is she
and so what she is going to do is she wants to go in and calculate the average
wants to go in and calculate the average salary for our salesman so let's try to
salary for our salesman so let's try to get that information so all we're going
get that information so all we're going to need is the job title and the salary
to need is the job title and the salary so let's come up here and let's get job
so let's come up here and let's get job title and let's get
title and let's get salary and let's look at
salary and let's look at this and now we only want to look at
this and now we only want to look at where the job title is equal to
salesman now the very last thing we want to do is we want to say we want the
to do is we want to say we want the average of salary now since we're going
average of salary now since we're going to need to do a group buy we're going to
to need to do a group buy we're going to have to get rid of this
have to get rid of this salary and just take job title write
salary and just take job title write down here and do group by job title so
down here and do group by job title so we're going to have job title and then
we're going to have job title and then the average
the average salary and there you go we have the
salary and there you go we have the salesman and the average salary is
salesman and the average salary is 52,000 so Angela now knows to go back
52,000 so Angela now knows to go back and fix what Kevin made a mistake on so
and fix what Kevin made a mistake on so that's how you use joins I will includ
that's how you use joins I will includ include this image in the description so
include this image in the description so you can go and look that up yourself if
you can go and look that up yourself if you are curious and want to look at that
you are curious and want to look at that that really helped me out when I was
that really helped me out when I was first getting started to kind of
first getting started to kind of conceptualize and understand what kind
conceptualize and understand what kind of data I was pulling based on what join
of data I was pulling based on what join I was using so I hope that was useful to
I was using so I hope that was useful to you as well in the very next video we're
you as well in the very next video we're going to be looking at the union so if
going to be looking at the union so if that is posted be sure to check that out
that is posted be sure to check that out next thank you guys so much for joining
next thank you guys so much for joining me I really appreciate it if you like
me I really appreciate it if you like this type of content or got anything out
this type of content or got anything out of it today be sure to smash the like
of it today be sure to smash the like button smash the Subscribe button and
button smash the Subscribe button and I'll see see in the next video what's
I'll see see in the next video what's going on everybody my name is Alex free
going on everybody my name is Alex free in today's video we're going to be
in today's video we're going to be looking at unions now in the very last
looking at unions now in the very last video we walked through joins and I
video we walked through joins and I thought it was appropriate to look at
thought it was appropriate to look at unions next because unions and joins are
unions next because unions and joins are somewhat similar or closely related and
somewhat similar or closely related and that's because in both instances they're
that's because in both instances they're combining two tables to create one
combining two tables to create one output now what's the difference the
output now what's the difference the difference is that a join combines both
difference is that a join combines both tables based off a common column and in
tables based off a common column and in last video that was the employee ID so
last video that was the employee ID so in both tables we had an employee ID and
in both tables we had an employee ID and when you're selecting your data you have
when you're selecting your data you have to choose either to only select one
to choose either to only select one employee ID or you can choose both
employee ID or you can choose both employee IDs but they're in separate
employee IDs but they're in separate columns and with a union you're actually
columns and with a union you're actually able to select all the data from both
able to select all the data from both tables and put it into one output where
tables and put it into one output where all the data is in each column and not
all the data is in each column and not separate it out and you don't have to
separate it out and you don't have to choose which table you're choosing it
choose which table you're choosing it from now that may not have made1 100%
from now that may not have made1 100% sense but let's look at it real quick in
sense but let's look at it real quick in stages so let's go down here and let's
stages so let's go down here and let's actually join this table
actually join this table together and see what we get now the two
together and see what we get now the two tables that we're looking at is employee
tables that we're looking at is employee demographics and warehouse employee
demographics and warehouse employee demographics so over here we have our
demographics so over here we have our employee demographics information and
employee demographics information and then over here or actually down here we
then over here or actually down here we have our warehouse employee demographics
have our warehouse employee demographics now right now I'm doing a full outer
now right now I'm doing a full outer join so we're looking at all the data
join so we're looking at all the data and if we were to pull this in to an
and if we were to pull this in to an Excel spreadsheet we could just copy
Excel spreadsheet we could just copy this and paste it over here and we would
this and paste it over here and we would be good to go and that's because we have
be good to go and that's because we have all the same columns first name last
all the same columns first name last name age gender first name last name age
name age gender first name last name age gender but if we tried to combine this
gender but if we tried to combine this in a query where we have this
in a query where we have this information right here it wouldn't work
information right here it wouldn't work we cannot get it in the same column and
we cannot get it in the same column and that's where a union comes into play so
that's where a union comes into play so let's go back up here and let's actually
let's go back up here and let's actually run both of
run both of these now as you can see they have the
these now as you can see they have the exact same columns and that makes it
exact same columns and that makes it super easy for what we're about to do
super easy for what we're about to do all we're going to do is between these
all we're going to do is between these two queries which are completely
two queries which are completely separate right now all we're going to do
separate right now all we're going to do is write
is write Union so let's run just
Union so let's run just this now because of the Union you can
this now because of the Union you can look down here and the information that
look down here and the information that used to be in the other table which were
used to be in the other table which were in separate columns are now added Down
in separate columns are now added Down Below in the exact same order now Daryl
Below in the exact same order now Daryl filin was actually in both tables and
filin was actually in both tables and the reason he isn't showing up multiple
the reason he isn't showing up multiple times is because this Union is actually
times is because this Union is actually taking out and removing the duplicates
taking out and removing the duplicates kind of like a distinct statement now
kind of like a distinct statement now there's actually another thing called
there's actually another thing called Union all and if we do Union all it is
Union all and if we do Union all it is going to show us all of the information
going to show us all of the information regardless if it is a duplicate or not
regardless if it is a duplicate or not so let's run that real quick and they
so let's run that real quick and they they are both there but let's order
they are both there but let's order by and let's do employee
by and let's do employee ID so now let's run it and as you can
ID so now let's run it and as you can see right here these are exact
see right here these are exact duplicates and so the union got rid of
duplicates and so the union got rid of it because they were the exact same but
it because they were the exact same but the union all kept it in because it is
the union all kept it in because it is showing just the data as is now let's
showing just the data as is now let's get rid of this Union all because the
get rid of this Union all because the only reason why it works so well is
only reason why it works so well is because those two tables were exact same
because those two tables were exact same they were employee ID first name last
they were employee ID first name last name age gender so they're basically the
name age gender so they're basically the same tables just with different
same tables just with different information so it made it really easy
information so it made it really easy but we have another table
but we have another table employee uh
employee uh salary and let's look at these two
salary and let's look at these two tables so these two tables are obviously
tables so these two tables are obviously very different they hold different
very different they hold different information now we would still be able
information now we would still be able to combine them so let's do employee
to combine them so let's do employee ID first name and let's do
ID first name and let's do age now down here on the employee salary
age now down here on the employee salary table we will do employee ID job title
table we will do employee ID job title and
and salary now let's use a union really
salary now let's use a union really quick and run this
quick and run this one and it is still going to work now
one and it is still going to work now why does this work well first off the
why does this work well first off the the reason it's working is because these
the reason it's working is because these data types are the exact same or at
data types are the exact same or at least similar so text and text age which
least similar so text and text age which is an integer salary which is an integer
is an integer salary which is an integer it has the same amount of columns so
it has the same amount of columns so three and three so we have employee ID
three and three so we have employee ID first name and age and it's taking that
first name and age and it's taking that from the first select statement and it's
from the first select statement and it's still using a union to take the data
still using a union to take the data from the second select statement so it's
from the second select statement so it's still inserting this information now
still inserting this information now this is not what you want to do because
this is not what you want to do because right here we have first name and it's
right here we have first name and it's salesman salesman and then our age we
salesman salesman and then our age we have 30 45,000 and 45,000 is obviously
have 30 45,000 and 45,000 is obviously not an age so you want to be careful
not an age so you want to be careful when you're using a union to combine two
when you're using a union to combine two separate tables and make sure that the
separate tables and make sure that the data you're selecting is the same in the
data you're selecting is the same in the very next video we're going to be
very next video we're going to be walking through case statements thank
walking through case statements thank you guys so much for joining me I really
you guys so much for joining me I really appreciate it if you like this type of
appreciate it if you like this type of content be sure to subscribe below and
content be sure to subscribe below and I'll see you in the next video what is
I'll see you in the next video what is going on everybody my name is Alex
going on everybody my name is Alex freeberg and today we're going to be
freeberg and today we're going to be walking through cas statements in SQL a
walking through cas statements in SQL a case statement allows you to specify a
case statement allows you to specify a condition and then it also allows you to
condition and then it also allows you to specify what you want returned when that
specify what you want returned when that condition is met so we're going to be
condition is met so we're going to be using this employee demographics table
using this employee demographics table that we're looking at right here we're
that we're looking at right here we're going to walk through the syntax of how
going to walk through the syntax of how to create a case statement and then
to create a case statement and then we're going to actually go into some use
we're going to actually go into some use cases at the end so let's start off by
cases at the end so let's start off by specifying what columns we want let's
specifying what columns we want let's say we want the first name we want the
say we want the first name we want the last name and we want want the age now
last name and we want want the age now let's just get that
let's just get that information now for our case statement
information now for our case statement we're going to be using this age column
we're going to be using this age column so we actually want the age to be in
so we actually want the age to be in there so let's
there so let's specify where age is not
specify where age is not null and run that so now we have a
null and run that so now we have a pretty good look at it and let's just
pretty good look at it and let's just order
order by H just to clean it up a little bit so
by H just to clean it up a little bit so now let's start building our case
now let's start building our case statement so we're going to say case and
statement so we're going to say case and then we want to say when now we need to
then we want to say when now we need to specify what condition we want to look
specify what condition we want to look for so let's do when age is greater than
for so let's do when age is greater than 30 then then what do we want to be
30 then then what do we want to be returned so we want to return that they
returned so we want to return that they are old else so that means anything that
are old else so that means anything that is not over the age of 30 we want to
is not over the age of 30 we want to return
return young and then you need to specify that
young and then you need to specify that you done with the case statement and so
you done with the case statement and so you will write end at the very bottom so
you will write end at the very bottom so this is our first case statement let's
this is our first case statement let's run it and see what we get so as you can
run it and see what we get so as you can see a new column was created and if the
see a new column was created and if the person is over the age of 30 so 31 and
person is over the age of 30 so 31 and up they are given old and if they're not
up they are given old and if they're not over the age of 30 they are given
over the age of 30 they are given young now we can do as many when and
young now we can do as many when and then statements as we want so if we want
then statements as we want so if we want to we can also do when the age is
to we can also do when the age is between 27 and 30
between 27 and 30 then we want to return
then we want to return young and anyone else we're going to
young and anyone else we're going to call a
call a baby so now we have Ryan Howard as the
baby so now we have Ryan Howard as the baby anyone between 27 and 30 they're
baby anyone between 27 and 30 they're considered young and anyone over the age
considered young and anyone over the age of 30 is old now something to note is
of 30 is old now something to note is that the very first condition that is
that the very first condition that is met is going to be returned so if there
met is going to be returned so if there are multiple conditions that meet the
are multiple conditions that meet the criteria only the very first one is
criteria only the very first one is going to be return returned and let's
going to be return returned and let's demonstrate that real quick so if the
demonstrate that real quick so if the age equals
age equals 38 then return Stanley because that is
38 then return Stanley because that is Stanley uh and let's execute this real
Stanley uh and let's execute this real quick so right here I'm specifying that
quick so right here I'm specifying that if it's 38 it should return Stanley but
if it's 38 it should return Stanley but he is right here and it still says old
he is right here and it still says old and that's because this condition was
and that's because this condition was already met now if we were to put this
already met now if we were to put this right
right here it should work correctly and let's
here it should work correctly and let's try it out so now because this condition
try it out so now because this condition is met first it is going to return
is met first it is going to return Stanley down here so now let's get into
Stanley down here so now let's get into our first use case let's start off by
our first use case let's start off by copying this and then commenting it out
copying this and then commenting it out I only did that because I don't want to
I only did that because I don't want to rewrite it because I'm
rewrite it because I'm lazy uh let's get rid of that and let's
lazy uh let's get rid of that and let's look at this real quick we are going to
look at this real quick we are going to join on another table that we have
join on another table that we have really fast um that's going to be SQL
really fast um that's going to be SQL tutorial if you watched my other videos
tutorial if you watched my other videos then you know this table and we're going
then you know this table and we're going to do that on employee
to do that on employee demographics. employee ID is equal to
demographics. employee ID is equal to employee
employee salary. employee ID okay so let's just
salary. employee ID okay so let's just look at everything in these tables
look at everything in these tables really quick now we are going to be
really quick now we are going to be focusing on the job title in the salary
focusing on the job title in the salary column but we want their first name and
column but we want their first name and last name as well so let's start
last name as well so let's start building that out
building that out let's do first
let's do first name last
name last name job title and salary and let's look
name job title and salary and let's look at this really quick so now we have our
at this really quick so now we have our employees and here is the situation we
employees and here is the situation we had a fantastic year this year selling
had a fantastic year this year selling paper and corporate has allowed Michael
paper and corporate has allowed Michael Scott to give out a yearly raise to
Scott to give out a yearly raise to every single employee but not every
every single employee but not every employee is going to get the same raise
employee is going to get the same raise because our salesmen are genuinely the
because our salesmen are genuinely the people who made us our money and they're
people who made us our money and they're going to get the biggest raises well
going to get the biggest raises well other people really aren't going to get
other people really aren't going to get that big of a raise so now let's go
that big of a raise so now let's go through and create a case statement to
through and create a case statement to calculate what their salary will be
calculate what their salary will be after they get their
after they get their raise so let's start off by saying
raise so let's start off by saying case and
case and when and we want it to say when job
when and we want it to say when job title is equal to
title is equal to salesman so when they are a Salesman
salesman so when they are a Salesman what do we want to happen so this is
what do we want to happen so this is where the calculation occurs so we're
where the calculation occurs so we're going to take their
going to take their salary and then we're going to add their
salary and then we're going to add their salary times how much their raise is
salary times how much their raise is going to be so the salesman did really
going to be so the salesman did really really well and we want to give them a
really well and we want to give them a 10% raise this year now when their job
10% raise this year now when their job title is equal
title is equal to
to accountant then and we'll take their
accountant then and we'll take their salary we will give
salary we will give them let's give them a 5% raise still
them let's give them a 5% raise still very
very generous there we we go and when the job
generous there we we go and when the job title is equal to
title is equal to HR then it's going to be the salary
HR then it's going to be the salary plus the
plus the salary times and then we're going to do
salary times and then we're going to do 01 all right and else we are just going
01 all right and else we are just going to
to do
do salary plus salary oops let's do
salary plus salary oops let's do parentheses times and let's just give
parentheses times and let's just give everyone else a 3% rays and then we'll
everyone else a 3% rays and then we'll write end now let's take a look at our
write end now let's take a look at our results so here's what we have so far we
results so here's what we have so far we have our first name our last name our
have our first name our last name our job title and our salary that is our
job title and our salary that is our current salary and then we're going to
current salary and then we're going to have our salary after we get our raise
have our salary after we get our raise so I'm going to actually write that up
so I'm going to actually write that up here so let's do as
here so let's do as salary a after
salary a after raise and let's execute
raise and let's execute that so let's look at these raises
that so let's look at these raises really quick so we have 45,000 and since
really quick so we have 45,000 and since he is a Salesman he gets a 10% raise
he is a Salesman he gets a 10% raise which is a raise of
which is a raise of $4,500 so 45,000 plus
$4,500 so 45,000 plus $4,500 is $49,500 and as you can see
$4,500 is $49,500 and as you can see down here we have HR who is making
down here we have HR who is making $50,000 and now he is making
$50,000 and now he is making $5,000 5 so everybody got a raise so
$5,000 5 so everybody got a raise so that is our case statement I hope that
that is our case statement I hope that was helpful I find myself using the case
was helpful I find myself using the case statement a lot when I'm wanting to
statement a lot when I'm wanting to categorize things or label things and
categorize things or label things and that's kind of what we did in the first
that's kind of what we did in the first example and you can even do calculations
example and you can even do calculations like we did in this use case so I hope
like we did in this use case so I hope that was helpful thank you guys so much
that was helpful thank you guys so much for watching I really appreciate it if
for watching I really appreciate it if you learned anything from this video be
you learned anything from this video be sure to like And subscribe below and
sure to like And subscribe below and I'll see you in the next video what is
I'll see you in the next video what is going on everybody my name is Alex fre
going on everybody my name is Alex fre and today we're going to be looking at
and today we're going to be looking at the having Clause now the having Clause
the having Clause now the having Clause I feels a little bit unappreciated in
I feels a little bit unappreciated in the SQL Community I feel like it doesn't
the SQL Community I feel like it doesn't get a lot of love and so today I want to
get a lot of love and so today I want to describe how to use it and what it's
describe how to use it and what it's used for so before we use the having
used for so before we use the having Clause I want to set up our query here
Clause I want to set up our query here uh we want to use an aggregate function
uh we want to use an aggregate function in the group by statement and then I
in the group by statement and then I will show you how to use this having
will show you how to use this having Clause so let's look at the job title
Clause so let's look at the job title and let's look at the count of job
and let's look at the count of job titles and then down here we need to do
titles and then down here we need to do group by job title
group by job title and let's execute
and let's execute this and here is our job titles and
this and here is our job titles and here's the count of how many people have
here's the count of how many people have those job titles so now let's say we
those job titles so now let's say we want to look at all the jobs that have
want to look at all the jobs that have more than one person in that specific
more than one person in that specific job so let's do where uh the
job so let's do where uh the count of job title is greater oops is
count of job title is greater oops is greater than one and let's run
greater than one and let's run that and as you can see we're going to
that and as you can see we're going to get this this message right here now
get this this message right here now let's read it an aggregate may not
let's read it an aggregate may not appear in the wear Clause unless it is
appear in the wear Clause unless it is in a subquery contained in a having
in a subquery contained in a having clause or a select list and the column
clause or a select list and the column being aggregated is an outer
being aggregated is an outer reference what that is basically saying
reference what that is basically saying is is we cannot use this aggregate
is is we cannot use this aggregate function in the wear statement we need
function in the wear statement we need to use a having Clause so let's get rid
to use a having Clause so let's get rid of this and let's say
of this and let's say having the count of job
having the count of job title greater than one I did the same
title greater than one I did the same thing again and let's execute this and
thing again and let's execute this and we're still going to get an error now
we're still going to get an error now why are we getting that error the reason
why are we getting that error the reason is is because this having statement is
is is because this having statement is completely dependent on the group by
completely dependent on the group by statement because we are performing this
statement because we are performing this after it has been aggregated so this
after it has been aggregated so this having statement actually needs to go
having statement actually needs to go after the group by statement because we
after the group by statement because we can't look at the aggregated information
can't look at the aggregated information before it's actually aggregated in that
before it's actually aggregated in that group by statement so now let's run this
group by statement so now let's run this and and it worked
and and it worked perfectly so now we only have the jobs
perfectly so now we only have the jobs that have more than one employee for
that have more than one employee for that job
that job title so now let's look at one more
title so now let's look at one more example let's do the average let's say
example let's do the average let's say salary and let's get rid of this having
salary and let's get rid of this having Clause real quick and just to look at
Clause real quick and just to look at this
this information uh let's do order by and
information uh let's do order by and we'll do average
we'll do average salary so let's look at this and we have
salary so let's look at this and we have 36,000 to 65,000 so in the middle we got
36,000 to 65,000 so in the middle we got 44,500 so let's use this having
44,500 so let's use this having statement and let's
statement and let's say the
say the average of
average of salary where it is greater than
salary where it is greater than 45,000 and we actually need to put this
45,000 and we actually need to put this right here right after the group buy and
right here right after the group buy and before the order buy so let's run this
before the order buy so let's run this and see what we get and it worked
and see what we get and it worked perfectly so now we're looking at the
perfectly so now we're looking at the job titles that have an average salary
job titles that have an average salary of over
of over $45,000 so there you go that is the
$45,000 so there you go that is the having Clause definitely one that is
having Clause definitely one that is good to know and is very useful in
good to know and is very useful in specific situations thank you guys so
specific situations thank you guys so much for watching I really appreciate it
much for watching I really appreciate it if you like this video or learned
if you like this video or learned anything today be sure to subscribe
anything today be sure to subscribe below and I'll see you in the next
below and I'll see you in the next video what is going on everybody my name
video what is going on everybody my name is Alex freeberg and today we're going
is Alex freeberg and today we're going to be looking at updating and deleting
to be looking at updating and deleting data in a table now what's the
data in a table now what's the difference between inserting data into a
difference between inserting data into a table and updating data insert into is
table and updating data insert into is going to create a new row in your table
going to create a new row in your table while updating is going to alter a
while updating is going to alter a pre-existing row while deleting is going
pre-existing row while deleting is going to specify what rows you want to remove
to specify what rows you want to remove from your table so let's get going with
from your table so let's get going with the updating so down here Holly flax
the updating so down here Holly flax does not have an employee ID age or
does not have an employee ID age or gender now we want to update this table
gender now we want to update this table to give her that information so let's do
to give her that information so let's do update now we need to specify what table
update now we need to specify what table we are going to be hitting off of so
we are going to be hitting off of so let's do SQL tutorial. db. employee
let's do SQL tutorial. db. employee demographics so now we're going to use
demographics so now we're going to use something called set and set is going to
something called set and set is going to specify what column and what value you
specify what column and what value you actually want to insert into that cell
actually want to insert into that cell so let's set
so let's set her employee ID equal to and it's going
her employee ID equal to and it's going to be
to be 1,2 and we have to specify which one to
1,2 and we have to specify which one to do this to because if we ran just this
do this to because if we ran just this is going to set every single employee ID
is going to set every single employee ID to 112 because we haven't specified that
to 112 because we haven't specified that we only want Holly flax's row to be
we only want Holly flax's row to be updated so now we have to specify
updated so now we have to specify where first
where first name is equal to
name is equal to Holly
Holly and last name is equal to flex so now
and last name is equal to flex so now let's run this and see what we
let's run this and see what we get so one row has been affected
get so one row has been affected let's see what we got and there we go as
let's see what we got and there we go as you can see the employee ID was updated
you can see the employee ID was updated exactly how we specified it right here
exactly how we specified it right here so we also want to update age and gender
so we also want to update age and gender and let's do that in the same
and let's do that in the same query so let's set the age equal to 31
query so let's set the age equal to 31 and instead of using and we actually
and instead of using and we actually need to use a comma so let's say age
need to use a comma so let's say age equal to 31 comma gender is going to be
equal to 31 comma gender is going to be equal to female and let's write
equal to female and let's write this and see what we get there you go
this and see what we get there you go now let's look at our
now let's look at our table and as you can see it was updated
table and as you can see it was updated to 31 and
to 31 and female so very easy very easy to specify
female so very easy very easy to specify what you want often times uh tables like
what you want often times uh tables like this will have a unique key like
this will have a unique key like employee ID is our unique key in this
employee ID is our unique key in this table so I could easily just say uh
table so I could easily just say uh where the employee ID is equal to and
where the employee ID is equal to and then you know
then you know 102 so it's an easy way way to specify
102 so it's an easy way way to specify what employee you're trying to update so
what employee you're trying to update so now let's look at the delete statement
now let's look at the delete statement the delete statement is going to remove
the delete statement is going to remove an entire row from our table so let's do
an entire row from our table so let's do delete and we actually need to say from
delete and we actually need to say from and we have to specify what table we
and we have to specify what table we want to be removing this information
want to be removing this information from so let's do SQL tutorial. db.
from so let's do SQL tutorial. db. employee
employee demographics and now we need to specify
demographics and now we need to specify what row we want to remove so let's do
what row we want to remove so let's do where employee ID is equal to and let's
where employee ID is equal to and let's choose a completely random employee ID
choose a completely random employee ID 105 so let's run this and see what
105 so let's run this and see what happens so one row is
happens so one row is affected let's look at our table and as
affected let's look at our table and as you can see 105 is now gone now you have
you can see 105 is now gone now you have to be very careful when you use the
to be very careful when you use the delete statement because once you run it
delete statement because once you run it you cannot get that data back there's no
you cannot get that data back there's no way to reverse a delete statement so if
way to reverse a delete statement so if I had gotten rid of this wear statement
I had gotten rid of this wear statement and I ran this it would delete
and I ran this it would delete everything from the entire table and you
everything from the entire table and you could not get that data back so a little
could not get that data back so a little trick that I use before I actually run a
trick that I use before I actually run a delete statement is I make it a select
delete statement is I make it a select statement because you're going to select
statement because you're going to select everything where the employee ID is
everything where the employee ID is equal to let's just do
equal to let's just do 1,4 and now when you run this you are
1,4 and now when you run this you are going to see exactly what you will be
going to see exactly what you will be deleting and now we know that Angela
deleting and now we know that Angela Martin that entire row is going to be
Martin that entire row is going to be gone if I hadn't done that and I just
gone if I hadn't done that and I just went like this and I wrote delete and I
went like this and I wrote delete and I only had this running I would not know
only had this running I would not know that this information is going to be the
that this information is going to be the only one that's gone maybe I made a
only one that's gone maybe I made a mistake down here maybe I accidentally
mistake down here maybe I accidentally put something in there that wasn't
put something in there that wasn't supposed to be in there and now I'm
supposed to be in there and now I'm deleting much more than I thought I was
deleting much more than I thought I was actually going to
actually going to delete so using the select statement can
delete so using the select statement can be a very good Safeguard against
be a very good Safeguard against accidentally deleting data that you do
accidentally deleting data that you do not want to delete so that is update and
not want to delete so that is update and delete thank you guys so much for
delete thank you guys so much for watching I really appreciate it if you
watching I really appreciate it if you like this video be sure to subscribe
like this video be sure to subscribe below and I'll see you in the next video
below and I'll see you in the next video What's going going on everybody my name
What's going going on everybody my name is Alex free and today we're going to be
is Alex free and today we're going to be talking about aliasing now all aliasing
talking about aliasing now all aliasing really is is temporarily changing the
really is is temporarily changing the column name or the table name in your
column name or the table name in your script and it's not really going to
script and it's not really going to impact your output at all aliasing is
impact your output at all aliasing is really used for the readability of your
really used for the readability of your script so that if you hand this off to
script so that if you hand this off to somebody or somebody comes behind you
somebody or somebody comes behind you and starts working on this they can more
and starts working on this they can more easily understand it and it may not
easily understand it and it may not sound super useful especially for small
sound super useful especially for small scripts like what we have on the screen
scripts like what we have on the screen but when you start getting to larger
but when you start getting to larger scripts where you have six seven or
scripts where you have six seven or eight joins and you're selecting 10
eight joins and you're selecting 10 different column names it actually is
different column names it actually is very useful and very important so let's
very useful and very important so let's get into how that actually works and
get into how that actually works and then I'll have an example later of how
then I'll have an example later of how we can use aling with a little bit of a
we can use aling with a little bit of a larger query so in this table let's
larger query so in this table let's select first
select first name and
name and execute what we want to do is just write
execute what we want to do is just write as and let's do FN name and all that's
as and let's do FN name and all that's going to do is it's going to rename this
going to do is it's going to rename this column from first name which it was
column from first name which it was originally named to FN name now you can
originally named to FN name now you can can use as but you can also just get rid
can use as but you can also just get rid of that and do it exactly how I have it
of that and do it exactly how I have it and it's still going to work perfectly
and it's still going to work perfectly you can either use the as or you can not
you can either use the as or you can not use it I typically don't I just put a
use it I typically don't I just put a space in between the actual column and
space in between the actual column and the Alias now let's look at an example
the Alias now let's look at an example of how this might actually be useful so
of how this might actually be useful so we have a first name and a last name in
we have a first name and a last name in this column so what we're going to do is
this column so what we're going to do is actually combine those so let's do plus
actually combine those so let's do plus and let's add a space in there and let's
and let's add a space in there and let's do a plus and let's do last name so this
do a plus and let's do last name so this is going to take the first name add a
is going to take the first name add a space and then do the last name and
space and then do the last name and we're going to do that as and let's do
we're going to do that as and let's do full
full name and let's execute
name and let's execute this so now we have a column called full
this so now we have a column called full name which is our Alias so we've
name which is our Alias so we've combined the first name and the last
combined the first name and the last name column into one single column and
name column into one single column and we've renamed it full name if we had not
we've renamed it full name if we had not used this Alias at all it would have
used this Alias at all it would have just said this which is no column name
just said this which is no column name at all we don't typically want that when
at all we don't typically want that when we have an output we want to give this
we have an output we want to give this column a name so that somebody who's
column a name so that somebody who's actually looking at the script or who's
actually looking at the script or who's looking at the output of the script
looking at the output of the script actually understand what is contained
actually understand what is contained within this column so for that we're
within this column so for that we're just going to keep it as full name now
just going to keep it as full name now another time that you're often going to
another time that you're often going to use aliasing in the select statement is
use aliasing in the select statement is when you're using aggregate functions so
when you're using aggregate functions so in this table we have age so let's pull
in this table we have age so let's pull that up really
that up really quick so we have age right here and
quick so we have age right here and let's actually just do the average
let's actually just do the average age and when we execute this we're going
age and when we execute this we're going to get no column name and 31 so we want
to get no column name and 31 so we want to do
to do is give it average
is give it average age and when we do that we now have a
age and when we do that we now have a column name and again you want to have a
column name and again you want to have a column name in case someone comes up
column name in case someone comes up behind you and is reading the script so
behind you and is reading the script so that they understand what this column is
that they understand what this column is being used for now that we've looked at
being used for now that we've looked at aliasing column names let's look at
aliasing column names let's look at aliasing table names it basically is the
aliasing table names it basically is the exact same thing uh we're just going to
exact same thing uh we're just going to write as and let's do demo for
write as and let's do demo for demographics and let's do demo Dot and
demographics and let's do demo Dot and it's going to give us all of our options
it's going to give us all of our options and we'll do employee
and we'll do employee ID so when you alias in a table name
ID so when you alias in a table name when you are selecting in the select
when you are selecting in the select statement you actually need to preface
statement you actually need to preface your column name with a table name or
your column name with a table name or the table Alias Dot and then employee ID
the table Alias Dot and then employee ID and this is extremely important to do
and this is extremely important to do especially when you have a lot of joins
especially when you have a lot of joins that you're doing or you're selecting a
that you're doing or you're selecting a lot of columns when you have several
lot of columns when you have several joins because it can get very very messy
joins because it can get very very messy quick so let's actually join this to
quick so let's actually join this to employees
employees salary and let's do that
salary and let's do that on
on demo. employee ID is equal
to s. employee ID so now let's do demo.
s. employee ID so now let's do demo. employee ID comma s do and let's do
employee ID comma s do and let's do salary so looking at the script now is
salary so looking at the script now is very clean it is very easy to understand
very clean it is very easy to understand and that is what's so important with
and that is what's so important with aliasing if for for example we took this
aliasing if for for example we took this off every time we wanted to reference
off every time we wanted to reference this table we would have to put the
this table we would have to put the entire table name and putting the entire
entire table name and putting the entire table name is correct it just is very
table name is correct it just is very cumbersome and does not look clean at
cumbersome and does not look clean at all and so using something like demo as
all and so using something like demo as an alias makes it a lot more easily
an alias makes it a lot more easily readable and a lot more manageable when
readable and a lot more manageable when you're looking at it when you have a
you're looking at it when you have a very long script let's look at this
very long script let's look at this queer where we're joining together three
queer where we're joining together three Separate Tables and after each table we
Separate Tables and after each table we have an alias for employee demographics
have an alias for employee demographics we have a employee salary we have B and
we have a employee salary we have B and warehouse employee demographics we have
warehouse employee demographics we have C now unfortunately I have seen a lot of
C now unfortunately I have seen a lot of scripts that look exactly like this and
scripts that look exactly like this and this is what you do not want to do you
this is what you do not want to do you do not want to use your aliasing to just
do not want to use your aliasing to just write an a a b or a c that is very
write an a a b or a c that is very frowned upon when writing queries
frowned upon when writing queries because it really doesn't give any
because it really doesn't give any context to what the table that you're
context to what the table that you're referencing is and it gets really
referencing is and it gets really confusing as this query continues to
confusing as this query continues to grow and as you add more columns to your
grow and as you add more columns to your select statement it makes it more
select statement it makes it more difficult to understand where those
difficult to understand where those columns are coming from and so when I'm
columns are coming from and so when I'm reading that I say select a. employee ID
reading that I say select a. employee ID okay what's a a is employee demographics
okay what's a a is employee demographics so you really do not want to do that now
so you really do not want to do that now let's look at an example of what it
let's look at an example of what it should look like so for employee
should look like so for employee demographics instead of having an alias
demographics instead of having an alias of a a I used demo for demographics for
of a a I used demo for demographics for employee salary I used s and for
employee salary I used s and for warehouse employee demographics I used
warehouse employee demographics I used where now this is not perfect by any
where now this is not perfect by any means but in the select statement if
means but in the select statement if you're just glancing at it you can
you're just glancing at it you can easily understand which columns are
easily understand which columns are coming from which tables so when I look
coming from which tables so when I look at employee ID I know that's coming from
at employee ID I know that's coming from employee demographics CU I have demo as
employee demographics CU I have demo as the Alias so it's a lot easier to
the Alias so it's a lot easier to understand and when you hand this query
understand and when you hand this query off to somebody it is going to be a lot
off to somebody it is going to be a lot easier for them to read through it and
easier for them to read through it and understand where those columns and those
understand where those columns and those table names are coming from and so they
table names are coming from and so they will appreciate that in the long run so
will appreciate that in the long run so that is all I got that is aling again
that is all I got that is aling again not a super tough subject but a really
not a super tough subject but a really important one to understand especially
important one to understand especially as you start working in teams and as you
as you start working in teams and as you start creating more and more complex
start creating more and more complex queries you want to have it more
queries you want to have it more organized and more easily readable and
organized and more easily readable and so it may not come into play with those
so it may not come into play with those really simple queries but again as as
really simple queries but again as as you build out those more complex queries
you build out those more complex queries this becomes very useful I really hope
this becomes very useful I really hope you enjoyed this video if you did be
you enjoyed this video if you did be sure to comment and subscribe below
sure to comment and subscribe below thank you so much for watching and I'll
thank you so much for watching and I'll see you in the next video what's going
see you in the next video what's going on everybody welcome back to another
on everybody welcome back to another intermediate SQL tutorial today we're
intermediate SQL tutorial today we're going to be covering Partition by now
going to be covering Partition by now Partition by is often compared to the
Partition by is often compared to the group by statement the group by
group by statement the group by statement is a little bit different the
statement is a little bit different the group by statement is going to reduce
group by statement is going to reduce the number of rows in our output by
the number of rows in our output by actually rolling them up and then
actually rolling them up and then calculating the sums or averages for
calculating the sums or averages for each group whereas Partition by actually
each group whereas Partition by actually divides the result set into partitions
divides the result set into partitions and changes how the window function is
and changes how the window function is calculated and so the Partition by
calculated and so the Partition by doesn't actually reduce the number of
doesn't actually reduce the number of rows returned in our output let's get
rows returned in our output let's get started to look at the actual syntax of
started to look at the actual syntax of how to use Partition by and then we'll
how to use Partition by and then we'll compare it to the group ey statement
compare it to the group ey statement later just to see the differences
later just to see the differences between the two we're going to be using
between the two we're going to be using these two tables on our left over here
these two tables on our left over here so I'm going to pull those up really
so I'm going to pull those up really quick so let's run this and let's look
quick so let's run this and let's look at the two these two tables Side by well
at the two these two tables Side by well one underneath the other really quick so
one underneath the other really quick so what we're going to be using to
what we're going to be using to demonstrate these partitioned by is this
demonstrate these partitioned by is this gender column as well as this salary
gender column as well as this salary column and so we just need to join these
column and so we just need to join these two tables together on the employee ID
two tables together on the employee ID and then we'll go from there now I'm not
and then we'll go from there now I'm not going to bore you with that I'm going to
going to bore you with that I'm going to skip ahead and we'll actually look at
skip ahead and we'll actually look at how to use this partition bu so I've
how to use this partition bu so I've joined these two tables together and
joined these two tables together and this is our output but we don't want
this is our output but we don't want every single column I'm going to start
every single column I'm going to start selecting some of these columns and then
selecting some of these columns and then we'll start using this partition Buy and
we'll start using this partition Buy and see what the output looks like after
see what the output looks like after that all right so let's go right up here
that all right so let's go right up here let's choose the first name let's do the
let's choose the first name let's do the last name we'll do
last name we'll do gender and let's do salary and now we
gender and let's do salary and now we want to identify how many male and
want to identify how many male and female employees we actually have and so
female employees we actually have and so we're going to say count of
we're going to say count of gender and this going to be
gender and this going to be over and now we're going to do our
over and now we're going to do our Partition
Partition by and we're also going to partition
by and we're also going to partition that by the
that by the gender as total gender now I'm going to
gender as total gender now I'm going to come back to why we did each part but I
come back to why we did each part but I want to see the output first and then we
want to see the output first and then we come back to why we wrote it this way so
come back to why we wrote it this way so let's just do this really
let's just do this really quick so it's going to be a little bit
quick so it's going to be a little bit different than what you typically would
different than what you typically would expect in a group by statement the group
expect in a group by statement the group by is going to roll everything up and
by is going to roll everything up and you typically wouldn't have like a first
you typically wouldn't have like a first name last name in a group by statement
name last name in a group by statement because it would be very hard to roll
because it would be very hard to roll all those things up into those
all those things up into those individual columns and to reduce the
individual columns and to reduce the number of columns that are in your
number of columns that are in your output and so in our output we can see
output and so in our output we can see Pam Beasley she's a female she makes
Pam Beasley she's a female she makes $36,000 as a salary and there are three
$36,000 as a salary and there are three total women that work alongside her in
total women that work alongside her in this employee demographics table and so
this employee demographics table and so in our total gender column over here
in our total gender column over here this is where we use the partition bu
this is where we use the partition bu and if we used a group bu statement to
and if we used a group bu statement to get this kind of information all we
get this kind of information all we would be able to do to get this
would be able to do to get this information in a group by statement is
information in a group by statement is say select gender count of gender and
say select gender count of gender and then Group by the gender down below
then Group by the gender down below underneath the join so because we're
underneath the join so because we're using the partition bu we're able to
using the partition bu we're able to isolate just one column that we want to
isolate just one column that we want to perform our aggregate function on and so
perform our aggregate function on and so we're able to add things like the first
we're able to add things like the first name and last name columns even though
name and last name columns even though we aren't trying to include that in any
we aren't trying to include that in any partition or group by statement yet
partition or group by statement yet we're still able to add the aggregate
we're still able to add the aggregate function to each individual row while
function to each individual row while still maintaining those other columns
still maintaining those other columns let's take this entire query and let's
let's take this entire query and let's basically just transform it into a group
basically just transform it into a group by statement and we'll see kind of what
by statement and we'll see kind of what that looks like and what the difference
that looks like and what the difference is so all I'm going to do is get rid of
is so all I'm going to do is get rid of all this I'm going
all this I'm going to copy all of
to copy all of this and I'm going to say
this and I'm going to say Group
Group by and I'm going to do that because we
by and I'm going to do that because we have to use all these columns in our
have to use all these columns in our group by statement so let's execute this
group by statement so let's execute this and as you can tell we are not able to
and as you can tell we are not able to see the output for the aggregate
see the output for the aggregate function that we were hoping for if we
function that we were hoping for if we wanted to get the same output that we
wanted to get the same output that we had before where we're showing three for
had before where we're showing three for females and six for males what we'd have
females and six for males what we'd have to do is get rid of this first and last
to do is get rid of this first and last name and the
name and the salary and do the same thing in the
salary and do the same thing in the group by
group by statement and so let me get rid of these
statement and so let me get rid of these really
really quick and run this and so what the
quick and run this and so what the Partition by is doing is basically
Partition by is doing is basically taking this query right here and
taking this query right here and sticking it on one line in the select
sticking it on one line in the select statement and so I hope now you can see
statement and so I hope now you can see how valuable the partition bu can be if
how valuable the partition bu can be if used correctly thank you guys so much
used correctly thank you guys so much for watching I really appreciate it if
for watching I really appreciate it if you like this video be sure to like And
you like this video be sure to like And subscribe below and I'll see you in the
subscribe below and I'll see you in the next video what's going on everybody
next video what's going on everybody welcome back to another squl tutorial
welcome back to another squl tutorial today we're going to be talking about
today we're going to be talking about CTE
CTE a CTE is a common table expression and
a CTE is a common table expression and it's a named temporary result set which
it's a named temporary result set which is used to manipulate the complex
is used to manipulate the complex subqueries data now this only exists
subqueries data now this only exists within the scope of the statement that
within the scope of the statement that we were about to write once we cancel
we were about to write once we cancel out of this query it's like it never
out of this query it's like it never existed a CTE is also only created in
existed a CTE is also only created in memory rather than a tempdb file like a
memory rather than a tempdb file like a temp table would be but in general a CTE
temp table would be but in general a CTE acts very much like a subquery and so if
acts very much like a subquery and so if you know how to do subqueries you should
you know how to do subqueries you should be able to pick up on CTE fairly easily
be able to pick up on CTE fairly easily so let's get started writing our very
so let's get started writing our very first C CTE and we're going to come down
first C CTE and we're going to come down here and we're going to say with and
here and we're going to say with and we're going to write
we're going to write CTE
CTE employee and we're going to say as and
employee and we're going to say as and this is where everything's going to
this is where everything's going to start now CTE are sometimes called with
start now CTE are sometimes called with queries I've never personally used that
queries I've never personally used that but I've seen it called that online but
but I've seen it called that online but that's because it uses this with
that's because it uses this with statement right at the very beginning so
statement right at the very beginning so now we have with CTE employee as then we
now we have with CTE employee as then we have an open parenthesis and now we have
have an open parenthesis and now we have to construct our select statement and
to construct our select statement and this is kind of where we build out our
this is kind of where we build out our quote unquote subquery and so I'm going
quote unquote subquery and so I'm going to take in a select statement that I
to take in a select statement that I actually used in a previous video where
actually used in a previous video where we using the partition bu and so I'm
we using the partition bu and so I'm going to put that in there and I'm kind
going to put that in there and I'm kind of walk us through what that does and
of walk us through what that does and how we're going to use this so I'm going
how we're going to use this so I'm going to paste this down right here and I'm
to paste this down right here and I'm actually going to go like this just to
actually going to go like this just to make it look a little nicer and then I'm
make it look a little nicer and then I'm going to close the parentheses at the
going to close the parentheses at the end so now we have our CTE in place and
end so now we have our CTE in place and as you can see it is basically just a
as you can see it is basically just a select statement within the with CTE
select statement within the with CTE employee as and what this is going to do
employee as and what this is going to do is going to take the first name last
is going to take the first name last last name gender and salary and then
last name gender and salary and then it's going to take this aggregate
it's going to take this aggregate function with the partition buy
function with the partition buy aggregate function with the partition
aggregate function with the partition buy and it's going to place it to where
buy and it's going to place it to where we can now query off of this data so
we can now query off of this data so it's putting it basically in a temporary
it's putting it basically in a temporary place where we can then go and grab that
place where we can then go and grab that data so all we're going to do at the
data so all we're going to do at the very bottom is we're going to say select
very bottom is we're going to say select everything and we can do that from CTE
everything and we can do that from CTE employee so let's run this entire thing
employee so let's run this entire thing and see what we
and see what we get so as you can see this select
get so as you can see this select everything from CTE employee we are
everything from CTE employee we are selecting everything from this select
selecting everything from this select statement and so this feels a lot like a
statement and so this feels a lot like a temp table we're actually quering off of
temp table we're actually quering off of a temp table but it actually acts a lot
a temp table but it actually acts a lot more like a subquery now we don't have
more like a subquery now we don't have to the select everything we can just do
to the select everything we can just do first name and let's do average
first name and let's do average salary and when we run this we'll just
salary and when we run this we'll just get those two columns and we don't have
get those two columns and we don't have to go through and actually write this
to go through and actually write this out each time it's just in this CTE for
out each time it's just in this CTE for us so it does all the heavy lift within
us so it does all the heavy lift within the CTE and then we can just query off
the CTE and then we can just query off of what we want now something to note is
of what we want now something to note is that the CTE is not stored anywhere and
that the CTE is not stored anywhere and so it's not stored in some temp database
so it's not stored in some temp database somewhere if I try to run just this by
somewhere if I try to run just this by itself it is not going to work so let's
itself it is not going to work so let's try that out really quick and we should
try that out really quick and we should get an error and that's because each
get an error and that's because each time we run this query is actually
time we run this query is actually creating the CTE again and so it's not
creating the CTE again and so it's not being saved anywhere and so each time we
being saved anywhere and so each time we run it we have to run it with the entire
run it we have to run it with the entire CTE another thing to note is you
CTE another thing to note is you actually have to put the select
actually have to put the select statement right after the CTE if I try
statement right after the CTE if I try to go down here and say select
to go down here and say select everything from uh let's do
everything from uh let's do CTE employees it doesn't actually work
CTE employees it doesn't actually work it's not going to come up at all and
it's not going to come up at all and that's because it only is going to work
that's because it only is going to work with the select statement directly after
with the select statement directly after the actual CTE that you've created I
the actual CTE that you've created I hope this was helpful and I hope that
hope this was helpful and I hope that you understand how to use a CTE a little
you understand how to use a CTE a little bit better again you don't have to go
bit better again you don't have to go super complicated with the select
super complicated with the select statement within your CTE it can be very
statement within your CTE it can be very very simple I just wanted to demonstrate
very simple I just wanted to demonstrate that you can use aggregate functions
that you can use aggregate functions within your CTE and then just query off
within your CTE and then just query off of those without having to do the the
of those without having to do the the aggregate function again which I find is
aggregate function again which I find is very very useful again thank you for
very very useful again thank you for watching if you like this video be sure
watching if you like this video be sure to like And subscribe below and I'll see
to like And subscribe below and I'll see you in the next video what's going on
you in the next video what's going on everybody welcome back to another squl
everybody welcome back to another squl tutorial today we are looking at temp
tutorial today we are looking at temp tables and if you can guess it based off
tables and if you can guess it based off of the name they're kind of like
of the name they're kind of like temporary tables and we create them very
temporary tables and we create them very much the same way we're going to do
much the same way we're going to do create table um it's just a little bit
create table um it's just a little bit different and you can hit off of this
different and you can hit off of this temp table multiple times which you
temp table multiple times which you cannot do with something like a CTE or a
cannot do with something like a CTE or a subquery where you can only use it one
subquery where you can only use it one time or with a subquery you need to
time or with a subquery you need to write it multiple times within a query
write it multiple times within a query and so these temp tables are extremely
and so these temp tables are extremely useful I'm going to kind of talk about
useful I'm going to kind of talk about how you can use them as we're going uh
how you can use them as we're going uh throughout this video but let's get
throughout this video but let's get started right away with actually
started right away with actually creating one looking at it inserting
creating one looking at it inserting some data and and and kind of showing
some data and and and kind of showing you how temp tables work and what we can
you how temp tables work and what we can do with them so uh we're going to start
do with them so uh we're going to start off with create table
off with create table much like uh a regular table is created
much like uh a regular table is created the only difference is we're going to do
the only difference is we're going to do this pound signed and then we're going
this pound signed and then we're going to do
to do tempcore
tempcore employee uh so literally the only
employee uh so literally the only difference between a regular table and a
difference between a regular table and a temp table is this right here at the
temp table is this right here at the very beginning this this pound sign so
very beginning this this pound sign so uh let's just start by doing employee ID
uh let's just start by doing employee ID we make that an integer we'll do job
we make that an integer we'll do job title
title and we'll make that a varar
and we'll make that a varar 100 and then we'll do
100 and then we'll do salary and let's make that an
salary and let's make that an integer and so now we have our temp
integer and so now we have our temp table uh let's go ahead and create
table uh let's go ahead and create it so now we have our temp table created
it so now we have our temp table created and so we can look at it really
and so we can look at it really quick so let's select
quick so let's select everything from and we'll do temp
everything from and we'll do temp employee so let's take a look it's
employee so let's take a look it's completely empty um and we can insert
completely empty um and we can insert data very much the same way we'd insert
data very much the same way we'd insert data into a regular table so let's start
data into a regular table so let's start doing that let's do insert
doing that let's do insert into and we'll do temp
into and we'll do temp employee and we'll do
employee and we'll do values and let's just do something
values and let's just do something really quick because I'm going to get to
really quick because I'm going to get to a little bit more interesting stuff in a
a little bit more interesting stuff in a second
oops so we'll make this person HR that's their job title then for
HR that's their job title then for salary we'll give them
salary we'll give them 45,000 and close it off so let's run
45,000 and close it off so let's run this and let's select everything again
this and let's select everything again and see what's in there perfect so we
and see what's in there perfect so we were able to insert data into this temp
were able to insert data into this temp table and again we we don't have to
table and again we we don't have to create this every single time we um um
create this every single time we um um or we don't have to run this every
or we don't have to run this every single time we need to hit off of it
single time we need to hit off of it like we did a CTE if you watch my
like we did a CTE if you watch my previous video and this one we can just
previous video and this one we can just run it and it sits there and so U again
run it and it sits there and so U again it feels very much like a real table and
it feels very much like a real table and I'm going to get to a little bit of the
I'm going to get to a little bit of the nuances of of the and the differences
nuances of of the and the differences between a regular table and a temp table
between a regular table and a temp table in a second but let's really quickly um
in a second but let's really quickly um we want more data in there you don't
we want more data in there you don't have to just um do it value by value we
have to just um do it value by value we can also just do
can also just do um uh where we select all of the data
um uh where we select all of the data from a specific table and insert that
from a specific table and insert that into a temp table and that is really
into a temp table and that is really quickly you know how I do it most of the
quickly you know how I do it most of the time most of the time I'm not inserting
time most of the time I'm not inserting values um I am you know taking a large
values um I am you know taking a large table and taking a subset of that and
table and taking a subset of that and then sticking it into a temp table so
then sticking it into a temp table so let's look at this really
let's look at this really quick and
quick and and run that so now we took all of the
and run that so now we took all of the data from employee salary and then we
data from employee salary and then we just stuck it into this table and really
just stuck it into this table and really quickly this is one of the big uses of a
quickly this is one of the big uses of a temp table we had let let's say for
temp table we had let let's say for example that this employee salary table
example that this employee salary table had a billion rows or or or just an
had a billion rows or or or just an extremely large number and we were
extremely large number and we were trying to uh you know hit a somewhat
trying to uh you know hit a somewhat complex query off of it where we're
complex query off of it where we're using joint coins and we're using U
using joint coins and we're using U maybe some window functions or different
maybe some window functions or different things you know it would take a very
things you know it would take a very long time to hit off of this but what we
long time to hit off of this but what we can do is we could insert that data into
can do is we could insert that data into this temp table and then we can hit off
this temp table and then we can hit off the temp table and it already has that
the temp table and it already has that sub uh that subsection of data that
sub uh that subsection of data that we're wanting to use for all of our
we're wanting to use for all of our later queries so really quickly that's
later queries so really quickly that's kind of um kind of a use case for that
kind of um kind of a use case for that so let's go down here we're going to
so let's go down here we're going to kind of create another one and this
kind of create another one and this one's going to be a little bit more
one's going to be a little bit more advanced a little bit of how I would
advanced a little bit of how I would actually use a temp table above was just
actually use a temp table above was just kind of showing the basic syntax how you
kind of showing the basic syntax how you kind of put data into it you know kind
kind of put data into it you know kind of how it's used now I'm going to show
of how it's used now I'm going to show you kind of how I would actually use it
you kind of how I would actually use it so let's do create table uh let's do
so let's do create table uh let's do temp
temp oops create
oops create table uh let's do
table uh let's do temp uh employee
temp uh employee 2 and then let's do open parentheses and
2 and then let's do open parentheses and we'll do job title and we'll make that a
we'll do job title and we'll make that a varar
varar 50 and then we can do
50 and then we can do employees per job we'll make that an
employees per job we'll make that an integer now we need average age make
integer now we need average age make that an integer and the very last one
that an integer and the very last one will be average salary I'll make that an
will be average salary I'll make that an integer as well and let's run this oops
integer as well and let's run this oops so we have our second table now we want
so we have our second table now we want to insert data into this one so we're
to insert data into this one so we're just going to do insert
just going to do insert into and we'll do temp employee 2 and
into and we'll do temp employee 2 and for this one I'm going to take a query
for this one I'm going to take a query that we used in a previous video and so
that we used in a previous video and so I'm just going to copy and paste that to
I'm just going to copy and paste that to save time uh and then we'll keep on
save time uh and then we'll keep on moving from there all right so I'm just
moving from there all right so I'm just going to paste that in we will run this
going to paste that in we will run this and really all it's doing is from this
and really all it's doing is from this these tables it's taking the job title
these tables it's taking the job title we're getting a count on the job title
we're getting a count on the job title average age average salary and that is
average age average salary and that is it um so let's see if that worked which
it um so let's see if that worked which it looks like it did but you know let's
it looks like it did but you know let's actually take a look at the
actually take a look at the [Music]
[Music] data and so now we have this subsection
data and so now we have this subsection of data from this join
of data from this join above and what this is going to do is is
above and what this is going to do is is whenever we want to run this we don't
whenever we want to run this we don't have to run it on these two tables and
have to run it on these two tables and create the join and then do the
create the join and then do the calculations which takes time what it's
calculations which takes time what it's going to do is it's going to take this
going to do is it's going to take this these exact values and place this into
these exact values and place this into this temporary table and if we want to
this temporary table and if we want to run further calculations on these values
run further calculations on these values we can easily do that in a fraction of
we can easily do that in a fraction of the time instead of having to run this
the time instead of having to run this every single time which will take up so
every single time which will take up so much uh uh processing power and it will
much uh uh processing power and it will reduce your runtime dramatically when
reduce your runtime dramatically when you're placing this data in this temp
you're placing this data in this temp table and hitting off of that instead of
table and hitting off of that instead of all these joints and everything above uh
all these joints and everything above uh a lot of times these temp tables are
a lot of times these temp tables are used in store procedures now if you
used in store procedures now if you haven't learned about store procedures
haven't learned about store procedures or used stor procedures at all you know
or used stor procedures at all you know that's okay I still want to show you
that's okay I still want to show you something that might be useful um
something that might be useful um although this is used a ton in store
although this is used a ton in store procedures so for example let's say we
procedures so for example let's say we have a store procedure set up we run the
have a store procedure set up we run the store procedure and we get an output and
store procedure and we get an output and you know we for whatever reason want to
you know we for whatever reason want to run it again and when we run it again uh
run it again and when we run it again uh we get this error and you know this temp
we get this error and you know this temp table lives somewhere it it doesn't live
table lives somewhere it it doesn't live in an actual in the actual database uh
in an actual in the actual database uh but it lives somewhere and so when we
but it lives somewhere and so when we run it again we get an error because
run it again we get an error because there's already a temp table created one
there's already a temp table created one trick or one little tip that I would
trick or one little tip that I would give is doing something like this saying
give is doing something like this saying drop table oops I don't know why I did
drop table oops I don't know why I did so many spaces drop table if
so many spaces drop table if exists and we'll do temp employee
exists and we'll do temp employee 2 just like that now what this is going
2 just like that now what this is going to do is when you're running that store
to do is when you're running that store procedure over and over and over again
procedure over and over and over again you're getting error or whatever for
you're getting error or whatever for whatever reason you need to run it
whatever reason you need to run it multiple times every time that you run
multiple times every time that you run it it's going to encounter this and so
it it's going to encounter this and so if that already exists it is going to
if that already exists it is going to delete that table and then allow you to
delete that table and then allow you to create it again and this is just a
create it again and this is just a really good thing to do so now if you
really good thing to do so now if you see down below I can run this time and
see down below I can run this time and time and time again and it is going to
time and time again and it is going to work every single time because it is
work every single time because it is checking to see if that exists and if it
checking to see if that exists and if it does it deletes it and then I can create
does it deletes it and then I can create again and so that is just a helpful tip
again and so that is just a helpful tip if you're going to try to use this I
if you're going to try to use this I highly recommend adding that to your
highly recommend adding that to your query just to make sure things run
query just to make sure things run smoothly I know there is a lot more that
smoothly I know there is a lot more that can go into temp tables a lot more of
can go into temp tables a lot more of the technical aspects or the DBA stuff
the technical aspects or the DBA stuff um obviously I just want to teach you
um obviously I just want to teach you how to use it and what you might use it
how to use it and what you might use it for and how to actually write it out but
for and how to actually write it out but you know there are a lot more things
you know there are a lot more things that you can do research on about
that you can do research on about processing speed and storage but unless
processing speed and storage but unless you are something like a DBA you
you are something like a DBA you probably don't need to worry about those
probably don't need to worry about those things and so if you are a DBA I do
things and so if you are a DBA I do recommend looking into those things
recommend looking into those things making sure you understand how that
making sure you understand how that works how this data is stored uh so that
works how this data is stored uh so that when people use them or you are using
when people use them or you are using them you know what's going on in the
them you know what's going on in the background but for getting up and
background but for getting up and running with temp tables I hope that
running with temp tables I hope that this was helpful thank you guys so much
this was helpful thank you guys so much for watching I really appreciate it if
for watching I really appreciate it if you like this video be sure to like And
you like this video be sure to like And subscribe below and I'll see you in the
subscribe below and I'll see you in the next
[Music] video what's going on everybody welcome
video what's going on everybody welcome back to another SQL tutorial today we're
back to another SQL tutorial today we're going to be looking at string functions
going to be looking at string functions some of the things that we're going to
some of the things that we're going to be looking at are things like trim
be looking at are things like trim replace substring and upper and lower uh
replace substring and upper and lower uh we're going to create a new table insert
we're going to create a new table insert a little bit of bad data into it and
a little bit of bad data into it and then we're going to be using that to
then we're going to be using that to work on our string functions today so I
work on our string functions today so I already have this set up right here um
already have this set up right here um I'm going to put this in the GitHub that
I'm going to put this in the GitHub that you can just download this you don't
you can just download this you don't have to you know type this out manually
have to you know type this out manually so go look in the description if you
so go look in the description if you know you just want to get that off the
know you just want to get that off the GitHub and download that and copy and
GitHub and download that and copy and paste it save you a little bit of time
paste it save you a little bit of time but let's go ahead and run this really
but let's go ahead and run this really quick and as you can see in this table
quick and as you can see in this table we have uh our data right here give me
we have uh our data right here give me one second so in this employeee errors
one second so in this employeee errors table basically what we have actually
table basically what we have actually let me pull this back up basically what
let me pull this back up basically what we have is in this first one we have
we have is in this first one we have here we go we have some uh basically
here we go we have some uh basically blank spaces on the right side the
blank spaces on the right side the second one some blank spaces on the left
second one some blank spaces on the left side U we also have Jimbo which is an
side U we also have Jimbo which is an error because his name is Jim um and
error because his name is Jim um and Halbert because his name is actually
Halbert because his name is actually Halbert um and then for Toby for
Halbert um and then for Toby for whatever reason that o is capitalized
whatever reason that o is capitalized and then uh Michael got in here and
and then uh Michael got in here and added this extra part so we're going to
added this extra part so we're going to have to figure out a way to take that
have to figure out a way to take that out when we're doing our query and
out when we're doing our query and that'll come in a little bit later I
that'll come in a little bit later I think in the substring section so let's
think in the substring section so let's get into it right away
get into it right away let's start using uh our left trim and
let's start using uh our left trim and right trim we're going to kind of go
right trim we're going to kind of go through each one um pretty quickly
through each one um pretty quickly hopefully I'm not not trying to make
hopefully I'm not not trying to make this a super long video because we got a
this a super long video because we got a lot of things to get through in this one
lot of things to get through in this one video uh so I'm going to go through the
video uh so I'm going to go through the trim right trim and left trim let's look
trim right trim and left trim let's look at uh the employee ID because that's the
at uh the employee ID because that's the one where we have some blank spaces on
one where we have some blank spaces on the right and the left side the left
the right and the left side the left side you'll be able to obviously you're
side you'll be able to obviously you're going to see that one much easier but uh
going to see that one much easier but uh let's start walking through this so
let's start walking through this so let's do
let's do select employee ID and before we get any
select employee ID and before we get any further let me just get the employee
further let me just get the employee errors on here so we can
errors on here so we can um so that we can see everything as it
um so that we can see everything as it comes up so we're just going to do trim
comes up so we're just going to do trim and then type in the column that we want
and then type in the column that we want to uh take these blank spaces out of
to uh take these blank spaces out of that's where the trim does the trim gets
that's where the trim does the trim gets rid of Blank spaces on either the front
rid of Blank spaces on either the front or the back or or the left on the right
or the back or or the left on the right side so on both sides that's what trim
side so on both sides that's what trim does and we'll say as ID trim so let's
does and we'll say as ID trim so let's run this one really quick and as you can
run this one really quick and as you can see this is our regular employee ID and
see this is our regular employee ID and so you know you can't visually see it as
so you know you can't visually see it as easily on this first one but there are
easily on this first one but there are blank spaces after this 101 and we got
blank spaces after this 101 and we got rid of those and then there were blank
rid of those and then there were blank spaces before the 102 and we got rid of
spaces before the 102 and we got rid of those now I'm just going to copy this uh
those now I'm just going to copy this uh two times because it's basically the
two times because it's basically the exact same thing but uh I'm going to
exact same thing but uh I'm going to show you them all at the same time so
show you them all at the same time so it's the exact same thing except lrim
it's the exact same thing except lrim and right trim uh and let's take a look
and right trim uh and let's take a look at all these at the same
at all these at the same time and let me pull it up so in the me
time and let me pull it up so in the me see if I can get these all in here okay
see if I can get these all in here okay in the trim it got rid of both the left
in the trim it got rid of both the left and the right side so all of these were
and the right side so all of these were fixed in the employee ID for the left
fixed in the employee ID for the left trim we're only getting going to be
trim we're only getting going to be getting rid of this one this one still
getting rid of this one this one still has um blank spaces on it and when we do
has um blank spaces on it and when we do the right trim we're only going to get
the right trim we're only going to get rid of the stuff on the right side so
rid of the stuff on the right side so this one doesn't change because this is
this one doesn't change because this is on the left hand side where the blank
on the left hand side where the blank spaces are so this one was fixed again
spaces are so this one was fixed again not super visual so you can't really see
not super visual so you can't really see it but that one is fixed uh let's move
it but that one is fixed uh let's move on to the next part uh which is using
on to the next part uh which is using replace so for this one we're going to
replace so for this one we're going to be looking at the last name so let's go
be looking at the last name so let's go back up really quick to the employee
back up really quick to the employee errors uh as you can tell the last name
errors uh as you can tell the last name um the biggest one where we kind of want
um the biggest one where we kind of want to take something out of because we
to take something out of because we don't want that um that Dash fired still
don't want that um that Dash fired still in there we're going to replace that and
in there we're going to replace that and so let's look at how to do that um let
so let's look at how to do that um let me
me just copy this real quick and get rid of
just copy this real quick and get rid of this top part um so we're going to do
this top part um so we're going to do the last name so let's just start off
the last name so let's just start off with our last name um and then just as a
with our last name um and then just as a baseline so we can see what it looks
baseline so we can see what it looks like before and then we'll do replace
like before and then we'll do replace and all we're going to specify is the
and all we're going to specify is the column that we want uh to do the
column that we want uh to do the replacing in we're going to specify the
replacing in we're going to specify the value that we want to replace so in this
value that we want to replace so in this it's going to be Dash fire oops got a
it's going to be Dash fire oops got a little aggressive on that one dash
little aggressive on that one dash fired and we're going to indicate what
fired and we're going to indicate what we want to replace it with now I'm just
we want to replace it with now I'm just going to replace it with blank
going to replace it with blank um and we can say as last name
um and we can say as last name fixed so let's see what this looks like
fixed so let's see what this looks like really
really quick and it looks like it worked so in
quick and it looks like it worked so in this last name it originally had
this last name it originally had flenderson DF fired and when we replaced
flenderson DF fired and when we replaced it and we took that Dash fired and
it and we took that Dash fired and replaced it with basically nothing uh it
replaced it with basically nothing uh it then fixed it and so now it looks
then fixed it and so now it looks correct all right let's move on to the
correct all right let's move on to the next one I think this one might be um
next one I think this one might be um the the longest one to write but that is
the the longest one to write but that is the
the substring um and let me take this real
substring um and let me take this real quick trying to save us some time so
quick trying to save us some time so substring is
substring is very is very very unique you can
very is very very unique you can specify um in a either a number or a
specify um in a either a number or a string you can specify the place that
string you can specify the place that you want to start and then you can also
you want to start and then you can also specify how many characters you want to
specify how many characters you want to go out um and and and it pulls that in
go out um and and and it pulls that in so just as a really quick example um and
so just as a really quick example um and then I'm going to show you kind of a use
then I'm going to show you kind of a use case for this one that I think is pretty
case for this one that I think is pretty cool that um you know
cool that um you know maybe let me
maybe let me see so that maybe that you'd find useful
see so that maybe that you'd find useful so I'm going to do first name and then
so I'm going to do first name and then I'm just going to do one comma three so
I'm just going to do one comma three so it's going to take the first name it's
it's going to take the first name it's going to start at the very first um very
going to start at the very first um very first letter or number and it's going to
first letter or number and it's going to go forward three spaces or three spots
go forward three spaces or three spots spots so let's just take a look at what
spots so let's just take a look at what that looks like so for our table it's
that looks like so for our table it's going to take Jim Pam and to or or Tobe
going to take Jim Pam and to or or Tobe for Toby um and so it's only going to
for Toby um and so it's only going to take the the first three because you're
take the the first three because you're starting at number one now what if we
starting at number one now what if we started at three so we do three comma 3
started at three so we do three comma 3 it's going to go to the
it's going to go to the third um digit or or third letter and
third um digit or or third letter and then it's going to go forward three so
then it's going to go forward three so you kind of get a sense of how this
you kind of get a sense of how this works now I'm going to show you
works now I'm going to show you something that I think is very
something that I think is very interesting that I think you guys will
interesting that I think you guys will also find interesting uh let me fix that
also find interesting uh let me fix that CU I just messed it up so if you've ever
CU I just messed it up so if you've ever heard of something called fuzzy matching
heard of something called fuzzy matching now if you don't know what fuzzy
now if you don't know what fuzzy matching is I'll give you an example
matching is I'll give you an example let's say in one table my name is Alex
let's say in one table my name is Alex and in another table my name is
and in another table my name is Alexander if we tried to join those two
Alexander if we tried to join those two together based off of my name they will
together based off of my name they will not join because one is Alex and one is
not join because one is Alex and one is Alexander there's not they're not an
Alexander there's not they're not an exact match but for if I take the
exact match but for if I take the substring and start position one and
substring and start position one and move forward four characters it's going
move forward four characters it's going to take Alex from both and then it will
to take Alex from both and then it will match them together uh and say that they
match them together uh and say that they are the same so that you know it may not
are the same so that you know it may not be perfect that's why it's called a
be perfect that's why it's called a fuzzy match because it can work for a
fuzzy match because it can work for a large majority of the time but it's not
large majority of the time but it's not going to work every single time and so I
going to work every single time and so I want to show you how we can use this
want to show you how we can use this here um really quick I need to join this
here um really quick I need to join this to um the demographics table so I'm
to um the demographics table so I'm going to do that really
going to do that really quick bear with me for just one
quick bear with me for just one second let's try to make this at least
second let's try to make this at least look somewhat good so what I'm going to
look somewhat good so what I'm going to do is I'm going to start off by saying
do is I'm going to start off by saying um let's tie it to the first name uh
um let's tie it to the first name uh let's do whoops let's do air. first name
let's do whoops let's do air. first name is equal to the demographics table first
is equal to the demographics table first name okay so I want to see and I'm just
name okay so I want to see and I'm just going to do first name for
going to do first name for ER and let's do them. first name so
ER and let's do them. first name so let's see what comes up when we do it
let's see what comes up when we do it like this so the only one that is going
like this so the only one that is going to work is Toby and that's because even
to work is Toby and that's because even though it has a capital O it's still
though it has a capital O it's still going to take it um so you know we want
going to take it um so you know we want to get all of them to match and we can
to get all of them to match and we can do that but it's going to be um a little
do that but it's going to be um a little bit of a different way than maybe is
bit of a different way than maybe is perfect but that's why they call it
perfect but that's why they call it fuzzy matching so we're going to use
fuzzy matching so we're going to use substring on this so I'm going to say
substring on this so I'm going to say substring oops let me that right so I'm
substring oops let me that right so I'm going to say substring and we're going
going to say substring and we're going to go one three so starting at the first
to go one three so starting at the first position and going forward with three
position and going forward with three and we're going to do the exact same
and we're going to do the exact same thing on the oops subst string it be
thing on the oops subst string it be great if I could spell that correctly
great if I could spell that correctly we're going to do the exact same thing
we're going to do the exact same thing so one and
so one and three so we are actually going
three so we are actually going to take this give me a second missed
to take this give me a second missed that we're going to take this up here
that we're going to take this up here and we're just going to go like that and
and we're just going to go like that and I why did I copy it with the error okay
I why did I copy it with the error okay so let's run this really
so let's run this really quickly and as you can see it is now
quickly and as you can see it is now going to match all of them and you can
going to match all of them and you can do this on a lot of different things
do this on a lot of different things typically when I'm doing a fuzzy match
typically when I'm doing a fuzzy match like this I'm not just going to do it on
like this I'm not just going to do it on a first name right because if every
a first name right because if every there can be a ton of people named JY
there can be a ton of people named JY you know we want to do it on uh and and
you know we want to do it on uh and and real quick let me actually show you
real quick let me actually show you um what the originals looked like just
um what the originals looked like just to make sure I hit the the point
to make sure I hit the the point across um and that is going to be first
across um and that is going to be first name and come all right so real quick
name and come all right so real quick let's actually look at this so it
let's actually look at this so it originally was Jimbo Pamela and Toby uh
originally was Jimbo Pamela and Toby uh in this one was Jim Pam and Toby And so
in this one was Jim Pam and Toby And so when we just took the first three
when we just took the first three because it was Jimbo it then becomes Jim
because it was Jimbo it then becomes Jim it was Pamela it becomes Pam now it
it was Pamela it becomes Pam now it matches and so that's what that's kind
matches and so that's what that's kind of the example that we're going
of the example that we're going for like I was saying I typically will
for like I was saying I typically will not just filter on a first name because
not just filter on a first name because there's going to be a ton of people
there's going to be a ton of people named Alex or Jim or or or you know
named Alex or Jim or or or you know Henry or whatever you're going to do
Henry or whatever you're going to do this on many different things so would
this on many different things so would be doing it on things like uh if I'm
be doing it on things like uh if I'm trying to do a fuzzy match on a person I
trying to do a fuzzy match on a person I do it on their gender to make sure that
do it on their gender to make sure that their gender is the same um and I
their gender is the same um and I wouldn't probably need to use a
wouldn't probably need to use a substring for that but just to kind
substring for that but just to kind of give you a little bit more
of give you a little bit more information I need to do it on the last
information I need to do it on the last name um so I need to use that substring
name um so I need to use that substring again and I would probably do it on the
again and I would probably do it on the age
age oops the what am I doing come on the age
oops the what am I doing come on the age and the date of birth okay so all of
and the date of birth okay so all of those things if you if you fuzzy match
those things if you if you fuzzy match on the first name and the last name and
on the first name and the last name and then the gender the age and the date of
then the gender the age and the date of birth are all the same then you can
birth are all the same then you can typically get a very high accuracy in
typically get a very high accuracy in matching people across
matching people across tables whether or not you have you know
tables whether or not you have you know this is an example if you don't have
this is an example if you don't have like an employee ID which is what we do
like an employee ID which is what we do have but take for example we were not
have but take for example we were not given that uh this is a way to match
given that uh this is a way to match them using substrings let's move on to
them using substrings let's move on to Upper and lower all upper and lower is
Upper and lower all upper and lower is going to do is basically take all the
going to do is basically take all the characters in The the text and make them
characters in The the text and make them either upper or make them lower so it's
either upper or make them lower so it's very
very self-explanatory uh let me copy this up
self-explanatory uh let me copy this up here and we will get going on this
here and we will get going on this one uh let's just look at the first name
one uh let's just look at the first name um specifically we're going to be
um specifically we're going to be looking at Toby right here so let's do
looking at Toby right here so let's do first name
first name let's do uh lower and all we have to do
let's do uh lower and all we have to do is put in the column that we want to
is put in the column that we want to do so this is our original first name
do so this is our original first name and it then takes every single uh string
and it then takes every single uh string that is in here or every single I guess
that is in here or every single I guess character and and it makes it lowercase
character and and it makes it lowercase that's all it does uh and it is the
that's all it does uh and it is the exact opposite when we do upper so we
exact opposite when we do upper so we can now take take a look at this one and
can now take take a look at this one and now everything's going to be capitalized
now everything's going to be capitalized so there is a lot that you can do with
so there is a lot that you can do with these string functions and this is not
these string functions and this is not all the string functions that there are
all the string functions that there are there are a lot more but I would say
there are a lot more but I would say that these are the more popular more
that these are the more popular more useful ones that I typically use on a
useful ones that I typically use on a regular basis and so I hope that this
regular basis and so I hope that this has been helpful I hope that you learned
has been helpful I hope that you learned something from this if you did be sure
something from this if you did be sure to like And subscribe below I have a lot
to like And subscribe below I have a lot more videos coming out with tutorials on
more videos coming out with tutorials on everything from SQL python Tableau and
everything from SQL python Tableau and Excel
Excel thank you so much for joining me I
thank you so much for joining me I appreciate it and I will see you in the
appreciate it and I will see you in the next
[Music] video what's going on everybody welcome
video what's going on everybody welcome back to another SQL tutorial today we
back to another SQL tutorial today we are talking about stored procedures now
are talking about stored procedures now what is a store procedure a store
what is a store procedure a store procedure is a group of SQL statements
procedure is a group of SQL statements that has been created and then stored in
that has been created and then stored in that database a store procedure can
that database a store procedure can accept input parameters and we will be
accept input parameters and we will be looking at that today but that means
looking at that today but that means that a single store procedure can be
that a single store procedure can be used over the network by several
used over the network by several different users uh and we can all be
different users uh and we can all be using different input data a store
using different input data a store procedure will also reduce Network
procedure will also reduce Network traffic and increase the performance and
traffic and increase the performance and lastly if we modify that store procedure
lastly if we modify that store procedure everyone who uses that store procedure
everyone who uses that store procedure in the future will also get that update
in the future will also get that update let's start writing out the store
let's start writing out the store procedure so we can look at the syntax
procedure so we can look at the syntax we'll start off very simple and then in
we'll start off very simple and then in the next one we'll get a little bit more
the next one we'll get a little bit more complicated so the very first thing that
complicated so the very first thing that you need to write is create and then
you need to write is create and then procedure and after that you're going to
procedure and after that you're going to name it so let's just call this one
name it so let's just call this one test and all you're going to say is as
test and all you're going to say is as and then you're going to write your
and then you're going to write your query and so let's just do select
query and so let's just do select everything from employee
everything from employee demographics and that is it we have
demographics and that is it we have created our very first store procedure
created our very first store procedure of course this is super super simple but
of course this is super super simple but let's execute this really quick and take
let's execute this really quick and take a look at
a look at it so it says that the commands
it so it says that the commands completed successfully let's go over to
completed successfully let's go over to our SQL tutorial we're going to go over
our SQL tutorial we're going to go over to
to programmability store procedures and it
programmability store procedures and it is not showing up there what we need to
is not showing up there what we need to do is we need to refresh our store
do is we need to refresh our store procedures we're just going to go right
procedures we're just going to go right here we're going to click refresh and
here we're going to click refresh and then there is our store procedure now
then there is our store procedure now how do you actually use the store
how do you actually use the store procedure that we just created so let's
procedure that we just created so let's go right down here and let's say x which
go right down here and let's say x which means execute and then all we're going
means execute and then all we're going to say is test test and we're going to
to say is test test and we're going to run
run this and there we go so all we put in
this and there we go so all we put in this store procedure was a select
this store procedure was a select statement and so when we actually
statement and so when we actually Rebrand the store procedure it returned
Rebrand the store procedure it returned our select statement now let's go down
our select statement now let's go down here and we're going to make it a little
here and we're going to make it a little bit more complicated we're going to do
bit more complicated we're going to do the exact same thing in create store
the exact same thing in create store procedure make sure I spelled that right
procedure make sure I spelled that right and let's call this
and let's call this tempore employee so if you remember from
tempore employee so if you remember from a previous video we worked on temp
a previous video we worked on temp tables and we created our temp tables
tables and we created our temp tables then inserted data into that we are
then inserted data into that we are going to add that to this St procedure
going to add that to this St procedure so we can see the difference between a
so we can see the difference between a simple query versus a little bit more
simple query versus a little bit more complicated query so I'm going to say as
complicated query so I'm going to say as and then I'm going to insert that in
and then I'm going to insert that in here now what this is doing is I'm
here now what this is doing is I'm creating a table and then right down
creating a table and then right down here I inserting that table now if I
here I inserting that table now if I create this store procedure and then
create this store procedure and then execute it nothing is actually going to
execute it nothing is actually going to be returned it will insert the data into
be returned it will insert the data into that temp table but since I don't have a
that temp table but since I don't have a select statement in this proced
select statement in this proced procedure nothing will be returned so
procedure nothing will be returned so let's write
let's write select everything and we'll just do
select everything and we'll just do from and this is temp
from and this is temp employee and right here and so now let's
employee and right here and so now let's create our store
create our store procedure so that created successfully
procedure so that created successfully let's refresh over
let's refresh over here and let's execute this so let's
here and let's execute this so let's just go down right
just go down right here and say execute and it's going to
here and say execute and it's going to be temp
be temp employee and now we will execute
employee and now we will execute this and there is our output now really
this and there is our output now really quick let's go into temp employee and we
quick let's go into temp employee and we actually want to change this store
actually want to change this store procedure so we're going to go over to
procedure so we're going to go over to modify so when we modify it a few things
modify so when we modify it a few things are going to show up on your screen the
are going to show up on your screen the first thing that you're going to see is
first thing that you're going to see is it says use SQL tutorial so it's just
it says use SQL tutorial so it's just specifying the database the next two
specifying the database the next two things you may not be as familiar with
things you may not be as familiar with it's set anzy nules and then set quoted
it's set anzy nules and then set quoted identifier if you don't know what these
identifier if you don't know what these are it's not super important the first
are it's not super important the first one just talks about how it to deal with
one just talks about how it to deal with nulles when you're using the wear
nulles when you're using the wear statement and then the quoted identifier
statement and then the quoted identifier just talks about how it uses quotes in
just talks about how it uses quotes in the actual query itself again not super
the actual query itself again not super important but they have those
important but they have those automatically turned on let's go down a
automatically turned on let's go down a little bit further and we're going to
little bit further and we're going to look at the alter procedure so we
look at the alter procedure so we created our store procedure but now we
created our store procedure but now we want to alter it so this is the alter
want to alter it so this is the alter procedure and we are going to add a
procedure and we are going to add a parameter to this so what the parameter
parameter to this so what the parameter is going to allow us to do is when we're
is going to allow us to do is when we're actually executing the store procedure
actually executing the store procedure we can specify an input into that store
we can specify an input into that store procedure so that we get a specific
procedure so that we get a specific result back and I'm going to show you
result back and I'm going to show you what I mean by that in just a second but
what I mean by that in just a second but let's actually add our input and we're
let's actually add our input and we're going to say at we're going to say job
going to say at we're going to say job title and we need to specify the data
title and we need to specify the data type that that is going to be so let's
type that that is going to be so let's just say
just say nvar 100 I know below it says varar 100
nvar 100 I know below it says varar 100 but that's um not extremely important so
but that's um not extremely important so this is going to be our input so we need
this is going to be our input so we need to go down here and say
to go down here and say where job title is equal to at job title
where job title is equal to at job title so when we actually are executing this
so when we actually are executing this and we say the job title is equal to
and we say the job title is equal to let's say accountant this is going to
let's say accountant this is going to become accountant and it's going to give
become accountant and it's going to give us our results based off of it being an
us our results based off of it being an accountant so let's go over here and we
accountant so let's go over here and we are going to click this execute temp
are going to click this execute temp employee which we just modified and when
employee which we just modified and when we run it we're going to get an error
we run it we're going to get an error because it is now expecting us to
because it is now expecting us to include our parameter of job title so
include our parameter of job title so what we need to do is we need to say at
what we need to do is we need to say at job title and let's say it's equal to a
job title and let's say it's equal to a Salesman now let's try running this one
Salesman now let's try running this one and see what we get and so there is our
and see what we get and so there is our output if we go back here I just wanted
output if we go back here I just wanted to show you really quick we do not have
to show you really quick we do not have to put this job title right here you can
to put this job title right here you can put this anywhere in the query and use
put this anywhere in the query and use it however you want that's how
it however you want that's how parameters work and that's why
parameters work and that's why parameters are so useful and you can use
parameters are so useful and you can use multiple parameters for one store
multiple parameters for one store procedure so you don't have to just
procedure so you don't have to just limit yourself to one or none you can
limit yourself to one or none you can put as many as you really like so I hope
put as many as you really like so I hope that this video is helpful and that you
that this video is helpful and that you understand store procedures just a
understand store procedures just a little bit better thank you guys so much
little bit better thank you guys so much for watching I really appreciate it if
for watching I really appreciate it if you like this video be sure to like And
you like this video be sure to like And subscribe below and I'll see you in the
subscribe below and I'll see you in the next
next [Music]
[Music] video
what's going on everybody welcome back to another SQL tutorial today we are
to another SQL tutorial today we are going to be talking about subqueries now
going to be talking about subqueries now subqueries are often called inner
subqueries are often called inner queries or an nestic queries and they're
queries or an nestic queries and they're basically a query within a query a
basically a query within a query a subquery is used to return data that
subquery is used to return data that will be used in the main query or the
will be used in the main query or the outer query as a condition to specify
outer query as a condition to specify the data that we want retrieved you can
the data that we want retrieved you can use subqueries almost anywhere you can
use subqueries almost anywhere you can use it in the select part of a query the
use it in the select part of a query the from the where you can also use it in
from the where you can also use it in insert update and delete statements but
insert update and delete statements but in today's tutorial we're only going to
in today's tutorial we're only going to be looking at the select the from in the
be looking at the select the from in the Weare statements and you should get a
Weare statements and you should get a pretty good idea of how to use it in
pretty good idea of how to use it in those other statements all right now I'm
those other statements all right now I'm going to paste on screen basically what
going to paste on screen basically what we're going to be walking through today
we're going to be walking through today but really quick let's just take a look
but really quick let's just take a look at the table that we're actually be
at the table that we're actually be working in and that is going to be from
working in and that is going to be from the employee salary table and I just
the employee salary table and I just want to show you the data that we're
want to show you the data that we're going to be working with before we
going to be working with before we actually get into it so we have an
actually get into it so we have an employee ID we have a job title and then
employee ID we have a job title and then we have a salary so really quick I'm
we have a salary so really quick I'm going to show you what it looks like to
going to show you what it looks like to have a subquery in the select statement
have a subquery in the select statement so let's go down here really
so let's go down here really quick and what we're going to try to do
quick and what we're going to try to do is kind of do something like a Windows
is kind of do something like a Windows function but without actually having to
function but without actually having to do the windows function um and so we're
do the windows function um and so we're going to do this with a subquery so I'm
going to do this with a subquery so I'm going to select and really quick
going to select and really quick actually let me copy this so we're going
actually let me copy this so we're going to do employee
to do employee ID there we go we're going to do salary
ID there we go we're going to do salary and now we can start building our
and now we can start building our subquery so we need to do an open
subquery so we need to do an open parenthesis and I'm just going to copy
parenthesis and I'm just going to copy this really quick because we're going to
this really quick because we're going to be doing it off of that table so we're
be doing it off of that table so we're going to say select and then I'll paste
going to say select and then I'll paste that and close it as well but what we
that and close it as well but what we want to do is we want to say average and
want to do is we want to say average and salary now what this is going to do is
salary now what this is going to do is it is literally going to run this and
it is literally going to run this and let's run this really quick it is going
let's run this really quick it is going to run this and is going to show that
to run this and is going to show that the average salary for all the employees
the average salary for all the employees is 40 $
is 40 $ 7,99 so we are looking at the average
7,99 so we are looking at the average salary for every employee so when we run
salary for every employee so when we run this it is going to give us the employee
this it is going to give us the employee ID the salary and then in the very last
ID the salary and then in the very last one is going to show the average salary
one is going to show the average salary for every employee now it doesn't have a
for every employee now it doesn't have a column header so or or a column name so
column header so or or a column name so let's give it um let's say as all
let's give it um let's say as all average salary and we'll run that one
average salary and we'll run that one more time just to make it look a little
more time just to make it look a little prettier um you can also do this in
prettier um you can also do this in Partition bu I'm going to Super quickly
Partition bu I'm going to Super quickly just really quickly write this out um it
just really quickly write this out um it should take no time at all and then I'm
should take no time at all and then I'm going to show you why we can't do this
going to show you why we can't do this without the subquery why you aren't able
without the subquery why you aren't able to do this with a group buy so really
to do this with a group buy so really quickly let me copy this I'm going to
quickly let me copy this I'm going to put it right down here and we're going
put it right down here and we're going to say average salary whoops and we can
to say average salary whoops and we can get rid of all this and we can say over
get rid of all this and we can say over and we're not going to partition it by
and we're not going to partition it by anything
anything but let's run both these at the same
but let's run both these at the same time and you'll see that they're the
time and you'll see that they're the exact same outputs and so it's just a
exact same outputs and so it's just a different way of doing it in this
different way of doing it in this example but it really is just to show a
example but it really is just to show a comparison of how you might be able to
comparison of how you might be able to use a subquery in the select statement
use a subquery in the select statement now you might be wondering why group I
now you might be wondering why group I does not work for this uh really quickly
does not work for this uh really quickly I'm going to write this out and let's
I'm going to write this out and let's get rid of that and we'll say Group by
get rid of that and we'll say Group by whoops let me at least try to write it
whoops let me at least try to write it correctly Group by and we'll do employee
correctly Group by and we'll do employee ID and we also have to do salary and
ID and we also have to do salary and then we'll say order
then we'll say order by one two so let's run
by one two so let's run this and as you can see since we have to
this and as you can see since we have to use the group by it groups by both the
use the group by it groups by both the ordered ID and the salary and so we're
ordered ID and the salary and so we're not going to be able to get that all
not going to be able to get that all average salary that we're looking for
average salary that we're looking for that we can get in the partition buy and
that we can get in the partition buy and also the subquery in the select
also the subquery in the select statement now I'm going to show you the
statement now I'm going to show you the subquery in the from statement so let's
subquery in the from statement so let's just get rid of that really quick and
just get rid of that really quick and let's say select everything let's say
let's say select everything let's say from and we're going to do an open
from and we're going to do an open parentheses here and here is where we're
parentheses here and here is where we're going to write our subquery so if you
going to write our subquery so if you have watched previous videos where I've
have watched previous videos where I've done uh tutorials on the CTE or tutorial
done uh tutorials on the CTE or tutorial on the temp tables this is one that is
on the temp tables this is one that is very much like those except I think a
very much like those except I think a little bit less efficient when I'm doing
little bit less efficient when I'm doing something where I'm creating a table and
something where I'm creating a table and then quering off off of it which is what
then quering off off of it which is what we're about to do I much prefer a CTE or
we're about to do I much prefer a CTE or a temp table subqueries tend to be a
a temp table subqueries tend to be a little bit slow compared to a temp table
little bit slow compared to a temp table or a CTE I tend to use temp tables a lot
or a CTE I tend to use temp tables a lot more because you can reuse them over and
more because you can reuse them over and over whereas a subquery you cannot you
over whereas a subquery you cannot you have to write it out each time so really
have to write it out each time so really quickly I'm going to show you how it's
quickly I'm going to show you how it's done although I don't really recommend
done although I don't really recommend using this method really quickly let's
using this method really quickly let's go up here and let's steal this
go up here and let's steal this partition bu really quick this will be
partition bu really quick this will be our
our subquery uh and let's paste this in here
subquery uh and let's paste this in here I'm going make this look a little nicer
I'm going make this look a little nicer just so you can visualize it a little
just so you can visualize it a little bit
bit easier um so really quick what this is
easier um so really quick what this is going to do is it is first going to run
going to do is it is first going to run this and create this table again much
this and create this table again much like a temp table or a CTE so let's
like a temp table or a CTE so let's execute this really quick it's going to
execute this really quick it's going to create this table and then it's going to
create this table and then it's going to allow us to query off of it so I can
allow us to query off of it so I can actually say um and let me give kind of
actually say um and let me give kind of kind of an alias to this a. employee ID
kind of an alias to this a. employee ID and then let's say all average salary so
and then let's say all average salary so now I can take um columns from this
now I can take um columns from this inner query if I want to and just select
inner query if I want to and just select those or I can select everything and
those or I can select everything and return that entire table again I much
return that entire table again I much prefer a temp table or a CTE for this
prefer a temp table or a CTE for this type of situation but as an example I
type of situation but as an example I just wanted to show you how it works now
just wanted to show you how it works now let's go down to the subquery in thewar
let's go down to the subquery in thewar statement but really quick I just steal
statement but really quick I just steal this query so I don't have to rewrite
this query so I don't have to rewrite everything and let's get rid of this
everything and let's get rid of this really quick and add back the job
really quick and add back the job title all right so let's look at this
title all right so let's look at this really quick so we have our table that
really quick so we have our table that we've been using our employee ID job
we've been using our employee ID job title salary so for this example we only
title salary so for this example we only want to return employees if they're over
want to return employees if they're over the age of 30 and as you can see in this
the age of 30 and as you can see in this table there is no age column that is in
table there is no age column that is in the employee demographics table now if
the employee demographics table now if we wanted we could join to that table
we wanted we could join to that table and get that information or we could use
and get that information or we could use a subquery and so for this example we
a subquery and so for this example we are going to be using a subquery so
are going to be using a subquery so let's go right down here and say where
let's go right down here and say where employee ID is in and we'll do an open
employee ID is in and we'll do an open parentheses and now this is where we are
parentheses and now this is where we are going to build out the subquery so just
going to build out the subquery so just for visual purposes I'm going to go
for visual purposes I'm going to go right here I'm going to say select
right here I'm going to say select everything and we'll do from employee
everything and we'll do from employee demographics and close the parenthesis
demographics and close the parenthesis so we're going to try to select
so we're going to try to select something in this subquery that will
something in this subquery that will then identify the employee IDs that are
then identify the employee IDs that are over the age of 30 so really quickly
over the age of 30 so really quickly let's take a look at this table so right
let's take a look at this table so right now we have the entire table selected so
now we have the entire table selected so we have the employee ID first name last
we have the employee ID first name last name age and gender so in this subquery
name age and gender so in this subquery the only thing that should be returned
the only thing that should be returned is the employee ID and in fact in your
is the employee ID and in fact in your subquery you can only have one column
subquery you can only have one column selected so I can't select everything I
selected so I can't select everything I have to specify one column and that's a
have to specify one column and that's a little bit different than how we did it
little bit different than how we did it in in this from statement where we were
in in this from statement where we were basically able to select the entire
basically able to select the entire table and then in the select statement
table and then in the select statement specify what columns we wanted in the
specify what columns we wanted in the where statement we can't do that so we
where statement we can't do that so we want to return the employee ID and we
want to return the employee ID and we also want to say where the age is
also want to say where the age is greater than 30 so let's run this really
greater than 30 so let's run this really quick and see if it works as you can see
quick and see if it works as you can see in the results these are the employees
in the results these are the employees who are over the age of 30 now if you
who are over the age of 30 now if you wanted to display the age as a column in
wanted to display the age as a column in this output you would have to join to
this output you would have to join to that table and then put that column or
that table and then put that column or that field in the select statement but
that field in the select statement but in a lot of situations you won't
in a lot of situations you won't actually want or need to do that and so
actually want or need to do that and so a subquery can be a really good option
a subquery can be a really good option in these scenarios with that being said
in these scenarios with that being said this is the last video in the advanced
this is the last video in the advanced sequel tutorials I hope that this Series
sequel tutorials I hope that this Series has been helpful and that you learned
has been helpful and that you learned something along the way thank you so
something along the way thank you so much for joining me I really appreciate
much for joining me I really appreciate it if you like this video be sure to
it if you like this video be sure to like And subscribe below and I'll see
like And subscribe below and I'll see you in the next
video [Music]
[Music] what is going on everybody welcome back
what is going on everybody welcome back to another video today we are starting
to another video today we are starting our data analyst portfolio project
series now before we jump into our first project I wanted to talk with you for
project I wanted to talk with you for just a second so that we're all on the
just a second so that we're all on the same page first thing is that there are
same page first thing is that there are going to be four projects the first one
going to be four projects the first one is going to be SQL and we doing a lot of
is going to be SQL and we doing a lot of data exploration and we'll be setting up
data exploration and we'll be setting up a lot of our data to visualize it in
a lot of our data to visualize it in Tableau Tableau is going to be our
Tableau Tableau is going to be our second project in our third project
second project in our third project again we're going back to SQL but we're
again we're going back to SQL but we're going to be doing a lot more of the ETL
going to be doing a lot more of the ETL process so a lot more of the data
process so a lot more of the data cleaning I did that one as the third
cleaning I did that one as the third project because I think it's going to be
project because I think it's going to be a little bit more advanced than this
a little bit more advanced than this first project I tried to make it as
first project I tried to make it as beginner friendly as possible so even if
beginner friendly as possible so even if you are a complete beginner as long as
you are a complete beginner as long as you've walked through uh you know the
you've walked through uh you know the tutorial that I have made on my channel
tutorial that I have made on my channel you should be pretty good and then the
you should be pretty good and then the fourth and the final project will be
fourth and the final project will be with python we'll be using a lot of
with python we'll be using a lot of pandas doing a little bit of data
pandas doing a little bit of data cleaning and then doing visualizations
cleaning and then doing visualizations as well as I said just a second ago I'm
as well as I said just a second ago I'm trying to make this as beginner friendly
trying to make this as beginner friendly as I possibly can the whole point of the
as I possibly can the whole point of the series is that if you are trying to
series is that if you are trying to apply for a data analyst job by the end
apply for a data analyst job by the end of the series you should have an entire
of the series you should have an entire portfolio or at least a a really good
portfolio or at least a a really good start at a portfolio to show a potential
start at a portfolio to show a potential employer I give you full permission to
employer I give you full permission to copy every script every query line for
copy every script every query line for line if that is what you want to do and
line if that is what you want to do and create your own portfolio I am totally
create your own portfolio I am totally fine with that but I will encourage you
fine with that but I will encourage you and I'm sure I'll say this throughout
and I'm sure I'll say this throughout the video I encourage you to try to
the video I encourage you to try to think of your own queries try to think
think of your own queries try to think of your own insights and your own things
of your own insights and your own things that you can do to make this portfolio
that you can do to make this portfolio project unique with that being said I'm
project unique with that being said I'm super excited to get started on this
super excited to get started on this with you guys so let's jump over to my
with you guys so let's jump over to my screen and get started on our very first
screen and get started on our very first project all right so now that we are on
project all right so now that we are on my screen we are going to get started on
my screen we are going to get started on this project we're going to download the
this project we're going to download the data set we are going to format it just
data set we are going to format it just a little bit in Excel
a little bit in Excel and then we're going to get into sequel
and then we're going to get into sequel where we will start querying it I will
where we will start querying it I will say that I think this is going to be a
say that I think this is going to be a very long video I'm hoping to keep it
very long video I'm hoping to keep it under an hour and a half I may separate
under an hour and a half I may separate this into two videos depending on how
this into two videos depending on how long it runs um but you know I I will do
long it runs um but you know I I will do my best to keep it short but we have a
my best to keep it short but we have a lot to get through I'm going to
lot to get through I'm going to basically do no Cuts I'm I'm that's my
basically do no Cuts I'm I'm that's my goal is to do no cuts um in this because
goal is to do no cuts um in this because I want to walk you through each step of
I want to walk you through each step of the process so that you understand
the process so that you understand everything that's going on and I I you
everything that's going on and I I you don't get lost at some point um but I
don't get lost at some point um but I think this is probably the best way to
think this is probably the best way to do it we'll see uh the very first thing
do it we'll see uh the very first thing we're going to do is download our data
we're going to do is download our data set so you know as we're looking at this
set so you know as we're looking at this there's an option right here to download
there's an option right here to download the data set I don't recommend that one
the data set I don't recommend that one um you can it just won't give you all
um you can it just won't give you all the information that I personally want
the information that I personally want which is go back to like the very
which is go back to like the very beginning um if you go down right here
beginning um if you go down right here to the very first graph um you can
to the very first graph um you can actually push this back and then
actually push this back and then download it and what this will do is it
download it and what this will do is it will go back to I think January 1st of
will go back to I think January 1st of 2020 so let's open this one
2020 so let's open this one up um and when we get in here we're
up um and when we get in here we're going to reformat it just a little bit
going to reformat it just a little bit it's nothing too complicated I hope um
it's nothing too complicated I hope um I'm just going to double click here
I'm just going to double click here actually let me let me go up here and
actually let me let me go up here and filter just in case we want to filter
filter just in case we want to filter anything so um what we have here is a
anything so um what we have here is a ton of information on Co I mean just a
ton of information on Co I mean just a ton and it goes back to early 2020 I
ton and it goes back to early 2020 I believe it does go back to the first of
believe it does go back to the first of 2020 so really quick a really brief
2020 so really quick a really brief introduction of what kind of data is in
introduction of what kind of data is in here we have total cases new
here we have total cases new cases um total deaths new deaths we use
cases um total deaths new deaths we use those quite a bit in the the queries
those quite a bit in the the queries that are coming
that are coming up um if we go way over here we have
up um if we go way over here we have total vaccin vaccinations people
total vaccin vaccinations people vaccinated um and then over here a
vaccinated um and then over here a little bit farther we have population
little bit farther we have population that's the main stuff we're going to be
that's the main stuff we're going to be working with today as you can see
working with today as you can see there's so many other things in here I
there's so many other things in here I mean you can use this if you want to go
mean you can use this if you want to go back and do more stuff on this I highly
back and do more stuff on this I highly recommend it there's such you know
recommend it there's such you know there's so such unique data in here
there's so such unique data in here about smokers and diabetes and like all
about smokers and diabetes and like all this random stuff that I did not do a
this random stuff that I did not do a deep dive in I mean I could I could
deep dive in I mean I could I could spend you know a month just like looking
spend you know a month just like looking at this data set and and getting really
at this data set and and getting really interesting stuff from it um but I'm not
interesting stuff from it um but I'm not going to do that I wanted to do this
going to do that I wanted to do this faster than uh two months to to
faster than uh two months to to complete what we're going to do um is
complete what we're going to do um is we're going to go back over here we're
we're going to go back over here we're going to take this
going to take this population and we're going to click on
population and we're going to click on this as and we're going to click contrl
this as and we're going to click contrl X and that's going to cut it we're going
X and that's going to cut it we're going to go back to the very beginning and
to go back to the very beginning and we're going to place it right here and
we're going to place it right here and we're going to right click and say
we're going to right click and say insert cut cells now why are we doing
insert cut cells now why are we doing this because I've already done this
this because I've already done this entire project um and if you don't do
entire project um and if you don't do this you're going to do a join with
this you're going to do a join with every single query you do which if you
every single query you do which if you want to do that keep it there and then
want to do that keep it there and then just you know change your query for for
just you know change your query for for that I did it like this because I wanted
that I did it like this because I wanted to show joins later on I wanted to keep
to show joins later on I wanted to keep it kind of simple at the beginning um
it kind of simple at the beginning um and then work my way to a little bit
and then work my way to a little bit more advanced things which you will see
more advanced things which you will see um it gets you know semi Advanced but
um it gets you know semi Advanced but not too much I promise um just stick
not too much I promise um just stick with me let's go back over here we're
with me let's go back over here we're going to go to uh actually double A and
going to go to uh actually double A and then we're going to click control shift
then we're going to click control shift right key that's going to select
right key that's going to select everything over here and we're going to
everything over here and we're going to literally delete it okay this is going
literally delete it okay this is going to be our first table over here so
to be our first table over here so everything you see over here is our
everything you see over here is our first table um and we're going to save
first table um and we're going to save that so let's save as I'm just going to
that so let's save as I'm just going to keep it in my downloads as and let's do
keep it in my downloads as and let's do covid deaths so that has our death
covid deaths so that has our death information the next one is going to
information the next one is going to include our um vaccination information
include our um vaccination information which is what we're going to join on and
which is what we're going to join on and then um we're going to do that later so
then um we're going to do that later so let's let's hit contrl Z that's going to
let's let's hit contrl Z that's going to bring it back now let's select on Z and
bring it back now let's select on Z and go all the way to e and we're going to
go all the way to e and we're going to do the same thing we're going to delete
do the same thing we're going to delete this looks like there's no data but I
this looks like there's no data but I promise there is later on the
promise there is later on the vaccinations um like total vaccinations
vaccinations um like total vaccinations if we go down um you can see that that
if we go down um you can see that that starts on in February the end very end
starts on in February the end very end of February in 2021 that's because
of February in 2021 that's because vaccinations are you know didn't come
vaccinations are you know didn't come out till recently now let's save this
out till recently now let's save this file and we're going to save as instead
file and we're going to save as instead of covid deaths we'll do Co
of covid deaths we'll do Co vaccinations all right now let's save
vaccinations all right now let's save that so now we have our two excels that
that so now we have our two excels that we want we need to get them into SQL
we want we need to get them into SQL we're going to go over to SQL and we're
we're going to go over to SQL and we're going to create a portfolio project
going to create a portfolio project database I've already done this all you
database I've already done this all you have to do though is rightclick click
have to do though is rightclick click new database type in
new database type in portfolio project and then click okay
portfolio project and then click okay and it will create your database for you
and it will create your database for you um if you open up the tables it should
um if you open up the tables it should be empty and that's where we're going to
be empty and that's where we're going to put these two Excel files now uh I had a
put these two Excel files now uh I had a ton of trouble actually importing these
ton of trouble actually importing these excels um I mean I tried everything and
excels um I mean I tried everything and I eventually just went down a rabbit
I eventually just went down a rabbit hole of how to get these in I don't know
hole of how to get these in I don't know if it's me or or what but I could not
if it's me or or what but I could not figure out how to do it if you go to
figure out how to do it if you go to portfolio project you hit tasks and you
portfolio project you hit tasks and you hit import data that may do it for you
hit import data that may do it for you and it may work um it did not work for
and it may work um it did not work for me uh it just it kept giving me errors
me uh it just it kept giving me errors so what I would recommend you do right
so what I would recommend you do right off the bat just to make sure that we're
off the bat just to make sure that we're doing the same thing um and you can do
doing the same thing um and you can do it that way if you want I went over here
it that way if you want I went over here to start um again I'm on a Windows and I
to start um again I'm on a Windows and I went down to Microsoft SQL Server 2019
went down to Microsoft SQL Server 2019 and clicked Import and
and clicked Import and Export looks the same but for whatever
Export looks the same but for whatever reason it it all the research I did it
reason it it all the research I did it has to do with the 32-bit versus the
has to do with the 32-bit versus the 64bit when you do it this way it goes to
64bit when you do it this way it goes to the 64-bit and it is able to import the
the 64-bit and it is able to import the data if you do it the other way it was
data if you do it the other way it was doing it the 32-bit version and gives
doing it the 32-bit version and gives you an error I don't understand it don't
you an error I don't understand it don't ask me that's that's the re that's I
ask me that's that's the re that's I mean I went down a huge rabbit hole but
mean I went down a huge rabbit hole but this one works so let's go over here and
this one works so let's go over here and this is going to be our data source
this is going to be our data source where is the data coming from it's an
where is the data coming from it's an Excel file
Excel file so let's do that let's browse and let's
so let's do that let's browse and let's go over to my
downloads I thought I saved it in downloads uh maybe because it's an Excel
downloads uh maybe because it's an Excel workbook what was I saving
workbook what was I saving before Oh that's a
before Oh that's a CSV okay something important to note is
CSV okay something important to note is we're doing an Excel and not a
we're doing an Excel and not a CSV you're going to get the same error
CSV you're going to get the same error I'm just doing it live and I'm making
I'm just doing it live and I'm making myself look stupid so um we're going to
myself look stupid so um we're going to save it but instead of a CSV we're going
save it but instead of a CSV we're going to save it as an Excel workbook so let's
to save it as an Excel workbook so let's save that um now we have to go back to
save that um now we have to go back to how it was right
how it was right here um the same way and we're going to
here um the same way and we're going to file save as and let's do this is now
file save as and let's do this is now covid
covid deaths and save it as a workbook now we
deaths and save it as a workbook now we have them now let's go back um now we
have them now let's go back um now we have our covid deaths and our covid
have our covid deaths and our covid vaccinations let's do our deaths first
vaccinations let's do our deaths first um let me get back right here so it
um let me get back right here so it looks kind of more
looks kind of more normal um so we have our Excel file we
normal um so we have our Excel file we have our covid deaths let's go next and
have our covid deaths let's go next and now we have to say where we're going to
now we have to say where we're going to place it where's our destination so
place it where's our destination so we're going to click over here and go
we're going to click over here and go down to SQL Server native client
down to SQL Server native client 11.0 I want to say this is something
11.0 I want to say this is something that I messed up and it took me like 45
that I messed up and it took me like 45 minutes to figure out it was the
minutes to figure out it was the stupidest mistake um it's gonna autop
stupidest mistake um it's gonna autop populate a server name
populate a server name and I never checked to confirm that this
and I never checked to confirm that this was my server name and so I couldn't
was my server name and so I couldn't figure out why I wasn't able to insert
figure out why I wasn't able to insert this into my portfolio project uh
this into my portfolio project uh database that's because mine is 01 I
database that's because mine is 01 I created two different servers um
created two different servers um intentionally and for whatever reason I
intentionally and for whatever reason I forgot that and so all I have to do is
forgot that and so all I have to do is add 01 over here so just make sure yours
add 01 over here so just make sure yours is is the same thing click portfolio
is is the same thing click portfolio project click next yes we're want to
project click next yes we're want to copy the data should autop populate if
copy the data should autop populate if it doesn't if it gives you like multiple
it doesn't if it gives you like multiple you can always uh check mark on the one
you can always uh check mark on the one that you think is the right one it
that you think is the right one it should be the first one we'll click next
should be the first one we'll click next we'll just click finish I'm sure it says
we'll just click finish I'm sure it says run immediately we'll click finish and
run immediately we'll click finish and finish now while this is running um
finish now while this is running um there should be around
there should be around 89,000 that's how it was like a week ago
89,000 that's how it was like a week ago when I started it maybe a little more
when I started it maybe a little more now because there's extra
now because there's extra days um with that being said you know
days um with that being said you know there's going to be
there's going to be a good siiz amount of data um we're
a good siiz amount of data um we're about to do a lot of different things
about to do a lot of different things we're going to start at the very basics
we're going to start at the very basics of just like queer quering the table
of just like queer quering the table like super simple um and then we're
like super simple um and then we're going to go into things like joins ctes
going to go into things like joins ctes temp tables creating views um I the
temp tables creating views um I the whole purpose of what we're about to do
whole purpose of what we're about to do is not
is not to it's not to keep it too simple um I
to it's not to keep it too simple um I want to showcase to a potential employer
want to showcase to a potential employer right that you can do more advanced
right that you can do more advanced Advanced things so I'm going to probably
Advanced things so I'm going to probably do I mean I'm I'm looking at because I
do I mean I'm I'm looking at because I have already done this entire project
have already done this entire project individually I mean we've probably got
individually I mean we've probably got like 15 to 20 queries here you don't
like 15 to 20 queries here you don't have to do all of them um I'm going to
have to do all of them um I'm going to walk through all of them and you can
walk through all of them and you can choose which ones you want but you don't
choose which ones you want but you don't have to do all them it is quite a few so
have to do all them it is quite a few so just know that so there's 85,000 right
just know that so there's 85,000 right here that's
here that's fantastic uh it won't show up
fantastic uh it won't show up immediately you need to refresh it uh
immediately you need to refresh it uh and there we go so that's our covid
and there we go so that's our covid vaccinations U let's get rid of this so
vaccinations U let's get rid of this so we just have Co vaccinations um I
we just have Co vaccinations um I thought that was our covid deaths one
thought that was our covid deaths one but maybe I'm wrong um but let's do the
but maybe I'm wrong um but let's do the exact same thing down
exact same thing down here and we will import and say
here and we will import and say next we're going to go down to
next we're going to go down to Excel and browse and now we want to do
Excel and browse and now we want to do the covid deaths apparently last time we
the covid deaths apparently last time we did the vaccinations which
did the vaccinations which um I actually actually you know what I
um I actually actually you know what I bet what it did was it took yeah it took
bet what it did was it took yeah it took this right here as Co vaccinations but
this right here as Co vaccinations but that was the deaths one as it saved so
that was the deaths one as it saved so uh forget that let's go right
uh forget that let's go right here let's do the co vaccinations it
here let's do the co vaccinations it just has the same sheet
just has the same sheet name uh so sorry for the confusion
name uh so sorry for the confusion destination is going to be the exact
destination is going to be the exact same place it's going to be SQL Server
same place it's going to be SQL Server native client let's add that
native client let's add that 01 and let's click refresh portfolio
01 and let's click refresh portfolio project next next um like I said before
project next next um like I said before if it does this just click the first one
if it does this just click the first one it's going to be Co vaccinations it did
it's going to be Co vaccinations it did that for the covid deaths that's because
that for the covid deaths that's because I made the mistake earlier I hope you I
I made the mistake earlier I hope you I hope when you're watching this you
hope when you're watching this you aren't super confused um the whole point
aren't super confused um the whole point make two tables or make two excels one
make two tables or make two excels one should be covid deaths one should be Co
should be covid deaths one should be Co vaccinations upload them and then rename
vaccinations upload them and then rename them in a nutshell U so we have the same
them in a nutshell U so we have the same amount uh let's refresh
amount uh let's refresh this this one is actually the co
this this one is actually the co vaccinations this one is covid
vaccinations this one is covid deaths I'm telling you this stuff is
deaths I'm telling you this stuff is it's confuses me sometimes to be honest
it's confuses me sometimes to be honest um but we're going to query this really
um but we're going to query this really quick to make sure we act are actually
quick to make sure we act are actually doing um what we're supposed to be doing
doing um what we're supposed to be doing so let's do select
so let's do select everything from um and let's do
everything from um and let's do portfolio project and you can do
portfolio project and you can do dbo or you can do dot dot I tend to just
dbo or you can do dot dot I tend to just do that because it's easier um let's
do that because it's easier um let's look at this one make sure it's the
look at this one make sure it's the right table so we have total cases new
right table so we have total cases new cases
cases perfect um and let's order on let's do
perfect um and let's order on let's do three comma 4 just to make
three comma 4 just to make sure or order by of course just to make
sure or order by of course just to make sure that we have all everything that
sure that we have all everything that we're looking for so this looks right
we're looking for so this looks right this looks like our
this looks like our Excel let's copy this let's go down here
Excel let's copy this let's go down here we're going to do covid
we're going to do covid vaccinations and let's run this one make
vaccinations and let's run this one make sure the second one came in correctly as
sure the second one came in correctly as well so perfect so we have our two
well so perfect so we have our two tables this is fantastic news um and now
tables this is fantastic news um and now we can get going um we can keep this one
we can get going um we can keep this one I'm GNA comment it out in case you know
I'm GNA comment it out in case you know we want to come back to it um I'm going
we want to come back to it um I'm going to really quick again right here I have
to really quick again right here I have another laptop I have already done this
another laptop I have already done this whole project so I'm just using it as a
whole project so I'm just using it as a guideline to know kind of what I'm doing
guideline to know kind of what I'm doing next so that I don't waste everyone's
next so that I don't waste everyone's time um so really quickly let's just
time um so really quickly let's just let's select the data that we are going
let's select the data that we are going to be using you don't have to use these
to be using you don't have to use these comments I will say that I'm going to
comments I will say that I'm going to specify I'm going to say hey this
specify I'm going to say hey this comment is something I would keep in
comment is something I would keep in your portfolio project I'm going to add
your portfolio project I'm going to add a bunch of extra stuff that is not
a bunch of extra stuff that is not needed um just for your purpose but when
needed um just for your purpose but when you are creating your portfolio project
you are creating your portfolio project you shouldn't be adding some of the
you shouldn't be adding some of the things that I'm going to be commenting
things that I'm going to be commenting um on so we're going to do um or
um on so we're going to do um or actually let's do really quick let's
actually let's do really quick let's copy this so that it kind of knows what
copy this so that it kind of knows what we're doing so let's select the
we're doing so let's select the location uh the date the to total
location uh the date the to total cases the new
cases the new cases the
cases the [Music]
[Music] total deaths and then
total deaths and then population uh now where we're at I'm
population uh now where we're at I'm going to turn off my camera because it's
going to turn off my camera because it's going to get it's going to start getting
going to get it's going to start getting in the way to be honest I don't want it
in the way to be honest I don't want it to interfere with your ability to see
to interfere with your ability to see what we're doing on screen so it's been
what we're doing on screen so it's been great seeing you guys I'm going to turn
great seeing you guys I'm going to turn this off and we will continue from here
this off and we will continue from here all right that should be turned off so
all right that should be turned off so let's keep running so this is what we're
let's keep running so this is what we're doing let's actually let's keep this
doing let's actually let's keep this going because I I don't like things not
going because I I don't like things not being
being organized um so we have our location oh
organized um so we have our location oh no we want to do one two we want to do
no we want to do one two we want to do it based off the location and the date
it based off the location and the date makes things everything easier I promise
makes things everything easier I promise you so we're going to be the first one's
you so we're going to be the first one's obviously Afghanistan here's our date we
obviously Afghanistan here's our date we have our total cases are new cases total
have our total cases are new cases total deaths and population so really quick
deaths and population so really quick I'm just going to scroll down just a
I'm just going to scroll down just a second um they started having you know
second um they started having you know the the total deaths it's um it started
the the total deaths it's um it started about a
about a month after they got their first case it
month after they got their first case it looks like so and then it just like
looks like so and then it just like ramps up a lot um and we're going to be
ramps up a lot um and we're going to be diving into all these numbers what they
diving into all these numbers what they mean how to you can do some really
mean how to you can do some really simple calculations on them um but
simple calculations on them um but really quickly we're just going to do
really quickly we're just going to do again a super simple calculation um and
again a super simple calculation um and one that we do multiple times for
one that we do multiple times for different things um so let's go right
different things um so let's go right down here and let's say uh we're going
down here and let's say uh we're going to be looking at the total
to be looking at the total cases
cases versus total deaths so how many cases
versus total deaths so how many cases are there in this country and then how
are there in this country and then how many deaths do they have per um uh you
many deaths do they have per um uh you know how many deaths they have for their
know how many deaths they have for their entire cases so let's say they have a
entire cases so let's say they have a thousand people who H who've been
thousand people who H who've been diagnosed they had 10 people who died
diagnosed they had 10 people who died what's the percentage of people who died
what's the percentage of people who died who had um who had it
who had um who had it so uh let's go right down here and we're
so uh let's go right down here and we're gonna I'm just going to copy this really
gonna I'm just going to copy this really quick this just going to make our life
quick this just going to make our life easier I think you should do the same as
easier I think you should do the same as well um so we have location date total
well um so we have location date total cases um and we're going to get rid of
cases um and we're going to get rid of our new cases we don't need that one in
our new cases we don't need that one in this query right here uh nor do you need
this query right here uh nor do you need this population so let's work on our
this population so let's work on our calculation really quick it should be
calculation really quick it should be super super easy let me make sure I'm
super super easy let me make sure I'm still recording perfect oh man we're 25
still recording perfect oh man we're 25 almost 25 minutes in um or more because
almost 25 minutes in um or more because I have the
I have the intro so now we're going to do uh we
intro so now we're going to do uh we want to know the percentage of people
want to know the percentage of people who are dying who actually get infected
who are dying who actually get infected or or or or who um report being infected
or or or or who um report being infected so we're going to do um total underscore
so we're going to do um total underscore deaths we'll go right down here and
deaths we'll go right down here and we're going to divide that by the total
we're going to divide that by the total cases total cases and if we do this
cases total cases and if we do this really
really quick um what it's going to have and
quick um what it's going to have and well let's go down to where there's
well let's go down to where there's actually
actually numbers so we have 34 we have one um
numbers so we have 34 we have one um it's it's showing
it's it's showing 0.029% if you ever try to get a
0.029% if you ever try to get a percentage of something you have to
percentage of something you have to multiply times 8 100 um so let's do that
multiply times 8 100 um so let's do that really quick all we have to add is the
really quick all we have to add is the what's that the asteris sign um times
what's that the asteris sign um times 100 um and while we're here let's just
100 um and while we're here let's just add the um what's it called Alias Let's
add the um what's it called Alias Let's do let's call this death percentage I
do let's call this death percentage I don't know that that works for me and
don't know that that works for me and let's take a look at this it'll be a
let's take a look at this it'll be a little bit more accurate
little bit more accurate accurate so when there were 34 there was
accurate so when there were 34 there was one and that gives gives us a
one and that gives gives us a 2.94% death rate and we can go down even
2.94% death rate and we can go down even further um and this is still all
further um and this is still all Afghanistan let's go down to the very
Afghanistan let's go down to the very bottom let's go down to the very very
bottom let's go down to the very very bottom so as of as of today yesterday
bottom so as of as of today yesterday there were
there were 59745 total cases in Afghanistan and
59745 total cases in Afghanistan and there were 20 2,625 deaths which is 4%
there were 20 2,625 deaths which is 4% so you have a 4% chance basically right
so you have a 4% chance basically right now of dying I mean if if you want to
now of dying I mean if if you want to look at it like that 4% chance of dying
look at it like that 4% chance of dying if you get it and you live in
if you get it and you live in Afghanistan um let's I mean we you don't
Afghanistan um let's I mean we you don't have to but really quick just to look at
have to but really quick just to look at it further let's look at where the
it further let's look at where the location um I think
location um I think it's let's say like real quick because
it's let's say like real quick because I'm not 100% if it's
I'm not 100% if it's States
States um it should I think it's United
um it should I think it's United States but yeah so I mean I live in the
States but yeah so I mean I live in the United States if you don't you can look
United States if you don't you can look at your country but um you know
at your country but um you know we we this is like this is genuine real
we we this is like this is genuine real reported data so it's really interesting
reported data so it's really interesting um right at the beginning I mean the I
um right at the beginning I mean the I don't know if it was the way we were
don't know if it was the way we were reporting or what but we had really high
reporting or what but we had really high percentage rates um as we go down we're
percentage rates um as we go down we're looking at a 5% 6% I mean this was the
looking at a 5% 6% I mean this was the peak of it this got really bad in the US
peak of it this got really bad in the US um maybe get I hope it gets better um
um maybe get I hope it gets better um how many are we at this is I'm going to
how many are we at this is I'm going to go to the end of this year we sitting at
go to the end of this year we sitting at around 2 to
around 2 to 3% um um yeah it goes down to under 2%
3% um um yeah it goes down to under 2% so at the end of at the end of the year
so at the end of at the end of the year we were looking at over 2 million people
we were looking at over 2 million people that's 2
that's 2 million no wait 20 million
million no wait 20 million 9363 wait wait wait 20 million people
9363 wait wait wait 20 million people who have been
who have been infected um that's a lot that's a lot of
infected um that's a lot that's a lot of 20 million people who have had it 35,000
20 million people who have had it 35,000 or 352,000 deaths by the end of the year
or 352,000 deaths by the end of the year that's a lot um let's keep
that's a lot um let's keep going um and at the very end we had over
going um and at the very end we had over 32
m346fa um there's a lot of deaths 576,000 and I verified this number um I
576,000 and I verified this number um I Googled it Google knows all I googled
Googled it Google knows all I googled this number and it's pretty accurate um
this number and it's pretty accurate um and it's really sad that's a lot of lot
and it's really sad that's a lot of lot of lives um and that's
of lives um and that's 1.78% so as of right now if you're were
1.78% so as of right now if you're were to get it today a estimate is around one
to get it today a estimate is around one uh and three fourest to 2% chance that
uh and three fourest to 2% chance that you're that you could die from it um so
you're that you could die from it um so really interesting numbers this is the
really interesting numbers this is the kind of exploratory stuff that that you
kind of exploratory stuff that that you know we're going to be doing we're going
know we're going to be doing we're going to get a lot more advanced as we go on
to get a lot more advanced as we go on but this shows you know the likelihood
but this shows you know the likelihood um and we can I'm going to write that
um and we can I'm going to write that shows the likely I hope I'm spelling
shows the likely I hope I'm spelling this right I'm not spelling this right
this right I'm not spelling this right likelihood I hope that's right if this
likelihood I hope that's right if this not I apologize likelihood
not I apologize likelihood of dying if you
of dying if you contract uh covid in your
contract uh covid in your country um again rough estimates but you
country um again rough estimates but you know just glancing at the data that's
know just glancing at the data that's kind of what we're looking at um now
kind of what we're looking at um now we're going to look at and let's go down
we're going to look at and let's go down here let's look
here let's look at looking at the total cases versus the
at looking at the total cases versus the population again we're going to do a lot
population again we're going to do a lot of this like percentage stuff um it it's
of this like percentage stuff um it it's pretty simple um that will only last for
pretty simple um that will only last for so long I promise you but it'll be
so long I promise you but it'll be really I'm going to keep it on the
really I'm going to keep it on the states just because um I'm going to be
states just because um I'm going to be looking at that one the most because
looking at that one the most because obviously it's pretty relevant to me um
obviously it's pretty relevant to me um so if you're in another country filter
so if you're in another country filter by your country you'll be really
by your country you'll be really interested in the stats I I know I was
interested in the stats I I know I was really really really um shocked by a lot
really really really um shocked by a lot of the things that we're going to find
of the things that we're going to find today so we're going to keep the
today so we're going to keep the location we're going
location we're going to we're going to keep the date keep the
to we're going to keep the date keep the total cases um but let's change this to
total cases um but let's change this to population and then instead of um the
population and then instead of um the total cases being here we're going to
total cases being here we're going to put the total cases there and then
put the total cases there and then change this to population so what is
change this to population so what is this going to do for us this is going to
this going to do for us this is going to show us what percentage of the
show us what percentage of the population has gotten covid so shows
population has gotten covid so shows what
what percentage of population
percentage of population oops got covid um some of these things
oops got covid um some of these things again they're they're good to know um
again they're they're good to know um the one that I upload to
the one that I upload to GitHub will have the notes that I
GitHub will have the notes that I recommend keeping um again not
recommend keeping um again not everything in here is um not everything
everything in here is um not everything in here is
in here is what you know you need to have in there
what you know you need to have in there this is mostly just you know what I
this is mostly just you know what I think you guys need to see while we're
think you guys need to see while we're actually typing this out all right so
actually typing this out all right so let's take a look at this um actually I
let's take a look at this um actually I want to change this I want to put this
want to change this I want to put this right here just as easier for me
right here just as easier for me visually um just for because the total
visually um just for because the total cases right here so our our population
cases right here so our our population in the US is around
in the US is around 331
331 million um
million um so at the beginning when we had one case
so at the beginning when we had one case I mean it's like nothing let's keep
I mean it's like nothing let's keep scrolling um and see where we get to 1%
scrolling um and see where we get to 1% so
so 1% that's
1% that's 3,311
3,311 32 uh people and that happened in what
32 uh people and that happened in what is that August August of last year so 1%
is that August August of last year so 1% of the population let's keep going all
of the population let's keep going all the way down again we're just kind of
the way down again we're just kind of glancing at this we're about 10% um
glancing at this we're about 10% um again we're at the that 32 million so
again we're at the that 32 million so 10% of the population has has gotten it
10% of the population has has gotten it gotten a test and it's been confirmed so
gotten a test and it's been confirmed so really
really interesting um you know we'll come back
interesting um you know we'll come back to that one I'm sure in the future I you
to that one I'm sure in the future I you know we might make we might use this one
know we might make we might use this one as
as like um a visualization again uh I'm
like um a visualization again uh I'm only looking at the states or United
only looking at the states or United States right now but you know think
States right now but you know think about it in terms of how we're going to
about it in terms of how we're going to visualize this in the future cuz a lot
visualize this in the future cuz a lot of what we're doing
of what we're doing we're going to visualize in the future
we're going to visualize in the future um in Tableau I have Tableau even open
um in Tableau I have Tableau even open right here you can see I have a map um
right here you can see I have a map um this is just a super I threw this
this is just a super I threw this together in like two seconds um we have
together in like two seconds um we have the uh we have the location and so you
the uh we have the location and so you know this is like our future this is
know this is like our future this is what you need to be envisioning when
what you need to be envisioning when you're looking at this data so we have
you're looking at this data so we have you know Afghanistan and let's just
you know Afghanistan and let's just scroll through bellaro and Bolivia and
scroll through bellaro and Bolivia and Bulgaria and cambod all the every single
Bulgaria and cambod all the every single country um that that is reporting so
country um that that is reporting so we're just looking at the states but
we're just looking at the states but remember all of these are going to be
remember all of these are going to be used so just something to remember um I
used so just something to remember um I want to know and I'm really curious as
want to know and I'm really curious as to what countries have the highest um
to what countries have the highest um infection rates compared to the
infection rates compared to the population so we're just looking at our
population so we're just looking at our population um up here um how are we
population um up here um how are we going to do this we'll do actually let
going to do this we'll do actually let me say well let me write it out really
me say well let me write it out really quick so let's look looking at
quick so let's look looking at countries with highest infection rate
countries with highest infection rate compared to population so that's what
compared to population so that's what this script is going to do or this query
this script is going to do or this query is going to do I'm going to copy
is going to do I'm going to copy this um so we're going to keep the
this um so we're going to keep the location we are not going to keep the
location we are not going to keep the date this is not going to be date
date this is not going to be date specific this just going to be
specific this just going to be overall and then we're going to look at
overall and then we're going to look at the max of the total cases so we only
the max of the total cases so we only want to look at the highest so when when
want to look at the highest so when when we were looking at the us we had 32
we were looking at the us we had 32 million we don't want to look at every
million we don't want to look at every single Pop um uh of the total cases we
single Pop um uh of the total cases we only look at the very highest one so
only look at the very highest one so we'll look at the Max total
we'll look at the Max total cases um and let's right here we'll just
cases um and let's right here we'll just say give it an alias at least something
say give it an alias at least something to recognize it so highest U I guess we
to recognize it so highest U I guess we can say infection count so we'll say
can say infection count so we'll say highest infection count that's the
highest infection count that's the highest infection count per country um
highest infection count per country um so per location
so per location um and then we want to also take because
um and then we want to also take because it's going to it's not going since we
it's going to it's not going since we don't have Max total cases here if we
don't have Max total cases here if we just kept total cases here it'll give us
just kept total cases here it'll give us the same one that we were looking at in
the same one that we were looking at in this above query what we need to do is
this above query what we need to do is we need to look at the max of this um so
we need to look at the max of this um so we're going to look at
we're going to look at Max and just add a parentheses there um
Max and just add a parentheses there um and we'll look at this isn't the death
and we'll look at this isn't the death percentage anymore I forgot to change it
percentage anymore I forgot to change it in this last one this is
in this last one this is is what is this it's percent of
is what is this it's percent of population
population infected so let's change that for both
infected so let's change that for both of these because I don't want to get
of these because I don't want to get confused when you're looking at the
confused when you're looking at the column headers later um so we'll look at
column headers later um so we'll look at the percent of population infected let's
the percent of population infected let's run this and see what we
run this and see what we get uh list is not contained in either
get uh list is not contained in either the aggregate oh I need to add a group
the aggregate oh I need to add a group ey of course um so let's add Group
ey of course um so let's add Group by um and we need to group by both the
by um and we need to group by both the population and the location so let's try
population and the location so let's try that really quick let's see if this
that really quick let's see if this works
works awesome um well we ordered on location
awesome um well we ordered on location and population but I really want to look
and population but I really want to look at the
at the highest um so let's so let's just see
highest um so let's so let's just see really quick look at some of these
really quick look at some of these numbers got like 1% 4% um 10% okay so
numbers got like 1% 4% um 10% okay so yeah yeah what we want to do is order on
yeah yeah what we want to do is order on um this percent population infected so
um this percent population infected so let's go ahead and do that uh and let's
let's go ahead and do that uh and let's do that
do that descending so the descending gets the
descending so the descending gets the highest number
highest number first um my goodness 177% so what
first um my goodness 177% so what percentage of your population has gotten
percentage of your population has gotten covid it's been reported and and and um
covid it's been reported and and and um we can see that now so the very first
we can see that now so the very first one small population so it doesn't
one small population so it doesn't surprise me but if you look right down
surprise me but if you look right down here here so that's that 32 million that
here here so that's that 32 million that we were talking about that's that Max of
we were talking about that's that Max of total
total cases um which is the the highest number
cases um which is the the highest number of our infection count so we have 33 so
of our infection count so we have 33 so we're at I mean we're we're right up
we're at I mean we're we're right up there on the list let's look for other
there on the list let's look for other large countries I mean it's us you know
large countries I mean it's us you know there's Israel there's
there's Israel there's Belgium Portugal France so you know
Belgium Portugal France so you know we're up almost to about 10% in a lot of
we're up almost to about 10% in a lot of these countries
these countries so some some of us including the United
so some some of us including the United States we are we are in there as well
States we are we are in there as well some of us has have really high
some of us has have really high percentage rates we just did not keep it
percentage rates we just did not keep it under control um and you know a large
under control um and you know a large amount of the population has gotten it
amount of the population has gotten it that's what this one shows um now let's
that's what this one shows um now let's look uh kind of at the sad side of
look uh kind of at the sad side of things we were just looking at how many
things we were just looking at how many people were infected let's look at how
people were infected let's look at how many people actually died um so let's
many people actually died um so let's do let's comment and we'll say this is
do let's comment and we'll say this is going to this is
going to this is showing the
showing the countries with the let's do
countries with the let's do highest high am I spelling that right
highest high am I spelling that right yeah highest death count per
yeah highest death count per population um now how are we going to do
population um now how are we going to do this let's copy this off the bat but I
this let's copy this off the bat but I don't know if we're going to do it the
don't know if we're going to do it the exact same way because we just need
exact same way because we just need location um and not much else honestly
location um and not much else honestly so let's get rid of all this stuff but
so let's get rid of all this stuff but we do need we're looking at the highest
we do need we're looking at the highest death count so like we did up here with
death count so like we did up here with the Max total cases we're going to do
the Max total cases we're going to do Max and then we'll do total
Max and then we'll do total deaths I hope it's like this total
deaths I hope it's like this total deaths um and then we'll do as total
deaths um and then we'll do as total oops total death
oops total death count um and we'll order that by the
count um and we'll order that by the total death count see I don't need
total death count see I don't need this I think yeah I need to group by
this I think yeah I need to group by because there's an aggregate function
because there's an aggregate function and let's try this really
and let's try this really quick okay so if you're getting this
quick okay so if you're getting this there's a there's a simple slash
there's a there's a simple slash confusing explanation to
confusing explanation to this total deaths right now let's go
this total deaths right now let's go into our covid deaths
into our covid deaths columns okay let's show the total deaths
columns okay let's show the total deaths which is right
which is right here it's an nvar chart 255 it's an
here it's an nvar chart 255 it's an issue with the data type um oh wait
issue with the data type um oh wait total deaths no total deaths right here
total deaths no total deaths right here it's an issue with the data type um it
it's an issue with the data type um it just has to do with how the data type is
just has to do with how the data type is read when you use this aggregate
read when you use this aggregate function we need to convert it um or
function we need to convert it um or cast it is what we're actually do we
cast it is what we're actually do we need to cast this as an integer so
need to cast this as an integer so that's red as a numeric um why I cannot
that's red as a numeric um why I cannot 100% give you a perfect explanation for
100% give you a perfect explanation for it but this happens all the time you
it but this happens all the time you just need to look at the data and
just need to look at the data and realize oh it's probably because of this
realize oh it's probably because of this data type let's try something else um
data type let's try something else um and then it'll work so let's cast this
and then it'll work so let's cast this and we're in casting it I find is just
and we're in casting it I find is just easier but just as int boom there you go
easier but just as int boom there you go so now we're taking this nvar chart 255
so now we're taking this nvar chart 255 over here and then we are converting it
over here and then we are converting it to an
to an integer now let's run this um and let's
integer now let's run this um and let's get rid of this just for visual visual
get rid of this just for visual visual purposes now we are much more
purposes now we are much more accurate but we have a slight issue or
accurate but we have a slight issue or we're we're now seeing a slight issue
we're we're now seeing a slight issue with our
with our data in our data in the location section
data in our data in the location section we have a few ones that really shouldn't
we have a few ones that really shouldn't be there ones like world or
be there ones like world or Africa um or South America these are
Africa um or South America these are grouping entire
grouping entire continents so let's go back up to our um
continents so let's go back up to our um let's go back up here and let's do
let's go back up here and let's do actually let's pull it up really quick
actually let's pull it up really quick because this is just part of exploring
because this is just part of exploring the data and figuring it out so if we
the data and figuring it out so if we scroll down um we're going to f we're
scroll down um we're going to f we're going to see one like right where is it
going to see one like right where is it right here this this location is all of
right here this this location is all of Asia whereas in other ones the continent
Asia whereas in other ones the continent is Asia if I can pull one up real quick
is Asia if I can pull one up real quick so like right here the continent is Asia
so like right here the continent is Asia whereas before the location is Asia but
whereas before the location is Asia but if you also notice um the continent is
if you also notice um the continent is null here so what we need to do is say
null here so what we need to do is say um uh where
um uh where continent is not null because when it is
continent is not null because when it is null that means that this location is
null that means that this location is actually an entire continent and we
actually an entire continent and we don't want that um that may be helpful
don't want that um that may be helpful for us um later on but it is not helpful
for us um later on but it is not helpful now so now this right here will get rid
now so now this right here will get rid of that um and just knowing that
of that um and just knowing that figuring that out now we can add that to
figuring that out now we can add that to every every
every every script um and we can do you know you
script um and we can do you know you don't have to do this I'm just doing
don't have to do this I'm just doing this for you know visual purposes I'm
this for you know visual purposes I'm not going to do that for everyone um so
not going to do that for everyone um so let's say where continent is not null
let's say where continent is not null and now let's look at this and now you
and now let's look at this and now you can see that the United States is number
can see that the United States is number one and
one and so number one is not the best thing to
so number one is not the best thing to be number one in but we have a death
be number one in but we have a death count of 576,000 and again I I googled
count of 576,000 and again I I googled this earlier these numbers are pretty
this earlier these numbers are pretty accurate there some of them are like a
accurate there some of them are like a day or two behind give me a second I'm
day or two behind give me a second I'm going to take a
water they're like a couple days behind um this number is actually higher um and
um this number is actually higher um and as you know as we continue to have more
as you know as we continue to have more people die unfortunately that number
people die unfortunately that number just continues to go up um so the data
just continues to go up um so the data that that you download may be a a lot
that that you download may be a a lot higher um as of right now we've been
higher um as of right now we've been breaking everything out by location
breaking everything out by location right really quickly let's just do this
right really quickly let's just do this by something we kind of saw earlier um
by something we kind of saw earlier um and I'm just going to do this for
and I'm just going to do this for breaking it up purposes but I'm going to
breaking it up purposes but I'm going to say I'm going do caps lock let's break
say I'm going do caps lock let's break things down by continent how SP
things down by continent how SP continent
continent Contin jeez is that even how you spell
Contin jeez is that even how you spell it I don't even know let's keep going um
it I don't even know let's keep going um but now we can do consonant right
but now we can do consonant right here and we'll just copy and paste that
here and we'll just copy and paste that let's get that back up here um and now
let's get that back up here um and now we can see where continent is not
we can see where continent is not null let's see if that makes that yeah
null let's see if that makes that yeah okay so now it's breaking it out by
okay so now it's breaking it out by continents um with North America South
continents um with North America South America Asia Europe Africa
America Asia Europe Africa Oceana is this
Oceana is this perfect no no it's not perfect um North
perfect no no it's not perfect um North America looks like it's only including
America looks like it's only including the numbers from the United States and
the numbers from the United States and not Canada um so we have some small
not Canada um so we have some small issues in here um but for the purposes
issues in here um but for the purposes of what we're trying to do which I don't
of what we're trying to do which I don't think anyone's going going to come in
think anyone's going going to come in here and fact check us or check the data
here and fact check us or check the data they may and then you're I don't know
they may and then you're I don't know you might be screwed but for the
you might be screwed but for the purposes of
purposes of hierarchy um and you know drill that
hierarchy um and you know drill that drill down effect in Tableau which is
drill down effect in Tableau which is something we are going to do we want
something we are going to do we want want to start including this continent
want to start including this continent in our in our queries so that we can
in our in our queries so that we can drill down um further into these things
drill down um further into these things um we can also do where just wait I'm
um we can also do where just wait I'm going to do where
going to do where isnull um actually let me see so before
isnull um actually let me see so before we were doing work continent is not null
we were doing work continent is not null but let's do location I'm just I I'm
but let's do location I'm just I I'm doing this on the Fly I haven't done
doing this on the Fly I haven't done this before I just kind of
this before I just kind of am doing
am doing this um this actually is the correct
this um this actually is the correct numbers
numbers and I don't know why I didn't do this
and I don't know why I didn't do this before when I was actually creating this
before when I was actually creating this project but now this is a wonderful
project but now this is a wonderful beautiful thing I believe this is the
beautiful thing I believe this is the correct numbers um I could verify but I
correct numbers um I could verify but I don't want to do that live because I I
don't want to do that live because I I might look stupid but I think this is
might look stupid but I think this is accurate um remember before we were
accurate um remember before we were looking at the location and the location
looking at the location and the location um and it was actually the countries
um and it was actually the countries itself and then there were ones where we
itself and then there were ones where we did where is notnull to get rid of all
did where is notnull to get rid of all the ones that were like world and all
the ones that were like world and all those other things well now I'm just
those other things well now I'm just filtering on those instead of deleting
filtering on those instead of deleting them before we were looking at
them before we were looking at everything but these now we're only
everything but these now we're only looking at these and these numbers look
looking at these and these numbers look a lot more accurate so with that being
a lot more accurate so with that being said um I'm going to use this going
said um I'm going to use this going forward in my script so I'm going to
forward in my script so I'm going to kind of change things up to where from
kind of change things up to where from what I originally had um let me see
what I originally had um let me see though because if that is the case it
though because if that is the case it may screw up our drill down effect um
may screw up our drill down effect um which is highly unfortunate I may I I
which is highly unfortunate I may I I honestly might just revert back to it
honestly might just revert back to it for the pure fact that we want the
for the pure fact that we want the visualizations to look correct um just
visualizations to look correct um just know that this is the right way and if
know that this is the right way and if you want to go back and do that I highly
you want to go back and do that I highly encourage that I didn't figure that out
encourage that I didn't figure that out my first time around um but I'm willing
my first time around um but I'm willing to admit when I'm wrong let me see what
to admit when I'm wrong let me see what let me do a time check all we're run
let me do a time check all we're run like 50 minutes or so I think we're
like 50 minutes or so I think we're gonna we're just going to keep going all
gonna we're just going to keep going all the way through I I I don't think we're
the way through I I I don't think we're going to stop um I don't think we're
going to stop um I don't think we're going to stop in this project so we want
going to stop in this project so we want to do some of the the above queries were
to do some of the the above queries were kind of what we were going for nothing
kind of what we were going for nothing crazy difficult right nothing crazy hard
crazy difficult right nothing crazy hard um and now we want to we want to start
um and now we want to we want to start breaking this out by um continent as
breaking this out by um continent as well I'm I'm going to go back
well I'm I'm going to go back and is this correct let me look no so is
and is this correct let me look no so is not
not no
no um so we want to start doing some of the
um so we want to start doing some of the above queries but adding that content in
above queries but adding that content in there you can even go back and add that
there you can even go back and add that as well um if you want to that's totally
as well um if you want to that's totally fine I'm going to do some more queries
fine I'm going to do some more queries down
down here um or at least one more one or two
here um or at least one more one or two more and then we're going to start
more and then we're going to start getting I think into some a little bit
getting I think into some a little bit more advanced things we're going to
more advanced things we're going to start getting into some temp tables uh
start getting into some temp tables uh stuff like that because we're going to
stuff like that because we're going to eventually set these up in um views so
eventually set these up in um views so that we have these views to um use for
that we have these views to um use for Tableau
Tableau later um and again it shows you know how
later um and again it shows you know how to create a view so that's important so
to create a view so that's important so we we've we've done this first one this
we we've we've done this first one this next one is going to let me go down one
next one is going to let me go down one more this is
more this is showing the continents with the highest
showing the continents with the highest death count so almost the exact same as
death count so almost the exact same as we did before but now we're looking at
we did before but now we're looking at the continents um we can even go up and
the continents um we can even go up and look at uh just wait we literally just
look at uh just wait we literally just did that
did that um so that's what this one is
um so that's what this one is actually looking at my notes wrong
actually looking at my notes wrong idiot okay perfect um now we you know we
idiot okay perfect um now we you know we want to start looking at this from a
want to start looking at this from a Viewpoint of I'm going to visualize this
Viewpoint of I'm going to visualize this so how do we do that what we want to
so how do we do that what we want to look at let's look at some Global
look at let's look at some Global numbers um you can do as many many of
numbers um you can do as many many of these as you want anything up here just
these as you want anything up here just add continent to it um anything what
add continent to it um anything what like groupy just replace it with
like groupy just replace it with continent and you and you got it um so I
continent and you and you got it um so I don't want to go through and do every
don't want to go through and do every single one of those but that is kind of
single one of those but that is kind of the gist of what you might want to do
the gist of what you might want to do especially if you want that drill down
especially if you want that drill down effect and if you don't know what that
effect and if you don't know what that is um you know it's like clicking on
is um you know it's like clicking on North America and then when you bring up
North America and then when you bring up North America then it shows all the
North America then it shows all the countries in North America so Canada uh
countries in North America so Canada uh and the United States and so it's a
and the United States and so it's a drill down so you like on Africa and
drill down so you like on Africa and then there's all the African countries
then there's all the African countries that's what drilling down does and
that's what drilling down does and that's what you can do when you have um
that's what you can do when you have um those layers so you have the continent
those layers so you have the continent then you have the location um so you
then you have the location um so you know I'm not going to we we'll look at
know I'm not going to we we'll look at that when we actually get to Tableau but
that when we actually get to Tableau but I don't want to actually spend all the
I don't want to actually spend all the time writing that
time writing that out um but what we now want to do is we
out um but what we now want to do is we want to calculate everything for the
want to calculate everything for the across the entire
across the entire world so
world so let's do this let's
let's do this let's say um breaking let's do Global let's
say um breaking let's do Global let's just say global global numbers easier
just say global global numbers easier easier than
easier than nothing
nothing um all
um all right uh I let me really quick find the
right uh I let me really quick find the I think it's probably the first one the
I think it's probably the first one the death percentage let me let me see if
death percentage let me let me see if this is the one that we want
this is the one that we want
[Music] okay let me
see all right so let's take this one I'm sorry that took me a while to find again
sorry that took me a while to find again I'm not cutting any of this stuff out
I'm not cutting any of this stuff out you just got to stick with me you if
you just got to stick with me you if you're sticking with me this long I know
you're sticking with me this long I know you care I know you're not you're not
you care I know you're not you're not cutting away because I'm trying to
cutting away because I'm trying to figure things out on my side so um let
figure things out on my side so um let me get rid of this so this is the exact
me get rid of this so this is the exact same SC what well let's say where just
same SC what well let's say where just so we can get the right
so we can get the right numbers um so we are now going to look
numbers um so we are now going to look at the global numbers uh so we're not
at the global numbers uh so we're not going
going to we're not going to uh include any
to we're not going to uh include any location any continent or anything like
location any continent or anything like that but we do want to make sure that
that but we do want to make sure that we're only looking at all of the um
we're only looking at all of the um countries and we're not looking at the
countries and we're not looking at the world numbers plus all the countries
world numbers plus all the countries because then the numbers would get
because then the numbers would get astronomical so instead of now now we
astronomical so instead of now now we can't do so let's try running this
can't do so let's try running this really
really quick so now
quick so now we really can't do this um because now
we really can't do this um because now it's breaking everything out by
it's breaking everything out by um by you know that uh which is the date
um by you know that uh which is the date it's breaking everything out by the date
it's breaking everything out by the date because um these total case the numbers
because um these total case the numbers are different right so really quick
are different right so really quick let's Group by date
let's Group by date and now let's see what it looks
and now let's see what it looks like uh it's going to give us an error
like uh it's going to give us an error obviously that's because we're looking
obviously that's because we're looking at
at um that's because when we're looking at
um that's because when we're looking at this we're looking at multiple things
this we're looking at multiple things and we can't Group by just the dates
and we can't Group by just the dates obviously if we wanted to group by
obviously if we wanted to group by something which we need to
something which we need to do we then need to um start using
do we then need to um start using aggregate functions on everything
aggregate functions on everything else um so really
else um so really quickly let's do some aggregate
quickly let's do some aggregate functions I'm looking at my notes for
functions I'm looking at my notes for just a second um to see what I
just a second um to see what I did basically what we want to do and I
did basically what we want to do and I think what'll make things
think what'll make things easier is I mean I could try to do the
easier is I mean I could try to do the sum of Max total cases I don't think
sum of Max total cases I don't think that's possible um let me comment this
that's possible um let me comment this out really
quick yeah um it's because there's an aggregate function within an aggregate
aggregate function within an aggregate fun function and we can't really do that
fun function and we can't really do that um if we go back to the data and you we
um if we go back to the data and you we kind of looked at this earlier there's
kind of looked at this earlier there's one called new
one called new cases um let's use this because instead
cases um let's use this because instead of doing Max we can just sum it or or or
of doing Max we can just sum it or or or do a sum on it and that's going to give
do a sum on it and that's going to give us the sum of all the new cases which
us the sum of all the new cases which adds up to the total cases so if we do
adds up to the total cases so if we do this let's see this will give us on each
this let's see this will give us on each day the total across the world because
day the total across the world because we're not filtering by any continent or
we're not filtering by any continent or or we're filtering out um like the world
or we're filtering out um like the world and in the actual continents we're not
and in the actual continents we're not filtering by location or continent or
filtering by location or continent or anything it's just by date so we're
anything it's just by date so we're looking at the sum of the new
looking at the sum of the new cases so now let's
cases so now let's do uh let's do the
do uh let's do the [Music]
[Music] sum
sum of uh new underscore
of uh new underscore deaths and we can run that
one um operating data type and our chart is invalid for the some operator so
is invalid for the some operator so going back um and this is something I
going back um and this is something I encountered a lot when I was doing this
encountered a lot when I was doing this is these new cases is a float which is
is these new cases is a float which is why it's working in the sum but the new
why it's working in the sum but the new deaths is an narar so what we need to do
deaths is an narar so what we need to do again is cast that as an integer it's
again is cast that as an integer it's just the easiest thing to
just the easiest thing to do um and now that one should
do um and now that one should work so um let's get rid of the well
work so um let's get rid of the well let's get rid of down to here so we're
let's get rid of down to here so we're we're about to do another one and that's
we're about to do another one and that's going to be our death percentage
going to be our death percentage globally across um across the I guess
globally across um across the I guess the
the world so we need to do the sum of I
world so we need to do the sum of I think it's we need to do new
deaths all right divided by the sum
by the sum of new
of new [Music]
[Music] cases all right times
cases all right times 100 let's see what this takes
100 let's see what this takes us um okay of course we're getting the
us um okay of course we're getting the same thing let me
same thing let me um let me put this right
um let me put this right here and see if this
here and see if this works um
works um invalid data oh that's because this was
invalid data oh that's because this was new cases
new cases the new deaths one is right
the new deaths one is right here and let's run
here and let's run this and now we
this and now we are looking good um and as you can see
are looking good um and as you can see the death percentage is right here we
the death percentage is right here we have
have 91 um and let me give these I don't we
91 um and let me give these I don't we can't let me go back real quick and just
can't let me go back real quick and just say as total
say as total cases as as
cases as as total
total deaths um and let's run that
again okay and so across the world these are our numbers so we have total cases
are our numbers so we have total cases on that very first day that cases were
on that very first day that cases were starting to be
starting to be reported there were 98 total cases there
reported there were 98 total cases there was one total deaths that gives us a
was one total deaths that gives us a death percentage of 1% across the
death percentage of 1% across the country
country or across the world and as we scroll
or across the world and as we scroll down it gets lower and lower and that's
down it gets lower and lower and that's cuz we have a lot of people who have
cuz we have a lot of people who have gotten infected are the total cases um
gotten infected are the total cases um and again that's per day right so if we
and again that's per day right so if we remove this all together that date Al
remove this all together that date Al together which we can do right
now this will uh this will give us the total
will uh this will give us the total cases which is oh gosh let me read this
cases which is oh gosh let me read this through one two 150
through one two 150 million um versus
million um versus 3,180 26 so overall across the world we
3,180 26 so overall across the world we are looking at a um a death percentage
are looking at a um a death percentage of a little over
2% so interesting numbers you can keep both of those queries separate if you'd
both of those queries separate if you'd like um you know they might come in
like um you know they might come in handy
handy later but let's do
later but let's do this
this so we
so we have
have um give me one second check on my notes
um give me one second check on my notes again because I just want to make sure
again because I just want to make sure I'm not doing something
right all right so again we have a whole another table that we haven't used yet
another table that we haven't used yet uh it's this covid
uh it's this covid vaccinations um and just to you know
vaccinations um and just to you know refresh your memory let's do um let's
refresh your memory let's do um let's look at the table from portfolio
look at the table from portfolio project. Co vaccinations let's jog our
project. Co vaccinations let's jog our memory on what we got
memory on what we got here so we have
here so we have um we have these tests we have um
um we have these tests we have um vaccinations over here which was what
vaccinations over here which was what we're actually going to be
we're actually going to be using
using um excuse me
um excuse me me uh that's what we are going to be
me uh that's what we are going to be using so let's join these two tables
using so let's join these two tables together uh and let's let's actually
together uh and let's let's actually just
just do from actually let's just do this
do from actually let's just do this whole
whole thing from let's do covid deaths and
thing from let's do covid deaths and here's how we're going to join it so
here's how we're going to join it so we're going to say join and we're going
we're going to say join and we're going to say oops wait that is wrong join and
to say oops wait that is wrong join and we're going to say on so what are we
we're going to say on so what are we going to join them on um we're going to
going to join them on um we're going to join them on two things we're going to
join them on two things we're going to join them on location because that's
join them on location because that's much more specific than the continent
much more specific than the continent we're going to join them on location and
we're going to join them on location and we're going to join them on date let's
we're going to join them on date let's call this one DEA let's call this one
call this one DEA let's call this one vaccination so a little Alias for these
vaccination so a little Alias for these so that we don't have to type out this
so that we don't have to type out this entire table name each time so let's do
entire table name each time so let's do dea. location
dea. location is equal to
is equal to vac.
vac. location and da do and we'll say date is
location and da do and we'll say date is equal to
equal to vac. date and let's just see what we get
vac. date and let's just see what we get really quick so we'll have all of these
really quick so we'll have all of these things and let's look at Granada
things and let's look at Granada 0717 let's go all the way over
0717 let's go all the way over here and it should
here and it should have Gren
have Gren 0717 so just making sure that they were
0717 so just making sure that they were joined
joined correctly um for this query what we're
correctly um for this query what we're going to do is look at the total
going to do is look at the total population and let's do that right here
population and let's do that right here so looking at total population versus
so looking at total population versus vaccination so how many PE what is the
vaccination so how many PE what is the total amount of people in the world that
total amount of people in the world that have been vaccinated that is that is
have been vaccinated that is that is what we're going to do in this query
what we're going to do in this query right here so
right here so let's do
let's do dea.
continent location uh da. date again these are
location uh da. date again these are going to be the same in either one but
going to be the same in either one but we have to specify um let me just for
we have to specify um let me just for example if we do population population
example if we do population population oh actually that's a terrible example um
oh actually that's a terrible example um because population's only in one let me
because population's only in one let me go back real quick let me say I only
go back real quick let me say I only write date that's going to give me an
write date that's going to give me an error because there's date in both of
error because there's date in both of them in fact we joined it on them so we
them in fact we joined it on them so we know there's date in both of them so
know there's date in both of them so it's going to give us an error we just
it's going to give us an error we just have to specify what table we want to
have to specify what table we want to pull it from so we going to do
pull it from so we going to do DEA um and da. population just to keep
DEA um and da. population just to keep it consistent um and now we're going to
it consistent um and now we're going to add the next one da do and let's do new
add the next one da do and let's do new vaccinations um and really quick let's
vaccinations um and really quick let's just look at
this um and let me get my orders cu I want it to be organized I I
orders cu I want it to be organized I I actually one let's do one two three I
actually one let's do one two three I don't like it when it's not organized it
don't like it when it's not organized it bothers
bothers me so we're looking at oh no I also need
me so we're looking at oh no I also need to add or consonant is not
to add or consonant is not [Music]
[Music] null there we
null there we go uh
go uh da perfect now let's run this this
da perfect now let's run this this should look much better there we go all
should look much better there we go all right we are in fact if we want to look
right we are in fact if we want to look at Afghanistan like we have normally
at Afghanistan like we have normally been doing um in previous ones we do two
been doing um in previous ones we do two slash3 so there's our population here's
slash3 so there's our population here's our new
our new vaccinations
vaccinations now let's
now let's see we're going to go back go down and
see we're going to go back go down and let's see they have vaccinations
let's see they have vaccinations starting on
starting on 218 um if we go even further down let's
218 um if we go even further down let's just go
just go to who's this Canada oh yeah Canada
to who's this Canada oh yeah Canada would be a good one to look at they
would be a good one to look at they started doing vaccinations
started doing vaccinations on right here so 12:15 I mean they
on right here so 12:15 I mean they started very
started very early and their numbers only increased
early and their numbers only increased and now they're you know doing this is
and now they're you know doing this is per day right so this is 288,000 in one
per day right so this is 288,000 in one day um so that's you know really high
day um so that's you know really high numbers but this is the number of new
numbers but this is the number of new vaccinations um there is a column called
vaccinations um there is a column called total vaccinations in this table but
total vaccinations in this table but we're going to do something pretty just
we're going to do something pretty just to display again this whole portfolio
to display again this whole portfolio project is to show potential employers
project is to show potential employers that you know how to do certain things
that you know how to do certain things so I want to set up opportunities to do
so I want to set up opportunities to do that we're not going to use the total
that we're not going to use the total vaccinations we're going to use this new
vaccinations we're going to use this new vaccinations which is new vaccinations
vaccinations which is new vaccinations per
per day um so we want to we want to know or
day um so we want to we want to know or do kind of like a rolling count um out
do kind of like a rolling count um out here so as this number let me go back to
here so as this number let me go back to the beginning as this number increases
the beginning as this number increases 718 2300 4179 we want it to add up over
718 2300 4179 we want it to add up over here it's a pretty cool thing I mean you
here it's a pretty cool thing I mean you know it's once you see it you'll be like
know it's once you see it you'll be like oh that's pretty easy but you know we're
oh that's pretty easy but you know we're going to be using partition bu we're
going to be using partition bu we're going to be using
going to be using um uh this a Windows function so it's
um uh this a Windows function so it's really good to to Showcase I think so
really good to to Showcase I think so we're going to do
um and let's do um we need to do the sum because
do um we need to do the sum because we're going to be adding these together
we're going to be adding these together so we need to do the sum of new
so we need to do the sum of new vaccinations
vaccinations oops do the sum of new
oops do the sum of new vaccinations let's do
vaccinations let's do over and we're going to say partition oh
over and we're going to say partition oh gosh Partition by and
gosh Partition by and we need to Partition by the location
we need to Partition by the location first and foremost because we're
first and foremost because we're breaking it up by if we do it by
breaking it up by if we do it by continent the numbers are going to be
continent the numbers are going to be completely off we need to do it by
completely off we need to do it by location location and and also partly
location location and and also partly the date but you'll see that in just a
the date but you'll see that in just a second but we need to partition it by
second but we need to partition it by breaking it up by um
breaking it up by um location and why is that because every
location and why is that because every time it gets to a new location we want
time it gets to a new location we want the count to start over we we don't want
the count to start over we we don't want this aggregate function to just keep
this aggregate function to just keep running and running running it'll ruin
running and running running it'll ruin all of our numbers we only want the this
all of our numbers we only want the this part a partition on the the location so
part a partition on the the location so that it runs only through Canada and
that it runs only through Canada and then when it gets to the next country it
then when it gets to the next country it doesn't keep going um and if we only did
doesn't keep going um and if we only did that by the way let's look at what this
that by the way let's look at what this looks like uh okay real quick I need to
looks like uh okay real quick I need to cast
cast this um as an integer like we've been
this um as an integer like we've been doing in the past you can also do um
doing in the past you can also do um real quick I want to show you another
real quick I want to show you another one convert and I think
one convert and I think it's comma
it's comma [Music]
[Music] integer um or is it integer comma let me
integer um or is it integer comma let me try integer comma I think it's that way
try integer comma I think it's that way actually um and you can do it this way
actually um and you can do it this way as well that is up to you um you know
as well that is up to you um you know either one is totally fine if you want
either one is totally fine if you want to use both that's even better because
to use both that's even better because then it kind of shows you can do both um
then it kind of shows you can do both um but they basically do the exact same
but they basically do the exact same thing so let's go
thing so let's go down and let's see what what's happening
down and let's see what what's happening here so it goes down to Albania and
here so it goes down to Albania and since we're partitioning on Albania
since we're partitioning on Albania Albania their total amount of
Albania their total amount of vaccinations is 347,000 I know that
vaccinations is 347,000 I know that going into it because it has it on every
going into it because it has it on every single stinking row but down here they
single stinking row but down here they started to add they started to add up
started to add they started to add up right but we didn't do that we only
right but we didn't do that we only partitioned on location so it added it
partitioned on location so it added it did the sum of all the new vaccinations
did the sum of all the new vaccinations by that location so what we need to do
by that location so what we need to do is go over here and say order by and we
is go over here and say order by and we need to order it by both the location
need to order it by both the location oops da.
oops da. location and the date that is very
location and the date that is very important uh the date is what's going to
important uh the date is what's going to separate it out um and you'll see in
separate it out um and you'll see in just a second what I mean so now let's
just a second what I mean so now let's run this and let's go back down to
run this and let's go back down to Albania I think it was so here's Albania
Albania I think it was so here's Albania let's go to our first one so here's what
let's go to our first one so here's what we have we have 60 and it gives us 60
we have we have 60 and it gives us 60 then we add 78 so we add 60 + 78 = 138
then we add 78 so we add 60 + 78 = 138 then 78 + 1 78 sorry 60 + 78 + 42 = 180
then 78 + 1 78 sorry 60 + 78 + 42 = 180 then 60 + 78+ 142 + 61 241 so you get
then 60 + 78+ 142 + 61 241 so you get the point it adds up every single uh
the point it adds up every single uh consecutive one and when there's nulls
consecutive one and when there's nulls or there's zeros it's going to uh not
or there's zeros it's going to uh not anything it's just going to keep it uh
anything it's just going to keep it uh going and then you can see as it's it's
going and then you can see as it's it's a rolling count so we're going to name
a rolling count so we're going to name this let's do
this let's do as
as um let's do as um rolling people
um let's do as um rolling people vaccinated let's call
that um I think that's good now what we want to do is actually
good now what we want to do is actually look at the total population versus the
look at the total population versus the vaccinations um and really what we want
vaccinations um and really what we want to do is use this rolling people
to do is use this rolling people vaccinated we want to use the max number
vaccinated we want to use the max number because at the very bottom is our Max
because at the very bottom is our Max number this is how many people in
number this is how many people in Albania um we want to use that number
Albania um we want to use that number and divide it by the population to know
and divide it by the population to know how many people in that country are
how many people in that country are vaccinated so what we want to do is
vaccinated so what we want to do is we'll do this we'll do rolling people
we'll do this we'll do rolling people vaccinated divided by
vaccinated divided by population times 100 and as you can see
population times 100 and as you can see we're getting an error you can't use a
we're getting an error you can't use a column that you just created to then use
column that you just created to then use the next one so what we need to do is we
the next one so what we need to do is we need to create either a CTE or a temp
need to create either a CTE or a temp table um this is at this is the time of
table um this is at this is the time of of the show of this tutorial whatever
of the show of this tutorial whatever you want to call it where I'm going to
you want to call it where I'm going to give you some options you can do one you
give you some options you can do one you can do
can do both you know there's no preference to
both you know there's no preference to me
me um but we're going to take this and
um but we're going to take this and we're going to at least for this first
we're going to at least for this first one we're going to use a
one we're going to use a CT so we're going to say excuse me we're
CT so we're going to say excuse me we're going to say with and let's call
going to say with and let's call it um pop vers vac I don't
it um pop vers vac I don't know population versus
know population versus vaccination and then all we need to do
vaccination and then all we need to do is specify the um basically the columns
is specify the um basically the columns that we're going to input um so let's
that we're going to input um so let's put as and let's insert that down here
put as and let's insert that down here because what we need to do is we want to
because what we need to do is we want to say
say um we do
um we do continent oh gosh I'm so bad at spelling
continent oh gosh I'm so bad at spelling continent uh
continent uh location date
location date population um and then we'll have this
population um and then we'll have this rolling people vaccinated that should be
it um and let's see if there's we just need to close this parentheses so this
need to close this parentheses so this is our CTE it should be
is our CTE it should be working um actually that's not true I
working um actually that's not true I need an open parenthesis here that's why
need an open parenthesis here that's why it's giving me that
it's giving me that error um let's see it's I'm still
error um let's see it's I'm still getting an error so let me see if I'm
getting an error so let me see if I'm doing something
doing something wrong do
parentheses there and there I say with pop back there continent location
with pop back there continent location date
date population
population ah I believe that is the
ah I believe that is the issue so then we need we just need to
issue so then we need we just need to add that last
add that last column new
column new vaccinations um if the number of columns
vaccinations um if the number of columns in the CTE is different than the number
in the CTE is different than the number of columns here it's going to give you
of columns here it's going to give you an error so you got to make sure um and
an error so you got to make sure um and then let's just say for real for right
then let's just say for real for right now select everything from and we'll do
now select everything from and we'll do and we can even say pop versus vag it'll
and we can even say pop versus vag it'll come up right away so really quickly
come up right away so really quickly let's run this and see what
let's run this and see what happens uh the order by Clause can't be
happens uh the order by Clause can't be in there I knew that but
in there I knew that but whoops let's comment that out
whoops let's comment that out let's get that all the way up here let's
let's get that all the way up here let's run this so now that query that we were
run this so now that query that we were looking at before is now in here but now
looking at before is now in here but now we can actually use it to perform
we can actually use it to perform further calculations um so we'll just do
further calculations um so we'll just do everything
everything comma and then we'll do rolling people
comma and then we'll do rolling people vaccinated uh divided
vaccinated uh divided by and that needs to be
by and that needs to be population time 100 I'm pretty sure this
population time 100 I'm pretty sure this is incorrect give me me a
is incorrect give me me a second um invalid object oh it's because
second um invalid object oh it's because I have to run it with the
I have to run it with the CTE my bad
CTE my bad um so let's look at this percentage
um so let's look at this percentage really quick
really quick um it's not wrong and it's actually
um it's not wrong and it's actually going to give us a rolling number and
going to give us a rolling number and this may actually be what we
this may actually be what we want
um so basically what it's doing is it's taking this column
taking this column and doing it versus this column and so
and doing it versus this column and so this number should only increase because
this number should only increase because as this number increases this number
as this number increases this number will increase because the population
will increase because the population stays stagnant um again I'm kind of
stays stagnant um again I'm kind of looking at this as we go so right now
looking at this as we go so right now 12% of the population in um
12% of the population in um Albania is vaccinated so that you know
Albania is vaccinated so that you know that is that's all we know I don't think
that is that's all we know I don't think we need to go any further than that I
we need to go any further than that I think um if you want to
think um if you want to you can look at the max
you can look at the max one um but you'll have to get rid of
one um but you'll have to get rid of date and just keep the location um
date and just keep the location um population Etc because the date is going
population Etc because the date is going to throw everything
to throw everything off so if that's something you want to
off so if that's something you want to do absolutely do
do absolutely do that um you can use a temp table here uh
that um you can use a temp table here uh we can look at how to do
we can look at how to do that really quickly I think um so that
that really quickly I think um so that you guys know how to do that again I
you guys know how to do that again I recommend throwing in one or two of
recommend throwing in one or two of these um like even up
these um like even up here you can do different um different
here you can do different um different counts and then do one for each um so
counts and then do one for each um so let's do temp
let's do temp table all right so it's going to be a
table all right so it's going to be a lot of the same stuff we're going to
lot of the same stuff we're going to keep
keep this
this and this is going to be what we insert
and this is going to be what we insert so let's say insert into and we need to
so let's say insert into and we need to write where we're inserting it into but
write where we're inserting it into but let's say uh again I'm only doing this
let's say uh again I'm only doing this for it's going to be basically the same
for it's going to be basically the same it's going to have the same effect but
it's going to have the same effect but um with a temp table so uh we're going
um with a temp table so uh we're going to do temp table and let's look
to do temp table and let's look at um let's say let's call percent
at um let's say let's call percent population
population vaccinated and we need to specify our
vaccinated and we need to specify our columns so let's go down here excuse me
columns so let's go down here excuse me let's go down here and let's do the
let's go down here and let's do the basically the exact same thing so
basically the exact same thing so continent
continent I think I spelled that
I think I spelled that right no I didn't spell that right I
right no I didn't spell that right I almost did I got really confident we'll
almost did I got really confident we'll do we and and just so you know for these
do we and and just so you know for these we have to specify the data type as well
we have to specify the data type as well um because we're basically creating like
um because we're basically creating like a genuine table is just a temporary one
a genuine table is just a temporary one so let's do invar Char 255 we'll do um
so let's do invar Char 255 we'll do um location we'll do the same thing and
location we'll do the same thing and barar
barar oops
oops 255 we need to do date and we'll do that
255 we need to do date and we'll do that as date time we'll do
as date time we'll do population and we can do I mean there's
population and we can do I mean there's lots of different ones we can do but
lots of different ones we can do but we'll do numeric for this
we'll do numeric for this example there's new uncore
example there's new uncore vaccinations and let's do that one as
vaccinations and let's do that one as numeric again you can use different
numeric again you can use different things um and then we'll do rolling
things um and then we'll do rolling people
people vaccinated Um this can can be numeric as
vaccinated Um this can can be numeric as well
well um and then we need to insert that into
um and then we need to insert that into here okay so we're inserting the data
here okay so we're inserting the data and then down here we can actually
and then down here we can actually select it and let's let's take
select it and let's let's take this and do right here except we're
this and do right here except we're going to be doing this
going to be doing this by this right here but it hasn't been
by this right here but it hasn't been created yet but it will be created in
created yet but it will be created in just a
just a second okay so you let me see if
second okay so you let me see if yeah so these were the rows that were
yeah so these were the rows that were affected um and and then we got our
affected um and and then we got our actual output from this right here now
actual output from this right here now let's say you wanted to change something
let's say you wanted to change something in here you're like oh you know I I
in here you're like oh you know I I don't want to do it we this let me
don't want to do it we this let me comment that out and then let me do this
comment that out and then let me do this and um create that table again oh no we
and um create that table again oh no we got an error um how can we get around
got an error um how can we get around this very simple I've done this in a I
this very simple I've done this in a I should do this in a different one you
should do this in a different one you can do drop table if exist
can do drop table if exist and then do this right
and then do this right here um and when we run this it
here um and when we run this it should give us our output I highly
should give us our output I highly recommend just adding this especially if
recommend just adding this especially if you plan on making any alterations so
you plan on making any alterations so that when you um run it multiple times
that when you um run it multiple times you don't have to you know go and then
you don't have to you know go and then delete the view or or delete the temp
delete the view or or delete the temp table or drop temp table or you know
table or drop temp table or you know it's just built in it's at the top it's
it's just built in it's at the top it's easy to maintain and it looks good it's
easy to maintain and it looks good it's it's something that that a lot of people
it's something that that a lot of people do and so if you have that at the top of
do and so if you have that at the top of your query and somebody you know
your query and somebody you know somebody who wants to hire you looks at
somebody who wants to hire you looks at this like oh okay that makes sense I'm
this like oh okay that makes sense I'm glad they included that they know what
glad they included that they know what they're doing this guy's smart I should
they're doing this guy's smart I should hire them um now what we're going to do
hire them um now what we're going to do is uh I feel like I've showed you as
is uh I feel like I've showed you as much as I can show you um with the
much as I can show you um with the limited data that we've looked at again
limited data that we've looked at again I could have done this for six hours
I could have done this for six hours straight if I had used all the data at
straight if I had used all the data at least I mean there's just so much data
least I mean there's just so much data but let's create a view you know I'm
but let's create a view you know I'm only going to show you how to create one
only going to show you how to create one view but I want you to go back and
view but I want you to go back and create multiple views you know if this
create multiple views you know if this is one that you want to look at these
is one that you want to look at these Global numbers um let's look at this one
Global numbers um let's look at this one really quick if you want to look at this
really quick if you want to look at this number right here toss it in a view I
number right here toss it in a view I mean that one doesn't make sense to toss
mean that one doesn't make sense to toss in a view but this
in a view but this one toss these numbers in a view um and
one toss these numbers in a view um and we're we're going to um look at it in
we're we're going to um look at it in Tableau later but for right now let's
Tableau later but for right now let's just create our
just create our view um so like let's just
view um so like let's just say
say creating view to store data for
creating view to store data for later
later visualizations all right so let's say
visualizations all right so let's say create view um and I want I'm just going
create view um and I want I'm just going to keep the same thing um like that um
to keep the same thing um like that um and for views it's so easy I mean I'm
and for views it's so easy I mean I'm literally just going to and I can even
literally just going to and I can even take um the order by I believe we'll see
take um the order by I believe we'll see if I'm
if I'm correct um actually let's get rid of
correct um actually let's get rid of both of these
both of these things so it says create view percent
things so it says create view percent uh percent populate oops percent
uh percent populate oops percent population
population vaccinated um and let's see am I doing
vaccinated um and let's see am I doing anything wrong
anything wrong [Music]
[Music] here let me
here let me see the order by clause
see the order by clause I was completely wrong I was wondering
I was completely wrong I was wondering why I was getting that now let's try
why I was getting that now let's try running it okay so it ran successfully
running it okay so it ran successfully um let's look at our views it's not
um let's look at our views it's not going to be in there let's refresh it
going to be in there let's refresh it hey look we got our very first view we
hey look we got our very first view we can open that up like a table if we want
can open that up like a table if we want to um isn't it's I mean it's gorgeous um
to um isn't it's I mean it's gorgeous um if you want to get rid of that select or
if you want to get rid of that select or sorry control shift R that's a
sorry control shift R that's a refresh um and now it it basically
refresh um and now it it basically recognized is it but let's go back here
recognized is it but let's go back here for a
for a second um and you know we can now query
second um and you know we can now query off of that it's a view now so you know
off of that it's a view now so you know it's it's something that you can it's
it's it's something that you can it's permanent you know you have to go in and
permanent you know you have to go in and actually delete it's not like a temp
actually delete it's not like a temp table this is now permanent and this
table this is now permanent and this could be something that we now use for a
could be something that we now use for a visualization later so do some of these
visualization later so do some of these look at some of the queries that we've
look at some of the queries that we've looked at and create a few of these
looked at and create a few of these views um and we will use them later
views um and we will use them later um normally in like a normal setting uh
um normally in like a normal setting uh if I was actually working I would put
if I was actually working I would put some of these in actual like I would
some of these in actual like I would call them like a work view or a work
call them like a work view or a work table or something set aside so that I
table or something set aside so that I can use them consistently um but I would
can use them consistently um but I would also set them aside so that I could
also set them aside so that I could connect Tableau to that view now we're
connect Tableau to that view now we're going to be using something called
going to be using something called Tableau public that'll be in the very
Tableau public that'll be in the very next tutorial unfortunately um
next tutorial unfortunately um let me see if I can show you I can't
let me see if I can show you I can't show you Tableau public does not connect
show you Tableau public does not connect to SQL databases um and that's because
to SQL databases um and that's because it's free and I totally get it you have
it's free and I totally get it you have to pay for the upgraded version but I am
to pay for the upgraded version but I am not a a billionaire okay I cannot afford
not a a billionaire okay I cannot afford uh the real version of Tableau I'm also
uh the real version of Tableau I'm also not like a student or or like something
not like a student or or like something where I can get it cheap so I'm not
where I can get it cheap so I'm not paying for that so we're going to use
paying for that so we're going to use Tableau public and and I recommend this
Tableau public and and I recommend this anyways because anybody can access it
anyways because anybody can access it it's it's free for anybody so we're
it's it's free for anybody so we're going to be using Tableau in the next
going to be using Tableau in the next one to actually visualize a lot of these
one to actually visualize a lot of these things I want to get at least five
things I want to get at least five visualizations we're going to create a
visualizations we're going to create a dashboard it's going to be a beautiful
dashboard it's going to be a beautiful beautiful thing all right so the very
beautiful thing all right so the very last thing that we are going to do is we
last thing that we are going to do is we are going to actually save this and then
are going to actually save this and then put it into GitHub and I just want to
put it into GitHub and I just want to show you how to do that that's where
show you how to do that that's where we're going to be storing our code at
we're going to be storing our code at least for now um so let's go up here
least for now um so let's go up here let's click file let's click save as
let's click file let's click save as I've already have multiple versions of
I've already have multiple versions of this let's just put
this let's just put B2 we're going to save that so we have
B2 we're going to save that so we have this saved now I'm going to go over here
this saved now I'm going to go over here I'm going to go to my GitHub now if you
I'm going to go to my GitHub now if you don't have an account I highly recommend
don't have an account I highly recommend getting an account so you can start
getting an account so you can start putting your portfolio projects in here
putting your portfolio projects in here of course we're not going to put our
of course we're not going to put our Tableau one in here but our SQL ones and
Tableau one in here but our SQL ones and our python ones you can put in here
our python ones you can put in here again I'll talk a lot more about how we
again I'll talk a lot more about how we actually want to display this in GitHub
actually want to display this in GitHub or other places but what we're going to
or other places but what we're going to do for this is we're going to create a
do for this is we're going to create a new
new repository let's call this one
repository let's call this one portfolio
portfolio projects make it public we'll create the
projects make it public we'll create the repository we'll do all that extra stuff
repository we'll do all that extra stuff later so what we now want to do is
later so what we now want to do is upload an existing file we'll click
upload an existing file we'll click right there go to choose files and we'll
right there go to choose files and we'll click this latest one that we saved uh
click this latest one that we saved uh and we'll open it and we can always
and we'll open it and we can always change the name of it later on and you
change the name of it later on and you can add notes if you'd like but we'll
can add notes if you'd like but we'll commit that change so we'll actually
commit that change so we'll actually upload this uh this
upload this uh this file um but let's look at it really
file um but let's look at it really quick and I'm going to go back and I'm
quick and I'm going to go back and I'm going to use the real one where has the
going to use the real one where has the formatting and and the notes that I have
formatting and and the notes that I have that I wanted to add in there but as you
that I wanted to add in there but as you can see you know you can see all of the
can see you know you can see all of the queries that we wrote and this is
queries that we wrote and this is fantastic so if somebody comes in here
fantastic so if somebody comes in here you know we'll have more notes and kind
you know we'll have more notes and kind of better comments on what they do um
of better comments on what they do um and what the takeaway is this from for a
and what the takeaway is this from for a hiring manager to you know when they
hiring manager to you know when they actually look at this so this is a
actually look at this so this is a really really good place to start again
really really good place to start again uh this may not be your optimal place to
uh this may not be your optimal place to put this I'll give you a few different
put this I'll give you a few different options in a later video about how we
options in a later video about how we can actually uh potentially improve upon
can actually uh potentially improve upon this I'm really looking forward to
this I'm really looking forward to getting more portfolio projects done so
getting more portfolio projects done so we can actually start building a compl
we can actually start building a compl complete
complete portfolio uh if you've stuck around all
portfolio uh if you've stuck around all this way I just want to say
this way I just want to say congratulations I mean I know this was a
congratulations I mean I know this was a long video I know that it took a long
long video I know that it took a long time but you stuck with me uh you you
time but you stuck with me uh you you put in the hard work and that is
put in the hard work and that is fantastic and I really hope that it pays
fantastic and I really hope that it pays off and I hope that this has been
off and I hope that this has been helpful thank you for watching we'll
helpful thank you for watching we'll have a lot more uh videos in the future
have a lot more uh videos in the future on these portfolio projects and I'm I'm
on these portfolio projects and I'm I'm just really really looking forward to
just really really looking forward to doing them to be honest so thank you for
doing them to be honest so thank you for sticking with with me uh thank you for
sticking with with me uh thank you for watching I really appreciate it if you
watching I really appreciate it if you like this video be sure to like And
like this video be sure to like And subscribe below and I will see you in
subscribe below and I will see you in the next
the next [Music]
[Music] video what's going on everybody welcome
video what's going on everybody welcome back to another video today we will be
back to another video today we will be heading back in a sequel for our third
heading back in a sequel for our third portfolio
now I am extremely excited for this project in particular for a few reasons
project in particular for a few reasons one we're getting back into SQL and I
one we're getting back into SQL and I really like SQL and two we are finally
really like SQL and two we are finally focusing on data cleaning and I have
focusing on data cleaning and I have talked so much about why data cleaning
talked so much about why data cleaning is important and that you really need to
is important and that you really need to learn how to clean data and that that's
learn how to clean data and that that's a big part of what a data analyst does
a big part of what a data analyst does but I haven't actually showed you how to
but I haven't actually showed you how to do it yet and so that is what this whole
do it yet and so that is what this whole project is going to be and then at the
project is going to be and then at the end you'll get to add it to your
end you'll get to add it to your portfolio so it's really a win-win now
portfolio so it's really a win-win now before we start I just want to say that
before we start I just want to say that I think it's going to be a little bit
I think it's going to be a little bit more advanced than our very first video
more advanced than our very first video in Sequel where we walk through data
in Sequel where we walk through data exploration if you see something that
exploration if you see something that you have never seen before I will do my
you have never seen before I will do my best to explain it while we're walking
best to explain it while we're walking through it but if you get confused or it
through it but if you get confused or it seems a little complicated please pause
seems a little complicated please pause it Google it do a little bit of research
it Google it do a little bit of research and then come back and I think that will
and then come back and I think that will be very helpful with that being said
be very helpful with that being said let's jump over to my screen and we'll
let's jump over to my screen and we'll get started on the project so we're
get started on the project so we're going to start over here on GitHub and
going to start over here on GitHub and this is where I've actually put the data
this is where I've actually put the data set that we are going to be using so I
set that we are going to be using so I will put this link in the description uh
will put this link in the description uh we're going to go right over here to the
we're going to go right over here to the Nashville housing data for data cleaning
Nashville housing data for data cleaning all you have to do is Click download and
all you have to do is Click download and it's going to download it and you can
it's going to download it and you can open it up if you want to we're not
open it up if you want to we're not going to do anything to this data at all
going to do anything to this data at all but really quick I'm just going to show
but really quick I'm just going to show you what it does look like um and we'll
you what it does look like um and we'll of course look at this in SQL in just a
of course look at this in SQL in just a little bit but we have a unique ID
little bit but we have a unique ID parcel ID uh we have this
parcel ID uh we have this address a sales date uh the price of the
address a sales date uh the price of the home so this is housing data if you
home so this is housing data if you didn't pick up on that already uh who
didn't pick up on that already uh who actually owns the home the owner address
actually owns the home the owner address and then some information about land
and then some information about land value um bedrooms bathrooms things like
value um bedrooms bathrooms things like that again not super important um
that again not super important um because we're going to be doing all of
because we're going to be doing all of this in uh SQL so let's actually get
this in uh SQL so let's actually get this data into SQL we're going to import
this data into SQL we're going to import it the exact same way that we did uh in
it the exact same way that we did uh in the very first video so we're going to
the very first video so we're going to come right over here going to go all the
come right over here going to go all the way down to Microsoft SQL Server 2019
way down to Microsoft SQL Server 2019 Import and Export we'll click next our
Import and Export we'll click next our data source is like last time a
data source is like last time a Microsoft Excel and let's take a
Microsoft Excel and let's take a look and we'll take that first one this
look and we'll take that first one this is the most recent one I've downloaded
is the most recent one I've downloaded but I just wanted to make sure so I
but I just wanted to make sure so I downloaded a few times um for the
downloaded a few times um for the destination we're going to click SQL
destination we're going to click SQL Server native client
Server native client 11.0 and this is my client or my server
11.0 and this is my client or my server right here
right here and I'm going to go down here and I want
and I'm going to go down here and I want to put it in this portfolio project so
to put it in this portfolio project so you know just configure this to what
you know just configure this to what your server is um again if you haven't
your server is um again if you haven't done this before you've never set up SQL
done this before you've never set up SQL server or a server um to go on SQL
server or a server um to go on SQL Server I will leave a link hopefully
Server I will leave a link hopefully right here also in the description uh
right here also in the description uh like I did for the first project so um
like I did for the first project so um you know be sure to go through that
you know be sure to go through that video so that you know how to download
video so that you know how to download this and have everything we're going to
this and have everything we're going to copy the data we're going to take sheet
copy the data we're going to take sheet one um we could renamed sheet one to
one um we could renamed sheet one to something else but uh we didn't and then
something else but uh we didn't and then we're going to finish this and finish
we're going to finish this and finish and it should run successfully
and it should run successfully hopefully it's looking good perfect so
hopefully it's looking good perfect so we have
we have 56477 so let's head over to
56477 so let's head over to SQL all right let's go to our
SQL all right let's go to our database portfolio project uh and here
database portfolio project uh and here is our sheet one now I'm going to rename
is our sheet one now I'm going to rename this um let's rename name it what is it
this um let's rename name it what is it Nashville let's just do Nashville
Nashville let's just do Nashville housing that's what I'm going to rename
housing that's what I'm going to rename it as um at least so when I post these
it as um at least so when I post these queries um to the GitHub and you see
queries um to the GitHub and you see them this is what they will be so if you
them this is what they will be so if you want to have them the exact same or be
want to have them the exact same or be able to copy and paste them um you know
able to copy and paste them um you know you should you should do that as well so
you should you should do that as well so let's take a look really quick let's
let's take a look really quick let's select the top
select the top 1,000 but there's about 56,000 rows
1,000 but there's about 56,000 rows there's a lot of data in here um and a
there's a lot of data in here um and a lot of things so
lot of things so uh I'm about to open up a a save thing
uh I'm about to open up a a save thing and we'll walk through the exact things
and we'll walk through the exact things that we're going to be working on in
that we're going to be working on in just a little bit but um yeah this is
just a little bit but um yeah this is what the data looks like in here there's
what the data looks like in here there's lots of columns uh lots of data so
lots of columns uh lots of data so really excited about this um let me pull
really excited about this um let me pull this open really fast it's going to be
this open really fast it's going to be this project walkth through here are the
this project walkth through here are the things and I'm going to show you this
things and I'm going to show you this really quickly here are the things that
really quickly here are the things that we're going to be walking through so
we're going to be walking through so we're going to standardize the date
we're going to standardize the date format we're going to populate the
format we're going to populate the property address data um that's
property address data um that's referring to this right here if you
referring to this right here if you notice there's the address and there's
notice there's the address and there's also the city that it's in so we want to
also the city that it's in so we want to be able to separate that out um and that
be able to separate that out um and that is actually right over here we're going
is actually right over here we're going to be doing the same same thing to the
to be doing the same same thing to the owner address except that has an address
owner address except that has an address a city and the state um which makes it a
a city and the state um which makes it a little bit more complicated and so um
little bit more complicated and so um that one should be really really cool to
that one should be really really cool to to show you um oh whoops I I messed up
to show you um oh whoops I I messed up that's what this one is breaking out
that's what this one is breaking out into individual columns that's where
into individual columns that's where going to do for that this popular in the
going to do for that this popular in the property address um you know if you
property address um you know if you notice and we'll go into this a little
notice and we'll go into this a little bit there's actually some values in the
bit there's actually some values in the property address that are blank but I'm
property address that are blank but I'm going to show you how you can actually
going to show you how you can actually populate that um which you know is a
populate that um which you know is a it's just a cool trick that I've used a
it's just a cool trick that I've used a few times and it it it does work I think
few times and it it it does work I think you'll find that one interesting um in
you'll find that one interesting um in the sold as vacant field we're going to
the sold as vacant field we're going to be doing some um some case statements if
be doing some um some case statements if then um then we're going to be removing
then um then we're going to be removing duplicates and then delet deleting
duplicates and then delet deleting unused columns so we have a lot to get
unused columns so we have a lot to get through this could be potentially the
through this could be potentially the longest video and I'm okay with that um
longest video and I'm okay with that um because I'm I love SQL down here and and
because I'm I love SQL down here and and I will say that when I when I in the
I will say that when I when I in the very first video I said it was going to
very first video I said it was going to be an ETL video um and I fully intended
be an ETL video um and I fully intended on doing that but I ran into not issues
on doing that but I ran into not issues on my side but issues in the fact that
on my side but issues in the fact that the ma vast majority of people who are
the ma vast majority of people who are going to be watching this are not going
going to be watching this are not going to be able to do what I did to configure
to be able to do what I did to configure my server um but I left it in here
my server um but I left it in here anyways when I think ETL is an automated
anyways when I think ETL is an automated process in order to uh extract the data
process in order to uh extract the data from somewhere we're going to transform
from somewhere we're going to transform it and then put it somewhere this was
it and then put it somewhere this was going to be the extraction method um and
going to be the extraction method um and I was going to put it in a store
I was going to put it in a store procedure so that you could um you know
procedure so that you could um you know run the run the store procedure run the
run the run the store procedure run the job import the data it was going to be
job import the data it was going to be really cool but I know that if I was
really cool but I know that if I was having trouble with it me trying to
having trouble with it me trying to explain it to you and you being able to
explain it to you and you being able to figure it out on your side was going to
figure it out on your side was going to be very tough I left the this anyways
be very tough I left the this anyways because I was able to get to work on my
because I was able to get to work on my computer um but it is tough and it took
computer um but it is tough and it took a lot of research um and I did this for
a lot of research um and I did this for a previous server like a year or two ago
a previous server like a year or two ago and I remember it being crazy hard but I
and I remember it being crazy hard but I was able to figure it out on my computer
was able to figure it out on my computer so if you want to try it out um try it
so if you want to try it out um try it out and and look into the stuff so I'm
out and and look into the stuff so I'm going to leave this here this is just
going to leave this here this is just for if you want to try it it's a little
for if you want to try it it's a little more advanced um and so you don't have
more advanced um and so you don't have to just important and this will be a
to just important and this will be a data cleaning project
data cleaning project instead of an ETL project but data
instead of an ETL project but data cleaning is what 90% it was going to be
cleaning is what 90% it was going to be anyways um anyways let's go back up to
anyways um anyways let's go back up to the very top really quickly I have a
the very top really quickly I have a whole another laptop right here as I did
whole another laptop right here as I did in the first video I didn't show it to
in the first video I didn't show it to you last time but um I have all of my
you last time but um I have all of my queries written out over here I'm going
queries written out over here I'm going to try to do this as quickly as possible
to try to do this as quickly as possible we have a lot to get through now before
we have a lot to get through now before we start writing our queries I am going
we start writing our queries I am going to turn off my camera so I do not get in
to turn off my camera so I do not get in the way all right you should still be
the way all right you should still be hearing my voice let's let get started
hearing my voice let's let get started let's just start with select everything
let's just start with select everything and we'll do from uh and it is portfolio
and we'll do from uh and it is portfolio project.
project. db. Nashville housing so let's just get
db. Nashville housing so let's just get this pulled up on
this pulled up on screen awesome so this is exactly what
screen awesome so this is exactly what we were looking at
we were looking at before and the very first thing that
before and the very first thing that we're going to be looking at is this
we're going to be looking at is this sale date now uh I wrote standardized
sale date now uh I wrote standardized sale date but I'm really just going to
sale date but I'm really just going to change the sale date um so let's copy
change the sale date um so let's copy this really
this really quick and let's look at just s
date and it has this time on the end and it serves absolutely no purpose and I it
it serves absolutely no purpose and I it just annoys me I want to take that off
just annoys me I want to take that off and so right now it's a say it's a date
and so right now it's a say it's a date time format but we're going to convert
time format but we're going to convert and we're going to do date and we're
and we're going to do date and we're going to take sale
going to take sale date sale date and we're going to go
date sale date and we're going to go like that and let's run this really
like that and let's run this really quick and this is what we want it to
quick and this is what we want it to look like all right so let's say update
look like all right so let's say update and we have portfolio project specified
and we have portfolio project specified up here so we can just say Nashville
up here so we can just say Nashville housing and we are going to set sale
housing and we are going to set sale date equal to and we're just going to
date equal to and we're just going to copy this now I will say before we do
copy this now I will say before we do this um I had some issues in my when I
this um I had some issues in my when I was initially doing it whether or not it
was initially doing it whether or not it made the update and I was I'm not sure
made the update and I was I'm not sure why why not it was doing it um so yeah
why why not it was doing it um so yeah it's not doing it right now I you try it
it's not doing it right now I you try it out on yours it may or may not be
out on yours it may or may not be working I'm not exactly sure why that is
working I'm not exactly sure why that is because I would say like 80% of the time
because I would say like 80% of the time it's doing it 10 20% it's not I don't
it's doing it 10 20% it's not I don't know why um no logical explanation of
know why um no logical explanation of that but uh when I most the time when I
that but uh when I most the time when I did it they would then be the same
did it they would then be the same column something we can do I just
column something we can do I just thought of we can do alter alter can't
thought of we can do alter alter can't even say that word alter
even say that word alter table and we can say um I think it's new
table and we can say um I think it's new or it's add add um give me one
or it's add add um give me one second yeah so add and we'll just do
second yeah so add and we'll just do sale date
sale date converted um and let's make that a date
converted um and let's make that a date format and bum just like this and then
format and bum just like this and then we can
we can say like this and say
say like this and say sale date
sale date converted um let's try this and see what
converted um let's try this and see what happens so I'm going to add this
happens so I'm going to add this column and then I'm going to update this
column and then I'm going to update this and it says it's affected let's see what
and it says it's affected let's see what happened uh so let's write sale
happened uh so let's write sale date convert
date convert sale date
sale date converted let's see what happened let's
converted let's see what happened let's see if it actually
see if it actually worked and it worked okay so we we now
worked and it worked okay so we we now have a column um and maybe at the end
have a column um and maybe at the end we'll remove that sale date column U so
we'll remove that sale date column U so that we just have that sale date
that we just have that sale date converted but we know what that is you
converted but we know what that is you don't have to name it that you can name
don't have to name it that you can name it sale date to or something like that
it sale date to or something like that um cool well let's go down to the
um cool well let's go down to the property address and let's get a just a
property address and let's get a just a really quick look at it uh let's copy
really quick look at it uh let's copy this up here I hate rewriting this stuff
this up here I hate rewriting this stuff so I'm always copying and pasting um but
so I'm always copying and pasting um but we're going to be working with the prop
address there we go so let's take a look at this really
at this really quick
um so let's look at sorry I was looking at my notes we need to look at where the
at my notes we need to look at where the property address is null so what you'll
property address is null so what you'll see really quick when we run this is
see really quick when we run this is that there are null values um why there
that there are null values um why there are null values yeah I really don't know
are null values yeah I really don't know um I I really am not sure but let's look
um I I really am not sure but let's look at
at everything where this
everything where this is um where it's n so we have this
is um where it's n so we have this property address we have a sale date a
property address we have a sale date a price legal reference um there's this
price legal reference um there's this parcel ID and there's this unique ID um
parcel ID and there's this unique ID um so we have a lot of information and when
so we have a lot of information and when you have something like this something
you have something like this something like a u an address an address is you
like a u an address an address is you know the address isn't going to change
know the address isn't going to change the address is the address the owner the
the address is the address the owner the owner's address might change but the
owner's address might change but the property itself the address 99.9% of the
property itself the address 99.9% of the time is not going to change so you can
time is not going to change so you can say with almost certainty that you know
say with almost certainty that you know this property address could be populated
this property address could be populated if we had a reference point um to base
if we had a reference point um to base that off of so really quickly um let's
that off of so really quickly um let's look at just
look at just everything and let's look at and we'll
everything and let's look at and we'll just order by
just order by let's
let's do property not property address uh
do property not property address uh let's do parcel ID and let's take a look
let's do parcel ID and let's take a look at this so we have to do a little bit of
at this so we have to do a little bit of some research on this um but I'm going
some research on this um but I'm going to show you something really quick let's
to show you something really quick let's see if I can find
see if I can find example um in not too
example um in not too long okay so here's an example here's
long okay so here's an example here's the same ID so 015 bum and that's the
the same ID so 015 bum and that's the exact same address and we'll find this a
exact same address and we'll find this a lot of times and I look through the data
lot of times and I look through the data and it's it is pretty much accurate um
and it's it is pretty much accurate um when it does have it it it is the exact
when it does have it it it is the exact same address so this parcel ID is going
same address so this parcel ID is going to be the same as the property address
to be the same as the property address um so something that we can do is
um so something that we can do is basically say if this parcel ID has an
basically say if this parcel ID has an address and this parcel ID does not have
address and this parcel ID does not have an address let's populate it with this
an address let's populate it with this address that's already populated because
address that's already populated because we know these are going to be the same
we know these are going to be the same that is basically what we are about to
that is basically what we are about to do um and it's not super complicated um
do um and it's not super complicated um but let's get started writing
but let's get started writing it let's copy that down there um one
it let's copy that down there um one thing we are going to have to do with
thing we are going to have to do with this is do a self-join so we have to
this is do a self-join so we have to join the table to itself to look
join the table to itself to look at if this is equal to this then this
at if this is equal to this then this needs to be equal to this that kind of
needs to be equal to this that kind of thing um so real quick let's just write
thing um so real quick let's just write that join part out and we'll go from
that join part out and we'll go from there I don't know why I sounded
there I don't know why I sounded Canadian right there we'll go from
Canadian right there we'll go from there uh so we'll join on this and we'll
there uh so we'll join on this and we'll say
say on a do oh wait let's let's label them
on a do oh wait let's let's label them I'm gonna do this in a really lazy way
I'm gonna do this in a really lazy way I'm just going to do a and b a. parcel
I'm just going to do a and b a. parcel ID is equal to b. parcel ID
ID is equal to b. parcel ID and um let's see really
and um let's see really quick so we need to find a way to
quick so we need to find a way to distinguish these the sale date could be
distinguish these the sale date could be the same um one thing this unique ID is
the same um one thing this unique ID is is unique so we need these to be
is unique so we need these to be different so let's use this and let's
different so let's use this and let's say um let's say and a. unique ID is not
say um let's say and a. unique ID is not equal to b. unique ID so all we have
equal to b. unique ID so all we have done here is we've joined these the same
done here is we've joined these the same exact table to it self and we said where
exact table to it self and we said where the partiel ID is the same but it's not
the partiel ID is the same but it's not the same row right because this is a
the same row right because this is a unique ID unique will never that means
unique ID unique will never that means these will never repeat themselves so
these will never repeat themselves so we'll never get the same one so if
we'll never get the same one so if this is equal to this but these are
this is equal to this but these are different we want to then populate um
different we want to then populate um populate the other one so let's do a.
populate the other one so let's do a. parcel ID and we'll say a do property
parcel ID and we'll say a do property address B do parcel ID comma bproperty
address B do parcel ID comma bproperty address um and let's take a look at this
address um and let's take a look at this really
really quick and let's
quick and let's do let me see if this works where
do let me see if this works where aproperty address is null and let's see
aproperty address is null and let's see if see what comes up
if see what comes up here okay so this is perfect this is
here okay so this is perfect this is exactly what I wanted to see so we have
exactly what I wanted to see so we have this parcel ID we have this parcel ID
this parcel ID we have this parcel ID and here is our address and it's blank
and here is our address and it's blank in all 35 of these so we have an address
in all 35 of these so we have an address for all of these but we're not
for all of these but we're not populating it so what we want to do is
populating it so what we want to do is we want to say use this thing called
we want to say use this thing called isnull so isnull is basically saying
isnull so isnull is basically saying it's the first thing is what do we want
it's the first thing is what do we want to check to see if it's null so we want
to check to see if it's null so we want to check aproperty address this whole
to check aproperty address this whole thing now if it is null what do we want
thing now if it is null what do we want to populate um we want to put in there
to populate um we want to put in there this B do bproperty um address because
this B do bproperty um address because we want to take that property address
we want to take that property address and stick it in there so um let's run
and stick it in there so um let's run this really quick so this row is what is
this really quick so this row is what is eventually going to be stuck into this
eventually going to be stuck into this row so this is perfect um it's literally
row so this is perfect um it's literally saying when it's null take take this and
saying when it's null take take this and put it there and so that's what this um
put it there and so that's what this um this part of is doing so let's go in
this part of is doing so let's go in here and write our update
here and write our update uh so we want to update and let's take
uh so we want to update and let's take this whole thing from here
up and we this will be the set oops um so we're going to set
oops um so we're going to set um
property okay we need to specify um and just so you know when you're doing joins
just so you know when you're doing joins in an update statement you're not going
in an update statement you're not going to say Nashville housing okay that's
to say Nashville housing okay that's going to give you an error you need to
going to give you an error you need to use it by by its Alias so let's put a so
use it by by its Alias so let's put a so now we're going to say property address
now we're going to say property address is going to be equal to and now we're
is going to be equal to and now we're just going to copy this is
just going to copy this is null and put it right
null and put it right here and we only want to update let's
here and we only want to update let's see if it it does take this so I think
see if it it does take this so I think this should be correct let's let's test
this should be correct let's let's test it out really quick and we're going to
it out really quick and we're going to run this above query and see if it made
run this above query and see if it made that
update all right so there you go um as you can see there are now none that have
you can see there are now none that have null in there otherwise it'd be giving
null in there otherwise it'd be giving us an output right now so that one is
us an output right now so that one is fixed we can go back and check it if you
fixed we can go back and check it if you want to please go back and and double
want to please go back and and double check that um but that is what we did
check that um but that is what we did and it worked perfectly so that's what
and it worked perfectly so that's what that is null does it checks to see if
that is null does it checks to see if this is null if it is null it it it can
this is null if it is null it it it can populate with a value you can also do
populate with a value you can also do like a string and what we I mean you can
like a string and what we I mean you can write you know no address if you wanted
write you know no address if you wanted to do something like that we don't want
to do something like that we don't want to do that we're going to keep it how it
to do that we're going to keep it how it is let's keep moving on we do not have
is let's keep moving on we do not have unlimited time here trying to keep this
unlimited time here trying to keep this I'm going to try to keep this on one
I'm going to try to keep this on one under two hours stretching the rules
under two hours stretching the rules because for my love of SQL and that is
because for my love of SQL and that is the only reason um and this I think is
the only reason um and this I think is going to take a little longer so let's
going to take a little longer so let's take a look and let's copy this real
quick and let's take a look at uh what are we doing the property address the
are we doing the property address the property address um and we can get rid
property address um and we can get rid of this as
of this as well so if you notice we have two things
well so if you notice we have two things here we have both the address and then
here we have both the address and then there's this comma after all of them and
there's this comma after all of them and there is the
there is the city now you know you don't know that or
city now you know you don't know that or you maybe you haven't looked into this
you maybe you haven't looked into this but I have and there are no other commas
but I have and there are no other commas anywhere except for in between these
anywhere except for in between these things as a separator as a delimiter
things as a separator as a delimiter um a delimiter is lit if you don't know
um a delimiter is lit if you don't know what if you've never heard that term
what if you've never heard that term delimiter a delimiter um is something
delimiter a delimiter um is something that separates different columns or
that separates different columns or different values so for us the delimer
different values so for us the delimer is a comma and for this first one
is a comma and for this first one because we're going to be separating
because we're going to be separating this one out and then we're going to be
this one out and then we're going to be doing the owner
doing the owner address um for this one we're going to
address um for this one we're going to be using something called a substring
be using something called a substring and we're also going to be using
and we're also going to be using something called a character index or a
something called a character index or a charart index so let's start writing
charart index so let's start writing that out and let's do
that out and let's do select and let's say substring now the
select and let's say substring now the substring that we want to take we of
substring that we want to take we of course want to be looking at oops let me
course want to be looking at oops let me um put this down here so it helps us out
um put this down here so it helps us out a little
a little bit and I'll get do like that so
bit and I'll get do like that so substring and of course we're looking at
substring and of course we're looking at property
property address and we want to look at position
address and we want to look at position one so we're going to start at position
one so we're going to start at position one one now this next part
one one now this next part is something that you may have never
is something that you may have never seen before um and if that if you
seen before um and if that if you haven't that's totally okay um we're
haven't that's totally okay um we're going to be the character index is going
going to be the character index is going to be searching
to be searching for the um it's going to basically be
for the um it's going to basically be searching for a specific value okay
searching for a specific value okay that's all it's doing and you and you
that's all it's doing and you and you can look into this a little bit more if
can look into this a little bit more if you want um so it's going to be Char
you want um so it's going to be Char index that's how it's spelled and then
index that's how it's spelled and then like an open parentheses and we want to
like an open parentheses and we want to specify what we're looking for so it can
specify what we're looking for so it can be anything you can even do you know if
be anything you can even do you know if you wanted to things like um Tom or you
you wanted to things like um Tom or you can do Val well you do it um like this
can do Val well you do it um like this you can look for Tom or if you're
you can look for Tom or if you're looking for a specific word like John
looking for a specific word like John you can search that that's what this is
you can search that that's what this is for um but we're going to do a comma
for um but we're going to do a comma where are we looking that's what this
where are we looking that's what this next one is so we're looking in property
next one is so we're looking in property address uh and then we're going to close
address uh and then we're going to close the
the parenthesis and and we'd also close it
parenthesis and and we'd also close it again to complete off that substring and
again to complete off that substring and we'll say as
we'll say as address um and let's just take a look
address um and let's just take a look really quick at
really quick at this so right now it's taking the it is
this so right now it's taking the it is basically going it's looking at property
basically going it's looking at property address it's going to the very first
address it's going to the very first value or starting at the first value and
value or starting at the first value and then it's going until the comma Now the
then it's going until the comma Now the unfortunate thing is is we actually
unfortunate thing is is we actually getting this comma in this output and we
getting this comma in this output and we don't want that uh you don't want a
don't want that uh you don't want a comma at the end of every address we can
comma at the end of every address we can change that um so we can say because
change that um so we can say because this is specifying a position if we just
this is specifying a position if we just look at this chart index which we can do
look at this chart index which we can do really
quick it is going to give us a a number it is saying at position 19 that is
it is saying at position 19 that is where the comma is right so it's not
where the comma is right so it's not like it's taking it's not a value or
like it's taking it's not a value or it's not a um it's not a string it's a
it's not a um it's not a string it's a it's a number so we can say minus one
it's a number so we can say minus one one and if we do
one and if we do that and now we run
that and now we run it now that comma is gone because we're
it now that comma is gone because we're looking back we're going to the comma
looking back we're going to the comma and then going back one from uh one
and then going back one from uh one behind the comma so that's how you get
behind the comma so that's how you get rid of that comma right there um the
rid of that comma right there um the next one's a little bit more tricky
next one's a little bit more tricky because we're not starting well it's not
because we're not starting well it's not super tricky but we're not starting at
super tricky but we're not starting at that first position anymore so let's put
that first position anymore so let's put a comma then we have our substring now
a comma then we have our substring now where we want to start is at this as at
where we want to start is at this as at where the comma is so instead of
where the comma is so instead of position one we want it to be where that
position one we want it to be where that character
character index um I don't want it to look like
index um I don't want it to look like this this whole time is it like this
this this whole time is it like this what am I doing it doesn't
what am I doing it doesn't matter let's just get rid of this and
matter let's just get rid of this and see if that fixes
see if that fixes it what am I doing here oh it's just
it what am I doing here oh it's just because this is wrong um and we'll just
because this is wrong um and we'll just do comma parentheses that might fix it
do comma parentheses that might fix it ah doesn't matter okay I'm wasting time
ah doesn't matter okay I'm wasting time I'm going to keep going we want to start
I'm going to keep going we want to start in this in this position okay um but we
in this in this position okay um but we actually don't want to start at minus
actually don't want to start at minus one we need to start at plus one because
one we need to start at plus one because we want to go to the actual comma itself
we want to go to the actual comma itself then once we get to the comma we want to
then once we get to the comma we want to add one so if we didn't if we just left
add one so if we didn't if we just left it the same again it would include the
it the same again it would include the comma at the beginning um then we need
comma at the beginning um then we need to specify where it needs to go to where
to specify where it needs to go to where does it need to finish now every single
does it need to finish now every single thing is going to be different every
thing is going to be different every single address has a different length
single address has a different length but we can use that to our advantage in
but we can use that to our advantage in this one and we can literally say the
this one and we can literally say the length
length of property address you guessed it right
of property address you guessed it right and then we can close this off let's see
and then we can close this off let's see if that
if that works okay what's messing up so we have
works okay what's messing up so we have property
property substring property
substring property address comma character
address comma character index and then we have specifying it in
index and then we have specifying it in the
the comma um we have the property address
comma um we have the property address plus one okay we can't have that right
plus one okay we can't have that right there I don't know why I had that F
there I don't know why I had that F finally figured it out at the end um so
finally figured it out at the end um so let's see what we're doing here let's
let's see what we're doing here let's see if it worked it works perfect um and
see if it worked it works perfect um and again this was one that I'm guessing a
again this was one that I'm guessing a lot of people haven't used before so I
lot of people haven't used before so I was trying to explain it a little bit
was trying to explain it a little bit more than other ones um but if we take
more than other ones um but if we take that out we take out that plus one
that out we take out that plus one you're going to see the comma at the
you're going to see the comma at the beginning right here so that's what that
beginning right here so that's what that is um so Plus one and that's what we're
is um so Plus one and that's what we're going to keep now we can't separate two
going to keep now we can't separate two values into from one column without
values into from one column without creating two other columns so just like
creating two other columns so just like we added this um table up here we're
we added this um table up here we're just going to I mean we're we're I'm
just going to I mean we're we're I'm just going to copy this down here really
just going to copy this down here really quick we're going to create two new
columns and add that value in so we're gonna we're gonna add that we're going
gonna we're gonna add that we're going to call this um let's call it because
to call this um let's call it because it's property address let's do
it's property address let's do property property split um and this is
property property split um and this is the
the address and then we'll say this one this
address and then we'll say this one this next one is going to be property and
next one is going to be property and this is City split
this is City split city city and this isn't going to be a
city city and this isn't going to be a date of course uh this going to be let's
date of course uh this going to be let's do narar and let's make it 255 just in
do narar and let's make it 255 just in case it's a large um just in case it is
case it's a large um just in case it is a large string a large text so then we
a large string a large text so then we can say um update that update
can say um update that update that um and now we need to in insert um
that um and now we need to in insert um what we did for it so this first one is
what we did for it so this first one is the address so we're going to say that
the address so we're going to say that equals the address and we're going to
equals the address and we're going to take this whole thing this whole
take this whole thing this whole substring oops and copy that and that's
substring oops and copy that and that's going to equal this um and then at the
going to equal this um and then at the end we'll we'll look at it really quick
end we'll we'll look at it really quick so first let's add this table I'm going
so first let's add this table I'm going to do this one at a time really quick so
to do this one at a time really quick so you can see it so it adds the
you can see it so it adds the table now it adds the results and again
table now it adds the results and again adds the table of city and sets that
adds the table of city and sets that City to that
City to that substring and now let's take um let's
substring and now let's take um let's take this and just do
take this and just do select everything from this and you
select everything from this and you should see at the very end because when
should see at the very end because when you add it it goes to the end we should
you add it it goes to the end we should have two new values and here we are so
have two new values and here we are so property split address and property
property split address and property split city um it's much more usable than
split city um it's much more usable than this I mean this would be a nightmare
this I mean this would be a nightmare not a nightmare it just be annoying to
not a nightmare it just be annoying to use this column I mean now that it's
use this column I mean now that it's separated on the address and the city
separated on the address and the city it's so much more usable of data it
it's so much more usable of data it really really is the next thing we're
really really is the next thing we're going to be looking at is this owner
going to be looking at is this owner address
address now it was hard enough or it was tough
now it was hard enough or it was tough enough to do this um but I want to show
enough to do this um but I want to show you maybe even a simpler way to do it
you maybe even a simpler way to do it even though this is more complicated so
even though this is more complicated so let's go down
let's go down here and let's get rid of
here and let's get rid of this so let's say um let's get this and
this so let's say um let's get this and let's just say property oops no we're
let's just say property oops no we're doing owner owner address here we go
doing owner owner address here we go let's just take a look at this let's see
let's just take a look at this let's see what we got so again we're using or what
what we got so again we're using or what we have in here is the address the city
we have in here is the address the city and the state so what we need to do is
and the state so what we need to do is split all of those out um and again I
split all of those out um and again I don't want to use substrings again that
don't want to use substrings again that was a pain I want to use um something a
was a pain I want to use um something a little different something again that
little different something again that you may have never seen it's called
you may have never seen it's called parse name um and parse name is super
parse name um and parse name is super useful um especially for like delimited
useful um especially for like delimited stuff stuff that's delimited by a
stuff stuff that's delimited by a specific value um so let me just show
specific value um so let me just show you what it is and then we'll go from
you what it is and then we'll go from there so we can say
there so we can say parse parse name um and we're going to
parse parse name um and we're going to be doing this on the owner
be doing this on the owner address okay let's let me see let me see
address okay let's let me see let me see yeah I mean it's because I don't have
yeah I mean it's because I don't have this of course I do that all the time so
this of course I do that all the time so annoying so on the owner address um and
annoying so on the owner address um and then let's do
then let's do one and let's just see what happens
uh nothing changed of course because parse name only is useful with periods
parse name only is useful with periods or that's what it looks for that's what
or that's what it looks for that's what par name looks for and these are commas
par name looks for and these are commas so something we can just do is we can
so something we can just do is we can replace those commas with uh a a instead
replace those commas with uh a a instead of a comma we replace it with a period
of a comma we replace it with a period so super easy we're just going to do
so super easy we're just going to do owner address
owner address comma um and we'll look for the comma in
comma um and we'll look for the comma in there then we need to specify what we
there then we need to specify what we need to change it to we'll change it to
need to change it to we'll change it to a period and let's close
a period and let's close that and now let's run
that and now let's run it and it's taking Tennessee so
it and it's taking Tennessee so something odd about at least to me odd
something odd about at least to me odd about parse name is that it kind of does
about parse name is that it kind of does things backwards than what you would
things backwards than what you would expect it to do uh let's really quick
expect it to do uh let's really quick let's add the other things um you'll
let's add the other things um you'll you'll get a kick out well you won't get
you'll get a kick out well you won't get a kick out of this as much as I do
a kick out of this as much as I do here's one two
here's one two three let's execute this and it
three let's execute this and it separates everything for us but it's
separates everything for us but it's backwards so it's 1 2 3 you would
backwards so it's 1 2 3 you would imagine it' be one two 3 but no it's one
imagine it' be one two 3 but no it's one two three so all we need to do is go
two three so all we need to do is go three 2
three 2 one and run
one and run this and there we go so now we have it
this and there we go so now we have it broken out this is now our address this
broken out this is now our address this is our city and this is our state so
is our city and this is our state so super what I would consider super easy a
super what I would consider super easy a lot easier than the substring but I
lot easier than the substring but I didn't want to show you the easy one
didn't want to show you the easy one first and then give you the hard one um
first and then give you the hard one um so now we just need to add those columns
so now we just need to add those columns and then we need to add the values so
and then we need to add the values so let's do
let's do this uh let's make some room and I need
this uh let's make some room and I need to get rid of one of these I think o did
to get rid of one of these I think o did I do that right what did I
do I have my alter table update alter table update what is this doing here
table update what is this doing here what is this I don't even know what this
what is this I don't even know what this is we'll just go like that so now we
is we'll just go like that so now we have three perfect um so from National
have three perfect um so from National Housing we're going to say we're going
Housing we're going to say we're going to say this is the
to say this is the owner oops owner split
owner oops owner split address um actually let me just copy the
address um actually let me just copy the owner make it easier so we have owner
owner make it easier so we have owner split address owner split
split address owner split City
City and let's do owner owner split and then
and let's do owner owner split and then State oops and copy there owner split
State oops and copy there owner split City there we go owner split address
City there we go owner split address owner split address so I'm putting all
owner split address so I'm putting all the sets equal to what we're about to
the sets equal to what we're about to add to so now this first one this three
add to so now this first one this three is the address we'll paste it there the
is the address we'll paste it there the second one is the city so we'll put that
second one is the city so we'll put that oh I see what happened here that's what
oh I see what happened here that's what happened got to get rid of
happened got to get rid of that um I set the owner split City equal
that um I set the owner split City equal to that middle one and then of course
to that middle one and then of course the third one is the
the third one is the state so let's go do
state so let's go do that and that should be done so let's do
that and that should be done so let's do it two at a
it two at a time oops owner split address what's
time oops owner split address what's wrong with that oh I probably just got
wrong with that oh I probably just got to run this first let's try that
to run this first let's try that tried to get good too quick um you can
tried to get good too quick um you can do this a much more efficient way I'm
do this a much more efficient way I'm just doing this for visual purposes I
just doing this for visual purposes I would update all the tables first or add
would update all the tables first or add all the um columns first I mean and then
all the um columns first I mean and then do all the updating at the end that's
do all the updating at the end that's normally how I do it but um again for
normally how I do it but um again for visual purposes that this is what we're
visual purposes that this is what we're doing so let's go get this actually
doing so let's go get this actually let's get this bring this down
let's get this bring this down here um don't keep this in in your final
here um don't keep this in in your final queries it's a lot of extra selecting
queries it's a lot of extra selecting everything you don't need to do that um
everything you don't need to do that um so here we go so owner split address
so here we go so owner split address owner split City owner split State again
owner split City owner split State again so much more usable than when it's all
so much more usable than when it's all in one column I mean it is 10 100 times
in one column I mean it is 10 100 times more useful data now um you know that
more useful data now um you know that one to me you that gets used a lot let's
one to me you that gets used a lot let's keep it going I feel like we're making
keep it going I feel like we're making fantastic time I don't even know I'm not
fantastic time I don't even know I'm not even keeping track of time time is not
even keeping track of time time is not even relative anymore be three hours and
even relative anymore be three hours and I wouldn't care let's keep going um
I wouldn't care let's keep going um let's take a look at this column right
let's take a look at this column right here sold as vacant um right now has no
here sold as vacant um right now has no but let's look at let's do select
but let's look at let's do select distinct oh gosh I hate when I do this I
distinct oh gosh I hate when I do this I do this all the time am I the only one I
do this all the time am I the only one I don't think I'm the only one and we'll
don't think I'm the only one and we'll do sp uh what is it sold as okay sold as
do sp uh what is it sold as okay sold as vacant let's do a distinct count on are
vacant let's do a distinct count on are distinct on
distinct on these so right now we have yes no n y
these so right now we have yes no n y I'm guessing which is no and yes and
I'm guessing which is no and yes and then no so let's look just for just
then no so let's look just for just because I'm curious um let's look at a
because I'm curious um let's look at a count
count of I don't want to do the let me just do
of I don't want to do the let me just do sold as vacant let me do a count of this
sold as vacant let me do a count of this and we'll Group
and we'll Group by uh sold is vacant okay let's run this
by uh sold is vacant okay let's run this and see what we get oh gosh let me order
and see what we get oh gosh let me order by okay here we go now we're now we're
by okay here we go now we're now we're moving that's not what I wanted at all
moving that's not what I wanted at all order by two here's what I wanted okay
order by two here's what I wanted okay so at no we have 51,000 yes 4,000 almost
so at no we have 51,000 yes 4,000 almost 5,000 no and then just a few so let's
5,000 no and then just a few so let's change them to to yes and no because
change them to to yes and no because these are obviously the vastly more
these are obviously the vastly more populated ones um and we're just going
populated ones um and we're just going to do this through a case statement so
to do this through a case statement so we're going to say oh yeah let me get
we're going to say oh yeah let me get this ready before we start oh yeah I'm
this ready before we start oh yeah I'm ahead of the game now let's do select
ahead of the game now let's do select and we'll do sold as vacant and then
and we'll do sold as vacant and then we'll start our case
we'll start our case statement um yeah let's do right here so
statement um yeah let's do right here so we'll do case when sold as vacant is
we'll do case when sold as vacant is equal to yes all we want to do is say
equal to yes all we want to do is say then we want to make it
then we want to make it no oh
no oh won't make a yes what am I doing geez
won't make a yes what am I doing geez I'm losing it when and I'm just oops
I'm losing it when and I'm just oops oops oops oops ignore that pretend that
oops oops oops ignore that pretend that didn't
didn't happen
happen when sold as vacant is equal to
when sold as vacant is equal to n
n then
then no and then else we want to say if it's
no and then else we want to say if it's already if it's not one of those values
already if it's not one of those values it means it's already a yes or no so
it means it's already a yes or no so we're just going to say just keep it as
we're just going to say just keep it as sold as vacant and then we'll end it so
sold as vacant and then we'll end it so let's take a
let's take a look okay so let's scroll through here
look okay so let's scroll through here and see if we get any that we can see oh
and see if we get any that we can see oh I just went byy some didn't
I just went byy some didn't I oh I just went buy some I know I
I oh I just went buy some I know I did um let's see okay here we go so
did um let's see okay here we go so here's an N it's now a no so this this
here's an N it's now a no so this this sold as vacant as this column the newly
sold as vacant as this column the newly uh the case statement right here is
uh the case statement right here is changing it so the N is no so this
changing it so the N is no so this should work all and this will be a
should work all and this will be a unique update statement um and I hope it
unique update statement um and I hope it works unlike the first update statement
works unlike the first update statement that we we did that was a that was a
that we we did that was a that was a travesty um let's do update Nashville
travesty um let's do update Nashville housing um and we'll
housing um and we'll say
say set sorry I'm talking faster than I'm
set sorry I'm talking faster than I'm going set sold as vacant equal to and we
going set sold as vacant equal to and we can just literally put in this case
can just literally put in this case statement um it's not but let's try
statement um it's not but let's try it okay now let's go look at this again
it okay now let's go look at this again and see if it made the update there we
and see if it made the update there we go the update statement worked oh
go the update statement worked oh fantastic it's a beautiful
fantastic it's a beautiful thing okay great I'm glad that one
thing okay great I'm glad that one worked I was worried for a second that
worked I was worried for a second that uh my update had broken in um in SQL
uh my update had broken in um in SQL Server now now we're going to do
Server now now we're going to do something um these next two things is
something um these next two things is we're going to remove the duplicates and
we're going to remove the duplicates and then we're going to get rid of unused
then we're going to get rid of unused columns um this removing duplicate I got
columns um this removing duplicate I got to be honest I don't do it a ton in SQL
to be honest I don't do it a ton in SQL but I have done it um especially for
but I have done it um especially for like
like queries you know when I'm looking at
queries you know when I'm looking at full tables I I will write some sort of
full tables I I will write some sort of temp table and like put the remove
temp table and like put the remove duplicates in there I normally don't
duplicates in there I normally don't delete actual data we are we're going to
delete actual data we are we're going to do that um but it's not a standard
do that um but it's not a standard practice to delete data that's in um
practice to delete data that's in um that's in your database so just for
that's in your database so just for future purposes don't blame me if you
future purposes don't blame me if you delete all the all the duplicates back
delete all the all the duplicates back accident in your uh table at work so you
accident in your uh table at work so you can do this a few different ways but the
can do this a few different ways but the way I'm going to show you is we're going
way I'm going to show you is we're going to write a
to write a CTE and we're going to do some windows
CTE and we're going to do some windows functions to find where there are
functions to find where there are duplicate values okay so excuse me so
duplicate values okay so excuse me so let's start writing out our CTE and or
let's start writing out our CTE and or you know even we can write out the query
you know even we can write out the query first then put it into a CTE that might
first then put it into a CTE that might be a little bit better so let's do
be a little bit better so let's do select everything and oh my gosh I was
select everything and oh my gosh I was about to do it somebody's out there just
about to do it somebody's out there just like waiting for me to make that mistake
like waiting for me to make that mistake again so we want to partition our
again so we want to partition our data um when you're doing removing
data um when you're doing removing duplicates we're going to have duplicate
duplicates we're going to have duplicate rows and we need to be able to have a
rows and we need to be able to have a way to identify those rows right so you
way to identify those rows right so you can use things like rank order rank
can use things like rank order rank um row number there are a few different
um row number there are a few different options we're going to be using row
options we're going to be using row number um and you know if you want to
number um and you know if you want to look into how Rank and rank uh like
look into how Rank and rank uh like dense Rank and all those ones work
dense Rank and all those ones work please do that so you know why we're
please do that so you know why we're doing it um but we're using row number
doing it um but we're using row number because it's the I think the simplest um
because it's the I think the simplest um and it's going to do what we need
and it's going to do what we need exactly so I'm going to get this over
exactly so I'm going to get this over here we'll say select everything because
here we'll say select everything because we're selecting everything then we're
we're selecting everything then we're going to add this row number on here so
going to add this row number on here so row number and we're going to do these
row number and we're going to do these parenthesis right here we're going to
parenthesis right here we're going to say over and an open parentheses now we
say over and an open parentheses now we need to write our partition because
need to write our partition because we're going to partition this data so
we're going to partition this data so we're going to say um
we're going to say um Partition by cool um now really quickly
Partition by cool um now really quickly while we're here we need to actually
while we're here we need to actually know what we're partitioning on that's
know what we're partitioning on that's helpful so let me write this so while
helpful so let me write this so while we're writing it we can see what we're
we're writing it we can see what we're doing we need to partition it on things
doing we need to partition it on things that should be unique
that should be unique um
um two basically to each row um if in I
two basically to each row um if in I guess for the sake of what we're doing
guess for the sake of what we're doing we're we're going to pretend this unique
we're we're going to pretend this unique ID isn't here um although you know you
ID isn't here um although you know you could say I'm cheating it doesn't matter
could say I'm cheating it doesn't matter but I'm going to say you know if things
but I'm going to say you know if things like the parcel ID are the same if the
like the parcel ID are the same if the sale date is the same um the property
sale date is the same um the property address is the same the sales price is
address is the same the sales price is the same This legal reference which I'm
the same This legal reference which I'm guessing is some type of legal document
guessing is some type of legal document saying it's like somebody's uh property
saying it's like somebody's uh property if all of those are the exact same then
if all of those are the exact same then to me that is the same data it's it's
to me that is the same data it's it's unusable just for example I mean this
unusable just for example I mean this may I don't I mean this data is just
may I don't I mean this data is just some random data set I found online
some random data set I found online right so that's what we're going to be
right so that's what we're going to be going with that's what we're going to be
going with that's what we're going to be running with and pretend that lie that I
running with and pretend that lie that I just told you is completely true so what
just told you is completely true so what we want to Partition by let's start with
we want to Partition by let's start with the
the parcel um can I is this not right here
parcel um can I is this not right here why is it saying this why is it not
why is it saying this why is it not giving
giving me okay doesn't even matter I'm just
me okay doesn't even matter I'm just going to say parcel
going to say parcel ID um we can
ID um we can say
say property we'll do a property address
property we'll do a property address stick with me we're getting somewhere
stick with me we're getting somewhere we'll do sale
we'll do sale price um what do we say sale date I mean
price um what do we say sale date I mean there shouldn't be two of this they
there shouldn't be two of this they didn't sell twice on the same day come
didn't sell twice on the same day come on and then legal
reference and oh I know why it's not working or my
and oh I know why it's not working or my autocomplete isn't working which I love
autocomplete isn't working which I love um it's because we're creating our own
um it's because we're creating our own partition so it's its own column of
partition so it's its own column of course I don't know why I'm it's late as
course I don't know why I'm it's late as you can see down here it's 11:15 it's
you can see down here it's 11:15 it's getting late for me but hey I I this is
getting late for me but hey I I this is an adrenaline rush for me um now we need
an adrenaline rush for me um now we need to order it now we want to order it on
to order it now we want to order it on something that should
something that should be um not necessar I guess unique so
be um not necessar I guess unique so we're going to order it on this unique
we're going to order it on this unique ID we'll see if that actually does what
ID we'll see if that actually does what we want it to do um oops what am I doing
we want it to do um oops what am I doing order bu come on and we'll say uh
order bu come on and we'll say uh unique oops unique ID perfect and we
unique oops unique ID perfect and we should be able to close that off and
should be able to close that off and we're going to call this R num I mean
we're going to call this R num I mean that's just that just makes sense so now
that's just that just makes sense so now we have this and let's run this really
we have this and let's run this really quick and see what happens so um and
quick and see what happens so um and maybe we should order this as well but
maybe we should order this as well but we'll maybe we'll do that
we'll maybe we'll do that later yeah let's order this on parcel
later yeah let's order this on parcel ID um order by parcel ID let's just see
ID um order by parcel ID let's just see what happens because this I think that
what happens because this I think that should be pretty
accurate um let's scroll down and see if we get
we get any this is all
any this is all ones maybe should be doing it on unique
ones maybe should be doing it on unique ID I don't know let's see if we get any
ID I don't know let's see if we get any hits okay there's a two in
hits okay there's a two in there let's let's look at this really
there let's let's look at this really quick because I want to see
quick because I want to see it maybe I did something wrong I don't
it maybe I did something wrong I don't know it is absolutely
possible somebody play some Jeopardy music for me real
music for me real quick yeah I don't know I don't know why
quick yeah I don't know I don't know why it's um okay so let's see let's let's
it's um okay so let's see let's let's look at these
look at these two um and let's see if I
two um and let's see if I did something wrong
did something wrong oops don't need to pull that
oops don't need to pull that up I was doing some research when I when
up I was doing some research when I when that convert by wasn't working um okay
that convert by wasn't working um okay so this one and this one it's giving
so this one and this one it's giving different row
different row numbers so let's look at the actual data
numbers so let's look at the actual data ignore the unique ID but the data itself
ignore the unique ID but the data itself so the the sale date is the same the
so the the sale date is the same the sale price is the same the legal
sale price is the same the legal reference is the same the owner is the
same this is the same I mean literally every single thing
same I mean literally every single thing in here is the same so this is a good
in here is the same so this is a good example so we're going to in this query
example so we're going to in this query that we're about to write that that will
that we're about to write that that will be that second one will be deleted
be that second one will be deleted because we don't need it now there
because we don't need it now there there's only one so it looks like this
there's only one so it looks like this is working as intended um I can also
is working as intended um I can also do um
do um let's do where rowcor num is greater
let's do where rowcor num is greater than one let's see if that I don't think
than one let's see if that I don't think it will work
it will work actually yeah that's because uh it is
actually yeah that's because uh it is that is in a Windows function of course
that is in a Windows function of course we can't do that what am I thinking
we can't do that what am I thinking that's why we need to put it into a
that's why we need to put it into a CTE oh of course it all comes back so
CTE oh of course it all comes back so let's call this all comes back to the CT
let's call this all comes back to the CT those things are amazing um let's call
those things are amazing um let's call this um row num
this um row num num
num CTE and we'll say as and then open
CTE and we'll say as and then open parentheses and I don't think we can
parentheses and I don't think we can have an order by in here let's do it
have an order by in here let's do it like this and let's just do select
like this and let's just do select everything from row number
everything from row number CTE so again if you haven't watched my
CTE so again if you haven't watched my like CTE CTE video or you've never used
like CTE CTE video or you've never used a CTE before um this is now basically
a CTE before um this is now basically almost like a temp table so we're going
almost like a temp table so we're going to be able to this query down here is
to be able to this query down here is querying off of this table that we quote
querying off of this table that we quote unquote
unquote created so um it looks like it's working
created so um it looks like it's working so all we're going to do is select um
so all we're going to do is select um everything from that and we want to say
everything from that and we want to say where row num because that's now a row
where row num because that's now a row is greater than one and let's order that
is greater than one and let's order that by I don't know property address let's
by I don't know property address let's see if that
see if that works and let's see what happens
works and let's see what happens okay so all of these are duplicates we
okay so all of these are duplicates we have 104 of them it looks like so
have 104 of them it looks like so there's not many but it there's twos any
there's not many but it there's twos any threes no no threes so there's multiple
threes no no threes so there's multiple of these rows or columns that are
of these rows or columns that are basically duplicates um and we want to
basically duplicates um and we want to delete them so all we're going to say is
delete them so all we're going to say is we're going to select instead of saying
we're going to select instead of saying select everything from row we're just
select everything from row we're just going to say delete
going to say delete and uh yeah I got to get rid of that
and uh yeah I got to get rid of that order bu that doesn't work and let's do
this there's 104 let's see if it worked um so now let's do let's go back and
um so now let's do let's go back and we'll say select everything and let's
we'll say select everything and let's see if there's any more duplicates in
see if there's any more duplicates in there there are none that is fantastic
there there are none that is fantastic every I'm like biting my nails now to
every I'm like biting my nails now to see if each one of these Works um
see if each one of these Works um because I that first one didn't work um
because I that first one didn't work um so yeah so it worked we got rid of the
so yeah so it worked we got rid of the duplicates that is fantastic um and now
duplicates that is fantastic um and now it's smooth sailing from here because
it's smooth sailing from here because we're just going to delete some um
we're just going to delete some um unused columns that we don't care about
unused columns that we don't care about this doesn't happen often um this I
this doesn't happen often um this I would say actually happens more in like
would say actually happens more in like views when I'm creating views I have a
views when I'm creating views I have a view and I'm like oh I didn't mean to
view and I'm like oh I didn't mean to add that column let me just remove it
add that column let me just remove it because it's a I don't need it you don't
because it's a I don't need it you don't do this to um like the raw data that you
do this to um like the raw data that you import usually this is I mean again best
import usually this is I mean again best practices please don't do this to your
practices please don't do this to your raw data that comes into your database
raw data that comes into your database um talk to somebody before you do this
um talk to somebody before you do this that's just my my legal advice for the
that's just my my legal advice for the day I'm not legally bound or legally
day I'm not legally bound or legally held responsible for any mistakes you
held responsible for any mistakes you make so let's keep going um we're
make so let's keep going um we're literally just going to delete some
literally just going to delete some columns it could be any columns that we
columns it could be any columns that we want um but for example we got have
want um but for example we got have these property split address and owner
these property split address and owner split address um in city and state and
split address um in city and state and city and these are perfect and much more
city and these are perfect and much more useful than these owner um these owner
useful than these owner um these owner address because this is really unusable
address because this is really unusable to be honest so we're going to delete
to be honest so we're going to delete those um and maybe we'll also get rid of
those um and maybe we'll also get rid of like I don't know maybe the land that
like I don't know maybe the land that land use might be useful this tax tax
land use might be useful this tax tax District who cares about that um so it's
District who cares about that um so it's going to be super easy we're just going
going to be super easy we're just going to write alter table alter table did I
to write alter table alter table did I say that right
say that right geez um and we're going to say alter
geez um and we're going to say alter this
this table and we're going to
table and we're going to drop a column and you can do as many as
drop a column and you can do as many as many as we want so we're going to say
many as we want so we're going to say owner um
owner um address we're going to do tax
address we're going to do tax district and let's also do the property
address all right and let's try this and let's see if it
let's see if it works I'm
nervous all right so as you can see that the property address is gone the owner
the property address is gone the owner address is gone the tax what was it tax
address is gone the tax what was it tax district is gone and now we are left
district is gone and now we are left with this um now remember the whole
with this um now remember the whole point of everything we were doing was to
point of everything we were doing was to clean up the data right we wanted to
clean up the data right we wanted to clean the data and actually now well now
clean the data and actually now well now that we're here we have this sale date
that we're here we have this sale date as well U and we have the sale date
as well U and we have the sale date converted over here let's get rid I
converted over here let's get rid I forgot let's get rid of this oh that was
forgot let's get rid of this oh that was my dog Max excuse them let's get rid of
my dog Max excuse them let's get rid of oops let's get rid of that sale price
oops let's get rid of that sale price that that or the um sale date that made
that that or the um sale date that made me look like an idiot this is Sweet
me look like an idiot this is Sweet Revenge sale
Revenge sale date Sweet Sweet
date Sweet Sweet Revenge all right and it is gone so it's
Revenge all right and it is gone so it's as easy as that now remember like I was
as easy as that now remember like I was saying before the whole point of this
saying before the whole point of this project is to clean the data and make it
project is to clean the data and make it more usable um and it may not have felt
more usable um and it may not have felt like that as we were going through cuz I
like that as we were going through cuz I wasn't you know really looking at the
wasn't you know really looking at the clean cleaning data uh uh we were
clean cleaning data uh uh we were cleaning it but you know what was the
cleaning it but you know what was the purpose of it I may not have highlighted
purpose of it I may not have highlighted that too much all these other columns
that too much all these other columns that we created um are just it's much
that we created um are just it's much more usable much more friendly um this
more usable much more friendly um this is standardized now and you know we we
is standardized now and you know we we did that through quite a few various
did that through quite a few various methods um so let's go back up the top
methods um so let's go back up the top we're going to recap what we did really
we're going to recap what we did really quick so using this convert we tried to
quick so using this convert we tried to standardize the date format or change
standardize the date format or change the date format may or may not have
the date format may or may not have worked for you didn't work for me we
worked for you didn't work for me we populated this property address um which
populated this property address um which we did that
we did that before we broke this out because if we
before we broke this out because if we reversed it if we broke these addresses
reversed it if we broke these addresses out into individual columns and then we
out into individual columns and then we populated the this thing um we would
populated the this thing um we would have because then we went and
have because then we went and deleted uh we went and deleted this
deleted uh we went and deleted this column oops sorry we went and deleted uh
column oops sorry we went and deleted uh this property address so we wouldn't
this property address so we wouldn't have actually gotten any of that data so
have actually gotten any of that data so there was a reason it was in that order
there was a reason it was in that order uh don't mess that up that's happened um
uh don't mess that up that's happened um so we broke it out we did that to to
so we broke it out we did that to to using um substring chart index as well
using um substring chart index as well as parse name and
as parse name and replace then we went through and we
replace then we went through and we changed yes to no or Y and n's to yeses
changed yes to no or Y and n's to yeses and NOS using case
and NOS using case statements um then we use we removed
statements um then we use we removed duplicates using a row number a c te and
duplicates using a row number a c te and windows function of Partition by and
windows function of Partition by and then at the end we deleted a few useless
then at the end we deleted a few useless columns that we no longer want to see
columns that we no longer want to see because um they are horrible and
because um they are horrible and terrible and um you know we don't want
terrible and um you know we don't want to see them anymore that is the entire
to see them anymore that is the entire project that was everything and you did
project that was everything and you did it and I'm honestly super proud of you
it and I'm honestly super proud of you for sticking around this long it this
for sticking around this long it this this was not necessarily an easy project
this was not necessarily an easy project we used quite a few new things that I
we used quite a few new things that I may have not talked about or showed you
may have not talked about or showed you before this to me is just the beginning
before this to me is just the beginning right this is just a a glimpse into all
right this is just a a glimpse into all the things that you need to do you need
the things that you need to do you need to look for um in order to clean data so
to look for um in order to clean data so you know I really do think this is a
you know I really do think this is a good portfolio project because it will
good portfolio project because it will show that you understand and know how to
show that you understand and know how to clean the data although this is not an
clean the data although this is not an end to-end project right that could that
end to-end project right that could that would take a long time and a lot more
would take a long time and a lot more exploratory analysis looking into the
exploratory analysis looking into the data to to figure out what we need to
data to to figure out what we need to change but for all intents and purposes
change but for all intents and purposes I mean this is a a pretty good project
I mean this is a a pretty good project for cleaning data and I hope that you
for cleaning data and I hope that you learned something I also hope that you
learned something I also hope that you worked on this hard um if you want to
worked on this hard um if you want to make any improvements please do that
make any improvements please do that this is not perfect by any means there's
this is not perfect by any means there's other things that you could change um
other things that you could change um you could you know I don't even know I'm
you could you know I don't even know I'm not even going to try to guess you could
not even going to try to guess you could do other things to this data though um
do other things to this data though um and and create your own queries create
and and create your own queries create your own um data cleaning uh part of
your own um data cleaning uh part of this and so um you do that if you were
this and so um you do that if you were able to get this um the ETL part of it
able to get this um the ETL part of it done do that I think it'd be really
done do that I think it'd be really really cool um again I was able to get
really cool um again I was able to get it to work but I don't think 90% of
it to work but I don't think 90% of people out there would be able to get it
people out there would be able to get it to work um it's just every computer is
to work um it's just every computer is different every server is configured
different every server is configured differently um and so it would just be a
differently um and so it would just be a huge pain so I decided to cut that out
huge pain so I decided to cut that out and I'm sorry um but hopefully this will
and I'm sorry um but hopefully this will suffice um with that being said this is
suffice um with that being said this is it you made it all the way to the end
it you made it all the way to the end again I'm super proud you guys are doing
again I'm super proud you guys are doing fantastic you guys are the ones putting
fantastic you guys are the ones putting in the hard work to build the portfolio
in the hard work to build the portfolio for your future job I mean it's not easy
for your future job I mean it's not easy but you're putting in the work and so
but you're putting in the work and so and so kudos to you um in our next video
and so kudos to you um in our next video we're going to be going into python for
we're going to be going into python for the very first time really excited about
the very first time really excited about that one because um I think the only
that one because um I think the only python video that I have up right now is
python video that I have up right now is on one where I was scraping data from
on one where I was scraping data from Twitter so um you know this will be a
Twitter so um you know this will be a nice change a pace or a little bit
nice change a pace or a little bit different content than I normally put
different content than I normally put out and so I'm really excited about it
out and so I'm really excited about it and I hope you are as well with that
and I hope you are as well with that being said I am done with the video I'm
being said I am done with the video I'm going to be stopping it soon thank you
going to be stopping it soon thank you for joining me if you like this video be
for joining me if you like this video be sure to subscribe be sure to like this
sure to subscribe be sure to like this video leave a comment below um telling
video leave a comment below um telling me how it changed your life uh and I
me how it changed your life uh and I will see you in the next video
will see you in the next video [Music]
goodbye [Music]
[Music] what's going on everybody today we are
what's going on everybody today we are starting our Excel tutorial
starting our Excel tutorial [Music]
[Music] series now there are so many things that
series now there are so many things that you can do in Excel so I don't know how
you can do in Excel so I don't know how long this series is going to be it could
long this series is going to be it could be 15 or even 20 videos but what I do
be 15 or even 20 videos but what I do know is that I'm going to be covering
know is that I'm going to be covering just about every single thing that I've
just about every single thing that I've used since I became a data analyst and I
used since I became a data analyst and I want to show you how to do it uh so
want to show you how to do it uh so won't just be the more concrete things
won't just be the more concrete things um you know like pivot tables charts V
um you know like pivot tables charts V lookups things like that it'll also be
lookups things like that it'll also be some of the more nuanced things like how
some of the more nuanced things like how to deal with missing data or how to deal
to deal with missing data or how to deal with dirty data and how to clean that up
with dirty data and how to clean that up within Excel and so those are things
within Excel and so those are things that you may not be able to do you know
that you may not be able to do you know if somebody wasn't showing you how to do
if somebody wasn't showing you how to do it and so that's what I'm going to try
it and so that's what I'm going to try to help you because I know that that is
to help you because I know that that is something that you will need to do or
something that you will need to do or learn how to do in Excel now before we
learn how to do in Excel now before we get into it I want to give a huge shout
get into it I want to give a huge shout out to the sponsor of this Excel series
out to the sponsor of this Excel series and that is udem me I took so many Excel
and that is udem me I took so many Excel courses on you to me when I was first
courses on you to me when I was first starting out as a data analyst and there
starting out as a data analyst and there was this one course that I kept going
was this one course that I kept going back to over and over again because as I
back to over and over again because as I got into it in my job I realized that
got into it in my job I realized that there were so many things that were in
there were so many things that were in that course that I really needed to know
that course that I really needed to know but I didn't realize I needed to know it
but I didn't realize I needed to know it and so I'm going to put the links to
and so I'm going to put the links to those courses in the description in case
those courses in the description in case you want to take those again huge shout
you want to take those again huge shout out to you to me without further Ado
out to you to me without further Ado let's jump on my screen and get started
let's jump on my screen and get started with our very first Excel tutorial all
with our very first Excel tutorial all right so I'm going to go ahead and get
right so I'm going to go ahead and get rid of myself we are going to be looking
rid of myself we are going to be looking at something absolutely pivotal in your
at something absolutely pivotal in your data analytics career and that is Pivot
data analytics career and that is Pivot tables uh and I think that's really
tables uh and I think that's really appropriate it is probably one of the
appropriate it is probably one of the most commonly used things I think that
most commonly used things I think that data analysts use to convey information
data analysts use to convey information in Excel it's super easy to group things
in Excel it's super easy to group things together to display information in a
together to display information in a very easily understandable way
very easily understandable way especially for people who are not data
especially for people who are not data analysts right I use this a lot for
analysts right I use this a lot for other managers or for higher-ups um who
other managers or for higher-ups um who don't want to get into SQL or or you
don't want to get into SQL or or you know aren't super text savy in like
know aren't super text savy in like python or Tableau they just want it in
python or Tableau they just want it in an except sell and so I use it all the
an except sell and so I use it all the time for that reason and so we're going
time for that reason and so we're going to be using this data set right here
to be using this data set right here bike store sales in Europe I will
bike store sales in Europe I will include this link in the description um
include this link in the description um we're not going to look at the columns
we're not going to look at the columns just yet we're going to download it um
just yet we're going to download it um I've already downloaded it a few times
I've already downloaded it a few times but we are going to go
but we are going to go to um our downloads we're going to open
to um our downloads we're going to open it up and we're going to open up this
it up and we're going to open up this sales right
sales right here and give it a
here and give it a second all right perfect and so here's
second all right perfect and so here's what it looks like uh at least on my
what it looks like uh at least on my screen I'm going to uh spread it out
screen I'm going to uh spread it out just a little
just a little bit um and really quickly let's take a
bit um and really quickly let's take a very quick glance at this so we have a
very quick glance at this so we have a date a day a month a year so some um
date a day a month a year so some um some date
some date information um then we have some
information um then we have some customer age information so how old was
customer age information so how old was the customer again this is bike sales so
the customer again this is bike sales so what did um you know what did they buy
what did um you know what did they buy and they have some demographic
and they have some demographic information so this is their age group
information so this is their age group we have uh the gender country State the
we have uh the gender country State the product category the subcategory the
product category the subcategory the actual product that was purchased and
actual product that was purchased and then we have things like um you know how
then we have things like um you know how much these things cost the quantity that
much these things cost the quantity that was that was ordered so we have order
was that was ordered so we have order Quant quantity unit cost unit price then
Quant quantity unit cost unit price then we have the profit cost and revenue all
we have the profit cost and revenue all things that we almost everything in here
things that we almost everything in here we can in some way put into a pivot
we can in some way put into a pivot table now I'm not going to go through
table now I'm not going to go through every single variation of that but we
every single variation of that but we are going to be um looking at a lot of
are going to be um looking at a lot of this um Revenue over here because I
this um Revenue over here because I think it's it's pretty easy to show the
think it's it's pretty easy to show the value of a pivot table with especially
value of a pivot table with especially with um you know currency or money so
with um you know currency or money so what we're going to do to get started is
what we're going to do to get started is we're going to go up to insert and we're
we're going to go up to insert and we're going to click on insert and then we are
going to click on insert and then we are going to click on pivot table now really
going to click on pivot table now really quick there is a recommended pivot
quick there is a recommended pivot tables and if you click on that what
tables and if you click on that what will come up is some recommendations
will come up is some recommendations that Excel gives based on the data that
that Excel gives based on the data that you have um and it can kind of give you
you have um and it can kind of give you some ideas of of what you can do with
some ideas of of what you can do with pivot tables it's going to generate it
pivot tables it's going to generate it for you we're not going to do that we're
for you we're not going to do that we're going to build our
going to build our own uh but let's click on pivot table
own uh but let's click on pivot table and it's going to Auto Select basically
and it's going to Auto Select basically everything and that's fantastic um but
everything and that's fantastic um but what if it doesn't come like that I I
what if it doesn't come like that I I just erase that if it doesn't come like
just erase that if it doesn't come like that you can click right here you can
that you can click right here you can cck excuse me you can click control
cck excuse me you can click control shift and then the right arrow and then
shift and then the right arrow and then the down arrow and is going to select
the down arrow and is going to select all of our data um and you have right
all of our data um and you have right here a new worksheet or an existing
here a new worksheet or an existing worksheet we're going to create a new
worksheet we're going to create a new worksheet just tends to get too clogged
worksheet just tends to get too clogged up if we put it on the same worksheet
up if we put it on the same worksheet that already has a lot of data in it so
that already has a lot of data in it so right over here are pivot table fields
right over here are pivot table fields and these are all of our columns that we
and these are all of our columns that we just looked at and we're going to be
just looked at and we're going to be able to select those and kind of drag
able to select those and kind of drag and drop now if you just took the
and drop now if you just took the Tableau um tutorial series that I just
Tableau um tutorial series that I just finished doing last week then this is
finished doing last week then this is going to be pretty pretty familiar um
going to be pretty pretty familiar um you're going to start seeing a little
you're going to start seeing a little bit of um hopefully some patterns about
bit of um hopefully some patterns about how the data is kind of displayed and so
how the data is kind of displayed and so we have our filters down here we have
we have our filters down here we have columns rows
columns rows values all these things uh we will be
values all these things uh we will be using I'll show you how to use today as
using I'll show you how to use today as well as some additional things um one
well as some additional things um one thing that we want to start with uh for
thing that we want to start with uh for this demonstration is we're going to be
this demonstration is we're going to be looking at kind of the um these bottom
looking at kind of the um these bottom ones right here profit cost and Revenue
ones right here profit cost and Revenue and we're going to be doing that per
and we're going to be doing that per country uh per country and state and
country uh per country and state and we'll kind of do some drill Downs um and
we'll kind of do some drill Downs um and I'll show you how those work so for just
I'll show you how those work so for just to start out we're going to take the
to start out we're going to take the country right here and you'll see it
country right here and you'll see it populate right over here in fact um let
populate right over here in fact um let me zoom in maybe once uh yeah that
me zoom in maybe once uh yeah that should be fine I don't know if I want I
should be fine I don't know if I want I might zoom in it again in just a little
might zoom in it again in just a little bit um so we have our country and and
bit um so we have our country and and it's just like this very very simple
it's just like this very very simple oops um now I'm going to include the
oops um now I'm going to include the state now I'm going to drag this um all
state now I'm going to drag this um all the way and I'm going to put it under
the way and I'm going to put it under you can put it above or you can put it
you can put it above or you can put it below I'm going to put it
below I'm going to put it below uh it definitely makes the most
below uh it definitely makes the most sense there now when you do that it it
sense there now when you do that it it um kind of populates it in an expanded
um kind of populates it in an expanded way but you can collapse this very
way but you can collapse this very easily we're going to go right here
easily we're going to go right here we're going to right click we're going
we're going to right click we're going to go go down to expand and collapse and
to go go down to expand and collapse and we're going to collapse the entire field
we're going to collapse the entire field and so now here are all of our um all of
and so now here are all of our um all of our countries as they were before now
our countries as they were before now each of them has this plus sign to the
each of them has this plus sign to the left and if you click on it now we can
left and if you click on it now we can go and we see this state that we that we
go and we see this state that we that we added to these rows and what this is
added to these rows and what this is going to do is it kind of is like a
going to do is it kind of is like a rollup or it's like a grouping um and so
rollup or it's like a grouping um and so if you you know have taken the SQL um
if you you know have taken the SQL um tutorial series and you've done things
tutorial series and you've done things with Group by this is very similar to
with Group by this is very similar to that um and if you've done the Tableau
that um and if you've done the Tableau tutorial series it's kind of like a
tutorial series it's kind of like a drill down it's very very similar so you
drill down it's very very similar so you can drill into the information so we um
can drill into the information so we um can put some values in here uh and what
can put some values in here uh and what we're what that's going to do is that's
we're what that's going to do is that's going to kind of create some some
going to kind of create some some context to what this what we're grouping
context to what this what we're grouping by so just for um visual purposes let's
by so just for um visual purposes let's add this Revenue so this is the revenue
add this Revenue so this is the revenue that is bike uh bike sales revenue right
that is bike uh bike sales revenue right that's what we're looking at so this is
that's what we're looking at so this is the sum of the revenue for these bike
the sum of the revenue for these bike sales per country now if we drop down
sales per country now if we drop down right here we can see that in Australia
right here we can see that in Australia uh New South Wales had uh 92 was that
uh New South Wales had uh 92 was that 9,234 N5 Queensland had 5 million you
9,234 N5 Queensland had 5 million you know etc etc so now we can break it down
know etc etc so now we can break it down we can't it's we don't just have to look
we can't it's we don't just have to look at Australia we can now drill down even
at Australia we can now drill down even further to the actual state is what
further to the actual state is what they're calling it um the actual state
they're calling it um the actual state within Australia and so it's super super
within Australia and so it's super super useful and you can do that for every
useful and you can do that for every single one and so we can look at Canada
single one and so we can look at Canada we can look at France and we can really
we can look at France and we can really drill down into uh the revenue for each
drill down into uh the revenue for each of these countries as well as the states
of these countries as well as the states within them now over here this is not
within them now over here this is not the most uh pretty um it just says sum
the most uh pretty um it just says sum of Revenue and then it has some numbers
of Revenue and then it has some numbers not not the most pretty thing I've ever
not not the most pretty thing I've ever seen um really quick we can go like we
seen um really quick we can go like we can um kind of highlight over these and
can um kind of highlight over these and we can go back to home you can do it in
we can go back to home you can do it in a couple different ways we can go to
a couple different ways we can go to home and will type currency now it has
home and will type currency now it has these two. Z at the end you can get rid
these two. Z at the end you can get rid of those really easily by going like
of those really easily by going like that um already this looks quite a bit
that um already this looks quite a bit better just visually um especially if
better just visually um especially if you're looking at it in uh you know
you're looking at it in uh you know dollars you can change the currency um
dollars you can change the currency um to different currencies if you want to
to different currencies if you want to do that now we don't just have to do uh
do that now we don't just have to do uh the sum of Revenue we can do a lot of
the sum of Revenue we can do a lot of different things so let's go to the
different things so let's go to the value field settings so we can customize
value field settings so we can customize this name so we can do um Revenue oops
this name so we can do um Revenue oops good if I get spell Revenue per
good if I get spell Revenue per country that's fine that you know it's
country that's fine that you know it's just a placeholder trying to show you
just a placeholder trying to show you but we don't have to just do that um you
but we don't have to just do that um you know we could do the count the average
know we could do the count the average the max the Min we can do just about
the max the Min we can do just about anything we want um but let's keep it
anything we want um but let's keep it the sum right now um and if we want to
the sum right now um and if we want to we can show this value as different
we can show this value as different things so we percentage the percentage
things so we percentage the percentage of column total percentage of row total
of column total percentage of row total let's do really quick just for
let's do really quick just for demonstration purposes the percentage of
demonstration purposes the percentage of grand total so when we do that we can
grand total so when we do that we can see that the United States the per
see that the United States the per Revenue per country United States has
Revenue per country United States has 32% just between these um you know these
32% just between these um you know these countries and Australia has the next one
countries and Australia has the next one so you know it might be kind of hard to
so you know it might be kind of hard to glance at this really quickly to know
glance at this really quickly to know who has the highest um but what we can
who has the highest um but what we can do is we can go right here and we can go
do is we can go right here and we can go to sort and we can do largest to
to sort and we can do largest to smallest and there we have the United
smallest and there we have the United States on top now when you do it right
States on top now when you do it right here it's not sorted largest uh to
here it's not sorted largest uh to smallest you'd have to go in again click
smallest you'd have to go in again click sort and do largest to smallest and so
sort and do largest to smallest and so now we can see that California has the
now we can see that California has the has the um you know biggest percentage
has the um you know biggest percentage they're pulling in 20% of that 32% of
they're pulling in 20% of that 32% of Revenue so I'm just going to click C
Revenue so I'm just going to click C control z a few times and get us back to
control z a few times and get us back to where we just were um and what I want to
where we just were um and what I want to do is I want to show you a few different
do is I want to show you a few different things uh pretty quickly so we want to
things uh pretty quickly so we want to pull in this profit and this cost uh and
pull in this profit and this cost uh and so I'm going to pull in this cost next
so I'm going to pull in this cost next and then I'm going to pull in this
and then I'm going to pull in this profit again uh I'm going to
profit again uh I'm going to change the currency on
change the currency on this and I'm not going to change the
this and I'm not going to change the names um right now but you know you
names um right now but you know you absolutely can do that now the revenue
absolutely can do that now the revenue is the how much is actually being sold
is the how much is actually being sold so you know for the United States it was
so you know for the United States it was 27 million now the cost is how much did
27 million now the cost is how much did it cost to manufacture or or store um or
it cost to manufacture or or store um or distribute all of these products so that
distribute all of these products so that was 60 million and the profit is
was 60 million and the profit is actually how much money is being made at
actually how much money is being made at the end of the day after um you know all
the end of the day after um you know all their costs after all their employee
their costs after all their employee costs after everything they're still
costs after everything they're still making the United States is still making
making the United States is still making $1
$1 million now you might look at this and
million now you might look at this and you might say well you know I can kind
you might say well you know I can kind of glance at it and say know that this
of glance at it and say know that this profit is correct based off these two
profit is correct based off these two numbers um but we can do a calculated
numbers um but we can do a calculated field um if you remember what calculated
field um if you remember what calculated fields are that's something from Tableau
fields are that's something from Tableau very uh basically the exact same thing
very uh basically the exact same thing and so we can create an additional
and so we can create an additional column right here that is a calculated
column right here that is a calculated field that can add and subtract these
field that can add and subtract these things to make sure that our numbers are
things to make sure that our numbers are adding up correctly
adding up correctly so let's do that really quickly U let's
so let's do that really quickly U let's go to pivot table analyze we're going to
go to pivot table analyze we're going to go over to Fields items and sets and go
go over to Fields items and sets and go to calculated field now we can name this
to calculated field now we can name this anything um and I'm just going to for
anything um and I'm just going to for demo purposes I'm going to say um oops
demo purposes I'm going to say um oops calculated field demo uh I'm sure yours
calculated field demo uh I'm sure yours will be different now um if you want to
will be different now um if you want to you can go in here and this is the
you can go in here and this is the formula it's almost like um you know we
formula it's almost like um you know we haven't looked at formulas up this is
haven't looked at formulas up this is our first tutorial but you know when we
our first tutorial but you know when we look at formulas it's basically the same
look at formulas it's basically the same thing as writing it if inside of a cell
thing as writing it if inside of a cell but here it gives us kind of this um
but here it gives us kind of this um open text to do how we uh do what we
open text to do how we uh do what we want with it now what we're going to do
want with it now what we're going to do is we're going to do Revenue I'm going
is we're going to do Revenue I'm going to insert that I'm going to get rid of
to insert that I'm going to get rid of this I'm going to do revenue and so
this I'm going to do revenue and so that's the the the very large number and
that's the the the very large number and then we're going to
then we're going to subtract and we're going to sub subract
subtract and we're going to sub subract our cost we going to insert that and
our cost we going to insert that and let's do this and click okay so this is
let's do this and click okay so this is our calculated field demo column that we
our calculated field demo column that we just created and as you can see it
just created and as you can see it matches our uh sum of profit column
matches our uh sum of profit column exactly and that's exactly what we want
exactly and that's exactly what we want to see we want to kind of check to make
to see we want to kind of check to make sure that this revenue and cost uh
sure that this revenue and cost uh fields are generating the correct profit
fields are generating the correct profit and sometimes those are off and so it's
and sometimes those are off and so it's really good to kind of check those and
really good to kind of check those and have that additional column um You
have that additional column um You probably wouldn't have this if you were
probably wouldn't have this if you were um you know going to submit this to
um you know going to submit this to somebody uh just so you know now that
somebody uh just so you know now that this is an actual column you can't go
this is an actual column you can't go here and do something like cut or and
here and do something like cut or and paste it over here you know that's not
paste it over here you know that's not it won't let you do that what it is is
it won't let you do that what it is is is now an actual um column and so we can
is now an actual um column and so we can go and remove that and we can add it
go and remove that and we can add it back at any moment so if we want to go
back at any moment so if we want to go back and add that um oops add that down
back and add that um oops add that down here we can do that because we've
here we can do that because we've created that column it's now permanently
created that column it's now permanently there unless we go and delete all of
there unless we go and delete all of that data uh and so we can just click
that data uh and so we can just click this check mark and it will get rid of
this check mark and it will get rid of it for us all right now the last thing
it for us all right now the last thing that we have not used down here is the
that we have not used down here is the filters now the filters is exactly what
filters now the filters is exactly what it sounds like it's going to allow you
it sounds like it's going to allow you to filter on certain things um but
to filter on certain things um but probably not things that you already
probably not things that you already have included in your pivot table so if
have included in your pivot table so if you add something like the country down
you add something like the country down here um it's going to kind of expand
here um it's going to kind of expand everything and then if you then go and
everything and then if you then go and filter on it it kind of breaks it down
filter on it it kind of breaks it down that's really not what the filter is
that's really not what the filter is kind of used for or meant for um for
kind of used for or meant for um for example right up here we have uh
example right up here we have uh customer gender okay so let's take the
customer gender okay so let's take the customer gender and we'll put it in this
customer gender and we'll put it in this filters now we can see all of the
filters now we can see all of the revenue all of the cost all the profit
revenue all of the cost all the profit and we can do that based off of the
and we can do that based off of the gender so we can filter by a gender not
gender so we can filter by a gender not really having to change anything about
really having to change anything about our pivot table and so at a super Quick
our pivot table and so at a super Quick Glance we can see that uh the males are
Glance we can see that uh the males are the profit from the males is
16.48% so at a super uh basic level at a really quick glance we can see that the
really quick glance we can see that the men or the males are you know spending a
men or the males are you know spending a little bit more than the females by
little bit more than the females by about about
about about $700,000 now let's go ahead and create
$700,000 now let's go ahead and create one more pivot table uh we are going to
one more pivot table uh we are going to create a pivot table right over here
create a pivot table right over here let's go back to the
let's go back to the sales right here again control shift
sales right here again control shift right down it's going to select all of
right down it's going to select all of our data and we're click okay so one
our data and we're click okay so one thing that we're going to look at is
thing that we're going to look at is we're going to use some of this date
we're going to use some of this date information right here so let's select
information right here so let's select our country just like we did before um
our country just like we did before um and what we want to do is see you know
and what we want to do is see you know what year were we performing our best
what year were we performing our best when were we doing our absolute best uh
when were we doing our absolute best uh with oops
with oops let me go
let me go back uh with our sales so I'm going to
back uh with our sales so I'm going to select the year and put that in our
select the year and put that in our columns and so now we have 2011 through
columns and so now we have 2011 through 2016 and we want to look at our Revenue
2016 and we want to look at our Revenue let's put our Revenue right down here
let's put our Revenue right down here and now we have all of our Revenue now
and now we have all of our Revenue now let's again make this into a
let's again make this into a currency just like that and super
currency just like that and super quickly now we can get a really quick
quickly now we can get a really quick glance at at how Australia was doing
glance at at how Australia was doing each year and we can see that there was
each year and we can see that there was a huge uptick in 2013 and a huge uptick
a huge uptick in 2013 and a huge uptick in 2015 it didn't happen for every
in 2015 it didn't happen for every single country uh it did go up uh for
single country uh it did go up uh for most countries very slightly for some
most countries very slightly for some but we can see on a large scale from um
but we can see on a large scale from um year to year what that's like and So
year to year what that's like and So within just a few minutes we're able to
within just a few minutes we're able to create some really useful pivot tables
create some really useful pivot tables that anybody could look at and
that anybody could look at and understand and that's really the biggest
understand and that's really the biggest use of these PIV pivot tables is that
use of these PIV pivot tables is that you can kind of group these things
you can kind of group these things together show some uh information and
together show some uh information and data at at kind of a broad larger scale
data at at kind of a broad larger scale and make it to where anybody who's
and make it to where anybody who's looking at it can understand it that is
looking at it can understand it that is why pivot tables are so useful and so I
why pivot tables are so useful and so I hope that this video was helpful I hope
hope that this video was helpful I hope that I was able to walk through it and
that I was able to walk through it and help you better understand how pivot
help you better understand how pivot tables work and how you can use them
tables work and how you can use them when you are working within Excel thank
when you are working within Excel thank you guys so much for watching I really
you guys so much for watching I really appreciate it if you like this video be
appreciate it if you like this video be sure to like And subscribe below and
sure to like And subscribe below and I'll see you in the next video
I'll see you in the next video [Music]
[Music] what's going on everybody today we're
what's going on everybody today we're going to be looking at formulas in
going to be looking at formulas in [Music]
[Music] Excel now I know what you're thinking
Excel now I know what you're thinking there's absolutely no way that you're
there's absolutely no way that you're going to be able to show us every single
going to be able to show us every single formula in Excel and you're absolutely
formula in Excel and you're absolutely right but I am going to show you some of
right but I am going to show you some of my favorites and the ones that I found
my favorites and the ones that I found the most useful and then you can go
the most useful and then you can go ahead and practice those and try those
ahead and practice those and try those out and if there are ones that you
out and if there are ones that you really want me to do and you think that
really want me to do and you think that I missed put it in the comments below
I missed put it in the comments below and I will see those and I'll try to
and I will see those and I'll try to make a list of those and make another
make a list of those and make another video on formulas and include all of
video on formulas and include all of those as well and now before we jump
those as well and now before we jump into the actual tutorial I want to give
into the actual tutorial I want to give a huge shout out to the sponsor of the
a huge shout out to the sponsor of the series and that is udemy you guys
series and that is udemy you guys already know if you have watched any of
already know if you have watched any of my videos that I absolutely love udem me
my videos that I absolutely love udem me I mean honestly they were the ones who
I mean honestly they were the ones who got me started and were able ble to give
got me started and were able ble to give me affordable courses for me to get
me affordable courses for me to get started as a data analyst I learned SQL
started as a data analyst I learned SQL and Excel and python all through udimi
and Excel and python all through udimi courses and so if you are looking for a
courses and so if you are looking for a platform to take a course I absolutely
platform to take a course I absolutely recommend you look at udemy they have
recommend you look at udemy they have fantastic sales going on right now
fantastic sales going on right now especially during the holiday season in
especially during the holiday season in this new year and so if you're looking
this new year and so if you're looking to take a full-fledged Excel course I
to take a full-fledged Excel course I have some of my favorites in the
have some of my favorites in the description below and now without
description below and now without further Ado let's jump onto my screen
further Ado let's jump onto my screen and get started with the tutorial all
and get started with the tutorial all right now before we start I want to say
right now before we start I want to say that this is not like every other
that this is not like every other tutorial that I have created created
tutorial that I have created created this one is very streamlined okay so I
this one is very streamlined okay so I already know exactly what I'm going to
already know exactly what I'm going to do there's not going to be much messing
do there's not going to be much messing around I left little notes here and
around I left little notes here and there um and I'm going to try to get
there um and I'm going to try to get through it because there's a lot of them
through it because there's a lot of them to get through um so all these ones at
to get through um so all these ones at the bottom now these are ones that I use
the bottom now these are ones that I use a lot that I think are useful again if
a lot that I think are useful again if you know other ones that you use a lot
you know other ones that you use a lot that think that I should be using which
that think that I should be using which I know there are ones that I left out of
I know there are ones that I left out of here you know put it in the comments um
here you know put it in the comments um I'll see the ones that people are liking
I'll see the ones that people are liking and I will I will create more videos on
and I will I will create more videos on the because I know there are so many I
the because I know there are so many I also will save this um excel in on the
also will save this um excel in on the GitHub so you can go and download it it
GitHub so you can go and download it it will be exactly what you're looking at
will be exactly what you're looking at right now I highly recommend trying
right now I highly recommend trying these formulas out for yourself so you
these formulas out for yourself so you can get a feel for how they work and how
can get a feel for how they work and how they're actually used and you can mess
they're actually used and you can mess around with it yourself so um as you can
around with it yourself so um as you can see at the bottom we're going to start
see at the bottom we're going to start with Max Min and then we're going to go
with Max Min and then we're going to go on to some more I think a little bit
on to some more I think a little bit more uh difficult things um and all
more uh difficult things um and all these things are super useful I'll try
these things are super useful I'll try to talk about how you can actually use
to talk about how you can actually use it as we go through it some are super
it as we go through it some are super self-explanatory but some may not be so
self-explanatory but some may not be so this one I think is super
this one I think is super self-explanatory but again one that
self-explanatory but again one that you're going to use all the time um and
you're going to use all the time um and so what we can do is we can say equal
so what we can do is we can say equal and that's how you kind of start off
and that's how you kind of start off saying this is going to be a formula in
saying this is going to be a formula in this cell equal means uh I am now
this cell equal means uh I am now creating a formula and we're going to
creating a formula and we're going to say
say MX and I'll hit Tab and so it'll kind of
MX and I'll hit Tab and so it'll kind of populate it and right here if you've
populate it and right here if you've never seen a formula before it'll to
never seen a formula before it'll to give you what the inputs need to be so
give you what the inputs need to be so it's going to say Max of number one
it's going to say Max of number one number two etc etc what we're going to
number two etc etc what we're going to do is we're going to give a range so
do is we're going to give a range so we're going to go from here down to here
we're going to go from here down to here you don't have to close the parenthesis
you don't have to close the parenthesis but you can I'm going to and then you
but you can I'm going to and then you hit enter and so for this date it's
hit enter and so for this date it's going to give us the max date now these
going to give us the max date now these are um the start dates for these people
are um the start dates for these people right here and so if we just kind of
right here and so if we just kind of glance through here we can see that 2013
glance through here we can see that 2013 was the last year and this one is
was the last year and this one is actually the latest in that year and so
actually the latest in that year and so it gave us the correct one the Min is
it gave us the correct one the Min is going to do the exact opposite it's
going to do the exact opposite it's going to give us uh the smallest and so
going to give us uh the smallest and so we'll give it the same range we'll close
we'll give it the same range we'll close the parenthesis and it's going to say
the parenthesis and it's going to say December 7th of 1995 and we can see that
December 7th of 1995 and we can see that that is correct so Michael Scott started
that is correct so Michael Scott started in 1995 the earliest of all the
in 1995 the earliest of all the employees um and you can do the exact
employees um and you can do the exact same thing for really any of these
same thing for really any of these columns we can see who the who's making
columns we can see who the who's making the most money or at least what the
the most money or at least what the higher salary is U so we'll do Max and
higher salary is U so we'll do Max and then we'll do the salary range and so
then we'll do the salary range and so this is this one again uh whoops what
this is this one again uh whoops what did I do oh I did the wrong range didn't
did I do oh I did the wrong range didn't I no I didn't do the wrong range it's
I no I didn't do the wrong range it's just there it goes uh this column was a
just there it goes uh this column was a date range or a date column for whatever
date range or a date column for whatever reason let me get rid of that uh and
reason let me get rid of that uh and then we can do equals Min and we'll do
then we can do equals Min and we'll do again we'll do the salary and at a quick
again we'll do the salary and at a quick glance we can see that Pam Beasley is
glance we can see that Pam Beasley is making the least and 65,000 is Michael
making the least and 65,000 is Michael Scott who's making uh that so super
Scott who's making uh that so super simple it shows the max it shows the Min
simple it shows the max it shows the Min you can select a range there you go
you can select a range there you go let's move on to if and ifs now if is um
let's move on to if and ifs now if is um I think pretty straightforward so all
I think pretty straightforward so all you're going to do is you're going to
you're going to do is you're going to say if this then that um ifs is a little
say if this then that um ifs is a little bit different so ifs is you can you can
bit different so ifs is you can you can put multiple conditions and as we're
put multiple conditions and as we're writing it I'll show you kind of what it
writing it I'll show you kind of what it the conditions that need to be met all
the conditions that need to be met all right so we're going to click right here
right so we're going to click right here we're going to say equal we're going to
we're going to say equal we're going to do if hit Tab and we need a logical test
do if hit Tab and we need a logical test uh and so we're going to give it a range
uh and so we're going to give it a range or or or something we're going to say if
or or or something we're going to say if it's equal greater to um something like
it's equal greater to um something like that then we're going to say if the
that then we're going to say if the value is true what's the what is going
value is true what's the what is going to be the output or if the value is
to be the output or if the value is false what's going to be the output so
false what's going to be the output so let's do this right
let's do this right here we'll do this age range and so if
here we'll do this age range and so if they are greater than let's say let's do
they are greater than let's say let's do 30 if they're greater than 30 we're
30 if they're greater than 30 we're going to do a comma and so if the value
going to do a comma and so if the value is true what what should be the output
is true what what should be the output uh if they're greater than 30 we're
uh if they're greater than 30 we're going to call them old and then if it is
going to call them old and then if it is false so if they're younger than 30 what
false so if they're younger than 30 what should it say and we're going to say
should it say and we're going to say young and we'll close the
young and we'll close the parenthesis and there you go so if
parenthesis and there you go so if they're over 30 then they are going to
they're over 30 then they are going to have young or if they're younger than 30
have young or if they're younger than 30 they're going to have young now this is
they're going to have young now this is something where you need to specify if
something where you need to specify if you want 30 and over or over 30 we chose
you want 30 and over or over 30 we chose over 30 so 30 is not included in that um
over 30 so 30 is not included in that um so they're going to be
so they're going to be young now uh let's get we don't actually
young now uh let's get we don't actually need two of these that's pretty
need two of these that's pretty self-explanatory the ifs is a little bit
self-explanatory the ifs is a little bit different right you can have multiple
different right you can have multiple conditions so let's open that up real
conditions so let's open that up real quick so ifs and now we have a logical
quick so ifs and now we have a logical test value if uh that's true then you
test value if uh that's true then you can do logical test two value if that's
can do logical test two value if that's true um so you can have multiple
true um so you can have multiple multiple multiple things now this one is
multiple multiple things now this one is a little bit different in this one oops
a little bit different in this one oops let me get out of this in this one you
let me get out of this in this one you had a value of true a value of false ifs
had a value of true a value of false ifs does not have that ifs is going to give
does not have that ifs is going to give you um different ranges in different
you um different ranges in different specific conditions
specific conditions and you can't say if this one's false
and you can't say if this one's false you're just going to have multiple
you're just going to have multiple conditions so let's do equals and ifs
conditions so let's do equals and ifs Tab and we'll do our first logical test
Tab and we'll do our first logical test so let's
so let's do
do um if the
um if the salesman or if that equals to
salesman or if that equals to salesman we're going to say we're going
salesman we're going to say we're going to respond with
to respond with sales so that's if the value is true
sales so that's if the value is true that's what we want the output to be now
that's what we want the output to be now we're going to go on to our logical test
we're going to go on to our logical test two so you're going to see this pattern
two so you're going to see this pattern right if this is our conditional or
right if this is our conditional or logical test so if this is true this is
logical test so if this is true this is what's going to be returned so you'll
what's going to be returned so you'll notice that's just a a pretty simple
notice that's just a a pretty simple pattern we can just do random things so
pattern we can just do random things so if it's equal to sales um and we'll just
if it's equal to sales um and we'll just do the same one if that is equal
do the same one if that is equal to say HR we can say fire
to say HR we can say fire immediately and now we're going to
immediately and now we're going to say if it's equal
to regional
regional manager and we
manager and we say give
say give Christmas bonus and we'll close the
Christmas bonus and we'll close the parenthesis and let's see what we get so
parenthesis and let's see what we get so as you can see there's no default value
as you can see there's no default value for true or false like like this one
for true or false like like this one there was a logical test and if it was
there was a logical test and if it was true there was a value and if it was
true there was a value and if it was false there was a value so for every
false there was a value so for every single one you'll get a value for this
single one you'll get a value for this one that's not exactly going to happen
one that's not exactly going to happen as you can see there are these
as you can see there are these Nas now when that happens it just means
Nas now when that happens it just means nothing met that condition so we never
nothing met that condition so we never said anything about supplier relations
said anything about supplier relations we never said anything about accountants
we never said anything about accountants but if it was part of that ifs statement
but if it was part of that ifs statement then it got something um and so that is
then it got something um and so that is how the ifs works now let's move on to
how the ifs works now let's move on to length uh this is exactly what we're
length uh this is exactly what we're going to do but you know some of the
going to do but you know some of the uses for this U for the length I've used
uses for this U for the length I've used it for a lot of different things um one
it for a lot of different things um one thing that I've used it for in the past
thing that I've used it for in the past and you know Max and ifs you know you
and you know Max and ifs you know you can use it for almost anything length is
can use it for almost anything length is there's a lot of different use cases one
there's a lot of different use cases one I used to work with a lot of um customer
I used to work with a lot of um customer data or patient data they had like
data or patient data they had like Social Security numbers and if you know
Social Security numbers and if you know there was bad Social Security numbers we
there was bad Social Security numbers we didn't want to include that and so we do
didn't want to include that and so we do like the length of that and if a social
like the length of that and if a social security number was let's say 10 numbers
security number was let's say 10 numbers or 11 numbers where it should only be
or 11 numbers where it should only be nine or or you know however many they
nine or or you know however many they are I think it's nine then we know that
are I think it's nine then we know that that social security number is incorrect
that social security number is incorrect and then we can get rid of that or
and then we can get rid of that or discard it from our results that's just
discard it from our results that's just an example right um so for this oops
an example right um so for this oops what I do that I did control Z to undo
what I do that I did control Z to undo that if you didn't know how to do that
that if you didn't know how to do that um so we're going to do equals Len which
um so we're going to do equals Len which is length um and again if you didn't see
is length um and again if you didn't see that it Returns the number of characters
that it Returns the number of characters in a text string so let's go right here
in a text string so let's go right here and let's go to uh let's go to their
and let's go to uh let's go to their last name and we'll give it a range so
last name and we'll give it a range so it's going to tell us how many
it's going to tell us how many characters are in that string so for
characters are in that string so for halber it's seven characters for
halber it's seven characters for flenderson it's 10 characters and we're
flenderson it's 10 characters and we're able to see a length and so again there
able to see a length and so again there are a lot of different use cases for
are a lot of different use cases for this uh the social security number was
this uh the social security number was one another one is phone numbers right
one another one is phone numbers right if you look at the length of the phone
if you look at the length of the phone numbers and there's ones that are like
numbers and there's ones that are like 12 numbers long you know those might not
12 numbers long you know those might not be ones that are accurate and you need
be ones that are accurate and you need to go look at them and see if you want
to go look at them and see if you want to include them in your results or your
to include them in your results or your output so that is how length is done
output so that is how length is done let's move right over to the left and
let's move right over to the left and right um I I might be going a little
right um I I might be going a little fast but uh you know I'm keeping it I'm
fast but uh you know I'm keeping it I'm keeping it live I'm keeping this on our
keeping it live I'm keeping this on our feet uh so let's keep going left and
feet uh so let's keep going left and right um are kind of like substrings if
right um are kind of like substrings if you've taken the the sequel um tutorial
you've taken the the sequel um tutorial series that I've done uh substrings are
series that I've done uh substrings are where you can choose a certain part of
where you can choose a certain part of the text string and you can extract data
the text string and you can extract data from that um and usually have to
from that um and usually have to reference a certain number so a certain
reference a certain number so a certain amount of characters that's the exact
amount of characters that's the exact same thing except uh unfortunately
same thing except uh unfortunately there's no substring there's substitute
there's no substring there's substitute but there's no substring left and right
but there's no substring left and right is really the closest thing that we have
is really the closest thing that we have so let's kind of take a look real quick
so let's kind of take a look real quick and see what we can do so we're going to
and see what we can do so we're going to do left and it's going to say Returns
do left and it's going to say Returns the specified number of characters from
the specified number of characters from the start of a text string so we're
the start of a text string so we're starting from the very far left and we
starting from the very far left and we need to choose our text and then choose
need to choose our text and then choose the number of characters that we're
the number of characters that we're going to be looking over so let's go
going to be looking over so let's go over here and let's just choose you know
over here and let's just choose you know start symol uh we'll get a little bit
start symol uh we'll get a little bit more advanced so we have um this is our
more advanced so we have um this is our text range so these are the the the ones
text range so these are the the the ones that we want to look at and then how
that we want to look at and then how many characters do we want to look
many characters do we want to look forward and we'll just choose three as
forward and we'll just choose three as an example and so you can see that it
an example and so you can see that it takes the first three characters from
takes the first three characters from every single thing now you can also do
every single thing now you can also do this with numbers it doesn't just have
this with numbers it doesn't just have to be um you know name with with actual
to be um you know name with with actual words or letters you can do the exact
words or letters you can do the exact same thing so you can say
same thing so you can say write um and we're going to choose our
write um and we're going to choose our our string uh and let's do this one so
our string uh and let's do this one so you know all of them start with 100 um
you know all of them start with 100 um and we'll just say we want to take the
and we'll just say we want to take the last one so this one is going to start
last one so this one is going to start from the very far right and go over one
from the very far right and go over one character
character so right here you can see this is our
so right here you can see this is our range and I just chose one so starting
range and I just chose one so starting from the very far right we go over one
from the very far right we go over one character and that's what we take and so
character and that's what we take and so that can definitely be useful another
that can definitely be useful another one that you can do and this one is one
one that you can do and this one is one that I have used so many times I mean
that I have used so many times I mean honestly countless times in in actually
honestly countless times in in actually using this in my job uh so we're going
using this in my job uh so we're going to go from the right and we're going to
to go from the right and we're going to look at a date so you know sometimes you
look at a date so you know sometimes you have these date structures month month
have these date structures month month day day year year year or year um you
day day year year year or year um you know day month year all these different
know day month year all these different and sometimes you just want to extract
and sometimes you just want to extract either the month or the year or or
either the month or the year or or something like that the day and so we
something like that the day and so we want to come in here we're just going to
want to come in here we're just going to extract the oops I wanted to make that
extract the oops I wanted to make that arrange we want to extract the year of
arrange we want to extract the year of the start dates so we're going to do
the start dates so we're going to do that and then we're going to go over
that and then we're going to go over four so we want to take the first four
four so we want to take the first four characters from the right to give us the
characters from the right to give us the entire year let's do that and now we can
entire year let's do that and now we can see exactly the year and this can be
see exactly the year and this can be just super super useful this is again
just super super useful this is again one that I've used used a lot and so
one that I've used used a lot and so that is one that you might want to
that is one that you might want to remember in case you're ever doing
remember in case you're ever doing analysis on you know start and end dates
analysis on you know start and end dates or or anything with um date data uh
or or anything with um date data uh again one that I highly recommend
again one that I highly recommend remembering let's go over to date to
remembering let's go over to date to text I actually probably should have
text I actually probably should have included that um before because I
included that um before because I actually used it in this one um if you
actually used it in this one um if you notice right here this is a text so in
notice right here this is a text so in in this one we just did that was a text
in this one we just did that was a text you can't do this right on um start and
you can't do this right on um start and end dates when it's a date uh format and
end dates when it's a date uh format and let me show you so this is a date now if
let me show you so this is a date now if I do equals and you know we just did
I do equals and you know we just did this uh let's do on the end date and
this uh let's do on the end date and I'll do the whole range give me a second
I'll do the whole range give me a second and we'll do
and we'll do four it's giving us completely random
four it's giving us completely random numbers why is that because underneath
numbers why is that because underneath the date range there are um numbers
the date range there are um numbers right so if I go right here and I make
right so if I go right here and I make this a general it's going to have the
this a general it's going to have the numbers and look these are the first
numbers and look these are the first four characters from the right and so
four characters from the right and so it's doing what it's supposed to do but
it's doing what it's supposed to do but uh it's not doing what we actually want
uh it's not doing what we actually want and that's the issue so how can we
and that's the issue so how can we convert this now there are a ton of
convert this now there are a ton of different ways um but the quickest
different ways um but the quickest probably the easiest besides actually
probably the easiest besides actually writing writing it out like this like
writing writing it out like this like 11-2
11-2 d201 which then converts it to a date
d201 which then converts it to a date format um but what you can do you know
format um but what you can do you know just so you know you can create a as a
just so you know you can create a as a text you can do 11-2
text you can do 11-2 d201 and now it will stay a text string
d201 and now it will stay a text string and as you can tell these are a little
and as you can tell these are a little bit different because this one is uh
bit different because this one is uh formatted or situated on the right and
formatted or situated on the right and this one's on the left that's how you
this one's on the left that's how you can tell the difference now if you don't
can tell the difference now if you don't want to do it by hand uh completely
want to do it by hand uh completely manually and waste hours of your time
manually and waste hours of your time you can do it in a very simple way so
you can do it in a very simple way so we're going to do uh text so this is the
we're going to do uh text so this is the exact um form for that we're going to
exact um form for that we're going to use so let's get rid of that one there
use so let's get rid of that one there we go so we're going to do equals we're
we go so we're going to do equals we're going to do uh oops text it says
going to do uh oops text it says converts a value to text in a specific
converts a value to text in a specific number format so for a date format we
number format so for a date format we can choose a date format and then it'll
can choose a date format and then it'll convert it to a text for us which saves
convert it to a text for us which saves so much time I promise you uh let's do
so much time I promise you uh let's do all of these just like we did and then
all of these just like we did and then we need to tell it what the format is if
we need to tell it what the format is if we don't if we tell it something
we don't if we tell it something incorrect it's going to give us a
incorrect it's going to give us a completely terrible output or just give
completely terrible output or just give us an error alog together so this is a
us an error alog together so this is a DayDay month Monon year year year year
DayDay month Monon year year year year format and that is what we're going to
format and that is what we're going to do so we're going to do
do so we're going to do ddmm y y YY and close that up and there
ddmm y y YY and close that up and there you go and now we well because it's in a
you go and now we well because it's in a formula what we need to do
formula what we need to do is copy
is copy this and past paste it right over here
this and past paste it right over here and now you can see that is a general
and now you can see that is a general this is something that we can use as a
this is something that we can use as a string and let's just check it just to
string and let's just check it just to make sure we're going to do right we're
make sure we're going to do right we're going to do this one let's do all of
going to do this one let's do all of them and we'll do
them and we'll do four and there you go so now it works
four and there you go so now it works that is what we are looking for um and
that is what we are looking for um and you can do that imagine doing that with
you can do that imagine doing that with millions of rows or you know let's say
millions of rows or you know let's say 10,000 rows it's going to be a breeze
10,000 rows it's going to be a breeze right it's going to take you two minutes
right it's going to take you two minutes or a minute
or a minute to do everything that you want to do
to do everything that you want to do instead of having to just do a bunch of
instead of having to just do a bunch of mess to convert it to a string which I
mess to convert it to a string which I promise you I've done it it just takes
promise you I've done it it just takes forever it's it's terrible so that is uh
forever it's it's terrible so that is uh date to text super helpful formula let's
date to text super helpful formula let's go over to trim now I I purposefully
go over to trim now I I purposefully messed up this column now why do I did I
messed up this column now why do I did I mess it up like this because when you're
mess it up like this because when you're working with real data you're going to
working with real data you're going to get data like this it it's messy it's
get data like this it it's messy it's dirty it just has random spaces at the
dirty it just has random spaces at the end for no reason um because sometimes
end for no reason um because sometimes you're going to be working with um data
you're going to be working with um data that is inputed by a user it's not like
that is inputed by a user it's not like a drop- down option so imagine
a drop- down option so imagine somebody's typing this in they
somebody's typing this in they accidentally put a space so they
accidentally put a space so they actually put an enter or something and
actually put an enter or something and then they submit it and this is how it's
then they submit it and this is how it's going to look in the database um and if
going to look in the database um and if you're a data engineer or you know
you're a data engineer or you know you're working with the raw data if they
you're working with the raw data if they don't clean that up then you're going to
don't clean that up then you're going to be working with that that dirty data and
be working with that that dirty data and I I guarantee you if you're working as a
I I guarantee you if you're working as a data analyst you're going to see stuff
data analyst you're going to see stuff like this not with maybe a last name but
like this not with maybe a last name but all sorts of data so we're going to go
all sorts of data so we're going to go right here we're going to say equals
right here we're going to say equals trim do open parenthesis actually this
trim do open parenthesis actually this says removes all spaces from a text
says removes all spaces from a text string except for a single space between
string except for a single space between words so like you know if it said
words so like you know if it said Halpert space uh or gy space Halpert it
Halpert space uh or gy space Halpert it won't take the space in between there
won't take the space in between there because it it kind of understands that
because it it kind of understands that the in normal language space is supposed
the in normal language space is supposed to be there so it won't do that um but
to be there so it won't do that um but we'll take that we'll give it this
we'll take that we'll give it this Range close that up and there you go now
Range close that up and there you go now it is nice and clean much more usable
it is nice and clean much more usable now let's look at concatenate one that I
now let's look at concatenate one that I have used just way way way too many
have used just way way way too many times um and something that I've used
times um and something that I've used concatenate for and you'll see this one
concatenate for and you'll see this one in a lot of demonstrations for a good
in a lot of demonstrations for a good reason is because a lot of people use it
reason is because a lot of people use it for this um so what you can do is you
for this um so what you can do is you can say equals um and well let me tell
can say equals um and well let me tell you what concatenate does real
you what concatenate does real quick so what concatenate does oops I'm
quick so what concatenate does oops I'm totally messing up here um but it joins
totally messing up here um but it joins two or more text strings into one string
two or more text strings into one string it basically joins things together and
it basically joins things together and adds them together so let's do
adds them together so let's do concatenate and we're going to add this
concatenate and we're going to add this first and last name again one that gets
first and last name again one that gets used all the time but that's because um
used all the time but that's because um it really is useful so you can do this
it really is useful so you can do this and you can say now now I want to
and you can say now now I want to include this so concatenating this and
include this so concatenating this and this and let's take a look so it says
this and let's take a look so it says Jim Halpert U but it's all connected and
Jim Halpert U but it's all connected and that's typically not how people write
that's typically not how people write their names so what we can do is we can
their names so what we can do is we can go back in here and we can do what my
go back in here and we can do what my demonstration up here already tells us
demonstration up here already tells us to do which is we're just going to add
to do which is we're just going to add another thing in here and if we add two
another thing in here and if we add two parentheses we can include anything in
parentheses we can include anything in here we can include a dash we can
here we can include a dash we can include an exclamation point or we can
include an exclamation point or we can just include a space so let's just
just include a space so let's just include a space really quick and just
include a space really quick and just like that it works perfectly and so now
like that it works perfectly and so now we have the full name now something that
we have the full name now something that you could use it for is something like
you could use it for is something like generating uh an email this is something
generating uh an email this is something that you absolutely could do um and it's
that you absolutely could do um and it's you know pretty simple so I'm going to
you know pretty simple so I'm going to do it like this I'm G to say oops what
do it like this I'm G to say oops what did I do I'm G to say um Dot and then at
did I do I'm G to say um Dot and then at the end I'm going to say at
the end I'm going to say at oops comma
oops comma quotation
quotation gmail.com and now I've created emails
gmail.com and now I've created emails for all of these people so just
for all of these people so just something that you can do with this um
something that you can do with this um and something that it it absolutely is
and something that it it absolutely is used for and you'll see that
used for and you'll see that demonstration almost everywhere because
demonstration almost everywhere because honestly it gets used a lot um by data
honestly it gets used a lot um by data analysts and so uh you know just a good
analysts and so uh you know just a good one to know understanding how that that
one to know understanding how that that concatenation works um let's go over to
concatenation works um let's go over to the next one
the next one so we are going to do substitute now
so we are going to do substitute now substitute's really interesting um there
substitute's really interesting um there are different ways you can do it I'm
are different ways you can do it I'm going to show it to you on these dates
going to show it to you on these dates real quick uh that's what we're going to
real quick uh that's what we're going to look at so changing a date format
look at so changing a date format changing how what it's supposed to look
changing how what it's supposed to look like is absolutely something that
like is absolutely something that happens all the time and um you know
happens all the time and um you know sometimes you'll even get it like
sometimes you'll even get it like this where it'll look like it'll be
this where it'll look like it'll be messy it'll be different a different um
messy it'll be different a different um I guess format so this one has all the
I guess format so this one has all the other ones have slashes where these ones
other ones have slashes where these ones have
have dashes and you know what you can do is
dashes and you know what you can do is if you want to well let me actually go
if you want to well let me actually go with the no instances real quick because
with the no instances real quick because this one is uh actually makes the most
this one is uh actually makes the most sense um so we'll do equals and we're
sense um so we'll do equals and we're going to say
going to say substitute and oops and let me say
substitute and oops and let me say substitute replaces existing text with
substitute replaces existing text with new text in a text string so if we do an
new text in a text string so if we do an open parenthesis it says we take the
open parenthesis it says we take the text have the old text we have the new
text have the old text we have the new text and then we have how what instance
text and then we have how what instance or how many times uh or or or what
or how many times uh or or or what instance are we looking at it and I'll
instance are we looking at it and I'll explain that in a little
explain that in a little bit so the text that we're going to be
bit so the text that we're going to be looking at is this one right here so
looking at is this one right here so let's take this
let's take this range and the old is we're going to take
range and the old is we're going to take this Dash and so let's take the
this Dash and so let's take the dash and then what do we want to replace
dash and then what do we want to replace replace it with we want to replace it
replace it with we want to replace it with this slash right here I think it's
with this slash right here I think it's a forward slash isn't that what it's
a forward slash isn't that what it's called it's called a forward slash am I
called it's called a forward slash am I crazy um and we're not going to put an
crazy um and we're not going to put an instance notice that that's in a bracket
instance notice that that's in a bracket that means it's optional we're going to
that means it's optional we're going to do none of that um and what it's going
do none of that um and what it's going to do is it's going to fix this so this
to do is it's going to fix this so this one is now in the correct format that we
one is now in the correct format that we want uh and that's fantastic that's you
want uh and that's fantastic that's you know that's what we tried to accomplish
know that's what we tried to accomplish given what we had now let's fix that if
given what we had now let's fix that if we want to do the exact same thing uh we
we want to do the exact same thing uh we can say
can say uh what are we doing substitute we can
uh what are we doing substitute we can do substitute we can do open parentheses
do substitute we can do open parentheses we'll give the range and now let's say
we'll give the range and now let's say we want to change all of them to a
we want to change all of them to a different format so instead of the um
different format so instead of the um forward slash I'm going to keep calling
forward slash I'm going to keep calling it that if that's correct we want to
it that if that's correct we want to give it a dash and so then we close that
give it a dash and so then we close that and now all of them are in this new
and now all of them are in this new format so it it's able to substitute a
format so it it's able to substitute a specific value for a new value and if
specific value for a new value and if you don't include an instance
you don't include an instance then it'll do it to every single one in
then it'll do it to every single one in there so let's go over here and we're
there so let's go over here and we're going to actually use the the um the the
going to actually use the the um the the instance num and I'll show you what that
instance num and I'll show you what that does uh and so really quick we'll do the
does uh and so really quick we'll do the exact same thing that we just did we'll
exact same thing that we just did we'll do the forward slash and we want to
do the forward slash and we want to replace it with this one again this Dash
replace it with this one again this Dash but we only want to do it on the first
but we only want to do it on the first instance of that forward slash and so as
instance of that forward slash and so as you can see all the ones that um all the
you can see all the ones that um all the ones that were replaced are the very
ones that were replaced are the very first instance whereas the second
first instance whereas the second instance which is the second time it
instance which is the second time it appears in this string does not get
appears in this string does not get touched so if we take
touched so if we take this and we put it right over here and
this and we put it right over here and we move it to
we move it to two it's kind of the opposite so the
two it's kind of the opposite so the first one wasn't touched the second one
first one wasn't touched the second one was so we're choosing which instance or
was so we're choosing which instance or which time it shows up in that string
which time it shows up in that string and then it replaces it if you do not
and then it replaces it if you do not choose an instance it chooses all of
choose an instance it chooses all of them so this can be super useful if you
them so this can be super useful if you want to do like a bulk replace um but
want to do like a bulk replace um but you only want to do it on a specific
you only want to do it on a specific column um and you just want to use a
column um and you just want to use a formula really quick right um and so you
formula really quick right um and so you can use this in a lot of different ways
can use this in a lot of different ways so that's how you're able to actually do
so that's how you're able to actually do it with the first instance the second
it with the first instance the second instance and if you don't include an
instance and if you don't include an instance at all let's go over to the sum
instance at all let's go over to the sum uh this is one I think everyone knows
uh this is one I think everyone knows how to use but I want to show you two
how to use but I want to show you two other ones um as well so let's go to the
other ones um as well so let's go to the sum and we're just going to do equals
sum and we're just going to do equals the sum and I hope you know what this is
the sum and I hope you know what this is well not hope I if you don't know what
well not hope I if you don't know what this is it just adds up all the numbers
this is it just adds up all the numbers in range so we're going to add sum means
in range so we're going to add sum means add so we're going to take this and it's
add so we're going to take this and it's going to give us the uh what all these
going to give us the uh what all these salaries are together so super super
salaries are together so super super simple Su is one of probably the most
simple Su is one of probably the most basic formulas that you can do um some
basic formulas that you can do um some if is a little bit different you can add
if is a little bit different you can add an if statement which we learned right
an if statement which we learned right back here you can add an if statement
back here you can add an if statement and then add it if it meets a certain
and then add it if it meets a certain criteria all right so we're going to do
criteria all right so we're going to do equals some if and then you're going to
equals some if and then you're going to need to give a range in criteria and you
need to give a range in criteria and you can include a some range if you would
can include a some range if you would like so we're going to do the salary
like so we're going to do the salary again we going to do a comma and now
again we going to do a comma and now here's our criteria let's do if they
here's our criteria let's do if they have greater than 50,000 for their
have greater than 50,000 for their salary and close our parenthesis so now
salary and close our parenthesis so now it's only going to add up if their
it's only going to add up if their salary is greater than 50,000 now his is
salary is greater than 50,000 now his is 50,000 exactly so that won't count but
50,000 exactly so that won't count but we have 63 and 65,000 which does equal
we have 63 and 65,000 which does equal 128,000 so it it just gives a specific
128,000 so it it just gives a specific criteria or an if statement then it does
criteria or an if statement then it does the addition uh so super useful on that
the addition uh so super useful on that one so that is how you do a su if and Su
one so that is how you do a su if and Su ifs is kind of the same thing as we did
ifs is kind of the same thing as we did back here there's the if and the ifs so
back here there's the if and the ifs so the ifs is going to be if it has it
the ifs is going to be if it has it meets multiple conditions so let's take
meets multiple conditions so let's take a look at that one so let's do um equals
a look at that one so let's do um equals some ifs now uh oops now the Syntax for
some ifs now uh oops now the Syntax for this one is going to be a little bit
this one is going to be a little bit different you'll see that in just a
different you'll see that in just a second this adds the cells specified by
second this adds the cells specified by a given set of conditions or criteria so
a given set of conditions or criteria so let's do an open open parentheses we
let's do an open open parentheses we give the sum range so let's do um the
give the sum range so let's do um the same one as before then we have our
same one as before then we have our criteria range so what are we looking at
criteria range so what are we looking at What's um this is the area that's going
What's um this is the area that's going to be added after all these if
to be added after all these if statements are done right so we have to
statements are done right so we have to initially set that now we're going to
initially set that now we're going to say okay what criteria are we basing
say okay what criteria are we basing this off of so let's put a comma and
this off of so let's put a comma and we're going to base it off of let's do
we're going to base it off of let's do this one we'll say um if the uh gender
this one we'll say um if the uh gender so we'll do comma if that's female oops
so we'll do comma if that's female oops if that's female and then we'll give
if that's female and then we'll give another one we can say if they're female
another one we can say if they're female and let's say they are greater than oops
and let's say they are greater than oops greater than 30 and we'll close that up
greater than 30 and we'll close that up and it's going to give us 88,000 so
and it's going to give us 88,000 so female female there's one two right here
female female there's one two right here so it's going to be this one and this
so it's going to be this one and this one that equals 88,000 so that's how
one that equals 88,000 so that's how that works you're able to incorporate
that works you're able to incorporate several different
several different conditions into uh the sum formula so
conditions into uh the sum formula so again I know this one's super simple but
again I know this one's super simple but you you can use it in a much more
you you can use it in a much more complex way if you use the sum if and
complex way if you use the sum if and the sum ifs um almost the exact same
the sum ifs um almost the exact same thing for this count I'm not going to go
thing for this count I'm not going to go super in depth into this one um I'll
super in depth into this one um I'll just kind of show you because count is
just kind of show you because count is um count and sum are kind of on the same
um count and sum are kind of on the same level of difficulty they're both pretty
level of difficulty they're both pretty beginner this is just going to give you
beginner this is just going to give you a count of how many cells um are there
a count of how many cells um are there so let's give this range um and so it's
so let's give this range um and so it's not going to add it it's just going to
not going to add it it's just going to give us a count so if we do right here
give us a count so if we do right here and scroll over them like highlight them
and scroll over them like highlight them this countdown here oops this countdown
this countdown here oops this countdown here is nine and so it's going to give
here is nine and so it's going to give us that count but we can do a count with
us that count but we can do a count with conditions exactly how we did it in the
conditions exactly how we did it in the sum so if we do count if Oops I did not
sum so if we do count if Oops I did not spell that right if we do count if we're
spell that right if we do count if we're going to give a range and a criteria
going to give a range and a criteria exact same as we did before so let's do
exact same as we did before so let's do this I me you can do this on basically
this I me you can do this on basically any of these it doesn't really for this
any of these it doesn't really for this demonstration it doesn't really matter
demonstration it doesn't really matter um but we'll say if their salary is
um but we'll say if their salary is greater than 45,000 so how many people
greater than 45,000 so how many people this is going to give us how many people
this is going to give us how many people have a salary over 45,000 and that's
have a salary over 45,000 and that's five so before in the sum if if we did
five so before in the sum if if we did that um we did 50,000 it adds everything
that um we did 50,000 it adds everything together the count is just going to
together the count is just going to count the amount of cells that meet that
count the amount of cells that meet that criteria and again count
criteria and again count ifs uh we're going to have a criteria
ifs uh we're going to have a criteria range and then we will specify what if
range and then we will specify what if statements we want to be uh to occur in
statements we want to be uh to occur in order to count those cells so let's do
order to count those cells so let's do we want you know we want to count it can
we want you know we want to count it can be any range or it can be any of these
be any range or it can be any of these we'll do the ID this time and now we can
we'll do the ID this time and now we can say you know want it to be is our
say you know want it to be is our criteria one we can say we want it to be
criteria one we can say we want it to be greater than want their ID to be greater
greater than want their ID to be greater than
than 1005 and let's say we want them to
be male so they have an ID over a certain
male so they have an ID over a certain um a certain range and then they are a
um a certain range and then they are a male so there's only three people that
male so there's only three people that meet that criteria and so it'll be
meet that criteria and so it'll be Michael Stanley and Kevin those are our
Michael Stanley and Kevin those are our three people and so it gives us a count
three people and so it gives us a count very useful to give quick numbers like
very useful to give quick numbers like this something I I genuinely use a lot
this something I I genuinely use a lot and I know I've said that a lot during
and I know I've said that a lot during this tutorial but that's because
this tutorial but that's because everything I'm showing you are things
everything I'm showing you are things that I've used a lot so I don't feel
that I've used a lot so I don't feel like um you know I'm speaking out of
like um you know I'm speaking out of turn here let's look at this one this
turn here let's look at this one this one is very um has some specific use
one is very um has some specific use cases um notice that this is a text
cases um notice that this is a text right now um if you do it when it is uh
right now um if you do it when it is uh in a date format it actually will not
in a date format it actually will not work I mean I can you can test it out
work I mean I can you can test it out yourself you just got to trust me it's
yourself you just got to trust me it's not going to work so what this does is
not going to work so what this does is it's going to give you the range from
it's going to give you the range from this day to this day that's what it's
this day to this day that's what it's going to do so let's do uh oops days
going to do so let's do uh oops days it's GNA we want to choose our end date
it's GNA we want to choose our end date so this is our end date it's kind of
so this is our end date it's kind of backward from what you think end date to
backward from what you think end date to start date you think start date to end
start date you think start date to end date so you have to start with this one
date so you have to start with this one and then we're going to choose the start
and then we're going to choose the start date and now it's going to tell us how
date and now it's going to tell us how many um how many uh days was it from
many um how many uh days was it from here to here and this one it's
here to here and this one it's 5,56 so Network days is extremely
5,56 so Network days is extremely similar except it takes out holidays and
similar except it takes out holidays and it takes out weekends and you can see
it takes out weekends and you can see how many working days has this person um
how many working days has this person um how many working days or network days
how many working days or network days has this person worked not including you
has this person worked not including you know weekends and holidays have they
know weekends and holidays have they actually worked since their start date
actually worked since their start date and their end date so let's do Network
and their end date so let's do Network days and we need our start date our end
days and we need our start date our end date and you can specify extra holidays
date and you can specify extra holidays if you'd like but there are a already
if you'd like but there are a already standard set holidays in there that it
standard set holidays in there that it takes out um so you know if you want to
takes out um so you know if you want to do that you can so we're going to do the
do that you can so we're going to do the start date again this one's different
start date again this one's different this one says start date end date and
this one says start date end date and then we're going to give the end
then we're going to give the end date and if you
date and if you notice they are going to be different
notice they are going to be different numbers is dramatically lower because
numbers is dramatically lower because it's taking out weekends and holidays so
it's taking out weekends and holidays so this is how many days uh calendar days
this is how many days uh calendar days they've worked and this is how many days
they've worked and this is how many days they've actually been in the office and
they've actually been in the office and worked and that is it um again there are
worked and that is it um again there are so many formulas I mean literally
so many formulas I mean literally hundreds of formulas that you can
hundreds of formulas that you can utilize and use and are out there for
utilize and use and are out there for you to try out yourself if there are
you to try out yourself if there are specific ones that I did not cover in
specific ones that I did not cover in this video please please put it in the
this video please please put it in the comments below so that I can you know
comments below so that I can you know show you how to do these things I I I
show you how to do these things I I I will say I've probably used a majority
will say I've probably used a majority of the ones that you're going to put in
of the ones that you're going to put in the comments already and if I haven't
the comments already and if I haven't used it I'll take a look at it and see
used it I'll take a look at it and see if it's really useful and I'll show you
if it's really useful and I'll show you that so thank you guys so much for
that so thank you guys so much for watching I hope that this has been
watching I hope that this has been helpful I I feel like a lot of these
helpful I I feel like a lot of these things are not things that I learned
things are not things that I learned before I started almost all these are
before I started almost all these are ones that I learned while I was on the
ones that I learned while I was on the job and so I'm hoping that you can get
job and so I'm hoping that you can get ahead of the curve and you can learn
ahead of the curve and you can learn learn these things before you actually
learn these things before you actually start so that when you get in there
start so that when you get in there you're just like killing it with the
you're just like killing it with the formulas and people are like whoa this
formulas and people are like whoa this guy is like this guy knows what he's
guy is like this guy knows what he's doing in Excel give him all the Excel
doing in Excel give him all the Excel work and then you become like you know
work and then you become like you know just the Excel guy um and everyone you
just the Excel guy um and everyone you know loves you for it so with that being
know loves you for it so with that being said thank you so much for watching I
said thank you so much for watching I really do hope this helped if you like
really do hope this helped if you like this video be sure to like And subscribe
this video be sure to like And subscribe below I'll see you in the next
below I'll see you in the next [Music]
[Music] video
what's going on everybody welcome back to another video in this Excel tutorial
to another video in this Excel tutorial we'll be looking at
[Music] xlup now if you don't already know what
xlup now if you don't already know what xlookup is it is a new feature in Excel
xlookup is it is a new feature in Excel to kind of replace vlookup or to be a
to kind of replace vlookup or to be a much better option at least in my mind
much better option at least in my mind is a much better option than V lookup
is a much better option than V lookup and so if you're someone who's either
and so if you're someone who's either used V lookup a a lot and you're trying
used V lookup a a lot and you're trying to you know learn this new Option or if
to you know learn this new Option or if you've never used it before this video
you've never used it before this video will be super helpful because I'll walk
will be super helpful because I'll walk you through kind of the options and what
you through kind of the options and what x lookup can do as well as the
x lookup can do as well as the difference between X lookup and V lookup
difference between X lookup and V lookup but before we get into the tutorial I
but before we get into the tutorial I want to give a huge shout out to today's
want to give a huge shout out to today's sponsor and that is udemy udemy is the
sponsor and that is udemy udemy is the go-to place if you want a full-fledged
go-to place if you want a full-fledged course in Excel I have three options of
course in Excel I have three options of courses that I have taken on em me so
courses that I have taken on em me so I'd highly recommend checking those out
I'd highly recommend checking those out they are having a huge sale on all their
they are having a huge sale on all their courses during this time and so if you
courses during this time and so if you are in the market for a course I highly
are in the market for a course I highly recommend checking out UD to me and
recommend checking out UD to me and getting one there now without further
getting one there now without further Ado let's jum on my screen and start the
Ado let's jum on my screen and start the tutorial all right so let's get me off
tutorial all right so let's get me off the screen because we all know why we're
the screen because we all know why we're here so I didn't include this in the
here so I didn't include this in the formulas video last week because I knew
formulas video last week because I knew this was going to be a large one and a
this was going to be a large one and a lot of people are going to want to know
lot of people are going to want to know how to do this what the difference
how to do this what the difference stream V lookup and X lookup is so it
stream V lookup and X lookup is so it has its own dedicated video to it so
has its own dedicated video to it so let's get started it is a Formula so
let's get started it is a Formula so we're going to come in here in this cell
we're going to come in here in this cell we're going to hit equal and then we're
we're going to hit equal and then we're going to start typing X lookup now I'm
going to start typing X lookup now I'm GNA hit tab in just a second but let's
GNA hit tab in just a second but let's read what this says it says searches a
read what this says it says searches a range or an array for a match and
range or an array for a match and Returns the corresponding item from a
Returns the corresponding item from a second range or array by default an
second range or array by default an exact match is used so really useful to
exact match is used so really useful to know um we'll talk a little bit more
know um we'll talk a little bit more about that in just a second let's hit
about that in just a second let's hit Tab and it's going to complete it and
Tab and it's going to complete it and it's going to start giving us or it's
it's going to start giving us or it's going to tell us what our input values
going to tell us what our input values need to be we're going to have our
need to be we're going to have our lookup value we're going to have our
lookup value we're going to have our lookup array our return array and then
lookup array our return array and then some options things like if not found so
some options things like if not found so if your option isn't found you know what
if your option isn't found you know what will be um you know the the uh output
will be um you know the the uh output that it gives us a match mode and a
that it gives us a match mode and a search mode and I'm going to show you um
search mode and I'm going to show you um kind of how to use every single one of
kind of how to use every single one of these things as you can see at the very
these things as you can see at the very bottom I've kind of already set up all
bottom I've kind of already set up all of the instructional um instructional
of the instructional um instructional content for this video and so we'll kind
content for this video and so we'll kind of get through all these different
of get through all these different scenarios so let's just start really
scenarios so let's just start really quickly with um how to use it very
quickly with um how to use it very simply with the lookup lookup array and
simply with the lookup lookup array and return array so we're going to come in
return array so we're going to come in here and we're going to give it our
here and we're going to give it our lookup value Now Toby Fenderson right
lookup value Now Toby Fenderson right over here in A3 is going to be our
over here in A3 is going to be our lookup value so that's who we're going
lookup value so that's who we're going to be searching for now we're going to
to be searching for now we're going to hit comma and now we're going to be
hit comma and now we're going to be needing to look up uh or to input our
needing to look up uh or to input our lookup array now an array is just uh you
lookup array now an array is just uh you know a range basically so we're going to
know a range basically so we're going to do this is where it's going to be
do this is where it's going to be searching for um that value this is
searching for um that value this is where it's searches for A3 so here's
where it's searches for A3 so here's Toby Fenderson here's Toby flenderson so
Toby Fenderson here's Toby flenderson so it will find it in this array right here
it will find it in this array right here then we're going to hit comma and now we
then we're going to hit comma and now we need to give it the return array what
need to give it the return array what it's going to return on that row when it
it's going to return on that row when it finds it so we're going to return his
finds it so we're going to return his email keep it really simple so what it
email keep it really simple so what it should do and let's close parentheses
should do and let's close parentheses what it should do is it should take Toby
what it should do is it should take Toby Fenderson it's going to search in this
Fenderson it's going to search in this column or in this array and then it's
column or in this array and then it's going to return the email when it finds
going to return the email when it finds Toby Fenderson so it's on Toby Fenderson
Toby Fenderson so it's on Toby Fenderson is on row six so it's going to find Toby
is on row six so it's going to find Toby flenderson it's going to come over here
flenderson it's going to come over here and it's going to return Toby flenderson
and it's going to return Toby flenderson dundermifflin corporate.com that's what
dundermifflin corporate.com that's what it should do let's see what it actually
it should do let's see what it actually does said enter and it returns it now if
does said enter and it returns it now if we drag it down like this it'll apply it
we drag it down like this it'll apply it to all of these names right here and it
to all of these names right here and it works exactly how it's supposed to um
works exactly how it's supposed to um again if you have never used vlookup you
again if you have never used vlookup you don't know how good you have it okay
don't know how good you have it okay vlookup um was extremely useful but just
vlookup um was extremely useful but just uh a bit complicated and I'll talk about
uh a bit complicated and I'll talk about that near the end of the video when we
that near the end of the video when we compare V lookup to xlookup but just
compare V lookup to xlookup but just know that if you're using X lookup for
know that if you're using X lookup for the first time and you're just getting
the first time and you're just getting into using Excel you guys have it good
into using Excel you guys have it good okay so just know that um now let's go
okay so just know that um now let's go over here to X lookup multiple rows
over here to X lookup multiple rows because you can return more than one
because you can return more than one output with um with X lookup so let's go
output with um with X lookup so let's go right in here and we're going to
right in here and we're going to basically write the exact same thing as
basically write the exact same thing as we did before so let's write X lookup
we did before so let's write X lookup we're going to do Toby flenderson as our
we're going to do Toby flenderson as our value we're going to search here and
value we're going to search here and we're going to do something a little bit
we're going to do something a little bit different this time we want to include
different this time we want to include our end date and the email so what we're
our end date and the email so what we're going to do is we're going to start here
going to do is we're going to start here we're going to go down all the way to
we're going to go down all the way to the bottom of end date and then we're
the bottom of end date and then we're also going to include the email and when
also going to include the email and when we do that it will uh in the output give
we do that it will uh in the output give us a row or a column for end dat and a
us a row or a column for end dat and a column for email so an output for both
column for email so an output for both so let's hit enter and now we can see
so let's hit enter and now we can see that we have the end date here and the
that we have the end date here and the email here now one of the downsides or
email here now one of the downsides or or something that I'm not a huge huge
or something that I'm not a huge huge fan of is well first off I love that you
fan of is well first off I love that you can do this that's fantastic um but it
can do this that's fantastic um but it have to be right next to each other so
have to be right next to each other so you're only going to get that output
you're only going to get that output exactly how it is in the columns so if I
exactly how it is in the columns so if I went and did this range um I would
went and did this range um I would include all of that um so H you know
include all of that um so H you know let's just for example let's pull that
let's just for example let's pull that down here so let's take
down here so let's take this and put it right here if I did
this and put it right here if I did instead of zero or or O2 to P10 if I
instead of zero or or O2 to P10 if I included age to email this whole range
included age to email this whole range and I hit enter it's all going to be
and I hit enter it's all going to be included so you know that's one of the
included so you know that's one of the small downsides of of that functionality
small downsides of of that functionality of when you can use multiple rows is
of when you can use multiple rows is that it's going to use the rows exactly
that it's going to use the rows exactly as they are you can't really customize
as they are you can't really customize it within the formula you can move
it within the formula you can move around um these columns to how you want
around um these columns to how you want it um so that is something to note and
it um so that is something to note and again you can pull this down and it'll
again you can pull this down and it'll be applied to all of those names let's
be applied to all of those names let's go over to X lookup exact match so let's
go over to X lookup exact match so let's open this up we're going to do equals
open this up we're going to do equals xlup as we've been doing and we're
xlup as we've been doing and we're actually going to be looking at the if
actually going to be looking at the if not found and the match mode U both you
not found and the match mode U both you know on this tab right here so let's do
know on this tab right here so let's do what we've been doing before we take our
what we've been doing before we take our value that we're looking up we take the
value that we're looking up we take the array that we're looking and we're going
array that we're looking and we're going to do the email and you know as you can
to do the email and you know as you can see this says Toby flender and not Toby
see this says Toby flender and not Toby flenderson so what we are going to do is
flenderson so what we are going to do is we're going to hit comma and if it's not
we're going to hit comma and if it's not found you can return um a value or a
found you can return um a value or a string that you want to return now for
string that you want to return now for simple purposes or for simple
simple purposes or for simple instructional purposes we're going to do
instructional purposes we're going to do not
not found and then we're going to close that
found and then we're going to close that off so let's do this and Toby Fenderson
off so let's do this and Toby Fenderson was not found and so it was returned not
was not found and so it was returned not found if Toby Fender was actually in
found if Toby Fender was actually in this full name then it would have
this full name then it would have returned the email and then if along the
returned the email and then if along the way you know one of these was not part
way you know one of these was not part of it then you know we would have uh we
of it then you know we would have uh we would have had the KN found all right so
would have had the KN found all right so let's go right up here we're actually
let's go right up here we're actually just going to copy this uh because I
just going to copy this uh because I want to reuse it um and then we're going
want to reuse it um and then we're going to go right here and we hit a comma now
to go right here and we hit a comma now this is our match mode option and so we
this is our match mode option and so we have four different options that we can
have four different options that we can choose from a zero is an exact match and
choose from a zero is an exact match and that is by default that is what we have
that is by default that is what we have or what we use then there's a minus one
or what we use then there's a minus one that's an exact match or next smaller
that's an exact match or next smaller item then there's a one which is an
item then there's a one which is an exact match or next larger item and then
exact match or next larger item and then there's a two which is a wild card
there's a two which is a wild card character match now we're going to do
character match now we're going to do that and we are going to um you know try
that and we are going to um you know try this out and it's not going to work and
this out and it's not going to work and not just because I forgot to put A4 um
not just because I forgot to put A4 um it's doing it because it's searching for
it's doing it because it's searching for Beasley but if there's not a wild card
Beasley but if there's not a wild card option already put in here um it doesn't
option already put in here um it doesn't recognize it so we need to indicate
recognize it so we need to indicate where that wild card needs to be so
where that wild card needs to be so we're going to do a double apostrophe or
we're going to do a double apostrophe or quotation marks we're going to put put
quotation marks we're going to put put an asterisk right here and then do
an asterisk right here and then do another one and we're going to hit an
another one and we're going to hit an Amper sand so we're going to have an
Amper sand so we're going to have an Amper sand right here and when that's
Amper sand right here and when that's going to say is anything that comes
going to say is anything that comes before A4 anything that comes before
before A4 anything that comes before Beasley is okay doesn't matter what it
Beasley is okay doesn't matter what it is as long as it has Beasley at the end
is as long as it has Beasley at the end that is going to be okay so we're going
that is going to be okay so we're going to have Pam that comes before Beasley
to have Pam that comes before Beasley and that's going to tell it and it's
and that's going to tell it and it's going to say okay I know that anything
going to say okay I know that anything that comes before Beasley is all right
that comes before Beasley is all right and so when we hit enter is now going to
and so when we hit enter is now going to return the output that we are looking
return the output that we are looking for and we can include that on these as
for and we can include that on these as well now this one is Meredith um and so
well now this one is Meredith um and so Meredith is at the beginning so we have
Meredith is at the beginning so we have Meredith Palmer so we can actually take
Meredith Palmer so we can actually take this and we're going to put this at the
this and we're going to put this at the end put the Amber sand right here and
end put the Amber sand right here and now it'll work and the exact same thing
now it'll work and the exact same thing for Kevin Malo right here Kevin Malone
for Kevin Malo right here Kevin Malone so it just didn't include uh the ne at
so it just didn't include uh the ne at the end and so it's still going to work
the end and so it's still going to work if we include that asterisk at the end
if we include that asterisk at the end now I know I said we were looking at
now I know I said we were looking at search order but I'm actually going to
search order but I'm actually going to kind of give you an exact match uh first
kind of give you an exact match uh first and then search order but it just kind
and then search order but it just kind of easier to show it over here so I'm
of easier to show it over here so I'm going to do X look up I'm going to look
going to do X look up I'm going to look up this value do a comma here's the
up this value do a comma here's the range this is our start date that's it's
range this is our start date that's it's going to be looking for and I want to
going to be looking for and I want to return the full name now no value in
return the full name now no value in here has one one 2000 but what we can do
here has one one 2000 but what we can do is we can do comma and then a comma for
is we can do comma and then a comma for the match mode and do an exact match or
the match mode and do an exact match or next
next larger and I know this is in the exact
larger and I know this is in the exact match part but it you know kind of
match part but it you know kind of refers to search ORD a little bit um
refers to search ORD a little bit um where it searches for the next largest
where it searches for the next largest value that's that's what that number one
value that's that's what that number one represents the next larger value so we
represents the next larger value so we have 112000 and if we look right here
have 112000 and if we look right here the next value above 112000 is
the next value above 112000 is 152000 and so it should should return
152000 and so it should should return Angela Martin let's see if that works
Angela Martin let's see if that works and there it is now let's look up the
and there it is now let's look up the actual search order um so let's do
actual search order um so let's do equals x
equals x lookup this is the value that we want to
lookup this is the value that we want to be searching for and we're going to be
be searching for and we're going to be looking in this start date and comma and
looking in this start date and comma and we want to return the name now let's get
we want to return the name now let's get over to search mode now the search mode
over to search mode now the search mode performs a search starting at the first
performs a search starting at the first item so at the very top going down so by
item so at the very top going down so by default it searches from first to last
default it searches from first to last but you can reverse that and do search
but you can reverse that and do search from last to first or you can do a
from last to first or you can do a binary search which is where it sorts in
binary search which is where it sorts in ascending order or sorts in descending
ascending order or sorts in descending order um and that's with the actual
order um and that's with the actual value and so we won't be able to show
value and so we won't be able to show this binary search or on ascending or
this binary search or on ascending or descending because our values are the
descending because our values are the same but if we had different values and
same but if we had different values and we were looking up um using this um next
we were looking up um using this um next largest we we would be able to show that
largest we we would be able to show that but I'm going to show you the search
but I'm going to show you the search from first to last and last to first so
from first to last and last to first so let's put in by default and this is what
let's put in by default and this is what it would be search From First to Last
it would be search From First to Last what the default would be so it starts
what the default would be so it starts at the very top it goes down and finds
at the very top it goes down and finds the first 56 2001 and returns Toby
the first 56 2001 and returns Toby flenderson now if we go in here and we
flenderson now if we go in here and we hit minus one that is going to search
hit minus one that is going to search from last to first so it's going to
from last to first so it's going to start at the bottom and go to the top
start at the bottom and go to the top and the first one that it finds is
and the first one that it finds is Michael Scott so that's that first one
Michael Scott so that's that first one starting from the bottom and then the
starting from the bottom and then the Michael Scott right there so these two
Michael Scott right there so these two the exact match and the search order can
the exact match and the search order can kind of be combined into um this one
kind of be combined into um this one right here we're using this
right here we're using this one um which is you know exact match or
one um which is you know exact match or next larger and you can include that in
next larger and you can include that in this binary search in this one as well
this binary search in this one as well all right now let's head over to the X
all right now let's head over to the X lookup horizontal I think we're we only
lookup horizontal I think we're we only have a few left yep X look up horizontal
have a few left yep X look up horizontal then we'll do X lookup with sum and then
then we'll do X lookup with sum and then I'm going to show you the V lookup at
I'm going to show you the V lookup at the end so let's go right here let's say
the end so let's go right here let's say equals X lookup the value that we want
equals X lookup the value that we want to be searching for is February that's
to be searching for is February that's what we're looking for hit comma and
what we're looking for hit comma and where do we want to search to find
where do we want to search to find February we want to search in uh these
February we want to search in uh these calendar months and then we hit another
calendar months and then we hit another comma and now we're going to be
comma and now we're going to be searching for paper so let's do paper
searching for paper so let's do paper and we'll hit enter and it found
and we'll hit enter and it found February and it return paper right here
February and it return paper right here and we can do that for paper printer and
and we can do that for paper printer and manila folders and so it's going to give
manila folders and so it's going to give us the 310 the 40 and the 118 from
us the 310 the 40 and the 118 from February now let's go right over here to
February now let's go right over here to XL up with some um I actually it's
XL up with some um I actually it's basically a carbon copy of this uh let's
basically a carbon copy of this uh let's take this over here real
take this over here real quick and place it right there because
quick and place it right there because it's the exact same thing except at the
it's the exact same thing except at the end we're going to use I'm going to show
end we're going to use I'm going to show you how to use sum with the X lookup at
you how to use sum with the X lookup at the same time now um we're going to be
the same time now um we're going to be using the formula sum and
using the formula sum and so we're going to do sum and then within
so we're going to do sum and then within the sum our first number is going to be
the sum our first number is going to be an X lookup and then our next value is
an X lookup and then our next value is also going to be an X lookup so let's do
also going to be an X lookup so let's do X lookup and now we're going to search
X lookup and now we're going to search for our very first value oops our very
for our very first value oops our very first lookup value so we're going to go
first lookup value so we're going to go to
to i1 and then we're going to search this
i1 and then we're going to search this again and we want whatever value oop
again and we want whatever value oop goes into that so let's close that
goes into that so let's close that parenthesis and now we're going to do a
parenthesis and now we're going to do a colon and another X
colon and another X lookup and now let's do March so now
lookup and now let's do March so now we're going to search for March we're
we're going to search for March we're going to do our search range where we're
going to do our search range where we're searching for that March and we want the
searching for that March and we want the paper as
paper as well and let's close that and then we
well and let's close that and then we also need to close that parentheses so
also need to close that parentheses so now we are basically adding this
now we are basically adding this February and and this March so it's
February and and this March so it's going to be 310 plus 150 it's adding
going to be 310 plus 150 it's adding those um two values and it should be uh
those um two values and it should be uh what 460 so let's see if that is our
what 460 so let's see if that is our output and it is so you can do this with
output and it is so you can do this with a lot of things not just some but you're
a lot of things not just some but you're able to use x lookup within different
able to use x lookup within different formulas if you're searching for a
formulas if you're searching for a specific value and a specific value um
specific value and a specific value um in in another um cell you can add those
in in another um cell you can add those together using X lookup which is
together using X lookup which is honestly it's pretty great so let's go
honestly it's pretty great so let's go over to V up so I wanted to show you
over to V up so I wanted to show you this because I wanted to show you where
this because I wanted to show you where it came from and what we used to do um
it came from and what we used to do um unless you are continuing to use V
unless you are continuing to use V lookup and what we can do now so X
lookup and what we can do now so X lookup I just showed you kind of
lookup I just showed you kind of everything um but super quickly I'm
everything um but super quickly I'm going to show you how vlookup used to
going to show you how vlookup used to work um in a super short way so that you
work um in a super short way so that you can understand how it used to be used
can understand how it used to be used and how it is used uh how X lookup is
and how it is used uh how X lookup is used now so let's go in here and we're
used now so let's go in here and we're going to say equals and we're going to
going to say equals and we're going to do a vlookup and so we have a lookup
do a vlookup and so we have a lookup value Val and so we're going to click
value Val and so we're going to click this we're going to hit Comma just like
this we're going to hit Comma just like we did before and now we're going to do
we did before and now we're going to do a table array and the table array is a
a table array and the table array is a little different in that you're
little different in that you're searching an entire area so let's do uh
searching an entire area so let's do uh H2 all the way through o oops o10 so
H2 all the way through o oops o10 so that's what that's what our table array
that's what that's what our table array is going to be then we're going to do a
is going to be then we're going to do a comma and now we have to do a column
comma and now we have to do a column index number which number um are we
index number which number um are we going to be um searching for which um
going to be um searching for which um value are we going to be searching for
value are we going to be searching for in here and so we want to search for
in here and so we want to search for eight because this is 1 2 3 4 five 6 7
eight because this is 1 2 3 4 five 6 7 eight we want to return that email and
eight we want to return that email and we're searching for the name right here
we're searching for the name right here in this very first column so we have
in this very first column so we have that comma and we're going to do eight
that comma and we're going to do eight and then in the range lookup you can do
and then in the range lookup you can do true which is an approximate match or
true which is an approximate match or false which is an exact match and we'll
false which is an exact match and we'll do
do false I don't know why it's not Auto
false I don't know why it's not Auto auto doing it but there we go and now we
auto doing it but there we go and now we will do it and it's going to return it
will do it and it's going to return it just as we had it um a lot of people uh
just as we had it um a lot of people uh I guess not everybody but some people
I guess not everybody but some people didn't like and the reason why they
didn't like and the reason why they created X lookup you had to do those
created X lookup you had to do those ranges and if you ever went in here and
ranges and if you ever went in here and then we let's say we um added another
then we let's say we um added another column which happens to data now it
column which happens to data now it gives completely different um different
gives completely different um different data so let's say for whatever reason we
data so let's say for whatever reason we added uh address so now we have these
added uh address so now we have these people address well now it's going to
people address well now it's going to give us a different um value it's going
give us a different um value it's going to have this end dates because if we go
to have this end dates because if we go in here now it doesn't um now the eighth
in here now it doesn't um now the eighth is this end date and the ninth is this
is this end date and the ninth is this email so if you have a vlookup that you
email so if you have a vlookup that you use for um you know a calculation or a
use for um you know a calculation or a table that you've created or different
table that you've created or different things in Excel you then have to go
things in Excel you then have to go through here and manually change this
through here and manually change this and so a lot of people didn't like that
and so a lot of people didn't like that CU if you you know needed to change data
CU if you you know needed to change data or you needed to change something or add
or you needed to change something or add an additional column you'd have to go
an additional column you'd have to go back and fix all of your vlookups they
back and fix all of your vlookups they wouldn't just automatically U Move with
wouldn't just automatically U Move with it which is what happens with xlookup
it which is what happens with xlookup and just to prove this uh let's go back
and just to prove this uh let's go back to the very first one which is the X
to the very first one which is the X lookup and right now the email is
lookup and right now the email is looking at O2 and through o10 um we're
looking at O2 and through o10 um we're just going to insert right here and that
just going to insert right here and that would be our new colum we'll do address
would be our new colum we'll do address oops address and notice that it hasn't
oops address and notice that it hasn't changed and why is that because it auto
changed and why is that because it auto changed for us from P2 to P10
changed for us from P2 to P10 understanding that it wanted to stick
understanding that it wanted to stick with when something was inserted here it
with when something was inserted here it wanted to stick with the original data
wanted to stick with the original data the original array that was selected and
the original array that was selected and so xlup does that work for you and it
so xlup does that work for you and it makes it a little bit easier to automate
makes it a little bit easier to automate things and create these processes in
things and create these processes in Excel without having to go fix it later
Excel without having to go fix it later which you had to do with lookup so that
which you had to do with lookup so that is it for today I hope that you know how
is it for today I hope that you know how to use x lookup a little bit better now
to use x lookup a little bit better now that you have watched this uh if you
that you have watched this uh if you enjoyed this video be sure to like And
enjoyed this video be sure to like And subscribe below and I will see you in
subscribe below and I will see you in the next
the next [Music]
[Music] video what's going on everybody welcome
video what's going on everybody welcome back to another Excel tutorial today
back to another Excel tutorial today we'll be looking at conditional
we'll be looking at conditional formatting
formatting [Music]
[Music] now if you've never heard of conditional
now if you've never heard of conditional form mounting before that's okay I had
form mounting before that's okay I had never heard of it before I became a data
never heard of it before I became a data analyst and so now that I've been using
analyst and so now that I've been using Excel a lot of course I use it quite a
Excel a lot of course I use it quite a bit and so I want to show you how to use
bit and so I want to show you how to use it conditional formatting is basically
it conditional formatting is basically just a way to see patterns and Trends
just a way to see patterns and Trends and data and that's a super simple way
and data and that's a super simple way of putting it um but it's very easy to
of putting it um but it's very easy to use and so hopefully I can show you how
use and so hopefully I can show you how to use it uh really easily in a lot of
to use it uh really easily in a lot of the things that I use the most and some
the things that I use the most and some of the things that I use it for so that
of the things that I use it for so that you can also know how to use conditional
you can also know how to use conditional formatting now before we jump into the
formatting now before we jump into the tutorial I want to give a huge shout out
tutorial I want to give a huge shout out to the sponsor of this Excel series and
to the sponsor of this Excel series and that is udemy you guys know by now that
that is udemy you guys know by now that I absolutely love udemy I've been using
I absolutely love udemy I've been using them for years and I've taken literally
them for years and I've taken literally hundreds of courses on udemy and I've
hundreds of courses on udemy and I've learned so so much especially when I was
learned so so much especially when I was first starting out as a data analyst uh
first starting out as a data analyst uh I learned a lot through their Excel
I learned a lot through their Excel courses on udemy and so I have actually
courses on udemy and so I have actually put the ones that I really like and I
put the ones that I really like and I have taken and enjoyed and think you
have taken and enjoyed and think you would as well in the description so if
would as well in the description so if you want to take those sure to check
you want to take those sure to check those out again huge shout out to UD me
those out again huge shout out to UD me for sponsoring the series now without
for sponsoring the series now without further Ado let's jump onto my screen
further Ado let's jump onto my screen and get started with the tutorial all
and get started with the tutorial all right so let's jump right into it on
right so let's jump right into it on this Home tab right here if we go all
this Home tab right here if we go all the way over to the right there is
the way over to the right there is conditional formatting and the
conditional formatting and the description that it gives us is easily
description that it gives us is easily spot Trends and patterns in your data
spot Trends and patterns in your data using bars colors and icons to visually
using bars colors and icons to visually highlight important values and that is
highlight important values and that is exactly how I would have defined it a
exactly how I would have defined it a really good job Microsoft exactly how I
really good job Microsoft exactly how I would have done it so what you'll see
would have done it so what you'll see right away is there's nothing too
right away is there's nothing too complex so we have some highlight cell
complex so we have some highlight cell rules um we have some top bottom rules
rules um we have some top bottom rules data bars color scales icon sets and
data bars color scales icon sets and then at the bottom we can create a rule
then at the bottom we can create a rule we can clear the rule and we can manage
we can clear the rule and we can manage our rule so if you create a rule then
our rule so if you create a rule then you can manage it so we're going to
you can manage it so we're going to start with these icon sets and I'm going
start with these icon sets and I'm going to show you how to use those and we'll
to show you how to use those and we'll work our way to the top and then I'll
work our way to the top and then I'll show you how to create some rules
show you how to create some rules yourself and how that all works so let's
yourself and how that all works so let's start off with the icon sets I'm going
start off with the icon sets I'm going to go over here to sales um and for this
to go over here to sales um and for this data we kind of have this um you know
data we kind of have this um you know Trend or or pattern that you can kind of
Trend or or pattern that you can kind of see over time so over the months um so
see over time so over the months um so if we go right here and let's use that
if we go right here and let's use that conditional forming let's use that icon
conditional forming let's use that icon sets and right here we can use these
sets and right here we can use these directional so you know we have this
directional so you know we have this kind of Time series each month that
kind of Time series each month that shows us how much paper they're selling
shows us how much paper they're selling and if we do this right here it's going
and if we do this right here it's going to show us if it's kind of average or if
to show us if it's kind of average or if it's below average or if it's above
it's below average or if it's above average or if it's going up so at a
average or if it's going up so at a really quick glance you can kind of see
really quick glance you can kind of see the pattern of this data set it's kind
the pattern of this data set it's kind of going mostly yellow and red there's
of going mostly yellow and red there's only two months where it's going up
only two months where it's going up significantly now we don't have to only
significantly now we don't have to only do that for one row or one column you
do that for one row or one column you can apply to all of them but as you can
can apply to all of them but as you can see all of these are red now why are
see all of these are red now why are they all red it's because they're using
they all red it's because they're using numbers for everything so they're
numbers for everything so they're comparing these 24s these 50s and 65s
comparing these 24s these 50s and 65s against these 450s and 750s and so
against these 450s and 750s and so they're all going to be red but if we do
they're all going to be red but if we do it individually if we do it each row if
it individually if we do it each row if we take it just like this and then we go
we take it just like this and then we go to Icon sets and do it it's going to be
to Icon sets and do it it's going to be much more representative of the actual
much more representative of the actual printers not of all the numbers as a
printers not of all the numbers as a whole and you can do other things uh the
whole and you can do other things uh the arrows are ones that you'll probably see
arrows are ones that you'll probably see the most often that's the one I've used
the most often that's the one I've used if I ever do use them um but you can you
if I ever do use them um but you can you know do ones like this where they have
know do ones like this where they have you know kind of a trend upward or a
you know kind of a trend upward or a trend downward um and so there's just
trend downward um and so there's just several more arrows this one only gives
several more arrows this one only gives you three as you can see this one gives
you three as you can see this one gives you five um and you can do you know
you five um and you can do you know colors or shapes or or different
colors or shapes or or different indicators and all these different
indicators and all these different things um and honestly it's kind of
things um and honestly it's kind of whatever you want to use whatever makes
whatever you want to use whatever makes sense for your data but you know I've
sense for your data but you know I've really only ever seen like these colors
really only ever seen like these colors being used I've never really seen these
being used I've never really seen these flags or anything like that but again it
flags or anything like that but again it just depends on what industry you work
just depends on what industry you work in you might you might see that let's go
in you might you might see that let's go right over here to the demographics um
right over here to the demographics um and let's look at our color scales now
and let's look at our color scales now color scales are going to be the
color scales are going to be the probably the most obvious thing that in
probably the most obvious thing that in datab bars are going to be the most
datab bars are going to be the most obvious things in here um if you go
obvious things in here um if you go right here and and you look at this
right here and and you look at this color scale if it's high if it's among
color scale if it's high if it's among the top ones it's green the lowest it's
the top ones it's green the lowest it's red and you can change that um to really
red and you can change that um to really any colors you want any colors that they
any colors you want any colors that they offer you um and it it does exactly what
offer you um and it it does exactly what it does it's a color scale a gradient of
it does it's a color scale a gradient of the colors from high to low or low to
the colors from high to low or low to high and so any color that you do you'll
high and so any color that you do you'll be able to kind of see um you know
be able to kind of see um you know what's good and what's not good that
what's good and what's not good that really is um color scales in a nutshell
really is um color scales in a nutshell data bars are again super super
data bars are again super super straightforward it's going to be either
straightforward it's going to be either a gradient fill or a solid fill so let's
a gradient fill or a solid fill so let's look at the gradient fill if we do a
look at the gradient fill if we do a blue gradient fill I'll actually let's
blue gradient fill I'll actually let's get rid of our um let's go over here
get rid of our um let's go over here let's go to clear rules from selected
let's go to clear rules from selected cells we haven't looked at that yet but
cells we haven't looked at that yet but that's how you clear it let's go to data
that's how you clear it let's go to data bars and we'll use this blue gradient so
bars and we'll use this blue gradient so with this blue gradient you know this
with this blue gradient you know this one is or sorry this one is the highest
one is or sorry this one is the highest one so it's going to be completely
one so it's going to be completely filled and this one is 36,000 almost
filled and this one is 36,000 almost half of this I'm pretty close and so
half of this I'm pretty close and so it's almost half um this one again you
it's almost half um this one again you know it's not used very often
know it's not used very often I you don't see these a lot to be honest
I you don't see these a lot to be honest you just don't um but if you do see it
you just don't um but if you do see it that's how you use it that's how it can
that's how you use it that's how it can be done again pretty easy uh as I just
be done again pretty easy uh as I just showed a second ago if you want to clear
showed a second ago if you want to clear the rules you can clear from the
the rules you can clear from the selected cells that's what we're doing
selected cells that's what we're doing so I have column G selected and I'm
so I have column G selected and I'm going to I'm going to clear that if you
going to I'm going to clear that if you want to clear the rules for the entire
want to clear the rules for the entire sheet you can do that as well so it
sheet you can do that as well so it would affect every single column and row
would affect every single column and row we'll just do this for now so now let's
we'll just do this for now so now let's go look at the top bottom rules so so
go look at the top bottom rules so so this is the top 10 items top 10% bottom
this is the top 10 items top 10% bottom 10 items bottom 10% above average and
10 items bottom 10% above average and below average and they're going to do
below average and they're going to do exactly what you think they are going to
exactly what you think they are going to do if you select above average it is
do if you select above average it is going to select or highlight the cells
going to select or highlight the cells that are above the average in column G
that are above the average in column G so let's look at the salaries that are
so let's look at the salaries that are above average all right and so uh the
above average all right and so uh the ones that are at the very top are
ones that are at the very top are Michael Scotts Toby flenderson and
Michael Scotts Toby flenderson and Dwight shro uh no shock there um I
Dwight shro uh no shock there um I believe the average is somewhere around
believe the average is somewhere around like
like 48,500 or something so I think this one
48,500 or something so I think this one just is just below it and so all these
just is just below it and so all these other ones are below average and that's
other ones are below average and that's just because you know Michael Scott and
just because you know Michael Scott and Dwight Sho are and Toby are kind of
Dwight Sho are and Toby are kind of bringing up that average quite a bit so
bringing up that average quite a bit so everyone else is going to fall beneath
everyone else is going to fall beneath that so at a super quick glance you're
that so at a super quick glance you're able to just highlight the cells and
able to just highlight the cells and you're able to see who is above average
you're able to see who is above average and you know you can do this in a lot of
and you know you can do this in a lot of different ways in Excel but this is just
different ways in Excel but this is just a really simple fast way to do that um
a really simple fast way to do that um let's get rid of that real quick and
let's get rid of that real quick and let's go back up here and now we can
let's go back up here and now we can oops let's go to top bottom rules and
oops let's go to top bottom rules and now we can see the below average and
now we can see the below average and it's going to highlight all the other
it's going to highlight all the other ones and so it works exactly how you
ones and so it works exactly how you think it is going to work and this is
think it is going to work and this is the default way that it highlights these
the default way that it highlights these cells so it highlights them this kind of
cells so it highlights them this kind of um seeth through red and then it
um seeth through red and then it highlights the actual text or or the um
highlights the actual text or or the um characters in there red as well now I'm
characters in there red as well now I'm not going to go through and show you
not going to go through and show you every single one of these top bottom
every single one of these top bottom rules I think they're pretty
rules I think they're pretty self-explanatory I just kind of wanted
self-explanatory I just kind of wanted to show you what happens when you do use
to show you what happens when you do use one of them it's going to highlight that
one of them it's going to highlight that cell so let's go up here to the
cell so let's go up here to the Highlight cells rules and honestly these
Highlight cells rules and honestly these are the ones that I use by far the most
are the ones that I use by far the most uh all these other ones combined I do
uh all these other ones combined I do not use more than this highlight cells
not use more than this highlight cells rules um and the one in here that I use
rules um and the one in here that I use more than any other conditional
more than any other conditional formatting rule is this duplicate values
formatting rule is this duplicate values so I'll start with that really quick and
so I'll start with that really quick and I'll kind of show you a few few of these
I'll kind of show you a few few of these other ones but this duplicate values to
other ones but this duplicate values to me is one of the most useful ones um and
me is one of the most useful ones um and so let's kind of show you how that works
so let's kind of show you how that works if we go to the start date you can see
if we go to the start date you can see that we have a duplicate value right
that we have a duplicate value right here and if we go over here to
here and if we go over here to conditional formatting highlight cells
conditional formatting highlight cells rules and duplicate values it is going
rules and duplicate values it is going to highlight um the uh duplicate and
to highlight um the uh duplicate and that says duplicate right here now we
that says duplicate right here now we can go through here and click on unique
can go through here and click on unique um and then it would highlight all the
um and then it would highlight all the ones that are not duplicates um so you
ones that are not duplicates um so you can use it you know kind of in a similar
can use it you know kind of in a similar inverse way uh it's just different
inverse way uh it's just different different but I use the duplicate almost
different but I use the duplicate almost always um another thing that you can do
always um another thing that you can do is go over here and you can change the
is go over here and you can change the color um or you can even do a custom um
color um or you can even do a custom um which I never do that it's not um
which I never do that it's not um something I spend a lot of time doing I
something I spend a lot of time doing I typically just stick with this one so
typically just stick with this one so you can do that and it's going to
you can do that and it's going to highlight um you know something that has
highlight um you know something that has a duplicate value in there now why do I
a duplicate value in there now why do I use this so much well I work with a lot
use this so much well I work with a lot of different types of data sets but one
of different types of data sets but one thing that you'll find in almost all of
thing that you'll find in almost all of them is they have some type of ID and
them is they have some type of ID and they're going to have some type of um
they're going to have some type of um personal information whether that's a
personal information whether that's a social security number or an address or
social security number or an address or um you
um you know or a cell phone number or something
know or a cell phone number or something like that there is going to be data that
like that there is going to be data that is going to to identify that person now
is going to to identify that person now I work a lot with pharmaceutical data a
I work a lot with pharmaceutical data a lot with Pharmacy data um as well as
lot with Pharmacy data um as well as Healthcare data so like names Social
Healthcare data so like names Social Security numbers addresses phone numbers
Security numbers addresses phone numbers all those things all that customer or or
all those things all that customer or or client information and oftentimes when I
client information and oftentimes when I get a new data set and I have it in
get a new data set and I have it in Excel or I convert it to excel I will
Excel or I convert it to excel I will start using these duplicates to try to
start using these duplicates to try to find issues with the data and I find
find issues with the data and I find them all the time either there's an
them all the time either there's an employee ID or some type of customer ID
employee ID or some type of customer ID or client ID that has a duplicate in
or client ID that has a duplicate in there that should not be in there or
there that should not be in there or there's multiple Social Security numbers
there's multiple Social Security numbers or there's an issue in some other way
or there's an issue in some other way and I'm able to find those things and
and I'm able to find those things and spot those patterns using this
spot those patterns using this duplicates and I promise you I use this
duplicates and I promise you I use this one almost every single time I open a
one almost every single time I open a new data set or I work with a new
new data set or I work with a new clients working with their data um and
clients working with their data um and so I wanted to show you this one I
so I wanted to show you this one I wanted to really press upon you that
wanted to really press upon you that this one is a really really really good
this one is a really really really good one to know and learn how to use it's
one to know and learn how to use it's not complicated it's not hard it just
not complicated it's not hard it just shows you you know you know
shows you you know you know if there's a duplicate value but I
if there's a duplicate value but I wanted you to know how I use it and how
wanted you to know how I use it and how often I use it so that you can you know
often I use it so that you can you know pick that up and put that in your tool
pick that up and put that in your tool kit in your back pocket so that you can
kit in your back pocket so that you can use that later on if you have uh if you
use that later on if you have uh if you have a similar need or if you're trying
have a similar need or if you're trying to do something similar to what I was
to do something similar to what I was just talking about so that is how
just talking about so that is how duplicates work again super great it's
duplicates work again super great it's obviously not super useful when you're
obviously not super useful when you're only using um 10 rows but when you have
only using um 10 rows but when you have you know 50,000 100,000 and there should
you know 50,000 100,000 and there should be zero duplicates in there and you
be zero duplicates in there and you highlight it and then uh you come right
highlight it and then uh you come right here use the
here use the filter and we're going to filter and
filter and we're going to filter and we're going to sort by the color and it
we're going to sort by the color and it allows you to sort by the color and you
allows you to sort by the color and you have duplicates in there then that's a
have duplicates in there then that's a problem and you identified a problem
problem and you identified a problem super quickly uh and you know some of
super quickly uh and you know some of those things they slip by because nobody
those things they slip by because nobody checks it and so that's something that I
checks it and so that's something that I I often check and if you go here and you
I often check and if you go here and you sort by color and there isn't an option
sort by color and there isn't an option to do um this this pink red color and
to do um this this pink red color and that means there aren't any duplicates
that means there aren't any duplicates and that a really good thing most of the
and that a really good thing most of the time that's a really good thing so let's
time that's a really good thing so let's go ahead and we're going to clear that
go ahead and we're going to clear that as well
as well as get rid of our conditional formatting
as get rid of our conditional formatting rules now another one that I use a lot
rules now another one that I use a lot is this one right here which is the text
is this one right here which is the text that contains honestly this one comes a
that contains honestly this one comes a lot in handy especially when you're
lot in handy especially when you're looking for like a specific keyword in
looking for like a specific keyword in my uh case a lot of times I was using
my uh case a lot of times I was using this when I was going through drug names
this when I was going through drug names I am not a doctor I do not pretend to be
I am not a doctor I do not pretend to be a doctor and so when I was looking for
a doctor and so when I was looking for laraza Pam or something like that um I
laraza Pam or something like that um I would just search for like lorz or
would just search for like lorz or something and and not Lorax but loras
something and and not Lorax but loras you know I I would just search for it
you know I I would just search for it and then all the ones that contain that
and then all the ones that contain that would pop up I can bring them to the top
would pop up I can bring them to the top and I can see them and to me that's
and I can see them and to me that's super super useful and I would do that
super super useful and I would do that all the time and so in this case we're
all the time and so in this case we're looking at emails and let's say we all
looking at emails and let's say we all only wanted to pull all the ones that
only wanted to pull all the ones that are Gmail and so now we can go through
are Gmail and so now we can go through and we can you know click okay and
and we can you know click okay and that's going to pop up or we want all
that's going to pop up or we want all the ones that have Dunder oops Dunder
the ones that have Dunder oops Dunder Mifflin and if we click on that all the
Mifflin and if we click on that all the ones that are Dunder Mifflin come up or
ones that are Dunder Mifflin come up or have done their Mylin in it and again we
have done their Mylin in it and again we can um sort by or we can um and so we
can um sort by or we can um and so we can sort by right here and we can bring
can sort by right here and we can bring all those to the top and so super super
all those to the top and so super super useful um and another use for it that
useful um and another use for it that you may not think of is something like
you may not think of is something like if it's you know there's some incorrect
if it's you know there's some incorrect data in there this happens often with
data in there this happens often with phone numbers addresses um start dates
phone numbers addresses um start dates or or or dates in general date formats
or or or dates in general date formats where you can go in here and you can say
where you can go in here and you can say text that contains and if you know you
text that contains and if you know you put in a oops a dash and it has it in
put in a oops a dash and it has it in there then you know that that is that is
there then you know that that is that is wrong now that is really all I wanted to
wrong now that is really all I wanted to show you in the Highlight cells rules uh
show you in the Highlight cells rules uh the duplicate values and the text
the duplicate values and the text contains are by far the ones that I use
contains are by far the ones that I use the most all the other ones I have used
the most all the other ones I have used um these ones not so much but in these
um these ones not so much but in these highlight cells rules I use you know
highlight cells rules I use you know these two all the time um sometimes I
these two all the time um sometimes I use this between I don't really use
use this between I don't really use these other ones as much although I have
these other ones as much although I have used them and so you got nothing else
used them and so you got nothing else from this video I just wanted you to
from this video I just wanted you to know that these two are super useful and
know that these two are super useful and if you haven't used them before to maybe
if you haven't used them before to maybe try them out and see if you can apply
try them out and see if you can apply them to your own data sets now we've
them to your own data sets now we've looked at all of these preset ones in
looked at all of these preset ones in conditional formatting but you can also
conditional formatting but you can also do a new rule and so if we click on new
do a new rule and so if we click on new rule right here and we go down to use a
rule right here and we go down to use a formula to determine which cells to
formula to determine which cells to format we can add our own formula in
format we can add our own formula in here that will then highlight exactly
here that will then highlight exactly what we want and so if there isn't a
what we want and so if there isn't a preset rule that you like and it doesn't
preset rule that you like and it doesn't have the option that you want you can do
have the option that you want you can do almost any formula that you want in our
almost any formula that you want in our formulas video that we did a few weeks
formulas video that we did a few weeks ago and you can put it in here and then
ago and you can put it in here and then you can format uh what you want the cell
you can format uh what you want the cell to look like if it meets that criteria
to look like if it meets that criteria so let's take this right over here um
so let's take this right over here um and before we start this formula I just
and before we start this formula I just want you to note that you know I have
want you to note that you know I have h11 highlighted that's going to come
h11 highlighted that's going to come into play in just a little bit but I
into play in just a little bit but I want you to be aware that h11 is the
want you to be aware that h11 is the cell that we're highlighted so what
cell that we're highlighted so what we're going to do is we are going to
we're going to do is we are going to create our formula now if you've never
create our formula now if you've never created a formula I highly recommend uh
created a formula I highly recommend uh watching my formulas tutorial because
watching my formulas tutorial because that is going to show you how to do this
that is going to show you how to do this um but we're all we're going to do is
um but we're all we're going to do is we're going to do equals that's how you
we're going to do equals that's how you start the uh how you actually create a
start the uh how you actually create a formula and we're going to give it this
formula and we're going to give it this range right here and so it's going to
range right here and so it's going to take everything from G2 to G10 now these
take everything from G2 to G10 now these dollar signs are super important if you
dollar signs are super important if you don't know how to use them or you don't
don't know how to use them or you don't know what they do um you're going to
know what they do um you're going to mess up this formula a lot uh and so
mess up this formula a lot uh and so what this dollar sign basically does is
what this dollar sign basically does is it's basically hardcoding it in there it
it's basically hardcoding it in there it is only going to look at G2 and is only
is only going to look at G2 and is only going to look at G10 or through G10
going to look at G10 or through G10 because that colon and this can come
because that colon and this can come into play because if you have something
into play because if you have something selected like the h11 it's going to mess
selected like the h11 it's going to mess it up because now if you have h11
it up because now if you have h11 selected like we do you'll see this in a
selected like we do you'll see this in a second it's not going to be applied to
second it's not going to be applied to this um and again I'll show you that in
this um and again I'll show you that in just a minute but we don't want this
just a minute but we don't want this hardcoded in there okay but we do have
hardcoded in there okay but we do have to select the proper range in a second
to select the proper range in a second um so we're going to get rid of this
um so we're going to get rid of this we're going to get rid of the dollar
we're going to get rid of the dollar signs because we want to pretty fluid
signs because we want to pretty fluid and be able to applied to be applied
and be able to applied to be applied basically anywhere we want let's go into
basically anywhere we want let's go into this
this formula um if it meets our criteria
formula um if it meets our criteria let's give it um let's give it a border
let's give it um let's give it a border and we'll give it um we'll give it some
and we'll give it um we'll give it some color we're going to say if this is
color we're going to say if this is greater than
greater than 50,000 so let's hit okay and nothing
50,000 so let's hit okay and nothing happened so let's go back and see why so
happened so let's go back and see why so if we go to our manage rules you can see
if we go to our manage rules you can see that so as the G2 to G G10 is greater
that so as the G2 to G G10 is greater than 50,000 but it only is being applied
than 50,000 but it only is being applied to this h11 cell which really makes no
to this h11 cell which really makes no sense um so if we had wanted to get it
sense um so if we had wanted to get it done the first time we needed to have
done the first time we needed to have basically selected that G2 to G10 right
basically selected that G2 to G10 right away um but we can do that now so let's
away um but we can do that now so let's get rid of this and we're going to say
get rid of this and we're going to say G2 to
G2 to G10 and that is hardcoded in there
G10 and that is hardcoded in there that's should be fine still um but let's
that's should be fine still um but let's see what it
see what it does and so now every every single thing
does and so now every every single thing is highlighted and why is that uh that's
is highlighted and why is that uh that's because when we changed it it also
because when we changed it it also changed the format of it because we
changed the format of it because we changed the cell that we were looking at
changed the cell that we were looking at so we need to come back here and that's
so we need to come back here and that's why again you want to do this the right
why again you want to do this the right way the first time we're going to come
way the first time we're going to come back here we're going to give it this
back here we're going to give it this range and we're going to get rid of
range and we're going to get rid of these dollar
these dollar [Music]
[Music] signs and now we're going to hit okay
signs and now we're going to hit okay and so now it's being applied G2 to G10
and so now it's being applied G2 to G10 and G2 to G10 and we'll keep it like
and G2 to G10 and we'll keep it like that and we'll apply it and now it works
that and we'll apply it and now it works properly so now everything that's above
properly so now everything that's above 50,000 is being highlighted again if
50,000 is being highlighted again if that was confusing um it it is confusing
that was confusing um it it is confusing it genuinely is and so if you wanted to
it genuinely is and so if you wanted to do this right the first time without
do this right the first time without having to make a bunch of changes you'd
having to make a bunch of changes you'd want to highlight these before you start
want to highlight these before you start and then you want to go in and create
and then you want to go in and create the rule we'll do this really quick just
the rule we'll do this really quick just to kind of show you what I'm talking
to kind of show you what I'm talking about we'll say equals we'll give it
about we'll say equals we'll give it this range
this range get rid of these real quick because
get rid of these real quick because again I don't want this hardcoded in
again I don't want this hardcoded in there it will ruin our formula and then
there it will ruin our formula and then we'll say greater than 30 um and we'll
we'll say greater than 30 um and we'll give this nice green uh and so now if
give this nice green uh and so now if they're over the age of 30 it will be
they're over the age of 30 it will be highlighted and we didn't have to go
highlighted and we didn't have to go back and change anything we didn't have
back and change anything we didn't have to go back and fix anything like we did
to go back and fix anything like we did in the first one um that was all for
in the first one um that was all for demonstration purposes but again you
demonstration purposes but again you need to really be aware of that that is
need to really be aware of that that is something that I think think almost
something that I think think almost everybody's going to mess up at some
everybody's going to mess up at some point if you don't already know about it
point if you don't already know about it then you definitely are going to make
then you definitely are going to make that mistake now if we come over here in
that mistake now if we come over here in this area uh we go to our manage rules
this area uh we go to our manage rules and not just the current selection but
and not just the current selection but this whole worksheet then you can see
this whole worksheet then you can see that we have these two formulas now you
that we have these two formulas now you can go in and edit any of these by
can go in and edit any of these by double clicking or clicking on it and
double clicking or clicking on it and then hitting edit rule you can also
then hitting edit rule you can also delete these rules or duplicate these
delete these rules or duplicate these rules um I just wanted to show you what
rules um I just wanted to show you what you are able to do with them but if we
you are able to do with them but if we uh go ahead and we get rid of this um so
uh go ahead and we get rid of this um so let's say we delete that rule and we hit
let's say we delete that rule and we hit apply uh you know the rule is going to
apply uh you know the rule is going to go away that's that I mean it's as
go away that's that I mean it's as simple as that so that is how you can
simple as that so that is how you can create your own rule I want to be again
create your own rule I want to be again very specific in the fact that that is a
very specific in the fact that that is a confusing piece and if you mess that up
confusing piece and if you mess that up you're going to be you know fixing a
you're going to be you know fixing a bunch of different stuff and not
bunch of different stuff and not understanding why your rule is not
understanding why your rule is not working properly it's just because it's
working properly it's just because it's confusing those dollar signs are are
confusing those dollar signs are are really important to watch out for and
really important to watch out for and that is all all there is to it with
that is all all there is to it with conditional formatting again conditional
conditional formatting again conditional formatting is um you know it's not
formatting is um you know it's not anything super confusing we've looked at
anything super confusing we've looked at more complicated things but it's a
more complicated things but it's a really really useful tool to use to look
really really useful tool to use to look at these patterns and Trends super
at these patterns and Trends super quickly and to find um these outliers or
quickly and to find um these outliers or these specific values that you're
these specific values that you're looking for very quickly and if you're
looking for very quickly and if you're looking at just thousands and tens of
looking at just thousands and tens of thousands or hundreds of thousands of
thousands or hundreds of thousands of rows this is one of the fastest ways to
rows this is one of the fastest ways to find these things without having to kind
find these things without having to kind of wait and filter and use these um
of wait and filter and use these um these these filters right here because
these these filters right here because again this can just take forever um and
again this can just take forever um and so if you haven't or if you've never
so if you haven't or if you've never worked with a ton of data and tried to
worked with a ton of data and tried to use this before it can take honestly
use this before it can take honestly like 10 minutes for something simple
like 10 minutes for something simple that you could do with conditional
that you could do with conditional formatting in like 10 seconds so
formatting in like 10 seconds so definitely something to mess with and
definitely something to mess with and use when you are working with your own
use when you are working with your own data sets uh I hope this was helpful I
data sets uh I hope this was helpful I mean honestly I use this all the time so
mean honestly I use this all the time so you know I hope that somebody out there
you know I hope that somebody out there can can use this uh for their own work
can can use this uh for their own work that they're currently using thank you
that they're currently using thank you guys so much for watching I really
guys so much for watching I really appreciate it again huge shout out to
appreciate it again huge shout out to you me for sponsoring this Excel series
you me for sponsoring this Excel series if you like this video be sure to like
if you like this video be sure to like And subscribe below I'll see you in the
And subscribe below I'll see you in the next
next [Music]
[Music] video what's going on everybody welcome
video what's going on everybody welcome back to another Excel tutorial today we
back to another Excel tutorial today we will be looking at
will be looking at [Music]
[Music] charts
now if you have data in Excel and you want to visually show that with bars or
want to visually show that with bars or graphs or anything like that you can do
graphs or anything like that you can do that really simply and I'm going to show
that really simply and I'm going to show you how to do that today and a lot of
you how to do that today and a lot of people are a little bit intimidated
people are a little bit intimidated because they think it's a little bit
because they think it's a little bit complicated but I promise you by the end
complicated but I promise you by the end of this video you will know how to do it
of this video you will know how to do it like a pro it's not that difficult it's
like a pro it's not that difficult it's just you need to know where to look
just you need to know where to look where to click and how to actually
where to click and how to actually filter through things to make sure that
filter through things to make sure that you're visually showing the things that
you're visually showing the things that you want to show but before we actually
you want to show but before we actually jump into the the tutorial I want to
jump into the the tutorial I want to give a huge shout out to the sponsor of
give a huge shout out to the sponsor of this Excel series and that is udem me
this Excel series and that is udem me you may not know this but I probably get
you may not know this but I probably get at least 15 to 50 companies every single
at least 15 to 50 companies every single month reaching out to me wanting to
month reaching out to me wanting to sponsor the channel and promote their
sponsor the channel and promote their product and I turn down almost every
product and I turn down almost every single one because I either don't know
single one because I either don't know their product or I don't believe in
their product or I don't believe in their product and so I'm not going to
their product and so I'm not going to you know go and promote that on my
you know go and promote that on my channel but unud me is one that I have
channel but unud me is one that I have consistently promoted over the past year
consistently promoted over the past year and that's because I truly believe in
and that's because I truly believe in their product I've been taking courses
their product I've been taking courses off their platform for years and I've
off their platform for years and I've honestly learned so much and I cannot
honestly learned so much and I cannot recommend them enough so if you want to
recommend them enough so if you want to take a full-fledged Excel course I have
take a full-fledged Excel course I have my recommendations in the description if
my recommendations in the description if you want to check those out thank you
you want to check those out thank you again to UD me for sponsoring this Excel
again to UD me for sponsoring this Excel Series so without further Ado let's jump
Series so without further Ado let's jump onto my screen and get started with the
onto my screen and get started with the tutorial all right so let's jump right
tutorial all right so let's jump right into it right here we have the Dunder
into it right here we have the Dunder Mifflin sales report and over here we
Mifflin sales report and over here we have all the products that they were
have all the products that they were selling along with the months that they
selling along with the months that they were sold in and so in January they sold
were sold in and so in January they sold 450 reams of paper down here we have the
450 reams of paper down here we have the total it items per month and so in
total it items per month and so in January they sold 898 units of uh
January they sold 898 units of uh products or or things that they sold at
products or or things that they sold at the very end we have the year end total
the very end we have the year end total so this is the total amount of paper
so this is the total amount of paper that they sold throughout the year now
that they sold throughout the year now we're going to use this data right here
we're going to use this data right here for all of our charts now you may not
for all of our charts now you may not have data exactly like this it can come
have data exactly like this it can come in lots of different flavors but you're
in lots of different flavors but you're going to get the basic gist of how to
going to get the basic gist of how to use charts how to edit it how to
use charts how to edit it how to customize it to fit what you need and
customize it to fit what you need and then we're going to kind of put it right
then we're going to kind of put it right over here and kind of create its own
over here and kind of create its own sheet where we can kind of visualize all
sheet where we can kind of visualize all the things that we want to
the things that we want to show so let's jump right back over here
show so let's jump right back over here into sales and first thing we need to do
into sales and first thing we need to do is kind of highlight the data that we're
is kind of highlight the data that we're going to be working with now I'm going
going to be working with now I'm going to start with everything but um you know
to start with everything but um you know I'll show you along the way we don't
I'll show you along the way we don't actually want everything but we can
actually want everything but we can filter that stuff out as we go so let's
filter that stuff out as we go so let's go right here and we're going to insert
go right here and we're going to insert and we're going to go over to charts now
and we're going to go over to charts now this is the chart section there's lots
this is the chart section there's lots of different types of charts um but the
of different types of charts um but the first thing that we're going to be
first thing that we're going to be looking at is right here this is a 2d
looking at is right here this is a 2d column or kind of like a bar chart and
column or kind of like a bar chart and we're just going to click right here and
we're just going to click right here and we're going to pull this
we're going to pull this down so now that we have this down here
down so now that we have this down here there are a few things that I want to
there are a few things that I want to show you before we actually really get
show you before we actually really get into it I kind of want to show you the
into it I kind of want to show you the options that you have so if you go up
options that you have so if you go up here we have different uh chart Styles
here we have different uh chart Styles and so if I hover over them you can see
and so if I hover over them you can see that each one kind of looks a little bit
that each one kind of looks a little bit different and it really doesn't matter
different and it really doesn't matter uh it doesn't really change the data in
uh it doesn't really change the data in any way just how you visualize it and so
any way just how you visualize it and so if that is important if that is
if that is important if that is something that you um you want to stick
something that you um you want to stick with a certain theme or a certain look
with a certain theme or a certain look then go for that uh the other thing
then go for that uh the other thing that's really nice to have over here is
that's really nice to have over here is this switch row and column so right down
this switch row and column so right down here you can see this purple and you can
here you can see this purple and you can see this red those are our rows and
see this red those are our rows and columns and we can switch that right
columns and we can switch that right here so if we go like this now instead
here so if we go like this now instead of the months being right here the
of the months being right here the months are the colors and the actual
months are the colors and the actual product is right here let's click it
product is right here let's click it again and it'll go back and so now we
again and it'll go back and so now we have this kind of Time series now we
have this kind of Time series now we have January through the end of your
have January through the end of your total now this one is one that I think
total now this one is one that I think is super helpful you know it you can do
is super helpful you know it you can do it down here as well if you go to this
it down here as well if you go to this filter um but both of these are super
filter um but both of these are super helpful because you sometimes just want
helpful because you sometimes just want to select all the data and then kind of
to select all the data and then kind of get in there and mess with with it
get in there and mess with with it something that we want to get rid of is
something that we want to get rid of is this total items per month so we want to
this total items per month so we want to remove that and then we also want to
remove that and then we also want to remove this year-end total because both
remove this year-end total because both of those are are kind of the end result
of those are are kind of the end result they're not the actual data per month or
they're not the actual data per month or or per product so we're going to get rid
or per product so we're going to get rid of those and we're going to apply that
of those and we're going to apply that and as you can see just right off the
and as you can see just right off the bat our data is changed dramatically uh
bat our data is changed dramatically uh and that's because we aren't including
and that's because we aren't including these these large large numbers that
these these large large numbers that were kind of throwing off uh the
were kind of throwing off uh the visualization for us so this one right
visualization for us so this one right here as is is already pretty good um
here as is is already pretty good um what we can do right here is we can
what we can do right here is we can change this and we're just going to
change this and we're just going to say products
say products sold per
sold per month now what we can do if we want to
month now what we can do if we want to move it to another um to another sheet
move it to another um to another sheet is we can actually move the chart and we
is we can actually move the chart and we can select where we want to move it we
can select where we want to move it we can move it to chart sheet and we can do
can move it to chart sheet and we can do that or something that I do um almost
that or something that I do um almost 99% of the time I just copy and I come
99% of the time I just copy and I come over here and I'm going to paste it and
over here and I'm going to paste it and so now we have this um this chart right
so now we have this um this chart right over here as well as back here and so I
over here as well as back here and so I typically tend to do that because now we
typically tend to do that because now we can still go over here and change this
can still go over here and change this one as much as we want so if we want to
one as much as we want so if we want to go in here we can alter this one and it
go in here we can alter this one and it won't affect the other one so we just
won't affect the other one so we just have basically two copies so we're going
have basically two copies so we're going to keep this one right here this is
to keep this one right here this is going to be our first
going to be our first visualization um and as I said said it's
visualization um and as I said said it's it's fairly straightforward if you've
it's fairly straightforward if you've ever done any types of charts or graphs
ever done any types of charts or graphs before um right here it's January
before um right here it's January February March April May and if you
February March April May and if you hover over these you can see that that's
hover over these you can see that that's the the paper and if we just glance you
the the paper and if we just glance you know the paper is their biggest product
know the paper is their biggest product by far and so that blue um which is
by far and so that blue um which is their paper is going to be the biggest
their paper is going to be the biggest every single month so that makes perfect
every single month so that makes perfect sense now what if we want to change up
sense now what if we want to change up uh the the kind so what if we want to
uh the the kind so what if we want to change up the kind of visualization that
change up the kind of visualization that it offers us well we have a lot of
it offers us well we have a lot of different options let's go right over
different options let's go right over here to change chart type now this is
here to change chart type now this is going to offer you just about everything
going to offer you just about everything you could possibly imagine or want and
you could possibly imagine or want and even things that you absolutely would
even things that you absolutely would never ever want ever um and so I'm going
never ever want ever um and so I'm going to show you some of the good ones and
to show you some of the good ones and I'm going to show you some just
I'm going to show you some just absolutely insane ones that uh Excel
absolutely insane ones that uh Excel came up with which cannot I could not
came up with which cannot I could not imagine a scenario that these are ever
imagine a scenario that these are ever used um but Within These columns you can
used um but Within These columns you can do they're called cluster columns uh
do they're called cluster columns uh these stacked columns so would look just
these stacked columns so would look just like this those are often used as
like this those are often used as well um and then we have ones that
well um and then we have ones that they're just not used often let's look
they're just not used often let's look let's take a look at this one right
let's take a look at this one right here I mean it's tough it's tough to
here I mean it's tough it's tough to look at um but let's let's put it right
look at um but let's let's put it right here this is basically the same thing
here this is basically the same thing that we just had except visualized in a
that we just had except visualized in a different um we'll call it more unique
different um we'll call it more unique way
way uh and let's for the sake of it let's
uh and let's for the sake of it let's put it over here um these two things
put it over here um these two things show the same information they show the
show the same information they show the same data just one is shown well and one
same data just one is shown well and one is not shown well um I'm not a fan of
is not shown well um I'm not a fan of these 3D type of
these 3D type of visualizations I I just don't like them
visualizations I I just don't like them but maybe you do and and you want to use
but maybe you do and and you want to use that that's fantastic let's go back um
that that's fantastic let's go back um something else that you'll probably use
something else that you'll probably use a lot are things like these um these
a lot are things like these um these line graphs okay so these are line
line graphs okay so these are line graphs and they're different types so
graphs and they're different types so they're these stacked um 100% stacked
they're these stacked um 100% stacked line lines with markers different
line lines with markers different flavors for this this type of line graph
flavors for this this type of line graph and so you can go in here and take a
and so you can go in here and take a look again um not my favorite but they
look again um not my favorite but they have it as an option if you CH so choose
have it as an option if you CH so choose to do this um but I kind of I'm kind of
to do this um but I kind of I'm kind of a simple guy um but I'm going to go in
a simple guy um but I'm going to go in here and it's pretty cluster
here and it's pretty cluster um I want to kind of take the ones that
um I want to kind of take the ones that have the highest
have the highest sales or the highest total amount sold
sales or the highest total amount sold so that would be paper manila folders
so that would be paper manila folders and three ring binders so let's go in
and three ring binders so let's go in here we want to keep paper we want to
here we want to keep paper we want to keep uh manila
keep uh manila folders and we want to keep three ring
folders and we want to keep three ring binders and let's apply that and so now
binders and let's apply that and so now it's a lot cleaner and we're just going
it's a lot cleaner and we're just going to copy this and we're going to put it
to copy this and we're going to put it over here and I'm just putting these all
over here and I'm just putting these all over here for you U because we'll look
over here for you U because we'll look at this at the end and just kind of see
at this at the end and just kind of see different options and and ways to do
different options and and ways to do things as we have gone through this
things as we have gone through this tutorial so let's go back here now
tutorial so let's go back here now something else that we haven't looked at
something else that we haven't looked at is the actual colors and color schemes
is the actual colors and color schemes that you can do so let's go right here
that you can do so let's go right here to these chart Styles and we can go to
to these chart Styles and we can go to color now color is um something that
color now color is um something that probably is quite overlooked um in
probably is quite overlooked um in actual charts and graphs some terrible
actual charts and graphs some terrible colors like this or or this um where
colors like this or or this um where they're really close together especially
they're really close together especially when you have a lot of them um for
when you have a lot of them um for example let's just pretend we put all of
example let's just pretend we put all of them back really quickly it is near
them back really quickly it is near impossible to distinguish these colors
impossible to distinguish these colors um we
um we wouldn't we wouldn't want that let's go
wouldn't we wouldn't want that let's go back to this color you know when you
back to this color you know when you have it like uh in some of these colors
have it like uh in some of these colors at least it at least distinguishes them
at least it at least distinguishes them so you can kind of see what you're
so you can kind of see what you're working with with um but when you have
working with with um but when you have it in these monochromatic options
it in these monochromatic options sometimes they're just impossible to
sometimes they're just impossible to distinguish so be sure to choose the
distinguish so be sure to choose the right colors that you're using so that
right colors that you're using so that if somebody who's never seen this data
if somebody who's never seen this data before looks at it they can easily
before looks at it they can easily distinguish uh the product and the month
distinguish uh the product and the month that you are looking at but let's go
that you are looking at but let's go just back up here we'll choose this
just back up here we'll choose this default option um well let's choose this
default option um well let's choose this one right here this one's nice although
one right here this one's nice although there's lots of yellows and oranges
there's lots of yellows and oranges let's see this one this one's not bad
let's see this one this one's not bad greens blues uh and like yellows so
greens blues uh and like yellows so that's nice um other things that we want
that's nice um other things that we want to look at and there are these chart
to look at and there are these chart elements right here other things that we
elements right here other things that we can add are things like data labels um
can add are things like data labels um and right here it's super messy um but
and right here it's super messy um but if we went back and we got rid of some
if we went back and we got rid of some of these things like the printer Staples
of these things like the printer Staples highlighters pens and total we apply
highlighters pens and total we apply that it's a little bit easier to
that it's a little bit easier to distinguish um
distinguish um and that's you know something that you
and that's you know something that you may be interested in doing you can also
may be interested in doing you can also add this data table at the bottom which
add this data table at the bottom which is the actual columns and rows that you
is the actual columns and rows that you have for this visualization right here
have for this visualization right here now let's expand this quite a bit I'm
now let's expand this quite a bit I'm going to make this extremely large if
going to make this extremely large if you have something like this it actually
you have something like this it actually can be pretty nice um you know maybe we
can be pretty nice um you know maybe we get rid of these data labels but it can
get rid of these data labels but it can be easy because you're putting it all in
be easy because you're putting it all in one place you can also make this two
one place you can also make this two separate visualizations so you can have
separate visualizations so you can have one visualization just like this and
one visualization just like this and right underneath it you can have the
right underneath it you can have the actual rows and columns but this option
actual rows and columns but this option allows you to put it all in one so let's
allows you to put it all in one so let's put this back down because that is way
put this back down because that is way too big and uh wait let's expand it a
too big and uh wait let's expand it a little bit now if you notice right here
little bit now if you notice right here we have our Legend up top um it is
we have our Legend up top um it is possible to actually change that you can
possible to actually change that you can go right here and you can move this um
go right here and you can move this um kind of wherever you want um but it's
kind of wherever you want um but it's not exactly easy to put based off how we
not exactly easy to put based off how we have it right here if we go into to this
have it right here if we go into to this chart elements we go down to Legend and
chart elements we go down to Legend and we hit this little arrow right here we
we hit this little arrow right here we can select it on the right the top the
can select it on the right the top the left and the bottom or we can just go to
left and the bottom or we can just go to more options uh which allows us to push
more options uh which allows us to push it anywhere but um let's say I want to
it anywhere but um let's say I want to do it just like this I'm going to put on
do it just like this I'm going to put on the right and I actually want to bring
the right and I actually want to bring it down right here and you know that's
it down right here and you know that's just an option if you want to kind of
just an option if you want to kind of customize it a little further makes a
customize it a little further makes a little cleaner uh you can do that with
little cleaner uh you can do that with almost any of these things so if you
almost any of these things so if you click on this oops if you click on this
click on this oops if you click on this you can move this anywhere as well so if
you can move this anywhere as well so if you want to move this over here on top
you want to move this over here on top of it you can and make it look terrible
of it you can and make it look terrible or you can move it uh right back over
or you can move it uh right back over here you know this is something that you
here you know this is something that you can move around uh you just kind of want
can move around uh you just kind of want to make sure you're doing it the right
to make sure you're doing it the right way so let's get this back where was
way so let's get this back where was there we go now before we go any further
there we go now before we go any further let's copy that and put it right over
let's copy that and put it right over here with our other uh charts and graphs
here with our other uh charts and graphs and if you see over here on this side we
and if you see over here on this side we have this this format chart area notice
have this this format chart area notice I haven't showed you this at all yet
I haven't showed you this at all yet that is because I genuinely just don't
that is because I genuinely just don't use this almost at all um there are some
use this almost at all um there are some good stuff in here um and I'm sure that
good stuff in here um and I'm sure that you know if you were someone who really
you know if you were someone who really wants to go in there and super customize
wants to go in there and super customize it you can do that um but I honestly I
it you can do that um but I honestly I just never get in here and I never you
just never get in here and I never you know change the glow or the Shadows um
know change the glow or the Shadows um just not something I use and some of
just not something I use and some of these are only for these three 3D
these are only for these three 3D formatting which I never use and so I'm
formatting which I never use and so I'm not going to show you and walk through
not going to show you and walk through these things again I I really don't use
these things again I I really don't use it and so if you want to go in there and
it and so if you want to go in there and mess with it uh you know by all means go
mess with it uh you know by all means go for it it's just not something that I
for it it's just not something that I want to take the time to show you and
want to take the time to show you and with that being said let's go back over
with that being said let's go back over to this chart sheet that we have and it
to this chart sheet that we have and it was super super easy to get these um
was super super easy to get these um charts and graphs and and and whatnot
charts and graphs and and and whatnot there are lots of different options
there are lots of different options again if we go back here and we go up
again if we go back here and we go up here to chart design and go to the
here to chart design and go to the change chart type and again there are a
change chart type and again there are a ton of different options like a pie
ton of different options like a pie chart um like this it's it's you know
chart um like this it's it's you know you can try to figure this out and use
you can try to figure this out and use these um but you know I wanted to show
these um but you know I wanted to show you the ones that you'll probably use
you the ones that you'll probably use the most which are these columns and
the most which are these columns and line charts and they all kind of are
line charts and they all kind of are similar in their own way this bar chart
similar in their own way this bar chart is basically you know this column chart
is basically you know this column chart just on its side and so they all have
just on its side and so they all have their different flavor they all have
their different flavor they all have their different way of visualizing the
their different way of visualizing the data but but in essence they're using
data but but in essence they're using the data in a similar way to to
the data in a similar way to to visualize it and represent the data
visualize it and represent the data itself especially things like these box
itself especially things like these box and whisker plots or these waterfall
and whisker plots or these waterfall charts uh you know these are things that
charts uh you know these are things that usually require specific data to kind of
usually require specific data to kind of use uh and and so I'm just using data
use uh and and so I'm just using data that you'll probably see the most of um
that you'll probably see the most of um like this this sales data so I hope that
like this this sales data so I hope that this given you a pretty good um you know
this given you a pretty good um you know quick understanding of how to use these
quick understanding of how to use these how to customize them how to copy and
how to customize them how to copy and paste them over to to a different sheet
paste them over to to a different sheet to create some type of little uh chart
to create some type of little uh chart and visualization sheet that you can use
and visualization sheet that you can use to show your employers and and visualize
to show your employers and and visualize the data that you are working with thank
the data that you are working with thank you guys so much for watching I really
you guys so much for watching I really appreciate it again huge shout out to
appreciate it again huge shout out to you to me for sponsoring this Excel
you to me for sponsoring this Excel series if you like this video be sure to
series if you like this video be sure to like And subscribe below and I'll see
like And subscribe below and I'll see you in the next
you in the next [Music]
what's going on everybody welcome back to the Excel tutorial Series today we'll
to the Excel tutorial Series today we'll be looking at how to clean data in
[Music] Excel now knowing how to clean data in
Excel now knowing how to clean data in Excel is actually extremely useful and
Excel is actually extremely useful and there are a ton of techniques to do this
there are a ton of techniques to do this I'm going to be showing you the ones
I'm going to be showing you the ones that I probably use the most I feel like
that I probably use the most I feel like are the most helpful to kind of do the
are the most helpful to kind of do the bulk or the majority of the data
bulk or the majority of the data cleaning that you're going to do in
cleaning that you're going to do in Excel like I said there's so many
Excel like I said there's so many different ways and very specific things
different ways and very specific things that you can do but I'm going to
that you can do but I'm going to highlight some of the bigger ones that I
highlight some of the bigger ones that I find the most useful and some of you may
find the most useful and some of you may be thinking well I'll just do my data
be thinking well I'll just do my data cleaning in SQL or python or when I get
cleaning in SQL or python or when I get it ready to put it in Tableau um but
it ready to put it in Tableau um but honestly a lot of the data cleaning at
honestly a lot of the data cleaning at least a lot of the big stuff I tend to
least a lot of the big stuff I tend to do in Excel IF the data set is small
do in Excel IF the data set is small enough to fit in Excel and so I think
enough to fit in Excel and so I think it's actually really really useful to
it's actually really really useful to know how to do this because you'll most
know how to do this because you'll most likely be doing it more than you think
likely be doing it more than you think now before we jump into the tutorial I
now before we jump into the tutorial I want to give a shout out to the sponsor
want to give a shout out to the sponsor of this video and is brand new sponsor
of this video and is brand new sponsor it is unlocked by Z by HP unlocked is a
it is unlocked by Z by HP unlocked is a movie that's actually broken up into
movie that's actually broken up into four parts and each of them have a
four parts and each of them have a unique data science challenge associated
unique data science challenge associated with it now I'm going to read this next
with it now I'm going to read this next part because it's extremely interesting
part because it's extremely interesting each challenge represents a different
each challenge represents a different topic so there's data visualization text
topic so there's data visualization text analysis audio signal processing and
analysis audio signal processing and computer vision and you can submit your
computer vision and you can submit your answers in your work on their website
answers in your work on their website for a chance to win one of 10 zbook
for a chance to win one of 10 zbook Studio laptops or a free trip to the
Studio laptops or a free trip to the kaggle World Championships so I'll leave
kaggle World Championships so I'll leave a link in the description where you can
a link in the description where you can go watch the movie and then do the
go watch the movie and then do the challenges and then submit your answers
challenges and then submit your answers for a chance to win you should also go
for a chance to win you should also go check out their hackathon where you can
check out their hackathon where you can do these projects with other people just
do these projects with other people just like you who are trying to figure out
like you who are trying to figure out these answers and submit them to win as
these answers and submit them to win as well so go check that out thank you
well so go check that out thank you again to the sponsor of this video
again to the sponsor of this video unlocked by Z by HP now without further
unlocked by Z by HP now without further Ado let's jump onto my screen and get
Ado let's jump onto my screen and get started with the tutorial all right so
started with the tutorial all right so let's jump right into it I have this US
let's jump right into it I have this US president data set I got the base data
president data set I got the base data set from kaggle uh but I added some of
set from kaggle uh but I added some of my own data and then I messed some stuff
my own data and then I messed some stuff up as well just to kind of um
up as well just to kind of um demonstrate some of these things that
demonstrate some of these things that we're going to be looking at today this
we're going to be looking at today this is not a full project so you know we're
is not a full project so you know we're actually going to be using this to
actually going to be using this to create any visualizations or anything
create any visualizations or anything like that so you know all this is just
like that so you know all this is just for demonstration purposes but we will
for demonstration purposes but we will be doing a full project in about two or
be doing a full project in about two or three videos uh in this Excel Series
three videos uh in this Excel Series where we're going to be doing from start
where we're going to be doing from start to finish with a real data set so you
to finish with a real data set so you know if that's something that you're you
know if that's something that you're you wanting then we will absolutely be doing
wanting then we will absolutely be doing that now something that you may be
that now something that you may be wondering is how do you actually
wondering is how do you actually identify what you need need to clean in
identify what you need need to clean in the data what do you know to look for
the data what do you know to look for well some of the obvious things are
well some of the obvious things are things like formatting and
things like formatting and standardization so things like you know
standardization so things like you know this James Monroe is in all caps that
this James Monroe is in all caps that happens all the time within real data um
happens all the time within real data um and and so you know you want to
and and so you know you want to standardize that or this all lowercase
standardize that or this all lowercase you want to standardize that you want
you want to standardize that you want that all to be the same there's also
that all to be the same there's also things like um right here or we have
things like um right here or we have this wig and this wig with a bunch of
this wig and this wig with a bunch of random stuff after it this happens all
random stuff after it this happens all the time where it's not completely
the time where it's not completely standardized um and you may even notice
standardized um and you may even notice um you know there are some spelling
um you know there are some spelling errors in here and I'll we'll kind of
errors in here and I'll we'll kind of look through that in a little bit and
look through that in a little bit and then you know there are things like
then you know there are things like additional spaces where there shouldn't
additional spaces where there shouldn't be spaces there are things like
be spaces there are things like currencies that you need to be aware of
currencies that you need to be aware of if you were importing this into or going
if you were importing this into or going to be importing this into a SQL database
to be importing this into a SQL database um things like currencies can be just a
um things like currencies can be just a problem or be really um unnecessary it
problem or be really um unnecessary it may actually cause more issues in the
may actually cause more issues in the long run so you may just want to you
long run so you may just want to you know take that to the base value and
know take that to the base value and then dates are always an issue always
then dates are always an issue always always always um so always look at your
always always um so always look at your dates make sure they're they're
dates make sure they're they're formatted correctly make sure they're
formatted correctly make sure they're all the same these are the types of
all the same these are the types of things that right when I glance at this
things that right when I glance at this data set these are things that I'm
data set these are things that I'm looking for um one other thing that is
looking for um one other thing that is actually the first thing that we're
actually the first thing that we're going to start out with is you want to
going to start out with is you want to make sure that your data is not
make sure that your data is not duplicated because if your data has
duplicated because if your data has duplicate data in it and you don't want
duplicate data in it and you don't want that it's not supposed to be there there
that it's not supposed to be there there are some specific use cases where
are some specific use cases where duplicated data is okay um you know you
duplicated data is okay um you know you want to get rid of that and it's very
want to get rid of that and it's very easy to do in Excel uh the first thing
easy to do in Excel uh the first thing we're going to do we're going to go uh
we're going to do we're going to go uh to this data tab we're going to go right
to this data tab we're going to go right over here and we're going to get see if
over here and we're going to get see if there's any uh duplicates in our data so
there's any uh duplicates in our data so we're just going to go up to remove
we're just going to go up to remove duplicates it's going to automatically
duplicates it's going to automatically choose all of your columns to to check
choose all of your columns to to check against so it's going to for from a all
against so it's going to for from a all the way through I it's going to see is
the way through I it's going to see is the exact same data in all these rows
the exact same data in all these rows and if it is it's going to get rid of it
and if it is it's going to get rid of it um and so we're going to click okay and
um and so we're going to click okay and it did find one duplicate and I'll show
it did find one duplicate and I'll show you that one real quick um because you
you that one real quick um because you know it was right here so Barack Obama
know it was right here so Barack Obama was here twice and then I'm going to hit
was here twice and then I'm going to hit control I hit control Z to go back I'm
control I hit control Z to go back I'm going hit control y to go forward and it
going hit control y to go forward and it removed that uh that row completely now
removed that uh that row completely now in this example you may be able to spot
in this example you may be able to spot that with your eye but in a real data
that with your eye but in a real data set where you have 10,000 100,000 rows
set where you have 10,000 100,000 rows there's absolutely no way you're going
there's absolutely no way you're going to see that or very very unlikely that
to see that or very very unlikely that you are going to see that there's
you are going to see that there's duplicated data in there so just running
duplicated data in there so just running a a a quick um dup or or removing of
a a a quick um dup or or removing of duplicates that is really important to
duplicates that is really important to make sure that you um have gotten rid of
make sure that you um have gotten rid of those things so that's one of the first
those things so that's one of the first things that I do um we're going to go
things that I do um we're going to go into a lot of these different uh columns
into a lot of these different uh columns and I'm going to kind of show you
and I'm going to kind of show you different techniques or things that I do
different techniques or things that I do when I look at actual data so I'm going
when I look at actual data so I'm going to come right over here I'm going to
to come right over here I'm going to insert and this is what I actually do I
insert and this is what I actually do I I usually create a separate column
I usually create a separate column especially when I'm working with this
especially when I'm working with this because I don't want to change this one
because I don't want to change this one um I don't want to go in here and you
um I don't want to go in here and you know say um equals upper equals proper
know say um equals upper equals proper Etc there's a lot of different ways that
Etc there's a lot of different ways that you can change um names or not a lot but
you can change um names or not a lot but the main ones that you can change names
the main ones that you can change names and all of them are completely okay so
and all of them are completely okay so for example I'm going to hit equal upper
for example I'm going to hit equal upper oops upper and I'm going to go like this
oops upper and I'm going to go like this and close my parentheses so I selected
and close my parentheses so I selected this S I close my parenthese hit enter
this S I close my parenthese hit enter it is and I'm going to hit um in the
it is and I'm going to hit um in the bottom right I'm going toit double click
bottom right I'm going toit double click this and it's going to apply to all of
this and it's going to apply to all of them it is completely okay to have your
them it is completely okay to have your data like this if you want it to be like
data like this if you want it to be like that um if you want it to be all lower
that um if you want it to be all lower you can do that if you want it to be in
you can do that if you want it to be in proper case you can do that um there are
proper case you can do that um there are oops there are different um uses for all
oops there are different um uses for all of them and honestly as long as it's all
of them and honestly as long as it's all the same typically it's okay but if um
the same typically it's okay but if um you know for example if you're selling
you know for example if you're selling this to like a third party company or
this to like a third party company or something like that they may have um
something like that they may have um what they want for their ingestion
what they want for their ingestion process when they take your file in if
process when they take your file in if you send you know a weekly file or a
you send you know a weekly file or a monthly file they may want it exactly
monthly file they may want it exactly how they want it and you can change that
how they want it and you can change that to to what they want um but as long as
to to what they want um but as long as it's standardized for you it's all the
it's standardized for you it's all the same for you that is a good thing so now
same for you that is a good thing so now we have all of these um in the proper
we have all of these um in the proper case that's typically what I I do or I
case that's typically what I I do or I use upper those are the ones I use the
use upper those are the ones I use the most I don't usually use um lower and if
most I don't usually use um lower and if you go in here and you type in
you go in here and you type in lower you know it changes it to all
lower you know it changes it to all lower I don't typically do that um and
lower I don't typically do that um and I'm gon to add I'm oops I'm gonna say
I'm gon to add I'm oops I'm gonna say president Dash fixed and so now all of
president Dash fixed and so now all of these names um all of these uh different
these names um all of these uh different uppercase and lowercase these are all
uppercase and lowercase these are all fixed and and it just makes it so much
fixed and and it just makes it so much easier to read and you don't have
easier to read and you don't have different um uppercase and lowercase
different um uppercase and lowercase issues it's all the same so I'm going to
issues it's all the same so I'm going to keep keep that right
keep keep that right there uh if we move a little bit to the
there uh if we move a little bit to the right if you look at this prior now this
right if you look at this prior now this prior is a mess it it has stuff all over
prior is a mess it it has stuff all over and to be honest this is not really
and to be honest this is not really something that I would probably be using
something that I would probably be using um like in a real data set I would look
um like in a real data set I would look at this column and I would say this is
at this column and I would say this is pretty useless um if I had a very
pretty useless um if I had a very specific use case for this this data in
specific use case for this this data in this column I might try to you know
this column I might try to you know parse it out and do something but I
parse it out and do something but I don't uh this this is a completely
don't uh this this is a completely useless com to me so I'm actually going
useless com to me so I'm actually going to skip this one I'm going to go to this
to skip this one I'm going to go to this party one and this party one to me it
party one and this party one to me it looks pretty important because this is
looks pretty important because this is something that I know I can Group by um
something that I know I can Group by um and I can create visualizations with and
and I can create visualizations with and and kind of break that out and if you
and kind of break that out and if you look right here we're going to add um
look right here we're going to add um we're going to add a filter so now let's
we're going to add a filter so now let's open up party and take a look so if we
open up party and take a look so if we look right here we have Democratic
look right here we have Democratic democratic-republican Federalist
democratic-republican Federalist nonpartisan repu Republican Republicans
nonpartisan repu Republican Republicans wig and wig with a a date and some
wig and wig with a a date and some information in the back of it and then
information in the back of it and then some blanks um and it's really important
some blanks um and it's really important when we're when we're looking at these
when we're when we're looking at these um ones that we think we might Group by
um ones that we think we might Group by that we have these um properly grouped
that we have these um properly grouped so Republican and Republicans to me
so Republican and Republicans to me right off the bat looks like a spelling
right off the bat looks like a spelling error and so I'm just going to deselect
error and so I'm just going to deselect All I'm going to go to Republican
All I'm going to go to Republican Republicans and it's literally
Republicans and it's literally Republican all the way down except for
Republican all the way down except for for this last one and to me that's just
for this last one and to me that's just something that I would update so I would
something that I would update so I would just go right here I do that if I didn't
just go right here I do that if I didn't do that and then I try to create let's
do that and then I try to create let's say a pivot table on here I'll have its
say a pivot table on here I'll have its own group of Republicans and it wouldn't
own group of Republicans and it wouldn't be added to Republican and maybe that's
be added to Republican and maybe that's on purpose but let's just presume that
on purpose but let's just presume that we know this data extremely well and
we know this data extremely well and that's not supposed to be like that
that's not supposed to be like that right again that that just comes back to
right again that that just comes back to knowing your data really well
knowing your data really well understanding what it um you know what
understanding what it um you know what it should look like and we know that it
it should look like and we know that it should not be like that so we're going
should not be like that so we're going to fix that uh the next thing that we're
to fix that uh the next thing that we're going to fix um and as you can see it it
going to fix um and as you can see it it got rid of it next thing we're going to
got rid of it next thing we're going to fix is this wig
fix is this wig um that's just like an error that's
um that's just like an error that's that's some issue on the the data side
that's some issue on the the data side and we're just going to fix that by
and we're just going to fix that by updating it and that's it I would always
updating it and that's it I would always be keeping um a a copy of this with the
be keeping um a a copy of this with the raw data uh somewhere else because this
raw data uh somewhere else because this is presumably like a working document
is presumably like a working document this is not
this is not a um you know you aren't saving over
a um you know you aren't saving over your original file let's just say that
your original file let's just say that and then let's take a look at these
and then let's take a look at these blanks real quick um okay so there are
blanks real quick um okay so there are these rows right here that have nothing
these rows right here that have nothing I think we're okay but if we see
I think we're okay but if we see anything different 47 48 okay so yeah
anything different 47 48 okay so yeah it's just these ones right here that
it's just these ones right here that have no data in it anyways it's just
have no data in it anyways it's just seeing it in the filter so not an issue
seeing it in the filter so not an issue at all so okay we're looking good we've
at all so okay we're looking good we've gone all the way over we we fixed this
gone all the way over we we fixed this President we skipped this one um we we
President we skipped this one um we we cleaned up this party and I kept this
cleaned up this party and I kept this one in here because I'm not exactly sure
one in here because I'm not exactly sure if that's a Democratic or republican so
if that's a Democratic or republican so I'm going to keep it its own thing um
I'm going to keep it its own thing um I'm not a huge uh history buff in that
I'm not a huge uh history buff in that aspect the next one right here is um the
aspect the next one right here is um the next one right here is really easy uh
next one right here is really easy uh this is something that happens all the
this is something that happens all the time especially on actually most often
time especially on actually most often it's happens on numerical data so like
it's happens on numerical data so like uh you know there'll be a number of 1,1
uh you know there'll be a number of 1,1 and then there'll be a space after it
and then there'll be a space after it for absolutely no reason uh and it
for absolutely no reason uh and it happens all the time it does happen like
happens all the time it does happen like this as well um where you'll see this
this as well um where you'll see this and all you got to do is do trim and
and all you got to do is do trim and select the the cell we're going to close
select the the cell we're going to close that parenthesis and we're going to
that parenthesis and we're going to apply that all the way down what is so
apply that all the way down what is so fantastic about the trim is that it's
fantastic about the trim is that it's really intuitive and it knows basically
really intuitive and it knows basically everything it needs to do for example um
everything it needs to do for example um it gets gets rid of the um spaces before
it gets gets rid of the um spaces before it gets rid of extra spaces in the
it gets rid of extra spaces in the middle and um it'll get rid of extra
middle and um it'll get rid of extra spaces at the end um which you wouldn't
spaces at the end um which you wouldn't be able to see but they are there and
be able to see but they are there and they they absolutely can cause issues if
they they absolutely can cause issues if you have spaces at the end that you
you have spaces at the end that you cannot see um let's take this one for
cannot see um let's take this one for example like if I had spaces at the end
example like if I had spaces at the end that can cause issues when you insert or
that can cause issues when you insert or or or put that into a database um that
or or put that into a database um that happens a lot with numbers um you know
happens a lot with numbers um you know when you're putting that into SQL
when you're putting that into SQL that can cause issues and so you really
that can cause issues and so you really it is important to actually do that trim
it is important to actually do that trim um and you can do that on all of your
um and you can do that on all of your columns or just ones that you know
columns or just ones that you know you're having issues with but once you
you're having issues with but once you import that data into SQL you will know
import that data into SQL you will know if there's an issue or not when you
if there's an issue or not when you actually try to start using it so we're
actually try to start using it so we're going to say Vice and we're going to say
going to say Vice and we're going to say fixed oops there we go uh this next one
fixed oops there we go uh this next one is one that you'll run into a lot when
is one that you'll run into a lot when you're working with numerical data you
you're working with numerical data you will encounter so many different issues
will encounter so many different issues um one that I run into a lot is I I've
um one that I run into a lot is I I've worked with a lot of cost data or
worked with a lot of cost data or pricing data and when it's in an Excel
pricing data and when it's in an Excel it h it sometimes comes in with um these
it h it sometimes comes in with um these currencies like a dollar sign a pound
currencies like a dollar sign a pound sign things like that and when you put
sign things like that and when you put that into
that into SQL it just is a nuisance right you're
SQL it just is a nuisance right you're not going to be able to run um it's
not going to be able to run um it's going to go in as a text or it's going
going to go in as a text or it's going to be like a string right because it has
to be like a string right because it has that special character and you don't
that special character and you don't want that you don't want to have to then
want that you don't want to have to then go in and then change things around you
go in and then change things around you just want to be able to start um you
just want to be able to start um you know doing calculations on those numbers
know doing calculations on those numbers so what you can do is sometimes it'll
so what you can do is sometimes it'll come in as a text sometimes it'll come
come in as a text sometimes it'll come in as um currency which I think this
in as um currency which I think this one's a currency we are just going to
one's a currency we are just going to change that to be a number and then
change that to be a number and then we're going to get rid of these
we're going to get rid of these oops and get rid of those that it
oops and get rid of those that it doesn't look as pretty but that is much
doesn't look as pretty but that is much more useful than actually having the
more useful than actually having the currency on there with the decimals this
currency on there with the decimals this actually is so much easier when you when
actually is so much easier when you when you want to use it for almost anything
you want to use it for almost anything because you're able to add and uh do
because you're able to add and uh do things properly in other systems in
things properly in other systems in Excel I think it does understand it um
Excel I think it does understand it um but you know that can cause issues so
but you know that can cause issues so there is how you do that the next thing
there is how you do that the next thing that we're going to look at is these
that we're going to look at is these dates and just notoriously whenever I
dates and just notoriously whenever I see a date field I know there's going to
see a date field I know there's going to be an issue with it it's very rare
be an issue with it it's very rare that I get a date field that is perfect
that I get a date field that is perfect uh it just it is genuinely is um is a
uh it just it is genuinely is um is a novelty when that happens and most of
novelty when that happens and most of the time it has to do with um let's say
the time it has to do with um let's say a date comes into Excel and it's in a
a date comes into Excel and it's in a text format or date comes into Excel and
text format or date comes into Excel and they're not the same in this example
they're not the same in this example they are not the same um and we just
they are not the same um and we just want them to all be similar they say
want them to all be similar they say date on if you look right here it says
date on if you look right here it says date it says date it looks like it
date it says date it looks like it should be the the same um but if we go
should be the the same um but if we go like this it all looks the same right
like this it all looks the same right there's no issues at all if we were to
there's no issues at all if we were to um try to use that it may or may not be
um try to use that it may or may not be an issue but we don't want to leave that
an issue but we don't want to leave that to chance later on if you're using this
to chance later on if you're using this with python or something like that it
with python or something like that it can cause issues U maybe not in SQL
can cause issues U maybe not in SQL because it may um see the underlying um
because it may um see the underlying um what's in the underlying cell not just
what's in the underlying cell not just what we see but some systems won't and
what we see but some systems won't and so you want to make sure that they're
so you want to make sure that they're all the same
all the same and so you know what we were doing back
and so you know what we were doing back here with um oops with the party and we
here with um oops with the party and we were looking at this uh this filter and
were looking at this uh this filter and identifying the issues I usually do that
identifying the issues I usually do that on date fields as well and and
on date fields as well and and oftentimes um I know just for just for
oftentimes um I know just for just for demonstration purposes ofttimes I will
demonstration purposes ofttimes I will get something like that and then I'll
get something like that and then I'll come up here and I'll notice that
come up here and I'll notice that there's this one random number that
there's this one random number that happens all the time all the time um and
happens all the time all the time um and so you know you want to make sure that
so you know you want to make sure that you um that you look at these things and
you um that you look at these things and just just do at least a quick glance if
just just do at least a quick glance if not kind of doing a kind of a deep dive
not kind of doing a kind of a deep dive into it but all we're going to do is
into it but all we're going to do is we're going to do both of these and
we're going to do both of these and we're going to do a short date and let's
we're going to do a short date and let's take a look and see if that fixed it and
take a look and see if that fixed it and so now they are all the same format and
so now they are all the same format and that is fantastic that is exactly what
that is fantastic that is exactly what we want we're going to go back through
we want we're going to go back through here we're going to get rid of these um
here we're going to get rid of these um again this is a
again this is a working um this is a working document
working um this is a working document oops uh we need to we're I'm going to do
oops uh we need to we're I'm going to do um control shift down oops let me go
um control shift down oops let me go back up do control shift down and copy
back up do control shift down and copy and what I'm going to do right now is
and what I'm going to do right now is I'm actually going to copy let me do it
I'm actually going to copy let me do it right here I'll show you sometimes I do
right here I'll show you sometimes I do this does just depends I'm going to go
this does just depends I'm going to go right here I'm going to hit rightclick
right here I'm going to hit rightclick and I'm going to paste as a value which
and I'm going to paste as a value which means it's not going to take the
means it's not going to take the calculation or the formula that I just
calculation or the formula that I just did
did uh it's going to actually paste it as
uh it's going to actually paste it as that value so we just replaced it um
that value so we just replaced it um right here you can see up here it says
right here you can see up here it says equals trim of
equals trim of G2 this now now that I copied and pasted
G2 this now now that I copied and pasted it over as a value um it got rid of that
it over as a value um it got rid of that um calculation and now it is actually a
um calculation and now it is actually a string so we don't need this anymore and
string so we don't need this anymore and I'll do the same thing over here as
I'll do the same thing over here as well I'm going to control shift down
well I'm going to control shift down copy and I just hit the right key uh or
copy and I just hit the right key uh or the left key sorry now I'm going to
the left key sorry now I'm going to right click and I'm going to do paste as
right click and I'm going to do paste as a value and again it has this proper and
a value and again it has this proper and now it doesn't have the proper it's
now it doesn't have the proper it's actually the value that was here so
actually the value that was here so that's really important to note uh and
that's really important to note uh and we're going to get rid of that one and
we're going to get rid of that one and so now what we have is is already
so now what we have is is already looking much better now one of the last
looking much better now one of the last things I we're going to look at is
things I we're going to look at is deleting columns that we are not going
deleting columns that we are not going to use and this is why it's so important
to use and this is why it's so important to keep a backup or or or the raw data
to keep a backup or or or the raw data not in this file because if you start
not in this file because if you start saving over this file and this is your
saving over this file and this is your raw file uh that can mess up a lot of
raw file uh that can mess up a lot of things and that happens to me before and
things and that happens to me before and it's terrible and then you have to
it's terrible and then you have to request another file or you have to go
request another file or you have to go back and find it or something like that
back and find it or something like that it's terrible um so so this is our
it's terrible um so so this is our working document so we can mess with
working document so we can mess with this and do whatever we want for our
this and do whatever we want for our purposes now for us um I can already
purposes now for us um I can already tell you that this prior is a bunch of
tell you that this prior is a bunch of nonsense and we do not need it we're not
nonsense and we do not need it we're not going to use it for anything and it and
going to use it for anything and it and if we have um this is a small very small
if we have um this is a small very small data set this only has like um let's say
data set this only has like um let's say you know one two three four five six
you know one two three four five six seven eight we have like eight columns
seven eight we have like eight columns that we're you know kind of using that
that we're you know kind of using that has data eight or nine now that's a
has data eight or nine now that's a small data I've had ones with literally
small data I've had ones with literally like hundreds um and and it has so many
like hundreds um and and it has so many columns uh so much data and sometimes
columns uh so much data and sometimes it's good to just trim it back to the
it's good to just trim it back to the things you know you're going to use this
things you know you're going to use this to me is absolutely useless um we're
to me is absolutely useless um we're going to delete that
going to delete that and then right over here it's pretty
and then right over here it's pretty redundant um it's just one number off
redundant um it's just one number off but if we scroll down just a little bit
but if we scroll down just a little bit um it goes it's basically just counts
um it goes it's basically just counts it's a you could even call it a unique
it's a you could even call it a unique um identifier if you want sure why not
um identifier if you want sure why not but we don't need both um so we're going
but we don't need both um so we're going to get rid of this first one and now we
to get rid of this first one and now we have more of the useful and relevant
have more of the useful and relevant data rather than the stuff that we
data rather than the stuff that we absolutely know that we are not going to
absolutely know that we are not going to use um these date updated and date
use um these date updated and date created we may never use them but we
created we may never use them but we might um so it doesn't hurt to keep it
might um so it doesn't hurt to keep it on hand those other ones are ones that
on hand those other ones are ones that we are almost certain we will never use
we are almost certain we will never use again keep a backup just in case you
again keep a backup just in case you need it you can always go back and get
need it you can always go back and get it so you know if you go back to what we
it so you know if you go back to what we started with and you look at what we
started with and you look at what we have now it is much cleaner it's much
have now it is much cleaner it's much more usable and these are small subtle
more usable and these are small subtle changes um especially with this very
changes um especially with this very small data set of only like 50 rows or
small data set of only like 50 rows or or 46 rows but you're going to be
or 46 rows but you're going to be working with data sets that are
working with data sets that are thousands tens of thousands hundreds of
thousands tens of thousands hundreds of thousands of rows and you need to know
thousands of rows and you need to know how to kind of look at this data
how to kind of look at this data standardize it um format it properly for
standardize it um format it properly for what you're going to be using it for if
what you're going to be using it for if you're keeping it in Excel there are
you're keeping it in Excel there are different things that you may do than if
different things that you may do than if you're putting it into a database or
you're putting it into a database or going to be using it in you know um
going to be using it in you know um using python to to access it so you need
using python to to access it so you need to kind of know your use case but these
to kind of know your use case but these are some things that I do all the time
are some things that I do all the time to kind of clean up the data before I
to kind of clean up the data before I use it for something whether I'm
use it for something whether I'm creating pivot tables or I'm inserting
creating pivot tables or I'm inserting it into or I'm putting it into SQL these
it into or I'm putting it into SQL these are things I do all the time and so
are things I do all the time and so hopefully that helps give you kind of an
hopefully that helps give you kind of an idea of some of the things that you
idea of some of the things that you should be looking for when you're
should be looking for when you're actually cleaning data and it's really
actually cleaning data and it's really important to understand why you're
important to understand why you're actually making these changes and the
actually making these changes and the reason you're making these changes
reason you're making these changes because some of the things that I did
because some of the things that I did today may not be things you want to do
today may not be things you want to do on a different data set that has
on a different data set that has different uses and different um purposes
different uses and different um purposes for so you know take everything that
for so you know take everything that I've said and and apply it um with a
I've said and and apply it um with a little grain of salt to your data set
little grain of salt to your data set because your spefic specific needs may
because your spefic specific needs may be different than what I wanted when I
be different than what I wanted when I was cleaning my data set so I hope this
was cleaning my data set so I hope this was helpful I hope you this gave you a
was helpful I hope you this gave you a small glimpse of some of the things that
small glimpse of some of the things that I'm looking for when I clean a data set
I'm looking for when I clean a data set or I get a new data set in and I'm kind
or I get a new data set in and I'm kind of you know analyzing it figuring out
of you know analyzing it figuring out what I need to fix in it I hope this has
what I need to fix in it I hope this has been helpful uh with that being said
been helpful uh with that being said thank you so much for watching I really
thank you so much for watching I really appreciate it if you like this video be
appreciate it if you like this video be sure to like And subscribe below and
sure to like And subscribe below and I'll see you in the next
I'll see you in the next [Music]
[Music] video
[Music] what's going on everybody welcome back
what's going on everybody welcome back to the Excel tutorial Series today we're
to the Excel tutorial Series today we're going to create an entire project in
going to create an entire project in [Music]
[Music] Excel now if you've never done a
Excel now if you've never done a complete project in Excel where you take
complete project in Excel where you take the data you clean it then you create an
the data you clean it then you create an actual dashboard where people can click
actual dashboard where people can click on things and filter things this is
on things and filter things this is going to be a really great learning
going to be a really great learning opportunity as well as potentially you
opportunity as well as potentially you know a simple project that you can use
know a simple project that you can use for your portfolio or you can spice
for your portfolio or you can spice things up and go a little farther than
things up and go a little farther than what we're going to be doing in today's
what we're going to be doing in today's video I will walk you through every
video I will walk you through every single step of the way and hopefully we
single step of the way and hopefully we learn something together and without
learn something together and without further Ado let's jump right into it
further Ado let's jump right into it let's jump onto my screen and get
let's jump onto my screen and get started with the project all right so
started with the project all right so this is the data set that we're going to
this is the data set that we're going to be working with I will leave a link in
be working with I will leave a link in the description to my GitHub where you
the description to my GitHub where you can go and download it so you can be
can go and download it so you can be working with the exact same data set
working with the exact same data set that I am using now before we actually
that I am using now before we actually get into this data and start looking at
get into this data and start looking at it I'm going to show you what the final
it I'm going to show you what the final dashboard is going to look like um we're
dashboard is going to look like um we're going to create a few different types of
going to create a few different types of visualizations nothing too crazy um and
visualizations nothing too crazy um and then we'll create some filters as well
then we'll create some filters as well so we can kind of you know create some
so we can kind of you know create some interactive filters with our data so
interactive filters with our data so let's go right on over to our data set
let's go right on over to our data set now I'm going to hide this because we
now I'm going to hide this because we are not going to use that but what I am
are not going to use that but what I am going to do before we do anything is I'm
going to do before we do anything is I'm going to create a
going to create a dashboard and I'm going to create a
dashboard and I'm going to create a pivot table oops
pivot table oops and I'm going to create a working sheet
and I'm going to create a working sheet so um all these things have different
so um all these things have different uses and I'll explain that as we go
uses and I'll explain that as we go along so this is our data set um I'm
along so this is our data set um I'm going to copy this over to our working
going to copy this over to our working sheet when I go into you know an Excel
sheet when I go into you know an Excel and I'm working on something I don't
and I'm working on something I don't like to you know use just the one that I
like to you know use just the one that I was using in case I mess something up
was using in case I mess something up and it saves over or's some issue I like
and it saves over or's some issue I like to create a working sheet and keep the
to create a working sheet and keep the raw data right over here it just makes
raw data right over here it just makes my life easier I don't have to save it
my life easier I don't have to save it and then you know open up a different
and then you know open up a different Excel to compare them so we have our
Excel to compare them so we have our bike buyers this is our working sheets
bike buyers this is our working sheets this is our raw data this is the one
this is our raw data this is the one we're actually be working on today so
we're actually be working on today so let's um let's start looking at it
let's um let's start looking at it really quick and just kind of glance and
really quick and just kind of glance and see what data we're working with and
see what data we're working with and then we'll start cleaning it up making
then we'll start cleaning it up making it more useful for what we are going to
it more useful for what we are going to be using it for and then we'll start
be using it for and then we'll start building out the dashboard so right here
building out the dashboard so right here we have an ID that should be be a unique
we have an ID that should be be a unique ID to each person uh this is their
ID to each person uh this is their marital status so married or single this
marital status so married or single this is their gender male female we have
is their gender male female we have their income children their education
their income children their education their occupation do they own a home how
their occupation do they own a home how many cars they own how long their
many cars they own how long their commute is the region where they live
commute is the region where they live their age and if they purchased a bike
their age and if they purchased a bike and this column right here is extremely
and this column right here is extremely important this is going to tell us
important this is going to tell us whether they did or did not buy a bike
whether they did or did not buy a bike so we got their information they're
so we got their information they're looking for a bike but they either
looking for a bike but they either decided not to buy a bike or they did
decided not to buy a bike or they did buy a bike and we're going to be using
buy a bike and we're going to be using that one a lot in in this video and so
that one a lot in in this video and so um you know this is basically the data
um you know this is basically the data set that we're working with um some of
set that we're working with um some of the demographics and and information
the demographics and and information behind the person so what we want to do
behind the person so what we want to do when we are cleaning the data before we
when we are cleaning the data before we do anything uh I like to see if there
do anything uh I like to see if there are any duplicates in here um what we're
are any duplicates in here um what we're going to do is come right up here we can
going to do is come right up here we can go to
go to uh where is it right here we got remove
uh where is it right here we got remove duplicates so we're going to click on
duplicates so we're going to click on that it selects every single one we just
that it selects every single one we just want to see if there's any useless
want to see if there's any useless duplicated data that we do not need uh
duplicated data that we do not need uh and the data is a header so we're going
and the data is a header so we're going to click
to click okay all right so we had a ton of
okay all right so we had a ton of duplicates in there uh for whatever
duplicates in there uh for whatever reason so yeah we do have duplicates in
reason so yeah we do have duplicates in there so I'm glad we did that otherwise
there so I'm glad we did that otherwise we would have uh you know not good data
we would have uh you know not good data and we don't want that let's start right
and we don't want that let's start right over here um the ID of course we're not
over here um the ID of course we're not going to change the marital status and
going to change the marital status and gender are M's s's fs and M's um this
gender are M's s's fs and M's um this isn't inherently a bad thing to have it
isn't inherently a bad thing to have it like this but you know we have to think
like this but you know we have to think about it from the perspective of someone
about it from the perspective of someone who's going to be using this dashboard
who's going to be using this dashboard do they know what M ands is do they know
do they know what M ands is do they know what M uh and F is and if they don't
what M uh and F is and if they don't it's better to just spell it out for the
it's better to just spell it out for the most part um so let's just do that so
most part um so let's just do that so we're going to click on the column B
we're going to click on the column B we're going to hit controll H that's
we're going to hit controll H that's going to bring up our find and replace
going to bring up our find and replace now there's an m in both of these
now there's an m in both of these columns and there's different things one
columns and there's different things one is married and one means male so we're
is married and one means male so we're going to do is we're going to search by
going to do is we're going to search by columns um and we'll have match case I
columns um and we'll have match case I don't think that's going to change
don't think that's going to change anything but that just means an exact
anything but that just means an exact match uh and we're going to do m equals
match uh and we're going to do m equals and we're going to replace it with
and we're going to replace it with married and we'll replace all awesome
married and we'll replace all awesome and then we do s is
and then we do s is single this one is super easy we're
single this one is super easy we're going to do the exact same thing right
going to do the exact same thing right here so column C to hit contrl H we'll
here so column C to hit contrl H we'll do still has by column so we'll do m is
do still has by column so we'll do m is male we'll replace all of those and F is
male we'll replace all of those and F is female and replace all those that's
female and replace all those that's great uh you know the next column right
great uh you know the next column right here is income and in a SE in a previous
here is income and in a SE in a previous video I talked about how I don't
video I talked about how I don't typically like it in this format and
typically like it in this format and that's true um if you're doing calcul
that's true um if you're doing calcul ations on it or or any other thing it
ations on it or or any other thing it can mess it up sometimes having the
can mess it up sometimes having the dollar sign or it being a currency we're
dollar sign or it being a currency we're not really going to mess with it too
not really going to mess with it too much right now um what we can do is just
much right now um what we can do is just kind
kind of make sure all of it's currency um
of make sure all of it's currency um we'll just go like that to make it a
we'll just go like that to make it a little simpler but we're not going to
little simpler but we're not going to change it to like a
change it to like a numeric um we will use this in the
numeric um we will use this in the visualization we'll see how it looks and
visualization we'll see how it looks and if we need to we'll come back and change
if we need to we'll come back and change it if not we'll keep it how it is um so
it if not we'll keep it how it is um so so that's all we're going to do to that
so that's all we're going to do to that one uh the children those look good we
one uh the children those look good we have
have education partial College partial High
education partial College partial High School this looks fine to me um if
School this looks fine to me um if there's any spelling errors or anything
there's any spelling errors or anything like that of course we need to clean
like that of course we need to clean that up it doesn't look like there
that up it doesn't look like there is
is occupation skilled manual manual okay
occupation skilled manual manual okay those should be separate are they a
those should be separate are they a homeowner should just be yes or no all
homeowner should just be yes or no all right we have Cars 1 2 3 4 good night
right we have Cars 1 2 3 4 good night who owns four cars um and then we have
who owns four cars um and then we have the commute distance uh and you know
the commute distance uh and you know there's nothing terrible about this it's
there's nothing terrible about this it's giving you ranges um which can be a good
giving you ranges um which can be a good thing I say let's keep it for now but I
thing I say let's keep it for now but I have a feeling when we get further and
have a feeling when we get further and we start using in the visualization we
we start using in the visualization we may want to change this so let's just
may want to change this so let's just hold off for now um but if needed we
hold off for now um but if needed we will come back to this and we'll change
will come back to this and we'll change this um and then we have our region and
this um and then we have our region and that looks totally fine and we have our
that looks totally fine and we have our age now when you're using ages typically
age now when you're using ages typically you have some type of like age bracket
you have some type of like age bracket or or age range and you do that because
or or age range and you do that because there are so many ages in here right
there are so many ages in here right it's 25 all the way down to 89 and if
it's 25 all the way down to 89 and if you're using that in some type of
you're using that in some type of visualization it could just get really
visualization it could just get really messy and so you'll create kind of you
messy and so you'll create kind of you know just brackets around these so that
know just brackets around these so that you can kind of condense it and make it
you can kind of condense it and make it a little bit easier to understand so
a little bit easier to understand so let's do that and just create a new
let's do that and just create a new column and then then we can use that for
column and then then we can use that for our dashboard so let's go right up here
our dashboard so let's go right up here we're just going to create a new column
we're just going to create a new column uh we'll call this age
uh we'll call this age brackets and what we can do is we can
brackets and what we can do is we can use an if statement to kind of say if
use an if statement to kind of say if it's older than or less than and and and
it's older than or less than and and and kind of give them these ranges um that's
kind of give them these ranges um that's one way to do it and that's the way
one way to do it and that's the way we're going to do it right now so let's
we're going to do it right now so let's go up here and what we want to do is we
go up here and what we want to do is we want to say is going to we're going to
want to say is going to we're going to say equals and we're going to do if and
say equals and we're going to do if and we're going to close that parenthesis
we're going to close that parenthesis now what we're going to say is if
now what we're going to say is if this we'll go right back up here if this
this we'll go right back up here if this is less than so we're going do this 31
is less than so we're going do this 31 and we're going to say comma so if they
and we're going to say comma so if they are less than 31 what do we want to call
are less than 31 what do we want to call them what do we want their their you
them what do we want their their you know name to be we'll call them
know name to be we'll call them adolescent oops that's not how you spell
adolescent oops that's not how you spell adolescent adolescent um and then if
adolescent adolescent um and then if they're not what we're going to do is
they're not what we're going to do is we're going to say it's
we're going to say it's invalid okay and let's just see if this
invalid okay and let's just see if this one works
one works first all right it's not working at all
first all right it's not working at all um okay so basically what we did was um
um okay so basically what we did was um incorrect we did it backward uh we want
incorrect we did it backward uh we want to do I said uh L2 is greater than 31 no
to do I said uh L2 is greater than 31 no we want to do like this so let's do that
we want to do like this so let's do that now all right and it should pull up
now all right and it should pull up where if they're under the age of 31 so
where if they're under the age of 31 so if they're 30 or below is basically what
if they're 30 or below is basically what it's saying so if they're 31 they'll be
it's saying so if they're 31 they'll be invalid but if they're 30 or below it's
invalid but if they're 30 or below it's adolescent so it is working properly um
adolescent so it is working properly um and let's see what it see what it says
and let's see what it see what it says perfect so this one is working and and
perfect so this one is working and and now what we want to do is we actually
now what we want to do is we actually want to build on this and make it uh
want to build on this and make it uh kind of like a nested if statement if
kind of like a nested if statement if you've ever heard of that or done that
you've ever heard of that or done that before so this is our first first if
before so this is our first first if statement and this is going to be this
statement and this is going to be this is invalid this is our value if false
is invalid this is our value if false statement this whole statement is going
statement this whole statement is going to become our value if false for a
to become our value if false for a different if statement um so let let me
different if statement um so let let me write it out and hopefully that'll make
write it out and hopefully that'll make sense but we're going to say if do open
sense but we're going to say if do open parentheses and we're going to do it
parentheses and we're going to do it like this and let's just get rid of this
like this and let's just get rid of this for a
for a second all right uh what did I do and
second all right uh what did I do and let me do
let me do oops give me a
oops give me a second okay we have our if let me just
second okay we have our if let me just write that out again we have our if
write that out again we have our if there we go so now what we're going to
there we go so now what we're going to do is we're going to write basically the
do is we're going to write basically the next part of it so we're going to say if
next part of it so we're going to say if that L2 is and we're going to do this
that L2 is and we're going to do this time we're going to do greater than or
time we're going to do greater than or equal to 31 so now it's going to include
equal to 31 so now it's going to include that 31 so right here we did anything
that 31 so right here we did anything less than 31 so it's 30 and below this
less than 31 so it's 30 and below this one is going to be 31 and above so we're
one is going to be 31 and above so we're going to say these people are middle
going to say these people are middle Ag and if not then it's going to go to
Ag and if not then it's going to go to this if statement and then we need to
this if statement and then we need to close it I believe so now let's try this
close it I believe so now let's try this all
all right fantastic now if um everybody
right fantastic now if um everybody should be in one of these areas right
should be in one of these areas right everyone should either be an adolescent
everyone should either be an adolescent or middle age because basically all
or middle age because basically all we're saying is is if they're older than
we're saying is is if they're older than 31 or 30 or below that's all these two
31 or 30 or below that's all these two statements do so we have um you know our
statements do so we have um you know our next group now we can add and go even
next group now we can add and go even further into this and now we can use
further into this and now we can use this entire thing as the um what was it
this entire thing as the um what was it called the value if false section so
called the value if false section so that's what we're going to do we're
that's what we're going to do we're going to do one more so we're have three
going to do one more so we're have three different categories so we're going to
different categories so we're going to say if and do uh an open parenthesis and
say if and do uh an open parenthesis and we're going to say if oh actually Let's
we're going to say if oh actually Let's Do It
Do It um let's not do it to this one
um let's not do it to this one let's do to this top one just
let's do to this top one just easier uh so we're going to say if open
easier uh so we're going to say if open parenthesis we're going to say L2 and
parenthesis we're going to say L2 and this time we're going to say anybody
this time we're going to say anybody over the age of 50 uh or we can do 55
over the age of 50 uh or we can do 55 let's do 55 so we'll do 55 and we're
let's do 55 so we'll do 55 and we're going to call them
going to call them old and we'll do a comma and this is the
old and we'll do a comma and this is the value if false statement and we need to
value if false statement and we need to close our parenthesis so let's try this
close our parenthesis so let's try this anybody over the age of 55 should have
anybody over the age of 55 should have old um you know maybe we'll do 54 so
old um you know maybe we'll do 54 so anybody who is 55 is considered old I
anybody who is 55 is considered old I think that's fair I think that's fair
think that's fair I think that's fair guys oops I should have
guys oops I should have done I should have done that to this one
done I should have done that to this one let me get out of this and we'll do
let me get out of this and we'll do 54 my dad is 55 that's why I'm doing it
54 my dad is 55 that's why I'm doing it like this this is fre
like this this is fre dead CU he should be in this old
dead CU he should be in this old category to be fair so now we have
category to be fair so now we have adolescent adolescent middle-age and old
adolescent adolescent middle-age and old these are three categories so we can now
these are three categories so we can now have these buckets these different
have these buckets these different groups of Ages and it's much more usable
groups of Ages and it's much more usable than these individual ages um and so we
than these individual ages um and so we will be using this in our in our
will be using this in our in our dashboard for sure now our next one is
dashboard for sure now our next one is the purchased bike uh and we're not
the purchased bike uh and we're not going to do anything with that so you
going to do anything with that so you know that is that is that one and you
know that is that is that one and you know there wasn't a ton to clean up here
know there wasn't a ton to clean up here we removed some duplicates um I don't
we removed some duplicates um I don't know why it says that what did I do
married married what does this mean even mean I did I write that did I mess this
mean I did I write that did I mess this up guys
up guys oh when I did the m and the S uh
oh when I did the m and the S uh replacement in there it replaced it with
replacement in there it replaced it with married and single it's supposed to say
married and single it's supposed to say marital status
marital status oops thanks for catching that guys
oops thanks for catching that guys thanks for catching that I hope that's
thanks for catching that I hope that's how you spell marital uh we'll see so
how you spell marital uh we'll see so uh we are going to keep it just like
uh we are going to keep it just like this now what we are going to
now now what we are going to do is build pivot tables with this data so we had
pivot tables with this data so we had our raw data we have our working sheet
our raw data we have our working sheet and now we want to create pivot tables
and now we want to create pivot tables and pivot tables is how you actually
and pivot tables is how you actually help build your dashboards or help build
help build your dashboards or help build your visualizations so we're going to go
your visualizations so we're going to go right here we're going to hit whoops get
right here we're going to hit whoops get rid of that we're going to go right here
rid of that we're going to go right here we're going to insert and we're going to
we're going to insert and we're going to say pivot table and it's going to ask us
say pivot table and it's going to ask us what
what range so we're going to go back to the
range so we're going to go back to the working sheet and we'll just click here
working sheet and we'll just click here and hit control
and hit control a this is going to select all of our
a this is going to select all of our data for us so it's really easy and
data for us so it's really easy and we're going to hit okay and so now we
we're going to hit okay and so now we have all of
have all of our pivot I don't need I don't need to
our pivot I don't need I don't need to pull it out that far that was way too
pull it out that far that was way too far and now we have all of our pivot
far and now we have all of our pivot table information over here and so that
table information over here and so that should make it really easy to you know
should make it really easy to you know actually build out so what we're going
actually build out so what we're going to do is start selecting what columns
to do is start selecting what columns and what data we actually want to work
and what data we actually want to work with so the first one that we're going
with so the first one that we're going to build out is a dashboard that is
to build out is a dashboard that is basically looking at the average income
basically looking at the average income of somebody who either bought or did not
of somebody who either bought or did not buy a bike so we need in this one we're
buy a bike so we need in this one we're going to need their income that's
going to need their income that's definitely going to be a value right
definitely going to be a value right here um but we want to break it out by
here um but we want to break it out by male and female so let's look at their
male and female so let's look at their gender we going to pull that down into
gender we going to pull that down into the rows so
the rows so um this is basically a sum and no let's
um this is basically a sum and no let's look
look at let's make this an average so I just
at let's make this an average so I just went to the um I clicked right here I
went to the um I clicked right here I went to the value field settings and
went to the value field settings and we're just going to do an
we're just going to do an average all right and then we are going
average all right and then we are going to make these um and as you can see
to make these um and as you can see there's four decimal points um we'll
there's four decimal points um we'll keep it as is right now but we may need
keep it as is right now but we may need to go back and change something then
to go back and change something then we're going to look at if they purchased
we're going to look at if they purchased a bik or not and we're going to put that
a bik or not and we're going to put that right here so so we can see that uh
right here so so we can see that uh right here for the people who did not
right here for the people who did not buy a bike the females their their
buy a bike the females their their average salary was 53,000 the average
average salary was 53,000 the average salary for the average salary for males
salary for the average salary for males was 56,000 for yes the ones who did buy
was 56,000 for yes the ones who did buy a bike the average salary was 55 for
a bike the average salary was 55 for female and 60 for male so the people who
female and 60 for male so the people who had a little bit more money are buying
had a little bit more money are buying bikes and you can also see that uh the
bikes and you can also see that uh the men are making more money in this data
men are making more money in this data set just overall in general um so let's
set just overall in general um so let's make the visualization really quick but
make the visualization really quick but you know I don't know I'm not a huge fan
you know I don't know I'm not a huge fan of these decimal points and maybe we can
of these decimal points and maybe we can just change that in the visualization
just change that in the visualization we'll see um oops that's not what I
we'll see um oops that's not what I meant to
meant to do um let's do that so what we are going
do um let's do that so what we are going to do is we're going to click into here
to do is we're going to click into here we're going to click insert and we're
we're going to click insert and we're going to go to these recommended charts
going to go to these recommended charts and it's going to bring up basically
and it's going to bring up basically every single type that we would want um
every single type that we would want um and we can just click in here and see
and we can just click in here and see which one looks good uh oh yeah I love
which one looks good uh oh yeah I love those 3D ones those are my favorite you
those 3D ones those are my favorite you guys know that uh let's let's use this
guys know that uh let's let's use this one right here pretty simple um whoops
one right here pretty simple um whoops let's pull this right over here and as
let's pull this right over here and as is it looks pretty good um you know it
is it looks pretty good um you know it shows male female we have the average or
shows male female we have the average or the incomes right here whether they did
the incomes right here whether they did or did not purchase it um and so at a
or did not purchase it um and so at a glance it's pretty easy to see let's see
glance it's pretty easy to see let's see if there's anything um you know if you
if there's anything um you know if you want to change up style-wise go for it
want to change up style-wise go for it I'm just going to keep it as is um but
I'm just going to keep it as is um but let's see if there's anything we need to
let's see if there's anything we need to add right do we want to add these access
add right do we want to add these access titles uh for the most part I I tend to
titles uh for the most part I I tend to do that um it makes it pretty easy to
do that um it makes it pretty easy to see so we can go in here and we can just
see so we can go in here and we can just click it like this and we'll say
click it like this and we'll say income and we'll say oops and we'll do
income and we'll say oops and we'll do gender so that's what that
gender so that's what that is and and let's go back in here do we
is and and let's go back in here do we want to add a chart title we definitely
want to add a chart title we definitely want to add a chart title uh for most of
want to add a chart title uh for most of these we'll add a chart title for sure
these we'll add a chart title for sure so we'll say average
so we'll say average income per
income per purchase um I don't know if that's 100%
purchase um I don't know if that's 100% right but we'll we'll we'll use it uh if
right but we'll we'll we'll use it uh if we need to change it to be you know by
we need to change it to be you know by gender or something we can but um for
gender or something we can but um for now let's see do we want to add data
now let's see do we want to add data labels uh definitely not uh a data table
labels uh definitely not uh a data table um we can do this it may make it a
um we can do this it may make it a little easier to read I will say that
little easier to read I will say that again these numbers are just these
again these numbers are just these decimal points are really throwing me
decimal points are really throwing me off let's go see if um we can change it
off let's go see if um we can change it in here let's go
in here let's go to see if we can just make these numbers
to see if we can just make these numbers okay and um we can keep it like that or
okay and um we can keep it like that or we can even do something like this add
we can even do something like this add commas yeah I'm going to keep it just
commas yeah I'm going to keep it just like this I I think this just looks the
like this I I think this just looks the best um again I'm I'm getting adding
best um again I'm I'm getting adding commas here I'm changing the um decimal
commas here I'm changing the um decimal place right here it just makes it look a
place right here it just makes it look a little nicer a little cleaner um so
little nicer a little cleaner um so let's keep this exactly how it is um we
let's keep this exactly how it is um we can always change things if we want to
can always change things if we want to uh if we want to come back to it so we
uh if we want to come back to it so we created our pivot table and then we
created our pivot table and then we created our visualization basically
created our visualization basically exactly what we're going to do for all
exactly what we're going to do for all of these because again all of these need
of these because again all of these need um you know all of these need pivot
um you know all of these need pivot tables in order to create the
tables in order to create the visualization so let's um get out of
visualization so let's um get out of here we're going to scroll down and
here we're going to scroll down and we're going to create our next pivot
we're going to create our next pivot table and once we get done with all of
table and once we get done with all of the pivot tables that we need all the
the pivot tables that we need all the visualizations that we need then we will
visualizations that we need then we will um we will start so we're going to do
um we will start so we're going to do control a we're going do okay and
control a we're going do okay and basically do the exact same thing that
basically do the exact same thing that we did um this time we're going to look
we did um this time we're going to look at the distance so for this one I wanted
at the distance so for this one I wanted to see you know I try to you know I
to see you know I try to you know I created this already I've already done
created this already I've already done this entire project through but I
this entire project through but I haven't really talked about why or what
haven't really talked about why or what we're going to look at for this one you
we're going to look at for this one you know know we're looking at is their
know know we're looking at is their income does it change whether they
income does it change whether they bought or didn't buy one um so if they
bought or didn't buy one um so if they said yes you know is there a reason are
said yes you know is there a reason are they making more money is you know are
they making more money is you know are price points are the customers do they
price points are the customers do they make more money so you we cater to them
make more money so you we cater to them or not uh that's a good question uh
or not uh that's a good question uh another thing is you know we're we sell
another thing is you know we're we sell bikes or this person sells bikes so
bikes or this person sells bikes so commuting distance definitely makes a
commuting distance definitely makes a difference you know does the person who
difference you know does the person who is buying a bike live one mile away from
is buying a bike live one mile away from where they work or 20 miles away uh this
where they work or 20 miles away uh this will help us determine this next
will help us determine this next visualization will help us determine you
visualization will help us determine you know who who is doing that or who's
know who who is doing that or who's buying it so what we are going to do is
buying it so what we are going to do is we are going to look at the um that one
we are going to look at the um that one that we were looking at earlier the
that we were looking at earlier the commute distance so we're going to bring
commute distance so we're going to bring that right over here so we have these
that right over here so we have these you know one mile 10 Mile 1.2
you know one mile 10 Mile 1.2 Etc now we are going to uh again we're
Etc now we are going to uh again we're going to look at if they purchased a
going to look at if they purchased a bike
bike that's really important and let's make
that's really important and let's make that the column as well so now what we
that the column as well so now what we have is a count of these Nos and yeses
have is a count of these Nos and yeses whether they did or did not buy a bike
whether they did or did not buy a bike um one of the issues I already see and
um one of the issues I already see and we'll I'm going to visualize it and then
we'll I'm going to visualize it and then I'll show you that this 10 miles you
I'll show you that this 10 miles you know it's right next to the 0.1 so it's
know it's right next to the 0.1 so it's not an order um and that could be that
not an order um and that could be that could be an issue um so we may have to
could be an issue um so we may have to revise that somehow to put it at the
revise that somehow to put it at the very bottom because we can either do
very bottom because we can either do ascending
ascending or descending uh either one I don't
or descending uh either one I don't think is going to work so we may have to
think is going to work so we may have to work through that in just a second um I
work through that in just a second um I don't know if I did that my I plan for
don't know if I did that my I plan for that um yeah so it has this big dip
that um yeah so it has this big dip um yeah so let's let's create it um
um yeah so let's let's create it um that's okay we're going to figure this
that's okay we're going to figure this one out together because I honestly um I
one out together because I honestly um I didn't plan for this one so okay we have
didn't plan for this one so okay we have 0.1 miles that's exactly where it needs
0.1 miles that's exactly where it needs to be the one the two the five that's
to be the one the two the five that's exactly where it needs to be this 10
exactly where it needs to be this 10 miles is not and let's see if I change
miles is not and let's see if I change that 10 10 plus miles to 10 miles plus
that 10 10 plus miles to 10 miles plus let's see if that'll put it down here
let's see if that'll put it down here because I I don't know if it's looking
because I I don't know if it's looking at I don't know if it's reading it weird
at I don't know if it's reading it weird um but let's go into this working sheet
um but let's go into this working sheet and let's go right here and we're going
and let's go right here and we're going to do controll H and we'll do oops not
to do controll H and we'll do oops not this
this one um 10 miles plus let's get that in
one um 10 miles plus let's get that in there and we're going to do
there and we're going to do 10 uh
10 uh miles plus I I don't know if that's
miles plus I I don't know if that's actually going to work um we will see so
actually going to work um we will see so let's go back to the pivot
let's go back to the pivot table let's re go to the data let's
table let's re go to the data let's refresh uh no it didn't it didn't change
refresh uh no it didn't it didn't change it um okay so let's think about this
it um okay so let's think about this maybe if we change it to like a letter
maybe if we change it to like a letter it might change down here so start it
it might change down here so start it with uh miles that could work um let's
with uh miles that could work um let's try it it okay it's already
try it it okay it's already selected let's do the 10 plus miles okay
selected let's do the 10 plus miles okay so let's
so let's do
do um M uh more than 10
um M uh more than 10 miles and we'll replace all let's get
miles and we'll replace all let's get rid of
rid of this let's go to the pivot and refresh
this let's go to the pivot and refresh all right okay so it's not perfect but
all right okay so it's not perfect but it works
it works um and for what we're doing I think
um and for what we're doing I think we'll keep it how it is so we have our
we'll keep it how it is so we have our second one uh and you know there are
second one uh and you know there are different ways you can kind of change
different ways you can kind of change this one um you know on the last one we
this one um you know on the last one we did a ton of different stuff we can
did a ton of different stuff we can do just
do just do commute
do commute distance and we can
distance and we can say what do we want to say on this one
say what do we want to say on this one what is this oh this is the count um do
what is this oh this is the count um do we have to do we have to keep this
we have to do we have to keep this one um no there we go I'm just going to
one um no there we go I'm just going to do um just one and
do um just one and say commute
say commute distance and let's add a
distance and let's add a title chart title we can make this one
title chart title we can make this one um let's
um let's say
say distance per customer uh that's not 100%
distance per customer uh that's not 100% true because it's no or yes um that's
true because it's no or yes um that's that's the important part of this it's
that's the important part of this it's distance um average distance uh let's
distance um average distance uh let's see we'll just say customer
commute all right and we'll keep it just like that all right perfect I don't
like that all right perfect I don't think um let me see I don't think
think um let me see I don't think there's anything else we need to add on
there's anything else we need to add on that one all right now let's go right
that one all right now let's go right down here we're going to create our very
down here we're going to create our very last one uh we only had three so you
last one uh we only had three so you know sometimes you'll have a ton
know sometimes you'll have a ton sometimes you'll have like one on each
sometimes you'll have like one on each sheet and you'll create multiple sheets
sheet and you'll create multiple sheets but um do contr a um now we have our
but um do contr a um now we have our thing now this one we're going to be
thing now this one we're going to be looking at these age brackets that we
looking at these age brackets that we were looking at that we created um
were looking at that we created um something that I do honestly a lot is is
something that I do honestly a lot is is kind of bracket things in into groups
kind of bracket things in into groups like this and you know for this I'm just
like this and you know for this I'm just kind of made them up but you know it's
kind of made them up but you know it's good to know how to do this because I I
good to know how to do this because I I promise you this one happens a lot or I
promise you this one happens a lot or I use this one a ton and then we just want
use this one a ton and then we just want to look at who purchased a bike uh so
to look at who purchased a bike uh so the same thing as we did before so like
the same thing as we did before so like purchase a bike count of the purchase um
purchase a bike count of the purchase um you know pretty easy so we just have to
you know pretty easy so we just have to count of either no or yes for these age
count of either no or yes for these age ranges um and let's go to the insert
ranges um and let's go to the insert we'll go to
we'll go to recommendation um I personally like a
recommendation um I personally like a good line for this one um so
good line for this one um so let's this is already interesting we
let's this is already interesting we could do something like
could do something like this that's nice see this one versus
this that's nice see this one versus this it just adds a dot it looks nice
this it just adds a dot it looks nice we'll keep that one
we'll keep that one um so just really quick at a glance
um so just really quick at a glance really interesting people under the age
really interesting people under the age of 30 are not buying that many bikes um
of 30 are not buying that many bikes um age 30 to
age 30 to 54 uh 31 to 54 buying a ton of bikes uh
54 uh 31 to 54 buying a ton of bikes uh they buy more bikes or look at bikes
they buy more bikes or look at bikes more than anybody really interesting um
more than anybody really interesting um but yeah we'll make the dashboard in a
but yeah we'll make the dashboard in a little bit um let's make these chart
little bit um let's make these chart titles we'll
titles we'll do vert oops the
do vert oops the horizontal we just call
horizontal we just call this age
bracket um and then we'll add a chart title um again you can add some extra
title um again you can add some extra stuff if you want
stuff if you want to um but you don't need to uh none of
to um but you don't need to uh none of this other stuff we really need I'm just
this other stuff we really need I'm just kind of looking at the stuff we do need
kind of looking at the stuff we do need or do want uh so what do we want to call
or do want uh so what do we want to call this one let's call it customer
this one let's call it customer age brackets um and it's not perfect but
age brackets um and it's not perfect but we'll keep it as is for comparison um
we'll keep it as is for comparison um let me see if I can copy
let me see if I can copy um or or use this um real quick instead
um or or use this um real quick instead of the age brackets I'm going to get rid
of the age brackets I'm going to get rid of this and use the age
of this and use the age and then let's
and then let's use um let's insert
use um let's insert recommendation we use a
recommendation we use a line and we'll use this
line and we'll use this so This compared to this just think of
so This compared to this just think of it like if a customer or consumer or or
it like if a customer or consumer or or not a customer if somebody you're
not a customer if somebody you're working with is trying to use this
working with is trying to use this dashboard to understand this dashboard
dashboard to understand this dashboard this is going to be just it's going to I
this is going to be just it's going to I don't know it might melt their brain
don't know it might melt their brain just makes no sense it makes sense it's
just makes no sense it makes sense it's just all over the place it's really hard
just all over the place it's really hard to make sense of this it really is I
to make sense of this it really is I mean you can kind of see a pattern going
mean you can kind of see a pattern going up around like the mid-30s and then it
up around like the mid-30s and then it Trends downward but it's hard to see um
Trends downward but it's hard to see um it really is so doing these um these
it really is so doing these um these brackets really helps and you can even
brackets really helps and you can even add you know adolescent um you know 0o
add you know adolescent um you know 0o to 30 underneath it and in fact we may
to 30 underneath it and in fact we may want to do that um why not why not let's
want to do that um why not why not let's do that oh whoops um so why don't why
do that oh whoops um so why don't why don't we do that why don't we go back
don't we do that why don't we go back I'm just going to I'm doing this on the
I'm just going to I'm doing this on the Fly why don't we go
Fly why don't we go back uh what am I doing
back uh what am I doing whoops and this is all calculated but
whoops and this is all calculated but let's do
let's do adolescent 0
adolescent 0 to
to 30 let's do
30 let's do middleaged 31 through
middleaged 31 through 54 and then old 55 plus let's see if
54 and then old 55 plus let's see if this breaks anything I hope it doesn't
this breaks anything I hope it doesn't um and we'll go back to our pivot table
um and we'll go back to our pivot table let's refresh the
data uh okay it did mess with stuff okay never mind guys that was a
stuff okay never mind guys that was a terrible idea don't do that um perfect
terrible idea don't do that um perfect uh let's get rid of that that was a
uh let's get rid of that that was a terrible idea don't do that I'm glad we
terrible idea don't do that I'm glad we tested it out though I like I like to
tested it out though I like I like to see if it was going to work no it messed
see if it was going to work no it messed with the um the Order of Things um I I
with the um the Order of Things um I I intentionally named them adolescent
intentionally named them adolescent middle- Ag and old because it's it it
middle- Ag and old because it's it it makes sense for the visualization um but
makes sense for the visualization um but you know if if I change something and it
you know if if I change something and it messes with it I'm not going to mess
messes with it I'm not going to mess with it it was just an idea on the Fly
with it it was just an idea on the Fly guys come on all right so let's start
guys come on all right so let's start building out our dashboard now um when
building out our dashboard now um when we're building our dashboard what I
we're building our dashboard what I personally like to do is to have this
personally like to do is to have this pivot table sheet and then I will copy
pivot table sheet and then I will copy them over and later we'll hide these
them over and later we'll hide these other sheets beats um and I'll explain
other sheets beats um and I'll explain that a little bit but I like to have
that a little bit but I like to have this this one for us so we're going to
this this one for us so we're going to copy this so I just click on it hit
copy this so I just click on it hit controlc we're going to paste it right
controlc we're going to paste it right over
over here uh let's just make them small for
here uh let's just make them small for now that's oh gosh no let's not do that
now that's oh gosh no let's not do that oh these look terrible okay anyways
oh these look terrible okay anyways um let's copy this one
um let's copy this one over
over oops okay what did I just
oops okay what did I just do oh I didn't copy this one
do oh I didn't copy this one whoops it's not
whoops it's not copying okay we're going to go
copying okay we're going to go copy hit
copy hit paste
paste fantastic oops guys look away this is
fantastic oops guys look away this is this is tough to watch this is tough for
this is tough to watch this is tough for me to watch I'm the one doing it it is
me to watch I'm the one doing it it is tough for me to watch all right let's go
tough for me to watch all right let's go to this last one I'm I'm gonna try it
to this last one I'm I'm gonna try it again all right it worked this time so
again all right it worked this time so now we have um our our three
now we have um our our three visualizations this is perfect but now
visualizations this is perfect but now we actually want to create a dashboard
we actually want to create a dashboard now how do you do that how do you make
now how do you do that how do you make it look nice U and then we're going to
it look nice U and then we're going to add some you know filters and stuff like
add some you know filters and stuff like that how do we make it look nice um what
that how do we make it look nice um what happened here what changed what did we
do oh my goodness gracious all right let's copy
this let's paste this let's get rid of this I don't even know how that happened
this I don't even know how that happened I've never seen that before that was
I've never seen that before that was wild uh Excel is trying to destroy my
wild uh Excel is trying to destroy my whole video I mean I'm doing this for
whole video I mean I'm doing this for you Excel good night okay no problem at
you Excel good night okay no problem at all what we're going to do and how you
all what we're going to do and how you make this at least look nice um first
make this at least look nice um first off we can get rid of these grid lines
off we can get rid of these grid lines pretty easily and I recommend when you
pretty easily and I recommend when you do that when you make a dashboard just
do that when you make a dashboard just makes it look cleaner makes it look like
makes it look cleaner makes it look like an actual dashboard um let's go to view
an actual dashboard um let's go to view and grid lines so we can get rid of
and grid lines so we can get rid of these grid lines it just makes it look
these grid lines it just makes it look nicer um we're going to make you know we
nicer um we're going to make you know we can choose any color here here I'm just
can choose any color here here I'm just going to get choose a
going to get choose a color I like this and let's we're we're
color I like this and let's we're we're basically creating like a header right
basically creating like a header right if you're using like Tableau or
if you're using like Tableau or something um we're going to merge and
something um we're going to merge and center so it takes every single cell
center so it takes every single cell that we have highlighted creates it into
that we have highlighted creates it into one let's call this um bike sales uh I
one let's call this um bike sales uh I have I think I called it bike sales
have I think I called it bike sales dashboard let's just call it that um you
dashboard let's just call it that um you know see what happens let's get that
know see what happens let's get that let's make it white and and make it much
let's make it white and and make it much larger than it
larger than it is okay okay
is okay okay um sure let's do that doesn't look bad
um sure let's do that doesn't look bad um what is it doing there we go uh let's
um what is it doing there we go uh let's bre that Center perfect um it's not
bre that Center perfect um it's not perfect but we're going to use it all
perfect but we're going to use it all right so now we kind of want to organize
right so now we kind of want to organize these and you know everybody has their
these and you know everybody has their different way of doing it uh I'm just
different way of doing it uh I'm just going to start building it out myself
going to start building it out myself self and just see how it
self and just see how it looks uh and then we'll go from there I
looks uh and then we'll go from there I like this one there um we can
like this one there um we can put this one I I this one's a kind of a
put this one I I this one's a kind of a longer one so I'll probably put it at
longer one so I'll probably put it at the bottom let's see how it
the bottom let's see how it looks um but we'll put this one right
looks um but we'll put this one right here try to line it up geez let's let's
here try to line it up geez let's let's zoom in a little bit let's try to line
zoom in a little bit let's try to line this up see what it looks
like let's extend it to the end that doesn't look too bad uh needs
end that doesn't look too bad uh needs to move up just a hair and I'll show you
to move up just a hair and I'll show you how to kind of align these in a second
how to kind of align these in a second but um that looks not bad and we'll kind
but um that looks not bad and we'll kind of try to align these as well let me
of try to align these as well let me zoom out and extend this the length of
zoom out and extend this the length of this just to make it look nice um you
this just to make it look nice um you know now what you can do and you know
know now what you can do and you know this is something that's pretty simple
this is something that's pretty simple is you can get both of these and we're
is you can get both of these and we're going to go to shape format and we can
going to go to shape format and we can just align these it's really nice to
just align these it's really nice to align especially if like the top and
align especially if like the top and maybe like the left to right but like
maybe like the left to right but like we're going to align these to the top
we're going to align these to the top and they just kind of align themselves
and they just kind of align themselves on the very top now these look much
on the very top now these look much better this one is a larger dashboard or
better this one is a larger dashboard or a larger visualization so I'm going to
a larger visualization so I'm going to keep it how it is um and I'm going to
keep it how it is um and I'm going to keep this one how it is so it is going
keep this one how it is so it is going to be a little bit smaller as you can
to be a little bit smaller as you can tell and then we'll have this one um and
tell and then we'll have this one um and I'm going to do that
I'm going to do that um I this is going to bother me if I
um I this is going to bother me if I don't align these so let me do this I'm
don't align these so let me do this I'm shape format align to the
shape format align to the right and it's not exactly what I wanted
right and it's not exactly what I wanted to happen
because oh jeez what am I doing that's not exactly what I wanted to happen I
not exactly what I wanted to happen I actually wanted this one to align uh
actually wanted this one to align uh this one to align with this one it did
this one to align with this one it did the opposite um so let me just scoot
the opposite um so let me just scoot this back all right visually looks fine
this back all right visually looks fine but that's how you do it if you want to
but that's how you do it if you want to do it um I I I if you have multiple of
do it um I I I if you have multiple of them like this it you can make it look
them like this it you can make it look bad so we have our dashboards this is
bad so we have our dashboards this is already looking really good I I like how
already looking really good I I like how this looks colors are coordinated it we
this looks colors are coordinated it we have a kind of a theme throughout um and
have a kind of a theme throughout um and it looks nice I actually I actually kind
it looks nice I actually I actually kind of want to change this one um
of want to change this one um to
to um let's
see maybe if I did like that it look nicer than all of them yeah this does
nicer than all of them yeah this does look nicer um it doesn't change much
look nicer um it doesn't change much either guys I'm should I do it all right
either guys I'm should I do it all right we're going for it we're changing the
we're going for it we're changing the design on the Fly should I do it for all
design on the Fly should I do it for all of
of them let's
them let's see it doesn't fit doesn't fit um all
see it doesn't fit doesn't fit um all right guys just ignore what I'm doing uh
right guys just ignore what I'm doing uh don't do any of this I'm just messing
don't do any of this I'm just messing around at this point so this is really
around at this point so this is really great to have it really is and what we
great to have it really is and what we want to do is there are other elements
want to do is there are other elements there are other things that people would
there are other things that people would like to feel a to filter by and be able
like to feel a to filter by and be able to look at but it's not in this
to look at but it's not in this visualization um to be more specific one
visualization um to be more specific one field that's could be really interesting
field that's could be really interesting is married versus single are single
is married versus single are single people buying more or um married people
people buying more or um married people buying more you know it it'd be nice to
buying more you know it it'd be nice to filter on it so we're going to click on
filter on it so we're going to click on uh any of these actually and we're going
uh any of these actually and we're going to go up to Pivot chart analyze and
to go up to Pivot chart analyze and we'll click insert slicer now we can
we'll click insert slicer now we can choose which ones we want to be able to
choose which ones we want to be able to filter on all at the same time or one at
filter on all at the same time or one at a time I'm just going to do the first
a time I'm just going to do the first one by itself and then I'll show you how
one by itself and then I'll show you how to do other ones um but this one is the
to do other ones um but this one is the marital status so this is the married
marital status so this is the married single the one we were just looking at
single the one we were just looking at and we can drag this right over
and we can drag this right over here bring it in a little
here bring it in a little bit all right and we don't need all that
bit all right and we don't need all that space so we're going to boop boop boop
space so we're going to boop boop boop boop all the way up
boop all the way up now while we're doing this um it only
now while we're doing this um it only because we selected this uh this
because we selected this uh this visualization it only is working on that
visualization it only is working on that one right now we of course wanted to
one right now we of course wanted to apply to all of them is not hard to do
apply to all of them is not hard to do all we're going to do is we're going to
all we're going to do is we're going to click on we're going to make sure we're
click on we're going to make sure we're clicking on this we're going to go up to
clicking on this we're going to go up to slicer we're going to hit report
slicer we're going to hit report connections um and if you remember we
connections um and if you remember we have this um this pivot table that we're
have this um this pivot table that we're working with um and this is where all of
working with um and this is where all of our pivots are coming from so we're
our pivots are coming from so we're going to actually apply it to all of
going to actually apply it to all of them this is our sheet U and this is the
them this is our sheet U and this is the name of the pivot table now again we
name of the pivot table now again we created that fourth one we're not using
created that fourth one we're not using it but we're going to apply it to all of
it but we're going to apply it to all of them so now when we click on it it's
them so now when we click on it it's going to apply to all of them so at a
going to apply to all of them so at a quick glance let's see what single
quick glance let's see what single people are doing
people are doing um interesting interesting um you know
um interesting interesting um you know when I'm looking at the just these
when I'm looking at the just these numbers right here married people these
numbers right here married people these individuals are making a lot more like
individuals are making a lot more like eight um sometimes eight to like 10,000
eight um sometimes eight to like 10,000 more on average than their single
more on average than their single counterpart um you know again that's a
counterpart um you know again that's a rough estimate but it's it's interesting
rough estimate but it's it's interesting so now what we can do is we're going to
so now what we can do is we're going to create more of these so we're going to
create more of these so we're going to go to uh pivot chart analyze we're going
go to uh pivot chart analyze we're going to go to slicer now we already did
to go to slicer now we already did marital status but what if we want to
marital status but what if we want to look at things like uh region and maybe
look at things like uh region and maybe something like their education so let's
something like their education so let's bring up both of those and look now two
bring up both of those and look now two of them come up so let's add the region
of them come up so let's add the region right here
right here we'll bring that in just a little bit
we'll bring that in just a little bit see if we can match it nailed it all
see if we can match it nailed it all right now we're going to put that up
right now we're going to put that up we'll bring this one
we'll bring this one down just like this bring it over see if
down just like this bring it over see if I can match it again come
I can match it again come on N almost nailed it I don't know if I
on N almost nailed it I don't know if I nailed it but it's close all right kind
nailed it but it's close all right kind of bring this up a little bit bring this
of bring this up a little bit bring this up and we have to do the exact same
up and we have to do the exact same thing that we did with this one because
thing that we did with this one because right now again it only applies to that
right now again it only applies to that one um chart so what we want to do is we
one um chart so what we want to do is we want to go to slicer report connections
want to go to slicer report connections add it to all of them okay do the same
add it to all of them okay do the same thing with education or connections bada
thing with education or connections bada bing bada boom We are looking good and
bing bada boom We are looking good and now uh let's get rid of all of them it's
now uh let's get rid of all of them it's just going to be everybody so now we can
just going to be everybody so now we can kind of slice and dice and choose what
kind of slice and dice and choose what we want we want to look at people who
we want we want to look at people who have a bachelor's degree who live in
have a bachelor's degree who live in Europe and are single
Europe and are single and this is the information that we have
and this is the information that we have on those people so now we can narrow it
on those people so now we can narrow it down by certain demographics even
down by certain demographics even further and look at this key information
further and look at this key information so we may not you know look at counts
so we may not you know look at counts and averages of these things but we're
and averages of these things but we're able to filter on them uh and that's
able to filter on them uh and that's really great to know so bachelor's
really great to know so bachelor's degrees on average are making 60s 70,000
degrees on average are making 60s 70,000 um let's look at um let's look at
um let's look at um let's look at graduate
graduate degrees okay a little
degrees okay a little more um but you know again I'm just
more um but you know again I'm just looking at random stuff um but you can
looking at random stuff um but you can mess around with this take a look at
mess around with this take a look at some stuff um this to me I want to make
some stuff um this to me I want to make this color darker I feel like it look
this color darker I feel like it look nicer darker there we go oh yeah that's
nicer darker there we go oh yeah that's way better this to me is it's a good
way better this to me is it's a good dashboard right you have key information
dashboard right you have key information that you're looking at nice
that you're looking at nice visualizations it's color coordinated
visualizations it's color coordinated you have these slicers on the side um to
you have these slicers on the side um to me this is a fantas fantastic just
me this is a fantas fantastic just simple dashboard and there are so many
simple dashboard and there are so many other things that you can do with this
other things that you can do with this data and you can make it unique and you
data and you can make it unique and you can add your own spin on it and I highly
can add your own spin on it and I highly recommend that you do that push yourself
recommend that you do that push yourself go past what we just did today and add
go past what we just did today and add your own stuff and and use this and then
your own stuff and and use this and then you can add this to your portfolio
you can add this to your portfolio website and show this off and show
website and show this off and show people that you know how to use Excel
people that you know how to use Excel which is a fantastic thing to know how
which is a fantastic thing to know how to use and show off so with that being
to use and show off so with that being said I hope that this project was
said I hope that this project was helpful I hope that you learned
helpful I hope that you learned something along the way I know I did um
something along the way I know I did um I was learning things as we were going
I was learning things as we were going and I hope that you didn't mind that I
and I hope that you didn't mind that I took some detours along the way um for
took some detours along the way um for your amusement as well as my learning uh
your amusement as well as my learning uh so with that being said thank you so
so with that being said thank you so much for joining me I really appreciate
much for joining me I really appreciate it I hope you have a good day and
it I hope you have a good day and [Music]
[Music] goodbye what's going on everybody
goodbye what's going on everybody welcome back to another video today we
welcome back to another video today we are starting our Tableau tutorial
[Music] series now this series is for absolute
series now this series is for absolute beginners so if you have never used TBL
beginners so if you have never used TBL blow before you are in the perfect place
blow before you are in the perfect place I'm going to take you all the way from
I'm going to take you all the way from the very beginning of installing it and
the very beginning of installing it and just understanding what Tableau is and
just understanding what Tableau is and how you can use it all the way to
how you can use it all the way to creating dashboards and sharing it now
creating dashboards and sharing it now personally I hate those videos that are
personally I hate those videos that are like 3 hours long and they just expect
like 3 hours long and they just expect you to go through it uh i' like to break
you to go through it uh i' like to break my videos up in chunk so if you have
my videos up in chunk so if you have ever done my sequel tutorials you'll
ever done my sequel tutorials you'll know that I like to break things up so
know that I like to break things up so it gives you time to try them out and do
it gives you time to try them out and do them yourself and then you can move on
them yourself and then you can move on to the next video so I'm going to be
to the next video so I'm going to be breaking this up into five separate
breaking this up into five separate videos but in this video I'm going to
videos but in this video I'm going to show you how to install Tableau for free
show you how to install Tableau for free I'm going to show you the user interface
I'm going to show you the user interface we're going to download a data set that
we're going to download a data set that you can find on kagle and then we will
you can find on kagle and then we will build our first visualization together
build our first visualization together with that being said let's jump over my
with that being said let's jump over my screen and we'll get started all right
screen and we'll get started all right so the very first thing that we need to
so the very first thing that we need to do is you need to actually download
do is you need to actually download Tableau so we're not going to be using
Tableau so we're not going to be using Tableau we're going to be using a free
Tableau we're going to be using a free version called Tableau public it has a
version called Tableau public it has a lot of the same features except of
lot of the same features except of course it's not uh every single feature
course it's not uh every single feature that regular Tableau has but it is
that regular Tableau has but it is absolutely perfect for learning it and
absolutely perfect for learning it and for using it and and you can even build
for using it and and you can even build um you know dashboards and share those
um you know dashboards and share those for your
for your portfolio um I'm going to put this link
portfolio um I'm going to put this link in the description so you can just go
in the description so you can just go and click on that and and all you have
and click on that and and all you have to do is input your email right here
to do is input your email right here we're going click download the app um
we're going click download the app um and then it should start to download and
and then it should start to download and then you can save that and then you're
then you can save that and then you're going to open this up now I'm going to
going to open this up now I'm going to open it up I don't know what it's going
open it up I don't know what it's going to do I already have it downloaded um
to do I already have it downloaded um but it should open up and look hopefully
but it should open up and look hopefully like what you're seeing on my screen in
like what you're seeing on my screen in just a second let see what it does um I
just a second let see what it does um I hope you can see this but it says
hope you can see this but it says Tableau public um it says I already have
Tableau public um it says I already have it set up but you're going to click
it set up but you're going to click install and go through all that um all
install and go through all that um all that setup stuff uh so I'm going to exit
that setup stuff uh so I'm going to exit out of here but I'm going to go over
out of here but I'm going to go over here and type in table of public uh and
here and type in table of public uh and it's 20 21.3 that's the current version
it's 20 21.3 that's the current version that they have out if you're doing this
that they have out if you're doing this in the future they may have you know
in the future they may have you know different versions um so you should be
different versions um so you should be able to pull this up right here now um
able to pull this up right here now um I'm going to go and get our data set
I'm going to go and get our data set that we're going to be using and I'm
that we're going to be using and I'm going to show you how to get that as
going to show you how to get that as well and then we will actually jump into
well and then we will actually jump into Tableau and start uh using it so let's
Tableau and start uh using it so let's go over here I'm going to get a data set
go over here I'm going to get a data set from kagle I wanted something pretty
from kagle I wanted something pretty generic uh to show you in future videos
generic uh to show you in future videos I'm going to show you some special or
I'm going to show you some special or not special but just different
not special but just different visualizations that you might use um and
visualizations that you might use um and we'll get different data sets for those
we'll get different data sets for those because of course not one data set
because of course not one data set covers all these other types of
covers all these other types of visualizations so um we're starting off
visualizations so um we're starting off pretty simple right here we're going to
pretty simple right here we're going to be getting one called video game sales
be getting one called video game sales um and we can take a really quick look
um and we can take a really quick look at it um here are some of the fields
at it um here are some of the fields that you're going to be having uh like
that you're going to be having uh like rank name platform the year genre and
rank name platform the year genre and then some sales data and this is what it
then some sales data and this is what it actually looks like it's called VG sales
actually looks like it's called VG sales so video game sales it's then a
so video game sales it's then a CSV and um you know here are the fields
CSV and um you know here are the fields and we have our data and all we are
and we have our data and all we are going to do is we're going to download
going to do is we're going to download that and I will save it now when you
that and I will save it now when you download it it's going to be saved into
download it it's going to be saved into a zip file so we need to go to our
a zip file so we need to go to our downloads uh let's refresh this here's
downloads uh let's refresh this here's our archive we need to go in here you
our archive we need to go in here you can just copy it and paste paste it
can just copy it and paste paste it right back into here um and just so you
right back into here um and just so you know that is a uh a CSV so be aware of
know that is a uh a CSV so be aware of that so what we want to do is we want to
that so what we want to do is we want to come in here now since it is a CSV this
come in here now since it is a CSV this is not we're not going to be using
is not we're not going to be using Microsoft Excel we're going to be using
Microsoft Excel we're going to be using the text file so we'll come in here
the text file so we'll come in here we'll take VG sales now uh one thing I
we'll take VG sales now uh one thing I want to do before I do that is I'm going
want to do before I do that is I'm going to rename mine uh VGC
to rename mine uh VGC sales1 um I've already prepared for this
sales1 um I've already prepared for this and so I already have that in there um
and so I already have that in there um but so I want to make a distinct one for
but so I want to make a distinct one for myself you do not have to do that so
myself you do not have to do that so we'll come back here um and then we're
we'll come back here um and then we're going to do text file and VG sales we're
going to do text file and VG sales we're going to open that
going to open that up
up and when it pulls up right here um you
and when it pulls up right here um you can bring in other tables and then you
can bring in other tables and then you can start to join them together and
can start to join them together and create those relationships we are not
create those relationships we are not going to be doing that in this video
going to be doing that in this video we'll do that in a separate one um as
we'll do that in a separate one um as for you know just getting started you
for you know just getting started you know we're not going to be using that
know we're not going to be using that but you can see some of these things or
but you can see some of these things or some of these fields and if you notice
some of these fields and if you notice they they um they're either ABC or
they they um they're either ABC or they're a number so it starts to
they're a number so it starts to categorize what this field type is so is
categorize what this field type is so is it a string is it numeric it starts to
it a string is it numeric it starts to automatically do that and that's all
automatically do that and that's all done within
done within Tableau and so it just kind of reads it
Tableau and so it just kind of reads it and that's what it does um what we going
and that's what it does um what we going to do is I'm going to click right down
to do is I'm going to click right down here it's called go to worksheet um the
here it's called go to worksheet um the worksheets are where you're going to
worksheets are where you're going to actually start being able to build your
actually start being able to build your visualizations your charts your graphs
visualizations your charts your graphs all these things um and so you know we
all these things um and so you know we have this in here now and so we're just
have this in here now and so we're just going to click right here on go to
going to click right here on go to worksheet as you can see here is VG
worksheet as you can see here is VG sales1 you will not have the underscore
sales1 you will not have the underscore one if you did not add that like I did
one if you did not add that like I did uh but right down here you can see all
uh but right down here you can see all the fields that we just imported from
the fields that we just imported from that data set and they even created one
that data set and they even created one right here for us uh they just generated
right here for us uh they just generated that field u based on the file so it's a
that field u based on the file so it's a count of all the rows really so what I'm
count of all the rows really so what I'm going to do is I'm just going to walk
going to do is I'm just going to walk you through uh basically what we're
you through uh basically what we're looking at some of the things that we're
looking at some of the things that we're going to be using today there will be
going to be using today there will be things that I don't talk about but I'm
things that I don't talk about but I'm going to highlight those in in in future
going to highlight those in in in future videos when we start using those or
videos when we start using those or going over them um and so let's just
going over them um and so let's just start with the most obvious one it's way
start with the most obvious one it's way over here I'm sure you saw it when we uh
over here I'm sure you saw it when we uh this first came up on the screen because
this first came up on the screen because it has all these different charts and
it has all these different charts and visualizations and graphs and uh these
visualizations and graphs and uh these will become available as you start
will become available as you start dragging and dropping our data into this
dragging and dropping our data into this sheet and so if I go right here it says
sheet and so if I go right here it says for Scatter Plots try zero or more
for Scatter Plots try zero or more Dimensions two to four measures so what
Dimensions two to four measures so what our dimensions are are right here what
our dimensions are are right here what our measures are are right down here and
our measures are are right down here and so typically uh things like like you say
so typically uh things like like you say genre or names or or strings like that
genre or names or or strings like that are going to be these uh dimensions and
are going to be these uh dimensions and then a lot of lot of times the numerical
then a lot of lot of times the numerical is going to be our going to be measures
is going to be our going to be measures next what I want to show you is right
next what I want to show you is right here so you can take something like
here so you can take something like Global sales and you can drag it right
Global sales and you can drag it right here into your rows and then it takes
here into your rows and then it takes your rows and so it automatically
your rows and so it automatically created a sum of global sales now if we
created a sum of global sales now if we take that away and let's say we drag it
take that away and let's say we drag it right here it's going to give us a
right here it's going to give us a column
column now you can also do it right up here you
now you can also do it right up here you don't have to um drag it on screen you
don't have to um drag it on screen you can
can also just add it to the column or the
also just add it to the column or the row that's typically what I do I it's
row that's typically what I do I it's just more intuitive to me um or you can
just more intuitive to me um or you can drop it in this section right here and
drop it in this section right here and it does its best to assign it some type
it does its best to assign it some type of um some type of visualization and so
of um some type of visualization and so that's what it always is trying to do it
that's what it always is trying to do it is trying to say okay this is what
is trying to say okay this is what you're trying to do let me try to to get
you're trying to do let me try to to get the best visualization for the data that
the best visualization for the data that you're giving me now while we are here
you're giving me now while we are here um it went down here into marks and
um it went down here into marks and marks is a very important area it's
marks is a very important area it's where you can add color size text detail
where you can add color size text detail and Tool tip and I'm not going to go
and Tool tip and I'm not going to go into what all those are cuz I'm just
into what all those are cuz I'm just going to show you so let's start pulling
going to show you so let's start pulling some fields in here and creating a
some fields in here and creating a visualization and then I'm going to show
visualization and then I'm going to show you how all of that works including
you how all of that works including filters as well so the first thing that
filters as well so the first thing that we are going to look at is global save
we are going to look at is global save and let's put that in the rows and then
and let's put that in the rows and then I'm going to take year and I'm going to
I'm going to take year and I'm going to make that the column and this is
make that the column and this is basically exactly what uh I wanted to do
basically exactly what uh I wanted to do now as of right now it has only the year
now as of right now it has only the year and it's looking at Global sales for
and it's looking at Global sales for everything but we want to break that out
everything but we want to break that out a little bit better I Want to Break It
a little bit better I Want to Break It Out by let's do genre so different genre
Out by let's do genre so different genre of games now if I add that right here to
of games now if I add that right here to this column s it is going to break it up
this column s it is going to break it up by year and genre if I add it right here
by year and genre if I add it right here is going to break it out by the year of
is going to break it out by the year of course but then in each individual row
course but then in each individual row has the different genre that's not what
has the different genre that's not what we want we want to keep this type of
we want we want to keep this type of line graph uh and what we're going to do
line graph uh and what we're going to do is we're going to add it to
is we're going to add it to Marks and you can't really see it based
Marks and you can't really see it based off of these colors but they're all
off of these colors but they're all different so we have action J genre we
different so we have action J genre we have the sports genre racing uh role
have the sports genre racing uh role playing all these different genres
playing all these different genres within it now we can get rid of that cuz
within it now we can get rid of that cuz we don't need it
we don't need it anymore uh and this is where these U
anymore uh and this is where these U these marks really come in handy because
these marks really come in handy because you can start basically doing what you
you can start basically doing what you want with them so for the genre I want
want with them so for the genre I want to be able to see all these different
to be able to see all these different genres with different colors to me that
genres with different colors to me that just makes the most sense so I'm going
just makes the most sense so I'm going to put color right here and
to put color right here and automatically it assigns every single
automatically it assigns every single genre its own own color and gives us
genre its own own color and gives us this Legend right over here and so it's
this Legend right over here and so it's really easy to see well when you have
really easy to see well when you have smaller numbers is much easier but I
smaller numbers is much easier but I know that red is sports and I can go
know that red is sports and I can go right here and find red and that is
right here and find red and that is sports so it makes it a lot easier than
sports so it makes it a lot easier than when it is all the same color blue so
when it is all the same color blue so what you can do after that is you can
what you can do after that is you can also add things like uh a label to it so
also add things like uh a label to it so if we take label and we or we take genre
if we take label and we or we take genre put label you can click right here and
put label you can click right here and you can get rid of the labels that you
you can get rid of the labels that you have and you can see them right down
have and you can see them right down here or you can also change uh the font
here or you can also change uh the font so if you want to make it orange or or
so if you want to make it orange or or whatever color you can do all those same
whatever color you can do all those same things and you can also do things like
things and you can also do things like changing where you see these things so
changing where you see these things so for Action you're going to see it a ton
for Action you're going to see it a ton because for each year action is is at
because for each year action is is at the is on the higher end and so you're
the is on the higher end and so you're seeing those in those mins and Maxes you
seeing those in those mins and Maxes you can also do it for a selected area so if
can also do it for a selected area so if I come in here and I select it it's then
I come in here and I select it it's then going to show me what those are so label
going to show me what those are so label is really really uh useful really
is really really uh useful really helpful let me get rid of that really
helpful let me get rid of that really quick uh you can also do it where the
quick uh you can also do it where the lines end so line ends is at the
lines end so line ends is at the beginning and the end and you can also
beginning and the end and you can also take that away or put that back on so
take that away or put that back on so labels are really important labels
labels are really important labels aren't very helpful when you're doing at
aren't very helpful when you're doing at least I don't find that it's super
least I don't find that it's super helpful when you're doing things like
helpful when you're doing things like genre so when you're doing your
genre so when you're doing your Dimensions so I'm going to get rid of
Dimensions so I'm going to get rid of that and I'm actually going to bring
that and I'm actually going to bring our Global sales over here and let's
our Global sales over here and let's label
label that and right now I think it's labeling
that and right now I think it's labeling the uh line ends we want to do the Min
the uh line ends we want to do the Min and Max now if we do Min and Max on the
and Max now if we do Min and Max on the table it's just going to give us the Max
table it's just going to give us the Max and the men which is zero and then
and the men which is zero and then 139.0 it's a little bit more useful if
139.0 it's a little bit more useful if we do it for each line uh this at least
we do it for each line uh this at least gives us some context I probably
gives us some context I probably wouldn't do this in an actual visual
wouldn't do this in an actual visual visualization but to give you some um
visualization but to give you some um understanding just how it works so now I
understanding just how it works so now I know that um right over here the men and
know that um right over here the men and the max or the men sorry the max for
the max or the men sorry the max for these for action and for sports is right
these for action and for sports is right around 138 139 so it's pretty easy to
around 138 139 so it's pretty easy to see um and you can again go in here and
see um and you can again go in here and you can remove the max or remove the
you can remove the max or remove the mins whichever one you feel is best uh
mins whichever one you feel is best uh you'll probably keep the maximums in
you'll probably keep the maximums in there for each category and so this is a
there for each category and so this is a really quickly becoming uh a pretty
really quickly becoming uh a pretty usable visualization and that's not the
usable visualization and that's not the only label that you can add we still are
only label that you can add we still are using year over here so we can always
using year over here so we can always drop year in there as well we'll create
drop year in there as well we'll create a label and so now we have let's see for
a label and so now we have let's see for this one is a puzzle genre so we also
this one is a puzzle genre so we also have the year that it had the maximum uh
have the year that it had the maximum uh sales and so you know just some things
sales and so you know just some things that you can do you don't have to add
that you can do you don't have to add that now let's go up here and we're
that now let's go up here and we're going to take a look at filters because
going to take a look at filters because filters are really important you know if
filters are really important you know if you are making this for a client or you
you are making this for a client or you making this for somebody you want them
making this for somebody you want them to be able to filter down uh to very
to be able to filter down uh to very specific information that they want to
specific information that they want to see so let's take uh the platform lots
see so let's take uh the platform lots of different
of different platforms um as you can see you know PS4
platforms um as you can see you know PS4 Xbox um if you're familiar with these
Xbox um if you're familiar with these we'll click all of these um and we'll
we'll click all of these um and we'll click okay so now this is an option as a
click okay so now this is an option as a filter and all we're going to do is
filter and all we're going to do is we're going to click on this Arrow right
we're going to click on this Arrow right here and we're going to say show
here and we're going to say show filter now right now all of them are
filter now right now all of them are selected so every single one is being
selected so every single one is being taken into account for this
taken into account for this visualization but let's say we come down
visualization but let's say we come down here and we say okay I don't want to see
here and we say okay I don't want to see sales for any of these PS the original
sales for any of these PS the original PlayStation 2 three or four so I'm going
PlayStation 2 three or four so I'm going to get rid of this one this one this one
to get rid of this one this one this one and this one and you could immediately
and this one and you could immediately see the the changes that were happening
see the the changes that were happening so now none of the numbers none of those
so now none of the numbers none of those sales are being accounted for and and
sales are being accounted for and and being added to the sum of global sales
being added to the sum of global sales right here at
right here at all so that is just how a filter uh can
all so that is just how a filter uh can work and you can also do that and you
work and you can also do that and you can get rid of all of them and you can
can get rid of all of them and you can go in and actually just pick very
go in and actually just pick very specific sales so if you only want to
specific sales so if you only want to see the PlayStation sales you can go in
see the PlayStation sales you can go in there and do that as well so really
there and do that as well so really really handy filter are things that you
really handy filter are things that you at least want to have as an option for
at least want to have as an option for most of your your visualizations at
most of your your visualizations at least that's what I found especially
least that's what I found especially when you're doing client facing work
when you're doing client facing work they like to uh get in there and mess
they like to uh get in there and mess around and look at different look at it
around and look at different look at it in different ways and so that's one that
in different ways and so that's one that I I think is is really useful to to
I I think is is really useful to to have the very last thing that we want to
have the very last thing that we want to do is we want to actually add this to a
do is we want to actually add this to a dashboard now let's say we add come
dashboard now let's say we add come right down down here and we add a new
right down down here and we add a new worksheet and actually we might change
worksheet and actually we might change one more thing on that last one but
one more thing on that last one but we'll just make a really simple one um
we'll just make a really simple one um we'll just give it genre and we'll give
we'll just give it genre and we'll give it Global sales as the
it Global sales as the rows um and this Nifty button right up
rows um and this Nifty button right up here which is a sorting button so I'm
here which is a sorting button so I'm going to sort like that I'm going to add
going to sort like that I'm going to add the genre in just as we did I'll give it
the genre in just as we did I'll give it different colors perfect now we have two
different colors perfect now we have two really quick different visualizations
really quick different visualizations right what I want to do is just show you
right what I want to do is just show you how to combine those because what you
how to combine those because what you are going to do is you're going to
are going to do is you're going to actually come in here and you're going
actually come in here and you're going to do new dashboard that's what this
to do new dashboard that's what this button is right here now when we come in
button is right here now when we come in here the size is extremely small it's
here the size is extremely small it's very easy to fix that all we're going to
very easy to fix that all we're going to do is Click right here we're going to go
do is Click right here we're going to go to this range or this dropdown and we're
to this range or this dropdown and we're going to click automatic so now it is a
going to click automatic so now it is a much larger size for us to actually drop
much larger size for us to actually drop our visualizations
our visualizations into uh and let's put sheet sheet one
into uh and let's put sheet sheet one and we'll put uh let's put it up top so
and we'll put uh let's put it up top so now it looks a little bit like this uh
now it looks a little bit like this uh not perfect but again if I wanted to
not perfect but again if I wanted to make this look a lot better I definitely
make this look a lot better I definitely would and then you can go over here and
would and then you can go over here and you can rename these things you can also
you can rename these things you can also do that back when we were in our actual
do that back when we were in our actual worksheets but you can also do it here
worksheets but you can also do it here as well and then start um you know
as well and then start um you know customizing it and building it out
customizing it and building it out that's not what this video is for that
that's not what this video is for that is the last video we're going to build
is the last video we're going to build an entire dashboard it'll be kind of
an entire dashboard it'll be kind of like a small project you put that in
like a small project you put that in your portfolio um if you have gotten
your portfolio um if you have gotten this far and you want to jump straight
this far and you want to jump straight into it and you don't want to wait for
into it and you don't want to wait for these other videos to come out or you
these other videos to come out or you don't you just want to jump straight
don't you just want to jump straight into creating an entire portfolio
into creating an entire portfolio project I have an entire portfolio
project I have an entire portfolio project series that covers SQL Python
project series that covers SQL Python and Tableau and so go check out that
and Tableau and so go check out that series I have one video dedicated to
series I have one video dedicated to Tableau it's like 45 minutes or an hour
Tableau it's like 45 minutes or an hour long and it covers a lot of the things
long and it covers a lot of the things that we're going to hear in here as well
that we're going to hear in here as well as a few other things but I appreciate
as a few other things but I appreciate you checking out this video in future
you checking out this video in future videos we're going be going over things
videos we're going be going over things like creating bins calculated Fields
like creating bins calculated Fields doing joins and then creating a final
doing joins and then creating a final project and putting it all together so
project and putting it all together so thank you so much for joining me I
thank you so much for joining me I really appreciate it if you like this
really appreciate it if you like this video be sure to like And subscribe
video be sure to like And subscribe below and I will see you in the next
below and I will see you in the next [Music]
[Music] video
video [Music]
what's going on everybody welcome back to the Tableau tutorial Series in this
to the Tableau tutorial Series in this video we're going to be going over bins
video we're going to be going over bins and calculated
[Music] Fields all right so let's jump right
Fields all right so let's jump right into it the first thing that we're going
into it the first thing that we're going to look at are bins and bins are
to look at are bins and bins are basically just groupings or ranges of
basically just groupings or ranges of numerical values so we cannot create
numerical values so we cannot create bins uh for genre name platform or
bins uh for genre name platform or anything like that we have to do
anything like that we have to do something with this sign right here
something with this sign right here which means that it is a numeric so year
which means that it is a numeric so year or all this sales data or this ranking
or all this sales data or this ranking data and we're going to use what we
data and we're going to use what we worked on in our very first tutorial and
worked on in our very first tutorial and so what we're going to be using to kind
so what we're going to be using to kind of demonstrate how bins work is this
of demonstrate how bins work is this year right down here so right now we
year right down here so right now we have a range of 1993 all the way up to
have a range of 1993 all the way up to 2018 and we're going to create some bins
2018 and we're going to create some bins to group and create ranges for these
to group and create ranges for these years and it's pretty simple all we're
years and it's pretty simple all we're going to do is I'm going to come right
going to do is I'm going to come right over here to year and this little drop
over here to year and this little drop down on the side and we're going to go
down on the side and we're going to go down to create and go down to
down to create and go down to bins now it's going to say the size of
bins now it's going to say the size of Bin and it's going to give you a
Bin and it's going to give you a recommendation based off of the
recommendation based off of the information that is already provided the
information that is already provided the Min and the max the ranges of these
Min and the max the ranges of these values you know you don't have to do
values you know you don't have to do this but usually um it it does give some
this but usually um it it does give some good estimation on what you might be
good estimation on what you might be considering if you were thinking hey
considering if you were thinking hey maybe do a bit of like 20 and they're
maybe do a bit of like 20 and they're recommending two think about why they
recommending two think about why they might be doing that we're going to
might be doing that we're going to change ours to five and you can always
change ours to five and you can always change what this field is going to be
change what this field is going to be I'm just going to give it an old
I'm just going to give it an old exclamation point just to um really
exclamation point just to um really spice things up here so we're going to
spice things up here so we're going to click okay and as you can see it adds it
click okay and as you can see it adds it right up here is no longer um it is no
right up here is no longer um it is no longer a numeric now it is a categorical
longer a numeric now it is a categorical so it now it's this is no longer just uh
so it now it's this is no longer just uh 1 2 3 4 five its ranges its groups and
1 2 3 4 five its ranges its groups and we're going to get rid of this year
we're going to get rid of this year really quick actually let's keep it up
really quick actually let's keep it up there for a second uh see what happens
there for a second uh see what happens but we're going to bring this up and
but we're going to bring this up and we'll get rid of this year and this is
we'll get rid of this year and this is is what kind of it spits out for us now
is what kind of it spits out for us now I did look at the data um when I was
I did look at the data um when I was prepping for this there are some nulls
prepping for this there are some nulls in the Years um and so all we're going
in the Years um and so all we're going to do for this is we're just going to go
to do for this is we're just going to go like this and we're going to exclude the
like this and we're going to exclude the nulls uh probably not something you
nulls uh probably not something you should be doing uh if you're doing this
should be doing uh if you're doing this for work but this is for demonstration
for work but this is for demonstration purposes so we can do it ever we want
purposes so we can do it ever we want but as you can see we now have these
but as you can see we now have these ranges so this range starts at
ranges so this range starts at 1990 and it includes 1990 all the way up
1990 and it includes 1990 all the way up to 1994 and then it's 1995 to
to 1994 and then it's 1995 to 1999 and so just really quickly we can
1999 and so just really quickly we can tell that the years 2000 to 2004 were a
tell that the years 2000 to 2004 were a huge huge huge uh season or group of of
huge huge huge uh season or group of of years for game sales so these are the
years for game sales so these are the global sales for for these video games
global sales for for these video games and so it is really helpful it's very
and so it is really helpful it's very useful um you can do this on a lot of
useful um you can do this on a lot of different information we could do this
different information we could do this on the sales data you can do this on age
on the sales data you can do this on age you can do it on years like we did and
you can do it on years like we did and it can be very very useful and so uh
it can be very very useful and so uh really quickly that is how bins work I
really quickly that is how bins work I would say it's pretty straightforward
would say it's pretty straightforward now this is a perfect time to segue into
now this is a perfect time to segue into the next part of the video which is
the next part of the video which is calculated Fields uh right over here on
calculated Fields uh right over here on this left hand side we see that the
this left hand side we see that the global sales which are in millions goes
global sales which are in millions goes all the way up to 900 million and
all the way up to 900 million and created these beautiful bins right down
created these beautiful bins right down here but let's look at Within These from
here but let's look at Within These from 1999 to 2015 let's see which of these
1999 to 2015 let's see which of these has the highest percentage of course
has the highest percentage of course it's going to be this one but we can do
it's going to be this one but we can do something called a quick table
something called a quick table calculation uh we'll create a our own
calculation uh we'll create a our own calculation later I'll show you how to
calculation later I'll show you how to do that but we're going to do a quick
do that but we're going to do a quick table calculation and we're going to do
table calculation and we're going to do the percent of total and so now we have
the percent of total and so now we have these bins and instead of just seeing
these bins and instead of just seeing the total amount of sales that they had
the total amount of sales that they had we see the actual percentages based off
we see the actual percentages based off these year ranges which is really useful
these year ranges which is really useful something that you could absolutely put
something that you could absolutely put uh in some real work that you do for a
uh in some real work that you do for a client now really quick just to show you
client now really quick just to show you something that you can do if you click
something that you can do if you click control and you drag this over here you
control and you drag this over here you can actually save that calculation so we
can actually save that calculation so we can say
can say percentage of global sales and that
percentage of global sales and that actually saves it as uh you know a
actually saves it as uh you know a measure for us so that was a quick
measure for us so that was a quick calculation but let's look how to
calculation but let's look how to actually create a calculated field so if
actually create a calculated field so if we do this right here what is going to
we do this right here what is going to come up is just the global sales and you
come up is just the global sales and you can do a lot of what you would basically
can do a lot of what you would basically do in Excel multiplication division
do in Excel multiplication division subtraction a few other things but we're
subtraction a few other things but we're going to keep it super super simple
going to keep it super super simple today all I'm going to do is I'm going
today all I'm going to do is I'm going to take Global sales and I'm going to
to take Global sales and I'm going to subtract I'm going to do an open bracket
subtract I'm going to do an open bracket and I'm going to say EU sales and it
and I'm going to say EU sales and it auto completes for me I'm going to click
auto completes for me I'm going to click okay and created calculation 2 I'm going
okay and created calculation 2 I'm going to come in here and I'm just going to
to come in here and I'm just going to say Global sales
say Global sales minus EU
minus EU sales and let's drag this over these are
sales and let's drag this over these are different um one's percentage one is in
different um one's percentage one is in terms of sum and so I'm just going to
terms of sum and so I'm just going to bring this in right here and so now we
bring this in right here and so now we are comparing against the same thing and
are comparing against the same thing and if we look at the global sales we have
if we look at the global sales we have probably right around 9 50 million-ish
probably right around 9 50 million-ish in this 2000 to 2004 bin and for Global
in this 2000 to 2004 bin and for Global sales minus the EU sales we're looking
sales minus the EU sales we're looking at you know 650 million so there is a
at you know 650 million so there is a noticeable difference and this is just
noticeable difference and this is just one of the ways that you can use
one of the ways that you can use calculated fields to actually just show
calculated fields to actually just show the difference between two numbers or
the difference between two numbers or you can do more advanced calculations
you can do more advanced calculations depending on the data that you actually
depending on the data that you actually have so that's it for this video I hope
have so that's it for this video I hope you learned a little bit more about bins
you learned a little bit more about bins and calculated fields in the next video
and calculated fields in the next video we're going to looking at a ton of
we're going to looking at a ton of different visualizations and graphs and
different visualizations and graphs and charts and just exploring what options
charts and just exploring what options really are out there for visualizing our
really are out there for visualizing our data thank you guys so much for joining
data thank you guys so much for joining me I really appreciate it if you like
me I really appreciate it if you like this video be sure to like And subscribe
this video be sure to like And subscribe below and I will see you in the next
below and I will see you in the next [Music]
[Music] video what's going on everybody welcome
video what's going on everybody welcome back to the Tableau tutorial Series in
back to the Tableau tutorial Series in this video we're going to be looking at
this video we're going to be looking at lots of different visualizations
lots of different visualizations including the scatter plot and density
[Music] Maps now before we jump into the
Maps now before we jump into the tutorial I have some very exciting news
tutorial I have some very exciting news in just two days on October 7th I going
in just two days on October 7th I going to be partnering with alter X to host a
to be partnering with alter X to host a webinar this webinar is completely for
webinar this webinar is completely for data analysts who are wanting to change
data analysts who are wanting to change careers to become a data analyst now you
careers to become a data analyst now you did hear that right I will be the host
did hear that right I will be the host of the event but but we will be bringing
of the event but but we will be bringing on guests as well who are industry
on guests as well who are industry experts who actually change careers to
experts who actually change careers to become data analyst much like myself
become data analyst much like myself they'll be sharing their stories of how
they'll be sharing their stories of how they actually transition careers along
they actually transition careers along with the tools that they found extremely
with the tools that they found extremely useful and helpful to make that switch
useful and helpful to make that switch and they'll be giving lots of advice
and they'll be giving lots of advice along the way so if you are somebody who
along the way so if you are somebody who is wanting to change careers to become a
is wanting to change careers to become a data analyst or just wanting to learn
data analyst or just wanting to learn about data analytics this is an absolute
about data analytics this is an absolute fantastic place to learn a lot more
fantastic place to learn a lot more about that I will leave a link in the
about that I will leave a link in the description so be sure to go and sign up
description so be sure to go and sign up for that again I'm going to be there so
for that again I'm going to be there so so it should be really fun without
so it should be really fun without further Ado let's jump onto my screen
further Ado let's jump onto my screen and start the tutorial now we are about
and start the tutorial now we are about to look at a ton of different
to look at a ton of different visualizations uh over here you can see
visualizations uh over here you can see just an array of them but not all of
just an array of them but not all of them are ones that I actually think are
them are ones that I actually think are useful or ones that I would actually
useful or ones that I would actually recommend using and so I'm going to take
recommend using and so I'm going to take you through some of the ones that I
you through some of the ones that I absolutely think are worth learning and
absolutely think are worth learning and using and trying out uh and I'm just
using and trying out uh and I'm just going to kind of just show you how I
going to kind of just show you how I might use them how they might look how
might use them how they might look how you can navigate them a little bit now
you can navigate them a little bit now before we do that we do need to go
before we do that we do need to go download one data set it's this
download one data set it's this Starbucks location worldwide yes we're
Starbucks location worldwide yes we're going to do a little bit of longitude
going to do a little bit of longitude latitude here and all we have to do is
latitude here and all we have to do is click this downloads button and it will
click this downloads button and it will download we're going to do that into
download we're going to do that into downloads we'll save that uh yeah I've
downloads we'll save that uh yeah I've already done that but you know I'm doing
already done that but you know I'm doing this with you guys I'm doing it for you
this with you guys I'm doing it for you so let's go to our
so let's go to our downloads now we have have here we want
downloads now we have have here we want to come in here we're going to copy it
to come in here we're going to copy it or um you can cut it and then we're
or um you can cut it and then we're going to paste it here yeah replace it
going to paste it here yeah replace it perfect and now we have it ready to go
perfect and now we have it ready to go we'll come in here let's do a new sheet
we'll come in here let's do a new sheet and I already have it in there but uh
and I already have it in there but uh I'm just going to show you what I would
I'm just going to show you what I would do do new data source we'll do text file
do do new data source we'll do text file we'll do directory and we will open
it and let's see what data we have in here before we actually begin uh just
here before we actually begin uh just super quickly we have the brand so um
super quickly we have the brand so um whatever company has it and then a bunch
whatever company has it and then a bunch of um location information street
of um location information street address City the state this is all in
address City the state this is all in the United States so that's basically it
the United States so that's basically it and what we are going to do is we're
and what we are going to do is we're going to go over to this sheet
going to go over to this sheet three and we have this directory 2
three and we have this directory 2 that's the one I just pulled in exact
that's the one I just pulled in exact same thing as directory but so the first
same thing as directory but so the first VIs visualization that we are going to
VIs visualization that we are going to look at is a bar and line graph so what
look at is a bar and line graph so what we're going to take is the year right
we're going to take is the year right here take these Global sales and these
here take these Global sales and these na
na sales and we're going to be doing this
sales and we're going to be doing this one right here so this has a combination
one right here so this has a combination of two separate uh types of
of two separate uh types of visualizations so sometimes you just
visualizations so sometimes you just have lines sometimes you just have these
have lines sometimes you just have these uh these bar graphs or the bar charts
uh these bar graphs or the bar charts and we're combining the two and it's
and we're combining the two and it's very nice I like how this looks now if
very nice I like how this looks now if you notice if I put this na sales behind
you notice if I put this na sales behind it now it kind of cuts off so now this
it now it kind of cuts off so now this Global sales is in front we're going to
Global sales is in front we're going to you know put that back I just wanted to
you know put that back I just wanted to show you that uh right here there's all
show you that uh right here there's all some of global sales some of Na sales so
some of global sales some of Na sales so if we go into this all we click this
if we go into this all we click this drop down we can change it to a line um
drop down we can change it to a line um we can change it basically whatever we
we can change it basically whatever we want I just hit contrl Z to reverse that
want I just hit contrl Z to reverse that but what we can do is we can go in here
but what we can do is we can go in here and we can change this color and let's
and we can change this color and let's see if we can just make it red is that
see if we can just make it red is that [Music]
[Music] possible see what I did I made it orange
possible see what I did I made it orange that works for me um just something to
that works for me um just something to stick out a little bit more choose
stick out a little bit more choose whatever color you want and this is a
whatever color you want and this is a really nice visualization this is one
really nice visualization this is one that I have used in the past we're
that I have used in the past we're looking at Global sales versus the na
looking at Global sales versus the na sales and so it's very easy to see the
sales and so it's very easy to see the distinction between the two and how one
distinction between the two and how one was doing a specific year versus how the
was doing a specific year versus how the other one was doing in that same year so
other one was doing in that same year so I really like this if you want to do
I really like this if you want to do something uh like keeping it consistent
something uh like keeping it consistent you can do two bars I don't really like
you can do two bars I don't really like this one as much um and you can again
this one as much um and you can again you can really change it up um there's
you can really change it up um there's lots of different ones that you can do
lots of different ones that you can do again I prefer the line but you know do
again I prefer the line but you know do whatever you think is best I'm going to
whatever you think is best I'm going to change it back because this is not how I
change it back because this is not how I want to keep it but there you go so that
want to keep it but there you go so that is the first one that we are going to
is the first one that we are going to look at let's move on to the second one
look at let's move on to the second one and we actually will be using our our
and we actually will be using our our Starbucks data here
Starbucks data here now when you bring in data that has um
now when you bring in data that has um any type of map or or um address or
any type of map or or um address or postal code or things like that or or
postal code or things like that or or country it's typically going to create
country it's typically going to create this latitude and longitude it's going
this latitude and longitude it's going to generate that now what we want to do
to generate that now what we want to do is bring this longitude right up here
is bring this longitude right up here and this latitude right
and this latitude right there and if you do the show me right
there and if you do the show me right now it's giving us this but what we want
now it's giving us this but what we want to do is add what we're looking for so
to do is add what we're looking for so what will we actually be trying to
what will we actually be trying to search for on this map you can do
search for on this map you can do anything from like a postal code um and
anything from like a postal code um and it will drag us right here let's come
it will drag us right here let's come over to this this allows us to kind of
over to this this allows us to kind of scroll around a little bit um we're
scroll around a little bit um we're going to mess around with this one for
going to mess around with this one for just a little bit and me see if I
just a little bit and me see if I can that's nice that might be too big
can that's nice that might be too big let me back up one so at least in the
let me back up one so at least in the Continental us a little bit down here
Continental us a little bit down here this these are the postal codes so right
this these are the postal codes so right now we're looking at post codes uh
now we're looking at post codes uh and there are a lot that you can do with
and there are a lot that you can do with this um really color will make almost no
this um really color will make almost no difference it just becomes this mess so
difference it just becomes this mess so you don't typically want to do something
you don't typically want to do something like that at least not for this let's go
like that at least not for this let's go to size and if we make it really small
to size and if we make it really small you can kind of see these groupings
you can kind of see these groupings these pairings um typically of like
these pairings um typically of like larger cities or major major
larger cities or major major metropolitan areas and so you can do
metropolitan areas and so you can do this and it's and it's really really
this and it's and it's really really easy I don't recommend uh labeling this
easy I don't recommend uh labeling this I don't even know if it'll do it um it
I don't even know if it'll do it um it would be an absolute mess to try to
would be an absolute mess to try to label all these
label all these postcodes well let's bring this out and
postcodes well let's bring this out and let's bring these State and provinces in
let's bring these State and provinces in now right now we have these little tiny
now right now we have these little tiny tiny uh dots on here and I think what we
tiny uh dots on here and I think what we want to do is not increase the size size
want to do is not increase the size size but over here we want to actually do
but over here we want to actually do this and make it a map so now it's going
this and make it a map so now it's going to fill in all the states we can you
to fill in all the states we can you know why not we'll add some color here
know why not we'll add some color here um but we
um but we can it hasn't numbered I didn't think
can it hasn't numbered I didn't think they were numbered
they were numbered um oh that's interesting I haven't seen
um oh that's interesting I haven't seen that I didn't look at that before I was
that I didn't look at that before I was just found that interesting but now we
just found that interesting but now we can see what uh what states Starbucks is
can see what uh what states Starbucks is in and as you can see they're in all 50
in and as you can see they're in all 50 states but it's something interesting to
states but it's something interesting to um look at to think about now if we go
um look at to think about now if we go right up here we can again choose a
right up here we can again choose a different type and we're going to go to
different type and we're going to go to the density now right now it's just
the density now right now it's just doing a density on the uh the state
doing a density on the uh the state we're going get rid of that we're going
we're going get rid of that we're going to bring back postal code I'm just
to bring back postal code I'm just switching it up on you a little bit and
switching it up on you a little bit and you can do it as small or as big as
you can do it as small or as big as you'd like um you know I like to do
you'd like um you know I like to do somewhere in the middle um probably
somewhere in the middle um probably right right about there is fine um I
right right about there is fine um I don't think it's going to make sense to
don't think it's going to make sense to really add any color here again all
really add any color here again all these poster codes are different so it's
these poster codes are different so it's just going to be complete mish mash but
just going to be complete mish mash but this is kind of how you can use a
this is kind of how you can use a density map and you can do this with uh
density map and you can do this with uh countries you can do this with postal
countries you can do this with postal codes you can do this with any type of
codes you can do this with any type of kind of like address or location based
kind of like address or location based data so that is how you can use a map
data so that is how you can use a map again there's lots of different ways to
again there's lots of different ways to use a map and so I'm not going to show
use a map and so I'm not going to show you every single way but in a really
you every single way but in a really brief way this is how you can use a map
brief way this is how you can use a map to actually visualize your data that
to actually visualize your data that does have location uh based information
does have location uh based information in it so let's go over to sheet three uh
in it so let's go over to sheet three uh and this data that we have over here it
and this data that we have over here it just allows for a lot of different types
just allows for a lot of different types of visualizations so we're going to use
of visualizations so we're going to use this one um and there are lots of other
this one um and there are lots of other ones that you might see out there like
ones that you might see out there like this one right here uh we obviously
this one right here uh we obviously wouldn't be using this we might do
wouldn't be using this we might do something like this change the
something like this change the label um and maybe add why have both of
label um and maybe add why have both of these in here um let's get rid of this
these in here um let's get rid of this oops that's not what I meant let's
oops that's not what I meant let's actually add that let's do the sum of
actually add that let's do the sum of global sales and we'll just make that
global sales and we'll just make that into a label as well
into a label as well so what you can do with these and and
so what you can do with these and and how you're able to use them and
how you're able to use them and visualize them again these are not
visualize them again these are not you'll see these often but these are not
you'll see these often but these are not often ones that I would recommend you
often ones that I would recommend you use that's very similar to these packed
use that's very similar to these packed bubbles um you can as these Global sales
bubbles um you can as these Global sales in here again add the label it just uh
in here again add the label it just uh it sometimes is not as straightforward
it sometimes is not as straightforward the information that it's trying to tell
the information that it's trying to tell you right you kind of have to search for
you right you kind of have to search for it a little bit you kind of have to look
it a little bit you kind of have to look around um but you can find some good
around um but you can find some good visualizations in here for very specific
visualizations in here for very specific types of data and so these are just ones
types of data and so these are just ones to consider uh one that you'll see all
to consider uh one that you'll see all the time is uh this guy right here and
the time is uh this guy right here and uh let me see if I can expand this a
uh let me see if I can expand this a little bit because this
little bit because this is very small um let's see we have the I
is very small um let's see we have the I just want Global
just want Global sales and let's label
sales and let's label that the
that the size I how do I expand this haven't done
size I how do I expand this haven't done this in a while let me just expand this
this in a while let me just expand this I don't use pie charts what is
I don't use pie charts what is happening this is a incredibly large pie
happening this is a incredibly large pie chart oh my gosh I am making this um
chart oh my gosh I am making this um this is becoming a problem there we go
this is becoming a problem there we go uh and what I actually wanted to do was
uh and what I actually wanted to do was label the uh genre as well as I've been
label the uh genre as well as I've been doing in all the other
doing in all the other ones and we'll label this now look
ones and we'll label this now look whether you are a fan of pie charts or
whether you are a fan of pie charts or not you have to understand that people
not you have to understand that people use them uh some people just like how
use them uh some people just like how they look and for certain data it can do
they look and for certain data it can do well for things that have a lot of
well for things that have a lot of different um groupings or categories it
different um groupings or categories it usually isn't super great but it does
usually isn't super great but it does give you some type of order of things
give you some type of order of things give you a quick glance and people use
give you a quick glance and people use them right so let's not pretend like
them right so let's not pretend like it's like the the the Hideous stepchild
it's like the the the Hideous stepchild all right people use it people have it
all right people use it people have it in their dashboards and their
in their dashboards and their visualizations all over so it's best to
visualizations all over so it's best to just know what they look like know how
just know what they look like know how to do them know um how to use them best
to do them know um how to use them best again I'm not a super huge huge fan of
again I'm not a super huge huge fan of it myself I've used it once or twice but
it myself I've used it once or twice but one to look out for and again you can
one to look out for and again you can come over to here and use is called a
come over to here and use is called a box and a whisker plot um it's good for
box and a whisker plot um it's good for these large um distributions you know
these large um distributions you know this is
this is like the median upper upper lower lower
like the median upper upper lower lower I don't use these a lot but I know a lot
I don't use these a lot but I know a lot of people who love them something to
of people who love them something to just look at and or mess around with it
just look at and or mess around with it a little bit it's pretty I think
a little bit it's pretty I think straightforward and it does give you
straightforward and it does give you some good insight into your data if you
some good insight into your data if you know how to use it now there is one last
know how to use it now there is one last one that I want to show you I'm just
one that I want to show you I'm just going to create it on a new sheet make
going to create it on a new sheet make it easy uh we'll do year here we'll do
it easy uh we'll do year here we'll do some of let's do na sales why
some of let's do na sales why not and we are going to make this like
not and we are going to make this like this now it's very similar to a line
this now it's very similar to a line chart but when we break it out by the
chart but when we break it out by the genre and we add some color you know
genre and we add some color you know it's just a different way to visualize
it's just a different way to visualize this information you can uh you know
this information you can uh you know potentially add some stuff in here like
potentially add some stuff in here like some labels if you uh want to depending
some labels if you uh want to depending on how it looks for you but this is just
on how it looks for you but this is just another way to visualize the data so
another way to visualize the data so wanting to give you guys some options
wanting to give you guys some options wanting to give you some things that you
wanting to give you some things that you might want to look at if you haven't
might want to look at if you haven't already used these before four these are
already used these before four these are ones all every single one that I've
ones all every single one that I've showed you are ones that I've at least
showed you are ones that I've at least used once um this one I maybe have
used once um this one I maybe have literally only used once but the first
literally only used once but the first ones that I showed you the ones I
ones that I showed you the ones I pointed out as the ones that I really
pointed out as the ones that I really wanted you to know are great
wanted you to know are great visualizations to learn how to use and
visualizations to learn how to use and learn how to make useful for the data
learn how to make useful for the data that you have with that being said that
that you have with that being said that is all that we are looking at in this
is all that we are looking at in this video again I tried to keep it super
video again I tried to keep it super easy just wanted to show you some
easy just wanted to show you some different visualizations the data that
different visualizations the data that you can use to get those visualizations
you can use to get those visualizations and just some other options in case you
and just some other options in case you wanted to get a little bit uh
wanted to get a little bit uh spontaneous a little bit out there a
spontaneous a little bit out there a little bit funky uh to show your boss or
little bit funky uh to show your boss or something like that thank you guys so
something like that thank you guys so much for watching I really appreciate it
much for watching I really appreciate it if you like this video be sure to like
if you like this video be sure to like And subscribe below and I will see you
And subscribe below and I will see you in the next
video [Music]
what's going on everybody welcome back to another video today we're looking at
to another video today we're looking at joins in
[Music] Tableau now before we get into the
Tableau now before we get into the tutorial I want to give a huge shout out
tutorial I want to give a huge shout out to today's sponsor and that is udem me
to today's sponsor and that is udem me they were having a massive Black Friday
they were having a massive Black Friday sale and so everything is about 85% off
sale and so everything is about 85% off so if you've been looking at a course
so if you've been looking at a course now is the time to buy it if you are
now is the time to buy it if you are looking at learning and taking an actual
looking at learning and taking an actual full Tableau course there are fantastic
full Tableau course there are fantastic ones on UD me that I have taken myself
ones on UD me that I have taken myself so be sure to go and check out UD me
so be sure to go and check out UD me while they're having this huge sale I
while they're having this huge sale I will include a link in the description
will include a link in the description if you want to check them out now let's
if you want to check them out now let's get into the tutorial all right let's
get into the tutorial all right let's get started and first we're going to
get started and first we're going to start off in Excel I'm going to kind of
start off in Excel I'm going to kind of walk you through the data that we're
walk you through the data that we're working with and then we're going to put
working with and then we're going to put it into Tableau and I'm going to show
it into Tableau and I'm going to show you how to do all those joins in Tableau
you how to do all those joins in Tableau so the first table that we have is this
so the first table that we have is this demographics table we have employee ID
demographics table we have employee ID name of employee employee age and
name of employee employee age and employee gender now look right here
employee gender now look right here because this will be important uh going
because this will be important uh going forward in the demographics table we
forward in the demographics table we have 10 uh individuals and they each
have 10 uh individuals and they each have an employee ID now when we go to
have an employee ID now when we go to the job title we have our employee ID
the job title we have our employee ID employee name and the job title but this
employee name and the job title but this one is missing Ryan Howard is missing
one is missing Ryan Howard is missing his employee ID and then the very last
his employee ID and then the very last one there are only seven employee IDs
one there are only seven employee IDs and no names um and so we're going to
and no names um and so we're going to use all of that and I'm going to show
use all of that and I'm going to show you how to actually do the joins into
you how to actually do the joins into Tableau Tableau does a really fantastic
Tableau Tableau does a really fantastic job of visualizing for you so it takes a
job of visualizing for you so it takes a lot of the guesswork out um I am going
lot of the guesswork out um I am going to include a link to my joins video in
to include a link to my joins video in SQL because these two are very closely
SQL because these two are very closely connected and and if you understand how
connected and and if you understand how the joins work in in SQL you'll
the joins work in in SQL you'll understand how the joins work in Tableau
understand how the joins work in Tableau it's almost the exact same thing so with
it's almost the exact same thing so with that being said let's jump over to
that being said let's jump over to Tableau so I'm going to pull this up
Tableau so I'm going to pull this up going go right over here and now we have
going go right over here and now we have uh where where we can connect to our
uh where where we can connect to our data and so we're going to click
data and so we're going to click Microsoft Excel I'm going to scroll down
Microsoft Excel I'm going to scroll down here to Tableau joins file I'm going to
here to Tableau joins file I'm going to open this up and I have it open so I
open this up and I have it open so I can't use it so let me get rid of that
can't use it so let me get rid of that and let's open it again perfect so now
and let's open it again perfect so now what we're going to do and I'm going to
what we're going to do and I'm going to show you how to actually open up the
show you how to actually open up the joins um in a second but what you need
joins um in a second but what you need to understand is when you first come
to understand is when you first come here Tableau doesn't automatically allow
here Tableau doesn't automatically allow you to to use the joins they use
you to to use the joins they use something called relationships and there
something called relationships and there are joins on the back end but they call
are joins on the back end but they call it relationships because they are
it relationships because they are inferring all of these things they're
inferring all of these things they're trying to go in and make that inference
trying to go in and make that inference for you so it takes a lot of the work
for you so it takes a lot of the work off of you and most of the time that
off of you and most of the time that works and and you know you just plug
works and and you know you just plug these two things in here like a
these two things in here like a demographics and the job title and it is
demographics and the job title and it is going to you know help you build those
going to you know help you build those what they call relationships and you can
what they call relationships and you can click on this and learn how the
click on this and learn how the relationships differ from joins again
relationships differ from joins again there's not a huge difference but it's
there's not a huge difference but it's not as custom customizable and you can't
not as custom customizable and you can't as easily do left joins or full joins or
as easily do left joins or full joins or all these things that we're about to
all these things that we're about to look at so uh I'm going to take this one
look at so uh I'm going to take this one off and what we're going to do to
off and what we're going to do to actually be able to look at the joins
actually be able to look at the joins and and choose what joins we want to use
and and choose what joins we want to use is we're going to do this dropdown we're
is we're going to do this dropdown we're going to click open and so now we are in
going to click open and so now we are in a place where we can actually create the
a place where we can actually create the joins uh and again it's just much more
joins uh and again it's just much more customizable and so um back when I was
customizable and so um back when I was using
using regularly I would use the relationships
regularly I would use the relationships when it was pretty simple and
when it was pretty simple and straightforward cuz almost they almost
straightforward cuz almost they almost always got it right but uh you know the
always got it right but uh you know the joins it it just makes more sense in the
joins it it just makes more sense in the way it visualizes it for me so most of
way it visualizes it for me so most of the time I'd be using the joins so let's
the time I'd be using the joins so let's pull over this job title right here and
pull over this job title right here and it's going to make this connection now
it's going to make this connection now before if you remember just about you
before if you remember just about you know 30 seconds ago when it connected
know 30 seconds ago when it connected them it was just a line and and so it
them it was just a line and and so it gave us the this option down here to
gave us the this option down here to kind of edit the relationship but now
kind of edit the relationship but now it's giving us this visualization and so
it's giving us this visualization and so let's click on it really quick and what
let's click on it really quick and what is going to come up is the different
is going to come up is the different types of joins that you can do you can
types of joins that you can do you can do an inner join a left join a right
do an inner join a left join a right join and a full outer join and then you
join and a full outer join and then you can actually choose the different uh
can actually choose the different uh data sources and how you're connecting
data sources and how you're connecting them so again um I'm going to walk
them so again um I'm going to walk through a little bit of this but I think
through a little bit of this but I think the sequel video that I did on this
the sequel video that I did on this shows it so well um I would highly
shows it so well um I would highly recommend using that um and I recommend
recommend using that um and I recommend learning SQL too so you know two birds
learning SQL too so you know two birds one stem so I'm going to get into each
one stem so I'm going to get into each of the joins how they work what data is
of the joins how they work what data is going to be displayed um and these
going to be displayed um and these visualizations are really going to be
visualizations are really going to be helpful and I think that it's it's just
helpful and I think that it's it's just nice that they have it because it's a
nice that they have it because it's a little reminder okay um you know this is
little reminder okay um you know this is what this joint is or this is what that
what this joint is or this is what that joint is so super super simple so right
joint is so super super simple so right now we have the demographics table and
now we have the demographics table and we have the job title table and so what
we have the job title table and so what it's doing right now and let's get rid
it's doing right now and let's get rid of this what it's doing right now is
of this what it's doing right now is it's doing an inner join and so it's
it's doing an inner join and so it's pulling everything that overlaps if it
pulling everything that overlaps if it matches on the employee ID and the
matches on the employee ID and the employee ID and so right now you only
employee ID and so right now you only see one through n but if you remember in
see one through n but if you remember in the demographics table we had uh 1,000
the demographics table we had uh 1,000 all the way through 10 so where's that
all the way through 10 so where's that 10th one well the 10th one is not there
10th one well the 10th one is not there and that is because in this job title
and that is because in this job title employee ID it only went up
employee ID it only went up to9 and then Ryan Howard just didn't
to9 and then Ryan Howard just didn't have an employee ID in there for
have an employee ID in there for whatever reason so that data is going to
whatever reason so that data is going to be missing now when you are using actual
be missing now when you are using actual data sets very large data sets which we
data sets very large data sets which we will use in the next video when we walk
will use in the next video when we walk through an entire
through an entire project um when you use large data sets
project um when you use large data sets this can be the difference between clean
this can be the difference between clean data and very wrong data and and
data and very wrong data and and visualizing it correctly and showing
visualizing it correctly and showing completely wrong numbers and so you
completely wrong numbers and so you really need to be sure you understand
really need to be sure you understand how your data works together when you're
how your data works together when you're doing these joins so how can we fix this
doing these joins so how can we fix this how can we um make it to where we can
how can we um make it to where we can see all of the data well right now we're
see all of the data well right now we're only making it to where if the employee
only making it to where if the employee ID is equal to the employee ID so we
ID is equal to the employee ID so we only are going to see through 109 and
only are going to see through 109 and through 109 we're never going to see
through 109 we're never going to see Ryan so there are two different types of
Ryan so there are two different types of joins that we could do to make it see it
joins that we could do to make it see it and then there's something else that we
and then there's something else that we can join on to where we can see that
can join on to where we can see that data the first that we can look at is
data the first that we can look at is the right uh join and what this does is
the right uh join and what this does is it's going to take everything that is
it's going to take everything that is the same but also everything from this
the same but also everything from this job title table regardless of if it has
job title table regardless of if it has a match in the demographics table so
a match in the demographics table so it's pretty you know this visualization
it's pretty you know this visualization does it all it's going to show
does it all it's going to show everything in the right table regardless
everything in the right table regardless and it's only going to show things from
and it's only going to show things from this table if there's a match so let's
this table if there's a match so let's try this one and we should see Ryan
try this one and we should see Ryan Howard in the job title table so let's
Howard in the job title table so let's click on it and if we scroll down there
click on it and if we scroll down there going to be n n n n n until we get to
going to be n n n n n until we get to over here where we now have the data
over here where we now have the data that we had in that actual table but
that we had in that actual table but again this wasn't a match and so we
again this wasn't a match and so we weren't able to see that data so this
weren't able to see that data so this gives us a way to where we can see all
gives us a way to where we can see all of it um all everything from that right
of it um all everything from that right table this job title table and now we're
table this job title table and now we're going to click on the full outer now the
going to click on the full outer now the full outer is going to take everything
full outer is going to take everything from both regardless of if there is a
from both regardless of if there is a match at all and so right here you're
match at all and so right here you're going to see Ryan Howard and Ryan Howard
going to see Ryan Howard and Ryan Howard now why are there two different rows for
now why are there two different rows for it well because in the demographics
it well because in the demographics table there was an employee ID so we're
table there was an employee ID so we're seeing the employee ID Ryan Howard his
seeing the employee ID Ryan Howard his age and his gender and over here there
age and his gender and over here there was no match right but in the job title
was no match right but in the job title table again this one didn't have an
table again this one didn't have an employee ID and so we we are going to be
employee ID and so we we are going to be able to see this data but over here it
able to see this data but over here it has no match and so that's why showing
has no match and so that's why showing us two different rows is because there
us two different rows is because there was no connection there was no match
was no connection there was no match there that's what a full outer joint is
there that's what a full outer joint is going to do now just for uh the purposes
going to do now just for uh the purposes of seeing what this one does as well we
of seeing what this one does as well we have the leftand table um and now we are
have the leftand table um and now we are able to see the 110 or or 1010 that we
able to see the 110 or or 1010 that we didn't see before um and it's putting in
didn't see before um and it's putting in nulles over here because there's no
nulles over here because there's no match so that's that is um what we have
match so that's that is um what we have so far now like I said just a second
so far now like I said just a second going to go there is a way that we can
going to go there is a way that we can do this without using the employee IDs
do this without using the employee IDs we're allowed to use a different join
we're allowed to use a different join Clause now there is the name of the
Clause now there is the name of the employee in both of them this one is
employee in both of them this one is called name of employee and in the job
called name of employee and in the job title it's called employee name they
title it's called employee name they don't have to have the same column name
don't have to have the same column name in order to join it you can do whatever
in order to join it you can do whatever you want so I'm going to get rid of this
you want so I'm going to get rid of this one and now we are only tying it on the
one and now we are only tying it on the employee name and let's do an inter
employee name and let's do an inter join and it should be basically
join and it should be basically everything um except the only piece of
everything um except the only piece of data that wasn't filled in which is that
data that wasn't filled in which is that 110 over on the job title table and so
110 over on the job title table and so this way was a slightly different maybe
this way was a slightly different maybe uh less thought of way because normally
uh less thought of way because normally you do it if there's an ID you go on the
you do it if there's an ID you go on the IDS but because we had a lack of data
IDS but because we had a lack of data for in in one of the tables in the job
for in in one of the tables in the job title table we decided to use a
title table we decided to use a different column to to join on and now
different column to to join on and now we're able to look at all the data
we're able to look at all the data together so super quickly that is an
together so super quickly that is an inner join a left join a right join and
inner join a left join a right join and a full outer join and it's pretty easily
a full outer join and it's pretty easily visualized here and you're able to uh
visualized here and you're able to uh change what you're joining on right here
change what you're joining on right here but you're also you can do multiple so
but you're also you can do multiple so if we want to do the employee ID and the
if we want to do the employee ID and the employee ID you can do that as well and
employee ID you can do that as well and you can keep going as as many as you'd
you can keep going as as many as you'd like um and
like um and right here or you can change some of
right here or you can change some of these things uh I don't there aren't a
these things uh I don't there aren't a lot of use cases for this um but you
lot of use cases for this um but you know you can absolutely do this um and
know you can absolutely do this um and mess around with this as seen I'm not
mess around with this as seen I'm not going to go through it in the tutorial
going to go through it in the tutorial because again 95 plus perc of the joins
because again 95 plus perc of the joins you're doing you're going to want to do
you're doing you're going to want to do it to where this equals this um and if
it to where this equals this um and if you want to get into where it doesn't
you want to get into where it doesn't equal or or all these other things which
equal or or all these other things which is more complicated I think it's much
is more complicated I think it's much better to learn that in SQL uh that's my
better to learn that in SQL uh that's my personal preference and so um again all
personal preference and so um again all in the SQL tutorial if you want to check
in the SQL tutorial if you want to check that one out so you're able to join on
that one out so you're able to join on multiple things now let's get rid of
multiple things now let's get rid of that one because we can actually bring
that one because we can actually bring in this salary one as well and what
in this salary one as well and what you'll see right down
you'll see right down here is that we have our employee ID and
here is that we have our employee ID and this is all coming from the demographics
this is all coming from the demographics so employee ID name of employer employee
so employee ID name of employer employee age employee gender then right over here
age employee gender then right over here we have the job title table so employee
we have the job title table so employee ID job title employee name job title and
ID job title employee name job title and then right over here was or is our
then right over here was or is our salary table and so we have employee ID
salary table and so we have employee ID salary and employee salary so again this
salary and employee salary so again this is a way that you can put all of this
is a way that you can put all of this data into one place and and just a
data into one place and and just a second we'll go into the
second we'll go into the worksheet right down here I'm going to
worksheet right down here I'm going to show you kind of how it looks because it
show you kind of how it looks because it looks a little bit different um than
looks a little bit different um than previous tutorials and so I want to show
previous tutorials and so I want to show you how that actually all works together
you how that actually all works together um but again you can create these joins
um but again you can create these joins um as well and do the exact same thing
um as well and do the exact same thing that we just looked at and customize the
that we just looked at and customize the joins customize what you're what you're
joins customize what you're what you're um uh joining on and then you have your
um uh joining on and then you have your finished product and so right now we
finished product and so right now we have our demographics plus Tableau joins
have our demographics plus Tableau joins file and we can rename that if we want
file and we can rename that if we want I'm going to call this um demographics
I'm going to call this um demographics plus joins
plus joins demo and click enter and so now that is
demo and click enter and so now that is saved so so now let's go down to the go
saved so so now let's go down to the go to worksheet we're going to click on
to worksheet we're going to click on that and so up here on our left side
that and so up here on our left side this may look a little bit different
this may look a little bit different than it normally does um because it's
than it normally does um because it's broken out um on the measure names and
broken out um on the measure names and the measure values it's broken out by
the measure values it's broken out by the tables that they were joined on so
the tables that they were joined on so we can pull in the employee gender now
we can pull in the employee gender now and we can pull in the employee name now
and we can pull in the employee name now um and we can pull in the employee ID
um and we can pull in the employee ID again if we want to from the job title
again if we want to from the job title table and we can pull in the employee ID
table and we can pull in the employee ID from the salary table we could do that
from the salary table we could do that if we wanted to it makes no sense uh uh
if we wanted to it makes no sense uh uh for actually creating any visualizations
for actually creating any visualizations but you know you can do that and so you
but you know you can do that and so you probably you wouldn't be able to do that
probably you wouldn't be able to do that if you hadn't joined these together and
if you hadn't joined these together and so down here in the measure values the
so down here in the measure values the values that we have are from the
values that we have are from the demographics table and the salary table
demographics table and the salary table all of the um all of the stuff from the
all of the um all of the stuff from the employee title none of those things were
employee title none of those things were um values and so we can't use there are
um values and so we can't use there are going to be no values down here and so
going to be no values down here and so really quick let's take the name of the
really quick let's take the name of the employee let's take their salary sure
employee let's take their salary sure why not um let's order
why not um let's order that let's take the employee
that let's take the employee salary we'll do
salary we'll do color and uh expan this out a little
color and uh expan this out a little bit maybe one more time oops just like
bit maybe one more time oops just like that and there you go so that is how you
that and there you go so that is how you do joins in Tableau and I think Tableau
do joins in Tableau and I think Tableau does a really fantastic job of making it
does a really fantastic job of making it pretty simple they have the different
pretty simple they have the different types of joins when you click on that
types of joins when you click on that that join button and it shows you the
that join button and it shows you the inner and the left and the right and the
inner and the left and the right and the full outer and they make it pretty
full outer and they make it pretty simple um and and and it's just really
simple um and and and it's just really useful to be able to see that while
useful to be able to see that while you're creating it and see the output
you're creating it and see the output below like we just did a second ago it
below like we just did a second ago it it just makes it so simple to create
it just makes it so simple to create those joins and then just keep going
those joins and then just keep going because you already know what your
because you already know what your output is going to be and you can kind
output is going to be and you can kind of mess around with it and make sure
of mess around with it and make sure you're getting the data that you need in
you're getting the data that you need in the very next video we're going to be
the very next video we're going to be doing an entire project in tap we're
doing an entire project in tap we're going to be using a lot more data and
going to be using a lot more data and it's going to be a a complete project
it's going to be a a complete project that you can add to your portfolio and
that you can add to your portfolio and it's going to be a really good time so I
it's going to be a really good time so I hope that you joined me for that one I
hope that you joined me for that one I appreciate your time I hope that this
appreciate your time I hope that this was helpful thank you guys so much for
was helpful thank you guys so much for watching I really appreciate it if you
watching I really appreciate it if you like this video be sure to like And
like this video be sure to like And subscribe below and I'll see you in the
subscribe below and I'll see you in the next
next [Music]
[Music] video
video what's going on everybody welcome back
what's going on everybody welcome back to the Tableau tutorial Series this is
to the Tableau tutorial Series this is our very last video in the series and
our very last video in the series and today we'll be doing an entire
[Music] project now if you're watching this
project now if you're watching this video I hope that you watch the other
video I hope that you watch the other four videos in this series just so you
four videos in this series just so you can get the basics down you kind of know
can get the basics down you kind of know what you're doing uh this won't be a
what you're doing uh this won't be a crazy hard project this is a beginner
crazy hard project this is a beginner tutorial Series so I'm trying to make
tutorial Series so I'm trying to make this super easy so you can follow along
this super easy so you can follow along nothing super comp complicated I promise
nothing super comp complicated I promise and if you were wanting to go above and
and if you were wanting to go above and beyond and just make a lot of different
beyond and just make a lot of different dashboards or try a lot of different
dashboards or try a lot of different things there's a ton of data in here and
things there's a ton of data in here and so I'll show you some of the things that
so I'll show you some of the things that I would do you know as we go through it
I would do you know as we go through it of the things that I would be looking at
of the things that I would be looking at and some of the different visualizations
and some of the different visualizations that I might do as well but again in
that I might do as well but again in this video we're going to be singing to
this video we're going to be singing to a lot of the basics but I'll switch over
a lot of the basics but I'll switch over my screen in just a second I will show
my screen in just a second I will show you the final product and then we will
you the final product and then we will actually walk through step by step of
actually walk through step by step of how to do the entire dashboard and at
how to do the entire dashboard and at the end you should have a completed
the end you should have a completed project that you can add to your
project that you can add to your portfolio or you know just share on
portfolio or you know just share on LinkedIn if you want to do that as well
LinkedIn if you want to do that as well with that being said let's jump over to
with that being said let's jump over to my screen and let's get started all
my screen and let's get started all right so let's get me off screen and
right so let's get me off screen and show you what we're going to be working
show you what we're going to be working on today this is the final dashboard
on today this is the final dashboard that we're actually going to be building
that we're actually going to be building and so it's nothing crazy right I'm sure
and so it's nothing crazy right I'm sure you have seen all of these things before
you have seen all of these things before um and I'm just going to help you kind
um and I'm just going to help you kind of build it out show you what to do the
of build it out show you what to do the buttons to click um and it's really
buttons to click um and it's really going to be a simple walk through by the
going to be a simple walk through by the end of this you should be able to do all
end of this you should be able to do all these things very easily and I highly
these things very easily and I highly encourage looking at at the data and
encourage looking at at the data and looking at these visualizations and
looking at these visualizations and seeing what else you can do with it
seeing what else you can do with it there's a lot of different colors a lot
there's a lot of different colors a lot of different visualizations um that you
of different visualizations um that you can do with this data I'm just showing
can do with this data I'm just showing you this today and so the more you go
you this today and so the more you go out there and the more you do this on
out there and the more you do this on your own and you mess around with stuff
your own and you mess around with stuff and and choose different things and see
and and choose different things and see how it all works the better you're going
how it all works the better you're going to get and so I highly highly encourage
to get and so I highly highly encourage doing that uh so what we are going to be
doing that uh so what we are going to be working with today is an Airbnb data set
working with today is an Airbnb data set I'm going to show you that in just a
I'm going to show you that in just a second and I'm going to show you the
second and I'm going to show you the data and we're going to just jump right
data and we're going to just jump right into it all right so this is the data
into it all right so this is the data set that we are going to be using this
set that we are going to be using this is the Seattle Airbnb open data set and
is the Seattle Airbnb open data set and let's scroll down really quick um
let's scroll down really quick um there's three different csvs in here and
there's three different csvs in here and so this is some of the data that we're
so this is some of the data that we're going to be working with um some date on
going to be working with um some date on listings and some pricing and then
listings and some pricing and then there's the actual listing that shows um
there's the actual listing that shows um the actual street address the location
the actual street address the location the price the bedrooms all of these good
the price the bedrooms all of these good stuff stuff and then there's a
stuff stuff and then there's a reviews um and it has you know some
reviews um and it has you know some comments and you know talks about some
comments and you know talks about some of the reviews so this is what we're
of the reviews so this is what we're going to be working with but you don't
going to be working with but you don't have to go in here and download it I
have to go in here and download it I have already combined all these csvs
have already combined all these csvs into one I've put it on the GitHub so
into one I've put it on the GitHub so I'll have a link below so you can just
I'll have a link below so you can just click on that and you don't have to do
click on that and you don't have to do all the stuff that I did to get this set
all the stuff that I did to get this set up um just so you know this is from 2016
up um just so you know this is from 2016 so this data set is a little bit old if
so this data set is a little bit old if you want to you can come right here and
you want to you can come right here and I will leave this link as well and you
I will leave this link as well and you can get the data set from you know what
can get the data set from you know what is this a couple weeks ago uh this is
is this a couple weeks ago uh this is they they are continuing to update this
they they are continuing to update this this is always updated and so you can go
this is always updated and so you can go ahead and download these but some of
ahead and download these but some of these are the CSV Dogz um so you may
these are the CSV Dogz um so you may need to like convert it I don't want to
need to like convert it I don't want to go through that process um on you know
go through that process um on you know in the video and so I am just going to
in the video and so I am just going to go with what is literally in kaggle um
go with what is literally in kaggle um and use that but if you want want to
and use that but if you want want to have an updated one for your project I
have an updated one for your project I just advise you to go in here and grab
just advise you to go in here and grab it yourself and that should be perfectly
it yourself and that should be perfectly good so go ahead and download the data
good so go ahead and download the data set from the GitHub and we should be
set from the GitHub and we should be good to go so this is the Excel that I
good to go so this is the Excel that I was just talking about this has all of
was just talking about this has all of our csvs in one place this is you know
our csvs in one place this is you know an Excel workbook so in this reviews
an Excel workbook so in this reviews actually let's start with the listings
actually let's start with the listings because that's kind of where it all
because that's kind of where it all stems from uh we have our listing and
stems from uh we have our listing and the DAT or the data in here is um you
the DAT or the data in here is um you you know really extensive there's a lot
you know really extensive there's a lot of data in here so let's get over really
of data in here so let's get over really quick um the listing refers to the
quick um the listing refers to the actual home that they're renting out the
actual home that they're renting out the Airbnb so it shows their
Airbnb so it shows their location um and there's a lot more
location um and there's a lot more location information over here I'm
location information over here I'm getting into it in in just a second so
getting into it in in just a second so there's the neighborhood the city state
there's the neighborhood the city state um zip code all stuff that you know may
um zip code all stuff that you know may be useful there's a latitude and
be useful there's a latitude and longitude it shows what type of property
longitude it shows what type of property it is so that's really really good um
it is so that's really really good um right over here it has you know how many
right over here it has you know how many bathrooms bedrooms and beds um you know
bathrooms bedrooms and beds um you know sometimes if it's a five bedroom house
sometimes if it's a five bedroom house it has seven beds so that's why there's
it has seven beds so that's why there's those two different um Fields I don't
those two different um Fields I don't know if you're familiar with Airbnb and
know if you're familiar with Airbnb and and you know what they have on there but
and you know what they have on there but just something to note uh they have the
just something to note uh they have the price this is the price per day this is
price this is the price per day this is a weekly price a monthly price and if
a weekly price a monthly price and if there's a deposit needed uh and then a
there's a deposit needed uh and then a cleaning fee as well so a bunch of
cleaning fee as well so a bunch of financial data that's you know super
financial data that's you know super useful we go into it a little bit but
useful we go into it a little bit but there's so much you can do with that um
there's so much you can do with that um you know if you want to dig into that
you know if you want to dig into that and that's kind of it the rest of it's
and that's kind of it the rest of it's pretty uh pretty useless um and there's
pretty uh pretty useless um and there's a lot so there's so much data in here
a lot so there's so much data in here almost you know more than half by far is
almost you know more than half by far is nothing you would put in any type of
nothing you would put in any type of visualization um and this is pretty
visualization um and this is pretty common uh you're not going to
common uh you're not going to get data every column where you're going
get data every column where you're going to be able to use it a lot of times it's
to be able to use it a lot of times it's just a lot of useless junk and so you
just a lot of useless junk and so you have to know what you're looking for and
have to know what you're looking for and know uh you know what's actually useful
know uh you know what's actually useful so that's the listing then we have
so that's the listing then we have reviews
reviews now what's really a little bit confusing
now what's really a little bit confusing in here and something that you just need
in here and something that you just need to kind of understand about the data um
to kind of understand about the data um and something that if you're if you get
and something that if you're if you get a data analyst job you need to
a data analyst job you need to understand your data because it's very
understand your data because it's very easy to come in here and say okay
easy to come in here and say okay there's an ID ID field and here's an ID
there's an ID ID field and here's an ID field so that means that those are the
field so that means that those are the same well not in this case um this ID
same well not in this case um this ID field is actually the review reviews ID
field is actually the review reviews ID not the reviewer ID that refers to like
not the reviewer ID that refers to like the person this is the reviews ID this
the person this is the reviews ID this listing ID is the actual ID right there
listing ID is the actual ID right there so really important to
so really important to note um and then the L and so then they
note um and then the L and so then they just have their comment there what they
just have their comment there what they left as a review and then on the
left as a review and then on the calendar um I don't know why I'm
calendar um I don't know why I'm scrolled down uh we have this listing
scrolled down uh we have this listing idea again so again that listing ID is
idea again so again that listing ID is equal to the ID in this listing table
equal to the ID in this listing table and we have a date in a price so this
and we have a date in a price so this refers to a specific location and on
refers to a specific location and on this day they got $85 for it somebody
this day they got $85 for it somebody rented it out um and so then there's
rented it out um and so then there's these like T's and Fs um let's try to
these like T's and Fs um let's try to find a blank one really quick here's a
find a blank one really quick here's a blank one so there's these T's and Fs uh
blank one so there's these T's and Fs uh the t means that it was taken um the f
the t means that it was taken um the f means that it's vacant I don't know
means that it's vacant I don't know exactly what it means uh what a TF means
exactly what it means uh what a TF means but that we can deduce that much from
but that we can deduce that much from this and so you can see when and how
this and so you can see when and how much this person was making or this
much this person was making or this homeade uh in that time so really really
homeade uh in that time so really really good data in here there's a lot to work
good data in here there's a lot to work with um and and so we're just going to
with um and and so we're just going to be kind of I'll give you a little bit of
be kind of I'll give you a little bit of a use case for it in a second and then
a use case for it in a second and then we're going to start trying to answer
we're going to start trying to answer some of those the building out some of
some of those the building out some of the visualizations for that use case uh
the visualizations for that use case uh again you could have 20 different use
again you could have 20 different use cases for this data or more um honestly
cases for this data or more um honestly for this data where you can build out
for this data where you can build out different dashboards and different
different dashboards and different reports literally with just this data
reports literally with just this data but you know we're doing a pretty
but you know we're doing a pretty General broad project and so it's hard
General broad project and so it's hard to answer all of them so let's jump over
to answer all of them so let's jump over to Tableau we're going to get started on
to Tableau we're going to get started on this and we are going to build out
this and we are going to build out everything all right so let's come right
everything all right so let's come right here uh this is a Microsoft Excel we'll
here uh this is a Microsoft Excel we'll open that up do this one we will open
open that up do this one we will open it and give it just a second says it's
it and give it just a second says it's executing the query it's pulling the
executing the query it's pulling the data in all right so we have our
data in all right so we have our calendar our listing and our reviews
calendar our listing and our reviews those are the different tabs at the
those are the different tabs at the bottom we're going to start with the
bottom we're going to start with the listing this is the the kind of the main
listing this is the the kind of the main one has um you know the there's I didn't
one has um you know the there's I didn't show you but there's about
show you but there's about 3,600
3,600 locations that they had in
locations that they had in there uh let's just have it update
there uh let's just have it update automatically I don't know why we need
automatically I don't know why we need to click on that but um so we have this
to click on that but um so we have this list listings we have our calendar and
list listings we have our calendar and our
our reviews what we're going to do is going
reviews what we're going to do is going to come in here and we're going to open
to come in here and we're going to open it as we did in our very last video uh
it as we did in our very last video uh for the joins so now that we've opened
for the joins so now that we've opened it we can kind of go in here and we can
it we can kind of go in here and we can do the joins as um as needed and so
do the joins as um as needed and so let's go over here and we're going to uh
let's go over here and we're going to uh let's start with
let's start with calendar put it right there that was
calendar put it right there that was super slow I
super slow I apologize all right let's wait for it
to get the data start setting everything up did not think it would take this long
up did not think it would take this long I
apologize no take your time so let's click on here and right now it has the
click on here and right now it has the uh the join based on the price which
uh the join based on the price which obviously is not going to work um and if
obviously is not going to work um and if you remember there is no ID in this
you remember there is no ID in this calendar it's just just the listing ID
calendar it's just just the listing ID um we can actually look right here
um we can actually look right here there's just the listing ID so we're
there's just the listing ID so we're actually going to put listing ID is
actually going to put listing ID is equal to
equal to ID and right down here we can see that
ID and right down here we can see that we have a lot of of well you can't see
we have a lot of of well you can't see it um but we show that there is a lot of
it um but we show that there is a lot of data um and so we know that that is
data um and so we know that that is correct we know that that is now pulling
correct we know that that is now pulling in data correctly because it's showing
in data correctly because it's showing up down here so that's a good thing now
up down here so that's a good thing now in this listings there there are about
in this listings there there are about 3600 um about 3600 listings and
3600 um about 3600 listings and so that all the data that's in listings
so that all the data that's in listings is going to be in there but on the
is going to be in there but on the calendar because we converted from a CSV
calendar because we converted from a CSV to an Excel workbook it isn't able to
to an Excel workbook it isn't able to store as much information so some of the
store as much information so some of the ones in calendar may have gotten cut off
ones in calendar may have gotten cut off so we can just keep at this inj join
so we can just keep at this inj join because we know that if it's in listings
because we know that if it's in listings it is going to be in calendar we know
it is going to be in calendar we know that it if it um there may be some in
that it if it um there may be some in calar Cal that aren't in listings so if
calar Cal that aren't in listings so if we really um you know if we really
we really um you know if we really really wanted to we could do a full
really wanted to we could do a full outer or something like that I I haven't
outer or something like that I I haven't really thought through this as I'm
really thought through this as I'm talking through it in my head but we
talking through it in my head but we know that uh everything that's in
know that uh everything that's in listing is going to be in calendar and
listing is going to be in calendar and so you know we don't really need to do
so you know we don't really need to do anything other than an inner
anything other than an inner join and we can also pull in these
join and we can also pull in these reviews and it's going to do the same
reviews and it's going to do the same thing as before where just kind of
thing as before where just kind of pulling in the data and it defaults to
pulling in the data and it defaults to ID equals ID now we know that that is
ID equals ID now we know that that is not correct um because the ID in here is
not correct um because the ID in here is referring to the review ID we need to go
referring to the review ID we need to go to the listings ID so we need the ID be
to the listings ID so we need the ID be able to you know be part of that
able to you know be part of that listings ID if we do the
listings ID if we do the ID it goes down to
ID it goes down to 2,555 rows if we do how it's supposed
2,555 rows if we do how it's supposed and because that's just you know it's
and because that's just you know it's random luck there happen to be some
random luck there happen to be some numbers that are in both fields um that
numbers that are in both fields um that tie together if we do the correct one
tie together if we do the correct one where we hit the listing ID it bumps it
where we hit the listing ID it bumps it up to I think 2, 373,000 oh maybe more
up to I think 2, 373,000 oh maybe more than that uh 23 million rows right a lot
than that uh 23 million rows right a lot lot lot more and so it's super important
lot lot more and so it's super important to get these joins right to tie them
to get these joins right to tie them together on the right Fields if you just
together on the right Fields if you just do it based off what Tableau tells you
do it based off what Tableau tells you because it has that automated um you
because it has that automated um you know it goes into these fields and says
know it goes into these fields and says okay these are the same exact column
okay these are the same exact column name so they're most likely going to be
name so they're most likely going to be what you're looking for well it was
what you're looking for well it was incorrect in this point so it's really
incorrect in this point so it's really important to check those things and make
important to check those things and make sure you're pulling in the right data
sure you're pulling in the right data again we're going to keep it that inner
again we're going to keep it that inner join um you know if you wanted to you
join um you know if you wanted to you know try to see if there's any other
know try to see if there's any other data that correlate we're keeping it
data that correlate we're keeping it simple today but sometimes you need to
simple today but sometimes you need to join on multiple things uh so just uh a
join on multiple things uh so just uh a you know a tip so let's get out of here
you know a tip so let's get out of here um and we are good to go so this is our
um and we are good to go so this is our listings plus Tableau full project
listings plus Tableau full project that's what we'll that's what we'll be
that's what we'll that's what we'll be working with um and we we were able to
working with um and we we were able to tie all three of these um you know as
tie all three of these um you know as you call them tables or sheets or
you call them tables or sheets or whatever you want to call them we were
whatever you want to call them we were able to tie them together so let's go
able to tie them together so let's go over here to our first
over here to our first worksheet uh let's
worksheet uh let's see all right so this says Tableau
see all right so this says Tableau public only works with less than 15
public only works with less than 15 million rows of data we have 23 million
million rows of data we have 23 million rows of data that is uh that's a problem
rows of data that is uh that's a problem um and when I did this before it didn't
um and when I did this before it didn't do that so I you know we're going to
do that so I you know we're going to work through this together so this is
work through this together so this is date reviews I believe this is date for
date reviews I believe this is date for um this is date for the calendar which
um this is date for the calendar which is going to be a lot of rows of data and
is going to be a lot of rows of data and so I'm sure that's part of it let's
so I'm sure that's part of it let's see let's do
see let's do years we only want 2016 oops we only
years we only want 2016 oops we only want
2016 let's do okay let's see what that does let's see if
let's see what that does let's see if that gets us under what we need um we
that gets us under what we need um we only want 2016 data
only want 2016 data anyways so if it's in 2017 we were going
anyways so if it's in 2017 we were going to take it out um anyway so we'll see if
to take it out um anyway so we'll see if that gets us underneath I have
that gets us underneath I have absolutely if this T ends up taking like
absolutely if this T ends up taking like 20 minutes I will just cut it and you
20 minutes I will just cut it and you know you won't have to wait as long as
know you won't have to wait as long as I'm waiting so let's see how long it
takes all right so it took about 20 minutes and it did absolutely nothing
minutes and it did absolutely nothing um one thing I do know is that we don't
um one thing I do know is that we don't actually use this review tables at all
actually use this review tables at all um just for demonstration purposes so
um just for demonstration purposes so we're going to remove that and let's see
we're going to remove that and let's see if that helps us in any
way if it does we're just going to keep it as is um you know the reviews table
it as is um you know the reviews table is really just for demonstrating how to
is really just for demonstrating how to do the joint
do the joint but we weren't actually using any of the
but we weren't actually using any of the data for any of the
data for any of the visualizations although you
visualizations although you could again I'm going to see how long
could again I'm going to see how long this takes uh and I'll cut
ahead all right so that worked uh perfectly it apparently took out all the
perfectly it apparently took out all the data that we needed all the rows that we
data that we needed all the rows that we needed to get under that level again I
needed to get under that level again I was just doing that to show you the that
was just doing that to show you the that that joins how you needed to change the
that joins how you needed to change the columns to make sure that it joined
columns to make sure that it joined properly we don't actually use for any
properly we don't actually use for any of the visualization so their end
of the visualization so their end product is going to be totally fine I
product is going to be totally fine I don't know why uh this didn't happen to
don't know why uh this didn't happen to me when I when I created this whole
me when I when I created this whole thing already um so just going to move
thing already um so just going to move forward because uh I make mistakes so uh
forward because uh I make mistakes so uh let's keep moving the first one that we
let's keep moving the first one that we are going to make is that uh is that
are going to make is that uh is that colorful one I'll probably pop it up on
colorful one I'll probably pop it up on screen so you can see it uh well if I
screen so you can see it uh well if I remember I'm going to pop it up on
remember I'm going to pop it up on screen um it's the colorful one it's the
screen um it's the colorful one it's the price by ZIP code so we're going to be
price by ZIP code so we're going to be looking at these zip codes and kind of
looking at these zip codes and kind of see
see um you know how
um you know how expensive is each zip code um and before
expensive is each zip code um and before we actually start I just remembered I
we actually start I just remembered I want to talk to you about the use case
want to talk to you about the use case for this
for this data I want to imagine you to imagine
data I want to imagine you to imagine that you're working for somebody they're
that you're working for somebody they're like hey where you know I want to start
like hey where you know I want to start an Airbnb business I want to know where
an Airbnb business I want to know where I should go where should I buy up buy a
I should go where should I buy up buy a home put it up on Airbnb and start
home put it up on Airbnb and start renting it out where's the best place
renting it out where's the best place you know what are some of the fact fact
you know what are some of the fact fact that I should be looking at uh and so
that I should be looking at uh and so that's kind of what our use case is so
that's kind of what our use case is so we're going to some of the things that
we're going to some of the things that he cares about are things like bedrooms
he cares about are things like bedrooms um location which is really important
um location which is really important and how much price he's actually going
and how much price he's actually going to get how much money can he charge and
to get how much money can he charge and so he's trying to optimize that to make
so he's trying to optimize that to make sure that whatever rental he gets he can
sure that whatever rental he gets he can make a the most profit from instead of
make a the most profit from instead of choosing something that you know he
choosing something that you know he thinks would work but you know in the
thinks would work but you know in the end he's actually not making that much
end he's actually not making that much money so those things are important so
money so those things are important so that's our use case we're trying to help
that's our use case we're trying to help this guy out help him find a really good
this guy out help him find a really good Airbnb um so let's take a look at these
Airbnb um so let's take a look at these zip codes real quick we have uh quite a
zip codes real quick we have uh quite a few of them and there's one that's null
few of them and there's one that's null uh we'll exclude that or if if it
uh we'll exclude that or if if it doesn't have a zip code we'll just
doesn't have a zip code we'll just exclude those because they're not going
exclude those because they're not going to show up on the these visualizations
to show up on the these visualizations anyways um and so we want to look at the
anyways um and so we want to look at the price so we just want to find uh the
price so we just want to find uh the price which should actually be down
price which should actually be down here and not the sum
here and not the sum uh no we want to look at the average
uh no we want to look at the average price and let's order that this is great
price and let's order that this is great um so this is the most expensive one uh
um so this is the most expensive one uh ZIP code 98134 at
ZIP code 98134 at $26 uh
$26 uh per for the average price uh but let's
per for the average price uh but let's give that some color really quick Let's
give that some color really quick Let's uh where's the ZIP code it's up here so
uh where's the ZIP code it's up here so let's take that zip code we're going to
let's take that zip code we're going to put it right over here we're going to do
put it right over here we're going to do color and it's going to give it some uh
color and it's going to give it some uh assorted colors now these colors are
assorted colors now these colors are going to um when we do the map in just a
going to um when we do the map in just a little bit these colors will um match
little bit these colors will um match what we're doing in there and so you
what we're doing in there and so you know I I like to try to color coordinate
know I I like to try to color coordinate things um we're not doing going too
things um we're not doing going too crazy with the colors today so this is
crazy with the colors today so this is our very first visualization
our very first visualization congratulations it is uh it is complete
congratulations it is uh it is complete so uh we can label this one and we can
so uh we can label this one and we can just
just do price by zip code and I'll make that
do price by zip code and I'll make that bold I don't know I usually like it bold
bold I don't know I usually like it bold we'll apply we'll do like that and boom
we'll apply we'll do like that and boom first one is done uh and this is our
first one is done uh and this is our starting place to say uh Hey person
starting place to say uh Hey person who's looking to buy this Airbnb here
who's looking to buy this Airbnb here are the zip codes where they are able to
are the zip codes where they are able to charge the most um for for their Airbnb
charge the most um for for their Airbnb so let's go over to the second sheet and
so let's go over to the second sheet and we are going to be doing the map and so
we are going to be doing the map and so um map is pretty easy
um map is pretty easy but it it's pretty easy Once you
but it it's pretty easy Once you actually get the data that you need
actually get the data that you need although there's a lot of different data
although there's a lot of different data that you can use for the actual U map
that you can use for the actual U map right here you need something that shows
right here you need something that shows um the location and there's a lot of
um the location and there's a lot of things that show location in here in
things that show location in here in fact they already um provide a latitude
fact they already um provide a latitude and longitude and then at the bottom
and longitude and then at the bottom they generated a latitude and longitude
they generated a latitude and longitude from from some different um fields and
from from some different um fields and then there's just a bunch of different
then there's just a bunch of different um State there's um States there's zip
um State there's um States there's zip codes there are uh I think another one I
codes there are uh I think another one I yeah like country there's a lot of
yeah like country there's a lot of location data in here so which one do we
location data in here so which one do we want to use we want to stay consistent
want to use we want to stay consistent we don't want to deviate from that and
we don't want to deviate from that and start using different um L long
start using different um L long longitude and latitudinal uh coordinates
longitude and latitudinal uh coordinates because that could throw off our our
because that could throw off our our results completely we want to stay
results completely we want to stay consistent with what we're using so we
consistent with what we're using so we actually want to use this ZIP code but
actually want to use this ZIP code but when we pull it up here it's going to
when we pull it up here it's going to give us uh basically the same um you
give us uh basically the same um you know it's going to show these zip codes
know it's going to show these zip codes but we were going to right over here
but we were going to right over here we're going to click on this one and now
we're going to click on this one and now it's going to separate them out so now
it's going to separate them out so now we have all of these um you know kind of
we have all of these um you know kind of separated out what you might get when
separated out what you might get when you first do this um is it might look
you first do this um is it might look like this you may have to zoom in um I
like this you may have to zoom in um I know that that happened to me the other
know that that happened to me the other time excuse me go to here that's what
time excuse me go to here that's what happened to me uh just when I first did
happened to me uh just when I first did it so uh know that that may happen
it so uh know that that may happen and we want to change the colors the
and we want to change the colors the exact same way that we did them before
exact same way that we did them before so we're just going over here we're
so we're just going over here we're doing color and these colors do um they
doing color and these colors do um they do should match up with the um with the
do should match up with the um with the other ones let me um exclude this let me
other ones let me um exclude this let me see if it does 98134 that's the
see if it does 98134 that's the blue and right over here 98134 that's a
blue and right over here 98134 that's a blue I I I believe believe they are
blue I I I believe believe they are going to be the same yep and so just
going to be the same yep and so just scrolling back if you look at the ZIP
scrolling back if you look at the ZIP code on the far right uh they are the
code on the far right uh they are the same so if you're looking like this
same so if you're looking like this section right over here I I'm just
section right over here I I'm just wanting to make sure I'm not going crazy
wanting to make sure I'm not going crazy uh before I get into this and realize
uh before I get into this and realize I'm not correct at all so uh now what we
I'm not correct at all so uh now what we want is you know this doesn't really
want is you know this doesn't really give us any information if I was just to
give us any information if I was just to glance at this map I would have no idea
glance at this map I would have no idea what you're trying to show me um any
what you're trying to show me um any information off this so we want to show
information off this so we want to show some actual
some actual information so first thing that we're
information so first thing that we're going to do is we're going to actually
going to do is we're going to actually add the label to this so that you can
add the label to this so that you can see it you know when you're going over
see it you know when you're going over here and you see okay here's this um zip
here and you see okay here's this um zip code um in the dashboard when we create
code um in the dashboard when we create it you can click on this but if you just
it you can click on this but if you just want to do it visually without having to
want to do it visually without having to click anywhere you'll be able to see
click anywhere you'll be able to see okay 98134 that's right here so this
okay 98134 that's right here so this location right here is you know able to
location right here is you know able to charge a lot of money it's probably a
charge a lot of money it's probably a really nice neighborhood so um and we
really nice neighborhood so um and we can back that up by putting the average
can back that up by putting the average price so these these two visualizations
price so these these two visualizations are really they really go hand in hand
are really they really go hand in hand we're going to add oops not the
we're going to add oops not the sum this one needs to be the average so
sum this one needs to be the average so you go to this measure the sum go to
you go to this measure the sum go to average and there you go and these
average and there you go and these should match so this should be
206.125 206.000 so this all matches um and we
206.000 so this all matches um and we can uh we can actually change that size
can uh we can actually change that size a little bit if you want to actually get
a little bit if you want to actually get it in um get it within each of these
it in um get it within each of these things you know adjust it as you see
things you know adjust it as you see fits I think that's fine right there um
fits I think that's fine right there um no need
no need to mess with it
to mess with it anymore all right so let me see I think
anymore all right so let me see I think that is everything for this one I don't
that is everything for this one I don't know if I want to add anything else uh
know if I want to add anything else uh no I'm going to keep it how it is so
no I'm going to keep it how it is so that is our second visualization again
that is our second visualization again these ones are directly uh correlated
these ones are directly uh correlated and and you know this there's just
and and you know this there's just different ways to visualize it this one
different ways to visualize it this one you can see actually on the map where it
you can see actually on the map where it is and the average price this one you
is and the average price this one you can see from highest to lowest so again
can see from highest to lowest so again you know sometimes when you're doing
you know sometimes when you're doing these visualizations you're going to
these visualizations you're going to have these accompanying um uh these
have these accompanying um uh these accompanying visualizations in your
accompanying visualizations in your dashboard that's very normal so let's
dashboard that's very normal so let's move over to the third one and for this
move over to the third one and for this third one um you know something that our
third one um you know something that our guy was looking at is he's like okay
guy was looking at is he's like okay well you know I'm thinking about listing
well you know I'm thinking about listing it on Airbnb but I also want to live in
it on Airbnb but I also want to live in it so I want to know the best times to
it so I want to know the best times to actually um you know put it on the
actually um you know put it on the market for people to be able to use and
market for people to be able to use and so I was like okay man no problem uh
so I was like okay man no problem uh let's let's take a look at when when are
let's let's take a look at when when are people spending the most money in
people spending the most money in airbnbs and we actually had that
airbnbs and we actually had that calendar um if you remember let's look
calendar um if you remember let's look let's see this calendar so we have this
let's see this calendar so we have this available the date the listing all of
available the date the listing all of that stuff um and let's look at the date
that stuff um and let's look at the date in
in here uh and we obviously don't want it
here uh and we obviously don't want it like this we want it to be more uh more
like this we want it to be more uh more of a Time series and we're going to do
of a Time series and we're going to do be doing that based off of uh the price
be doing that based off of uh the price for the calendar so let's go see if we
for the calendar so let's go see if we can find that really
can find that really quick okay here's the price
where is that calendar one let me see okay there's the calendar
one let me see okay there's the calendar oh
oh here I totally forgot where that was
here I totally forgot where that was supposed to be o that looks
supposed to be o that looks terrible okay um let's see let's let's
terrible okay um let's see let's let's start working on this because this needs
start working on this because this needs some work obviously uh this is the worst
some work obviously uh this is the worst visualization I have ever seen um so we
visualization I have ever seen um so we need to work on this a little bit what
need to work on this a little bit what we need to do is we need to change oh
we need to do is we need to change oh whoops we need to change some the way
whoops we need to change some the way that these dates are are seen so right
that these dates are are seen so right here is a these are two separate things
here is a these are two separate things so if I go right here and I Do by
so if I go right here and I Do by quarter it's just going to change the
quarter it's just going to change the quarters here right that's that isn't
quarters here right that's that isn't really helpful we actually want to keep
really helpful we actually want to keep the year here what we want to do it is
the year here what we want to do it is by year we want to separate it by year
by year we want to separate it by year um but we want to separate it let's just
um but we want to separate it let's just do I don't know let's try weak and see
do I don't know let's try weak and see what it looks like okay this is great
what it looks like okay this is great this is this is what we're looking at
this is this is what we're looking at again um if we went back and Chang this
again um if we went back and Chang this like quarter it uh changed it quarter
like quarter it uh changed it quarter and then change it to week it would show
and then change it to week it would show the
the quarters but it wouldn't
quarters but it wouldn't show everything right this isn't all the
show everything right this isn't all the data that we need and so you know you
data that we need and so you know you really need to make sure that you're
really need to make sure that you're doing this correct I by default it's
doing this correct I by default it's almost always year but if you're looking
almost always year but if you're looking at it via quarter so like let's say
at it via quarter so like let's say somebody comes in you say hey what
somebody comes in you say hey what quarters I Want to Break these out by
quarters I Want to Break these out by quarters um and not year-over-year
quarters um and not year-over-year that's how you would do this but in the
that's how you would do this but in the year we want to break it out by uh the
year we want to break it out by uh the week and you see this huge drop off um
week and you see this huge drop off um at the end well that is actually because
at the end well that is actually because the data doesn't go past that um there's
the data doesn't go past that um there's just like one day of data or one one um
just like one day of data or one one um week of data in here with actual um with
week of data in here with actual um with January of 2017 data so it just drops
January of 2017 data so it just drops off because this is an this is the sum
off because this is an this is the sum so it only adds up to like um 591 th000
so it only adds up to like um 591 th000 compared to like the 2 million so we
compared to like the 2 million so we want to get rid of that um and how do we
want to get rid of that um and how do we do that uh let's see I think it's
filter how's it format no it's not format what am I thinking bear with me
format what am I thinking bear with me uh let's a filter well I was looking for
uh let's a filter well I was looking for it I just couldn't find
it I just couldn't find it uh let's bring it back to the 31st
it uh let's bring it back to the 31st let's see if that fixes what we need
let's see if that fixes what we need perfect uh that's all you had to do um
perfect uh that's all you had to do um and the reason that this is helpful and
and the reason that this is helpful and often times you'd have several years
often times you'd have several years worth of data in here um and then you
worth of data in here um and then you could have you could do even do
could have you could do even do something like this um like this one
something like this um like this one where it has multiple
where it has multiple lines the reason that this is helpful is
lines the reason that this is helpful is because if I'm telling my friend let's I
because if I'm telling my friend let's I mean just I'm going to say it's a friend
mean just I'm going to say it's a friend or business partner whatever you
or business partner whatever you whatever you want to use this use case
whatever you want to use this use case for I'm GNA tell him hey the beginning
for I'm GNA tell him hey the beginning of January all the way until like you
of January all the way until like you know even February it's like really low
know even February it's like really low it's half so there's not a lot of people
it's half so there's not a lot of people traveling because everyone travels when
traveling because everyone travels when at the end of the year so in November
at the end of the year so in November December for the holidays to visit
December for the holidays to visit family um and then in the summer for
family um and then in the summer for vacations I would tell him just based
vacations I would tell him just based off this one thing I would say hey over
off this one thing I would say hey over the summer and then at the end of the
the summer and then at the end of the year and during the holidays that's when
year and during the holidays that's when I would be renting out your air BNB okay
I would be renting out your air BNB okay so just this one very simple
so just this one very simple visualization can help him understand
visualization can help him understand the best times um to do that that may be
the best times um to do that that may be an intuitive you may have already known
an intuitive you may have already known that but you can prove it with the data
that but you can prove it with the data which is always really helpful um and
which is always really helpful um and let's see is there anything else that we
let's see is there anything else that we need to do with
need to do with this uh I'm just going to label it and
this uh I'm just going to label it and I'm going to say
I'm going to say um
um revenue for
revenue for year
year let's do bold do apply there we go do I
let's do bold do apply there we go do I label this last one I didn't let's label
label this last one I didn't let's label that last
that last [Music]
[Music] one and we'll
one and we'll do price per zip
do price per zip code price per zip code we'll just keep
code price per zip code we'll just keep it at that keep it
it at that keep it simple um and let's do that all right I
simple um and let's do that all right I believe we have two more so we have done
believe we have two more so we have done um we've done three of them um we got
um we've done three of them um we got the zip codes we've got the um you know
the zip codes we've got the um you know the time of the year now something else
the time of the year now something else that he was wanting to know is um you
that he was wanting to know is um you know just how things affect it and
know just how things affect it and something that's going to affect the
something that's going to affect the price of the actual Airbnb is going to
price of the actual Airbnb is going to be the amount of bedrooms so the the
be the amount of bedrooms so the the larger the house the more bedrooms the
larger the house the more bedrooms the more it's going to cost typically so we
more it's going to cost typically so we can take a look at that let's pull in
can take a look at that let's pull in these bedrooms
these bedrooms um and that will be our
um and that will be our columns uh no it won't what we need to
columns uh no it won't what we need to do um and so I I knew this was going to
do um and so I I knew this was going to happen I just forgot it until right uh
happen I just forgot it until right uh until right now what we this right now
until right now what we this right now is actually a um it's a a value right so
is actually a um it's a a value right so it's a number and that's totally um
it's a number and that's totally um reasonable because if we go right here
reasonable because if we go right here we do count distinct that's because
we do count distinct that's because there's only seven values right it goes
there's only seven values right it goes there's zero bedrooms 1 2 3 4 5 5 six 7
there's zero bedrooms 1 2 3 4 5 5 six 7 all the way up to seven bedrooms right
all the way up to seven bedrooms right now it has it as a numerical value we
now it has it as a numerical value we want to um change that to create it as
want to um change that to create it as um these measure names not a value so
um these measure names not a value so we're going to um we're going to remove
we're going to um we're going to remove this we're going to go right down here
this we're going to go right down here we're going click this drop down and
we're going click this drop down and we're going to say convert to
we're going to say convert to Dimension and so now we're going to add
Dimension and so now we're going to add it as a dimension so there that looks um
it as a dimension so there that looks um much more normal I really quick I'm
much more normal I really quick I'm going to I'm going to keep these in here
going to I'm going to keep these in here for a second but we're going to get rid
for a second but we're going to get rid of these nulls and zeros because if a
of these nulls and zeros because if a home has zero bedrooms that's a
home has zero bedrooms that's a problem um and so we want to look at the
problem um and so we want to look at the price again let's go down here in the
price again let's go down here in the listings it should be the price now this
listings it should be the price now this is the price for the location per day um
is the price for the location per day um if you want to look at monthly or or you
if you want to look at monthly or or you know stuff like that they have that data
know stuff like that they have that data um but we're just going to do the price
um but we're just going to do the price the average price not the
the average price not the sum um although this is is helpful so
sum um although this is is helpful so just really quick before we change it
just really quick before we change it this is going to show you which ones
this is going to show you which ones make the which ones are bringing in the
make the which ones are bringing in the most money it also may show you which
most money it also may show you which ones are the most common um those are
ones are the most common um those are all different visualizations that we can
all different visualizations that we can do but the one that brings in the most
do but the one that brings in the most money uh that brought in 63 or that has
money uh that brought in 63 or that has $63 Million worth of um worth of
$63 Million worth of um worth of listings so they all add up those one
listings so they all add up those one bedrooms are doing phenomenal half of
bedrooms are doing phenomenal half of that are two bedrooms at 30 million
that are two bedrooms at 30 million three bedrooms at 18 million and so on
three bedrooms at 18 million and so on and so forth so there's a ton of
and so forth so there's a ton of one-bedroom ones we may even keep we
one-bedroom ones we may even keep we could even keep that in there um you
could even keep that in there um you know if we wanted
know if we wanted to um and then we do something similar
to um and then we do something similar later but you can keep something like
later but you can keep something like this in there what we will do really
this in there what we will do really quick though is we're going to do the
quick though is we're going to do the same thing that we've been doing is
same thing that we've been doing is keeping
keeping average um and we are going to get rid
average um and we are going to get rid of this cuz if it doesn't have the
of this cuz if it doesn't have the bedrooms you know that's not helpful to
bedrooms you know that's not helpful to us and if it has zero bedrooms that's
us and if it has zero bedrooms that's that's genuinely a problem I will not be
that's genuinely a problem I will not be renting an Airbnb with my family uh that
renting an Airbnb with my family uh that has zero bedrooms in it so now we have
has zero bedrooms in it so now we have this and would be really helpful to be
this and would be really helpful to be able to see that in the visualization I
able to see that in the visualization I mean it's just kind
mean it's just kind of hard to see it as is I
of hard to see it as is I mean it just does not hurt to add that
mean it just does not hurt to add that right here do a label um why is it
right here do a label um why is it angled like that maybe I just need to
angled like that maybe I just need to move it out
move it out more that looks much better um that's
more that looks much better um that's the average price that cannot be right
the average price that cannot be right that's the sum that's why so let's go
that's the sum that's why so let's go over here let's make that average as
over here let's make that average as well much better because uh if the price
well much better because uh if the price was $3
was $3 million for a three-bedroom I would not
million for a three-bedroom I would not be going there so this is really really
be going there so this is really really useful information for our friend right
useful information for our friend right if um he wants start you know get into
if um he wants start you know get into those one that one bedroom area you know
those one that one bedroom area you know you're not going to be making a lot of
you're not going to be making a lot of money it may be low cost UPF front but
money it may be low cost UPF front but he's not going to be making a lot of
he's not going to be making a lot of money it significantly goes up when you
money it significantly goes up when you reach these five and six bedroom homes
reach these five and six bedroom homes which makes sense I mean if it has five
which makes sense I mean if it has five or six bedrooms in it it's probably a
or six bedrooms in it it's probably a really large really nice home and you
really large really nice home and you can charge a lot more money and our
can charge a lot more money and our friend is uh extremely wealthy he can
friend is uh extremely wealthy he can buy whatever he wants and so he may be
buy whatever he wants and so he may be looking at these um larger on seeing
looking at these um larger on seeing that there's a much higher return um on
that there's a much higher return um on his investment the higher and the more
his investment the higher and the more bedrooms he goes so we're going to keep
bedrooms he goes so we're going to keep it just as it
it just as it is um and let me see is there's anything
is um and let me see is there's anything else that we want to do with this no
else that we want to do with this no we're going to keep it just like this uh
we're going to keep it just like this uh and the last one is by far the easiest
and the last one is by far the easiest and we actually just discussed it a
and we actually just discussed it a little bit we want to know you know
little bit we want to know you know what's his competition look like so um
what's his competition look like so um for those for the bedrooms specifically
for those for the bedrooms specifically so let's go back up to the
so let's go back up to the bedrooms we want that one to be right
bedrooms we want that one to be right here in our rows so we show um these and
here in our rows so we show um these and then we just want to count of um how
then we just want to count of um how many listings there are so we can do
many listings there are so we can do that via the listings ID so here's our
that via the listings ID so here's our listings each ID represents one location
listings each ID represents one location or one home so we're going to do that
or one home so we're going to do that right here uh that looks absolutely
right here uh that looks absolutely terrible that looks terrible what am I
terrible that looks terrible what am I doing wrong here um let me see
doing wrong here um let me see uh one thing we need to do is we want to
uh one thing we need to do is we want to get rid of these nulls and
get rid of these nulls and zeros do that really
zeros do that really quick um and then we don't want to do
quick um and then we don't want to do just the ID because I I'm realizing now
just the ID because I I'm realizing now uh what I'm doing I need to convert this
uh what I'm doing I need to convert this to a numeric so we can do a count on it
to a numeric so we can do a count on it so let's um oops let me see what what is
so let's um oops let me see what what is happening this is terrible all right
happening this is terrible all right let's put this back let's make let me
let's put this back let's make let me see if I can just um
see if I can just um do an
do an attribute let's
attribute let's do the
do the [Music]
[Music] count and let's
count and let's do
do text um no it needs to be a distinct
text um no it needs to be a distinct count because that's that's basically
count because that's that's basically like
like um a count of the numbers themselves not
um a count of the numbers themselves not each individual ID okay it took figuring
each individual ID okay it took figuring out I'm going to keep that in there
out I'm going to keep that in there because you guys need to see uh a lot of
because you guys need to see uh a lot of you guys like seeing when I make
you guys like seeing when I make mistakes so you know makes it feel like
mistakes so you know makes it feel like when you make mistakes it's okay um and
when you make mistakes it's okay um and I'm all about that so I'm leaving that
I'm all about that so I'm leaving that in there you guys can see me fail a
in there you guys can see me fail a little bit um I just forgot how to do
little bit um I just forgot how to do that for a second and this is exactly
that for a second and this is exactly what we're looking for right we want we
what we're looking for right we want we now it showed us in that visualization
now it showed us in that visualization that we were looking at earlier before
that we were looking at earlier before we um switched it to the average price
we um switched it to the average price this is showing us that there are for
this is showing us that there are for one bedrooms there's 1,800 one bedroom
one bedrooms there's 1,800 one bedroom two that 483 3 that have 206 four that
two that 483 3 that have 206 four that have 55 only five that have 20 and six
have 55 only five that have 20 and six that have five so the more you go up the
that have five so the more you go up the less and less it is or the less and less
less and less it is or the less and less competition there's going to be now is
competition there's going to be now is there a lot of demand for four-bedroom
there a lot of demand for four-bedroom five-bedroom six-bedroom uh that's for
five-bedroom six-bedroom uh that's for our friend to figure out um well maybe
our friend to figure out um well maybe we'll help them out with that later um
we'll help them out with that later um in the with the data you know we could
in the with the data you know we could look at the reviews that we had um
look at the reviews that we had um there's so much data in here and we
there's so much data in here and we could absolutely figure that out but for
could absolutely figure that out but for what it's worth giving him this initial
what it's worth giving him this initial stuff and he'll have follow-up questions
stuff and he'll have follow-up questions for us later that's how it always works
for us later that's how it always works I promise um so now we're good with this
I promise um so now we're good with this one let's label this one did I label the
one let's label this one did I label the last one I will go back and look um
last one I will go back and look um distinct I I'm going to butcher this one
distinct I I'm going to butcher this one I'm going do a distinct count
I'm going do a distinct count of of bedroom listings I don't that may
of of bedroom listings I don't that may not make sense at all but we're keeping
not make sense at all but we're keeping it so we're going to do bedroom apply
it so we're going to do bedroom apply okay let me see if I added the label on
okay let me see if I added the label on this one I didn't let me do that real
this one I didn't let me do that real quick we do
quick we do average price per
average price per bedroom again I'm
bedroom again I'm oops you didn't see that I'm just going
oops you didn't see that I'm just going with whatever is coming to my head this
with whatever is coming to my head this probably wouldn't be what I would keep
probably wouldn't be what I would keep if I this or like an actual project but
if I this or like an actual project but it works for now so we have our five
it works for now so we have our five visualizations 1 2 three four and five
visualizations 1 2 three four and five and let's create our dashboard that's
and let's create our dashboard that's going to be this button right here so
going to be this button right here so we're going to click that we are going
we're going to click that we are going to uh go right here and we're going to
to uh go right here and we're going to say automatic because we want to use
say automatic because we want to use this entire area and so now we're just
this entire area and so now we're just going to start um you know pulling them
going to start um you know pulling them over and I'm just going to start from
over and I'm just going to start from the very first one and go to the very
the very first one and go to the very last one keep it really simple so this
last one keep it really simple so this very first one we'll pull it over it you
very first one we'll pull it over it you know it's going to take up the entire
know it's going to take up the entire space until you start adding all the
space until you start adding all the other ones we'll include this one right
other ones we'll include this one right here um and well let's leave it as it is
here um and well let's leave it as it is you know we'll adjust it once it gets to
you know we'll adjust it once it gets to its final place now we have number three
its final place now we have number three We'll add this one on this side it looks
We'll add this one on this side it looks terrible right now but give it a second
terrible right now but give it a second uh then we have number four we're going
uh then we have number four we're going to add that across the top okay it's
to add that across the top okay it's already starting to look a little
already starting to look a little better and um maybe I I you don't have
better and um maybe I I you don't have to keep this in here
to keep this in here um but you definitely
um but you definitely can uh let's start to adjust things a
can uh let's start to adjust things a little
bit oops okay
oops okay let's see if I can zoom in one more NOP
let's see if I can zoom in one more NOP I'm going to do it just like that
I'm going to do it just like that actually let me
[Music] see if I can make it even just a little
see if I can make it even just a little bit closer perfect uh that's the the
bit closer perfect uh that's the the best you're going to get um if you
best you're going to get um if you didn't see I use this um magnifying and
didn't see I use this um magnifying and then I could click on the area that I
then I could click on the area that I wanted to see so we're going to keep
wanted to see so we're going to keep that just like
that just like that we're going to move this over
that we're going to move this over because that is um definitely not as
because that is um definitely not as important um and then we're going to
important um and then we're going to move this way over as well so keep it
move this way over as well so keep it just like that again this is something
just like that again this is something where if you want to you can click on
where if you want to you can click on this um it didn't I don't know why uh I
this um it didn't I don't know why uh I can't remember how to get those
can't remember how to get those connected but it's you definitely can um
connected but it's you definitely can um but okay I was just clicking on the
but okay I was just clicking on the wrong one that's
wrong one that's why that is why but you can click over
why that is why but you can click over here and you you know it'll filter um
here and you you know it'll filter um based on so if I go to this one oops
based on so if I go to this one oops [Music]
[Music] dang oh jeez what am I doing oh this is
dang oh jeez what am I doing oh this is a
a travesty okay let's try to get this
travesty okay let's try to get this back all right I'm not touching it guys
back all right I'm not touching it guys you get the gist you can mess around
you get the gist you can mess around with it yourself I'm not messing this up
with it yourself I'm not messing this up okay so the next thing we need to add is
okay so the next thing we need to add is the very last one that's going to go
the very last one that's going to go right up here and then we're just going
right up here and then we're just going to kind of move it off to the
to kind of move it off to the side
side and let's
and let's see going
add yeah have this caption um if you've never seen something like this
never seen something like this before um and I actually want to make
before um and I actually want to make this bigger as
this bigger as well oh jeez give me a second it's it's
well oh jeez give me a second it's it's kind of lagging a little
kind of lagging a little [Music]
bit and make this a little bit tall maybe I don't want it as wide but I
maybe I don't want it as wide but I definitely want a little
definitely want a little [Music]
[Music] taller give it a second yeah let me
taller give it a second yeah let me scooch this
scooch this [Music]
[Music] back just like that that's fine uh we
back just like that that's fine uh we can keep it like that in my original one
can keep it like that in my original one I didn't have this um um you can get rid
I didn't have this um um you can get rid of this if you want you know you can um
of this if you want you know you can um you know just exit out right here if you
you know just exit out right here if you want to do that but there you have it uh
want to do that but there you have it uh this is the entire thing so we started
this is the entire thing so we started from the very start um we started with
from the very start um we started with this one then this one uh did some um
this one then this one uh did some um and this is you know all the zip all of
and this is you know all the zip all of our ZIP code work then we took a look at
our ZIP code work then we took a look at the calendar where we looked at the
the calendar where we looked at the price and did some time series
price and did some time series visualization and then we're looking at
visualization and then we're looking at the bedrooms and and the count of
the bedrooms and and the count of bedrooms and so this should be really
bedrooms and so this should be really helpful for a friend it should be an
helpful for a friend it should be an initial dashboard to get him going and
initial dashboard to get him going and once he sees us he's going to have a
once he sees us he's going to have a million other questions and he's going
million other questions and he's going to want another dashboard for different
to want another dashboard for different data that's in there he's going to ask
data that's in there he's going to ask about okay well what if I want to do it
about okay well what if I want to do it weekly or you know I want to rent it out
weekly or you know I want to rent it out for the month or you know how many um
for the month or you know how many um reviews are people five star reviews are
reviews are people five star reviews are people giving on you know W bedroom two
people giving on you know W bedroom two bedroom three bedroom these are all
bedroom three bedroom these are all things that you know he may ask and then
things that you know he may ask and then we'd have to build out in the real world
we'd have to build out in the real world this is what happens all the time you
this is what happens all the time you know they make a request and then
know they make a request and then they're like oh this is great but I also
they're like oh this is great but I also want this so um you know your friend is
want this so um you know your friend is is going to be right in line with just
is going to be right in line with just about everyone else um that has ever
about everyone else um that has ever gotten a dashboard uh for work or for
gotten a dashboard uh for work or for personal use with that being said this
personal use with that being said this is it um we have done the entire thing
is it um we have done the entire thing now if you want to share this it is
now if you want to share this it is super super easy to share um and I'm
super super easy to share um and I'm going to try to remember how to share it
going to try to remember how to share it uh so we're going to do save to tap
uh so we're going to do save to tap public
public As and we're going to do this and we're
As and we're going to do this and we're going to make it um let's do Air BnB is
going to make it um let's do Air BnB is it like is it a capital B is it like
it like is it a capital B is it like that no that doesn't look right
that no that doesn't look right Airbnb uh we'll do full project and
Airbnb uh we'll do full project and we'll
we'll save and that is being created right now
save and that is being created right now um and I will save this so if you guys
um and I will save this so if you guys want to go look at this you can um and
want to go look at this you can um and I'll provide a link in the description
I'll provide a link in the description as well for that and see if yours looks
as well for that and see if yours looks um similar to mine or better than
um similar to mine or better than mine give it a second CU it's
thinking all right so here it is so here's our final our final project um
here's our final our final project um and if you followed step by step then
and if you followed step by step then you should get this exact or very very
you should get this exact or very very similar to this one again I encourage
similar to this one again I encourage you to if you want to have the upto-date
you to if you want to have the upto-date data to go to that um Link in the
data to go to that um Link in the description that has um the the most
description that has um the the most recent data and they update that I
recent data and they update that I believe monthly so you can go there get
believe monthly so you can go there get the most recent data and then you can do
the most recent data and then you can do stuff and you can create a beautiful
stuff and you can create a beautiful project just like this um but with the
project just like this um but with the you know the most recent data again I
you know the most recent data again I use the kaggle data just so you guys can
use the kaggle data just so you guys can remember and I encourage you to look at
remember and I encourage you to look at the different data points that are in
the different data points that are in the Excel there is so much in there and
the Excel there is so much in there and you can use uh honestly like there's
you can use uh honestly like there's probably 30 or 40 other fields that you
probably 30 or 40 other fields that you could be using in there that we never
could be using in there that we never even touched um but for this project
even touched um but for this project we're keeping it pretty simple and so so
we're keeping it pretty simple and so so go do that make completely unique
go do that make completely unique dashboards and and visualizations and
dashboards and and visualizations and create projects and add it to your
create projects and add it to your portfolios so that you can create uh a
portfolios so that you can create uh a fantastic portfolio website and get a
fantastic portfolio website and get a job and that's what this is all about um
job and that's what this is all about um it's about upskilling and and getting
it's about upskilling and and getting these skills that you can you know get a
these skills that you can you know get a job or or do better in your job so I
job or or do better in your job so I hope this has been helpful I really
hope this has been helpful I really appreciate you guys joining me and and
appreciate you guys joining me and and doing this entire project with me I have
doing this entire project with me I have no idea how long this is this probably
no idea how long this is this probably this could be like an hour for all I
this could be like an hour for all I know um so thank you so much for
know um so thank you so much for sticking with me this entire time if you
sticking with me this entire time if you like this video be sure to like And
like this video be sure to like And subscribe below and I will see you in
subscribe below and I will see you in the next
[Music] video what's going on everybody welcome
video what's going on everybody welcome back to another video today we're going
back to another video today we're going to be starting our powerbi tutorial
series now I am super excited to start this
now I am super excited to start this series with you guys we are going to be
series with you guys we are going to be breaking this up in about six or seven
breaking this up in about six or seven videos I don't really like those super
videos I don't really like those super long videos where it's like four hours
long videos where it's like four hours long I like breaking mine up into chunks
long I like breaking mine up into chunks so that's what we're going to do this is
so that's what we're going to do this is the beginner series and so we're going
the beginner series and so we're going to start with the very Basics and we're
to start with the very Basics and we're just going to work our way up and I'm
just going to work our way up and I'm going to walk you through every single
going to walk you through every single step of the way it'll be very easy to
step of the way it'll be very easy to follow everything will be provided for
follow everything will be provided for you so that all you have to do is really
you so that all you have to do is really follow along and by the end of it you
follow along and by the end of it you should know powerbi a lot better you
should know powerbi a lot better you should have a lot more com using it now
should have a lot more com using it now before we actually jump onto my screen I
before we actually jump onto my screen I want to give a huge shout out to the
want to give a huge shout out to the sponsor of this video and that is udemy
sponsor of this video and that is udemy you guys know that I absolutely love
you guys know that I absolutely love udemy I've been using them for years and
udemy I've been using them for years and that is no exception when it comes to
that is no exception when it comes to powerbi I have taken some of the best
powerbi I have taken some of the best powerbi courses ever on udemy so I
powerbi courses ever on udemy so I highly recommend you checking out the
highly recommend you checking out the ones that I have in the description
ones that I have in the description these are ones that I actually took and
these are ones that I actually took and I loved the most so if you're looking
I loved the most so if you're looking for a full powerbi course I highly
for a full powerbi course I highly recommend checking out you to me thank
recommend checking out you to me thank you so much again to our sponsor and now
you so much again to our sponsor and now without further Ado let's jump onto my
without further Ado let's jump onto my screen and get started with a tutorial
screen and get started with a tutorial all right so the first thing I'm going
all right so the first thing I'm going to do is download powerbi desktop I will
to do is download powerbi desktop I will leave this link in the description so
leave this link in the description so you can just click on it go to it and
you can just click on it go to it and download it we're going to click this
download it we're going to click this download free button and once we click
download free button and once we click it you can go to the Microsoft store and
it you can go to the Microsoft store and I already have it downloaded so when you
I already have it downloaded so when you see it uh it'll already say downloaded
see it uh it'll already say downloaded but um for you you can go in here you
but um for you you can go in here you can click download and it will download
can click download and it will download it for you I'm on Microsoft uh but it
it for you I'm on Microsoft uh but it may look a little bit different for you
may look a little bit different for you if you're on a different system but once
if you're on a different system but once that is done we are going to open up
that is done we are going to open up powerbi so let's go right down here to
powerbi so let's go right down here to our search let's go to
powerbi and it is going to open up for us all right so right away this is what
us all right so right away this is what it's going to look like when you open it
it's going to look like when you open it and we're going to go right over here to
and we're going to go right over here to get data and let's click on that it's
get data and let's click on that it's going to open up this window and it's
going to open up this window and it's going to give us a lot of different
going to give us a lot of different options for where we can get data from
options for where we can get data from now some of these are free and some you
now some of these are free and some you need to upgrade from but you just taking
need to upgrade from but you just taking a quick glance through here you have a
a quick glance through here you have a ton of options there's databases there's
ton of options there's databases there's um you know blob storages there's post
um you know blob storages there's post create SQL or different SQL databases um
create SQL or different SQL databases um there's Google analytics there's a lot
there's Google analytics there's a lot of places and you can go through the
of places and you can go through the process to connect to that data and you
process to connect to that data and you can pull that data in from those data
can pull that data in from those data sources now for what we are doing we're
sources now for what we are doing we're just going to be using an Excel I'm
just going to be using an Excel I'm going to leave the Excel that I'm going
going to leave the Excel that I'm going to be using in the description you can
to be using in the description you can go and download it and walk through this
go and download it and walk through this with me so what we're going to do is
with me so what we're going to do is click on Excel workbook and we're going
click on Excel workbook and we're going to click connect so we're going to go
to click connect so we're going to go right here in our powerbi tutorials
right here in our powerbi tutorials folder and we're going to click on
folder and we're going to click on apocalypse food prep so let's click on
apocalypse food prep so let's click on that and it is going to connect and pull
that and it is going to connect and pull that data in now right here we have our
that data in now right here we have our Navigator and so if you had a lot of
Navigator and so if you had a lot of different sheets you can click on that
different sheets you can click on that and choose which ones to pull in I just
and choose which ones to pull in I just clicked on it right over here and we're
clicked on it right over here and we're able to preview the data but I can't
able to preview the data but I can't load or transform it yet I need to
load or transform it yet I need to select which sheets I'm bringing in so
select which sheets I'm bringing in so we only have ones that's the only one
we only have ones that's the only one we're going to bring in so you can go
we're going to bring in so you can go ahead and load the data or you can click
ahead and load the data or you can click on transform data it's going to take us
on transform data it's going to take us to powerbi power query which is going to
to powerbi power query which is going to allow us to transform our data so I'm
allow us to transform our data so I'm going to have an entire video on how to
going to have an entire video on how to transform the data but I'm going to give
transform the data but I'm going to give you a really quick glance at it to kind
you a really quick glance at it to kind of show you what it is so right up here
of show you what it is so right up here it says our power query editor this is a
it says our power query editor this is a the window to basically transform your
the window to basically transform your data and get it ready for your
data and get it ready for your visualizations now you can do this in
visualizations now you can do this in Excel if you want to and do that before
Excel if you want to and do that before forand or you can do it here and there
forand or you can do it here and there are lots of things that we can do in
are lots of things that we can do in here as you can see at the top again
here as you can see at the top again I'll have an entire video dedicated to
I'll have an entire video dedicated to just power query but let's take a quick
just power query but let's take a quick look at the data and see if there's
look at the data and see if there's anything we want to transform quickly
anything we want to transform quickly before we actually go and start building
before we actually go and start building our
our visualizations so over here we have the
visualizations so over here we have the store where we purchased it we have the
store where we purchased it we have the product that we purchased the price that
product that we purchased the price that we paid and the date that we bought it
we paid and the date that we bought it now the first thing that jumps out to me
now the first thing that jumps out to me is that this just says date on it um we
is that this just says date on it um we might want to say date
might want to say date uncore purchased and we're going to hit
uncore purchased and we're going to hit enter and if you noticed right over here
enter and if you noticed right over here on these applied steps it says renamed
on these applied steps it says renamed columns everything that you do every
columns everything that you do every single step that you apply to transform
single step that you apply to transform this data is going to be right over here
this data is going to be right over here and if I want to if I go back and I say
and if I want to if I go back and I say you know I really didn't want to rename
you know I really didn't want to rename that column I can just click X and it is
that column I can just click X and it is going to get rid of that and take it
going to get rid of that and take it back to its original state so again I'm
back to its original state so again I'm just going to say purchase
just going to say purchase and we're going to enter that now this
and we're going to enter that now this is our apocalypse food prep so this is
is our apocalypse food prep so this is food that we are buying for the
food that we are buying for the apocalypse um for this example and if we
apocalypse um for this example and if we look at our products we have bottled
look at our products we have bottled water canned vegetables dried beans milk
water canned vegetables dried beans milk and rice and all of that stuff makes
and rice and all of that stuff makes sense except for the milk U milk will
sense except for the milk U milk will not stay or last long in the apocalypse
not stay or last long in the apocalypse so I think what we're going to do is
so I think what we're going to do is we're going to filter that out really
we're going to filter that out really quickly and we're GNA click okay and
quickly and we're GNA click okay and right over here again says filtered rows
right over here again says filtered rows and so now if we scroll down there's no
and so now if we scroll down there's no milk so what we are going to do is we
milk so what we are going to do is we are going to go over here to close and
are going to go over here to close and apply and it is going to actually load
apply and it is going to actually load the data into powerbi
the data into powerbi desktop so on this left- hand side it
desktop so on this left- hand side it immediately takes us to the report Tab
immediately takes us to the report Tab and what we want to do is go right here
and what we want to do is go right here to the data
to the data Tab and take a look at our data so again
Tab and take a look at our data so again there's our date purchased and as you
there's our date purchased and as you can see the milk is not in there another
can see the milk is not in there another tab that we're going to take a look at
tab that we're going to take a look at um and again in this report tab this is
um and again in this report tab this is where we actually build our
where we actually build our visualizations the data is where we can
visualizations the data is where we can see the data and and change it up a
see the data and and change it up a little bit and change some small things
little bit and change some small things about it like sorting The Columns or
about it like sorting The Columns or even creating a new column and over here
even creating a new column and over here we have this other Tab and is called
we have this other Tab and is called model and this is especially useful when
model and this is especially useful when you have multiple tables or multiple
you have multiple tables or multiple excels and you need to join them to kind
excels and you need to join them to kind of connect them together we don't have
of connect them together we don't have that but in a future video I'm going to
that but in a future video I'm going to walk through how to use this entire
walk through how to use this entire higher tab so now let's go back to the
higher tab so now let's go back to the data Tab and I want to just look at the
data Tab and I want to just look at the data really quickly before we go over to
data really quickly before we go over to the report Tab and we start building our
the report Tab and we start building our first visualization as you can see I've
first visualization as you can see I've been buying these different products in
been buying these different products in different months so this rice I've been
different months so this rice I've been purchasing in January February March and
purchasing in January February March and April and I've been buying it from three
April and I've been buying it from three different locations because I wanted to
different locations because I wanted to see if I was spending less money at one
see if I was spending less money at one location on all of the products so then
location on all of the products so then I would just shop there in the future
I would just shop there in the future and save a lot of money or if there were
and save a lot of money or if there were specific products that were really cheap
specific products that were really cheap at one location but others they were
at one location but others they were cheaper at a different location so I
cheaper at a different location so I should just buy like the dried beans at
should just buy like the dried beans at Costco but everything else I should be
Costco but everything else I should be buying at Walmart and so that's what
buying at Walmart and so that's what we're going to look at in just a little
we're going to look at in just a little bit so let's go over to the report tab
bit so let's go over to the report tab right up here at the top there's this
right up here at the top there's this data section so you can kind of choose
data section so you can kind of choose if you want to add any more data now
if you want to add any more data now that we are here we can also write
that we are here we can also write queries or transform the data like we
queries or transform the data like we were looking at in the power query
were looking at in the power query editor window over here in the insert we
editor window over here in the insert we can add a new visualization or a text
can add a new visualization or a text box and then in the calculation section
box and then in the calculation section we we can create a new measure or a
we we can create a new measure or a quick measure and then over here we have
quick measure and then over here we have share where you can actually publish
share where you can actually publish your report or your dashboard online now
your report or your dashboard online now over on the visualization section on
over on the visualization section on this far right this is a very important
this far right this is a very important area this is where a lot of the actual
area this is where a lot of the actual creating of the dashboards happen so
creating of the dashboards happen so let's take a look really quick and we'll
let's take a look really quick and we'll get into a lot of these things as we're
get into a lot of these things as we're actually building our dashboard so we're
actually building our dashboard so we're not just sitting here looking and
not just sitting here looking and talking we're going to be actually
talking we're going to be actually building and doing all right so we're
building and doing all right so we're going to click right here on this drop
going to click right here on this drop down on sheet one it's going to show us
down on sheet one it's going to show us all of our columns now two of the things
all of our columns now two of the things that we wanted to look at were where are
that we wanted to look at were where are we spending the least amount of money
we spending the least amount of money buying the exact same product that'll
buying the exact same product that'll help us determine where we want to shop
help us determine where we want to shop and the second thing was should I be
and the second thing was should I be buying all my products at the same place
buying all my products at the same place or are there certain products that
or are there certain products that they're going to be cheaper at a
they're going to be cheaper at a specific store and I should buy it there
specific store and I should buy it there so let's start out with the first one
so let's start out with the first one which we're just going to see uh with
which we're just going to see uh with the store and the
the store and the price uh where we're spending the least
price uh where we're spending the least amount of money and just at a quick
amount of money and just at a quick glance we can see we're spending the
glance we can see we're spending the least amount of money at Costco at $210
least amount of money at Costco at $210 versus Target 219 and Walmart at 225 and
versus Target 219 and Walmart at 225 and that really answers our question but we
that really answers our question but we want to visualize it better be able to
want to visualize it better be able to see it in an easier way so we're going
see it in an easier way so we're going to go right over here and we can click
to go right over here and we can click on a lot of these but the one that
on a lot of these but the one that probably makes the most sense is the
probably makes the most sense is the stocked column
stocked column chart and it's going to show Walmart
chart and it's going to show Walmart Target and Costco now they're all the
Target and Costco now they're all the same color let's add a legend so we're
same color let's add a legend so we're just going to drag store over here down
just going to drag store over here down to this Legend and let's make this
to this Legend and let's make this larger while we're working on it so now
larger while we're working on it so now we can see we're spending the most
we can see we're spending the most amount of money at Walmart right in
amount of money at Walmart right in between at Target and then at Costco is
between at Target and then at Costco is the lowest and so right there we know
the lowest and so right there we know that Costco is the place to go for our
that Costco is the place to go for our apocalypse food prep but is it going to
apocalypse food prep but is it going to be that way for every product I don't
be that way for every product I don't know let's take a look let's put this up
know let's take a look let's put this up in this corner and let's start a new one
in this corner and let's start a new one we're going to need to select the
we're going to need to select the product for sure and the price and
product for sure and the price and probably Additionally the store as well
probably Additionally the store as well and let's click
and let's click on let's not do this one we need a
on let's not do this one we need a clustered column chart that's what we
clustered column chart that's what we need let's bring this over here let's
need let's bring this over here let's expand this quite a bit and so really at
expand this quite a bit and so really at a glance this is giving us everything
a glance this is giving us everything that we need we can see each product
that we need we can see each product right here and we can see how much we're
right here and we can see how much we're paying per store and so for Rice we're
paying per store and so for Rice we're paying it looks like a lot more for our
paying it looks like a lot more for our rice at Walmart while at Target is
rice at Walmart while at Target is actually where we are paying the least
actually where we are paying the least now if we look at all of these it looks
now if we look at all of these it looks like for Costco the only one that we're
like for Costco the only one that we're really paying a lot more on is on our
really paying a lot more on is on our rice but for our dried beans our bottled
rice but for our dried beans our bottled water we're paying quite a bit less and
water we're paying quite a bit less and really it's pretty negligible for these
really it's pretty negligible for these canned vegetables we're paying maybe
canned vegetables we're paying maybe what 60 cents 50 60 cents more per can
what 60 cents 50 60 cents more per can so that's pretty negligible but for the
so that's pretty negligible but for the big ticket items um we're really
big ticket items um we're really spending a lot less at Costco if we
spending a lot less at Costco if we wanted to SP to save just a little bit
wanted to SP to save just a little bit more money we could go to Target for our
more money we could go to Target for our rice now if I want to make this more
rice now if I want to make this more like a dashboard and we're only keeping
like a dashboard and we're only keeping these two things I'm going to kind of
these two things I'm going to kind of size them kind of like this whoops going
size them kind of like this whoops going to show you that in a little bit I'm
to show you that in a little bit I'm going to size them a little bit like
going to size them a little bit like this so now that we have that looking
this so now that we have that looking good we want to change the title of both
good we want to change the title of both of these so what we're going to do is go
of these so what we're going to do is go over here in our visualizations and
over here in our visualizations and format your visual uh and we are going
format your visual uh and we are going to go to this General go to Ty TI and
to go to this General go to Ty TI and now we can name it anything we really
now we can name it anything we really want for this we're going to say best
want for this we're going to say best store for
store for product and while we're in here one
product and while we're in here one other thing that I wanted to do is I
other thing that I wanted to do is I want to go to this visual go right down
want to go to this visual go right down here to these data labels now we haven't
here to these data labels now we haven't added any data labels so I'm going to
added any data labels so I'm going to click on and you'll see exactly what it
click on and you'll see exactly what it does uh it just puts the labels and the
does uh it just puts the labels and the numbers above it so you don't have to
numbers above it so you don't have to actually like hover over it and see what
actually like hover over it and see what it is now it is actually rounding these
it is now it is actually rounding these numbers so what we're going to do is go
numbers so what we're going to do is go down here we're going to go down to
down here we're going to go down to values and we'll go down to display
values and we'll go down to display units and it's on auto so it's Auto
units and it's on auto so it's Auto rounding those numbers and we're just
rounding those numbers and we're just going to say none so we can see the
going to say none so we can see the actual value of these
actual value of these numbers and we can do the exact same
numbers and we can do the exact same thing over here it probably is a good
thing over here it probably is a good thing to do um and it just is going to
thing to do um and it just is going to visualize it a little bit differently in
visualize it a little bit differently in here but you can always change that if
here but you can always change that if you want to go over here to
you want to go over here to title and we're going to say total by
title and we're going to say total by store and now we're going to take a look
store and now we're going to take a look and so in a matter of minutes we were
and so in a matter of minutes we were able to take our data from an Excel put
able to take our data from an Excel put it into powerbi transform it a little
it into powerbi transform it a little bit then we're able to create these
bit then we're able to create these visualizations that gave us concrete
visualizations that gave us concrete answers to some very important topics we
answers to some very important topics we now know that Costco is the place to go
now know that Costco is the place to go for basically every single product
for basically every single product except if we're buying rice and if we
except if we're buying rice and if we want to save just a few dollars we're
want to save just a few dollars we're going to head over to Target and that's
going to head over to Target and that's genuinely going to change my shopping
genuinely going to change my shopping habits for the next several years until
habits for the next several years until the apocalypse happens so in future
the apocalypse happens so in future videos we're going to dive into a lot of
videos we're going to dive into a lot of the things that we looked at today but
the things that we looked at today but just in more detail and then at the very
just in more detail and then at the very end of the series we're going to have an
end of the series we're going to have an entire project where we really use every
entire project where we really use every single part of powerbi and create a
single part of powerbi and create a beautiful dashboard and so that's all we
beautiful dashboard and so that's all we have for our very first video in our
have for our very first video in our powerbi series I hope it was helpful if
powerbi series I hope it was helpful if you like this video be sure to like And
you like this video be sure to like And subscribe below and I'll see you in the
subscribe below and I'll see you in the next
next video
video [Music]
[Music] what's going on everybody today we're
what's going on everybody today we're continuing our powerbi tutorial series
continuing our powerbi tutorial series and in this video we're going to be
and in this video we're going to be looking at Power
looking at Power [Music]
[Music] query Now power query is really great
query Now power query is really great because it allows you to actually
because it allows you to actually transform the data before you actually
transform the data before you actually get it into powerbi so if you want to
get it into powerbi so if you want to make any changes like adding or deleting
make any changes like adding or deleting a column or changing the data type or a
a column or changing the data type or a ton of other things you can do all of
ton of other things you can do all of that in power query now without further
that in power query now without further Ado let's jump on my screen and get
Ado let's jump on my screen and get started with the tutorial all right so
started with the tutorial all right so before we jump over to powerbi and start
before we jump over to powerbi and start using power query I wanted to take a
using power query I wanted to take a look at the data and this is the Excel
look at the data and this is the Excel from our last video called apocalypse
from our last video called apocalypse food prep and in that video we went
food prep and in that video we went through and we bought some rice some
through and we bought some rice some beans water vegetables and milk all for
beans water vegetables and milk all for the apocalypse getting prepared for that
the apocalypse getting prepared for that now we decided to buy some additional
now we decided to buy some additional things like rope some flashlights duct
things like rope some flashlights duct tape and a water filter several water
tape and a water filter several water filters and after we purchased those uh
filters and after we purchased those uh our boss or whoever we're working with
our boss or whoever we're working with or somebody decided to go and make a
or somebody decided to go and make a pivot table now in this pivot table they
pivot table now in this pivot table they kind of broke it out by Costco Target
kind of broke it out by Costco Target and Walmart and had all the items had
and Walmart and had all the items had some subtotals as well as some Grand
some subtotals as well as some Grand totals right here and then they decided
totals right here and then they decided to kind of copy and paste that into this
to kind of copy and paste that into this and you'll see this a lot when you're
and you'll see this a lot when you're working with uh people who use Excel
working with uh people who use Excel they like to kind of make things like
they like to kind of make things like this maybe make it into like a table or
this maybe make it into like a table or or format a little bit differently but
or format a little bit differently but you'll see stuff like this a lot so this
you'll see stuff like this a lot so this is what we're going to actually pull
is what we're going to actually pull into Power query and work with now we're
into Power query and work with now we're going to imagine that this is all we
going to imagine that this is all we have this is the only thing we were
have this is the only thing we were working with and I'll kind of reference
working with and I'll kind of reference this pivot table a little bit but we're
this pivot table a little bit but we're going to pretend this is all we have and
going to pretend this is all we have and we want to transform it to make it a lot
we want to transform it to make it a lot more usable to where we can make
more usable to where we can make visualizations with it so let's hop over
visualizations with it so let's hop over to powerbi and pull this excel in so
to powerbi and pull this excel in so what we're going to do is click import
what we're going to do is click import data from Excel we're going to click
data from Excel we're going to click apocalypse food prep and click open and
apocalypse food prep and click open and then it's going to bring up this window
then it's going to bring up this window right here now this is where we can
right here now this is where we can choose what data to bring in so we can
choose what data to bring in so we can take a preview and just click on it real
take a preview and just click on it real quick and this is the pivot table that
quick and this is the pivot table that we were looking at so it does have that
we were looking at so it does have that pivot table so we are able to pull in
pivot table so we are able to pull in just a pivot table and then we have the
just a pivot table and then we have the purchase overview where it's kind of
purchase overview where it's kind of that formatted um thing that we're just
that formatted um thing that we're just looking at with all the colors we're
looking at with all the colors we're going to pull both of those in so we're
going to pull both of those in so we're going to pull in the pivot table and the
going to pull in the pivot table and the purchase overview now we could just load
purchase overview now we could just load it or we could transform it and we're
it or we could transform it and we're going to click transform and that's
going to click transform and that's going to bring us to power query so
going to bring us to power query so let's click on transform data so now
let's click on transform data so now really quick before we actually jump
really quick before we actually jump into working through this and
into working through this and transforming it I want to show you what
transforming it I want to show you what the power query editor looks like so if
the power query editor looks like so if we go right over here we have our
we go right over here we have our queries and these are the tables that we
queries and these are the tables that we actually pulled in and we can click on
actually pulled in and we can click on those and kind of go back and forth
those and kind of go back and forth between them now up top we have our
between them now up top we have our ribbon and the ribbon offers a lot of
ribbon and the ribbon offers a lot of functionality we have things like remove
functionality we have things like remove columns keep rows remove rows split
columns keep rows remove rows split columns these are all things that we're
columns these are all things that we're likely to use when using this power
likely to use when using this power query editor there's also another tab
query editor there's also another tab called transform where there's a lot of
called transform where there's a lot of functionality here as well things like
functionality here as well things like unpivoting a column or transposing
unpivoting a column or transposing columns and rows and using a first row
columns and rows and using a first row as a header some of the things that
as a header some of the things that we'll be looking at today there's also
we'll be looking at today there's also another tab called add a column and this
another tab called add a column and this one's pretty self-explanatory where you
one's pretty self-explanatory where you can add additional columns like deleting
can add additional columns like deleting a column creating an index column or a
a column creating an index column or a conditional column those are the three
conditional column those are the three main ones there's also view tools and
main ones there's also view tools and help but we're not going to really be
help but we're not going to really be looking at those today and then on the
looking at those today and then on the far right side we have our query
far right side we have our query settings you can do things like change
settings you can do things like change the name so we call it pivot table
the name so we call it pivot table 2022 and it'll update right over here on
2022 and it'll update right over here on our query side and we have our applied
our query side and we have our applied steps now our applied steps are
steps now our applied steps are extremely important and very very useful
extremely important and very very useful anytime we make any change to transform
anytime we make any change to transform this data it's going to be documented
this data it's going to be documented right here and then we can go back and
right here and then we can go back and look at it or we could even delete that
look at it or we could even delete that change in the future if we want to and
change in the future if we want to and go back to a previous version of what we
go back to a previous version of what we just did so when we loaded the data into
just did so when we loaded the data into powerbi it did a few things for us it
powerbi it did a few things for us it shows the source the navigation and it
shows the source the navigation and it promoted the headers and then it also
promoted the headers and then it also changed the data type so if we want to
changed the data type so if we want to check we can actually see those things
check we can actually see those things or change those things like this Source
or change those things like this Source right here we can click on this little
right here we can click on this little icon and it's going to bring up the
icon and it's going to bring up the actual path where we got this file so if
actual path where we got this file so if we wanted to change that or or it
we wanted to change that or or it changes in the future future we can come
changes in the future future we can come here and we can change this file path
here and we can change this file path but we're not going to do that right now
but we're not going to do that right now so let's click on cancel and let's go
so let's click on cancel and let's go back down to change type so it promoted
back down to change type so it promoted these headers and obviously these
these headers and obviously these headers are not correct we're looking at
headers are not correct we're looking at this pivot table and not the purchase
this pivot table and not the purchase overview but it changed these column
overview but it changed these column headers and so in the future if we
headers and so in the future if we wanted to we could easily change those
wanted to we could easily change those but it did that for us and it changed
but it did that for us and it changed the type as well so if you look right
the type as well so if you look right here it says
here it says abc123 all the way over here it's where
abc123 all the way over here it's where it just says ABC ABC means it's only
it just says ABC ABC means it's only going to be text where abc123 means it
going to be text where abc123 means it could be basically anything uh text or
could be basically anything uh text or it could be numeric so now let's go over
it could be numeric so now let's go over to purchase overview and this is the one
to purchase overview and this is the one that we're actually going to be working
that we're actually going to be working on the most but we might be looking at
on the most but we might be looking at pivot table just a little bit to kind of
pivot table just a little bit to kind of reference it and see some of the
reference it and see some of the differences so before we do anything
differences so before we do anything let's just take a look at how powerbi
let's just take a look at how powerbi decided to take this data in so it chose
decided to take this data in so it chose this apocalypse food prep overview as
this apocalypse food prep overview as kind of the First Column and that was
kind of the First Column and that was kind of our header or the title of what
kind of our header or the title of what we were looking at before and then all
we were looking at before and then all these other columns are basically column
these other columns are basically column 1 2 3 four fivs so that's something that
1 2 3 four fivs so that's something that we're going to want to change in just a
we're going to want to change in just a little bit there's also all these blank
little bit there's also all these blank uh columns right here at the top and
uh columns right here at the top and kind of these null values as we go along
kind of these null values as we go along and we'll take a look at those and we
and we'll take a look at those and we kind of we going to want to get rid of
kind of we going to want to get rid of some of this and just clean this up to
some of this and just clean this up to make it more usable for our powerbi
make it more usable for our powerbi visualizations this may be perfectly
visualizations this may be perfectly fine and acceptable in an Excel but when
fine and acceptable in an Excel but when you're pulling it into powerbi the real
you're pulling it into powerbi the real reason you're pulling it in is to create
reason you're pulling it in is to create visualizations not just it to look good
visualizations not just it to look good in an Excel so we're going to need to
in an Excel so we're going to need to clean this up quite a bit so let's go
clean this up quite a bit so let's go right up top the first thing that I want
right up top the first thing that I want to do is I want to get rid of these top
to do is I want to get rid of these top rows so we're going to go to this top
rows so we're going to go to this top ribbon and we're going to click remove
ribbon and we're going to click remove rows and we're going to select remove
rows and we're going to select remove top rows and we're going to select two
top rows and we're going to select two because we have one two rows of all
because we have one two rows of all nulls and those are completely useless
nulls and those are completely useless we just want to get rid of them right
we just want to get rid of them right away so let's cck Okay and it removed
away so let's cck Okay and it removed those the next thing that we want to do
those the next thing that we want to do is these this location product and all
is these this location product and all these dates these are actually the
these dates these are actually the column headers that we wanted so what we
column headers that we wanted so what we need to do now is we want to go over to
need to do now is we want to go over to transform and we want to say use first
transform and we want to say use first row as
row as headers and just like that we have
headers and just like that we have location products and these dates as our
location products and these dates as our headers exactly how we wanted them now
headers exactly how we wanted them now let's say for whatever reason you know
let's say for whatever reason you know we made a mistake and we needed to go
we made a mistake and we needed to go back we would just select remove top
back we would just select remove top rows and that would be perfectly fine
rows and that would be perfectly fine now you can see over here it promoted
now you can see over here it promoted the headers but it's also changed the
the headers but it's also changed the data type so before if we went to before
data type so before if we went to before we removed the headers these were all
we removed the headers these were all abc123 abc123 because it had a lot of
abc123 abc123 because it had a lot of different data types in there so it just
different data types in there so it just kind of made a generic data type but
kind of made a generic data type but when we promoted these headers the first
when we promoted these headers the first thing that it decided to do was also
thing that it decided to do was also change this data type for us giving us
change this data type for us giving us its best guess as to what this data type
its best guess as to what this data type is and it decided to do this decimal so
is and it decided to do this decimal so this one two is a decimal but we're
this one two is a decimal but we're actually going to change that and all
actually going to change that and all you have to do is click on This 1.2 uh
you have to do is click on This 1.2 uh or or the data type that it has right
or or the data type that it has right here for you and we're going to click on
here for you and we're going to click on fixed decimal number and let's do
fixed decimal number and let's do replace
replace current and now it's just a little bit
current and now it's just a little bit better so now it's 2.70 2.5 and that's
better so now it's 2.70 2.5 and that's normally how we would read uh values
normally how we would read uh values like this because this is money so we
like this because this is money so we would normally read it to the second
would normally read it to the second decimal just like that and if we have it
decimal just like that and if we have it on the second decimal for some we should
on the second decimal for some we should probably have it on the second decimal
probably have it on the second decimal for all all of them so really quickly
for all all of them so really quickly I'm going to go through and I'm just
I'm going to go through and I'm just going to change that and it should be
going to change that and it should be pretty quick so hang with me for just a
pretty quick so hang with me for just a second all right that is perfect now for
second all right that is perfect now for the purposes of what we're about to do
the purposes of what we're about to do we don't actually need these subtotals
we don't actually need these subtotals or this Costco total Target total and
or this Costco total Target total and Walmart total as well as the grand total
Walmart total as well as the grand total really we want to get rid of those and
really we want to get rid of those and so what we're going to do is we're going
so what we're going to do is we're going to go right over here we're going to
to go right over here we're going to click on this drop down and we're going
click on this drop down and we're going to try to filter this data before we
to try to filter this data before we actually load it into power VI so we're
actually load it into power VI so we're going to filter and we're going to say
going to filter and we're going to say remove empty and let's remove those and
remove empty and let's remove those and it's going to take out all of those
it's going to take out all of those nulls if we wanted to try to filter this
nulls if we wanted to try to filter this out by saying something like Costco
out by saying something like Costco total or Target total we could do that
total or Target total we could do that by going right here clicking this drop
by going right here clicking this drop town on products going to text filters
town on products going to text filters and saying does not contain and let's do
and saying does not contain and let's do insert and we're going to say does not
insert and we're going to say does not contain and we want to say total and
contain and we want to say total and let's click okay okay and again it
let's click okay okay and again it filtered out all of those things so
filtered out all of those things so there's a few different options that you
there's a few different options that you can do if you want to filter out rows
can do if you want to filter out rows that contain either null values or
that contain either null values or specific values now the next thing that
specific values now the next thing that we're going to do is actually get rid of
we're going to do is actually get rid of a column this grand total column and so
a column this grand total column and so what we're going to do is we're going to
what we're going to do is we're going to click on the very top part where it says
click on the very top part where it says grand total we're going to go back over
grand total we're going to go back over here to home and we're going to click on
here to home and we're going to click on remove columns and it says insert that's
remove columns and it says insert that's because we're on this filtered rows one
because we're on this filtered rows one right here um but what we're going to do
right here um but what we're going to do is just insert that and it'll insert
is just insert that and it'll insert right there that's totally fine we can
right there that's totally fine we can just move it to the bottom now we got
just move it to the bottom now we got rid of this column entirely now this
rid of this column entirely now this looks really good visually I like how
looks really good visually I like how this looks I like how everything is set
this looks I like how everything is set up the biggest thing about this is that
up the biggest thing about this is that when you're actually wanting to use this
when you're actually wanting to use this for visualizations these columns as
for visualizations these columns as dates doesn't really work too well and
dates doesn't really work too well and so what we're going to want to do is
so what we're going to want to do is we're going to want to transpose this or
we're going to want to transpose this or pivot this to where these dates are
pivot this to where these dates are actually rows so what we're going to do
actually rows so what we're going to do is select the first date which is
is select the first date which is January 1st all the way through April
January 1st all the way through April 1st and we're going to hit shift and
1st and we're going to hit shift and click on that April 1st right there to
click on that April 1st right there to select all of them at the same time and
select all of them at the same time and then we're going to go over here to the
then we're going to go over here to the transform Tab and we're going to click
transform Tab and we're going to click unpivot columns and let's see what this
unpivot columns and let's see what this does and so now what we've done is we've
does and so now what we've done is we've basically recreated our original Excel
basically recreated our original Excel that we had so let's go back and take a
that we had so let's go back and take a look really quickly at that so this
look really quickly at that so this looks almost identical to what we have
looks almost identical to what we have in powerbi right now and this is
in powerbi right now and this is extremely usable and very good for
extremely usable and very good for visualization
visualization and is much much better than this but
and is much much better than this but again we were pretending that this is
again we were pretending that this is what we were given at the beginning so
what we were given at the beginning so you have to imagine you know somebody
you have to imagine you know somebody just handing you this and you need to
just handing you this and you need to make it much more usable for
make it much more usable for visualizations in the future which
visualizations in the future which happens a lot and we actually wanted to
happens a lot and we actually wanted to create this we just weren't given this
create this we just weren't given this now a few last things that we might want
now a few last things that we might want to do is we want to clean this up just a
to do is we want to clean this up just a little bit we're going to select the
little bit we're going to select the data type and change this to date and
data type and change this to date and then we're going to select the value and
then we're going to select the value and I double clicked on the value and I
I double clicked on the value and I actually want to call this cost uh or
actually want to call this cost uh or product cost
product cost productor
productor cost and then for the location I
cost and then for the location I actually want this to be called
actually want this to be called store so now this looks really good but
store so now this looks really good but I want to show you one thing really
I want to show you one thing really quickly on this pivot table 2022 so
quickly on this pivot table 2022 so let's go back here this looks very
let's go back here this looks very similar to how we had it when it first
similar to how we had it when it first started one thing I wanted to show you
started one thing I wanted to show you uh really quickly and I want to click on
uh really quickly and I want to click on this first one we're going to make make
this first one we're going to make make this our column header and then we're
this our column header and then we're going to try to Pivot or unpivot this
going to try to Pivot or unpivot this January February March April so really
January February March April so really quickly let's do that so we're going to
quickly let's do that so we're going to transform use first row as
transform use first row as headers so now we have this January
headers so now we have this January February March April now if you notice
February March April now if you notice these are not dates these are actually
these are not dates these are actually texts it says January February March and
texts it says January February March and April so if we go to do this and we
April so if we go to do this and we click
click unpivot and here's the columns that are
unpivot and here's the columns that are cre cre when we unpivot it it is January
cre cre when we unpivot it it is January February March and April these are not
February March and April these are not dates so we cannot go and change this to
dates so we cannot go and change this to a date because that would error out
a date because that would error out because it's actually text so it's
because it's actually text so it's something that you want to look out for
something that you want to look out for it's something that you need to be aware
it's something that you need to be aware of and you can change that in the pivot
of and you can change that in the pivot table so you want to be aware of how it
table so you want to be aware of how it actually sits and looks in the Excel or
actually sits and looks in the Excel or whatever data source you're pulling from
whatever data source you're pulling from before you actually pull it into Power
before you actually pull it into Power query to transform and now the very last
query to transform and now the very last thing that we need to do to finalize all
thing that we need to do to finalize all of this is go over here to close and
of this is go over here to close and apply and once we click that everything
apply and once we click that everything that we've worked on is going to be
that we've worked on is going to be applied to the actual data and it's
applied to the actual data and it's going to load into powerbi to create our
going to load into powerbi to create our visualizations so let's go ahead and
visualizations so let's go ahead and click on that and so now the data has
click on that and so now the data has been pulled into powerbi let's go right
been pulled into powerbi let's go right down here to data and we can see the
down here to data and we can see the data right here if we need to transform
data right here if we need to transform this data again we can bring it back
this data again we can bring it back into the power query editor window by
into the power query editor window by just clicking the transform data button
just clicking the transform data button and it's going to bring us right back so
and it's going to bring us right back so I hope that this was helpful thank you
I hope that this was helpful thank you so much for watching if you like this
so much for watching if you like this video like And subscribe below and check
video like And subscribe below and check out all my other videos and everything
out all my other videos and everything data analyst related I'll see you in the
data analyst related I'll see you in the next
next [Music]
[Music] video what's going on everybody welcome
video what's going on everybody welcome back to the powerbi tutorial Series
back to the powerbi tutorial Series today we're going to be taking a look at
today we're going to be taking a look at building
[Music] relationships now when you import
relationships now when you import multiple tables from either the same
multiple tables from either the same data source or multiple data sources you
data source or multiple data sources you want to tie them together so that when
want to tie them together so that when you're creating your visualizations
you're creating your visualizations everything is connected so in this
everything is connected so in this tutorial we'll be walking through how to
tutorial we'll be walking through how to create those relationships to make sure
create those relationships to make sure that all of your tables are connected
that all of your tables are connected properly and without further Ado let's
properly and without further Ado let's jump onto my screen and get started with
jump onto my screen and get started with the tutorial all right so before we jump
the tutorial all right so before we jump over to powerbi and start creating our
over to powerbi and start creating our relationships and our model I want to
relationships and our model I want to take a look at the data in Excel we
take a look at the data in Excel we realized we were buying so many products
realized we were buying so many products for the apocalypse that we decided to
for the apocalypse that we decided to start our own store and we have several
start our own store and we have several customers and some client information
customers and some client information down here and so I wanted to take a look
down here and so I wanted to take a look at some of the columns and these tables
at some of the columns and these tables that we're going to be looking at first
that we're going to be looking at first thing we have is the apocalypse store
thing we have is the apocalypse store these are the things that we are selling
these are the things that we are selling I know it's a very limited inventory but
I know it's a very limited inventory but these are the really high sellers these
these are the really high sellers these are the ones that I wanted to sell so we
are the ones that I wanted to sell so we have this product ID our product name
have this product ID our product name price and production cost then we have
price and production cost then we have this apocalypse sales this is how many
this apocalypse sales this is how many sales we've actually made to our
sales we've actually made to our customers so we have this customer ID
customers so we have this customer ID our customer name product ID order ID
our customer name product ID order ID unit sold and the date it was purchased
unit sold and the date it was purchased and then we have our customer
and then we have our customer information right here here are all of
information right here here are all of our clients so we have this customer ID
our clients so we have this customer ID customer address city state and zip code
customer address city state and zip code so now that we've taken a look at our
so now that we've taken a look at our data let's go and load it into powerbi
data let's go and load it into powerbi so we're going to say import data from
so we're going to say import data from Excel we're going to choose this model
Excel we're going to choose this model right here we're going to click open and
right here we're going to click open and we are going to want all three of these
we are going to want all three of these so I'm going to click on all of them and
so I'm going to click on all of them and we're just going to load it we're not
we're just going to load it we're not going to transform the data at
all so now the data has been loaded let's go right over here on the left
let's go right over here on the left hand side to our model Tab and let's
hand side to our model Tab and let's scoot this over just a little bit and
scoot this over just a little bit and move back and we're going to move these
move back and we're going to move these tables up to where it's a little bit
tables up to where it's a little bit easier to
easier to see so right off the bat you can already
see so right off the bat you can already see that there are these lines between
see that there are these lines between these tables so there are already
these tables so there are already relationships that powerbi has
relationships that powerbi has automatically detected and created from
automatically detected and created from my experience powerbi actually does a
my experience powerbi actually does a really good job at creating these
really good job at creating these relationships automatically but we're
relationships automatically but we're going to go in and take a look at these
going to go in and take a look at these and kind of see what everything means
and kind of see what everything means and then we're going to go back and
and then we're going to go back and create these relationships from scratch
create these relationships from scratch just to make sure that we know how to do
just to make sure that we know how to do every single part so to get it started
every single part so to get it started let's double click on this line
let's double click on this line connecting the customer information
connecting the customer information table to the apocalypse sales
table to the apocalypse sales table and it's going to bring up this
table and it's going to bring up this edit relationship page right here so
edit relationship page right here so this line right here connecting these
this line right here connecting these two tables actually gives us quite a bit
two tables actually gives us quite a bit of information without actually having
of information without actually having to click into this edit relationship
to click into this edit relationship page what this is showing is that we
page what this is showing is that we have a one to many relationship and
have a one to many relationship and there's only one or a single crossfilter
there's only one or a single crossfilter direction and you can find both of those
direction and you can find both of those things right down here and I'm going to
things right down here and I'm going to walk through what those mean in just a
walk through what those mean in just a little bit on this page you can also see
little bit on this page you can also see the columns that powerbi decided to
the columns that powerbi decided to choose in order to tie these two tables
choose in order to tie these two tables together now for our example they
together now for our example they decided to use the customer and customer
decided to use the customer and customer right here from the customer information
right here from the customer information table as well as the apocal sales but I
table as well as the apocal sales but I don't really want to use those
don't really want to use those specifically because on this apocalypse
specifically because on this apocalypse sales table I might remove this customer
sales table I might remove this customer information and just keep the customer
information and just keep the customer ID it may have chosen these customer
ID it may have chosen these customer columns because they have the exact same
columns because they have the exact same name and really the same information but
name and really the same information but I want to use this customer ID anyways
I want to use this customer ID anyways so what I'm going to do is I'm going to
so what I'm going to do is I'm going to click on that column and click on this
click on that column and click on this column and then I'm going to click okay
column and then I'm going to click okay and if we go back into it by double
and if we go back into it by double clicking again we're going to see that
clicking again we're going to see that and now save that and if we did what we
and now save that and if we did what we just did before which is kind of hover
just did before which is kind of hover over it it's going to show us what those
over it it's going to show us what those two tables are joined on so opening this
two tables are joined on so opening this back up let's go down here to this
back up let's go down here to this cardinality and cross filter Direction
cardinality and cross filter Direction cardinality has several different
cardinality has several different options that you can choose from you
options that you can choose from you have one to many one to one one to many
have one to many one to one one to many and many to many now for this example
and many to many now for this example we're looking at apocalypse sales and
we're looking at apocalypse sales and we're going apocalypse sales down to
we're going apocalypse sales down to customer information now there are a lot
customer information now there are a lot of rows in the apocalypse sales but
of rows in the apocalypse sales but there's very few in this customer
there's very few in this customer information and there's only one
information and there's only one customer per row whereas in the
customer per row whereas in the apocalypse sales up here the customer
apocalypse sales up here the customer can have several rows for several
can have several rows for several different orders so that's why the
different orders so that's why the cardinality is many to one now if we
cardinality is many to one now if we flip this and we say we want the
flip this and we say we want the customer information here and we want
customer information here and we want the apocalypse sales down here we tie
the apocalypse sales down here we tie that together now it's going to flip and
that together now it's going to flip and it's going to say one to many now let's
it's going to say one to many now let's look at the cross filter Direction and
look at the cross filter Direction and there's only two options here it's
there's only two options here it's either single or both and if we choose
either single or both and if we choose both and we click okay this now goes
both and we click okay this now goes from a single arrow pointing in one
from a single arrow pointing in one direction to two arrows pointing in both
direction to two arrows pointing in both directions but what does this really
directions but what does this really mean so in order to demonstrate this I'm
mean so in order to demonstrate this I'm going to put this back to a single
going to put this back to a single Direction and what we're going to try to
Direction and what we're going to try to do is connect the data over here or the
do is connect the data over here or the columns over here to the columns in this
columns over here to the columns in this apocalypse store so let's go over here
apocalypse store so let's go over here to build a visualization and what we're
to build a visualization and what we're going to do is we're going to take this
going to do is we're going to take this customer information and let's just say
customer information and let's just say we want to look at state so I'm going to
we want to look at state so I'm going to click on state right here and I'm just
click on state right here and I'm just going to make this into a table and the
going to make this into a table and the customer information table is only tied
customer information table is only tied right now to the sales table so we're
right now to the sales table so we're actually going to go over to the
actually going to go over to the apocalypse store and we want to see how
apocalypse store and we want to see how many product IDs are being bought in
many product IDs are being bought in these different states so really quickly
these different states so really quickly we're going to come up here and create a
we're going to come up here and create a new measure and all we're going to say
new measure and all we're going to say is this measure is the count of
is this measure is the count of Apocalypse store product ID and we're
Apocalypse store product ID and we're going to create that and now we're going
going to create that and now we're going to select it so it's added to that table
to select it so it's added to that table so now what this is showing is that
so now what this is showing is that there are 10 product s which there are
there are 10 product s which there are 10 products for each of these states but
10 products for each of these states but that's not actually technically correct
that's not actually technically correct because not every state purchased these
because not every state purchased these 10 different items if we go back to our
10 different items if we go back to our model and we change both of these
model and we change both of these to a both
to a both Direction and then we're going to go
Direction and then we're going to go back and see what changed in our
back and see what changed in our numbers so now let's go back to our
numbers so now let's go back to our visualization and now we can see that
visualization and now we can see that Minnesota actually only ordered seven
Minnesota actually only ordered seven different product IDs Miss Miss 8 New
different product IDs Miss Miss 8 New York 99 and Texas 10 this is actually
York 99 and Texas 10 this is actually much more accurate than before when you
much more accurate than before when you use the both option it takes these
use the both option it takes these tables and treats them as if they are a
tables and treats them as if they are a single table but the single option is
single table but the single option is not going to do that and so for our
not going to do that and so for our example if we're trying to connect this
example if we're trying to connect this table to this table and one of the last
table to this table and one of the last things that I want to show you is this
things that I want to show you is this option right down here which says make
option right down here which says make this relationship active now if we don't
this relationship active now if we don't click list and there are other options
click list and there are other options in here that connect these things like
in here that connect these things like the customer to the customer then that
the customer to the customer then that may be the active relationship but if I
may be the active relationship but if I select this is the active relationship
select this is the active relationship that means this is going to become the
that means this is going to become the default relationship between these two
default relationship between these two tables so now let's come out of here
tables so now let's come out of here we're going to click cancel we're going
we're going to click cancel we're going to zoom in just a little bit and bring
to zoom in just a little bit and bring these tables a little bit closer so we
these tables a little bit closer so we can zoom in just a little bit more now
can zoom in just a little bit more now we are going to go ahead and delete
we are going to go ahead and delete these so we're going to say delete yes
these so we're going to say delete yes and delete yes so just for demonstration
and delete yes so just for demonstration purposes we're going to build these
purposes we're going to build these relationships from scratch so we're
relationships from scratch so we're going to come over to the customer
going to come over to the customer information table and we're going to
information table and we're going to drag it all the way over here and put it
drag it all the way over here and put it on top of this cust ID or the customer
on top of this cust ID or the customer ID in Apocalypse
ID in Apocalypse sales and it's going to automatically
sales and it's going to automatically create that relationship and we can open
create that relationship and we can open this up and as you can see it created
this up and as you can see it created the relationship between this customer
the relationship between this customer ID in the apocalypse sales and the
ID in the apocalypse sales and the customer ID in the customer information
customer ID in the customer information it also defaulted the cardinality from
it also defaulted the cardinality from many to one and the cross filter
many to one and the cross filter direction to single so we're going to go
direction to single so we're going to go ahead and change that to both and click
ahead and change that to both and click okay
okay and then we're going to come over here
and then we're going to come over here to the product ID in Apocalypse store
to the product ID in Apocalypse store and drag this over the product ID in the
and drag this over the product ID in the apocalypse
apocalypse sales and again if we open it up it
sales and again if we open it up it created that relationship for us it
created that relationship for us it created the cardinality automatically
created the cardinality automatically and we're going to change this cross
and we're going to change this cross filter direction to both and click okay
filter direction to both and click okay and so on a really small scale that is
and so on a really small scale that is how it works of course it becomes a
how it works of course it becomes a little bit more complex the more tables
little bit more complex the more tables that you add and the more relationships
that you add and the more relationships that are created but this is how you're
that are created but this is how you're going to actually create the
going to actually create the relationships in the model tab within
relationships in the model tab within powerbi I hope that this tutorial has
powerbi I hope that this tutorial has helped you understand this concept a
helped you understand this concept a little bit better thank you guys so much
little bit better thank you guys so much for watching I really appreciate it if
for watching I really appreciate it if you like this video be sure to like And
you like this video be sure to like And subscribe below and I'll see you in the
subscribe below and I'll see you in the next
next [Music]
[Music] video what's going on everybody welcome
video what's going on everybody welcome back to the powerbi tutorial Series
back to the powerbi tutorial Series today we're going to be taking a look at
today we're going to be taking a look at Dax
[Music] now DAC stands for data analysis
now DAC stands for data analysis expressions and it's basically a library
expressions and it's basically a library of functions and operators that help you
of functions and operators that help you build formulas you can use Dax to create
build formulas you can use Dax to create measures and calculated columns within
measures and calculated columns within powerbi which can really give you a lot
powerbi which can really give you a lot of insight into your data honestly it is
of insight into your data honestly it is not super complicated and hopefully by
not super complicated and hopefully by the end of this video you'll have a lot
the end of this video you'll have a lot more confidence actually using Dax and
more confidence actually using Dax and powerp so without further Ado let's jump
powerp so without further Ado let's jump onto my screen and get started with the
onto my screen and get started with the tutorial all right so let's take a look
tutorial all right so let's take a look at our tables and data before we get
at our tables and data before we get started so we have two tables the
started so we have two tables the apocalypse sales the apocalypse store
apocalypse sales the apocalypse store for this apocalypse sales table we have
for this apocalypse sales table we have the customer product ID order ID unit
the customer product ID order ID unit sold and the date it was purchased and
sold and the date it was purchased and then for the apocalypse store we have
then for the apocalypse store we have product ID product name price and
product ID product name price and production cost now these are joined
production cost now these are joined together or they do have a relationship
together or they do have a relationship together via the product ID so what
together via the product ID so what we're going to be using are these new
we're going to be using are these new measures and new columns to create our
measures and new columns to create our Dax functions so really quickly let's go
Dax functions so really quickly let's go over to this report Tab and let's drop
over to this report Tab and let's drop down our Fields over here so we can see
down our Fields over here so we can see everything and so to get us started
everything and so to get us started we're going to go right up here to
we're going to go right up here to apocalypse sales we're going to
apocalypse sales we're going to rightclick and click new measure and
rightclick and click new measure and it's going to open up this right here
it's going to open up this right here which is basically our bar where we can
which is basically our bar where we can create our functions and so right here
create our functions and so right here it's automatically given us the name
it's automatically given us the name measure but we can change that and we're
measure but we can change that and we're going to say count of sales so now we
going to say count of sales so now we can start writing our Dax function
can start writing our Dax function that's just going to be the name of it
that's just going to be the name of it and what's going to show up right over
and what's going to show up right over here once we click enter so let's go
here once we click enter so let's go over here and we're going to say count
over here and we're going to say count and as we're typing it's automatically
and as we're typing it's automatically giving us options it has something
giving us options it has something called intellisense if you've ever used
called intellisense if you've ever used other Microsoft products intellisense is
other Microsoft products intellisense is their kind of autoc completion that
their kind of autoc completion that helps you look at other options very
helps you look at other options very quickly and so we're just going to click
quickly and so we're just going to click on this count and it's prompting us to
on this count and it's prompting us to put in a column name and so we can come
put in a column name and so we can come down here and we can select one or we
down here and we can select one or we can type it out and it'll try to predict
can type it out and it'll try to predict and help us choose which column to
and help us choose which column to select so for us we're going to use this
select so for us we're going to use this order ID but let's just start typing it
order ID but let's just start typing it out we'll say order ID and then we can
out we'll say order ID and then we can click on it and we're going to close
click on it and we're going to close this parenthesis and click enter or you
this parenthesis and click enter or you can go over here and click this check
can go over here and click this check mark but we're just going to click
mark but we're just going to click enter and so over on this right side it
enter and so over on this right side it finalized that and save that and we can
finalized that and save that and we can actually look at that by clicking on
actually look at that by clicking on this box next to
this box next to it and we want to look at the this in a
it and we want to look at the this in a table so now we can see that there are
table so now we can see that there are 74 sales now for this we want to see
74 sales now for this we want to see who's buying our products we want to see
who's buying our products we want to see what our what our client name is so
what our what our client name is so we're going to go over here we're going
we're going to go over here we're going to choose customer and we're going to
to choose customer and we're going to put customer on top of sales and we're
put customer on top of sales and we're just going to take a look at it like
just going to take a look at it like this so now we can see that our number
this so now we can see that our number one customer is Uncle Joe's Prep shop he
one customer is Uncle Joe's Prep shop he has 22 orders now they have the most
has 22 orders now they have the most orders with us but it doesn't
orders with us but it doesn't necessarily mean that they're spending
necessarily mean that they're spending the most money with us but we can take a
the most money with us but we can take a look at that later the next thing that I
look at that later the next thing that I want to take a look at is how many
want to take a look at is how many products we're actually selling what are
products we're actually selling what are our big products that we're selling we
our big products that we're selling we have 10 different items but I don't know
have 10 different items but I don't know exactly which one is selling the best if
exactly which one is selling the best if if one is doing really poorly and
if one is doing really poorly and getting no orders this is something that
getting no orders this is something that I want to look into so all we're going
I want to look into so all we're going to do is go right back up here to
to do is go right back up here to apocalypse sales again right click and
apocalypse sales again right click and select new measure and for this one
select new measure and for this one we're going to call it the sum of
we're going to call it the sum of products sold
products sold and all we're going to start out with is
and all we're going to start out with is by doing sum and if this seems familiar
by doing sum and if this seems familiar to something like Excel you're 100%
to something like Excel you're 100% correct it is very similar and remember
correct it is very similar and remember these are both Microsoft products so
these are both Microsoft products so there's going to be similar
there's going to be similar functionality in both of them and so
functionality in both of them and so this Dax is going to have a lot of
this Dax is going to have a lot of similarities to exactly how it has it in
similarities to exactly how it has it in Excel so we're going to do an open
Excel so we're going to do an open bracket and now what we're going to
bracket and now what we're going to choose is this units sold we want to sum
choose is this units sold we want to sum up all of these units sold and see how
up all of these units sold and see how many we actually selling so we're going
many we actually selling so we're going to say units sold I'm going to hit tab
to say units sold I'm going to hit tab it's going to autocomplete that I'm
it's going to autocomplete that I'm going to close my parenthesis and I'm
going to close my parenthesis and I'm going to come over here and click this
going to come over here and click this checkbox so now it's created that
checkbox so now it's created that measure and we're already selected in
measure and we're already selected in this table so all we have to do is click
this table so all we have to do is click the check mark and it's going to show us
the check mark and it's going to show us that we have 3,000 total products sold
that we have 3,000 total products sold and we can go through here and see what
and we can go through here and see what the big sellers are and probably the
the big sellers are and probably the biggest one that I see right off the bat
biggest one that I see right off the bat is this multi- Tool Survival Knife so
is this multi- Tool Survival Knife so these Dax functions that you can write
these Dax functions that you can write can be very simple and lead to really
can be very simple and lead to really good insights that you can use for the
good insights that you can use for the visualizations later on now I want to
visualizations later on now I want to take a look at the difference between
take a look at the difference between something like sum which is an
something like sum which is an aggregator function and something like
aggregator function and something like sum X which is an iterator function
sum X which is an iterator function because if you add X to some of these
because if you add X to some of these aggregator functions you can create them
aggregator functions you can create them or or make them into an iterator
or or make them into an iterator function so you can have some and some X
function so you can have some and some X or average and average X adding X onto
or average and average X adding X onto the end of them can make them to an
the end of them can make them to an iterator function so let's take a look
iterator function so let's take a look and see how that actually works I'm
and see how that actually works I'm going to show you the difference and
going to show you the difference and then I'm going to talk through the
then I'm going to talk through the difference at the end so really quickly
difference at the end so really quickly let's go back to our data and let's go
let's go back to our data and let's go to the apocalypse store now what we have
to the apocalypse store now what we have right here is we have the price and we
right here is we have the price and we have the production cost and we want to
have the production cost and we want to see how much profit we're getting from
see how much profit we're getting from each of these as well as we can take a
each of these as well as we can take a look at the unit sold and see how much
look at the unit sold and see how much money we are actually making so what
money we are actually making so what we're going to do is we're going to come
we're going to do is we're going to come back over here we're going to go to
back over here we're going to go to apocalypse store we're going to right
apocalypse store we're going to right click and create a measure and in just a
click and create a measure and in just a little bit we're going to be creating a
little bit we're going to be creating a new column and that'll kind of show the
new column and that'll kind of show the difference really well so we're going to
difference really well so we're going to create this new measure and we're going
create this new measure and we're going to name it
to name it profit and we're going to come over here
profit and we're going to come over here and what we're going to do is we're
and what we're going to do is we're going to take the sum oops we're going
going to take the sum oops we're going to start with our sums we're going to
to start with our sums we're going to take the sum of the
take the sum of the price and then we're going to close that
price and then we're going to close that parenthesis and we're going to subtract
parenthesis and we're going to subtract the sum of the production cost so all
the sum of the production cost so all that does is it says if something cost
that does is it says if something cost $20 if we sold it for $20 and it only
$20 if we sold it for $20 and it only costs us $10 that's $10 in profit for
costs us $10 that's $10 in profit for that item and then what we're going to
that item and then what we're going to want to do is we're going to actually
want to do is we're going to actually want to encapsulate that really quickly
want to encapsulate that really quickly because we're about to use multiply and
because we're about to use multiply and then we're going to sum and now we're
then we're going to sum and now we're going to take the units sold so how many
going to take the units sold so how many units were actually sold at that profit
units were actually sold at that profit that we just made so let's see if that
that we just made so let's see if that works and let's click the check right
works and let's click the check right here and so we have the profit so let's
here and so we have the profit so let's click on the profit oops that's not what
click on the profit oops that's not what I wanted to do let's use a new one or
I wanted to do let's use a new one or let's create a new uh table we're going
let's create a new uh table we're going to click
to click profit let's make it a table and I'm
profit let's make it a table and I'm going to pull this right over
going to pull this right over here now we have our profit but I really
here now we have our profit but I really want to know is which customer is
want to know is which customer is spending the most money at my store so
spending the most money at my store so we're going to come right over here
we're going to come right over here we're going to click on customer and I'm
we're going to click on customer and I'm put customer at the top and just at a
put customer at the top and just at a glance we can see that Uncle Joe's Prep
glance we can see that Uncle Joe's Prep shop is spending the most money at the
shop is spending the most money at the store now now what I want to show you is
store now now what I want to show you is the difference between sum and sum X so
the difference between sum and sum X so what I'm going to do so I'm going to go
what I'm going to do so I'm going to go back to this profit and going to copy
back to this profit and going to copy this this entire thing and we're going
this this entire thing and we're going to go back here to this table now we
to go back here to this table now we just created a measure and we were able
just created a measure and we were able to break it down by each customer so
to break it down by each customer so let's go back over here now let's go up
let's go back over here now let's go up here to home and we're going to create a
here to home and we're going to create a new column and we're going to call this
new column and we're going to call this profit
profit profit underscore column and we're going
profit underscore column and we're going to literally paste the exact same thing
to literally paste the exact same thing into here and we're going to hit
into here and we're going to hit enter and each row is the exact same
enter and each row is the exact same thing so what it's doing is it is going
thing so what it's doing is it is going through the price and it's adding all of
through the price and it's adding all of it up and calculating it at the bottom
it up and calculating it at the bottom it's adding the production cost it's
it's adding the production cost it's going all the way down and calculating
going all the way down and calculating it at the bottom and then it's going
it at the bottom and then it's going over and looking at how many units it
over and looking at how many units it sold and then it's performing this
sold and then it's performing this calculation up here and then it gives us
calculation up here and then it gives us the total and it's doing it for every
the total and it's doing it for every single row but that's not really what we
single row but that's not really what we wanted to show what we wanted to show is
wanted to show what we wanted to show is the profit for each row what we wanted
the profit for each row what we wanted to say is here's the price for the Rope
to say is here's the price for the Rope the production cost for the rope and
the production cost for the rope and then how many units we actually sold and
then how many units we actually sold and then it'll calculate that and give us
then it'll calculate that and give us the actual profit for just that row but
the actual profit for just that row but we cannot do it by just using this sum
we cannot do it by just using this sum what we need to do is use something
what we need to do is use something called Su X so let's add another column
called Su X so let's add another column let's go back to home say new
let's go back to home say new column and now we're going to
column and now we're going to say profit underscore oops underscore
say profit underscore oops underscore column underscore sum
column underscore sum X and now we're going to use sum X and
X and now we're going to use sum X and hit Tab and we need to choose the table
hit Tab and we need to choose the table that we want to put this in so we're
that we want to put this in so we're going to say apocalypse sales because
going to say apocalypse sales because that's the table that we're looking at
that's the table that we're looking at right here we're going to say comma and
right here we're going to say comma and now we need to input an expression which
now we need to input an expression which it says it Returns the sum of an
it says it Returns the sum of an expression evaluated for each row in a
expression evaluated for each row in a table before when you're just using sum
table before when you're just using sum it's looking at all of these combined
it's looking at all of these combined now it's taking it row by row so what
now it's taking it row by row so what we're going to do is basically input the
we're going to do is basically input the same thing as we did before I'm going to
same thing as we did before I'm going to copy I'm going to paste that it's not
copy I'm going to paste that it's not going to be correct I need to get rid of
going to be correct I need to get rid of these
these sums but it's basically the exact same
sums but it's basically the exact same equation give me just a second and let's
equation give me just a second and let's get rid of this
get rid of this some and let's see if this works so
some and let's see if this works so let's click the check
let's click the check button and now this looks a lot better
button and now this looks a lot better so what this is now showing us is at a
so what this is now showing us is at a row level this nylon rope made us 51,000
row level this nylon rope made us 51,000 almost
almost $52,000 the waterproof matches made us
$52,000 the waterproof matches made us $115,000 and we can go down and look at
$115,000 and we can go down and look at each item and see how much that actually
each item and see how much that actually made us versus this profit column and so
made us versus this profit column and so that is the biggest difference between
that is the biggest difference between sum and sum X hopefully that made sense
sum and sum X hopefully that made sense I know that sum and sum X and and the
I know that sum and sum X and and the difference between an aggregator
difference between an aggregator function and iterator function can be a
function and iterator function can be a little bit confusing especially if
little bit confusing especially if you've never done it before but
you've never done it before but hopefully that was a good example for
hopefully that was a good example for you to understand that concept now let's
you to understand that concept now let's go back over here to apocalypse sales
go back over here to apocalypse sales right here we have a date purchase now
right here we have a date purchase now in the Dax function we have some ways
in the Dax function we have some ways that we can interact with dates and so I
that we can interact with dates and so I want to take a look at those really
want to take a look at those really quickly so we're going to go right up
quickly so we're going to go right up here and click on new column and we're
here and click on new column and we're just going to leave that as column but
just going to leave that as column but what we're going to say is day so
what we're going to say is day so there's a few different ones we have Day
there's a few different ones we have Day dates YTD next day previous day and
dates YTD next day previous day and weekday and they all are pretty
weekday and they all are pretty self-explanatory if you click on it
self-explanatory if you click on it let's click on weekday it says it's
let's click on weekday it says it's going to return a number from 1 to 7
going to return a number from 1 to 7 identifying the day of the week of a
identifying the day of the week of a date so let's use this really quickly
date so let's use this really quickly and so we're going to say date
and so we're going to say date purchased and and click tab hit
purchased and and click tab hit comma and it's going to give us a three
comma and it's going to give us a three different options basically it's a one a
different options basically it's a one a two and a three um right here if you hit
two and a three um right here if you hit this button read more you can read more
this button read more you can read more on it this is going to say Sunday is
on it this is going to say Sunday is equal to one Saturday is equal to seven
equal to one Saturday is equal to seven I like this one personally which is
I like this one personally which is Monday equals one in my brain it just
Monday equals one in my brain it just makes more sense so I'm going to click
makes more sense so I'm going to click on two I'm going to close that
on two I'm going to close that parentheses and we're going to I guess
parentheses and we're going to I guess I'll say uh let's say day of week for
I'll say uh let's say day of week for the column
the column let's click that
let's click that checkbox and now Saturdays are equal to
checkbox and now Saturdays are equal to sixes Mondays are equal to one this
sixes Mondays are equal to one this allows us to see which day of the week
allows us to see which day of the week people are buying the most products on
people are buying the most products on or or which day of the week is somebody
or or which day of the week is somebody submitting their orders on and so let's
submitting their orders on and so let's go over to our report let's get rid of
go over to our report let's get rid of this we just going to move this oh jeez
this we just going to move this oh jeez I hate moving stuff sometimes all right
I hate moving stuff sometimes all right really quickly I want to show you the
really quickly I want to show you the difference between what we just did and
difference between what we just did and what we already have so we have this um
what we already have so we have this um date purchased and let's make that into
date purchased and let's make that into a bar
a bar graph and what we're going to be taking
graph and what we're going to be taking a look at is actually the units sold so
a look at is actually the units sold so right here we have this and obviously
right here we have this and obviously for we don't want 2022 we're going to
for we don't want 2022 we're going to get rid of the year we only have one
get rid of the year we only have one quarter right here we can see January
quarter right here we can see January February March so we can tell that
February March so we can tell that January has the most sales or the most
January has the most sales or the most units sold in that month if we get rid
units sold in that month if we get rid of that we go down to day we do have
of that we go down to day we do have some information but we don't know what
some information but we don't know what day of the week it is it could change
day of the week it is it could change from month to month and it's really hard
from month to month and it's really hard to tell exactly what if there's any
to tell exactly what if there's any pattern there at all that's where what
pattern there at all that's where what we just created comes in handy so let's
we just created comes in handy so let's recreate this exact same thing but
recreate this exact same thing but instead we're going to use day of week
instead we're going to use day of week so we're going to select day of week and
so we're going to select day of week and unit sold let's drag that down and move
unit sold let's drag that down and move this over right here and this day of the
this over right here and this day of the week should be on the
week should be on the xaxis and it's really easy now to see if
xaxis and it's really easy now to see if there's a pattern here there's really
there's a pattern here there's really not at least not for this fake data that
not at least not for this fake data that we have um but just I I want these uh
we have um but just I I want these uh data labels on really
data labels on really quickly um it's not easy to see if
quickly um it's not easy to see if there's any pattern again Monday has the
there's any pattern again Monday has the most so maybe that that I mean it goes
most so maybe that that I mean it goes down a little bit and then it picks back
down a little bit and then it picks back up so maybe middle the week is our least
up so maybe middle the week is our least uh sales day our Wednesdays and
uh sales day our Wednesdays and Thursdays are a little bit lower than
Thursdays are a little bit lower than the rest and the beginning and the end
the rest and the beginning and the end of the week tend to be the highest again
of the week tend to be the highest again not a huge pattern but you know it's
not a huge pattern but you know it's much easier to see if there is a pattern
much easier to see if there is a pattern from week to week or what day of the
from week to week or what day of the week now that we use this weekday
week now that we use this weekday function and so this can be really
function and so this can be really really useful let's go back here to our
really useful let's go back here to our data now we're going to look at our last
data now we're going to look at our last Dax function for this video let's go up
Dax function for this video let's go up here and create a new column and we're
here and create a new column and we're going to be looking at something called
going to be looking at something called the if statement now if you've ever used
the if statement now if you've ever used Excel I'm sure you have heard of this
Excel I'm sure you have heard of this and you can do the exact same thing here
and you can do the exact same thing here in powerbi and so we're going to name
in powerbi and so we're going to name this one order size order undor size and
this one order size order undor size and so all we're going to say is if we're
so all we're going to say is if we're going to click on this one right here we
going to click on this one right here we need to perform our logical test and
need to perform our logical test and then we want to say if it's true what's
then we want to say if it's true what's our value and if it's false what is our
our value and if it's false what is our value so what we're going to be looking
value so what we're going to be looking at is units sold so we're looking at
at is units sold so we're looking at order size so we're going to say if unit
order size so we're going to say if unit sold is greater than
sold is greater than 25 what's going to happen if it is true
25 what's going to happen if it is true if the order is larger than 25 you want
if the order is larger than 25 you want to say it's a big order and if it's not
to say it's a big order and if it's not we want to say it's a small
we want to say it's a small order super simple we'll close that
order super simple we'll close that parenthesis we'll click okay and now
parenthesis we'll click okay and now really quickly we're able to see if this
really quickly we're able to see if this is a big order or a small order and so
is a big order or a small order and so that is all I have for you today there
that is all I have for you today there are a lot of other dox functions but the
are a lot of other dox functions but the ones that we looked at today are ones
ones that we looked at today are ones that are very common ones that you'll
that are very common ones that you'll see the most and there can be a lot of
see the most and there can be a lot of really complex and intricate Dax
really complex and intricate Dax functions that you can create and in our
functions that you can create and in our project at the end of this series I will
project at the end of this series I will be sure to include some more complex Dax
be sure to include some more complex Dax functions but hopefully this gave you a
functions but hopefully this gave you a good introduction into Dax so you know
good introduction into Dax so you know how to use it a little bit better thank
how to use it a little bit better thank you guys so much for watching I really
you guys so much for watching I really appreciate it if you like this video be
appreciate it if you like this video be sure to like And subscribe and check out
sure to like And subscribe and check out all of my other videos on everything
all of my other videos on everything data analyst related I will see you in
data analyst related I will see you in the next
video [Music]
[Music]
what's going on everybody welcome back to the powerbi tutorial Series today
to the powerbi tutorial Series today we're going to be looking at how to
we're going to be looking at how to drill down in
[Music] visualizations so when I say drill down
visualizations so when I say drill down I mean you're basically adding another
I mean you're basically adding another layer beneath the top layer of the
layer beneath the top layer of the visualization and when somebody clicks
visualization and when somebody clicks or drills down in that data they can see
or drills down in that data they can see more insights and more information on
more insights and more information on the top level of data when you drill
the top level of data when you drill down you can also drill up and I will
down you can also drill up and I will show you how to do that in this tutorial
show you how to do that in this tutorial so without further Ado let's jump on my
so without further Ado let's jump on my screen and get started with the tutorial
screen and get started with the tutorial all right so before we get started I
all right so before we get started I wanted to remind you that you can find
wanted to remind you that you can find the data that we're going to be working
the data that we're going to be working with in this tutorial in the description
with in this tutorial in the description you can go and download it from my
you can go and download it from my GitHub now the two tables I'm going to
GitHub now the two tables I'm going to be looking at are apocalypse sales and
be looking at are apocalypse sales and purchase tracker and if you've ever
purchase tracker and if you've ever created any visualizations you've
created any visualizations you've probably seen something like this where
probably seen something like this where you'll have the store and the price and
you'll have the store and the price and this is the the things that we actually
this is the the things that we actually bought so this is the total amount of
bought so this is the total amount of Apocalypse prepping uh equipment that we
Apocalypse prepping uh equipment that we bought and we'll put the store in this
bought and we'll put the store in this Legend right here and you've probably
Legend right here and you've probably seen something like this and if you're
seen something like this and if you're anything like me you're going to be in a
anything like me you're going to be in a meeting and you're going to be
meeting and you're going to be presenting this and some higher up is
presenting this and some higher up is going to be like hey Alex that looks
going to be like hey Alex that looks great but I want to you know see what
great but I want to you know see what things we actually bought in Target and
things we actually bought in Target and how much this cost can you create a
how much this cost can you create a visualization for that and you're going
visualization for that and you're going to be like well I could or I could use
to be like well I could or I could use drill down so you could have done this
drill down so you could have done this in the first place uh which you should
in the first place uh which you should have so what we're going to do is all
have so what we're going to do is all we're going to do is we're going to say
we're going to do is we're going to say we're going to say the product right
we're going to say the product right here and these are going to be the
here and these are going to be the actual things and we're going to put it
actual things and we're going to put it right under store now you can't see
right under store now you can't see these things right but there is a a
these things right but there is a a hierarchy here so once we added this
hierarchy here so once we added this these options became available let's
these options became available let's take it out and all those just
take it out and all those just disappeared and then if we add it back
disappeared and then if we add it back right here they came back and so you can
right here they came back and so you can do right here which is is click to turn
do right here which is is click to turn on drill down you can go to the next
on drill down you can go to the next level in the hierarchy or you can even
level in the hierarchy or you can even expand all down one level in the
expand all down one level in the hierarchy so let's look at each of those
hierarchy so let's look at each of those really quickly so let's click on this
really quickly so let's click on this one it's just going to turn on drill
one it's just going to turn on drill down mode so now if I go and I click on
down mode so now if I go and I click on target it's going to drill down into
target it's going to drill down into these and if we want to I can then put
these and if we want to I can then put product under this
product under this Legend and we can see all of those
Legend and we can see all of those things but of course if we go back up
things but of course if we go back up it's going to be all broken up into this
it's going to be all broken up into this clustered column chart which is more
clustered column chart which is more like
like um this which isn't exactly what we were
um this which isn't exactly what we were going for but it works now uh let me get
going for but it works now uh let me get rid of this I actually want store in the
rid of this I actually want store in the legend now if we turn that off and we
legend now if we turn that off and we click it doesn't do that anymore so what
click it doesn't do that anymore so what it does now is it just highlights
it does now is it just highlights Walmart it highlights Costco it
Walmart it highlights Costco it highlights Target so we're going to keep
highlights Target so we're going to keep that on uh but we can also do something
that on uh but we can also do something called going down in the next level of
called going down in the next level of hierarchy so let's click on that and so
hierarchy so let's click on that and so now this is going to go down to the next
now this is going to go down to the next level down to this product level because
level down to this product level because that is the next level and now it's
that is the next level and now it's going to show us each of those things
going to show us each of those things but it's going to have it broken out by
but it's going to have it broken out by the store and so it's a completely
the store and so it's a completely different visualization but all within
different visualization but all within the same Realm of the data that we're
the same Realm of the data that we're looking at and what we actually care
looking at and what we actually care about so let's go back up in the
about so let's go back up in the hierarchy and then let's use this one
hierarchy and then let's use this one right here which is expand all down one
right here which is expand all down one level in the hierarchy and so this one
level in the hierarchy and so this one is again extremely similar except it
is again extremely similar except it just visualizes it differently and now
just visualizes it differently and now what it's doing is Walmart rice Target
what it's doing is Walmart rice Target dried beans Costco rice so instead of
dried beans Costco rice so instead of having an all uh like this one where
having an all uh like this one where it's stacked on top of each other it's
it's stacked on top of each other it's breaking it down individually so this
breaking it down individually so this one column would become three separate
one column would become three separate columns now I'm going to minimize this
columns now I'm going to minimize this right here uh I'm actually going to go
right here uh I'm actually going to go back up in the hierarchy just for visual
back up in the hierarchy just for visual purposes now I'm going to show you one
purposes now I'm going to show you one more example we're going to use this
more example we're going to use this apocalypse sales up here and this is one
apocalypse sales up here and this is one that I actually use all the time so the
that I actually use all the time so the one you've seen you know you'll get
one you've seen you know you'll get stuff like that especially if you're
stuff like that especially if you're working with like sales and stuff but I
working with like sales and stuff but I work in operations right so I have a lot
work in operations right so I have a lot of order IDs product IDs stuff like that
of order IDs product IDs stuff like that now this one this one genuinely I use
now this one this one genuinely I use quite often I'll have a customer U let's
quite often I'll have a customer U let's make it we'll just go like this we have
make it we'll just go like this we have a customer and we have unit sold and
a customer and we have unit sold and let's use the customer as the legend so
let's use the customer as the legend so let's make this one quite a bit
let's make this one quite a bit larger and I'll have something like this
larger and I'll have something like this and they'll say okay well we want to see
and they'll say okay well we want to see the order ID s that go with it because
the order ID s that go with it because we want to know what orders are actually
we want to know what orders are actually happening for each of these people
happening for each of these people obviously I'm not using this exact data
obviously I'm not using this exact data but very very very similar and all you
but very very very similar and all you have to do is take these order IDs and
have to do is take these order IDs and slide it right under here under customer
slide it right under here under customer and this visualization right here is
and this visualization right here is something I've done a thousand times
something I've done a thousand times because what happens is is someone some
because what happens is is someone some stakeholder in our company is saying hey
stakeholder in our company is saying hey Alex we want this and we want to know we
Alex we want this and we want to know we want to drill down on this IP address we
want to drill down on this IP address we want to drill down on this certain
want to drill down on this certain database we want to drill down on
database we want to drill down on something and we want to see the order
something and we want to see the order IDs within them so then all you do is
IDs within them so then all you do is you turn on drill mode or drill down
you turn on drill mode or drill down mode you'll click on it and you can see
mode you'll click on it and you can see every single order ID that's in there
every single order ID that's in there and then they can go and look those up
and then they can go and look those up in their system and resolve them or
in their system and resolve them or whatever they're trying to do with it
whatever they're trying to do with it and it helps a ton and it's very very
and it helps a ton and it's very very useful this one is extremely applicable
useful this one is extremely applicable and that's really all drill down is
and that's really all drill down is again you have these different
again you have these different hierarchies as well um but for different
hierarchies as well um but for different things it's not as useful as you can see
things it's not as useful as you can see we also have this hierarchy which again
we also have this hierarchy which again is not as useful so it just depends on
is not as useful so it just depends on the data that you're using and how you
the data that you're using and how you want to use this drill down effect but I
want to use this drill down effect but I promise you that drill down is used all
promise you that drill down is used all the time especially when you're giving
the time especially when you're giving presentations where people want to know
presentations where people want to know more information than just the the
more information than just the the visualization that you're presenting so
visualization that you're presenting so I hope that this has been helpful I hope
I hope that this has been helpful I hope that you understand drill down a little
that you understand drill down a little bit better if you like this video be
bit better if you like this video be sure to like And subscribe and check out
sure to like And subscribe and check out all my other videos on powerbi thank you
all my other videos on powerbi thank you and I'll see you in the next video
and I'll see you in the next video [Music]
[Music] what's going on everybody welcome back
what's going on everybody welcome back to the powerbi tutorial Series today
to the powerbi tutorial Series today we're going to be taking a look at
we're going to be taking a look at conditional
[Music] formatting now conditional formatting
formatting now conditional formatting may sound familiar because we looked at
may sound familiar because we looked at it in the Excel series and it's very
it in the Excel series and it's very similar how you use it in Excel versus
similar how you use it in Excel versus how you use it in powerbi conditional
how you use it in powerbi conditional formatting allows you to take a table or
formatting allows you to take a table or a matrix within powerbi and use those
a matrix within powerbi and use those cells to color code them and create
cells to color code them and create gradients and different visualizations
gradients and different visualizations within the actual table or Matrix I'm
within the actual table or Matrix I'm excited to start this one so let's jump
excited to start this one so let's jump over my screen and get started with the
over my screen and get started with the tutorial all right so before we get
tutorial all right so before we get started if you want to use the data that
started if you want to use the data that we're using in this video you can find
we're using in this video you can find it in the description on my GitHub now
it in the description on my GitHub now conditional formatting is super simple
conditional formatting is super simple and you've most likely used it in Excel
and you've most likely used it in Excel before but you can also use it in
before but you can also use it in powerbi and let me show you how to do
powerbi and let me show you how to do that so the first thing we're going to
that so the first thing we're going to do is come over over to our apocalypse
do is come over over to our apocalypse store and we're going to pull up our
store and we're going to pull up our product name as well as the price and
product name as well as the price and what we can do is come over here and
what we can do is come over here and we're going to go to price and it has to
we're going to go to price and it has to be under the columns so you can't come
be under the columns so you can't come over here and do this we're going to
over here and do this we're going to come right over here to price and we're
come right over here to price and we're going to right click and let's go to
going to right click and let's go to conditional formatting and we have
conditional formatting and we have background color font color icons and
background color font color icons and web URL let's take a look at background
web URL let's take a look at background color first this is most likely the one
color first this is most likely the one that we'll look at the most so we're
that we'll look at the most so we're going to get this pop up and I'm going
going to get this pop up and I'm going to slide this over now there's a lot of
to slide this over now there's a lot of different things we can customize in
different things we can customize in here and the first thing I want to take
here and the first thing I want to take a look at is format style we have the
a look at is format style we have the gradient and what it's going to say is
gradient and what it's going to say is the lowest value will be this color
the lowest value will be this color highest value will be this color it'll
highest value will be this color it'll give us this gradient color scale and so
give us this gradient color scale and so we'll use that in just a little bit but
we'll use that in just a little bit but we can also create rules kind of like an
we can also create rules kind of like an if statement and if it is between this
if statement and if it is between this range and this range we give it a color
range and this range we give it a color and if it's between a different range
and if it's between a different range and a different range we'll give it a
and a different range we'll give it a different color so we'll also try that
different color so we'll also try that one and then we have this field value uh
one and then we have this field value uh and this one is one that uh honestly I
and this one is one that uh honestly I don't use that much I've used it maybe
don't use that much I've used it maybe once and what you can do is select a
once and what you can do is select a text field like customer and you can do
text field like customer and you can do some summarizations on the first and
some summarizations on the first and last and that is it so what we're going
last and that is it so what we're going to do is we're going to look at gradient
to do is we're going to look at gradient specifically for not the customer but
specifically for not the customer but we're going to go back to the apocalypse
we're going to go back to the apocalypse store and we're going to do it on the
store and we're going to do it on the price now what I'm going to do is keep
price now what I'm going to do is keep it as the count because this is what the
it as the count because this is what the default is and we're going to go back
default is and we're going to go back and fix it later but what we want our
and fix it later but what we want our lowest value to be is this bright green
lowest value to be is this bright green showing that this it's a cheap product
showing that this it's a cheap product it's easy to purchase the high value
it's easy to purchase the high value ones are going to be just the shade of
ones are going to be just the shade of red more expensive and we'll do it on
red more expensive and we'll do it on the count now remember the count is on
the count now remember the count is on each of these and we're not doing a
each of these and we're not doing a count of how many are sold we're doing a
count of how many are sold we're doing a count of each product so it's just one
count of each product so it's just one per row so it all should be the same
per row so it all should be the same color let's take a look so it is all the
color let's take a look so it is all the same color but what we really want to
same color but what we really want to show is the actual price not just the
show is the actual price not just the count of the price so let's go back to
count of the price so let's go back to conditional formatting we're going to
conditional formatting we're going to click the background color again and
click the background color again and this time we're going to change the
this time we're going to change the summarization now you can do sum you can
summarization now you can do sum you can do average minimum maximum it really
do average minimum maximum it really doesn't matter for this example the
doesn't matter for this example the number is the same regardless of really
number is the same regardless of really which one we choose so we can just
which one we choose so we can just choose the minimum and it's going to
choose the minimum and it's going to choose the minimum of each row which is
choose the minimum of each row which is the price so we're just going to select
the price so we're just going to select minimum for this example we'll select
minimum for this example we'll select okay and it should correct it
okay and it should correct it accordingly which means the bright green
accordingly which means the bright green is the lowest and it goes all the way up
is the lowest and it goes all the way up to the highest which is the red now
to the highest which is the red now let's go over here to apocalypse sales
let's go over here to apocalypse sales we'll add in the units
we'll add in the units sold and let's move that out a little
sold and let's move that out a little bit and I'm doing that on purpose
bit and I'm doing that on purpose because we're about to look at something
because we're about to look at something within the conditional formatting so
within the conditional formatting so let's go to unit sold and we'll look at
let's go to unit sold and we'll look at the conditional formatting for this one
the conditional formatting for this one now if you noticed we now have a new one
now if you noticed we now have a new one on here called data bars now we're able
on here called data bars now we're able to see data bar bars on unit sold and
to see data bar bars on unit sold and not price because unit sold is something
not price because unit sold is something like a sum an average something that's
like a sum an average something that's aggregated but let's take a look at
aggregated but let's take a look at datab bars because I want to show you
datab bars because I want to show you how to use this and then we'll go back
how to use this and then we'll go back to the background color so for data bars
to the background color so for data bars we are going to taking a look at the
we are going to taking a look at the lowest to the highest value again we're
lowest to the highest value again we're going to go from bright green all the
going to go from bright green all the way
way to this exact red it's going to be from
to this exact red it's going to be from left to right and what it's going to
left to right and what it's going to show you is if it is a positive number
show you is if it is a positive number which all of these are is going to be a
which all of these are is going to be a green bar basically representing the
green bar basically representing the number that you see in here along this
number that you see in here along this line so let's click
line so let's click okay and we're going to be able to see
okay and we're going to be able to see the highest numbers and let's scooch
the highest numbers and let's scooch this over quite a bit so you can kind of
this over quite a bit so you can kind of get a better understanding and we're
get a better understanding and we're going to do it from highest to lowest so
going to do it from highest to lowest so we sold the most multi-tool survival
we sold the most multi-tool survival knives at 477 and so this entire bar
knives at 477 and so this entire bar this row is entirely filled up or almost
this row is entirely filled up or almost all the way filled up while as it gets
all the way filled up while as it gets lower and as we sell only 182 solar
lower and as we sell only 182 solar battery flashlights the bar is going to
battery flashlights the bar is going to represent that and show that now I'm
represent that and show that now I'm about to completely mess up this
about to completely mess up this visualization on purpose because it's
visualization on purpose because it's about to get very messy to show you that
about to get very messy to show you that you can do a little bit too much uh it
you can do a little bit too much uh it is possible what we're going to do is
is possible what we're going to do is we're going to go right over here to
we're going to go right over here to this background color unit sold and
this background color unit sold and instead of gradient let's look at rules
instead of gradient let's look at rules now with the price we just did a
now with the price we just did a gradient scale but we can do basically
gradient scale but we can do basically groups of these and say if a number is
groups of these and say if a number is greater to or equal than this number
greater to or equal than this number then it's going to be a certain color
then it's going to be a certain color and then if it's in a different range we
and then if it's in a different range we can give it a different color so we're
can give it a different color so we're going to say if it's greater than or
going to say if it's greater than or equal to zero and we're going to say
equal to zero and we're going to say number not percent and if it's less than
number not percent and if it's less than 266 because we have 265 right here let's
266 because we have 265 right here let's make it a nice uh like gold a beautiful
make it a nice uh like gold a beautiful lovely mustard gold just just great now
lovely mustard gold just just great now we're going to say if it's greater than
we're going to say if it's greater than or equal to we'll do 260 6 6 because
or equal to we'll do 260 6 6 because this is less than 266 so it should be
this is less than 266 so it should be greater than or equal to 266 number and
greater than or equal to 266 number and if it is less than we'll say
if it is less than we'll say 500 now we want to do this one and we'll
500 now we want to do this one and we'll give it uh let's do like a peach and
give it uh let's do like a peach and we'll click okay and now we have another
we'll click okay and now we have another conditional formatting on top of that
conditional formatting on top of that that can give us more information now
that can give us more information now again you should not do this it's just
again you should not do this it's just too many now let's go one step further
too many now let's go one step further and make it even more ridiculous and
and make it even more ridiculous and show you one more thing before I show
show you one more thing before I show you how you may actually want to use
you how you may actually want to use this uh let's go back to unit sold we're
this uh let's go back to unit sold we're going to rightclick go to conditional
going to rightclick go to conditional formatting and you can do something
formatting and you can do something called icons um font color is the exact
called icons um font color is the exact same thing as background color except it
same thing as background color except it changes the the font and so I'm not
changes the the font and so I'm not really going to look into that one icons
really going to look into that one icons are very simple extremely similar to
are very simple extremely similar to Excel and how you've seen them and the
Excel and how you've seen them and the rules that you can apply to them are
rules that you can apply to them are basically the same as if you're doing
basically the same as if you're doing like a gradient and it's these if
like a gradient and it's these if statements that we saw before now it
statements that we saw before now it Auto gives us this right here which
Auto gives us this right here which basically says 0 to 33% 33 to 67 67 to
basically says 0 to 33% 33 to 67 67 to 100 if it's in the bottom 3% it gives us
100 if it's in the bottom 3% it gives us this red the middle is yellow and the
this red the middle is yellow and the top is green so we can go through and
top is green so we can go through and change all of this but honestly this
change all of this but honestly this looks pretty good so let's click on
looks pretty good so let's click on it and so the ones that are our least
it and so the ones that are our least sellers are these red ones right here
sellers are these red ones right here and the top sellers are up here now this
and the top sellers are up here now this is just based on unit sold and this
is just based on unit sold and this looks absolutely terrible so let's kind
looks absolutely terrible so let's kind of take this exact information but make
of take this exact information but make it a little bit better so we're going to
it a little bit better so we're going to create a new visualization or at least a
create a new visualization or at least a new table so let's click on product name
new table so let's click on product name and we'll take the price unit sold and
and we'll take the price unit sold and revenue and what I think makes the most
revenue and what I think makes the most sense for looking at revenue is these
sense for looking at revenue is these data bars right here but there's only
data bars right here but there's only one problem I can't do that because it's
one problem I can't do that because it's not summarized like unit sold was but
not summarized like unit sold was but what I can do is to get that those data
what I can do is to get that those data bars is I can come right down here
bars is I can come right down here instead of saying don't summarize I can
instead of saying don't summarize I can summarize it and I can just click the
summarize it and I can just click the sum so it now was summarized it's the
sum so it now was summarized it's the exact same number but if I right click
exact same number but if I right click on here as sum of Revenue I go to
on here as sum of Revenue I go to conditional formatting I can now use
conditional formatting I can now use those data bars and so we're going to
those data bars and so we're going to use those data bars and we're going to
use those data bars and we're going to say for the lowest value and the highest
say for the lowest value and the highest value and let's just make it a
value and let's just make it a nice maybe a darker green I don't want
nice maybe a darker green I don't want it to well that's that's hideous let's
it to well that's that's hideous let's make it this color right here a nice
make it this color right here a nice dark green and there's no negative so it
dark green and there's no negative so it doesn't really matter we're going to go
doesn't really matter we're going to go left to right and you can show the bar
left to right and you can show the bar only but we're going to keep it because
only but we're going to keep it because I want to see it and we're going to go
I want to see it and we're going to go just like this we're going to
just like this we're going to order and this is pretty telling um
order and this is pretty telling um honestly I did not think the
honestly I did not think the weatherproof jackets were performing so
weatherproof jackets were performing so well but I mean they are by far a number
well but I mean they are by far a number one seller so you know our weatherproof
one seller so you know our weatherproof jackets multitool survival knives and
jackets multitool survival knives and the nylon rope are perform outperforming
the nylon rope are perform outperforming all of our other products so those my
all of our other products so those my might be the ones that I focus on the
might be the ones that I focus on the most while duct tape the n95 masks and
most while duct tape the n95 masks and waterproof matches I mean those are
waterproof matches I mean those are those are garbage so I might be looking
those are garbage so I might be looking to replace those in the near future with
to replace those in the near future with some other items that might sell a
some other items that might sell a little bit better so that's how you use
little bit better so that's how you use conditional formatting and it's actually
conditional formatting and it's actually pretty useful there are a lot of times
pretty useful there are a lot of times where I've done something like this in
where I've done something like this in an actual visualization for work and it
an actual visualization for work and it looks something like this it just
looks something like this it just depends on what you're visualizing but
depends on what you're visualizing but this is very much a simple thing that
this is very much a simple thing that you can do to just add a little bit more
you can do to just add a little bit more information and and actual visual
information and and actual visual to this little chart or table that
to this little chart or table that you're going to create sometimes it's
you're going to create sometimes it's just better to have these simple
just better to have these simple visualizations on this table rather than
visualizations on this table rather than just having the numbers themselves makes
just having the numbers themselves makes it a little bit more easy to read and
it a little bit more easy to read and understand so again I hope that this was
understand so again I hope that this was helpful thank you guys so much for
helpful thank you guys so much for watching I really appreciate it if you
watching I really appreciate it if you like this video be sure to like And
like this video be sure to like And subscribe and check out all my other
subscribe and check out all my other videos on powerbi and I'll see you in
videos on powerbi and I'll see you in the next
the next [Music]
[Music] video
video [Music]
[Music] what's going on everybody welcome back
what's going on everybody welcome back to the powerbi tutorial Series today
to the powerbi tutorial Series today we're going to be taking a look at bins
we're going to be taking a look at bins and
[Music] lists now bins and list are really
lists now bins and list are really useful because they allow you to group
useful because they allow you to group things together to analyze and visualize
things together to analyze and visualize them easier so in this tutorial I'll
them easier so in this tutorial I'll show you how to create your bins and
show you how to create your bins and lists and then we'll create some
lists and then we'll create some visualizations to show you how it can be
visualizations to show you how it can be helpful so without further Ado let's
helpful so without further Ado let's jump on my screen start with a tutorial
jump on my screen start with a tutorial all right so before we get started I
all right so before we get started I wanted to let you know you can go and
wanted to let you know you can go and download the data that we're going to be
download the data that we're going to be using in this tutorial in the
using in this tutorial in the description below is on my GitHub so we
description below is on my GitHub so we are going to be looking at bins and
are going to be looking at bins and lists today um and for this we're going
lists today um and for this we're going to be going over here to this apocalypse
to be going over here to this apocalypse sales uh and let's open up our data
sales uh and let's open up our data right over here and we want to look at
right over here and we want to look at apocalypse sales really quickly I feel
apocalypse sales really quickly I feel like more people would know what a bin
like more people would know what a bin is so we'll kind of start with a list
is so we'll kind of start with a list just go a little bit backwards than we
just go a little bit backwards than we normally would uh I'm going to use this
normally would uh I'm going to use this customer or we're going to use this
customer or we're going to use this customer column right here for a list
customer column right here for a list really quickly and you can do that in
really quickly and you can do that in two ways you can come up here and you
two ways you can come up here and you can right click on the customer and go
can right click on the customer and go to new group or you can come over here
to new group or you can come over here under this uh the Field section on the
under this uh the Field section on the far right and go to customer rightclick
far right and go to customer rightclick and click new group so let's click on
and click new group so let's click on that
that now and right now is only giving us the
now and right now is only giving us the list type it's not giving us bins
list type it's not giving us bins because bins have to be numeric so we
because bins have to be numeric so we really can't do that at the moment um so
really can't do that at the moment um so we're going to call this just customer
we're going to call this just customer groups just or or we'll actually call it
groups just or or we'll actually call it list just so it's easier to recognize
list just so it's easier to recognize when we create it and so all we're going
when we create it and so all we're going to do is we're going to basically group
to do is we're going to basically group these but it's going to be called a list
these but it's going to be called a list and so what we're going to do is we're
and so what we're going to do is we're going to select and we're going to
going to select and we're going to select and we're going to say group and
select and we're going to say group and click on this group button and then it
click on this group button and then it creates this Alex the analyst apocalypse
creates this Alex the analyst apocalypse Preppers and uh this prep for anything
Preppers and uh this prep for anything prepping store so that it kind of named
prepping store so that it kind of named it for us but if we double click on it
it for us but if we double click on it then we can rename this and we can call
then we can rename this and we can call this the best prepping
this the best prepping stores and then we have these last two
stores and then we have these last two and we can we can click on one and then
and we can we can click on one and then click control and click on the other one
click control and click on the other one so we get both of them and then we can
so we get both of them and then we can click group and we can call this and
click group and we can call this and we'll double click and we'll call this
we'll double click and we'll call this the worst prepping stores
the worst prepping stores um and then that's it and that's all we
um and then that's it and that's all we have to do and what we're then going to
have to do and what we're then going to do and if you want to undo this and you
do and if you want to undo this and you want to switch it up and do whatever you
want to switch it up and do whatever you can click on group but we're not going
can click on group but we're not going to do that we're going to click
to do that we're going to click okay and here is the column that it
okay and here is the column that it created and it basically tells us what
created and it basically tells us what list we put it in if it's Uncle Joe's
list we put it in if it's Uncle Joe's Prep shop that's in the worst prepping
Prep shop that's in the worst prepping stores list and if it's the Alex the
stores list and if it's the Alex the analyst apocalypse Preppers that is in
analyst apocalypse Preppers that is in the best prepping stores so it's kind of
the best prepping stores so it's kind of like an if statement you could even
like an if statement you could even create a calculated column do it on this
create a calculated column do it on this customer create an if statement this is
customer create an if statement this is just a lot faster and a lot easier than
just a lot faster and a lot easier than doing that but it basically would do the
doing that but it basically would do the exact same thing now you can use lists
exact same thing now you can use lists as well on things like numeric so let's
as well on things like numeric so let's say we have order
say we have order ID and we'll go to new group and it's
ID and we'll go to new group and it's going to Auto go to bin because
going to Auto go to bin because typically that's what you'll use but you
typically that's what you'll use but you can do list as well and let's say you
can do list as well and let's say you know we want to say we want to call
know we want to say we want to call these like we'll group these and call
these like we'll group these and call these the
these the first um we'll call this the first
first um we'll call this the first customers or the first orders because
customers or the first orders because we're looking at order IDs look at the
we're looking at order IDs look at the first orders and then we will go back
first orders and then we will go back here we're going on the left side we're
here we're going on the left side we're going to click oops we're going to go
going to click oops we're going to go back to the top we're going to hit shift
back to the top we're going to hit shift group all of these and we'll say the
group all of these and we'll say the latest
latest orders and you absolutely can do this um
orders and you absolutely can do this um again this is kind of like an if
again this is kind of like an if statement right so you're saying if it
statement right so you're saying if it falls between this range and this range
falls between this range and this range then it's called the first orders and if
then it's called the first orders and if it's between this range and this other
it's between this range and this other range it's the latest orders um again
range it's the latest orders um again it's just a much simpler version of an
it's just a much simpler version of an if statement and so you don't have to
if statement and so you don't have to write it all out you can just have this
write it all out you can just have this user interface kind of do it for you uh
user interface kind of do it for you uh and and it's really really useful so now
and and it's really really useful so now let's talk about bins and by far the
let's talk about bins and by far the easiest way to demonstrate this and I'll
easiest way to demonstrate this and I'll show you one other way uh but by far the
show you one other way uh but by far the easiest ways to show this is by using
easiest ways to show this is by using age and so uh for absolutely no reason
age and so uh for absolutely no reason whatsoever these customer IDs uh who are
whatsoever these customer IDs uh who are right here in this customer information
right here in this customer information they decided to give us some of their
they decided to give us some of their buyer information who are actually
buyer information who are actually buying their products on their website
buying their products on their website or in their store they just decided to
or in their store they just decided to give it to us as well as some uh simple
give it to us as well as some uh simple demographic information I I don't know
demographic information I I don't know why but what we're going to use bins for
why but what we're going to use bins for is grouping these age brackets so you
is grouping these age brackets so you know you might be interested in say well
know you might be interested in say well I want to know if my core population who
I want to know if my core population who are buying my products are within a
are buying my products are within a certain range and you don't want to look
certain range and you don't want to look look at every single age because then it
look at every single age because then it just you know in your visualizations
just you know in your visualizations it's not going to look right you want to
it's not going to look right you want to kind of group them make it easier to
kind of group them make it easier to visualize so what we're going to do is
visualize so what we're going to do is we're going to go through here and we're
we're going to go through here and we're going to basically go by tens so 10 20
going to basically go by tens so 10 20 30 40 50 60 and see what age bracket
30 40 50 60 and see what age bracket these people fall in so we're going to
these people fall in so we're going to go to age we're going to right click and
go to age we're going to right click and we're going to say new group and we're
we're going to say new group and we're going to go to bin and we'll leave it as
going to go to bin and we'll leave it as a default age bins um and you can do two
a default age bins um and you can do two things you can do the size of the bins
things you can do the size of the bins which splits it uh uh which splits it by
which splits it uh uh which splits it by this number right here or you can go
this number right here or you can go based on the number of bins so if you
based on the number of bins so if you only want to do five different bins
only want to do five different bins it'll calculate that for you and it'll
it'll calculate that for you and it'll say okay if you only want five bins
say okay if you only want five bins you're going to have to do it at 12.2 if
you're going to have to do it at 12.2 if you want 10 bins it can be 6.1 but it is
you want 10 bins it can be 6.1 but it is completely up to you on how you want to
completely up to you on how you want to do that um you can do the size and we'll
do that um you can do the size and we'll just say every 10 which is what we're
just say every 10 which is what we're going to do or you can go through and
going to do or you can go through and then you can create you know the how
then you can create you know the how many many bins you actually want so
many many bins you actually want so let's go ahead and click okay and it's
let's go ahead and click okay and it's going to create those bins for us so if
going to create those bins for us so if somebody is 78 they're going to be in
somebody is 78 they're going to be in the 70s bin if somebody's 41 they'll be
the 70s bin if somebody's 41 they'll be in the 40 bin if somebody is 29 they'll
in the 40 bin if somebody is 29 they'll be in the 20 bin and so on and so forth
be in the 20 bin and so on and so forth so when we go to visualize this we don't
so when we go to visualize this we don't have you know 71 72 73 74 have a lot
have you know 71 72 73 74 have a lot more things on our visualization it'll
more things on our visualization it'll just be the 70 or it'll just be the 20
just be the 70 or it'll just be the 20 now we can also use bins on dates as
now we can also use bins on dates as well so let's go back to apocalypse
well so let's go back to apocalypse sales we have this date purchase so we
sales we have this date purchase so we can create a bin for this as well so
can create a bin for this as well so let's go to date purchased let's go new
let's go to date purchased let's go new group now you can also create a list and
group now you can also create a list and that's totally fine if you would like to
that's totally fine if you would like to do that um and it would look kind of
do that um and it would look kind of like this where you can go through and
like this where you can go through and you can select it and you can say okay
you can select it and you can say okay this group all these dates you can group
this group all these dates you can group those and say this is going to be
those and say this is going to be January uh and you can do that and
January uh and you can do that and that's totally okay um but for this one
that's totally okay um but for this one we're going to do bins I think it's a
we're going to do bins I think it's a little bit easier to do bins because
little bit easier to do bins because what we can do is go right here and we
what we can do is go right here and we can specify if we want seconds minutes
can specify if we want seconds minutes hours days months or years and so um for
hours days months or years and so um for the data that we have it goes January
the data that we have it goes January February and March so we're going to do
February and March so we're going to do months and we're going to say the bin
months and we're going to say the bin size is going to be one month so each
size is going to be one month so each month should have its own bin so it'll
month should have its own bin so it'll be three bins total so we're going to
be three bins total so we're going to select
select okay and as you can see on this right
okay and as you can see on this right side we have January of 2022 and that
side we have January of 2022 and that correlates to the January over here then
correlates to the January over here then it goes down to February and then it
it goes down to February and then it goes down to March and then when we
goes down to March and then when we visualize this uh we don't have to do
visualize this uh we don't have to do this the hierarchy stuff that we do in
this the hierarchy stuff that we do in here where we filter it down down to
here where we filter it down down to months we can just use this right here
months we can just use this right here and that will be our month's column so
and that will be our month's column so now let's go over to our visualizations
now let's go over to our visualizations and we'll see how this looks really
and we'll see how this looks really quickly we're not going to look at all
quickly we're not going to look at all of them but we will take a look at few
of them but we will take a look at few of them so the first one that we can
of them so the first one that we can look at is age so let's look at the
look at is age so let's look at the buyer ID and then we'll do age as well
buyer ID and then we'll do age as well and so let's spread this
and so let's spread this out and we can see our distribution of
out and we can see our distribution of our buyers so it looks like we have very
our buyers so it looks like we have very few uh who are in the 10 range thank
few uh who are in the 10 range thank goodness and we can even put the age
goodness and we can even put the age right under here under the age bins and
right under here under the age bins and we have this now we kind of have this
we have this now we kind of have this drill down and so if we go right here
drill down and so if we go right here and we drill down right there this will
and we drill down right there this will actually give us the breakdown so this
actually give us the breakdown so this is what it would have kind of looked
is what it would have kind of looked like our visualization would have looked
like our visualization would have looked like if we had just kept it the age cuz
like if we had just kept it the age cuz now we're drilling down into the age and
now we're drilling down into the age and so it looks like we have one 18-year-old
so it looks like we have one 18-year-old and maybe a 20-year-old as well um let's
and maybe a 20-year-old as well um let's go back up yeah so it looks like we only
go back up yeah so it looks like we only have one buyer ID yes so there's only
have one buyer ID yes so there's only one 18year old so of legal age to start
one 18year old so of legal age to start buying you know all these prepping
buying you know all these prepping equipment and probably uh buying online
equipment and probably uh buying online and stuff like that which makes sense
and stuff like that which makes sense right so uh this gives you kind of a
right so uh this gives you kind of a quick breakdown in the bins rather than
quick breakdown in the bins rather than um doing it the alternative way so now
um doing it the alternative way so now let's take a look at the customer list
let's take a look at the customer list as well as the unit sold and it looks
as well as the unit sold and it looks like the best prepping store uh is
like the best prepping store uh is actually performing much worse
actually performing much worse surprisingly uh than the worst prepping
surprisingly uh than the worst prepping store and so I hope this gave you a
store and so I hope this gave you a really good idea of how to use bins and
really good idea of how to use bins and lists within powerbi thank you so much
lists within powerbi thank you so much for watching if you like this video be
for watching if you like this video be sure to like And subscribe and check out
sure to like And subscribe and check out all my other videos on powerbi I'll see
all my other videos on powerbi I'll see you in the next
you in the next [Music]
[Music] video
video [Music]
[Music] what's going on everybody welcome back
what's going on everybody welcome back to the powerbi tutorial Series today
to the powerbi tutorial Series today we're going to be taking a look at all
we're going to be taking a look at all types of
[Music] visualizations now when you're working
visualizations now when you're working in powerbi there are a lot of different
in powerbi there are a lot of different options to create visualizations and you
options to create visualizations and you may not always be sure which one to use
may not always be sure which one to use and so that's what this video is for I'm
and so that's what this video is for I'm going to walk you through a lot of the
going to walk you through a lot of the visualizations that I like and I use a
visualizations that I like and I use a lot as well as kind of point out some of
lot as well as kind of point out some of the ones that I don't like as much so
the ones that I don't like as much so that you get kind of a feel for the ones
that you get kind of a feel for the ones that I think are really popular and that
that I think are really popular and that are used the most so without further Ado
are used the most so without further Ado let's jump into powerbi and start taking
let's jump into powerbi and start taking a look all right before we jump into it
a look all right before we jump into it there is a link in the description where
there is a link in the description where you can get the data that we're going to
you can get the data that we're going to be using for these visualizations if you
be using for these visualizations if you want to practice them yourself before we
want to practice them yourself before we actually get into it we do need to
actually get into it we do need to combine this and if you download that
combine this and if you download that Excel and you see this you'll have to do
Excel and you see this you'll have to do the same thing all we have to say is
the same thing all we have to say is that this product ID is the same as this
that this product ID is the same as this product ID purchased and now we are good
product ID purchased and now we are good to go do one to many and it's okay if
to go do one to many and it's okay if it's one way so right over here under
it's one way so right over here under this visualizations tab there are lots
this visualizations tab there are lots of different options and it can be a
of different options and it can be a little bit overwhelming you don't really
little bit overwhelming you don't really know which one to choose there are some
know which one to choose there are some in here that I have almost never used
in here that I have almost never used for my job ever so I'll Point those out
for my job ever so I'll Point those out as we go through but the main focus is
as we go through but the main focus is going to be focusing on the ones that I
going to be focusing on the ones that I do use that I have used and showing you
do use that I have used and showing you how to actually create that
how to actually create that visualization Maybe spice it up just a
visualization Maybe spice it up just a little bit but we have a lot of them to
little bit but we have a lot of them to go through so let's jump right into it
go through so let's jump right into it and the very first one that we're going
and the very first one that we're going to start with probably the easiest one
to start with probably the easiest one and the one that you'll recognize the
and the one that you'll recognize the most is a stacked bar chart and what we
most is a stacked bar chart and what we going to do is go ahead right over here
going to do is go ahead right over here to the product name and we want this
to the product name and we want this unit sold as well so we're going to
unit sold as well so we're going to click product name and it's going to go
click product name and it's going to go straight into the Y AIS for us and then
straight into the Y AIS for us and then we're going to click unit sold and that
we're going to click unit sold and that will go into the x-axis automatically it
will go into the x-axis automatically it just kind of intuitively knows but
just kind of intuitively knows but sometimes it will make a mistake and
sometimes it will make a mistake and then you can just fix it or flip it and
then you can just fix it or flip it and we do want this uh let me make this much
we do want this uh let me make this much larger we do want this to be a little
larger we do want this to be a little bit more colorcoded that is what this
bit more colorcoded that is what this Legend is down here so what we're going
Legend is down here so what we're going to do is drag this product name down to
to do is drag this product name down to the legend and now we have each product
the legend and now we have each product as its own
as its own color and in previous videos we have
color and in previous videos we have gone through and looked at some of these
gone through and looked at some of these Visual and general options that you have
Visual and general options that you have when you're actually creating these
when you're actually creating these visualizations but we're going to do
visualizations but we're going to do some of them while we're in here as well
some of them while we're in here as well so we're just going to go down here
so we're just going to go down here we're going to choose data labels and
we're going to choose data labels and we're going to shrink that and if you go
we're going to shrink that and if you go higher the higher you go the less you
higher the higher you go the less you see so if you want all of them all the
see so if you want all of them all the way down to the green we're going to go
way down to the green we're going to go right about there and we're going to
right about there and we're going to make it smaller so now we can go ahead
make it smaller so now we can go ahead and click anywhere outside of that
and click anywhere outside of that visualization and now we can create a
visualization and now we can create a new one if we had just kept it like this
new one if we had just kept it like this where we were still interacting with
where we were still interacting with this visualization and we clicked on a
this visualization and we clicked on a different one it would have then changed
different one it would have then changed our visualization completely which we
our visualization completely which we don't want so let's hit contrl Z click
don't want so let's hit contrl Z click out of it and now we can create a new
out of it and now we can create a new one let's go right over here to this
one let's go right over here to this 100% stacked column chart I'm going to
100% stacked column chart I'm going to click on it drag it over here and make
click on it drag it over here and make it much larger and we're going to come
it much larger and we're going to come right over here to this customer
right over here to this customer information and we're going to click on
information and we're going to click on customer and then we're going to go up
customer and then we're going to go up to unit sold and click on unit sold and
to unit sold and click on unit sold and we want to break these out and so
we want to break these out and so basically what this is doing is it's
basically what this is doing is it's breaking it out by each of these shops
breaking it out by each of these shops and we can see the total of what they're
and we can see the total of what they're buying the units sold but we want to see
buying the units sold but we want to see exactly what products make up this
exactly what products make up this percentage of this 100% so we're going
percentage of this 100% so we're going to go right over here to product name
to go right over here to product name we're going to drag that down to the
we're going to drag that down to the legend and as you can see now we have
legend and as you can see now we have each of these products and each of the
each of these products and each of the products is up here so this backpack we
products is up here so this backpack we can see the backpack right here backpack
can see the backpack right here backpack right here and right here and we can see
right here and right here and we can see which customer is buying what percentage
which customer is buying what percentage of their purchases so for this prep for
of their purchases so for this prep for anything prep store they have a very
anything prep store they have a very large percentage 40% is duct tape so
large percentage 40% is duct tape so they're buying a lot of duct tape so
they're buying a lot of duct tape so really quickly we're able to see what
really quickly we're able to see what clients are purchasing or which clients
clients are purchasing or which clients are purchasing what products the most so
are purchasing what products the most so just like this Alex analyst apocalypse
just like this Alex analyst apocalypse Preppers they're buying a lot of water
Preppers they're buying a lot of water purifiers we like drinking clean water
purifiers we like drinking clean water um you know that's just what my audience
um you know that's just what my audience likes and so you know we can easily get
likes and so you know we can easily get a quick glance of that again we're going
a quick glance of that again we're going to go in here I tend to like putting
to go in here I tend to like putting these data labels on here that's just
these data labels on here that's just what I preference
what I preference so you know something like this it looks
so you know something like this it looks nice it looks clean um we can always go
nice it looks clean um we can always go back and change these names which we'll
back and change these names which we'll do for this one so we're going to go
do for this one so we're going to go over here go to title we'll go down to
over here go to title we'll go down to the text and we'll do
the text and we'll do customer
customer oops customer purchase oh jeez
oops customer purchase oh jeez breakdown pretend I'm really good at
breakdown pretend I'm really good at spelling and we're going to do it just
spelling and we're going to do it just like that we'll get out of there so now
like that we'll get out of there so now we have customer purchase breakdown and
we have customer purchase breakdown and that looks really nice it's a good uh a
that looks really nice it's a good uh a good visualization and we're going to
good visualization and we're going to bring that right over here we're going
bring that right over here we're going to have a lot on the screen so I may
to have a lot on the screen so I may have to uh make them smaller or larger
have to uh make them smaller or larger to fit
to fit everything all right so let's go on to
everything all right so let's go on to our next one another really common
our next one another really common visualization is this one right here
visualization is this one right here which is the line chart and the line
which is the line chart and the line chart is great especially when you're
chart is great especially when you're using things like dates I have found
using things like dates I have found this one to be the best best and a lot
this one to be the best best and a lot of people use this as well so we're
of people use this as well so we're going to go right over here and click on
going to go right over here and click on date purchased and then units sold and
date purchased and then units sold and on the x-axis you can see it's broken up
on the x-axis you can see it's broken up by year quarter month and day so we
by year quarter month and day so we don't want to do it that high level we
don't want to do it that high level we only have three months of data in here
only have three months of data in here so we're going to get rid of the year
so we're going to get rid of the year we're going to get rid of the quarter
we're going to get rid of the quarter and then we at least have this and let's
and then we at least have this and let's break it out because right now we're
break it out because right now we're looking at all of the units sold so
looking at all of the units sold so we're going to drag the product name
we're going to drag the product name right down here to the legend and now it
right down here to the legend and now it breaks it out by the actual product and
breaks it out by the actual product and for each month in January February or
for each month in January February or March you can follow these products and
March you can follow these products and see how they did in each of those months
see how they did in each of those months and if we wanted to we can come right
and if we wanted to we can come right over here to the filter on the product
over here to the filter on the product name and we could filter it by maybe the
name and we could filter it by maybe the top three so let's do multi-tool
top three so let's do multi-tool survival knife the nylon rope and the
survival knife the nylon rope and the duct tape and we can have it just like
duct tape and we can have it just like this and you know you can do those for
this and you know you can do those for any product that you want but again we
any product that you want but again we just want to do it for those three just
just want to do it for those three just for an example and that really doesn't
for an example and that really doesn't give us a ton of information we could
give us a ton of information we could even go down to the day and you know it
even go down to the day and you know it might give us a little bit more
might give us a little bit more information and so we'll keep it like
information and so we'll keep it like that and we can go over here change the
that and we can go over here change the name as well we're not going to do this
name as well we're not going to do this for all of them again we're just looking
for all of them again we're just looking at the different types of visualizations
at the different types of visualizations I think are really good to know but
I think are really good to know but we'll change this one as well to
we'll change this one as well to products
products purchased by
purchased by date we'll keep it just like that again
date we'll keep it just like that again nothing fancy we're just trying to look
nothing fancy we're just trying to look at a bunch of different stuff so let's
at a bunch of different stuff so let's put this over here down here now let's
put this over here down here now let's click out of there and there are other
click out of there and there are other ones in here um that are definitely
ones in here um that are definitely useful and you absolutely can use um
useful and you absolutely can use um like this one is a stacked bar chart
like this one is a stacked bar chart this one is a stacked column chart it's
this one is a stacked column chart it's basically the same thing just a
basically the same thing just a different orientation like we went to
different orientation like we went to here it's just a different orientation
here it's just a different orientation it's the same thing um just like this
it's the same thing um just like this clustered bar chart custom column chart
clustered bar chart custom column chart it's just its orientation either
it's just its orientation either horizontal or
horizontal or vertical then we have things like an
vertical then we have things like an area chart uh stacked area chart not
area chart uh stacked area chart not really things that I've used too much in
really things that I've used too much in previous positions one that I have use
previous positions one that I have use though is a line and clustered column
though is a line and clustered column chart so it kind of combines a few of
chart so it kind of combines a few of these with you know you have these bar
these with you know you have these bar charts as well as line charts into one
charts as well as line charts into one visualization so let's look at this one
visualization so let's look at this one because this is one that I have used
because this is one that I have used several times in my actual job so for
several times in my actual job so for our x axis we'll use the product name
our x axis we'll use the product name then we'll look at something like the
then we'll look at something like the price and so let's make this a lot
price and so let's make this a lot larger so you can actually see it so now
larger so you can actually see it so now we have the price and now we can look at
we have the price and now we can look at something like the production cost and
something like the production cost and that can
that can be our line ya AIS so now we're looking
be our line ya AIS so now we're looking at the price of it how much someone is
at the price of it how much someone is actually paying for it and then we're
actually paying for it and then we're looking at how much it's costing us to
looking at how much it's costing us to actually produce that product and so
actually produce that product and so really quickly at a glance you can kind
really quickly at a glance you can kind of see that it's around the halfway to
of see that it's around the halfway to 2/3 point on most of these you can see
2/3 point on most of these you can see that the production cost is always lower
that the production cost is always lower than the actual price because of course
than the actual price because of course we're out here to make a profit on these
we're out here to make a profit on these products so let's minimize this one
products so let's minimize this one we're going to put this one right down
we're going to put this one right down here let's make it even smaller let's
here let's make it even smaller let's click out of that and the next one that
click out of that and the next one that we're going to take a look at is a
we're going to take a look at is a scatter chart so let's click on that and
scatter chart so let's click on that and make it much larger
make it much larger oops there we go so let's use the price
oops there we go so let's use the price and the production cost again and so our
and the production cost again and so our x axis is the price our y y AIS is the
x axis is the price our y y AIS is the production cost but now we need to fill
production cost but now we need to fill in this values right here so let's go
in this values right here so let's go over here and click on the product name
over here and click on the product name and drag that into values and so now we
and drag that into values and so now we have our values we just don't know what
have our values we just don't know what they are but we can see it so let's drag
they are but we can see it so let's drag this down to Legend as well and it
this down to Legend as well and it breaks it out and we kind of have this
breaks it out and we kind of have this scatter plot and you know for this fake
scatter plot and you know for this fake data that we're using it doesn't really
data that we're using it doesn't really show a lot U but if you're using real
show a lot U but if you're using real data you can definitely find outliers
data you can definitely find outliers and Trends and patterns using this type
and Trends and patterns using this type of visualization let's go ahead and make
of visualization let's go ahead and make that one small as well drag it right
that one small as well drag it right down into the
down into the corner now let's go right over here and
corner now let's go right over here and we have the the dreaded pie charts um
we have the the dreaded pie charts um and dut chart now look I think it's kind
and dut chart now look I think it's kind of a joke in the data analyst Community
of a joke in the data analyst Community about pie charts and doughnut charts but
about pie charts and doughnut charts but at the same time people use them and
at the same time people use them and they request them and so sometimes
they request them and so sometimes you're going to use it whether you like
you're going to use it whether you like it or not so let's click on the dut
it or not so let's click on the dut chart and let's make this one a lot
chart and let's make this one a lot larger and let's go over here and let's
larger and let's go over here and let's click on
click on State and we're also going to click on
State and we're also going to click on total purchased and that's really all
total purchased and that's really all you have to do these ones are pretty
you have to do these ones are pretty straightforward you can change a few
straightforward you can change a few different things like where these labels
different things like where these labels are if you want them inside you can also
are if you want them inside you can also do that and that would look totally fine
do that and that would look totally fine um again I'm just not a super huge fan
um again I'm just not a super huge fan but you will get this one requested
but you will get this one requested people like this and want to see it and
people like this and want to see it and the reason a lot of analysts don't like
the reason a lot of analysts don't like using this is because when you start
using this is because when you start glancing at these it's really hard to
glancing at these it's really hard to tell the difference between these sizes
tell the difference between these sizes if you look at something like this you
if you look at something like this you can easily see that this is larger like
can easily see that this is larger like if you're looking at this one the
if you're looking at this one the multi-tool survival knife is obviously
multi-tool survival knife is obviously the longest and it gets shorter shorter
the longest and it gets shorter shorter shorter shorter but when you start
shorter shorter but when you start getting in here it's really hard to
getting in here it's really hard to approximate the size I would not be able
approximate the size I would not be able to tell the difference between this 5.63
to tell the difference between this 5.63 5.78 two uh 7.72 I would not be able to
5.78 two uh 7.72 I would not be able to tell really the difference between these
tell really the difference between these or or kind of the the difference between
or or kind of the the difference between them very easily that's why a lot of
them very easily that's why a lot of people don't want to use them in general
people don't want to use them in general so again I want to show you this one
so again I want to show you this one because I think it's worth noting and
because I think it's worth noting and worth knowing how to use but I don't
worth knowing how to use but I don't really push people towards this because
really push people towards this because I don't think it's the best
I don't think it's the best visualization available most of the time
visualization available most of the time all right the next two are super easy
all right the next two are super easy but are used all the time uh maybe more
but are used all the time uh maybe more than some of these even but they're just
than some of these even but they're just so easy to use so I'm kind of saved them
so easy to use so I'm kind of saved them for last this one is the card and all
for last this one is the card and all the card is is it displays one number or
the card is is it displays one number or multiple numbers if you want to use a
multiple numbers if you want to use a multi- card but we'll just look at the
multi- card but we'll just look at the card for now all we're going to look at
card for now all we're going to look at is the total purchased and it's just
is the total purchased and it's just going to display it just like this and
going to display it just like this and you can make it as large or as small as
you can make it as large or as small as you'd like and normally it goes on like
you'd like and normally it goes on like the top and you'll put card here a card
the top and you'll put card here a card here um just for example I'll kind of
here um just for example I'll kind of show you how this might look so it look
show you how this might look so it look something like this right and at the top
something like this right and at the top it'll have different usually High
it'll have different usually High overarching information and this is
overarching information and this is super common to see and I'm sure if
super common to see and I'm sure if you've looked at other people's
you've looked at other people's visualization you'll see something like
visualization you'll see something like this this is usually totals or averages
this this is usually totals or averages or something like that in here where
or something like that in here where it's super easy to look at so like right
it's super easy to look at so like right here this is total purchased and we can
here this is total purchased and we can go in and look at the minimum and then
go in and look at the minimum and then we can go over here and this one can be
we can go over here and this one can be account and so it gives us a lot of
account and so it gives us a lot of information just at a really quick
information just at a really quick glance and then we have all of our more
glance and then we have all of our more in-depth colorful visualizations that
in-depth colorful visualizations that kind of have more information than just
kind of have more information than just a single piece like the card does and
a single piece like the card does and then the very last one that I'm going to
then the very last one that I'm going to show you is this one right here which is
show you is this one right here which is the table and this one is obviously
the table and this one is obviously extremely popular it's like an little
extremely popular it's like an little Excel table and we can go in here and we
Excel table and we can go in here and we can get the customer wherever that is
can get the customer wherever that is and then we'll also get the unit sold
and then we'll also get the unit sold and this is what it looks like and it's
and this is what it looks like and it's super easy and oftentimes you'll have it
super easy and oftentimes you'll have it like on the side as well uh and all the
like on the side as well uh and all the other visualizations over here and so
other visualizations over here and so you know if we're going to take all
you know if we're going to take all these visualizations and pretend they
these visualizations and pretend they were like a real thing you know there's
were like a real thing you know there's a lot in here but we'll just kind of
a lot in here but we'll just kind of really quickly do this um you know we
really quickly do this um you know we might have something like this and we'll
might have something like this and we'll make this larger and make this
make this larger and make this wider and you know we have a lot of
wider and you know we have a lot of information just in here and this is not
information just in here and this is not a project so don't go put this on your
a project so don't go put this on your portfolio I'm just threw a ton of random
portfolio I'm just threw a ton of random visualizations on you know this
visualizations on you know this dashboard but you can already see a lot
dashboard but you can already see a lot of these you most likely have seen in
of these you most likely have seen in other people's work in other people's
other people's work in other people's visualizations on LinkedIn or on YouTube
visualizations on LinkedIn or on YouTube these are very common very very popular
these are very common very very popular and again we did not go through all of
and again we did not go through all of the ones over here there are maps that
the ones over here there are maps that you can use but I haven't used Maps ever
you can use but I haven't used Maps ever in my job there are things like gauges
in my job there are things like gauges and decomposition trees and waterfall
and decomposition trees and waterfall charts and uh tree maps and all these
charts and uh tree maps and all these different things but I really have never
different things but I really have never used those in my actual job and I don't
used those in my actual job and I don't see them a lot in others people's work
see them a lot in others people's work either otherwise I would be telling you
either otherwise I would be telling you to learn these and use these but again
to learn these and use these but again try them out see which ones you like if
try them out see which ones you like if you like this video be sure to like And
you like this video be sure to like And subscribe below and go check out all the
subscribe below and go check out all the other powerbi tutorial videos that I
other powerbi tutorial videos that I have on my channel and I will see you in
have on my channel and I will see you in the
the [Music]
[Music] next what's going on everybody welcome
next what's going on everybody welcome back to the powerbi tutorial Series
back to the powerbi tutorial Series today we are going to be working on our
today we are going to be working on our final
now this is our final project of the powerbi tutorial Series so if you have
powerbi tutorial Series so if you have not watched all of those videos leading
not watched all of those videos leading up to this I recommend going and
up to this I recommend going and watching those videos so you can make
watching those videos so you can make sure that you know all the things that
sure that you know all the things that we're going to be looking at in today's
we're going to be looking at in today's project I am really excited to work on
project I am really excited to work on this project with you because I think it
this project with you because I think it is a really good one and it uses real
is a really good one and it uses real data that we collected about a month ago
data that we collected about a month ago where I took a survey of data
where I took a survey of data professionals and this is the raw data
professionals and this is the raw data that we're going to be looking at and so
that we're going to be looking at and so I think it's just really interesting
I think it's just really interesting that we collected our own data and now
that we collected our own data and now we're using for a project we're going to
we're using for a project we're going to transform the data using power query and
transform the data using power query and then we're actually create the
then we're actually create the visualizations and finalize the
visualizations and finalize the dashboards as well as create a theme and
dashboards as well as create a theme and a different color scheme to kind of make
a different color scheme to kind of make it a little bit more unique without
it a little bit more unique without further Ado let's jump onto my screen
further Ado let's jump onto my screen and get started with the project all
and get started with the project all right so before we jump into it I wanted
right so before we jump into it I wanted to let you know that you can get the
to let you know that you can get the data below it is on my GitHub you can go
data below it is on my GitHub you can go and download this exact file that we're
and download this exact file that we're going to be looking at now in the past
going to be looking at now in the past several projects we have been using this
several projects we have been using this fake apocalypse data set you know it was
fake apocalypse data set you know it was fun it was you know what whatever this
fun it was you know what whatever this data set is real this is a real data set
data set is real this is a real data set it was a survey that I took from data
it was a survey that I took from data professionals I posted on LinkedIn and
professionals I posted on LinkedIn and Twitter and all these other places and
Twitter and all these other places and we had about 600 700 people who
we had about 600 700 people who responded to the questions so before we
responded to the questions so before we actually get into it and start cleaning
actually get into it and start cleaning the data and doing all this stuff in
the data and doing all this stuff in powerbi I just wanted to show you the
powerbi I just wanted to show you the data all right so this is the CSV that I
data all right so this is the CSV that I downloaded from the survey website that
downloaded from the survey website that I used and this is completely raw data I
I used and this is completely raw data I haven't done anything to it at all let's
haven't done anything to it at all let's go through the data really quickly and
go through the data really quickly and we'll kind of see what we have and we
we'll kind of see what we have and we are not going to make any changes at all
are not going to make any changes at all in Excel we're going to do all of our
in Excel we're going to do all of our Transformations or at least a few
Transformations or at least a few transformations in powerbi because again
transformations in powerbi because again this is a powerbi tutorial and project
this is a powerbi tutorial and project so I want you to kind of learn how to
so I want you to kind of learn how to use that and not use Excel because you
use that and not use Excel because you can go through my Excel tutorial if you
can go through my Excel tutorial if you want to do that so let's just look at it
want to do that so let's just look at it in Excel and then we'll move it over to
in Excel and then we'll move it over to powerbi and actually start transforming
powerbi and actually start transforming the data so we have this unique ID these
the data so we have this unique ID these are all the people that actually took it
are all the people that actually took it oops don't want to do that we have an
oops don't want to do that we have an email which this was completely
email which this was completely Anonymous I didn't collect any data or
Anonymous I didn't collect any data or user data on this then we have the date
user data on this then we have the date Taken um and let's get into the actual
Taken um and let's get into the actual good information then we have all of
good information then we have all of these questions so we have question one
these questions so we have question one which title fits you best and they can
which title fits you best and they can choose things now uh let's add a filter
choose things now uh let's add a filter really quickly that we can look at this
really quickly that we can look at this now you had the pre-selected ones which
now you had the pre-selected ones which were like data analyst architect
were like data analyst architect engineer but then there was an option
engineer but then there was an option where you could say other and you could
where you could say other and you could spe specify what that was so if you look
spe specify what that was so if you look in here we're going to have all these
in here we're going to have all these different other please specify with
different other please specify with different titles right and there were a
different titles right and there were a lot of them now typically what you want
lot of them now typically what you want to do is really clean this up and we're
to do is really clean this up and we're not going to be doing a ton ton ton of
not going to be doing a ton ton ton of data cleaning but we are going to do
data cleaning but we are going to do some in powerbi but none in here but
some in powerbi but none in here but typically with this amount of data and
typically with this amount of data and the way that it's formatted we would do
the way that it's formatted we would do so much data cleaning um with this one I
so much data cleaning um with this one I mean I mean there is a lot of work to be
mean I mean there is a lot of work to be done um like this current year salary
done um like this current year salary this is one that I would absolutely be
this is one that I would absolutely be cleaning up because it's ranges and it
cleaning up because it's ranges and it has a dash and a k and and all these
has a dash and a k and and all these numbers this is something that I would
numbers this is something that I would be cleaning up and using but we're not
be cleaning up and using but we're not going to be cleaning this up right now
going to be cleaning this up right now so anyways let's just get into it let's
so anyways let's just get into it let's see what questions we asked uh we have
see what questions we asked uh we have the yearly salary what industry do you
the yearly salary what industry do you work in favorite programming
work in favorite programming language then there were a lot of
language then there were a lot of different options this is like one
different options this is like one question where they picked multiple
question where they picked multiple options so is how happy are you in your
options so is how happy are you in your current position with the following you
current position with the following you have your salary work life
have your salary work life balance um then we have co-workers
balance um then we have co-workers management upward Mobility learning new
management upward Mobility learning new things um and they could rank it from
things um and they could rank it from zero to 10 so some people ranked upward
zero to 10 so some people ranked upward Mobility a 10 some ranked it a zero or a
Mobility a 10 some ranked it a zero or a one um and again they can answer however
one um and again they can answer however they want how difficult was it to break
they want how difficult was it to break into Data very very difficult very easy
into Data very very difficult very easy um if you're looking for a new job we
um if you're looking for a new job we have you know what would you be looking
have you know what would you be looking for remote work better salary Etc we
for remote work better salary Etc we have male female which country you from
have male female which country you from and then this is more like demographics
and then this is more like demographics so if you're a male how old you are and
so if you're a male how old you are and this was in a Range so this is like a a
this was in a Range so this is like a a a sliding bar so you could slide it to
a sliding bar so you could slide it to the exact age you had there's some
the exact age you had there's some people who are apparently 92 um which if
people who are apparently 92 um which if that's true I mean good for you man or
that's true I mean good for you man or woman actually really quickly I'm going
woman actually really quickly I'm going to see just just while we're here I'm
to see just just while we're here I'm going to see if this is a male male or a
going to see if this is a male male or a female oh it's a female from India very
female oh it's a female from India very cool um so we have all this information
cool um so we have all this information and it is a lot of information when you
and it is a lot of information when you have something like this I mean there is
have something like this I mean there is so much data cleaning that can be done I
so much data cleaning that can be done I mean I already see like 20 plus
mean I already see like 20 plus different things that I would need to do
different things that I would need to do to make this a lot better um and we also
to make this a lot better um and we also have date Taken and the time taken as as
have date Taken and the time taken as as well as how long it they took on it like
well as how long it they took on it like the time spent really just really
the time spent really just really interesting data but again this is a
interesting data but again this is a beginner tutorial Series this is the
beginner tutorial Series this is the beginner project so we're not going to
beginner project so we're not going to get do anything too crazy I will be
get do anything too crazy I will be using this exact data set in a future
using this exact data set in a future video doing a lot more data cleaning and
video doing a lot more data cleaning and creating a much more advanced
creating a much more advanced visualization with what we have and what
visualization with what we have and what we're looking at right here but for this
we're looking at right here but for this video we're just going to be doing a
video we're just going to be doing a pretty simple visualization and D
pretty simple visualization and D dashboard that you can use uh to
dashboard that you can use uh to practice with or put on your portfolio
practice with or put on your portfolio if you know that's where you're at right
if you know that's where you're at right now so let's get out of here and let's
now so let's get out of here and let's put this into powerbi so let's exit out
put this into powerbi so let's exit out and let's come right over here to import
and let's come right over here to import data from Excel we'll click on powerbi
data from Excel we'll click on powerbi final project and
final project and open give that a second doing this all
open give that a second doing this all in real time we only have the one so
in real time we only have the one so we'll do be we won't be practicing any
we'll do be we won't be practicing any joins or anything but we're not going to
joins or anything but we're not going to load it we're going to transform this
load it we're going to transform this data so let's put it into to power query
data so let's put it into to power query editor and now we have all of our data
editor and now we have all of our data in here and it should look extremely
in here and it should look extremely familiar now when I'm looking at this
familiar now when I'm looking at this when I start looking at this information
when I start looking at this information I kind of need to know beforehand what I
I kind of need to know beforehand what I want to get out of this do I need to
want to get out of this do I need to clean every single column do I just need
clean every single column do I just need to clean a few of them do I need to get
to clean a few of them do I need to get rid of columns that's kind of where my
rid of columns that's kind of where my head's at and so right off the bat I can
head's at and so right off the bat I can already tell you that there are columns
already tell you that there are columns that we can just delete to get out of
that we can just delete to get out of our way so we're going to do that at the
our way so we're going to do that at the beginning so that we don't have to do
beginning so that we don't have to do that later on or they're just in our way
that later on or they're just in our way so I'm going to click on browser and
so I'm going to click on browser and then I'm going to hit shift and I'm
then I'm going to hit shift and I'm going to go over here to
going to go over here to refer and I'm just going to go up here
refer and I'm just going to go up here to remove columns and everything that we
to remove columns and everything that we do is going to go over here to this
do is going to go over here to this applied steps if you've been following
applied steps if you've been following this series um you know we can remove
this series um you know we can remove things add things but anything we do
things add things but anything we do will show up right over here so we can
will show up right over here so we can track it and go back if we need to now
track it and go back if we need to now one column that I know for sure that I'm
one column that I know for sure that I'm going to be using quite a bit is this
going to be using quite a bit is this which title fits you best in your
which title fits you best in your current role because I I specifically
current role because I I specifically wanted to do a breakdown of different
wanted to do a breakdown of different people's roles and how much they make
people's roles and how much they make and different stuff like that so I know
and different stuff like that so I know that I want to use this but as we saw
that I want to use this but as we saw before there's kind of the issue is is
before there's kind of the issue is is it's not very clean right it has data
it's not very clean right it has data analyst data architect engineer
analyst data architect engineer scientist databased developer and then
scientist databased developer and then like a hundred different options and
like a hundred different options and then a student or or none of these right
um and so for the purpose of this video right here we are not going to take
right here we are not going to take every single one of these options
every single one of these options because this involves a lot more data
because this involves a lot more data cleaning let me give you an example this
cleaning let me give you an example this says software engineer this also says
says software engineer this also says software engineer and with AI these two
software engineer and with AI these two would typically be combined or
would typically be combined or standardized to software engineer but
standardized to software engineer but it's not very easy to do that in powerbi
it's not very easy to do that in powerbi we could do that in Excel but not really
we could do that in Excel but not really in powerbi or even SQL if we pull this
in powerbi or even SQL if we pull this from a SQL database um and you can find
from a SQL database um and you can find lots of different you know options of
lots of different you know options of that we have data manager and data
that we have data manager and data manager if we separated these out these
manager if we separated these out these would be different options when we
would be different options when we created our visualizations and we don't
created our visualizations and we don't want that so what we are going to do uh
want that so what we are going to do uh and this is going to be kind of a an
and this is going to be kind of a an easy way out to just make sure that this
easy way out to just make sure that this is pretty clean and doesn't we don't
is pretty clean and doesn't we don't have a thousand different options we're
have a thousand different options we're going to create this to other so we're
going to create this to other so we're to simplify this a lot and then we're
to simplify this a lot and then we're going to use this so we'll have maybe
going to use this so we'll have maybe six or seven options instead of the you
six or seven options instead of the you know let's say 50 that we would have if
know let's say 50 that we would have if we actually did the harder work which
we actually did the harder work which just break it out standardize it and
just break it out standardize it and clean it up that way so what we're going
clean it up that way so what we're going to do is we're going to click on this
to do is we're going to click on this right here and we're going to go up here
right here and we're going to go up here to split column in this ribbon up top
to split column in this ribbon up top we'll go to split
we'll go to split column and we want to do it by a
column and we want to do it by a delimiter and if you notice let me see
delimiter and if you notice let me see if I can move this over if you notice we
if I can move this over if you notice we have other and then we have this
have other and then we have this parenthesis and in no other option or
parenthesis and in no other option or way is there parenthesis so what we're
way is there parenthesis so what we're going to do is we're going to use a
going to do is we're going to use a custom and we're use this open
custom and we're use this open parenthesis what that's going to do is
parenthesis what that's going to do is it's going to separate it by this
it's going to separate it by this parenthesis it's going to leave the
parenthesis it's going to leave the other it's going to create separate
other it's going to create separate columns um just one separate column for
columns um just one separate column for each of these and we can do that at each
each of these and we can do that at each occurrence or we can do the leftmost and
occurrence or we can do the leftmost and we really we only need it for the
we really we only need it for the leftmost because there's only one of
leftmost because there's only one of these uh left-handed or left-sided uh
these uh left-handed or left-sided uh brackets or or what is it whatever this
brackets or or what is it whatever this is called and then let's go and click
is called and then let's go and click okay and it should create another column
okay and it should create another column so it's going to have 0.1 Point 2 and
so it's going to have 0.1 Point 2 and now we have if we click on this now we
now we have if we click on this now we only have these options we have analyst
only have these options we have analyst architect engineer data scientist
architect engineer data scientist database developer other and student
database developer other and student looking or none that is what we want it
looking or none that is what we want it makes it so much simpler and it's not
makes it so much simpler and it's not perfect but again I'm trying to show you
perfect but again I'm trying to show you what we are able to do in powerbi so now
what we are able to do in powerbi so now we're just going to remove that column
we're just going to remove that column and we're going to go and do the exact
and we're going to go and do the exact same thing to this one as well because I
same thing to this one as well because I know that we want to use this and I
know that we want to use this and I really wanted to use this one as well
really wanted to use this one as well but if we look at this one also um
but if we look at this one also um there's a lot so I said what is your
there's a lot so I said what is your favorite programming language and people
favorite programming language and people there were pre-selected answers like
there were pre-selected answers like JavaScript Java C++ python R things like
JavaScript Java C++ python R things like that and then there was an other option
that and then there was an other option and in this other option I mean it was
and in this other option I mean it was free text so they can fill it in as they
free text so they can fill it in as they want I mean there's four five six
want I mean there's four five six different ways that people put SQL that
different ways that people put SQL that is something I would standardize and you
is something I would standardize and you know that would be the way I cleaned it
know that would be the way I cleaned it but that's not how we did it in here so
but that's not how we did it in here so we're going to do the same thing we're
we're going to do the same thing we're going to keep that other so we're going
going to keep that other so we're going to split this column again we're use a
to split this column again we're use a delimiter and for this delimiter though
delimiter and for this delimiter though we're going to use a colon so we're
we're going to use a colon so we're going to say we're going to do a colon
going to say we're going to do a colon right there we'll just do the leftmost
right there we'll just do the leftmost we'll click okay and then we have our
we'll click okay and then we have our options and it's much simpler now I
options and it's much simpler now I really would have rather kept all these
really would have rather kept all these and because sql's in there quite a bit
and because sql's in there quite a bit but you know a lot of people don't think
but you know a lot of people don't think SQL is even a programming language so uh
SQL is even a programming language so uh we're going to delete that column now
we're going to delete that column now one that I just skipped and I kind of
one that I just skipped and I kind of wanted to go back to is this current
wanted to go back to is this current yearly salary I really want to use this
yearly salary I really want to use this let's see if we can use it I here's what
let's see if we can use it I here's what I want to do with it and this is not
I want to do with it and this is not perfect um for this video I want to try
perfect um for this video I want to try it what I want to do is break up these
it what I want to do is break up these numbers 106 125 and then take the
numbers 106 125 and then take the average of those numbers so then we'll
average of those numbers so then we'll use some docks in there so we'll take
use some docks in there so we'll take 106 125 create that into two separate
106 125 create that into two separate columns then we'll create a third column
columns then we'll create a third column that will give us the average of those
that will give us the average of those two numbers so we'll do 106 plus 125
two numbers so we'll do 106 plus 125 divided by two and then we'll have the
divided by two and then we'll have the average of that now that is not perfect
average of that now that is not perfect but it's going to give us at least you
but it's going to give us at least you know an average of kind of roundabout
know an average of kind of roundabout number because they gave us this range
number because they gave us this range they said my salary is between 106 and
they said my salary is between 106 and 125,000 so if we say that their salary
125,000 so if we say that their salary was
was 112,000 at least gives us it makes it
112,000 at least gives us it makes it usable it's a numeric value instead of
usable it's a numeric value instead of being this which is text which we really
being this which is text which we really we could use and and I'll show you how
we could use and and I'll show you how to do that because we're going to keep
to do that because we're going to keep this column I'll create a copy of this
this column I'll create a copy of this and I'll show you the difference between
and I'll show you the difference between this and using the average but for but
this and using the average but for but for this data cleaning portion let's
for this data cleaning portion let's just try it let's see what we can do and
just try it let's see what we can do and see if we can make it work so first
see if we can make it work so first let's create a duplicate so we're going
let's create a duplicate so we're going to uh duplicate the column so now we
to uh duplicate the column so now we have this copy at the very very end and
have this copy at the very very end and we can use this one instead of having to
we can use this one instead of having to use the original way way way back here
use the original way way way back here so we're going to leave that one how it
so we're going to leave that one how it is and we're going to use this one so
is and we're going to use this one so let's go ahead and split this one up
let's go ahead and split this one up we're going to click on the column
we're going to click on the column header then we're going to click on
header then we're going to click on split column and we'll do it by digit to
split column and we'll do it by digit to non-digit and if you look at it right
non-digit and if you look at it right here it's broken it out kind of um in
here it's broken it out kind of um in the fact that now in this one we just
the fact that now in this one we just have numeric values and in this one we
have numeric values and in this one we have k- numeric or just Dash numeric and
have k- numeric or just Dash numeric and now this can be easily cleaned whereas
now this can be easily cleaned whereas this one we can just completely get rid
this one we can just completely get rid of because it's only K so we'll just
of because it's only K so we'll just remove that column and then in this one
remove that column and then in this one we're going to rightclick we're going to
we're going to rightclick we're going to click on replace values and so if it
click on replace values and so if it just has we're just do a k we'll replace
just has we're just do a k we'll replace with nothing we'll do okay and then for
with nothing we'll do okay and then for the last one we'll go to replace values
the last one we'll go to replace values and we'll do the dash or the minus sign
and we'll do the dash or the minus sign and we'll place that with nothing and so
and we'll place that with nothing and so now we have our values as well oh we
now we have our values as well oh we also have a plus let me get rid of that
also have a plus let me get rid of that because that's when some people had 250
because that's when some people had 250 or 225,000 plus so for that one the
or 225,000 plus so for that one the average is just going to be 225 we'll
average is just going to be 225 we'll have to specify that in our dock I
have to specify that in our dock I forgot but actually if somebody has
forgot but actually if somebody has 225 let me find this plus really quick
225 let me find this plus really quick uh let me filter by it because that's a
uh let me filter by it because that's a lot faster what we actually want to do
lot faster what we actually want to do for the purpose of this one is we want
for the purpose of this one is we want to put 225 here so that when we do 225
to put 225 here so that when we do 225 plus 225 divide by two it comes out to
plus 225 divide by two it comes out to 225 that's just what we're going to put
225 that's just what we're going to put it as and there's only two people so uh
it as and there's only two people so uh I'm actually going to replace this I'm
I'm actually going to replace this I'm going to do replace values I'm G to say
going to do replace values I'm G to say Plus
Plus with
with 225 and we'll click okay awesome we can
225 and we'll click okay awesome we can unfilter these select all so we're going
unfilter these select all so we're going to go right up here to add column we're
to go right up here to add column we're going to say custom
going to say custom column and we're going to go right over
column and we're going to go right over here actually let's make it uh
here actually let's make it uh average salary let's make it average
average salary let's make it average salary so we're going to insert this I'm
salary so we're going to insert this I'm going to
going to say parentheses and we're going to say
say parentheses and we're going to say plus this
plus this insert and close the parenthesis divided
insert and close the parenthesis divided by two and it says no syntax errors have
by two and it says no syntax errors have been detected let's click on okay and
been detected let's click on okay and it's giving us an error so it's saying
it's giving us an error so it's saying we cannot apply operator plus to types
we cannot apply operator plus to types text and text which makes perfect sense
text and text which makes perfect sense these aren't uh numbers so let's make it
these aren't uh numbers so let's make it a whole number and let's make it a whole
a whole number and let's make it a whole number and then let's see if this will
number and then let's see if this will actually work
actually work no or maybe we just need to try a whole
no or maybe we just need to try a whole another one so let's try transform or
another one so let's try transform or add column custom
add column custom column let's try this all again see if
column let's try this all again see if uh I can make it
uh I can make it work
work insert do this
insert do this one
one plus this
plus this one and we'll do divid by two and let's
one and we'll do divid by two and let's try this one and there we go so now
try this one and there we go so now let's get rid of this column
let's get rid of this column columns and we can actually remove these
columns and we can actually remove these ones as
ones as well because now we have this
well because now we have this um average salary
um average salary column which when we look at this or
column which when we look at this or when we use this uh we can let me see if
when we use this uh we can let me see if I can just move this way way way over
I can just move this way way way over all right I might cut because this is
all right I might cut because this is taking forever so if you take the
taking forever so if you take the average of these two numbers you'll get
average of these two numbers you'll get 53 if you take the average of 0 and 40
53 if you take the average of 0 and 40 you'll get 20 so now we have this
you'll get 20 so now we have this average salary and again when we get to
average salary and again when we get to the actual visualization part I'll show
the actual visualization part I'll show you why this isn't as useful as having
you why this isn't as useful as having this average salary and just a reminder
this average salary and just a reminder this is not perfect uh I wouldn't
this is not perfect uh I wouldn't typically do this especially if I had it
typically do this especially if I had it in Excel or if I was you know creating
in Excel or if I was you know creating this survey in a different way I would
this survey in a different way I would probably have a very specific value
probably have a very specific value where they could do it on a slider but
where they could do it on a slider but this is how it is so we've at least made
this is how it is so we've at least made it usable or more usable in my mind and
it usable or more usable in my mind and we have a few other things that we can
we have a few other things that we can change like what industry do you work in
change like what industry do you work in where we can break this one out so I'm
where we can break this one out so I'm going to go ahead and break this one out
going to go ahead and break this one out as well
as well as this one right here which country do
as this one right here which country do you live in I'm going to break bro both
you live in I'm going to break bro both of those out to where it's the country
of those out to where it's the country or other I'm not going to have these
or other I'm not going to have these other values although there are a lot of
other values although there are a lot of them because there's a lot of people who
them because there's a lot of people who live in these different countries but we
live in these different countries but we can't really do that super well in here
can't really do that super well in here because again the same issue kept
because again the same issue kept happening Argentina Argentina Argentine
happening Argentina Argentina Argentine a Australia so we can't normalize those
a Australia so we can't normalize those values unless we spend just a copious
values unless we spend just a copious amount of time doing that so I'm going
amount of time doing that so I'm going to go ahead and do these I'm going to
to go ahead and do these I'm going to fast I'm going to fast speed this so it
fast I'm going to fast speed this so it goes a lot faster so I'm just going to
goes a lot faster so I'm just going to go silent and let this happen really
go silent and let this happen really quick and then we'll get to the end and
quick and then we'll get to the end and we'll actually start building our
visualizations all right so we've split them up and as you can see we have all
them up and as you can see we have all the these options as well as other and I
the these options as well as other and I think you know there is let me tell you
think you know there is let me tell you there is so much more that we could do
there is so much more that we could do with this I mean just so many other
with this I mean just so many other things but this is like what the bare
things but this is like what the bare minimum of what we need for this project
minimum of what we need for this project so let's go ahead and close and apply
so let's go ahead and close and apply this and if we need to come back at any
this and if we need to come back at any point and actually fix anything or
point and actually fix anything or change anything we can so it's not like
change anything we can so it's not like that's permanent um so as you can see we
that's permanent um so as you can see we have everything over here we have all
have everything over here we have all our data as it is transformed in here as
our data as it is transformed in here as well and now we can start building out
well and now we can start building out our visualization let's go back to our
our visualization let's go back to our report and let's start building
report and let's start building something out all right so let's add a
something out all right so let's add a title to our
title to our dashboard we want to make this right at
dashboard we want to make this right at the
the top we call this the
top we call this the data
data professional
professional survey
survey breakdown and let's make make that quite
breakdown and let's make make that quite a bit
a bit larger make it bold why not and we'll
larger make it bold why not and we'll put that in the
put that in the center and now let's um let's add some
center and now let's um let's add some effects let's change that background to
effects let's change that background to something like it's too dark something
something like it's too dark something like this and I do not like that Boldt
like this and I do not like that Boldt let's take that
let's take that off there we go so something like this
off there we go so something like this just as a quick title to what we're
just as a quick title to what we're about to do what we are about to build
about to do what we are about to build so we're going to start off with the
so we're going to start off with the most simple visualizations that we're
most simple visualizations that we're going to do and we'll kind of work our
going to do and we'll kind of work our way towards kind of the harder ones so
way towards kind of the harder ones so the first one that we're going to start
the first one that we're going to start off with is a card and the cards are
off with is a card and the cards are obviously like just super super easy
obviously like just super super easy they usually just display one piece of
they usually just display one piece of information so we're going to go right
information so we're going to go right over here to the very bottom at the
over here to the very bottom at the unique ID and we're going to select it
unique ID and we're going to select it and we're going to say a account of
and we're going to say a account of distinct or account it doesn't matter um
distinct or account it doesn't matter um it says 630 count of unique ID now we're
it says 630 count of unique ID now we're not going to keep that as is we're
not going to keep that as is we're actually going to go right over here
actually going to go right over here we're going to say rename for this
we're going to say rename for this Visual and it says count of unique ID
Visual and it says count of unique ID but we're going to say count
but we're going to say count of survey takers and you can say
of survey takers and you can say whatever you want here but in in general
whatever you want here but in in general that is what it is we're we're counting
that is what it is we're we're counting how many people um you know took this
how many people um you know took this survey and that's just a kind of a total
survey and that's just a kind of a total maybe I should say total amount or of
maybe I should say total amount or of survey takers but you can say count of
survey takers but you can say count of survey takers how many people took this
survey takers how many people took this survey so let's click out of there let's
survey so let's click out of there let's click on card let's make it about the
click on card let's make it about the same size we're going to drag it up
same size we're going to drag it up here and try to make them about the same
here and try to make them about the same we will in a little bit we'll make them
we will in a little bit we'll make them the same size um but for this one we're
the same size um but for this one we're going to look at age so we're going to
going to look at age so we're going to look at current age so I'm going click
look at current age so I'm going click on that and we'll say want the average
on that and we'll say want the average age so our average age taker is almost
age so our average age taker is almost 30 years old so let's go right over here
30 years old so let's go right over here we're going to say rename for this
we're going to say rename for this visual we'll say a average age of
visual we'll say a average age of survey oop this might be too
survey oop this might be too long average age of survey taker again
long average age of survey taker again name it whatever you'd like so again
name it whatever you'd like so again these are meant to be highlevel numbers
these are meant to be highlevel numbers so when somebody's looking at your
so when somebody's looking at your dashboard they can just really quickly
dashboard they can just really quickly glance at this and know exactly what it
glance at this and know exactly what it is instead of like some of these other
is instead of like some of these other visualizations that we're about to
visualizations that we're about to create they don't really have to dig
create they don't really have to dig into it look at the x- axis the y axis
into it look at the x- axis the y axis the the different uh Legend colors and
the the different uh Legend colors and whatnot they can just see these high
whatnot they can just see these high numbers and get a really quick glance of
numbers and get a really quick glance of the data now let's create our first
the data now let's create our first visualization and what we're going to do
visualization and what we're going to do for that one is a clustered bar chart so
for that one is a clustered bar chart so let's go ahead and click on the
let's go ahead and click on the clustered bar chart we can create as
clustered bar chart we can create as small or as large as we'd like and for
small or as large as we'd like and for this one we're going to be looking at
this one we're going to be looking at the job titles now remember we kind of
the job titles now remember we kind of changed the job titles or you know U
changed the job titles or you know U transform those if you want to say that
transform those if you want to say that so we're going to look at Job titles and
so we're going to look at Job titles and then we're going to look at their
then we're going to look at their average salary and if you remember we
average salary and if you remember we transformed that one as well we have a
transformed that one as well we have a average salary now this one is it looks
average salary now this one is it looks like a text right now so it may not work
like a text right now so it may not work properly and what we're actually going
properly and what we're actually going to do is go over
to do is go over here I want to see the average
here I want to see the average salary so let's click on average salary
salary so let's click on average salary and see if we can change this data type
and see if we can change this data type from a text to a decimal number let's
from a text to a decimal number let's click yes I forgot to do that when we
click yes I forgot to do that when we were transforming it and there we go
were transforming it and there we go this is perfect um so now we can go
this is perfect um so now we can go back and we can select our average
back and we can select our average salary and as you can see it has this um
salary and as you can see it has this um this function symbol and so now we can
this function symbol and so now we can click on it and it'll look a lot better
click on it and it'll look a lot better and although this says average salary as
and although this says average salary as the title it's actually doing a count or
the title it's actually doing a count or the sum so we can click average right
the sum so we can click average right here and what we want to do is actually
here and what we want to do is actually break this down by the job title and so
break this down by the job title and so now we can see data scientists are
now we can see data scientists are making the most by far far they're
making the most by far far they're making average of 93,000 at least from
making average of 93,000 at least from the survey takers that took it then we
the survey takers that took it then we have our data Engineers making
have our data Engineers making 65,000 data Architects are making 63 and
65,000 data Architects are making 63 and then where the data analysts data
then where the data analysts data analysts are right here making 55 so
analysts are right here making 55 so again we had 630 people take this survey
again we had 630 people take this survey and so the vast majority of them were
and so the vast majority of them were data analysts so this one's probably the
data analysts so this one's probably the most accurate out of all of them and I
most accurate out of all of them and I actually don't like how this looks as
actually don't like how this looks as the cluster bar chart let's try the
the cluster bar chart let's try the stocked bar chart and put this as the
stocked bar chart and put this as the legend that's more what I was going for
legend that's more what I was going for I don't know I didn't want as skinny
I don't know I didn't want as skinny because when you're doing this one it
because when you're doing this one it typically they have multiple options per
typically they have multiple options per um uh x axis and so I think that's why
um uh x axis and so I think that's why it was that little skinny line but this
it was that little skinny line but this one is more what I was looking for but
one is more what I was looking for but let's make that smaller and let's
let's make that smaller and let's definitely change that title because
definitely change that title because good night um this is like incredibly
good night um this is like incredibly long let's go over here to this format
long let's go over here to this format visual ual we'll go to the general the
visual ual we'll go to the general the title and we're just going to say
title and we're just going to say average salary by job title just like
average salary by job title just like that and this looks a lot better now
that and this looks a lot better now we're not going to kind of format all
we're not going to kind of format all our whole dashboard yet we're going to
our whole dashboard yet we're going to create our visualizations and then we're
create our visualizations and then we're going to kind of organize everything and
going to kind of organize everything and kind of play Tetris with it to make it
kind of play Tetris with it to make it look the best so we're just going to
look the best so we're just going to minimize this and put it right up here
minimize this and put it right up here for now um but we will go back and kind
for now um but we will go back and kind of make everything look better at the
of make everything look better at the end and actually while we're here I also
end and actually while we're here I also want to change this as well so rename
want to change this as well so rename for this we're going to say job title
for this we're going to say job title Oops why did I do that
Oops why did I do that job title and for this one we're just
job title and for this one we're just going to
going to say name average
say name average salary there we go looks much better
salary there we go looks much better much cleaner uh took away a lot of the
much cleaner uh took away a lot of the anxiety that I was feeling about two
anxiety that I was feeling about two minutes ago when we first put that up
minutes ago when we first put that up there so let's go on to our second
there so let's go on to our second visualization the next one that I'm
visualization the next one that I'm interested in is actually what
interested in is actually what programming language people were using
programming language people were using the most so we have salary there's a
the most so we have salary there's a thousand different things we can look at
thousand different things we can look at in here but I want to know you know what
in here but I want to know you know what is people's favorite programming
is people's favorite programming language so let's take a look at that so
language so let's take a look at that so we have favorite programming language
we have favorite programming language let's find that so we have our favorite
let's find that so we have our favorite programming language and we also have
programming language and we also have how many people actually took it or the
how many people actually took it or the unique people so right now this is
unique people so right now this is columns we don't want that let's um
columns we don't want that let's um let's do a clustered column chart click
let's do a clustered column chart click on this right here and it looks
on this right here and it looks like here we go that is kind of what
like here we go that is kind of what we're looking for and instead of count
we're looking for and instead of count of unique ID we'll say count
of unique ID we'll say count of let's do count of
of let's do count of Voters and for favorite program language
Voters and for favorite program language we'll
we'll say favorite oops favorite programming
say favorite oops favorite programming language and get rid of that as well and
language and get rid of that as well and then we're going to go into here also
then we're going to go into here also and change the title and say favorite
and change the title and say favorite programming
programming languages or favorite pro programming
languages or favorite pro programming language just like this now let's make
language just like this now let's make this a lot bigger so you can see it but
this a lot bigger so you can see it but really quickly at a glance you can see
really quickly at a glance you can see python is by far the most popular are
python is by far the most popular are other C++ JavaScript Java now all we're
other C++ JavaScript Java now all we're seeing is the count so it's all the same
seeing is the count so it's all the same it's just blue we can see how many
it's just blue we can see how many people voted for each one but if we
people voted for each one but if we wanted to break it out similar to how we
wanted to break it out similar to how we did with the job titles we could still
did with the job titles we could still do that so all we'd have to do is break
do that so all we'd have to do is break it out uh or bring this job title down
it out uh or bring this job title down to the legend and now breaks out like
to the legend and now breaks out like this and that's not exactly what I was
this and that's not exactly what I was going for I was going more for something
going for I was going more for something like this where we can see the still the
like this where we can see the still the whole count but now we can see who is
whole count but now we can see who is actually V voting for these things so
actually V voting for these things so I'm just not a huge fan of the colors
I'm just not a huge fan of the colors that are pre-selected here and kind of
that are pre-selected here and kind of the whole theme of this dashboard at the
the whole theme of this dashboard at the very end we're going to completely
very end we're going to completely revamp this change a bunch of colors the
revamp this change a bunch of colors the background and make this look a lot
background and make this look a lot nicer rather than just the white
nicer rather than just the white background like we have it um and so for
background like we have it um and so for now let's
now let's just make this a lot smaller and put it
just make this a lot smaller and put it into this corner these will not be
into this corner these will not be staying there but we need to we need
staying there but we need to we need room to create our next visualizations
room to create our next visualizations and just just a cleaner space to do
and just just a cleaner space to do things now the next thing that I really
things now the next thing that I really want to include is a way to break down
want to include is a way to break down where they're from their country because
where they're from their country because especially something like salary is very
especially something like salary is very dependent on your country whereas the
dependent on your country whereas the average salary in the United States for
average salary in the United States for a data analyst may be like 60,000 in
a data analyst may be like 60,000 in another country it could be 20,000 that
another country it could be 20,000 that could bring down the average quite a bit
could bring down the average quite a bit so we need a way to be able to break
so we need a way to be able to break that down now we can do something like a
that down now we can do something like a filled map and there's no problem with
filled map and there's no problem with that at all um
that at all um but you know for what we're building
but you know for what we're building what we're creating it's not probably
what we're creating it's not probably going to work out the best I mean this
going to work out the best I mean this looks okay we could stick it in the
looks okay we could stick it in the corner or something um and you can do
corner or something um and you can do that and that's perfectly fine I think
that and that's perfectly fine I think what I'm going to do is something like a
what I'm going to do is something like a tree map which I don't use a lot but I
tree map which I don't use a lot but I want something where they can just click
want something where they can just click on it they can look at the
on it they can look at the values
values distinct they can look at the values and
distinct they can look at the values and just click on it and it'll be right
just click on it and it'll be right there for them so they don't have to
there for them so they don't have to filter it out on their own or no
filter it out on their own or no geography and look at this map they can
geography and look at this map they can just read Canada other United Kingdom
just read Canada other United Kingdom India United States and click on that
India United States and click on that and so for example let's click over here
and so for example let's click over here on United States the numbers change
on United States the numbers change quite a bit now the average salary for a
quite a bit now the average salary for a data scientist is
data scientist is 139,000 for data analyst it's 80 and if
139,000 for data analyst it's 80 and if we look at India you know the average
we look at India you know the average salary for a data scientist is 68 the
salary for a data scientist is 68 the average salary is 26 for a data analyst
average salary is 26 for a data analyst that doesn't mean that they make less
that doesn't mean that they make less money in India that just means that the
money in India that just means that the cost of living is probably lower in
cost of living is probably lower in India therefore they don't need the
India therefore they don't need the higher US Dollars salary because again
higher US Dollars salary because again this was all done in US dollars so just
this was all done in US dollars so just something to think about uh let's click
something to think about uh let's click out of that so we'll keep that one as
out of that so we'll keep that one as well so now let's create our next
well so now let's create our next visualization and this is one that I do
visualization and this is one that I do not get to use enough in my actual job
not get to use enough in my actual job so we're going to use it in this project
so we're going to use it in this project um and it's going to be this gauge right
um and it's going to be this gauge right here so let's add that one put it right
here so let's add that one put it right over here we're going to add two of
over here we're going to add two of those let's just go ahead and add
those let's just go ahead and add another one while we're at it because
another one while we're at it because we're going to have them kind of like
we're going to have them kind of like right here right next to each other the
right here right next to each other the first one and these ones are really good
first one and these ones are really good for kind of looking at these kind of
for kind of looking at these kind of surveys and I don't get to work with
surveys and I don't get to work with surveys enough but we can see you know
surveys enough but we can see you know how happy are they in terms of work life
how happy are they in terms of work life balance so we can add that we're going
balance so we can add that we're going to add work life balance um and right
to add work life balance um and right now it's doing a count and we don't have
now it's doing a count and we don't have minimum or maximum values in there yet
minimum or maximum values in there yet so it's going to look kind of weird but
so it's going to look kind of weird but we're going to look at the average rate
we're going to look at the average rate or the the average score of these then
or the the average score of these then we're going to pull this over to the
we're going to pull this over to the minimum value and we want to put that at
minimum value and we want to put that at the minimum and pull this over and add
the minimum and pull this over and add the maximum value so now it actually has
the maximum value so now it actually has zero to 10 and it shows that the average
zero to 10 and it shows that the average person is happy with which one was this
person is happy with which one was this their average person is happy with their
their average person is happy with their work life balance uh they rate about a
work life balance uh they rate about a 5.74 overall now let's really quickly
5.74 overall now let's really quickly change the title of this because this is
change the title of this because this is ridiculous I want to say happy with work
ridiculous I want to say happy with work life balance
life balance so this is their rating uh you know
so this is their rating uh you know change it to whatever title you want
change it to whatever title you want that's what I'm going to do and we'll
that's what I'm going to do and we'll also do happy with their salary let's
also do happy with their salary let's click on salary We'll add that to
click on salary We'll add that to minimum and we'll add the maximum value
minimum and we'll add the maximum value as well to make sure that we know how to
as well to make sure that we know how to use
use that and then we'll take the average so
that and then we'll take the average so not many people are happy with their
not many people are happy with their salary I'm just finding out I mean this
salary I'm just finding out I mean this is a real survey this is real data so I
is a real survey this is real data so I mean it's h pretty interesting let's go
mean it's h pretty interesting let's go to the title let's go to happy with or
to the title let's go to happy with or maybe it's happiness happiness with
maybe it's happiness happiness with salary maybe that's what we should make
salary maybe that's what we should make it and I'm going to change that over
it and I'm going to change that over here as well I think it sounds better
here as well I think it sounds better some of this I've already planned out
some of this I've already planned out some I haven't this is not something
some I haven't this is not something I've planned out so uh so we're going to
I've planned out so uh so we're going to say happiness with work life balance
say happiness with work life balance happiness with salary really interesting
happiness with salary really interesting um we may go back and tweak these just a
um we may go back and tweak these just a little bit in the future but the very
little bit in the future but the very last visualization that we're going to
last visualization that we're going to do is male versus female kind to got to
do is male versus female kind to got to have that in there um I don't typically
have that in there um I don't typically like pie charts and dut charts but uh
like pie charts and dut charts but uh you know I'm feeling I'm just feeling it
you know I'm feeling I'm just feeling it so let's try it um and we will
so let's try it um and we will do let see let's make this larger so we
do let see let's make this larger so we have male
have male female and what do we want to look at
female and what do we want to look at like what do we want to measure so we
like what do we want to measure so we have male versus female we can measure
have male versus female we can measure anything um but maybe what we'll do is
anything um but maybe what we'll do is the average salary again I mean we've
the average salary again I mean we've kind of only looked at salary once
kind of only looked at salary once in this one right here um and a little
in this one right here um and a little bit of like how happy they are but we'll
bit of like how happy they are but we'll look at the average salary between males
look at the average salary between males and females and then we'll look at not
and females and then we'll look at not the current age Oops I meant average
the current age Oops I meant average salary and then we'll look at the
salary and then we'll look at the average and it looks like the average
average and it looks like the average salary is actually really close versus
salary is actually really close versus males versus females 55 for female
males versus females 55 for female versus 53 for male so actually the
versus 53 for male so actually the females are a little bit higher
females are a little bit higher congratulations so they're just a little
congratulations so they're just a little bit higher in terms of pay so now we
bit higher in terms of pay so now we need to start organizing all of this
need to start organizing all of this cleaning it up making it look a lot
cleaning it up making it look a lot better than it does right now it looks
better than it does right now it looks great uh you know but we can do a lot
great uh you know but we can do a lot more with this so I'm gonna we're we're
more with this so I'm gonna we're we're going to keep these or all these kind of
going to keep these or all these kind of over on this left hand side I'm GNA put
over on this left hand side I'm GNA put this I want this up here we also need to
this I want this up here we also need to change that title I want this up here um
change that title I want this up here um and again we're going to kind of change
and again we're going to kind of change the theme as we go
the theme as we go I I just want to format it
I I just want to format it right we'll have it just like this let's
right we'll have it just like this let's change the title of
change the title of this let's go to title and we're going
this let's go to title and we're going to say country of survey
to say country of survey takers uh I'm not the the survey takers
takers uh I'm not the the survey takers I'm not really stuck on that if you find
I'm not really stuck on that if you find something better you think of something
something better you think of something better I would go with that but um you
better I would go with that but um you know it definitely doesn't look bad and
know it definitely doesn't look bad and where did this where did my other
where did this where did my other visualization go there goes um I think
visualization go there goes um I think this one I want to make kind of more
this one I want to make kind of more tall um so I might move it this way jeez
tall um so I might move it this way jeez this is such a I hate I hate having a
this is such a I hate I hate having a lot of visualizations on here it just
lot of visualizations on here it just really is annoying to me so what we're
really is annoying to me so what we're going to do I think we're
going to do I think we're gonna step this to the side put this to
gonna step this to the side put this to the side as
the side as well I want to make it to where it's
well I want to make it to where it's just okay I didn't want it to cut
just okay I didn't want it to cut off we'll do that might make these
off we'll do that might make these um make these a little bigger actually
um make these a little bigger actually so I want it to kind of match the
so I want it to kind of match the size like right there I'll match this
size like right there I'll match this perfect this one I kind of want to bring
perfect this one I kind of want to bring over
over here and bring it down a little bit
here and bring it down a little bit maybe something like
maybe something like this maybe I'm not sure I'm not I'm not
this maybe I'm not sure I'm not I'm not sold on that um I added a few different
sold on that um I added a few different visualizations that I didn't have in my
visualizations that I didn't have in my original so now I'm kind of having to do
original so now I'm kind of having to do this on the fly so I might fast forward
this on the fly so I might fast forward some of the parts where I'm like really
some of the parts where I'm like really thinking about it or taking too much
thinking about it or taking too much time on it but I'm going to bring this
time on it but I'm going to bring this down a little bit actually because I
down a little bit actually because I don't like how close that is to um the
don't like how close that is to um the the text above it but one thing we do
the text above it but one thing we do need to
do I'm going to put this up kind of like this I think that looks fine I think I'm
this I think that looks fine I think I'm going to put this at the very bottom so
going to put this at the very bottom so let's make some room for
let's make some room for it all right just like that stretch it
it all right just like that stretch it to the side and we'll lower
to the side and we'll lower it and I think we'll keep that as
it and I think we'll keep that as is kind of like this um okay there's a
is kind of like this um okay there's a lot going on in here and there are some
lot going on in here and there are some things I'm just noticing as we're
things I'm just noticing as we're walking through this that I kind of
walking through this that I kind of missed um like I need to change some
missed um like I need to change some titles and stuff like that so let me go
titles and stuff like that so let me go ahead and change some of those things so
ahead and change some of those things so we're going to do
we're going to do title do average
title do average salary by gender or by
salary by gender or by sex do like that average salary by sex I
sex do like that average salary by sex I also don't like that it's in the middle
also don't like that it's in the middle um I don't like that it's on the outside
um I don't like that it's on the outside I want them on the inside for this so
I want them on the inside for this so let's go to the details let's go to
let's go to the details let's go to inside and see if that looks any better
inside and see if that looks any better oh that looks terrible um let me see if
oh that looks terrible um let me see if I can change that maybe I don't no I
I can change that maybe I don't no I definitely want it
definitely want it um I guess we'll do outside I you can't
um I guess we'll do outside I you can't even see the information oh the decimal
even see the information oh the decimal is crazy long um let me go and see if I
is crazy long um let me go and see if I can change that decimal to just like a
can change that decimal to just like a whole number or like
whole number or like 1.1 uh because that's a problem so maybe
1.1 uh because that's a problem so maybe I need to go over here to the
I need to go over here to the value all right so I think I want to
value all right so I think I want to change this one it's just not working
change this one it's just not working out exactly how I wanted and you guys
out exactly how I wanted and you guys know if I make mistakes I'm going to
know if I make mistakes I'm going to keep it in here so you guys can see it I
keep it in here so you guys can see it I I hoped that this was going to turn out
I hoped that this was going to turn out better but it didn't um one that I do
better but it didn't um one that I do want to add because this is kind of a a
want to add because this is kind of a a breakdown and a nice visualization I
breakdown and a nice visualization I want to add this difficulty piece so I
want to add this difficulty piece so I want to add this how difficult was it
want to add this how difficult was it for you to break into data science let's
for you to break into data science let's get rid of these and I want to click on
get rid of these and I want to click on this really quickly see what it gives us
this really quickly see what it gives us um values okay so now this shows us
um values okay so now this shows us percentages um of how easy it was again
percentages um of how easy it was again it's neither easy nor difficult
it's neither easy nor difficult difficult easy very difficult very easy
difficult easy very difficult very easy these numbers make absolutely no sense
these numbers make absolutely no sense we need to kind of order them a little
we need to kind of order them a little better so I'm going to come over here to
better so I'm going to come over here to slices we have our colors over here we
slices we have our colors over here we want very difficult to be like the most
want very difficult to be like the most difficult um so we're going to make that
difficult um so we're going to make that red and then we want difficult to be
red and then we want difficult to be maybe like an
maybe like an orange let see if we can find an orange
orange let see if we can find an orange there we have an orange this does not
there we have an orange this does not look red enough there we go oh
look red enough there we go oh no no no very difficult is red difficult
no no no very difficult is red difficult is orange we have neither easy nor
is orange we have neither easy nor difficult and that's kind of a neutral
difficult and that's kind of a neutral um let's see if we have something
um let's see if we have something neutral in
here kind of like this yellow I don't know let's try it out then we have easy
know let's try it out then we have easy and very easy and these will be like our
and very easy and these will be like our Blues so I'm going to keep that um I'm
Blues so I'm going to keep that um I'm going to keep that kind of like a dark
going to keep that kind of like a dark blueish and then our blue for super easy
blueish and then our blue for super easy is just going to be like really blue U
is just going to be like really blue U and that doesn't look bad the I mean
and that doesn't look bad the I mean look I'm I'm not a color person I I'm
look I'm I'm not a color person I I'm not great with colors and we're going to
not great with colors and we're going to kind of organize this in just a little
kind of organize this in just a little bit but this looks better to me um but
bit but this looks better to me um but we need to change up some stuff as well
we need to change up some stuff as well like the title need to
like the title need to do difficulty to break into
do difficulty to break into Data there we go
Data there we go and we're also going to
and we're also going to change this title right here we're just
change this title right here we're just say
say difficulty difficulty difficulty this
difficulty difficulty difficulty this looks better to me um again not perfect
looks better to me um again not perfect and there's a thousand different things
and there's a thousand different things you could have done but that's just what
you could have done but that's just what we're going to do I need to go through
we're going to do I need to go through here and see what I need to change so
here and see what I need to change so right off the bat I can see I need to
right off the bat I can see I need to change this
change this um to let's see right here I'm going to
um to let's see right here I'm going to rename this job title just like we did
rename this job title just like we did in this one right here uh count of
in this one right here uh count of Voters that's fine progr language
Voters that's fine progr language breaking into difficulty happiness
breaking into difficulty happiness happiness average count okay okay so
happiness average count okay okay so what we have here is very close to a
what we have here is very close to a finished product now it's not 100%
finished product now it's not 100% complete I mean I I do want to make it
complete I mean I I do want to make it look a little nicer rather than just the
look a little nicer rather than just the typical white so what we're gonna do
typical white so what we're gonna do we're GNA go up here we'll go to uh what
we're GNA go up here we'll go to uh what is it View and we have all these
is it View and we have all these different filters and we're just going
different filters and we're just going to play around with it see if we can
to play around with it see if we can find something that we like um this
find something that we like um this doesn't look too bad it's not really my
doesn't look too bad it's not really my style we can do this one Frontier this
style we can do this one Frontier this is pretty neat I kind of am digging this
is pretty neat I kind of am digging this we might come back to it I like the
we might come back to it I like the natural tones I don't know why I said
natural tones I don't know why I said tones like that but I did um this one's
tones like that but I did um this one's not bad but I don't I don't it's not
not bad but I don't I don't it's not that's not my I don't like how dark that
that's not my I don't like how dark that is um and so maybe it's like you know we
is um and so maybe it's like you know we change like the background color of all
change like the background color of all of these as well as match it with um
of these as well as match it with um match it with something else whatever
match it with something else whatever you want genuinely you customize this
you want genuinely you customize this however you want I kind of like this one
however you want I kind of like this one it's kind of groovy man and um it's not
it's kind of groovy man and um it's not perfect by any means but what we can do
perfect by any means but what we can do and we can customize this current theme
and we can customize this current theme we can come in here customize this theme
we can come in here customize this theme however we'd like I personally don't
however we'd like I personally don't want color five which is the data
want color five which is the data analyst color I don't like it to I don't
analyst color I don't like it to I don't want to go go and change it because I
want to go go and change it because I don't like it but I don't really like
don't like it but I don't really like that color per se you know I might want
that color per se you know I might want to choose a different color um but it
to choose a different color um but it has to be like this muted like that it
has to be like this muted like that it has a style to it so you can come in
has a style to it so you can come in here and you can customize this and make
here and you can customize this and make it however you'd like and and really
it however you'd like and and really mess around with it play play around
mess around with it play play around with it for me uh I'm just going to keep
with it for me uh I'm just going to keep it how it is because I don't really want
it how it is because I don't really want to mess with it and break it or anything
to mess with it and break it or anything like that so U let me just put that up
like that so U let me just put that up just a tiny bit so this is it this is
just a tiny bit so this is it this is the project I hope that it was helpful
the project I hope that it was helpful um I am not joking when I say that I'm
um I am not joking when I say that I'm because I'm gonna do a different project
because I'm gonna do a different project I'm gonna go really in depth in another
I'm gonna go really in depth in another project it's probably gonna be like a
project it's probably gonna be like a two-hour project it's going to be crazy
two-hour project it's going to be crazy long um well for a YouTube video but I
long um well for a YouTube video but I can see doing thousand different things
can see doing thousand different things with this data creating a really great
with this data creating a really great dashboard really cleaning the data which
dashboard really cleaning the data which is a large part of of actually doing
is a large part of of actually doing this and we didn't do much data cleaning
this and we didn't do much data cleaning at all there's just so much you can do
at all there's just so much you can do with this and so really dig into this
with this and so really dig into this see what you like see what you don't
see what you like see what you don't like see what you want to clean what you
like see what you want to clean what you don't want to clean you could put it in
don't want to clean you could put it in SQL you could put it in um Excel and
SQL you could put it in um Excel and just and just standardize the data to
just and just standardize the data to make it a lot more usable do whatever
make it a lot more usable do whatever you want with it I mean I I took this
you want with it I mean I I took this survey for you guys that we could use it
survey for you guys that we could use it so go out and use it and make the best
so go out and use it and make the best dashboard that you can possibly do so I
dashboard that you can possibly do so I hope that this was helpful I hope that
hope that this was helpful I hope that you enjoyed this thank you so much for
you enjoyed this thank you so much for watching this video If you like this
watching this video If you like this thank you so much for watching if you
thank you so much for watching if you like this video be sure to like And
like this video be sure to like And subscribe below and I'll see you in the
subscribe below and I'll see you in the next
next [Music]
[Music] video
video what's going on everybody welcome back
what's going on everybody welcome back to another video today we're going to be
to another video today we're going to be starting our Python tutorial
[Music] series now I am extremely excited for
series now I am extremely excited for this series we're going to be walking
this series we're going to be walking through all the things that you need to
through all the things that you need to know to get started in Python we'll be
know to get started in Python we'll be looking at variables data types for
looking at variables data types for Loops y Loops operators and a ton more
Loops y Loops operators and a ton more after this beginner series we're going
after this beginner series we're going to be going into another set of Series
to be going into another set of Series where we look at pandas mat plat lib
where we look at pandas mat plat lib Seaborn web scraping and more now in
Seaborn web scraping and more now in this video we're just going to be
this video we're just going to be setting up our environment to where we
setting up our environment to where we can learn python in future videos in
can learn python in future videos in this series we're going to be using
this series we're going to be using jupyter notebooks for all of our
jupyter notebooks for all of our tutorials because I feel like it's a
tutorials because I feel like it's a really great place to learn the basics
really great place to learn the basics but then in future videos I'll show you
but then in future videos I'll show you different idees that you can use for
different idees that you can use for your python code I genuinely cannot wait
your python code I genuinely cannot wait to get started on this series I
to get started on this series I absolutely love python so without
absolutely love python so without further Ado let's jump on my screen I'm
further Ado let's jump on my screen I'm going to show you how to install jupyter
going to show you how to install jupyter notebooks all right so let's get started
notebooks all right so let's get started by downloading anaconda anaconda is an
by downloading anaconda anaconda is an open- Source distribution of python and
open- Source distribution of python and R products so within Anaconda is our
R products so within Anaconda is our Jupiter notebooks as well as a lot of
Jupiter notebooks as well as a lot of other things but we're going to be using
other things but we're going to be using it for our Jupiter notebooks so let's go
it for our Jupiter notebooks so let's go right down here and if I hit download
right down here and if I hit download it's going to download for me because
it's going to download for me because I'm on Windows but if you want
I'm on Windows but if you want additional installers if you're running
additional installers if you're running on Mac or Linux then you can get those
on Mac or Linux then you can get those all right here now if you are running on
all right here now if you are running on Windows just make sure to check your
Windows just make sure to check your system to see if it's a 32bit or a 64
system to see if it's a 32bit or a 64 you can go into your about in your
you can go into your about in your system settings to find that information
system settings to find that information I'm going to click on this 64
I'm going to click on this 64 bit it's going to pop up on my screen
bit it's going to pop up on my screen right here and I'm going to click
right here and I'm going to click save now it's going to start downloading
save now it's going to start downloading it it says it could take a little while
it it says it could take a little while but honestly it's going to take probably
but honestly it's going to take probably about 2 to three minutes and then we'll
about 2 to three minutes and then we'll get going now that it's done I'm just
get going now that it's done I'm just going to click on it and it's going to
going to click on it and it's going to pull up this window right here we are
pull up this window right here we are just going to click next because we want
just going to click next because we want to install it this is our license
to install it this is our license agreement you can read through this if
agreement you can read through this if you would like I will not I'm just going
you would like I will not I'm just going to click I agree now we can select our
to click I agree now we can select our installation type and you can either
installation type and you can either select it for just me or if you have
select it for just me or if you have multiple admin or users on one laptop
multiple admin or users on one laptop you can do that as well for me it's just
you can do that as well for me it's just me so I'm going to use this one as it
me so I'm going to use this one as it recommends now it's going to show you
recommends now it's going to show you where it's installing it on your
where it's installing it on your computer this is the actual file path
computer this is the actual file path it's going to take about 3.5 gigs of
it's going to take about 3.5 gigs of space I have plenty of space but make
space I have plenty of space but make sure you have enough space and then once
sure you have enough space and then once you do you can come right over here to
you do you can come right over here to next and now we can do some Advanced
next and now we can do some Advanced options we can add Anaconda 3 to my path
options we can add Anaconda 3 to my path environment variable
environment variable and when you're using python you
and when you're using python you typically have a default path with
typically have a default path with whatever python IDE or notebook that
whatever python IDE or notebook that you're using I use a lot of Visual
you're using I use a lot of Visual Studio code so if I do this I'm worried
Studio code so if I do this I'm worried it might mess something up so I am not
it might mess something up so I am not going to do this it also says it doesn't
going to do this it also says it doesn't recommend it again messing with these
recommend it again messing with these paths is kind of something that you
paths is kind of something that you might want to do once you know more
might want to do once you know more about python so I don't really recommend
about python so I don't really recommend you having this checked we can also
you having this checked we can also register in AA 3 as my default python
register in AA 3 as my default python 3.9 you can do this one and I'm to keep
3.9 you can do this one and I'm to keep it this way just so I have the exact
it this way just so I have the exact same settings as you do so let's go
same settings as you do so let's go ahead and click install and now it is
ahead and click install and now it is going to actually install this on your
going to actually install this on your computer now once that's complete we can
computer now once that's complete we can hit next and now we're going to hit next
hit next and now we're going to hit next again and finally we're going to hit
again and finally we're going to hit finish but if you want to you can have
finish but if you want to you can have this tutorial and this getting started
this tutorial and this getting started with Anaconda I don't want either of
with Anaconda I don't want either of them because I don't need them but if
them because I don't need them but if you would like to have those keep those
you would like to have those keep those checked and you can get those let's
checked and you can get those let's click finish now let's go down and and
click finish now let's go down and and we're going to search for Anaconda and
we're going to search for Anaconda and it'll say Anaconda Navigator and we're
it'll say Anaconda Navigator and we're going to click on that and it should
going to click on that and it should open up for us so this is what you
open up for us so this is what you should be seeing on your screen this is
should be seeing on your screen this is the Anaconda Navigator and this is where
the Anaconda Navigator and this is where that distribution of python and R is
that distribution of python and R is going to be so we have a lot of
going to be so we have a lot of different options in here and some of
different options in here and some of them may look familiar we have things
them may look familiar we have things like Visual Studio code spider our
like Visual Studio code spider our studio and then right up here we have
studio and then right up here we have our Jupiter notebooks and this is what
our Jupiter notebooks and this is what work we're going to be using throughout
work we're going to be using throughout our tutorials so let's go ahead and
our tutorials so let's go ahead and click on launch and this is what should
click on launch and this is what should kind of pop up on your screen now I've
kind of pop up on your screen now I've been using this a lot um so I have a ton
been using this a lot um so I have a ton of notebooks and files in here but if
of notebooks and files in here but if you are just now seeing this it might be
you are just now seeing this it might be completely blank or just have some you
completely blank or just have some you know default folders in here but this is
know default folders in here but this is where we're going to open up a new
where we're going to open up a new Jupiter notebook where we can write code
Jupiter notebook where we can write code and all the things that we're going to
and all the things that we're going to be learning in future tutorials and you
be learning in future tutorials and you can use this area to save things and
can use this area to save things and create folders and organize everything
create folders and organize everything if you already have some notebooks from
if you already have some notebooks from previous projects or something you can
previous projects or something you can upload them here but what we're going to
upload them here but what we're going to do is go right to this new we're going
do is go right to this new we're going to click on the drop down and we're
to click on the drop down and we're going to open up a Python 3 kernel and
going to open up a Python 3 kernel and so we're going to open this up right
so we're going to open this up right here now right here is where we're going
here now right here is where we're going to be spending 99% of our time in future
to be spending 99% of our time in future videos this is where we're going to
videos this is where we're going to write all of our code so right here is a
write all of our code so right here is a cell and this is where we can type
cell and this is where we can type things so I can say print I can do the
things so I can say print I can do the famous hello world
famous hello world and then I'll run that by clicking shift
and then I'll run that by clicking shift enter and this is where all of our code
enter and this is where all of our code is going to go these are called cells so
is going to go these are called cells so each one of these are a cell and we have
each one of these are a cell and we have a ton of stuff up here and I'm going to
a ton of stuff up here and I'm going to get to that in just a second one thing I
get to that in just a second one thing I wanted to show you is that you don't
wanted to show you is that you don't only have to write code here you can
only have to write code here you can also do something called markdown and so
also do something called markdown and so markdown is its own kind of you could
markdown is its own kind of you could say language but um it's just a
say language but um it's just a different way of writing especially
different way of writing especially within a notebook so all we're going to
within a notebook so all we're going to do is do this little hashtag and
do is do this little hashtag and actually I think it's a pound sign but
actually I think it's a pound sign but I'm G to call it hashtag we're going to
I'm G to call it hashtag we're going to do that and we're going to say first
do that and we're going to say first notebook and then if I run that we have
notebook and then if I run that we have our first notebook and we can make
our first notebook and we can make little comments and little notes like
little comments and little notes like that that don't actually run any code
that that don't actually run any code they just kind of organize things for us
they just kind of organize things for us and I'm going to do that in a lot of our
and I'm going to do that in a lot of our future videos so just want to show you
future videos so just want to show you how to do that now let's look right up
how to do that now let's look right up here a lot of these things are pretty
here a lot of these things are pretty important uh one of the first things
important uh one of the first things that's really important is actually
that's really important is actually saving this so let's say we wanted to
saving this so let's say we wanted to change the title to I'm going to do a AA
change the title to I'm going to do a AA because I want it to be at the beginning
because I want it to be at the beginning um so I can show you this I'm do AA a
um so I can show you this I'm do AA a new notebook and I'm going to rename it
new notebook and I'm going to rename it and then I'm going to save that so if I
and then I'm going to save that so if I go right back over here you can see AAA
go right back over here you can see AAA new notebook that green means that it's
new notebook that green means that it's currently running and when I say running
currently running and when I say running I mean right up here and if we wanted to
I mean right up here and if we wanted to we go ahead and shut that down which
we go ahead and shut that down which means it wouldn't run the code anymore
means it wouldn't run the code anymore and then we'd have to run up a new
and then we'd have to run up a new cluster uh so let's go ahead and do that
cluster uh so let's go ahead and do that I didn't plan on doing that but let's do
I didn't plan on doing that but let's do it so we have no notebooks running and
it so we have no notebooks running and right here it says we have a dead kernel
right here it says we have a dead kernel so this was our Python 3 kernel and now
so this was our Python 3 kernel and now since I stopped it it's no longer
since I stopped it it's no longer processing anything so let's go ahead
processing anything so let's go ahead and say try restarting
and say try restarting now and it says kernel is ready so it's
now and it says kernel is ready so it's back up and running and we're good to go
back up and running and we're good to go the next thing is this button right here
the next thing is this button right here now this is an insert cell below so if I
now this is an insert cell below so if I have a lot of code I know I'm going to
have a lot of code I know I'm going to be writing I can click a lot of that and
be writing I can click a lot of that and I often do that because I just don't
I often do that because I just don't like having to do that all the time so I
like having to do that all the time so I make a bunch of cells just so I can use
make a bunch of cells just so I can use them you can delete cells so say we have
them you can delete cells so say we have some code here we'll say here and we
some code here we'll say here and we have code here and then we have this
have code here and then we have this empty cell right here we can just get
empty cell right here we can just get rid of that by doing this cut selected
rid of that by doing this cut selected cells we can also copy selected cells so
cells we can also copy selected cells so if I hit copy selected cells and I can
if I hit copy selected cells and I can go right here and say paste selected
go right here and say paste selected cells and as you can see it pasted that
cells and as you can see it pasted that exact same cell you can also move this
exact same cell you can also move this up and down so I can actually take this
up and down so I can actually take this one and say I wanted it in this location
one and say I wanted it in this location I can take this cell and move it up or I
I can take this cell and move it up or I can move it down and that's just an easy
can move it down and that's just an easy way to kind of organize it instead of
way to kind of organize it instead of having to like copy this and moving it
having to like copy this and moving it right down here and pasting it you can
right down here and pasting it you can just take this cell and move it up which
just take this cell and move it up which is really nice now earlier when I ran
is really nice now earlier when I ran this code right here I hit shift enter
this code right here I hit shift enter you can also run and it'll run the cell
you can also run and it'll run the cell below so you can hit run and it works
below so you can hit run and it works properly if you're running a script and
properly if you're running a script and it's taking forever and it's not working
it's taking forever and it's not working properly at least it's you don't think
properly at least it's you don't think it's working properly you can stop that
it's working properly you can stop that by doing this interrupt the kernel right
by doing this interrupt the kernel right here and anything you're trying to do
here and anything you're trying to do within this kernel if it's just not
within this kernel if it's just not working properly it'll stop it you can
working properly it'll stop it you can restart it then you can try fixing your
restart it then you can try fixing your code you can also hit this button if you
code you can also hit this button if you want to restart your kernel and this
want to restart your kernel and this button if you want to restart the kernel
button if you want to restart the kernel and then rerun the entire notebook as we
and then rerun the entire notebook as we talked about just a second ago we have
talked about just a second ago we have our code and our markdown code we're not
our code and our markdown code we're not going to talk about either of these
going to talk about either of these because we're not going to use that
because we're not going to use that throughout the entire series the next
throughout the entire series the next thing I want to show you is right up
thing I want to show you is right up here if you open this file we can create
here if you open this file we can create a new notebook we can open an existing
a new notebook we can open an existing notebook we can copy it save it rename
notebook we can copy it save it rename it all that good stuff we can also edit
it all that good stuff we can also edit it so a lot of these things that we were
it so a lot of these things that we were talking about you can cut the cells and
talking about you can cut the cells and copy the cells using these shortcuts if
copy the cells using these shortcuts if you would like to we also go to view and
you would like to we also go to view and you can toggle a lot of these things if
you can toggle a lot of these things if you would like to which just means it'll
you would like to which just means it'll show it or not show it depending on what
show it or not show it depending on what you want so if we toggle this toolbar
you want so if we toggle this toolbar it'll take away the toolbar for us or if
it'll take away the toolbar for us or if we go back and we toggle the toolbar we
we go back and we toggle the toolbar we can bring it back we can also insert a
can bring it back we can also insert a few different things like inserting a
few different things like inserting a cell above or a cell below so instead of
cell above or a cell below so instead of saying This plus button you can just say
saying This plus button you can just say A or B adding above or below we also
A or B adding above or below we also have the cell in which we can run our
have the cell in which we can run our cells or run all of them or all above or
cells or run all of them or all above or all below and then we have our kernels
all below and then we have our kernels right here which we were talking about
right here which we were talking about earlier where we can interrupt it and
earlier where we can interrupt it and restart those there are widgets we're
restart those there are widgets we're not going to be looking at any widgets
not going to be looking at any widgets in this series but if it's something
in this series but if it's something you're interested in you can definitely
you're interested in you can definitely do that and then we have help so if you
do that and then we have help so if you are looking for some help on any of
are looking for some help on any of these things especially some of these
these things especially some of these references which are really nice you can
references which are really nice you can use those and you can also edit your own
use those and you can also edit your own keyboard shortcuts and now that we
keyboard shortcuts and now that we walked through all of that you now have
walked through all of that you now have anacon and jupyter notebooks installed
anacon and jupyter notebooks installed on your computer in future videos this
on your computer in future videos this is where we're going to be writing all
is where we're going to be writing all of our python code so be sure to check
of our python code so be sure to check those out so we can learn python
those out so we can learn python together thank you guys so much for
together thank you guys so much for watching I hope you were able to get
watching I hope you were able to get everything installed correctly I am
everything installed correctly I am super excited for this series ahead of
super excited for this series ahead of us if you like this video be sure to
us if you like this video be sure to like And subscribe below and I will see
like And subscribe below and I will see you in the next
you in the next [Music]
[Music] video
[Music] hello everybody today we're going to be
hello everybody today we're going to be learning about variables in Python a
learning about variables in Python a variable is basically just a container
variable is basically just a container for storing data values so you'll take a
for storing data values so you'll take a value like a number or a string you can
value like a number or a string you can assign it to a variable and then the
assign it to a variable and then the variable will carry and contain whatever
variable will carry and contain whatever you put into it so for example let's go
you put into it so for example let's go right over here we're going to say x and
right over here we're going to say x and this is going to be our variable we're
this is going to be our variable we're going to say is equal to now we can
going to say is equal to now we can assign the value to it so let's say I
assign the value to it so let's say I want to put
want to put 22 x is now equal to 22 so we won't have
22 x is now equal to 22 so we won't have to write out the number 22 in later
to write out the number 22 in later scripts that we write we can just say x
scripts that we write we can just say x because X is equal to 22 it now contains
because X is equal to 22 it now contains that number so now we can hit enter and
that number so now we can hit enter and say print we do an open parentheses and
say print we do an open parentheses and we'll say x now I'm going to hit shift
we'll say x now I'm going to hit shift enter and now it prints out that 22
enter and now it prints out that 22 because we are printing x and x is equal
because we are printing x and x is equal 22 this is our value and this is our
22 this is our value and this is our variable one really great thing about
variable one really great thing about variables is that it assigns its own
variables is that it assigns its own data type it's going to automatically do
data type it's going to automatically do this so we didn't have to go and tell X
this so we didn't have to go and tell X that it's an integer it just
that it's an integer it just automatically knew that 22 is a number
automatically knew that 22 is a number so we can check that by saying type and
so we can check that by saying type and then open parenthesis and writing X and
then open parenthesis and writing X and we'll do shift enter again and this says
we'll do shift enter again and this says that X is an integer type now we only
that X is an integer type now we only assigned an integer to X let's try
assigned an integer to X let's try assigning a string value or some text to
assigning a string value or some text to a variable so we'll say Y is equal to uh
a variable so we'll say Y is equal to uh let's say mint chocolate chip I'm
let's say mint chocolate chip I'm feeling some ice cream today so we'll
feeling some ice cream today so we'll say mint chocolate chip now if we print
say mint chocolate chip now if we print that again we'll do print open
that again we'll do print open parenthesis Y and do shift enter it'll
parenthesis Y and do shift enter it'll print mint chocolate chip and if we look
print mint chocolate chip and if we look at the type we can see that the type is
at the type we can see that the type is a string this time and not an integer
a string this time and not an integer now again we did not tell it that X was
now again we did not tell it that X was an integer and Y was a string it just
an integer and Y was a string it just automatically knew this let's go up here
automatically knew this let's go up here really quickly we're going to add
really quickly we're going to add several rows in here because we're about
several rows in here because we're about to write a lot of different variables
to write a lot of different variables and really learn in- depth how to use
and really learn in- depth how to use variables the next thing to know about
variables the next thing to know about variables is that you can overwrite
variables is that you can overwrite previous variables right now we have
previous variables right now we have mint chocolate chip and that is assigned
mint chocolate chip and that is assigned to the variable y so if I go down here I
to the variable y so if I go down here I say print y I hit shift enter it's going
say print y I hit shift enter it's going to print out mint chocolate chip but
to print out mint chocolate chip but if I go right above it I say Y is equal
if I go right above it I say Y is equal to and let's say chocolate if I print
to and let's say chocolate if I print that out it's now going to say chocolate
that out it's now going to say chocolate whereas up here I'm reassigning it to Y
whereas up here I'm reassigning it to Y it's still going to say mint chocolate
it's still going to say mint chocolate chip so if I come right down here and I
chip so if I come right down here and I copy this and I'm going to paste this
copy this and I'm going to paste this right here initially it is going to
right here initially it is going to assign y to Chocolate but then right
assign y to Chocolate but then right here it will automatically overwrite y
here it will automatically overwrite y as mint chocolate chip and when we hit
as mint chocolate chip and when we hit shift enter it's going to show mint
shift enter it's going to show mint chocolate chip variables are also case
chocolate chip variables are also case sensitive so if I come up here and I say
sensitive so if I come up here and I say a capital Y this is a lowercase Y and
a capital Y this is a lowercase Y and this is a capital Y it is going to print
this is a capital Y it is going to print out the correct one instead of mint
out the correct one instead of mint chocolate chip and then if I go down
chocolate chip and then if I go down here to the print and I type the capital
here to the print and I type the capital Y it will give us the mint chocolate
Y it will give us the mint chocolate chip up till now we've only assigned one
chip up till now we've only assigned one value to one variable but we can
value to one variable but we can actually assign multiple values to
actually assign multiple values to multiple variables so let's do X comma y
multiple variables so let's do X comma y comma Z is equal to and now we can
comma Z is equal to and now we can assign multiple values to all of those
assign multiple values to all of those so we can say
so we can say chocolate and then we'll do a comma oops
chocolate and then we'll do a comma oops a comma then we can say vanilla and then
a comma then we can say vanilla and then we'll do another comma and we'll say
we'll do another comma and we'll say rocky road now this is going to assign
rocky road now this is going to assign chocolate to X
chocolate to X vanilla to Y and Rocky Road to Z so what
vanilla to Y and Rocky Road to Z so what we can do is we'll say
we can do is we'll say print and we'll go print print print and
print and we'll go print print print and we'll say X Y and Z so it prints out
we'll say X Y and Z so it prints out chocolate vanilla and rocky road and
chocolate vanilla and rocky road and these are our three different values we
these are our three different values we can also assign multiple variables to
can also assign multiple variables to one value and we can do this by saying X
one value and we can do this by saying X is equal to Y is equal to Z is equal to
is equal to Y is equal to Z is equal to and we can put whatever we would like
and we can put whatever we would like let's do root beer float then we'll come
let's do root beer float then we'll come back up here we'll copy this and let's
back up here we'll copy this and let's print off our X our Y and Z and they are
print off our X our Y and Z and they are all the exact same now so far we've
all the exact same now so far we've really only looked at integers and
really only looked at integers and strings but you can assign things like
strings but you can assign things like lists dictionaries tupal and sets all to
lists dictionaries tupal and sets all to variables as well so let's go right down
variables as well so let's go right down here so let's create our very first list
here so let's create our very first list I'm going to say icore cream is equal to
I'm going to say icore cream is equal to and that is our variable right there the
and that is our variable right there the ice cream is our variable so now we're
ice cream is our variable so now we're going to do an Open Bracket like this
going to do an Open Bracket like this and we're going to come up here and copy
and we're going to come up here and copy all of these values and we're going to
all of these values and we're going to stick it within our list so now within
stick it within our list so now within ice cream we have three string values
ice cream we have three string values chocolate vanilla and rocky road all
chocolate vanilla and rocky road all within this list so what we can do is we
within this list so what we can do is we can say x comma y comma Z is equal to
can say x comma y comma Z is equal to icore cream so so now these three values
icore cream so so now these three values chocolate vanilla and rocky road will be
chocolate vanilla and rocky road will be assigned to these three variables X Y
assigned to these three variables X Y and Z and we can copy this print up
and Z and we can copy this print up here and we'll hit shift enter and now
here and we'll hit shift enter and now the X Y and Z all were assigned these
the X Y and Z all were assigned these values of chocolate vanilla and rocky
values of chocolate vanilla and rocky road now something that we just did
road now something that we just did which is really important or something
which is really important or something that you really need to consider is how
that you really need to consider is how you name your variables so right here we
you name your variables so right here we have ice cream now this to me is exactly
have ice cream now this to me is exactly how I usually write my variables but
how I usually write my variables but there are many different ways that you
there are many different ways that you can write your variables so let's take a
can write your variables so let's take a look at that really quickly and let's
look at that really quickly and let's add just a few more because I have a
add just a few more because I have a feeling we're going to go a little bit
feeling we're going to go a little bit longer than what we have so there are a
longer than what we have so there are a few best practices for naming variables
few best practices for naming variables first I'm going to show you kind of what
first I'm going to show you kind of what a lot of people will do I'll show you
a lot of people will do I'll show you some good practices and I'm going to
some good practices and I'm going to show you some bad practices as well that
show you some bad practices as well that you should avoid doing the first thing
you should avoid doing the first thing that we're going to look at is something
that we're going to look at is something called camel case and let's say we want
called camel case and let's say we want to name it t test variable case oops
to name it t test variable case oops case now if we have a test variable case
case now if we have a test variable case the camel case is going to look like
the camel case is going to look like this we'll have lowercase test and then
this we'll have lowercase test and then we'll have uppercase variable and
we'll have uppercase variable and uppercase case is equal to this is what
uppercase case is equal to this is what this variable is going to look like and
this variable is going to look like and we can assign it a nilla
swirl and this is what your camel case will look like it's going to be
will look like it's going to be lowercase and then all the rest of those
lowercase and then all the rest of those uh compound words or however you want to
uh compound words or however you want to say that these letters are going to be
say that these letters are going to be capitalized to kind of separate where
capitalized to kind of separate where the words end and begin let's go right
the words end and begin let's go right down here we're going to copy this the
down here we're going to copy this the next one is called Pascal case so Pascal
next one is called Pascal case so Pascal case is going to look just a little bit
case is going to look just a little bit different instead of the lowercase at
different instead of the lowercase at test it's going to be a capital T in
test it's going to be a capital T in test so test variable case again this is
test so test variable case again this is a very similar way of writing it very
a very similar way of writing it very similar to camel case U but just a
similar to camel case U but just a capital at the beginning now let's look
capital at the beginning now let's look at the last one and this one is my
at the last one and this one is my personal favorite this one is going to
personal favorite this one is going to be the snake case now this one is quite
be the snake case now this one is quite a bit different in the fact that you
a bit different in the fact that you don't use any capital letters and you
don't use any capital letters and you separate everything using underscore so
separate everything using underscore so we're going to write
we're going to write testore variable underscore case now
testore variable underscore case now typically let me have them all in there
typically let me have them all in there typically these are the best practices
typically these are the best practices these are what you typically want to do
these are what you typically want to do but probably the best one to to use is
but probably the best one to to use is this snake case right here what a lot of
this snake case right here what a lot of people say is that it improves
people say is that it improves readability if you take a look at either
readability if you take a look at either the camel case or the Pascal case which
the camel case or the Pascal case which you will see people do it's not as easy
you will see people do it's not as easy to distinguish exactly what it says and
to distinguish exactly what it says and the name of a variable is important
the name of a variable is important because you can gain information from it
because you can gain information from it if people name them appropriately so
if people name them appropriately so when I'm naming variables I usually
when I'm naming variables I usually write it in snake case because I just
write it in snake case because I just find it a lot easier to read because
find it a lot easier to read because each word is broken up by this
each word is broken up by this underscore score so now let's look at
underscore score so now let's look at some good variable names these are all
some good variable names these are all ones that you can use or could use let's
ones that you can use or could use let's do something like test VAR so test VAR
do something like test VAR so test VAR is completely appropriate we can also do
is completely appropriate we can also do something like testore VAR oops
something like testore VAR oops underscore we could do underscore test
underscore we could do underscore test underscore VAR you'll see that often as
underscore VAR you'll see that often as well well people will start it with an
well well people will start it with an underscore you can do test
underscore you can do test bar capital T oops capital T capital V
bar capital T oops capital T capital V in test VAR or you could even do
in test VAR or you could even do something like test VAR two now adding a
something like test VAR two now adding a number to your variable is not
number to your variable is not inherently a Bad Thing usually it's
inherently a Bad Thing usually it's semif fround upon but there are
semif fround upon but there are definitely some use cases where you can
definitely some use cases where you can use it but one thing that you cannot do
use it but one thing that you cannot do is do something
is do something like putting the two at the front if you
like putting the two at the front if you put the two at the front it no longer
put the two at the front it no longer works it won't run properly at all so
works it won't run properly at all so we're going to take that out so we can't
we're going to take that out so we can't do that so I'm going to use this as an
do that so I'm going to use this as an example of what you should not do you
example of what you should not do you also can't use a dash so something like
also can't use a dash so something like test- var2 that doesn't work either and
test- var2 that doesn't work either and you also can't use something like a
you also can't use something like a space or a comma or really any kind of
space or a comma or really any kind of symbol like a period or a backslash or
symbol like a period or a backslash or equal sign none of those things will
equal sign none of those things will work within your variable now another
work within your variable now another thing that you can do within your
thing that you can do within your variable is use the plus sign so let's
variable is use the plus sign so let's assign this we'll say x is equal to and
assign this we'll say x is equal to and we'll do a string we'll say ice
we'll do a string we'll say ice cream is my
cream is my favorite and then we'll do a plus sign
favorite and then we'll do a plus sign and we'll say period now what this will
and we'll say period now what this will do is it will literally add these two
do is it will literally add these two strings together so let's do print and
strings together so let's do print and we'll do X so now it says ice cream is
we'll do X so now it says ice cream is my favorite one thing that we cannot do
my favorite one thing that we cannot do in a variable is we cannot add a string
in a variable is we cannot add a string and a number or an integer so we can't
and a number or an integer so we can't do ice cream as my favorite two if we
do ice cream as my favorite two if we try to do that it will give us this
try to do that it will give us this error right here so in this error it's
error right here so in this error it's saying you can only concatenate a string
saying you can only concatenate a string not an integer to a string so only a
not an integer to a string so only a string plus a string for this example
string plus a string for this example you can also do and we'll say x is equal
you can also do and we'll say x is equal to or we'll say
to or we'll say y we'll say Y is equal
y we'll say Y is equal to 3 + 2 and it should output five
to 3 + 2 and it should output five because you can also do an integer and
because you can also do an integer and an integer now so far we've only been
an integer now so far we've only been outputting one variable in the print
outputting one variable in the print statement but you can actually add
statement but you can actually add multiple variables within a print
multiple variables within a print statement so let's go right down here
statement so let's go right down here we're going to say let's give it some
we're going to say let's give it some more right there so we'll say x is equal
more right there so we'll say x is equal to ice
to ice cream and we'll say Y is equal
cream and we'll say Y is equal to is and then the last one Z is equal
to is and then the last one Z is equal to my favorite and we'll do a period at
to my favorite and we'll do a period at the end now we can go to the bottom and
the end now we can go to the bottom and we can say print x + y + C and when we
we can say print x + y + C and when we enter
enter that and when we run and when we run
that and when we run and when we run that we get ice cream is my favorite now
that we get ice cream is my favorite now we can actually add a space before is a
we can actually add a space before is a space before my and when we hit shift
space before my and when we hit shift enter it says ice cream is my favorite
enter it says ice cream is my favorite you can also do this exact same thing
you can also do this exact same thing with numbers as well so we'll say x = to
with numbers as well so we'll say x = to 1 2 and what Z is equal to three so this
1 2 and what Z is equal to three so this should equal six now one thing that we
should equal six now one thing that we tried to do was assign to one variable a
tried to do was assign to one variable a string plus an integer and that did not
string plus an integer and that did not work but what you can do is you can take
work but what you can do is you can take something like this and you can say ice
something like this and you can say ice cream and we'll get rid of this one and
cream and we'll get rid of this one and we'll get rid of the Z now saying plus
we'll get rid of the Z now saying plus is actually not going to work let's try
is actually not going to work let's try running this
running this so again we can't concatenate these but
so again we can't concatenate these but what we can do in the print statement is
what we can do in the print statement is we can separate it by a comma so when we
we can separate it by a comma so when we add this comma it should work properly
add this comma it should work properly let's hit enter and it says ice cream 2
let's hit enter and it says ice cream 2 again this makes no sense but you are
again this makes no sense but you are able to combine a string and an integer
able to combine a string and an integer separating by a comma now this is the
separating by a comma now this is the meat and potatoes of variables there are
meat and potatoes of variables there are some other things as well but some of
some other things as well but some of those things are a little bit more
those things are a little bit more advanced and not something I wanted to
advanced and not something I wanted to cover in this tutorial although we may
cover in this tutorial although we may be looking at some of those things in
be looking at some of those things in future tutorials
future tutorials but this is definitely the basics what
but this is definitely the basics what you really really need to know about
you really really need to know about variables I hope that this video was
variables I hope that this video was helpful if it was be sure to like And
helpful if it was be sure to like And subscribe below and I will see you in
subscribe below and I will see you in the next
the next [Music]
[Music] video hello everybody today we're going
video hello everybody today we're going to be talking about data types in Python
to be talking about data types in Python data types are the classification of the
data types are the classification of the data that you are storing these
data that you are storing these classifications tell you what operations
classifications tell you what operations can be performed on your data we're
can be performed on your data we're going to be looking at the main data
going to be looking at the main data types within python including numeric
types within python including numeric sequence type set Boolean and dictionary
sequence type set Boolean and dictionary so let's get started actually writing
so let's get started actually writing some of this out and first let's look at
some of this out and first let's look at numeric there are three different types
numeric there are three different types of numeric data types we have integers
of numeric data types we have integers float and complex numbers let's take a
float and complex numbers let's take a look at integers an integer is basically
look at integers an integer is basically just a whole number whether it's
just a whole number whether it's positive or negative so an integer could
positive or negative so an integer could be a 12 and we can check that by saying
be a 12 and we can check that by saying type we'll do an open parenthesis and a
type we'll do an open parenthesis and a Clos parenthesis and if we say the type
Clos parenthesis and if we say the type of 12 it's going to give us an integer
of 12 it's going to give us an integer or if we say a -2 that is also an
or if we say a -2 that is also an integer we can also perform basic
integer we can also perform basic calculations like -2 + 100 and that'll
calculations like -2 + 100 and that'll tell us it is also an integer so whether
tell us it is also an integer so whether it's just a static value or you're
it's just a static value or you're performing an operation on it it's still
performing an operation on it it's still going to be that data type if those
going to be that data type if those numbers are whole numbers whether
numbers are whole numbers whether negative or positive now let's take this
negative or positive now let's take this exact one and let's say
exact one and let's say 12 and we'll do+
12 and we'll do+ 10.25 when we run this it's no longer
10.25 when we run this it's no longer going to be a whole number it'll now be
going to be a whole number it'll now be a float so let's check this and now this
a float so let's check this and now this is a float type because is no longer a
is a float type because is no longer a whole number it's now a decimal number
whole number it's now a decimal number and the last data type within the
and the last data type within the numeric data type is called complex
numeric data type is called complex let's copy this right down here now
let's copy this right down here now personally this is not one that I've
personally this is not one that I've used almost ever but it is one just
used almost ever but it is one just worth noting so you can do 12 plus and
worth noting so you can do 12 plus and let's say 3 J
let's say 3 J and if we do this it's going to give us
and if we do this it's going to give us a complex the complex data type is used
a complex the complex data type is used for imaginary numbers for me it's not
for imaginary numbers for me it's not often used but if you do use it J is
often used but if you do use it J is used as that imaginary number if you use
used as that imaginary number if you use something like C or any other number
something like C or any other number it's going to give you an error J is the
it's going to give you an error J is the only one that will work with it now
only one that will work with it now let's take a look at Boolean values so
let's take a look at Boolean values so we'll say Boolean the Boolean data type
we'll say Boolean the Boolean data type only has two built-in values either true
only has two built-in values either true or false so let's go right down here and
or false so let's go right down here and say type
say type true and when we run this it'll say bu
true and when we run this it'll say bu which stands for Boolean we can do the
which stands for Boolean we can do the exact same thing with false that is also
exact same thing with false that is also Boolean and this can be used with
Boolean and this can be used with something like a comparison operator so
something like a comparison operator so let's say 1 is greater than 5 and let's
let's say 1 is greater than 5 and let's check this this is giving us a Boolean
check this this is giving us a Boolean because it's telling us whether one is
because it's telling us whether one is greater than five let's bring that right
greater than five let's bring that right down here this will give us a false so
down here this will give us a false so it's telling us that one is not greater
it's telling us that one is not greater than five and just as we got a false we
than five and just as we got a false we can say 1 is equal to one and this
can say 1 is equal to one and this should give us a true so now let's take
should give us a true so now let's take a look at our sequence type data types
a look at our sequence type data types and that includes strings lists and
and that includes strings lists and tupal let's start off by looking at
tupal let's start off by looking at strings in Python strings are arrays of
strings in Python strings are arrays of bytes representing Unicode characters
bytes representing Unicode characters when you're using strings you put them
when you're using strings you put them either in a single quote a double quote
either in a single quote a double quote or a trible quote I call them
or a trible quote I call them apostrophes it's just what I was raised
apostrophes it's just what I was raised to call them but most people who use
to call them but most people who use Python call them quotes so right here we
Python call them quotes so right here we have a single quote and that works well
have a single quote and that works well we can do a double quote and that works
we can do a double quote and that works also and as you can see they are the
also and as you can see they are the exact same output and then we have a
exact same output and then we have a triple quote just like this and this is
triple quote just like this and this is called a multi-line so we can write on
called a multi-line so we can write on multiple lines here so let's write a
multiple lines here so let's write a nice little poem so we'll say the ice
nice little poem so we'll say the ice cream vanquished my longing for
cream vanquished my longing for sweets upon this diet
sweets upon this diet I look
I look away it no longer
away it no longer exists on this day and then if we run
exists on this day and then if we run that it's going to look a little bit
that it's going to look a little bit weird it's basically giving us the raw
weird it's basically giving us the raw text which is completely fine but let's
text which is completely fine but let's call this a
call this a multi-line and we're going to call this
multi-line and we're going to call this a variable multi-line and we're going to
a variable multi-line and we're going to come down here and say
come down here and say print and before I run this I have to
print and before I run this I have to make sure that this is Ran So now let's
make sure that this is Ran So now let's print out our multi-line and now we have
print out our multi-line and now we have our nice little poem right down here now
our nice little poem right down here now something to know about these single and
something to know about these single and double quotes is how they're actually
double quotes is how they're actually used so if we use a single quote and we
used so if we use a single quote and we say I've always wanted to eat a gallon
say I've always wanted to eat a gallon of ice cream and then we do an
of ice cream and then we do an apostrophe at the end obviously
apostrophe at the end obviously something went wrong here what went
something went wrong here what went wrong is when you use a single quote and
wrong is when you use a single quote and then within your text within your
then within your text within your sentence you have another apostrophe
sentence you have another apostrophe it's going to give you an error so what
it's going to give you an error so what we want to do is whenever we have a
we want to do is whenever we have a quote within it we need to use a double
quote within it we need to use a double quote these double quotes will negate
quote these double quotes will negate any single quotes that you have within
any single quotes that you have within your statement they won't however negate
your statement they won't however negate another double quote so you need to make
another double quote so you need to make sure you aren't using double quotes
sure you aren't using double quotes within your sentence if you want to do
within your sentence if you want to do something like that you need to use the
something like that you need to use the triple quotes like we did above so we
triple quotes like we did above so we can do double double and then let's
can do double double and then let's paste this within
paste this within it
it and anything you do Within These triple
and anything you do Within These triple quotes will be completely fine as long
quotes will be completely fine as long as you don't do triple quotes within
as you don't do triple quotes within your triple quotes we'll say this is
your triple quotes we'll say this is wrong so even though it's between these
wrong so even though it's between these two triple quotes it doesn't work
two triple quotes it doesn't work exactly again you just have to
exactly again you just have to understand how that works you have to
understand how that works you have to use the proper apostrophes or quotes
use the proper apostrophes or quotes within your string and just to check
within your string and just to check this we can always say here's our
this we can always say here's our multi-line we can always say type of
multi-line we can always say type of multi-line and that is still a string
multi-line and that is still a string one really important thing to know about
one really important thing to know about strings is that they can be indexed
strings is that they can be indexed indexing means that you can search
indexing means that you can search within it and that index starts at zero
within it and that index starts at zero so let's go ahead and create a variable
so let's go ahead and create a variable and we'll just say a is equal to and
and we'll just say a is equal to and let's do the all popular hello world
let's do the all popular hello world let's run this and now when we print the
let's run this and now when we print the string we can say a and we're going to
string we can say a and we're going to do a bracket and now we can search
do a bracket and now we can search throughout our string using the index so
throughout our string using the index so all you have to do is do a colon and we
all you have to do is do a colon and we can say five what this is going to do is
can say five what this is going to do is is going to say zero position zero all
is going to say zero position zero all the way up to five which should give us
the way up to five which should give us the whole hello I believe let's run this
the whole hello I believe let's run this and it's giving us the first five
and it's giving us the first five positions of this string we can also get
positions of this string we can also get rid of the colon and just say something
rid of the colon and just say something like five and then when we run this it's
like five and then when we run this it's actually going to give us position five
actually going to give us position five so this is 0o 1 2 3 4 and then five is
so this is 0o 1 2 3 4 and then five is the space let's do six so we can see the
the space let's do six so we can see the ACT ual letter and that is our w we can
ACT ual letter and that is our w we can also use a negative when we're indexing
also use a negative when we're indexing through our string so we could say -3
through our string so we could say -3 and it'll give us the L because it's NE
and it'll give us the L because it's NE -1 2 and three we can also specify a
-1 2 and three we can also specify a range if we don't want to use the
range if we don't want to use the default of zero so before we did 0 to
default of zero so before we did 0 to five and it started at zero because that
five and it started at zero because that was our default but we could also do two
was our default but we could also do two to five let's run this and now we go
to five let's run this and now we go position 0 1 and then we start at 2 L L
position 0 1 and then we start at 2 L L now we can also also multiply strings
now we can also also multiply strings and we have this a hello world so we can
and we have this a hello world so we can do a * 3 and if we run this it'll give
do a * 3 and if we run this it'll give us hello world three times and we can
us hello world three times and we can also do a plus a and that is Hello World
also do a plus a and that is Hello World hello world now let's go down here and
hello world now let's go down here and take a look at lists lists are really
take a look at lists lists are really fantastic because they store multiple
fantastic because they store multiple values the string was stored as one
values the string was stored as one value multiple characters but a list can
value multiple characters but a list can store multiple separate values so let's
store multiple separate values so let's create our very first list list we'll
create our very first list list we'll say list really quickly and then we'll
say list really quickly and then we'll put a bracket and a bracket means this
put a bracket and a bracket means this is going to be a list there are other
is going to be a list there are other ones like a squiggly bracket and a
ones like a squiggly bracket and a parenthesis these denote that they are
parenthesis these denote that they are different types of data types the
different types of data types the bracket is what makes a list list so to
bracket is what makes a list list so to keep it super simple we'll say one two
keep it super simple we'll say one two three and we'll run this and now we have
three and we'll run this and now we have a list that has three separate values in
a list that has three separate values in it the comma in our list denotes that
it the comma in our list denotes that they are separate values and a list is
they are separate values and a list is indexed just like a string is indexed so
indexed just like a string is indexed so position zero is this one position one
position zero is this one position one is the two and position two is the three
is the two and position two is the three now when we made this list we didn't
now when we made this list we didn't have to use any quotes because these are
have to use any quotes because these are numbers but if we wanted to create a
numbers but if we wanted to create a list and we wanted to add string values
list and we wanted to add string values we have to do it with our quotes so
we have to do it with our quotes so we'll say quote cookie
we'll say quote cookie dough then we'll do a comma to separate
dough then we'll do a comma to separate the value and then we'll say
the value and then we'll say strawberry and then we'll do one more
strawberry and then we'll do one more and this will just be chocolate and when
and this will just be chocolate and when we run this we have all three of these
we run this we have all three of these values stored in our list now one of the
values stored in our list now one of the best things about list is you can have
best things about list is you can have any data type within them they don't
any data type within them they don't just have to be numbers or strings you
just have to be numbers or strings you can basically put anything you want in
can basically put anything you want in there so let's create a new list and
there so let's create a new list and let's say
let's say vanilla and then we'll do three and then
vanilla and then we'll do three and then we'll add a list within a list and we'll
we'll add a list within a list and we'll say
say Scoops comma spoon and then we'll get
Scoops comma spoon and then we'll get out of that list and then we'll add
out of that list and then we'll add another value of true for Boolean and
another value of true for Boolean and now we can hit shift enter and we just
now we can hit shift enter and we just created a list with several different
created a list with several different data types within one list now let's
data types within one list now let's take this one list right here with all
take this one list right here with all of our different ice cream flavors we'll
of our different ice cream flavors we'll say icore cream is equal to this list
say icore cream is equal to this list now one thing that's really great about
now one thing that's really great about lists is that they are changeable that
lists is that they are changeable that means we can change the data in here we
means we can change the data in here we can also add and remove items from the
can also add and remove items from the list after we've already created it so
list after we've already created it so let's go and take ice cream and we'll
let's go and take ice cream and we'll say ice cream. append and this is going
say ice cream. append and this is going to append it to the very end of the list
to append it to the very end of the list we do an open parenthesis and let's say
we do an open parenthesis and let's say salted caramel now when we run this and
salted caramel now when we run this and we call it just like this it's going to
we call it just like this it's going to take this list add salted caramel to the
take this list add salted caramel to the end and we'll print it off and as you
end and we'll print it off and as you can see it was added to the list and
can see it was added to the list and just like I said before let me go down
just like I said before let me go down here we can also change things from this
here we can also change things from this list so let's say ice cream and then we
list so let's say ice cream and then we need to look at the indexed position so
need to look at the indexed position so we're going to say zero and that's going
we're going to say zero and that's going to be this cookie d right here we can
to be this cookie d right here we can say that is equal to so we can now
say that is equal to so we can now change that value so let's call that
change that value so let's call that butter econ and now when we call
butter econ and now when we call it we can now see that the cookie dough
it we can now see that the cookie dough was changed to butter peacon another
was changed to butter peacon another thing that you saw just a little bit ago
thing that you saw just a little bit ago is something called a list within a list
is something called a list within a list basically a nested list so we had Scoops
basically a nested list so we had Scoops spoon true let's give this and we'll say
spoon true let's give this and we'll say nested uncore list is equal to now when
nested uncore list is equal to now when we run this we now have this nested list
we run this we now have this nested list so if we look at the index and we say
so if we look at the index and we say zero we'll get vanilla if we say two
zero we'll get vanilla if we say two we'll get Scoops and spoons now since we
we'll get Scoops and spoons now since we have a list within a list we can also
have a list within a list we can also look at the index of that nested list so
look at the index of that nested list so let's now say one and that should give
let's now say one and that should give us just spoon and you can go on and on
us just spoon and you can go on and on and on with this you can do lists within
and on with this you can do lists within lists within lists and all of them will
lists within lists and all of them will have indexing that you can call now
have indexing that you can call now let's go down here and start taking a
let's go down here and start taking a look at tupal so a list and a tupal are
look at tupal so a list and a tupal are actually quite similar but the biggest
actually quite similar but the biggest difference between a list and a tuple is
difference between a list and a tuple is that a tupal is something called
that a tupal is something called immutable it means it cannot be modified
immutable it means it cannot be modified or changed after it's created let's go
or changed after it's created let's go right up here we're going to say
right up here we're going to say Tuple and let's write our very first
Tuple and let's write our very first tupal so we'll say Tuple score
tupal so we'll say Tuple score Scoops is equal to and then we'll do an
Scoops is equal to and then we'll do an open parentheses now these open
open parentheses now these open parentheses you've seen if you do like a
parentheses you've seen if you do like a print statement but that's different
print statement but that's different because that's executing a function this
because that's executing a function this is actually creating a tupal which is
is actually creating a tupal which is going to store data for us so we'll say
going to store data for us so we'll say one 2 3 two and one let's go ahead and
one 2 3 two and one let's go ahead and create that Tuple and we can just check
create that Tuple and we can just check the data type really quickly and it's a
the data type really quickly and it's a tupal and just like we saw before a
tupal and just like we saw before a tupal is also index text so if we go at
tupal is also index text so if we go at the very first position which is a one
the very first position which is a one we will get the output of a one but we
we will get the output of a one but we can't do something like
can't do something like aend and then add a value like three if
aend and then add a value like three if we do that it's going to say Tuple
we do that it's going to say Tuple object has no attribute append it's just
object has no attribute append it's just because you cannot change or add
because you cannot change or add anything to a tupal just like we were
anything to a tupal just like we were talking about before typically people
talking about before typically people will use tupal for when data is never
will use tupal for when data is never going to change an example for this
going to change an example for this might be something like a city name a
might be something like a city name a country a location
country a location something that won't change they
something that won't change they definitely have their use cases but I
definitely have their use cases but I don't think they're as popular as just
don't think they're as popular as just using a list so now let's scroll down
using a list so now let's scroll down and start taking look at sets but really
and start taking look at sets but really quickly let me add a few more cells for
quickly let me add a few more cells for us and let's say
us and let's say sets now a set is somewhat similar to a
sets now a set is somewhat similar to a list and a tupal but they are a little
list and a tupal but they are a little bit different in the fact that they
bit different in the fact that they don't have any duplicate elements
don't have any duplicate elements another big difference is that the
another big difference is that the values within a set cannot be accessed
values within a set cannot be accessed using an index because it doesn't have
using an index because it doesn't have an index because it's actually unordered
an index because it's actually unordered we can still Loop through the items in a
we can still Loop through the items in a set with something like a for Loop but
set with something like a for Loop but we can't access it using the bracket and
we can't access it using the bracket and then accessing its index point so let's
then accessing its index point so let's go ahead and create our very first set
go ahead and create our very first set so we're going to say daily uncore pints
so we're going to say daily uncore pints then we're going to say equal to and to
then we're going to say equal to and to create a set we're going to use these
create a set we're going to use these squiggly brackets I don't know if
squiggly brackets I don't know if there's an actual name for those if I'm
there's an actual name for those if I'm being honest I call them squiggly
being honest I call them squiggly brackets and that's what we're going to
brackets and that's what we're going to go with we're to put in a one a two and
go with we're to put in a one a two and a three so let's go ahead and run
a three so let's go ahead and run this and let's look at the type and as
this and let's look at the type and as you can see it is a set now when we
you can see it is a set now when we print this out it's going to show us one
print this out it's going to show us one a two and a three and those are all the
a two and a three and those are all the values within our set but if we copy
values within our set but if we copy this and we'll say daily pant log this
this and we'll say daily pant log this is going to be every single day maybe I
is going to be every single day maybe I had different
had different values now when we run this and we do
values now when we run this and we do the exact same thing now when we print
the exact same thing now when we print this it's going to have just the unique
this it's going to have just the unique values within that set now a use case
values within that set now a use case for set and this is something that I've
for set and this is something that I've done in the past is comparing two
done in the past is comparing two separate sets maybe you have a list or a
separate sets maybe you have a list or a tupal and you convert that into a set
tupal and you convert that into a set and that will narrow it down to its
and that will narrow it down to its unique values then you can compare the
unique values then you can compare the unique values of one set to the unique
unique values of one set to the unique values in another set and then we can
values in another set and then we can see what's the same and what's different
see what's the same and what's different so let's go down here and let's say
so let's go down here and let's say wife's
wife's uncore daily just copy this right here
uncore daily just copy this right here we'll say is equal to let's do our
we'll say is equal to let's do our squiggly lines let's do one two let's do
squiggly lines let's do one two let's do just random
just random numbers so now this is my daily log and
numbers so now this is my daily log and this is my wife's daily log and now we
this is my wife's daily log and now we can compare these values so let's go
can compare these values so let's go right down here let's say print we'll do
right down here let's say print we'll do my daily logs and then we'll do this bar
my daily logs and then we'll do this bar right here and this is going to show us
right here and this is going to show us the combined unique values it's
the combined unique values it's basically like putting them all in one
basically like putting them all in one second set and then trimming it down to
second set and then trimming it down to just the unique values so we'll take
just the unique values so we'll take wife's daily pintes log and when we run
wife's daily pintes log and when we run this we actually need to run this first
this we actually need to run this first when we run this we should see all the
when we run this we should see all the unique values between these two sets and
unique values between these two sets and so as you can see 0 1 2 3 4 5 6 7 24 31
so as you can see 0 1 2 3 4 5 6 7 24 31 so these are all the unique values
so these are all the unique values between these two
between these two sets we can also do another one and
sets we can also do another one and instead of this bar we're going to do
instead of this bar we're going to do this symbol right here which I believe
this symbol right here which I believe is called an Amper sand
is called an Amper sand don't quote me on that but when we run
don't quote me on that but when we run this it's going to show what matches
this it's going to show what matches that means which ones show up in both
that means which ones show up in both sets so the only ones that show up in
sets so the only ones that show up in both sets are 1 2 3 and five we can also
both sets are 1 2 3 and five we can also do the opposite of that by doing a minus
do the opposite of that by doing a minus sign and this is going to show us what
sign and this is going to show us what doesn't match and so we have four 6 and
doesn't match and so we have four 6 and 31 now where is our 24 that was in our
31 now where is our 24 that was in our wife's daily pints log it's in this one
wife's daily pints log it's in this one but we're subtracting the values on this
but we're subtracting the values on this one so let's reverse reverse this and
one so let's reverse reverse this and we'll say daily pints
we'll say daily pints log and let's run it now those are our
log and let's run it now those are our other values so we're taking the values
other values so we're taking the values of this and then we're subtracting all
of this and then we're subtracting all the ones that are the same and getting
the ones that are the same and getting the remaining values and then for our
the remaining values and then for our last one we can get rid of this and
last one we can get rid of this and we'll do this symbol right here and this
we'll do this symbol right here and this is going to show if a value is either in
is going to show if a value is either in one or the other but not in both so
one or the other but not in both so let's run this so these values are
let's run this so these values are completely unique only two each of those
completely unique only two each of those sets now the very last one that we're
sets now the very last one that we're going to look at in this video is
going to look at in this video is dictionaries so let's go right down here
dictionaries so let's go right down here let's add a few cells and let's say
let's add a few cells and let's say dictionaries now I saved dictionary for
dictionaries now I saved dictionary for last because this one is probably the
last because this one is probably the most different out of all the previous
most different out of all the previous data types that we've looked at within a
data types that we've looked at within a data type we have something called a
data type we have something called a key value pair that means when we use a
key value pair that means when we use a dictionary it's not like a list where
dictionary it's not like a list where you just have a value comma value comma
you just have a value comma value comma value we have a key that indicates what
value we have a key that indicates what that value is attributed to so let's
that value is attributed to so let's write out a dictionary to see how this
write out a dictionary to see how this looks so we're going to say
looks so we're going to say dictionary cream and just like a set we
dictionary cream and just like a set we use a squiggly line but the thing that
use a squiggly line but the thing that differentiates it is that in a
differentiates it is that in a dictionary we'll have that key value
dictionary we'll have that key value pair whereas in a set each value is just
pair whereas in a set each value is just separated by a comma so let's write name
separated by a comma so let's write name and this is our key and then we do a
and this is our key and then we do a colon and this is then where we input
colon and this is then where we input our value so we're going to say Alex
our value so we're going to say Alex freeberg and then we separate that key
freeberg and then we separate that key value Pair by a comma and now we can do
value Pair by a comma and now we can do another key value pair so we'll say
another key value pair so we'll say weekly intake and a colon and we'll say
weekly intake and a colon and we'll say five pints of ice cream do a comma and
five pints of ice cream do a comma and then we'll do favorite ice creams and
then we'll do favorite ice creams and now what we're going to do is we're
now what we're going to do is we're going to put in here a list so within
going to put in here a list so within this dictionary we can also add a list
this dictionary we can also add a list we'll do MCC from mint chocolate chip
we'll do MCC from mint chocolate chip and then we'll add chocolate another one
and then we'll add chocolate another one of my favorites so now we have our very
of my favorites so now we have our very first dictionary let's copy this and run
first dictionary let's copy this and run it and let's just look at the
it and let's just look at the type and as you can see it says that
type and as you can see it says that this is a dictionary let's also print it
this is a dictionary let's also print it out now if we want to we can take our
out now if we want to we can take our dictionary cream and say dot values with
dictionary cream and say dot values with an open parenthesis and when we execute
an open parenthesis and when we execute this we'll see all of the values within
this we'll see all of the values within this dictionary so here's our values of
this dictionary so here's our values of Alex freeberg five mint chocolate chip
Alex freeberg five mint chocolate chip and chocolate we can also say keys and
and chocolate we can also say keys and when we run this all of the keys the
when we run this all of the keys the name weekly intake and favorite ice
name weekly intake and favorite ice creams and we can also
creams and we can also say items so this key value pair is one
say items so this key value pair is one item and this key value pair is another
item and this key value pair is another item now one difference between
item now one difference between something like a list and a dictionary
something like a list and a dictionary is how you call the index but you can't
is how you call the index but you can't call it by doing something like like
call it by doing something like like this where you just do a bracket oops
this where you just do a bracket oops and say zero so this would in theory
and say zero so this would in theory take this very first one right our very
take this very first one right our very first key value pair that's going to
first key value pair that's going to give us an error how you call a
give us an error how you call a dictionary is actually by the key so it
dictionary is actually by the key so it doesn't technically have an index but
doesn't technically have an index but you can specify what you want to call
you can specify what you want to call and take it out so we're going to say
and take it out so we're going to say name and this is going to call that key
name and this is going to call that key right here and when we run this we'll
right here and when we run this we'll get the value which is Alex freeberg one
get the value which is Alex freeberg one other thing that you can do is you can
other thing that you can do is you can also update information in a dictionary
also update information in a dictionary which we can't with some other data
which we can't with some other data types so for this for the name it was
types so for this for the name it was Alex freeberg now let's say Ste freeberg
Alex freeberg now let's say Ste freeberg and when we update that I'm also going
and when we update that I'm also going to
to print the dictionary get rid of this so
print the dictionary get rid of this so it's going to update Christine freeberg
it's going to update Christine freeberg in that value of the name so let's go
in that value of the name so let's go ahead and run this and now it changed
ahead and run this and now it changed the name from Alex freeberg to Christine
the name from Alex freeberg to Christine freeberg we can also update all of these
freeberg we can also update all of these values at one time so let's copy
values at one time so let's copy this and I'm going to put it right down
this and I'm going to put it right down here I'm going to say dictionary.c
here I'm going to say dictionary.c cream. update then we're going to put a
cream. update then we're going to put a bracket or not a bracket but a
bracket or not a bracket but a parentheses around these so now what
parentheses around these so now what we're going to do is update this entire
we're going to do is update this entire thing let me take this say print this
thing let me take this say print this dictionary now we can update this to
dictionary now we can update this to anything we want so instead of here I
anything we want so instead of here I can
can say I'll say
say I'll say weight and because of all that ice cream
weight and because of all that ice cream I now weigh 300 lb so let's run this and
I now weigh 300 lb so let's run this and as you can see it did not delete our key
as you can see it did not delete our key value pair right here instead it just
value pair right here instead it just added to it when you're using the update
added to it when you're using the update we can't actually delete that's the
we can't actually delete that's the delete statement and I'll show you that
delete statement and I'll show you that in just a second but all we did was
in just a second but all we did was added this new value it also is going to
added this new value it also is going to check and see if you changed anything
check and see if you changed anything with your key value pair so we can go in
with your key value pair so we can go in here here and change this value and
here here and change this value and we'll say 10 so now when we run this the
we'll say 10 so now when we run this the value of this key value pair was changed
value of this key value pair was changed but let's say we do want to delete it
but let's say we do want to delete it we'll say deel that stands for delete
we'll say deel that stands for delete part of this dictionary cream and now
part of this dictionary cream and now let's specify the key which will also
let's specify the key which will also delete the value with it well let's
delete the value with it well let's specify the key that we want to get rid
specify the key that we want to get rid of and let's say
of and let's say wait and then let's print that
wait and then let's print that again and as you can see the weight was
again and as you can see the weight was deleted from that dictionary so that is
deleted from that dictionary so that is all we're going to cover in this data
all we're going to cover in this data types video thank you guys so much for
types video thank you guys so much for watching I really appreciate it if you
watching I really appreciate it if you like this video be sure to like And
like this video be sure to like And subscribe below and I'll see you in the
subscribe below and I'll see you in the next
next [Music]
[Music] video hello everybody today we're going
video hello everybody today we're going to be taking a look at comparison
to be taking a look at comparison logical and membership operators in
logical and membership operators in Python operators are used to perform
Python operators are used to perform operations on variables and values for
operations on variables and values for example you're often going to want to
example you're often going to want to compare two separate values to see if
compare two separate values to see if they are the same or if they're
they are the same or if they're different within Python and that's where
different within Python and that's where the comparison operator comes in right
the comparison operator comes in right here you can see our operators you can
here you can see our operators you can also see what they do so this equal sign
also see what they do so this equal sign equal sign stands for equal we have the
equal sign stands for equal we have the does not equal the greater than less
does not equal the greater than less than greater than or equal to and less
than greater than or equal to and less than or equal to and honestly I use
than or equal to and honestly I use these almost every single time I use
these almost every single time I use Python so these are very important to
Python so these are very important to know and know how to use so let's get
know and know how to use so let's get rid of that really quickly and actually
rid of that really quickly and actually start writing it out and see how these
start writing it out and see how these comparison operators work in Python the
comparison operators work in Python the very first one that we're going to look
very first one that we're going to look at is equal to now you can't just say 10
at is equal to now you can't just say 10 is equal to 10 let's try running that
is equal to 10 let's try running that really quickly by clicking shift enter
really quickly by clicking shift enter it's going to say cannot assign to
it's going to say cannot assign to literal that's because this is like
literal that's because this is like assigning a variable we're trying to say
assigning a variable we're trying to say 10 is equal to 10 and then we can call
10 is equal to 10 and then we can call that 10 later but that's not how this
that 10 later but that's not how this actually works what we're trying to do
actually works what we're trying to do is to determine whether 10 is equal to
is to determine whether 10 is equal to 10 so we're going to say equal sign
10 so we're going to say equal sign equal sign and then if we run that by
equal sign and then if we run that by clicking shift enter again it's going to
clicking shift enter again it's going to say true now if we put something else
say true now if we put something else like 50 in there and we try to run this
like 50 in there and we try to run this it's going to say false so really what
it's going to say false so really what you're going to get when you use these
you're going to get when you use these comparison operators is either a true or
comparison operators is either a true or a false if we take this right down here
a false if we take this right down here we can also say does not equal and we're
we can also say does not equal and we're going to use an exclamation point equal
going to use an exclamation point equal sign and that says 10 is not equal to 50
sign and that says 10 is not equal to 50 and that should be true you can also
and that should be true you can also compare strings and variables so let's
compare strings and variables so let's go right down here and we're going to
go right down here and we're going to say vanilla is not
say vanilla is not equal to chocolate and when we run this
equal to chocolate and when we run this it'll say false now if it was the same
it'll say false now if it was the same just like when we did our numbers it
just like when we did our numbers it should say true and we can also compare
should say true and we can also compare variables so we'll say x is equal to
variables so we'll say x is equal to vanilla and Y is equal to chocolate and
vanilla and Y is equal to chocolate and then when we come down here we can say x
then when we come down here we can say x is equal to Y and it'll give us a false
is equal to Y and it'll give us a false and we say X is not equal to Y and it'll
and we say X is not equal to Y and it'll give us a true the next one that we're
give us a true the next one that we're going to take take a look at is the less
going to take take a look at is the less than so let's copy this one right up
than so let's copy this one right up here let's scroll
here let's scroll down and let's say 10 is less than 50
down and let's say 10 is less than 50 now this will come out as true now let's
now this will come out as true now let's say we put a 10 in here before 10 was of
say we put a 10 in here before 10 was of course less than 50 but is 10 less than
course less than 50 but is 10 less than 10 no that's false because they are the
10 no that's false because they are the same so if we want an output that is
same so if we want an output that is true all we would have to add is an
true all we would have to add is an equal sign right here and this would say
equal sign right here and this would say 10 is less than or it is equal to 10 and
10 is less than or it is equal to 10 and now it's true of course we can say the
now it's true of course we can say the exact same thing by saying greater than
exact same thing by saying greater than so 10 is equal or greater than 10
so 10 is equal or greater than 10 that'll be true because 10 is equal to
that'll be true because 10 is equal to 10 we can also say 50 is greater or
10 we can also say 50 is greater or equal to 10 because 50 is obviously
equal to 10 because 50 is obviously greater than 10 now let's look at
greater than 10 now let's look at logical operators that are often
logical operators that are often combined with comparison operators so
combined with comparison operators so our operators are and or and not so if
our operators are and or and not so if you have an and that returns true if
you have an and that returns true if both statements are true if it's or only
both statements are true if it's or only one of the statements has to be true and
one of the statements has to be true and the not basically reverses the result so
the not basically reverses the result so if it was going to return true it would
if it was going to return true it would return false I don't use this not one a
return false I don't use this not one a lot but I will show you how it works so
lot but I will show you how it works so let's actually test that out so before
let's actually test that out so before we were saying 10 is greater than 50 and
we were saying 10 is greater than 50 and of course this returned false so now
of course this returned false so now let's add a parentheses around this 10
let's add a parentheses around this 10 is greater than 50 and we're going to
is greater than 50 and we're going to say and we'll do an open parenthesis 50
say and we'll do an open parenthesis 50 is greater than 10 now this statement
is greater than 10 now this statement right here is true 50 is greater than 10
right here is true 50 is greater than 10 so we have a true statement and a false
so we have a true statement and a false statement but this and is going to look
statement but this and is going to look at both of them and it's going to say
at both of them and it's going to say they both need to be true in order to
they both need to be true in order to return a true so let's try running this
return a true so let's try running this and we still have a false if we want it
and we still have a false if we want it to return true we're going to have to
to return true we're going to have to change this to make it a true statement
change this to make it a true statement so 70 is greater than 50 and 50 is
so 70 is greater than 50 and 50 is greater than 10 when we run this it
greater than 10 when we run this it should return true now let's look at the
should return true now let's look at the or so let's copy this and we'll say 10
or so let's copy this and we'll say 10 is greater than 50 or 50 is greater than
is greater than 50 or 50 is greater than 10 now this is a false statement and
10 now this is a false statement and this is a true statement so if even one
this is a true statement so if even one of them is a true statement the output
of them is a true statement the output should be true and again we can do this
should be true and again we can do this even with strings so we can do
even with strings so we can do vanilla and
vanilla and chocolate there we go and vanilla is
chocolate there we go and vanilla is actually greater than chocolate because
actually greater than chocolate because V is a higher number in the alphabetical
V is a higher number in the alphabetical order so V is like 20 something whereas
order so V is like 20 something whereas chocolate is three right so actually
chocolate is three right so actually looks at the spelling for this so if we
looks at the spelling for this so if we say or here it will come out true and if
say or here it will come out true and if we say and here it should also be true
we say and here it should also be true because V is greater than C and 50 is
because V is greater than C and 50 is greater than 10 so this should also be
greater than 10 so this should also be true now let's copy this right here and
true now let's copy this right here and we're going to say not so what we had
we're going to say not so what we had before is 50 is greater than 10 that
before is 50 is greater than 10 that returned true but now all we're doing is
returned true but now all we're doing is putting not in front of it so instead of
putting not in front of it so instead of returning true it's going to return
returning true it's going to return false so now let's take a look at
false so now let's take a look at membership operators and we use this to
membership operators and we use this to check if something whether it's a value
check if something whether it's a value or a string or something like that is
or a string or something like that is within another value or string or
within another value or string or sequence our operators are in and not in
sequence our operators are in and not in so it's pretty simple if it's in it's
so it's pretty simple if it's in it's going to return true if the sequence
going to return true if the sequence with a specified value is present in the
with a specified value is present in the object just like we were talking about
object just like we were talking about and for not in it's basically the exact
and for not in it's basically the exact same thing if it's not in that object so
same thing if it's not in that object so let's start out by taking a look at a
let's start out by taking a look at a string we're going to say ice _ cream is
string we're going to say ice _ cream is equal to I love chocolate
equal to I love chocolate ice
ice cream and then we're going to say love
cream and then we're going to say love in ice cream and that will will turn
in ice cream and that will will turn true so all we're doing is searching if
true so all we're doing is searching if the word love or that string is in this
the word love or that string is in this larger string we could also just do that
larger string we could also just do that by literally copying this and putting
by literally copying this and putting this where this is so we can check is
this where this is so we can check is this string part of this string and
this string part of this string and it'll say true we can also make a list
it'll say true we can also make a list so we'll say Scoops is equal to and then
so we'll say Scoops is equal to and then we'll do a bracket and we'll say 1 2 3 4
we'll do a bracket and we'll say 1 2 3 4 4 five and then we'll say two in Scoops
4 five and then we'll say two in Scoops so all we're doing is searching to see
so all we're doing is searching to see if two is within this list and that
if two is within this list and that should return true now if we put a six
should return true now if we put a six here and we said not in it will also
here and we said not in it will also return true because six is not in Scoops
return true because six is not in Scoops and that is true and just like we did we
and that is true and just like we did we could also say wanted underscore Scoops
could also say wanted underscore Scoops and we'll say eight so I wanted eight
and we'll say eight so I wanted eight Scoops so we can say wanted Scoops in
Scoops so we can say wanted Scoops in scoops and this should return true
scoops and this should return true because there's not an eight within the
because there's not an eight within the Scoops that we wanted and if we said in
Scoops that we wanted and if we said in and we said we wanted eight is that
and we said we wanted eight is that within our list that we created and
within our list that we created and that's going to return a false so that
that's going to return a false so that is a quick breakdown of comparison
is a quick breakdown of comparison logical and membership operators I hope
logical and membership operators I hope that this was helpful thank you guys so
that this was helpful thank you guys so much for watching if you like this video
much for watching if you like this video be sure to like And subscribe and I will
be sure to like And subscribe and I will see you in the next
see you in the next [Music]
[Music] video
[Music] hello everybody today we're going to be
hello everybody today we're going to be taking a look at the if statement within
taking a look at the if statement within python now it's actually the if LF else
python now it's actually the if LF else statement but that's a mouthful so I'm
statement but that's a mouthful so I'm just going to call it the if else
just going to call it the if else statement now we have this flowchart and
statement now we have this flowchart and I apologize for being blurry but this is
I apologize for being blurry but this is the absolute best one that I could find
the absolute best one that I could find right up top we have our if condition
right up top we have our if condition now if this if condition is true we're
now if this if condition is true we're going to run a body of code but if that
going to run a body of code but if that condition is false we're going to go
condition is false we're going to go over here and go to the LF condition the
over here and go to the LF condition the LF condition or statement is basically
LF condition or statement is basically saying if the first if statement doesn't
saying if the first if statement doesn't work let's try this if statement if this
work let's try this if statement if this LF statement is true it goes to this
LF statement is true it goes to this body of code if it's false it'll come
body of code if it's false it'll come over here to the else and the else is
over here to the else and the else is basically if all these things don't work
basically if all these things don't work then run this body of code now you can
then run this body of code now you can have as many ill if statements as you
have as many ill if statements as you want but you can only have one if
want but you can only have one if statement and one else statement so
statement and one else statement so let's write out some code and see how
let's write out some code and see how this actually looks let's first start
this actually looks let's first start off by writing if that that is our if
off by writing if that that is our if statement and now we have to write our
statement and now we have to write our condition which is about to be either
condition which is about to be either met or not met so we'll say if 25 is
met or not met so we'll say if 25 is greater than 10 which is true we'll say
greater than 10 which is true we'll say colon and then we're going to hit enter
colon and then we're going to hit enter and it's going to automatically indent
and it's going to automatically indent that line of code for us and this is our
that line of code for us and this is our body of code so if 25 is greater than 10
body of code so if 25 is greater than 10 our body of code will execute so for us
our body of code will execute so for us we're just going to write print and
we're just going to write print and we'll say it worked now if we run this
we'll say it worked now if we run this it's going to check is 25 greater than
it's going to check is 25 greater than 10 if that is true true print this so
10 if that is true true print this so let's hit shift enter and it worked now
let's hit shift enter and it worked now let's take this exact code we'll paste
let's take this exact code we'll paste it right down here and we'll say is less
it right down here and we'll say is less than and right now this if statement is
than and right now this if statement is not true so it's not actually going to
not true so it's not actually going to work as you can see there's no output
work as you can see there's no output there's nothing that happened really but
there's nothing that happened really but it did check to see if 25 was less than
it did check to see if 25 was less than 10 but it just wasn't true now we can
10 but it just wasn't true now we can use our else statement so we're going to
use our else statement so we're going to come right down here and we're going to
come right down here and we're going to say else and we'll do a colon and we'll
say else and we'll do a colon and we'll hit enter again automatically indenting
hit enter again automatically indenting and we're going to say print and we're
and we're going to say print and we're going to say it did not work dot dot dot
going to say it did not work dot dot dot so what it's going to do is it's going
so what it's going to do is it's going to come up here and check is 25 less
to come up here and check is 25 less than 10 no it's not so this body of code
than 10 no it's not so this body of code is not going to be executed it's going
is not going to be executed it's going to go right down to this else statement
to go right down to this else statement now this else statement is going to be
now this else statement is going to be printed there's no condition on this so
printed there's no condition on this so the if statement has a condition 25 is
the if statement has a condition 25 is less than 10 this has no condition so if
less than 10 this has no condition so if this doesn't work if this is false it's
this doesn't work if this is false it's going to come down here and it will run
going to come down here and it will run this body of code let's run this by
this body of code let's run this by clicking shift enter and as you can see
clicking shift enter and as you can see our output is it did not work now let's
our output is it did not work now let's go back up here and put greater than
go back up here and put greater than because this is now true it's going to
because this is now true it's going to say if 25 is greater than 10 print it
say if 25 is greater than 10 print it worked and then it's going to stop it's
worked and then it's going to stop it's not going to go to this lse statement at
not going to go to this lse statement at all so let's run this and our output is
all so let's run this and our output is it worked so what if we have a lot of
it worked so what if we have a lot of different conditions that we want to try
different conditions that we want to try let's come right down here this is where
let's come right down here this is where the LF comes in so so really quickly
the LF comes in so so really quickly let's change this to a not true a false
let's change this to a not true a false statement we're going to go down and say
statement we're going to go down and say LF and we're going to say if it is and
LF and we're going to say if it is and let's say
let's say 30 we'll say LF
30 we'll say LF worked so now it's going to check is 25
worked so now it's going to check is 25 less than 10 no it's not let's look at
less than 10 no it's not let's look at the next condition is 25 less than 30
the next condition is 25 less than 30 and if it is we'll print L if worked so
and if it is we'll print L if worked so let's try running this and L if worked
let's try running this and L if worked now we can do as many of these LF
now we can do as many of these LF statements as we want we can do let's
statements as we want we can do let's just try a few of them right here so
just try a few of them right here so we'll say if 25 is less than 20 is less
we'll say if 25 is less than 20 is less than
than 21 and let's do 40 and let's do 50 so
21 and let's do 40 and let's do 50 so we'll say LF lf2 lf3 and lf4 now if you
we'll say LF lf2 lf3 and lf4 now if you look at this the first one that is
look at this the first one that is actually going to work is this 25 to 40
actually going to work is this 25 to 40 right here once this one is checked and
right here once this one is checked and it comes out as true none of the other
it comes out as true none of the other LF or L statements will work so let's
LF or L statements will work so let's try this one it should be
try this one it should be lf3 and this one ran properly now within
lf3 and this one ran properly now within our condition so far we've only used a
our condition so far we've only used a comparison operator we can also use a
comparison operator we can also use a logical operator like and or or so we
logical operator like and or or so we can say if 25 is less than 10 which it's
can say if 25 is less than 10 which it's not and let's say or actually and we'll
not and let's say or actually and we'll say or 1 is less than three which is
say or 1 is less than three which is true if we run this now it will actually
true if we run this now it will actually work so we can use several different
work so we can use several different types of operators within our if
types of operators within our if statement to see if a condition is true
statement to see if a condition is true or not or several conditions are true
or not or several conditions are true there's also a way to write an IFL
there's also a way to write an IFL statement in one line if you want to do
statement in one line if you want to do that so we can write print we'll say it
that so we can write print we'll say it worked and then we'll come over here and
worked and then we'll come over here and say if 10 is greater than 30 and then
say if 10 is greater than 30 and then we'll write else print and we'll say it
we'll write else print and we'll say it did not work just like we had before
did not work just like we had before except now it's all occurring on one
except now it's all occurring on one line so let's just try this and see if
line so let's just try this and see if it works so it's saying print it worked
it works so it's saying print it worked if 10 is greater than 30 which it wasn't
if 10 is greater than 30 which it wasn't so it went to the lse statement and then
so it went to the lse statement and then it printed out our body right here
it printed out our body right here although we didn't have any indentation
although we didn't have any indentation or multiple lines it was all done in one
or multiple lines it was all done in one line now there's one other thing that we
line now there's one other thing that we haven't looked at yet uh and I'm going
haven't looked at yet uh and I'm going to show it to you really quickly and
to show it to you really quickly and that's a nested if statement so when we
that's a nested if statement so when we run this it's going to say it worked it
run this it's going to say it worked it works because it says 25 is less than 10
works because it says 25 is less than 10 or one is less than three since this is
or one is less than three since this is true it's going to print out it worked
true it's going to print out it worked but we can also do a nested if statement
but we can also do a nested if statement so we can do multiple if statements as
so we can do multiple if statements as well so we're going to hit enter and
well so we're going to hit enter and we'll say if and we'll do a true
we'll say if and we'll do a true statement here so we'll say if 10 is
statement here so we'll say if 10 is greater than five let's do a colon hit
greater than five let's do a colon hit enter then we'll say print and then
enter then we'll say print and then we'll type A String saying this nested
we'll type A String saying this nested if
if statement oops
statement oops worked now let's try this out and and
worked now let's try this out and and see what we get so it went through the
see what we get so it went through the first if statement it said it was true
first if statement it said it was true and it prints out it worked this is
and it prints out it worked this is still the body of code so it goes down
still the body of code so it goes down to this next if statement and it says if
to this next if statement and it says if 10 is greater than five we're going to
10 is greater than five we're going to print this out and you could do this on
print this out and you could do this on and on and on it can basically go on
and on and on it can basically go on forever and you can create a really
forever and you can create a really in-depth logic and that actually happens
in-depth logic and that actually happens a lot when you start writing more
a lot when you start writing more advanced code so I hope that this was
advanced code so I hope that this was helpful I hope that you understand the
helpful I hope that you understand the IFL statement better I hope that you
IFL statement better I hope that you understand how nested if statements work
understand how nested if statements work as well thank you guys so much for
as well thank you guys so much for watching if you like this video be sure
watching if you like this video be sure to like And subscribe below and I'll see
to like And subscribe below and I'll see you in the next
you in the next [Music]
[Music] video hello everybody today we're going
video hello everybody today we're going to be learning about for Loops in Python
to be learning about for Loops in Python the for Loop is used to iterate over a
the for Loop is used to iterate over a sequence which could be a list a tupal
sequence which could be a list a tupal an array a string or even a dictionary
an array a string or even a dictionary here's the list that we'll be working
here's the list that we'll be working with throughout this video and I have
with throughout this video and I have this little diagram right here which
this little diagram right here which kind of explains how a for Loop works
kind of explains how a for Loop works the for Loop is going to start by
the for Loop is going to start by looking at the very first item in our
looking at the very first item in our sequence or our list and that's going to
sequence or our list and that's going to be our one right here it's going to ask
be our one right here it's going to ask is this the last element in our list and
is this the last element in our list and it is not so it's going to go down to
it is not so it's going to go down to this body of the for Loop now we can
this body of the for Loop now we can have a thousand different things that
have a thousand different things that can happen in the body of the for loop
can happen in the body of the for loop as we're about to look out in just a
as we're about to look out in just a second then it's going to go up to the
second then it's going to go up to the next element and ask is this the last
next element and ask is this the last element reached so it'll be no again
element reached so it'll be no again because we'll be going to the two and
because we'll be going to the two and then the three and then the four and the
then the three and then the four and the five once it reaches the five it'll go
five once it reaches the five it'll go to the body the for Loop and then when
to the body the for Loop and then when it asks if that's the last element the
it asks if that's the last element the answer would be yes because it's
answer would be yes because it's iterated through all the items within
iterated through all the items within the list and then we would exit the loop
the list and then we would exit the loop and the for Loop would be over now that
and the for Loop would be over now that may not have made perfect sense but
may not have made perfect sense but let's actually start writing out the
let's actually start writing out the syntax of a for Loop so we can
syntax of a for Loop so we can understand this better to start our for
understand this better to start our for loop we're going to say four and and
loop we're going to say four and and then we're going to give it a temporary
then we're going to give it a temporary variable for this for Loop so it's a
variable for this for Loop so it's a variable as it iterates through these
variable as it iterates through these numbers it's going to assign the
numbers it's going to assign the variable to that number so for this one
variable to that number so for this one we're just going to say number because
we're just going to say number because it's pretty appropriate because these
it's pretty appropriate because these are all numbers and then we're going to
are all numbers and then we're going to say in integers now right here you can
say in integers now right here you can put just about anything this could be
put just about anything this could be the list this could be a tuple this
the list this could be a tuple this could be a string even but that is what
could be a string even but that is what we're going to iterate through so we're
we're going to iterate through so we're saying for the variables each of these
saying for the variables each of these numbers within this list of integers and
numbers within this list of integers and then we're going to write a colon this
then we're going to write a colon this is the body of code that's going to
is the body of code that's going to actually be executed when we run through
actually be executed when we run through and iterate through our list so for our
and iterate through our list so for our first example we're going to start off
first example we're going to start off super simple and all we're going to do
super simple and all we're going to do is say print open parentheses and say
is say print open parentheses and say number as it iterates through the 1 2 3
number as it iterates through the 1 2 3 4 and five number becomes our variable
4 and five number becomes our variable that is going to be printed so during
that is going to be printed so during that first loop our one will be printed
that first loop our one will be printed because that will be assigned right here
because that will be assigned right here then through the next iteration the two
then through the next iteration the two will be assigned and'll be put right
will be assigned and'll be put right here in each Loop until the very end so
here in each Loop until the very end so let's hit shift
let's hit shift enter and as you can see it did exactly
enter and as you can see it did exactly that now in this body and I'll copy and
that now in this body and I'll copy and paste this down here in this body we
paste this down here in this body we really can do just about anything we
really can do just about anything we want we don't even have to use this
want we don't even have to use this variable number right here we can just
variable number right here we can just print yep if we wanted to and what it's
print yep if we wanted to and what it's going to do is for each iteration all
going to do is for each iteration all five of those every time it Loops
five of those every time it Loops through it's going to print off yep so
through it's going to print off yep so let's hit shift enter and it printed it
let's hit shift enter and it printed it off for us so really we weren't even
off for us so really we weren't even using the numbers within the list we
using the numbers within the list we were really just using it as almost a
were really just using it as almost a counter now let's copy this integers
counter now let's copy this integers once again let's go right up here and
once again let's go right up here and let's go copy this for Loop that we
let's go copy this for Loop that we wrote now we do not have to call this
wrote now we do not have to call this number this can be anything you want any
number this can be anything you want any variable name that you'd like to name it
variable name that you'd like to name it we could call it
we could call it jelly and we can
jelly and we can do jelly plus
do jelly plus jelly I think you're getting the picture
jelly I think you're getting the picture right when it Loops through that one
right when it Loops through that one it's doing 1 plus one when it Loops
it's doing 1 plus one when it Loops through the two it's doing two plus two
through the two it's doing two plus two that is basically how a four Loop works
that is basically how a four Loop works now for a dictionary it's going to
now for a dictionary it's going to handle it a little bit differently so
handle it a little bit differently so let's create a dictionary really quickly
let's create a dictionary really quickly so we'll say ice
so we'll say ice cream dictionary is equal to we're going
cream dictionary is equal to we're going to do a squiggly brackets so we're going
to do a squiggly brackets so we're going to say name and we're going to say colon
to say name and we're going to say colon we need to assign our value for that
we need to assign our value for that item so we're going to say Alex freeberg
item so we're going to say Alex freeberg we'll do our next one separated by a
we'll do our next one separated by a comma and we'll say weekly intake and
comma and we'll say weekly intake and I'll say five Scoops per week the next
I'll say five Scoops per week the next one we will do is favorite ice creams
one we will do is favorite ice creams and for this one we're going to do
and for this one we're going to do something a little bit different for
something a little bit different for this we're going to have a list within
this we're going to have a list within this dictionary so we'll say within our
this dictionary so we'll say within our list of my favorite ice creams we'll say
list of my favorite ice creams we'll say mint chocolate chip and I'll just do MCC
mint chocolate chip and I'll just do MCC for that and we'll separate that out by
for that and we'll separate that out by a comma and we'll say chocolate so now
a comma and we'll say chocolate so now we have this dictionary ice cream dick
we have this dictionary ice cream dick and within it we have my name my weekly
and within it we have my name my weekly intake and my favorite ice creams with a
intake and my favorite ice creams with a list in there as well let's hit shift
list in there as well let's hit shift enter and now we're going to start
enter and now we're going to start writing our for Loop now the for Loop is
writing our for Loop now the for Loop is going to look very similar but to call a
going to look very similar but to call a dictionary it's just a little bit
dictionary it's just a little bit different so we're going to say four the
different so we're going to say four the cream in icore
cream in icore creamore
creamore dictionary. values and then we're going
dictionary. values and then we're going to do parentheses and then a colon now
to do parentheses and then a colon now we're going to print the cream so in
we're going to print the cream so in order to indicate what we actually want
order to indicate what we actually want to pull we have to specify within the
to pull we have to specify within the dictionary what we want are we pulling
dictionary what we want are we pulling the item are we pulling the value we
the item are we pulling the value we need to specify this so that's why we
need to specify this so that's why we have this dot values right here so let's
have this dot values right here so let's run this and see what we get so as you
run this and see what we get so as you can see we are pulling in the values
can see we are pulling in the values right here that's why we're pulling in
right here that's why we're pulling in Alex freeberg 5 and mint chocolate chip
Alex freeberg 5 and mint chocolate chip SL chocolate now we are able to call
SL chocolate now we are able to call both of those both the key and the value
both of those both the key and the value so let's go right down here and we can
so let's go right down here and we can do both the key and the value so we can
do both the key and the value so we can pull two things at one time and we're
pull two things at one time and we're going to do this by saying do items so
going to do this by saying do items so we could also do do key if we just
we could also do do key if we just wanted to do a key but we want to do
wanted to do a key but we want to do items so we going to do both of them
items so we going to do both of them so we're going to go right down here and
so we're going to go right down here and say for key and value in ice cream
say for key and value in ice cream dictionary. items print and let's write
dictionary. items print and let's write key and then we'll do a comma and then
key and then we'll do a comma and then let's give it a little arrow or
let's give it a little arrow or something like that uh something like
something like that uh something like this and then we'll do a comma and we'll
this and then we'll do a comma and we'll say value and let's print this off and
say value and let's print this off and see what we get so it's looping through
see what we get so it's looping through and for each key and value it's saying
and for each key and value it's saying here is the key so that's the name then
here is the key so that's the name then we have weekly intake then we have
we have weekly intake then we have favorite ice creams it's giving us a
favorite ice creams it's giving us a little arrow and then we're also
little arrow and then we're also printing off the value so we have name
printing off the value so we have name Alex freeberg weekly intake five
Alex freeberg weekly intake five favorite ice creams mint chocolate chip
favorite ice creams mint chocolate chip and chocolate so now let's talk about
and chocolate so now let's talk about nested for Loops we've looked at for
nested for Loops we've looked at for Loops we understand how they work and
Loops we understand how they work and why they do what they do but what about
why they do what they do but what about a nested for Loop a for Loop within a
a nested for Loop a for Loop within a for Loop for this example let's create
for Loop for this example let's create two separate lists let's create
two separate lists let's create flavors and let's make that a list by
flavors and let's make that a list by making it a bracket we'll do vanilla the
making it a bracket we'll do vanilla the classic
classic chocolate and then cookie dough all
chocolate and then cookie dough all great flavors so that's our first list
great flavors so that's our first list and then we're going to say toppings and
and then we're going to say toppings and we'll do a bracket for that as well and
we'll do a bracket for that as well and we'll say hot
we'll say hot fudge and then we'll do
fudge and then we'll do Oreos and then we'll do
Oreos and then we'll do marshmallows is how you spell
marshmallows is how you spell marshmallows
marshmallows I think it's an e that looks wrong I
I think it's an e that looks wrong I might be spelling it wrong but that's
might be spelling it wrong but that's okay so let's save this by clicking
okay so let's save this by clicking shift enter and now we have our flavors
shift enter and now we have our flavors and our toppings so now let's write our
and our toppings so now let's write our first for Loops we're going to say 41 as
first for Loops we're going to say 41 as in our number one for loop we're going
in our number one for loop we're going to say in flavors and we'll do a colon
to say in flavors and we'll do a colon we'll click enter now we can write our
we'll click enter now we can write our second for Loop so we're going to say 4
second for Loop so we're going to say 4 two in toppings and then we'll do a
two in toppings and then we'll do a colon and enter and then we're going to
colon and enter and then we're going to say print and we'll do an open
say print and we'll do an open parenthesis and then we're going to say
parenthesis and then we're going to say one so we're printing the one in flavors
one so we're printing the one in flavors and then we're going to say one comma
and then we're going to say one comma I'm going to say topped with comma 2 so
I'm going to say topped with comma 2 so what this is essentially going to do is
what this is essentially going to do is we're going to say for one we're going
we're going to say for one we're going to take the very first one in flavors
to take the very first one in flavors and then we're going to Loop through all
and then we're going to Loop through all of two as well so we're going to Loop
of two as well so we're going to Loop through hot fudge Oreo
through hot fudge Oreo and marshmallows and once we print that
and marshmallows and once we print that off then we will Loop all the way back
off then we will Loop all the way back to Flavors and look at the next
to Flavors and look at the next iteration or the next sequence within
iteration or the next sequence within the first for Loop so let's run this
the first for Loop so let's run this really quickly and see what we get so as
really quickly and see what we get so as you can see it goes vanilla vanilla
you can see it goes vanilla vanilla vanilla and vanilla is topped with the
vanilla and vanilla is topped with the hot fudge the Oreos and the marshmallows
hot fudge the Oreos and the marshmallows and then we start iterating through our
and then we start iterating through our second one in our first four Loop so
second one in our first four Loop so there's that hierarchy so we're
there's that hierarchy so we're iterating completely through this one
iterating completely through this one before we actually go to the very first
before we actually go to the very first for Loop and start iterating through
for Loop and start iterating through that one again now that is essentially
that one again now that is essentially how a nested for Loop works these nested
how a nested for Loop works these nested for Loops can get very complicated in
for Loops can get very complicated in fact for Loops in general can get very
fact for Loops in general can get very complicated the more you add to it and
complicated the more you add to it and the more you're wanting to do with it
the more you're wanting to do with it but that is basically how a for Loop and
but that is basically how a for Loop and a nested for Loop works thank you guys
a nested for Loop works thank you guys so much for watching be sure to like And
so much for watching be sure to like And subscribe below and I'll see you in the
subscribe below and I'll see you in the next
next [Music]
[Music] video
video [Music]
[Music] hello everybody today we're going to be
hello everybody today we're going to be taking a look at while Loops in Python
taking a look at while Loops in Python the while loop in Python is used to
the while loop in Python is used to iterate over a block of code as long as
iterate over a block of code as long as the test condition is true now the
the test condition is true now the difference between a for Loop and a
difference between a for Loop and a while loop is that a for Loop is going
while loop is that a for Loop is going to iterate over the entire sequence
to iterate over the entire sequence regardless of a condition but the while
regardless of a condition but the while loop is only going to iterate over that
loop is only going to iterate over that sequence as long as a specific condition
sequence as long as a specific condition is met once that condition is not met
is met once that condition is not met the code is going to stop and it's not
the code is going to stop and it's not going to inter through the rest of the
going to inter through the rest of the sequence so if we take a look at this
sequence so if we take a look at this flowchart right here we're going to
flowchart right here we're going to enter this while loop and we have a test
enter this while loop and we have a test condition right here the first time that
condition right here the first time that this test condition comes back false
this test condition comes back false it's going to exit the while loop so
it's going to exit the while loop so let's start actually writing out the
let's start actually writing out the code and see how this while loop works
code and see how this while loop works so let's create a variable we're just
so let's create a variable we're just going to say number is equal to one and
going to say number is equal to one and then we'll say while and now we need to
then we'll say while and now we need to write our condition that needs to be met
write our condition that needs to be met in order for our block of code beneath
in order for our block of code beneath this to run so we're going to say while
this to run so we're going to say while number is less than five and then we'll
number is less than five and then we'll do colon enter and now this is our block
do colon enter and now this is our block of code we're going to say print and
of code we're going to say print and then we'll say number now what we need
then we'll say number now what we need to do is basically create a counter
to do is basically create a counter we're going to say number equals number
we're going to say number equals number + 1 if you've never done something like
+ 1 if you've never done something like this it's kind of like a counter most
this it's kind of like a counter most people start it at zero in fact let's
people start it at zero in fact let's start it at zero and then each time it
start it at zero and then each time it runs through this while loop it's going
runs through this while loop it's going to add one to this number up here and
to add one to this number up here and then it's going to become a one a two a
then it's going to become a one a two a three each time it iterates through this
three each time it iterates through this while loop now once this number is no
while loop now once this number is no longer less than five it'll break out of
longer less than five it'll break out of the while loop and it will no longer run
the while loop and it will no longer run so let's run this really quick by
so let's run this really quick by hitting shift enter so it starts at zero
hitting shift enter so it starts at zero and it's going to say while the number
and it's going to say while the number is less than five print number so the
is less than five print number so the first time that it runs through it is
first time that it runs through it is zero and so it prints zero and then it
zero and so it prints zero and then it adds one two number and then it
adds one two number and then it continues that y Loop right here and it
continues that y Loop right here and it keeps looping through this portion it
keeps looping through this portion it never goes back up here to this line of
never goes back up here to this line of code this is just our variable that we
code this is just our variable that we start with and then once this condition
start with and then once this condition is no longer met once it is is false
is no longer met once it is is false then it's going to break out of that
then it's going to break out of that code now that we basically know how a y
code now that we basically know how a y Loop Works let's look at something
Loop Works let's look at something called a break statement so let's copy
called a break statement so let's copy this right down here and what we're
this right down here and what we're going to say is if number is equal to
going to say is if number is equal to three we're going to break now with the
three we're going to break now with the break statement we can basically Stop
break statement we can basically Stop the Loop even if the while condition is
the Loop even if the while condition is true so while this number is less than
true so while this number is less than five it's going to continue to Loop
five it's going to continue to Loop through but now we have this break
through but now we have this break statement so it's going to say if the
statement so it's going to say if the number equals three we're going to break
number equals three we're going to break out out of this while loop but if this
out out of this while loop but if this is false we're going to continue adding
is false we're going to continue adding to that number just like normal so let's
to that number just like normal so let's execute this so as you can see it only
execute this so as you can see it only went to three instead of four like
went to three instead of four like before because each time it was running
before because each time it was running through this while loop it was checking
through this while loop it was checking if the number was equal to three and
if the number was equal to three and once it got to three this became true
once it got to three this became true and then we broke out of this while loop
and then we broke out of this while loop the next thing that I want to look at
the next thing that I want to look at and we'll copy this right down here is
and we'll copy this right down here is an else statement much like an if
an else statement much like an if statement but we can use the lse
statement but we can use the lse statement with a while loop which runs
statement with a while loop which runs the block of code and when that that
the block of code and when that that condition is no longer true then it
condition is no longer true then it activates the else statement so we'll go
activates the else statement so we'll go right down here and we'll say else and
right down here and we'll say else and we'll do a colon and enter and then
we'll do a colon and enter and then we'll say print and we'll say no
we'll say print and we'll say no longer less than five now because this
longer less than five now because this if statement is still in there it will
if statement is still in there it will break so let's say six and then we'll
break so let's say six and then we'll run this and so it's going to iterate
run this and so it's going to iterate through this block of code and once this
through this block of code and once this statement is no longer true once we
statement is no longer true once we break out of it we're going to go to our
break out of it we're going to go to our else state St now as long as this
else state St now as long as this statement is true it's going to continue
statement is true it's going to continue to iterate through but once this
to iterate through but once this condition is not met then it will go to
condition is not met then it will go to our L statement and we'll run that line
our L statement and we'll run that line of code now the L statement is only
of code now the L statement is only going to trigger if the Y Loop no longer
going to trigger if the Y Loop no longer is true if we have something like this
is true if we have something like this if statement that causes it to break out
if statement that causes it to break out of the while loop the L statement will
of the while loop the L statement will no longer work so let's say if the
no longer work so let's say if the number is three and we run this the L
number is three and we run this the L statement is no longer going to trigger
statement is no longer going to trigger so this body of code will not be run now
so this body of code will not be run now the next thing that I want to look at is
the next thing that I want to look at is the continue statement if the continue
the continue statement if the continue statement is triggered it basically
statement is triggered it basically rejects all remaining statements in the
rejects all remaining statements in the current iteration of the loop and then
current iteration of the loop and then we'll go to the next iteration now to
we'll go to the next iteration now to demonstrate this I'm going to change
demonstrate this I'm going to change this break into a continue so before
this break into a continue so before when we had the break if the number was
when we had the break if the number was equal to three it would stop all the
equal to three it would stop all the code completely but when we change this
code completely but when we change this to continue which we'll do right now
to continue which we'll do right now what it's going to do is it's no longer
what it's going to do is it's no longer going to run through any of the
going to run through any of the subsequent code in this block of code
subsequent code in this block of code it's just going to go straight up to the
it's just going to go straight up to the beginning and restart our while loop so
beginning and restart our while loop so what's going to happen when we run this
what's going to happen when we run this is it's going to come to three it's
is it's going to come to three it's going to become three it's going to
going to become three it's going to continue back into the while loop but
continue back into the while loop but it's never going to have that number
it's never going to have that number changed to be added to one to continue
changed to be added to one to continue with the while loop this will basically
with the while loop this will basically create an infinite Loop let's try this
create an infinite Loop let's try this really quickly and as you can see it's
really quickly and as you can see it's going to stay three forever eventually
going to stay three forever eventually this would time out but I'm just going
this would time out but I'm just going to stop the code really quick so if we
to stop the code really quick so if we just change up the order of which we're
just change up the order of which we're doing things we're going to say there
doing things we're going to say there and we're going to put this down here
and we're going to put this down here so what it's going to do now instead of
so what it's going to do now instead of printing the number immediately and then
printing the number immediately and then adding the number later we're going to
adding the number later we're going to add the number right away and then we're
add the number right away and then we're going to say if it is three we're going
going to say if it is three we're going to continue and it's going to print the
to continue and it's going to print the number so let's try executing this and
number so let's try executing this and see what happens so as you can see we no
see what happens so as you can see we no longer have the three in our output what
longer have the three in our output what it did was when we got to the number
it did was when we got to the number three it continued and didn't execute
three it continued and didn't execute this right here which prints off that
this right here which prints off that number so that really is the basics of
number so that really is the basics of the while loop I hope that this was
the while loop I hope that this was helpful I hope that you learned
helpful I hope that you learned something in this video If you did be
something in this video If you did be sure to like And subscribe below and
sure to like And subscribe below and I'll see you in the next
I'll see you in the next [Music]
[Music] video hello everybody today we're going
video hello everybody today we're going to be taking a look at functions in
to be taking a look at functions in Python a function is a block of code
Python a function is a block of code which is only run when you call it so
which is only run when you call it so right here we're defining our function
right here we're defining our function and then this is our body of code that
and then this is our body of code that when we actually call it is going to be
when we actually call it is going to be ran so right here we have our function
ran so right here we have our function call and all we're doing is putting the
call and all we're doing is putting the function with the parenthesis that is
function with the parenthesis that is basically us calling that function and
basically us calling that function and then we have our output throughout this
then we have our output throughout this video I'm going to show you how to write
video I'm going to show you how to write a function as well as pass arguments to
a function as well as pass arguments to that function and then a few other
that function and then a few other things like arbitrary arguments keyword
things like arbitrary arguments keyword arguments and arbitrary keyword
arguments and arbitrary keyword arguments all of these things are really
arguments all of these things are really important to know when you are using
important to know when you are using functions so let's get started by
functions so let's get started by writing our very first function together
writing our very first function together we're going to start off by saying DF
we're going to start off by saying DF that is the keyword for defining a
that is the keyword for defining a function then we can actually name our
function then we can actually name our function and for this one we're just
function and for this one we're just going to do first underscore function
going to do first underscore function and then we do an open parenthesis and
and then we do an open parenthesis and then we'll put a colon we'll hit enter
then we'll put a colon we'll hit enter and it'll automatically indent for us
and it'll automatically indent for us and this is where our body of code is
and this is where our body of code is going to go now within our body of code
going to go now within our body of code we can write just about anything and in
we can write just about anything and in this video I'm not going to get super
this video I'm not going to get super Advanced we're just going to walk
Advanced we're just going to walk through the basics to make sure that you
through the basics to make sure that you understand how to use functions so for
understand how to use functions so for right now all we're going to say is
right now all we're going to say is print we'll do an open parenthesis we'll
print we'll do an open parenthesis we'll do an apostrophe and we'll say we did it
do an apostrophe and we'll say we did it and now we're going to hit shift enter
and now we're going to hit shift enter and this is not going to do anything at
and this is not going to do anything at least you won't see any output from this
least you won't see any output from this if we want to see the output or we
if we want to see the output or we actually want to run that function and
actually want to run that function and some functions don't have outputs but if
some functions don't have outputs but if we want to run that function what we
we want to run that function what we have to do is just copy this and put it
have to do is just copy this and put it right down here and now we're going to
right down here and now we're going to actually call our function so let's go
actually call our function so let's go ahead and click shift enter and now
ahead and click shift enter and now we've successfully called our first
we've successfully called our first function this function is about as
function this function is about as simple as it could possibly be but now
simple as it could possibly be but now let's take it up a notch and start
let's take it up a notch and start looking at arguments so let's go right
looking at arguments so let's go right down here and we're going to say Define
down here and we're going to say Define number underscore squared we'll do a
number underscore squared we'll do a parenthesis and our colon as well now
parenthesis and our colon as well now really quickly when you're naming your
really quickly when you're naming your function it's kind of like naming a
function it's kind of like naming a variable you can use something like X or
variable you can use something like X or Y but I tend to like to be a little bit
Y but I tend to like to be a little bit more descriptive but now let's take a
more descriptive but now let's take a look at passing an argument into a
look at passing an argument into a function the argument is going to be
function the argument is going to be passed right here in the parentheses so
passed right here in the parentheses so for us I'm just going to call it a
for us I'm just going to call it a number and then we're going to hit enter
number and then we're going to hit enter and now we'll write our body of code and
and now we'll write our body of code and all we're going to do for this is type
all we're going to do for this is type print and open parenthesis and we'll say
print and open parenthesis and we'll say number and we'll do two stars at least
number and we'll do two stars at least that's what I call it a star and a two
that's what I call it a star and a two and what this is going to do is it's
and what this is going to do is it's going to take the number that we pass
going to take the number that we pass into our function it's going to put it
into our function it's going to put it right here in our body of code and then
right here in our body of code and then for what we're doing it's going to put
for what we're doing it's going to put it to the power of two and so when the
it to the power of two and so when the user or you run this and call this
user or you run this and call this function this number is something that
function this number is something that you can specify it's an argument that
you can specify it's an argument that you can input that will then be run in
you can input that will then be run in this body of code so let's copy this
this body of code so let's copy this right here and then we'll put it right
right here and then we'll put it right down here into this next cell and we'll
down here into this next cell and we'll say five and so this five is going to be
say five and so this five is going to be passed through into this function and be
passed through into this function and be called right here for this print
called right here for this print statement let's run it and it should
statement let's run it and it should come out as I believe 25 that is my
come out as I believe 25 that is my fault I forgot to actually run this
fault I forgot to actually run this block of code so I'm going to hit shift
block of code so I'm going to hit shift enter so now we've defined our function
enter so now we've defined our function up here and now we can actually call it
up here and now we can actually call it so now we'll hit shift enter and we got
so now we'll hit shift enter and we got our output of 25 now now in this
our output of 25 now now in this function we only called one argument but
function we only called one argument but you can basically call as many arguments
you can basically call as many arguments as you want you just have to separate
as you want you just have to separate them by commas so let's copy this and
them by commas so let's copy this and we'll put it right down here now we'll
we'll put it right down here now we'll say number squared uncore custom and
say number squared uncore custom and then we'll do number and then we'll do
then we'll do number and then we'll do power so now we can specify our number
power so now we can specify our number as well as the power that we want to
as well as the power that we want to raise it to so instead of having two
raise it to so instead of having two which is what you call hardcoded we can
which is what you call hardcoded we can now customize that and we'll have power
now customize that and we'll have power power and now when we call this function
power and now when we call this function we can specify the number and the power
we can specify the number and the power and both of those will go into this body
and both of those will go into this body of code and be run and we can customize
of code and be run and we can customize those numbers so let's copy
those numbers so let's copy this and we'll
this and we'll say 5 to the power of three and let's
say 5 to the power of three and let's make sure I ran this so let's do shift
make sure I ran this so let's do shift enter and now we will call our function
enter and now we will call our function and let's hit shift enter and we got 5
and let's hit shift enter and we got 5 to the^ of 3 which is 125 and just one
to the^ of 3 which is 125 and just one last thing to mention is if you have two
last thing to mention is if you have two arguments within your function and you
arguments within your function and you are calling it right here you have to
are calling it right here you have to pass in two arguments you can't just
pass in two arguments you can't just have one so if we have a five right here
have one so if we have a five right here it's going to error out we have to
it's going to error out we have to specify both Arguments for it to work
specify both Arguments for it to work now let's take a look at arbitrary
now let's take a look at arbitrary arguments now arbitrary arguments are
arguments now arbitrary arguments are really interesting because if you don't
really interesting because if you don't know how many arguments you want to pass
know how many arguments you want to pass through if you don't know if it's a one
through if you don't know if it's a one a two or a three you can specify that
a two or a three you can specify that later when you're calling the argument
later when you're calling the argument so you don't have to do it upfront and
so you don't have to do it upfront and know that information ahead of time so
know that information ahead of time so let's define our function so we're going
let's define our function so we're going to say Define and then we're going to
to say Define and then we're going to say number underscore args and we'll do
say number underscore args and we'll do an open parenthesis and a colon now
an open parenthesis and a colon now within our argument right here typically
within our argument right here typically we would just specify here's what our
we would just specify here's what our argument will be it will be number or it
argument will be it will be number or it will be a word right but what we're
will be a word right but what we're going to do is something called an
going to do is something called an arbitrary argument so it's unknown so
arbitrary argument so it's unknown so we're going to put star and then we'll
we're going to put star and then we'll say args now you will see something
say args now you will see something exactly like this typically if you're
exactly like this typically if you're looking at tutorials that'll have star
looking at tutorials that'll have star args in there or if you're looking at
args in there or if you're looking at just a generic piece of code this is
just a generic piece of code this is what it will look like but for us we're
what it will look like but for us we're going to actually put number so again we
going to actually put number so again we have the star and then we have our
have the star and then we have our arbitrary argument right here and then
arbitrary argument right here and then we'll hit enter and we're going to say
we'll hit enter and we're going to say print open parentheses and this is where
print open parentheses and this is where it's going to get a little bit different
it's going to get a little bit different so we're going to say number and then
so we're going to say number and then we're going to do an open bracket and
we're going to do an open bracket and let's say zero and then we'll do that
let's say zero and then we'll do that times and then we'll say number again
times and then we'll say number again with a bracket of one so in a little bit
with a bracket of one so in a little bit once we run this and then we call this
once we run this and then we call this number args function right here we're
number args function right here we're going to need to specify the number zero
going to need to specify the number zero and the number one that's going to be
and the number one that's going to be called so let's go ahead and run this
called so let's go ahead and run this and then we are going to call it and
and then we are going to call it and let's say 5 comma 6 comma 1 2 8 so right
let's say 5 comma 6 comma 1 2 8 so right up here we did not know how many
up here we did not know how many arguments we were going to pass through
arguments we were going to pass through it could be five it could be a thousand
it could be five it could be a thousand we could also call in a tuple and that's
we could also call in a tuple and that's what this is right here we're calling in
what this is right here we're calling in a tupal so what it's going to do now is
a tupal so what it's going to do now is when it calls this number it's going to
when it calls this number it's going to call the very first within that tupal
call the very first within that tupal which will be that five and then it'll
which will be that five and then it'll also call in this number which will be
also call in this number which will be the first position which is the six so
the first position which is the six so let's hit shift enter and it's going to
let's hit shift enter and it's going to multiply these numbers together so 5 * 6
multiply these numbers together so 5 * 6 is equal to 30 now like I just said this
is equal to 30 now like I just said this is a tuple so we don't actually have to
is a tuple so we don't actually have to write out these numbers like we just did
write out these numbers like we just did we can pass through a tuple when we are
we can pass through a tuple when we are actually calling this function let's do
actually calling this function let's do that right up here let's just create um
that right up here let's just create um let's call it argor Tuple and we'll do
let's call it argor Tuple and we'll do open parentheses and we'll do the same
open parentheses and we'll do the same numbers let's just copy it make it
numbers let's just copy it make it easier and now we've created this tupal
easier and now we've created this tupal right here which we can then pass in and
right here which we can then pass in and this is a lot more handy a lot more
this is a lot more handy a lot more specific and this is most likely how
specific and this is most likely how someone would do something like this but
someone would do something like this but let's now create this and now we can
let's now create this and now we can copy args Tuple and pass it through now
copy args Tuple and pass it through now really quickly this is going to fail and
really quickly this is going to fail and I'm doing that on purpose but I want to
I'm doing that on purpose but I want to show you what you need to do in order to
show you what you need to do in order to pass through this
pass through this tupal so right now it's going to say
tupal so right now it's going to say Tuple index is out of range all you have
Tuple index is out of range all you have to do in order to use this is you have
to do in order to use this is you have to specify a star before it just like
to specify a star before it just like you did when you're creating your
you did when you're creating your argument up here you have to put a star
argument up here you have to put a star in front of our Tuple that we just
in front of our Tuple that we just passed through and now let's try running
passed through and now let's try running this and now it works properly now the
this and now it works properly now the last two things that we're going to look
last two things that we're going to look at are keyword arguments and arbitrary
at are keyword arguments and arbitrary keyword arguments there are more things
keyword arguments there are more things that you can learn and do within
that you can learn and do within functions but again I'm just trying to
functions but again I'm just trying to teach you the basics to make sure that
teach you the basics to make sure that you understand how they work so let's go
you understand how they work so let's go right up here and a keyword argument is
right up here and a keyword argument is kind of similar to this right here and
kind of similar to this right here and let's actually copy this and put it
let's actually copy this and put it right down here now a keyword argument
right down here now a keyword argument is very similar in that you're going to
is very similar in that you're going to specify your arguments right here but
specify your arguments right here but what we did up here let me bring this
what we did up here let me bring this down
down when we actually called the function
when we actually called the function what we did was we just put in a five
what we did was we just put in a five and a three and when we did that it
and a three and when we did that it automatically assigned number to five
automatically assigned number to five and power to three and that's totally
and power to three and that's totally fine and you can do that but if you want
fine and you can do that but if you want a little bit more control you can use a
a little bit more control you can use a keyword argument so right here we could
keyword argument so right here we could say power is equal to five and number is
say power is equal to five and number is equal to three so I just switched it
equal to three so I just switched it around right number was assigned to five
around right number was assigned to five and Power was assigned to three but I
and Power was assigned to three but I just switched it to show you how this
just switched it to show you how this might work so let's run both of these
might work so let's run both of these and now it's 3 to the^ of 5 which is
and now it's 3 to the^ of 5 which is 243 so that essentially is a keyword
243 so that essentially is a keyword argument again it just gives you a
argument again it just gives you a little bit more control you don't have
little bit more control you don't have to put them in specific positions like
to put them in specific positions like if you're just calling multiple
if you're just calling multiple arguments now let's come right down here
arguments now let's come right down here we're going to create basically another
we're going to create basically another custom function uh so for this one we're
custom function uh so for this one we're going to write Define number underscore
going to write Define number underscore bar and then we'll do an open
bar and then we'll do an open parenthesis a colon and enter and what
parenthesis a colon and enter and what this one is is this one is a keyword
this one is is this one is a keyword argument or an arbitrary keyword
argument or an arbitrary keyword argument now to specify an arbitrary
argument now to specify an arbitrary argument all we did was a star and then
argument all we did was a star and then we input number but if we're doing a
we input number but if we're doing a keyword argument we actually have to
keyword argument we actually have to have two stars right here so let's start
have two stars right here so let's start taking a look and again if you're doing
taking a look and again if you're doing arbitrary it means we don't really know
arbitrary it means we don't really know how many keyword arguments we want to
how many keyword arguments we want to pass into our function so we're just
pass into our function so we're just going to put star our number and then
going to put star our number and then later within our body of code and when
later within our body of code and when we're calling it we'll be able to
we're calling it we'll be able to specify it and just like the arbitrary
specify it and just like the arbitrary argument before the arbitrary keyword
argument before the arbitrary keyword argument means we really just don't know
argument means we really just don't know how many keyword arguments we're going
how many keyword arguments we're going to need to pass into our function so to
to need to pass into our function so to demonstrate this let's write print do an
demonstrate this let's write print do an open parenthesis and we'll say my oops
open parenthesis and we'll say my oops need to do an
need to do an apostrophe my number is we'll do just
apostrophe my number is we'll do just like that little space and we'll say
like that little space and we'll say plus and this is kind of where it gets a
plus and this is kind of where it gets a little interesting or a little bit more
little interesting or a little bit more tricky so we're going to say is number
tricky so we're going to say is number so This Is Us calling our number and
so This Is Us calling our number and then we're going to do a bracket and
then we're going to do a bracket and then I'm actually going to go to calling
then I'm actually going to go to calling the function it's a little bit backward
the function it's a little bit backward or a little bit different than what you
or a little bit different than what you might think but when we're calling it
might think but when we're calling it what I'm going to do is I'm going to say
what I'm going to do is I'm going to say integer is equal to let's just do some
integer is equal to let's just do some random number now when we're calling
random number now when we're calling that keyword within our body of code
that keyword within our body of code what we're going to do is we're going to
what we're going to do is we're going to actually type out integer just like this
actually type out integer just like this and this looks a little bit different
and this looks a little bit different but what this allows us to do is we can
but what this allows us to do is we can put as many keyword arguments in here as
put as many keyword arguments in here as we want later and I'll show you in just
we want later and I'll show you in just a second but for us we're just creating
a second but for us we're just creating this key and this value when we are
this key and this value when we are calling it within the function so now
calling it within the function so now when we create this and we run
when we create this and we run this oh whoops I forgot this has to be a
this oh whoops I forgot this has to be a string um so let's run this
string um so let's run this again now we will say my number is
again now we will say my number is 2309 then we're we're going to add we'll
2309 then we're we're going to add we'll say plus and this isn't going to look
say plus and this isn't going to look great but we'll say my other number
great but we'll say my other number because this will all be in the same
because this will all be in the same line that's okay my other number and
line that's okay my other number and then we'll say number and we can specify
then we'll say number and we can specify again what we want in there so now we
again what we want in there so now we can go down here to where we're calling
can go down here to where we're calling it we'll put a comma and we'll say
it we'll put a comma and we'll say integer oops
integer oops integer 2 is equal to we'll do a random
integer 2 is equal to we'll do a random number and then we'll put in two right
number and then we'll put in two right here and then we'll add plus right here
here and then we'll add plus right here so we don't error out we'll create this
so we don't error out we'll create this we'll run this and as you can see both
we'll run this and as you can see both numbers were passed through again the
numbers were passed through again the syntax is terrible but now you can see
syntax is terrible but now you can see that you have this arbitrary keyword
that you have this arbitrary keyword argument right here and all we have to
argument right here and all we have to do is put number number and we can pass
do is put number number and we can pass through as many of these arbitrary
through as many of these arbitrary keyword arguments as we want as long as
keyword arguments as we want as long as we just specify within our function when
we just specify within our function when we're calling it so that's all we're
we're calling it so that's all we're going to look at in today's video on
going to look at in today's video on functions there are of course other
functions there are of course other things that you can do within functions
things that you can do within functions and it can get a little bit more
and it can get a little bit more advanced but I wanted to show you the
advanced but I wanted to show you the basics the meat and potatoes of things I
basics the meat and potatoes of things I definitely think you should know in
definitely think you should know in order to get started using functions I
order to get started using functions I hope that you were able to understand
hope that you were able to understand functions better because of this video
functions better because of this video if you did be sure to like And subscribe
if you did be sure to like And subscribe below and I will see you in the next
below and I will see you in the next [Music]
[Music] video hell hello everybody today we're
video hell hello everybody today we're going to be talking about converting
going to be talking about converting data types in Python in this video I'm
data types in Python in this video I'm going to show you how to convert several
going to show you how to convert several different data types including strings
different data types including strings numbers sets tupal and even dictionaries
numbers sets tupal and even dictionaries so let's start off by creating a
so let's start off by creating a variable we'll say numor int is equal to
variable we'll say numor int is equal to 7 and we can check that data type by
7 and we can check that data type by saying type and then inserting our
saying type and then inserting our variable number undor int and that will
variable number undor int and that will tell us that our data type for this
tell us that our data type for this variable is an integer let's go ahead
variable is an integer let's go ahead and create another one we're going to
and create another one we're going to say num underscore string is equal to
say num underscore string is equal to and for this one we'll also do a seven
and for this one we'll also do a seven but let's check the type and we'll do an
but let's check the type and we'll do an open parentheses and we'll say the type
open parentheses and we'll say the type of num string and that one is a string
of num string and that one is a string now let's say we wanted to add those
now let's say we wanted to add those we'll say Num uncore Sum so the sum of
we'll say Num uncore Sum so the sum of numor int plus numor string now when
numor int plus numor string now when we're adding these two values it is not
we're adding these two values it is not going to work it's going to give us an
going to work it's going to give us an error and it's going to say unsupported
error and it's going to say unsupported op brand for INT and string so it cannot
op brand for INT and string so it cannot add both an integer and a string what we
add both an integer and a string what we need to do in order to add these two
need to do in order to add these two numbers is to convert that string into
numbers is to convert that string into an integer so let's go right up here
an integer so let's go right up here let's add another cell and let's say
let's add another cell and let's say numor string undor converted is equal to
numor string undor converted is equal to and we want to convert it into an
and we want to convert it into an integer so all we have to do to convert
integer so all we have to do to convert it into an integer is type int and then
it into an integer is type int and then we're going to say num underscore string
we're going to say num underscore string and that is as easy as it's going to get
and that is as easy as it's going to get all we have to do is say integer with
all we have to do is say integer with our numb string inside of it and then
our numb string inside of it and then it's going to convert it and we can even
it's going to convert it and we can even check it right after by saying type num
check it right after by saying type num string converted and let's run this and
string converted and let's run this and now we can see that it was converted
now we can see that it was converted into an integer so now let's add that
into an integer so now let's add that num string converted right
num string converted right here let's copy and replace that string
here let's copy and replace that string with the string
with the string converted and let's actually print out
converted and let's actually print out that numor sum and it worked properly
that numor sum and it worked properly now we did not specify what type of
now we did not specify what type of value this Num Sum was going to be but
value this Num Sum was going to be but because it was two integers in here it's
because it was two integers in here it's going to automatically apply that data
going to automatically apply that data type of integer to that Num Sum let's go
type of integer to that Num Sum let's go right down here and now let's look at
right down here and now let's look at how we can convert lists sets and tupal
how we can convert lists sets and tupal so now let's say we have a listor type
so now let's say we have a listor type and that's equal to 1 2 3 and we can
and that's equal to 1 2 3 and we can check it again by saying
check it again by saying type and that is a list let's say we
type and that is a list let's say we want to convert it to a tupal it's
want to convert it to a tupal it's fairly easy all we're going to do is
fairly easy all we're going to do is write Tuple say listor type that list
write Tuple say listor type that list uncore type is now going to be a tupal
uncore type is now going to be a tupal and we can check that by saying type and
and we can check that by saying type and wrapping it around this Tuple and it
wrapping it around this Tuple and it shows us that it is converting that list
shows us that it is converting that list into a tupal now we can also convert a
into a tupal now we can also convert a list into a set but it may change the
list into a set but it may change the actual values within it let's check that
actual values within it let's check that out really quickly so let's say we have
out really quickly so let's say we have this list and let's add a few more
this list and let's add a few more values to this just like that now let's
values to this just like that now let's say we want to convert it to a set so
say we want to convert it to a set so we're going to run this and we'll say
we're going to run this and we'll say set of listor type and let's try running
set of listor type and let's try running this and see what the output is so this
this and see what the output is so this is something that you really need to be
is something that you really need to be aware of when you are converting data
aware of when you are converting data types because set does not act the same
types because set does not act the same as a list a set is basically going to
as a list a set is basically going to take the unique values in the list and
take the unique values in the list and convert it to a set and it fundamentally
convert it to a set and it fundamentally changes the data that was in that
changes the data that was in that original list and just to check the data
original list and just to check the data type we can say
type we can say type I'm just doing this for all of them
type I'm just doing this for all of them and as you can see that is now a set now
and as you can see that is now a set now let's go down here and take a look at
let's go down here and take a look at dictionaries now let's say we have a
dictionaries now let's say we have a dictionary called dictionary type and
dictionary called dictionary type and we'll do a squiggly bracket and we'll
we'll do a squiggly bracket and we'll say name name and we'll do a colon and
say name name and we'll do a colon and we'll say Alex then we'll do age and a
we'll say Alex then we'll do age and a colon and we'll say
colon and we'll say 28 and then we'll do
28 and then we'll do hair
hair colon and so really quickly let's take
colon and so really quickly let's take that dictionary type and just confirm
that dictionary type and just confirm that it is a dictionary and it is and
that it is a dictionary and it is and now what we're going to do is take a
now what we're going to do is take a look at all of the items within that
look at all of the items within that dictionary so we're going to do
dictionary so we're going to do dictionary type. items open parenthesis
dictionary type. items open parenthesis and this is going to show us all the
and this is going to show us all the items within it now we can also take
items within it now we can also take this and look at something like the
this and look at something like the values and when we run that these are
values and when we run that these are our values So within our dictionary we
our values So within our dictionary we have items and that's what this is right
have items and that's what this is right here this is one item and then within
here this is one item and then within that we have our values which are right
that we have our values which are right here so Alex 28 and Na and then we have
here so Alex 28 and Na and then we have something called a key and this is the
something called a key and this is the key the name age and hair are all keys
key the name age and hair are all keys and we can look at that by saying dot
and we can look at that by saying dot keys so let's say we want to take all of
keys so let's say we want to take all of the keys and put that into a list what
the keys and put that into a list what we're going to do is we're going to take
we're going to do is we're going to take this right here say
this right here say list we'll do an open parenthesis we'll
list we'll do an open parenthesis we'll type that in right there so it says a
type that in right there so it says a list and we're converting these Keys
list and we're converting these Keys into a list and let's run that and now
into a list and let's run that and now this is a list and let's just check the
this is a list and let's just check the type as well just to confirm
type as well just to confirm and as you can see it was converted
and as you can see it was converted properly into a list and we can do the
properly into a list and we can do the exact same thing with
exact same thing with values and the values can also be
values and the values can also be converted into a list now we can also
converted into a list now we can also convert longer strings that aren't just
convert longer strings that aren't just numbers like we did above in our very
numbers like we did above in our very first example so let's do longcore
first example so let's do longcore string and we'll say I like to party now
string and we'll say I like to party now we're going to take this string and
we're going to take this string and we're going to say list long string so
we're going to say list long string so we're going to convert this string into
we're going to convert this string into a list and let's see what happens so it
a list and let's see what happens so it took every single character in that
took every single character in that string and put it into a list and we
string and put it into a list and we could also do a set as well that one's a
could also do a set as well that one's a lot shorter because it's only looking at
lot shorter because it's only looking at unique values so that is how you convert
unique values so that is how you convert data types in Python thank you guys so
data types in Python thank you guys so much for watching I really appreciate it
much for watching I really appreciate it if you like this video be sure to like
if you like this video be sure to like And subscribe below and I'll see you in
And subscribe below and I'll see you in the next
the next [Music]
video [Music]
[Music] hello everybody today we're going to be
hello everybody today we're going to be working on building a BMI calculator in
working on building a BMI calculator in Python now before we get started I want
Python now before we get started I want to show you this BMI calculator that I
to show you this BMI calculator that I found online and it shows you the basic
found online and it shows you the basic calculation that they use and that's the
calculation that they use and that's the one we're going to use in this video and
one we're going to use in this video and they also have this calculator right
they also have this calculator right down here and some ranges that we can
down here and some ranges that we can use for our calculator as well so for
use for our calculator as well so for reference I weigh about
reference I weigh about 170 I'm about 5 9 let's calculate this
170 I'm about 5 9 let's calculate this so I'm about a
so I'm about a 25.1 BMI which falls into the overweight
25.1 BMI which falls into the overweight category that's unfortunate but we can
category that's unfortunate but we can see exactly how this works and how ours
see exactly how this works and how ours should work when we actually build it so
should work when we actually build it so we're going to kind of reference this
we're going to kind of reference this throughout the video so let's go right
throughout the video so let's go right over here to our BMI calculator we need
over here to our BMI calculator we need to calculate weight and height and then
to calculate weight and height and then run this calculation right here so let's
run this calculation right here so let's go ahead and copy
go ahead and copy this and we're going to put it right
this and we're going to put it right down here
down here here and so now we have our calculation
here and so now we have our calculation so what we need is we need input from a
so what we need is we need input from a user and there is an input function
user and there is an input function within python that we're going to be
within python that we're going to be using so let's actually give me a few
using so let's actually give me a few more cells so the first thing that we
more cells so the first thing that we need to calculate is their weight let's
need to calculate is their weight let's type out weight right here we'll say
type out weight right here we'll say weight is equal to and this is where
weight is equal to and this is where we'll use our input function so we'll
we'll use our input function so we'll say input and when we actually run this
say input and when we actually run this it's just going to give us this blank
it's just going to give us this blank square or a user can input something
square or a user can input something we'll say Alex so this is our output is
we'll say Alex so this is our output is what the actual user input and it does
what the actual user input and it does save it to this variable so if we say
save it to this variable so if we say print weight it will still print out
print weight it will still print out Alex now this is where we want the user
Alex now this is where we want the user to just like we did before where they'll
to just like we did before where they'll input their weight so we want to kind of
input their weight so we want to kind of give them a prompt for this we'll put a
give them a prompt for this we'll put a string in here so I'll do a double quote
string in here so I'll do a double quote and then I'll say
and then I'll say enter your weight in and we're using
enter your weight in and we're using pounds
pounds say pounds colon space so now when we do
say pounds colon space so now when we do this it'll say enter your weight in
this it'll say enter your weight in pounds I'll say 170 and then when we run
pounds I'll say 170 and then when we run this it does store that now let's do
this it does store that now let's do print I should have saved it wait again
print I should have saved it wait again oops now it's only storing the value of
oops now it's only storing the value of 170 it's not actually storing this
170 it's not actually storing this string right here so that's really
string right here so that's really important for when we do our
important for when we do our calculations
calculations later um I'm going to I'm going to save
later um I'm going to I'm going to save this right down here because I'm sure
this right down here because I'm sure I'm going to use that later um so we
I'm going to use that later um so we have that it's working now we need to
have that it's working now we need to also do our height so let's copy this
also do our height so let's copy this and we'll put it right here and we'll do
and we'll put it right here and we'll do height and enter your height in inches
height and enter your height in inches so now for this one if we hit
so now for this one if we hit enter it's actually running let's stop
enter it's actually running let's stop it really quick and interrupt it let's
it really quick and interrupt it let's try running this so it's going to say
try running this so it's going to say enter your weight and pounds that's the
enter your weight and pounds that's the first input say
first input say 170 and then when I hit enter it's going
170 and then when I hit enter it's going to prompt me for that second input and
to prompt me for that second input and so in inches 59 is 69 in and then I can
so in inches 59 is 69 in and then I can hit enter again and now we have both of
hit enter again and now we have both of our inputs now we need this calculation
our inputs now we need this calculation right down here and just like that so
right down here and just like that so now we have weight in pounds time 703
now we have weight in pounds time 703 divided by height in inches by height in
divided by height in inches by height in inches so we actually have weight and
inches so we actually have weight and it's already written in there but I'm
it's already written in there but I'm just going to do like this we'll do
just going to do like this we'll do weight time 73 so that's pounds there
weight time 73 so that's pounds there our weight and pounds * 703 divided by
our weight and pounds * 703 divided by now we have our height in
now we have our height in inches times the height in inches so
inches times the height in inches so this is our calculation right here so
this is our calculation right here so let's do this exact same thing let's run
let's do this exact same thing let's run this and this times of course is not
this and this times of course is not going to work whoops we need to do our
going to work whoops we need to do our star for both of these all right now
star for both of these all right now this is our calculation so let's run
this is our calculation so let's run this so we have
this so we have 170 and that's pounds and inches was 69
170 and that's pounds and inches was 69 hit
hit enter and it says cannot multiply the
enter and it says cannot multiply the sequence of non- integer type of string
sequence of non- integer type of string Ah that's because these are being stored
Ah that's because these are being stored in strings they right down here I do and
in strings they right down here I do and we'll do type of height we run that this
we'll do type of height we run that this is actually a string so we want to
is actually a string so we want to change that because we don't need that
change that because we don't need that anymore
anymore that so we don't want it to be a string
that so we don't want it to be a string we need those to be integers or Floats
we need those to be integers or Floats or really anything besides a string it
or really anything besides a string it just needs to be numerical uh so integer
just needs to be numerical uh so integer float really so let's do integer and
float really so let's do integer and we'll wrap that input in it and we'll do
we'll wrap that input in it and we'll do the same thing for this
the same thing for this one now we have an integer for our
one now we have an integer for our weight an integer for our height so now
weight an integer for our height so now when we're running this calculation it
when we're running this calculation it should work properly let's run this
should work properly let's run this again our pounds are
again our pounds are 70 our height is 69 in
70 our height is 69 in and it's not giving us our output
and it's not giving us our output because we're not printing anything okay
because we're not printing anything okay so I just need to
so I just need to do
do print BMI so let's try this again 170
print BMI so let's try this again 170 69 and there is our BMI 25.1 so it
69 and there is our BMI 25.1 so it worked the exact same as this one so
worked the exact same as this one so they input well we input our height we
they input well we input our height we inputed our or we inputed our weight we
inputed our or we inputed our weight we inputed our height and then it
inputed our height and then it calculated rbmi the next thing that we
calculated rbmi the next thing that we need to do is we need to kind of give
need to do is we need to kind of give the user some context is that good is
the user some context is that good is there BMI in within a good range a bad
there BMI in within a good range a bad range we don't know uh so let's go ahead
range we don't know uh so let's go ahead and I'm going to see if I can copy this
and I'm going to see if I can copy this know if this will work or
know if this will work or not let's go ahead and copy this right
not let's go ahead and copy this right down here perfect so what we now need to
down here perfect so what we now need to do is we need to say okay if the user
do is we need to say okay if the user has given us this input we want to give
has given us this input we want to give them or tell them if they are a normal
them or tell them if they are a normal weight overweight obese severely obese
weight overweight obese severely obese anything like that and we have these
anything like that and we have these ranges so that should help us out quite
ranges so that should help us out quite a bit so let's just write our if
a bit so let's just write our if statement and then we'll include it up
statement and then we'll include it up here but let's go down here and we'll
here but let's go down here and we'll say if and then we'll do BMI and let's
say if and then we'll do BMI and let's just say BMI is greater than zero so if
just say BMI is greater than zero so if it's greater than zero if they had any
it's greater than zero if they had any input where the BMI was not zero which
input where the BMI was not zero which should be every time if they do it
should be every time if they do it properly and they don't you know put a
properly and they don't you know put a string in there or something or type out
string in there or something or type out 40 which maybe we should make a prompt
40 which maybe we should make a prompt for that if that happens then we can say
for that if that happens then we can say if we'll do
if we'll do BMI and now we need to give that first
BMI and now we need to give that first range so this range right here so if
range so this range right here so if it's under 18.5 so we need to do a less
it's under 18.5 so we need to do a less than so if it's less than
than so if it's less than 18.5 and it just says under it doesn't
18.5 and it just says under it doesn't say under or equal to so I'll keep it at
say under or equal to so I'll keep it at 18.5 so if it's under
18.5 so if it's under 18.5 then let's give kind of the output
18.5 then let's give kind of the output we'll say print
we'll say print and the output or the basically the
and the output or the basically the prompt is underweight so we'll just say
prompt is underweight so we'll just say you are
you are under under case underweight and just
under under case underweight and just like that um then we're going to pass
like that um then we're going to pass several ellf statements through here
several ellf statements through here well let's just say else so I guess this
well let's just say else so I guess this would be like if they are if they don't
would be like if they are if they don't input something properly if something
input something properly if something messes up
messes up maybe I we could write something like um
maybe I we could write something like um print
print oops I'm thinking all this through we
oops I'm thinking all this through we can write print
can write print enter valid
enter valid inputs or something like this or we can
inputs or something like this or we can always change that but let's really
always change that but let's really quickly let's run
quickly let's run this okay so I'm not in that range uh
this okay so I'm not in that range uh let's make the next one so then I can be
let's make the next one so then I can be within a certain range
within a certain range oops and we need we should need one one
oops and we need we should need one one more a minimum so we'll say
more a minimum so we'll say LF and
LF and LF these next two are this 24.9 so it's
LF these next two are this 24.9 so it's going to check this one first so if it's
going to check this one first so if it's 18.5 or below 18.5 it's automatically
18.5 or below 18.5 it's automatically going to print this one so this next one
going to print this one so this next one we don't have to do like a range or
we don't have to do like a range or anything we can just say if it's below
anything we can just say if it's below if it's between 25 and 29.9 so this one
if it's between 25 and 29.9 so this one actually should be less than or equal to
actually should be less than or equal to um this one is normal oh whoops
um this one is normal oh whoops 24.9 so this one is
24.9 so this one is 24.9 this one is going to say you are
24.9 this one is going to say you are normal weight so let's run this
normal weight so let's run this now let's see BMI was
now let's see BMI was 25.1 oh guys I'm just messing up here I
25.1 oh guys I'm just messing up here I apologize all right this is the one that
apologize all right this is the one that I was part of so now it's going to be
I was part of so now it's going to be I'm part of the overweight crowd now now
I'm part of the overweight crowd now now let's run this and now our prompt is you
let's run this and now our prompt is you are overweight cuz remember the BMI was
are overweight cuz remember the BMI was saved right here as
saved right here as 25.1 down here if we run through this
25.1 down here if we run through this it's saying no you're not in
it's saying no you're not in oops get rid of that no you're not in
oops get rid of that no you're not in under 18.5 you're not under
under 18.5 you're not under 24.9 if you under
24.9 if you under 29.9 you are overweight so that did work
29.9 you are overweight so that did work properly so that's really good and I
properly so that's really good and I don't think I want this to be our output
don't think I want this to be our output for person because we're going to add
for person because we're going to add this up here it's just going to give us
this up here it's just going to give us the BMI and then the output is going to
the BMI and then the output is going to say you are overweight uh let's make it
say you are overweight uh let's make it a little bit more customized um I'm
a little bit more customized um I'm going to say name is equal to input and
going to say name is equal to input and then we'll say
then we'll say enter your
enter your name um so it'll be enter your name
name um so it'll be enter your name we'll do Alex
we'll do Alex 70 69 there's our BMI now it's going to
70 69 there's our BMI now it's going to run through this logic or it will run
run through this logic or it will run through this logic and just just a
through this logic and just just a second
second when we actually finish this so then we
when we actually finish this so then we have
34.9 and let's do one more oops and then this one's going to
more oops and then this one's going to be for
39.9 so this one was overweight this one is
is obese severely obese so we'll say
obese severely obese so we'll say severely that you spell it really obese
severely that you spell it really obese and then anything that's over that 40
and then anything that's over that 40 and over so if it's not this one
and over so if it's not this one anything else should be S morbidly obese
anything else should be S morbidly obese so actually this lse statement right
so actually this lse statement right here should
here should say uh you
are you are severely obese this is going to say morbidly morbidly obese now I
to say morbidly morbidly obese now I added that name up here because I wanted
added that name up here because I wanted to add that down below actually so we're
to add that down below actually so we're we're going to say uh name plus and then
we're going to say uh name plus and then we'll do like
we'll do like comma you are underweight so it'll be a
comma you are underweight so it'll be a little bit more personalized uh I think
little bit more personalized uh I think it'll I think it'll be a nice touch I
it'll I think it'll be a nice touch I really do we'll do it like this and
really do we'll do it like this and we'll say you and let's go back and do
we'll say you and let's go back and do that to all of
that to all of them and let me see how quickly I can do
thiss oh whoops what I do get rid of that
that name plus u like that geez you guys are
name plus u like that geez you guys are seeing me mess up a h name plus you and
seeing me mess up a h name plus you and then name plus you so now let's run this
then name plus you so now let's run this and now it's a little more personalized
and now it's a little more personalized it says Alex you are overweight so this
it says Alex you are overweight so this is all really good now this is an if
is all really good now this is an if statement um what we had done before I
statement um what we had done before I think is actually what we should put
think is actually what we should put right down here so we'll say l else and
right down here so we'll say l else and then if that doesn't work we'll say what
then if that doesn't work we'll say what do we say enter valid input we'll just
do we say enter valid input we'll just put that um and let let me see if I can
put that um and let let me see if I can test this out don't I don't know if this
test this out don't I don't know if this will error out or if this will even
will error out or if this will even work let me just see if I can mess with
work let me just see if I can mess with it and see if I can get it to work
it and see if I can get it to work actually let's copy this we're going to
actually let's copy this we're going to copy this whole thing we're going to
copy this whole thing we're going to include it right
include it right here and now we have basically our
here and now we have basically our entire calculator so um let's run this
entire calculator so um let's run this enter your name we'll say Alex enter
enter your name we'll say Alex enter your pounds 170 into your inches
your pounds 170 into your inches 69 and then it's going to say
69 and then it's going to say 25.1 Alex you are overweight and that's
25.1 Alex you are overweight and that's perfect we could even go as far as
perfect we could even go as far as adding like some feedback we say you are
adding like some feedback we say you are overweight and then it would be a period
overweight and then it would be a period and we could say um you need to exercise
and we could say um you need to exercise more
more stop sitting and writing so many python
stop sitting and writing so many python tutorials so now if we run this we'll do
tutorials so now if we run this we'll do Alex
Alex 17069 it says Alex you are overweight
17069 it says Alex you are overweight you need to exercise more and stop
you need to exercise more and stop sitting and writing so many python
sitting and writing so many python tutorials period and that's it this is
tutorials period and that's it this is the entire project um you can go a ton
the entire project um you can go a ton farther you can include much more
farther you can include much more complex logic you could even build out a
complex logic you could even build out a UI to create your own you know app just
UI to create your own you know app just like this where it has this input and
like this where it has this input and this UI you can build that out with in
this UI you can build that out with in jupyter notebooks with python um but
jupyter notebooks with python um but that's not really what this tutorial is
that's not really what this tutorial is for this is just to kind of help you um
for this is just to kind of help you um think through some of the logic of
think through some of the logic of creating something like this so you know
creating something like this so you know I hope that this was helpful I hope that
I hope that this was helpful I hope that this was fun I like creating stuff like
this was fun I like creating stuff like this we have two other projects that
this we have two other projects that we're going to do and maybe I'll include
we're going to do and maybe I'll include more but we have two right now that I
more but we have two right now that I have planned um and I hope those those
have planned um and I hope those those are helpful this is probably our easiest
are helpful this is probably our easiest one and they'll get a little bit more
one and they'll get a little bit more difficult in the next projects so I hope
difficult in the next projects so I hope that this was fun I hope that this was
that this was fun I hope that this was helpful and that you can now kind of
helpful and that you can now kind of utilize those python skills that you've
utilize those python skills that you've been working on if you like this video
been working on if you like this video be sure to like And subscribe below and
be sure to like And subscribe below and I'll see you in the next
I'll see you in the next [Music]
[Music] video hello everybody today we're going
video hello everybody today we're going to be creating an automatic file sorder
to be creating an automatic file sorder for your files and file explorer now out
for your files and file explorer now out of all the projects that we've done in
of all the projects that we've done in this series so far I think this one
this series so far I think this one might be the most difficult but I also
might be the most difficult but I also think this one is the most cool because
think this one is the most cool because it has some real life applications so
it has some real life applications so without further Ado let's take a look at
without further Ado let's take a look at some files that we have right down here
some files that we have right down here in my file explorer so I have this
in my file explorer so I have this beautiful picture of Rosie uh right here
beautiful picture of Rosie uh right here this is a PNG file I have a CSV file and
this is a PNG file I have a CSV file and a text file and I want to sort all of
a text file and I want to sort all of them into their own folders depending on
them into their own folders depending on what kind of file it is so if I go right
what kind of file it is so if I go right in here and I click on this one I go to
in here and I click on this one I go to properties I can see that this is a PNG
properties I can see that this is a PNG file um if I go into this one I don't
file um if I go into this one I don't need to but if I go into this one it's a
need to but if I go into this one it's a CSV file and of course this one is a
CSV file and of course this one is a text file so I want three separate
text file so I want three separate folders in here and I want them to
folders in here and I want them to automatically go into those folders
automatically go into those folders without me having to drag and drop and
without me having to drag and drop and going and clicking now we only have four
going and clicking now we only have four files here but imagine if we have
files here but imagine if we have thousands of files
thousands of files how much time that could save us so
how much time that could save us so let's get out of here and let's start
let's get out of here and let's start writing our code so we're going to say
writing our code so we're going to say import OS comma and then we're going to
import OS comma and then we're going to say chut iil now OS obviously stands for
say chut iil now OS obviously stands for operating system shuil uh I don't know
operating system shuil uh I don't know what it actually supposed to stand for
what it actually supposed to stand for but what it will allow us to do is do
but what it will allow us to do is do some highlevel operations on our files
some highlevel operations on our files in file explorer so we're going to go
in file explorer so we're going to go ahead and import those and now that we
ahead and import those and now that we have those imported
have those imported uh something that's going to be very
uh something that's going to be very important for us to have throughout this
important for us to have throughout this whole thing and this is anytime I'm
whole thing and this is anytime I'm working with like directories or
working with like directories or something like this we want to get this
something like this we want to get this path down so I'm going to go ahead and
path down so I'm going to go ahead and copy this
copy this path and we're just going to say path is
path and we're just going to say path is equal to and we'll do this right here so
equal to and we'll do this right here so let's run this and I need to put an R
let's run this and I need to put an R right here to make this a raw text um so
right here to make this a raw text um so when you don't have the r uh it's going
when you don't have the r uh it's going to read in these you know these
to read in these you know these backslashes and these colons and
backslashes and these colons and different stuff if we do R it's just
different stuff if we do R it's just going to read it in as the raw string
going to read it in as the raw string and that's what we want so here's what
and that's what we want so here's what we need to do there there's a few
we need to do there there's a few different things that have to happen
different things that have to happen when we are writing this out one thing
when we are writing this out one thing is is we need to go in here and we need
is is we need to go in here and we need to see this path and we need to see are
to see this path and we need to see are there folders in here already um if not
there folders in here already um if not we need to create a folder so that's one
we need to create a folder so that's one of the first things that we need to do
of the first things that we need to do the next thing that we need is it needs
the next thing that we need is it needs to check each of these files
to check each of these files individually identify what kind of file
individually identify what kind of file it is and then put it into the correct
it is and then put it into the correct folder so we have to create the folder
folder so we have to create the folder then check these and then place it into
then check these and then place it into the correct folder so let's go right out
the correct folder so let's go right out of here so what we're going to start
of here so what we're going to start doing is we're going to start working
doing is we're going to start working with these paths and these directories
with these paths and these directories and some of these things you may never
and some of these things you may never have seen before but that's okay I'll
have seen before but that's okay I'll try to explain it as I go through so the
try to explain it as I go through so the first thing that we're going to write is
first thing that we're going to write is os. list directories uh and what this is
os. list directories uh and what this is actually going to do is show us all the
actually going to do is show us all the files in there we're going to say path
files in there we're going to say path so it should show us all the files
so it should show us all the files within path and so here are our results
within path and so here are our results so we have the data professional results
so we have the data professional results fake text file our image and our other
fake text file our image and our other image so this is actually showing us
image so this is actually showing us what files are in that path and that's
what files are in that path and that's super important because we're probably
super important because we're probably going to have to Loop through this in
going to have to Loop through this in some way later um I wrote this all out
some way later um I wrote this all out before so I kind of remember but I'm
before so I kind of remember but I'm doing this all off the top of my head so
doing this all off the top of my head so I guarantee you throughout this I'll
I guarantee you throughout this I'll make some mistakes but what we now need
make some mistakes but what we now need to do is we need to create folders or
to do is we need to create folders or check if there's a folder and create it
check if there's a folder and create it if it isn't there that's um The Next
if it isn't there that's um The Next Step that we need to take so let's go
Step that we need to take so let's go right down here and we want to check if
right down here and we want to check if this path exists already so if that
this path exists already so if that folder already exists so we're going to
folder already exists so we're going to say
say os. path. exists so this is going to
os. path. exists so this is going to check does this path just like this path
check does this path just like this path up here does it already exist and then
up here does it already exist and then we're going to do an open parenthesis
we're going to do an open parenthesis we'll say path so that's our path now we
we'll say path so that's our path now we need to add a folder name to this um we
need to add a folder name to this um we could hardcode it so we could do plus we
could hardcode it so we could do plus we could say CSV files and that could work
could say CSV files and that could work so it would say does this path already
so it would say does this path already exist and we can try running this and
exist and we can try running this and it's going to say false so this doesn't
it's going to say false so this doesn't already exist but the thing is is we
already exist but the thing is is we need to create three separate path so we
need to create three separate path so we could do this by just hardcoding it in
could do this by just hardcoding it in by saying CSV files image files um and
by saying CSV files image files um and text files or we can just put this all
text files or we can just put this all in a list and loop through it I think
in a list and loop through it I think it's just going to be easier to do that
it's just going to be easier to do that or I don't know visually it's going to
or I don't know visually it's going to be easier so we'll do uh folder undor
be easier so we'll do uh folder undor names and we'll say is equal to and
names and we'll say is equal to and we'll create a list so I think I want to
we'll create a list so I think I want to call it CSV files comma um image files
call it CSV files comma um image files or PNG files whatever you want to write
or PNG files whatever you want to write and then we'll do text
and then we'll do text files do text files and then we can go
files do text files and then we can go right down here um a little for Loop uh
right down here um a little for Loop uh I think what we'll do actually let's
I think what we'll do actually let's write
write folder underscore names um then we can
folder underscore names um then we can put something like uh let's write Loop
put something like uh let's write Loop why not um so a little trick for the for
why not um so a little trick for the for Loop is you going to say four and we'll
Loop is you going to say four and we'll say Loop and and we'll just do a range
say Loop and and we'll just do a range because we want it to basically go
because we want it to basically go through here we don't want it to
through here we don't want it to actually give us these file names we
actually give us these file names we just want it to count Zer one and two so
just want it to count Zer one and two so if we do range from Zer to two zero uh 0
if we do range from Zer to two zero uh 0 one2 that should work if we do um this
one2 that should work if we do um this then when it Loops through it's going to
then when it Loops through it's going to call folder name and say zero which
call folder name and say zero which would be CSV files image files and text
would be CSV files image files and text files um so
files um so let's uh yeah I need a colon let's run
let's uh yeah I need a colon let's run through this really quickly uh shouldn't
through this really quickly uh shouldn't do
do anything but what we can do now is we
anything but what we can do now is we can say okay if this does not exist what
can say okay if this does not exist what we can do is actually create it so we'll
we can do is actually create it so we'll say
say if not so if this does not exist then
if not so if this does not exist then what we're going to do is take
what we're going to do is take this and we'll say
this and we'll say os. make directory and then we'll do
os. make directory and then we'll do just like that um I think it's make
just like that um I think it's make directory S I can't I think that's
directory S I can't I think that's correct um so let's test this out really
correct um so let's test this out really quickly let's see if this
quickly let's see if this works and invalid syntax I I need a
works and invalid syntax I I need a colon okay so I just ran this let's see
colon okay so I just ran this let's see if it did actually make those
if it did actually make those folders let's refresh it and it didn't
folders let's refresh it and it didn't so let's just print this off um so if
so let's just print this off um so if not let's just print let's see does this
not let's just print let's see does this actually
actually work let's do
work let's do if
if okay ah okay so I think I know what
okay ah okay so I think I know what might be happening I think it's giving
might be happening I think it's giving us it actually be let let's check this
us it actually be let let's check this really quick go to python
really quick go to python tutorials oh
tutorials oh no I think it's
no I think it's creating yeah it's creating these Python
creating yeah it's creating these Python tutorial images right here whoops okay
tutorial images right here whoops okay so I just figured it out um let's go
so I just figured it out um let's go back into python tutorials don't take a
back into python tutorials don't take a look at any of those notebooks those are
look at any of those notebooks those are secret um we were creating them in the
secret um we were creating them in the wrong place um and that's because of
wrong place um and that's because of this right here we need a backslash so
this right here we need a backslash so we need to actually include a backslash
we need to actually include a backslash right here here in this path we didn't
right here here in this path we didn't have that um e y scanning string
have that um e y scanning string literal okay so this back slash could
literal okay so this back slash could cause an issue let's see if I can do
cause an issue let's see if I can do forward slashes on all these just stick
forward slashes on all these just stick with me guys I might cut this out I
with me guys I might cut this out I might not we'll see if this is important
might not we'll see if this is important just going to keep talking while we're
just going to keep talking while we're doing it um let's run
doing it um let's run this okay so now that we're doing these
this okay so now that we're doing these forward slashes we're still checking
forward slashes we're still checking let's make sure we can still check those
let's make sure we can still check those files good now when we Loop through this
files good now when we Loop through this I'm not going to well yeah I can print
I'm not going to well yeah I can print it off doesn't matter I'm going to print
it off doesn't matter I'm going to print it and we'll see if that name works and
it and we'll see if that name works and then we're also going to
then we're also going to um uh I said if so if it exists then
um uh I said if so if it exists then make it no no no so if not I think the
make it no no no so if not I think the not did make sense we just weren't sure
not did make sense we just weren't sure we had to do some um checking so if it
we had to do some um checking so if it exists then we're going to create it and
exists then we're going to create it and we'll keep the print in there because it
we'll keep the print in there because it doesn't really matter so it's going to
doesn't really matter so it's going to create the CSV an image but didn't
create the CSV an image but didn't create the text let's see okay let's uh
create the text let's see okay let's uh I don't know why this would work but
I don't know why this would work but let's run it okay so I think I just had
let's run it okay so I think I just had the wrong range so now we have our
the wrong range so now we have our images all through or we have our
images all through or we have our folders all three folders now we need to
folders all three folders now we need to write a script that will read in these
write a script that will read in these and check and see what kind of file it
and check and see what kind of file it is and place it into the correct
is and place it into the correct folder so let's come right down here and
folder so let's come right down here and let's see what we need to do so now I
let's see what we need to do so now I think we need to use this right here um
think we need to use this right here um I think we need to Loop through this to
I think we need to Loop through this to be able to check each one so we need to
be able to check each one so we need to name this so we'll just do um file name
name this so we'll just do um file name is equal to run that so now we have this
is equal to run that so now we have this file name um and what we can do is Loop
file name um and what we can do is Loop through this so let's say let's say for
through this so let's say let's say for file in file name so we're going to Loop
file in file name so we're going to Loop through this now when it goes through it
through this now when it goes through it needs to check the it's going to check
needs to check the it's going to check the file path and in the file path it'll
the file path and in the file path it'll say. txt CSV so let's say um if I think
say. txt CSV so let's say um if I think it should be CSV Let's test it on this
it should be CSV Let's test it on this one but if CSV is
one but if CSV is in file name or actually it's file so if
in file name or actually it's file so if if it's in
if it's in file and not in and oh not not in if
file and not in and oh not not in if it's also not in this I believe because
it's also not in this I believe because we're going to check we're going to
we're going to check we're going to check each of those folders so we're
check each of those folders so we're going to Loop through and it's going to
going to Loop through and it's going to check and see if the CSV so if that
check and see if the CSV so if that string is in the
string is in the file then what we want to do is check
file then what we want to do is check that it's also not in here that's
that it's also not in here that's actually just the folder we also need um
actually just the folder we also need um also we're not doing that for Loop
also we're not doing that for Loop anymore um
anymore um um okay I'm sorry I'm talking this
um okay I'm sorry I'm talking this through I'm figuring it out as I go
through I'm figuring it out as I go because I may have forgotten some of
because I may have forgotten some of this so we're going to say this that's
this so we're going to say this that's the CSV files so we need to check this
the CSV files so we need to check this one um let's do it like this oops okay
one um let's do it like this oops okay so it's going to check to see if CSV
so it's going to check to see if CSV files and I think it needs that in
files and I think it needs that in between it so it's going to say the path
between it so it's going to say the path so there's our path plus slash C SV
so there's our path plus slash C SV files um actually no it needs to be like
files um actually no it needs to be like this CU we're going to check that then I
this CU we're going to check that then I got it all right I figured it out now
got it all right I figured it out now then we're going to check if this file
then we're going to check if this file is in there yeah so that's right so it
is in there yeah so that's right so it says if the
says if the CSV is in the
CSV is in the file um which is right where am I
file um which is right where am I looking oh file name so if it's in that
looking oh file name so if it's in that list of the actual files which is all of
list of the actual files which is all of these if we find CSV in any of these
these if we find CSV in any of these files
files and it's not already in here so it's
and it's not already in here so it's going to say path plus CSV files did I
going to say path plus CSV files did I say files yeah CSV files plus file okay
say files yeah CSV files plus file okay that all looks correct so if it's not in
that all looks correct so if it's not in there we're going to use shuttle. move
there we're going to use shuttle. move now this is how we actually move the
now this is how we actually move the file it gives us the ability to move
file it gives us the ability to move what we want then we'll say move we need
what we want then we'll say move we need to take it from our initial path to our
to take it from our initial path to our new path so we're going to specify we'll
new path so we're going to specify we'll separate by comma we need to spef ify
separate by comma we need to spef ify its original path which it should just
its original path which it should just be
be this without this I think it should be
this without this I think it should be file path because this is where it is
file path because this is where it is now it's in the FI this path with that
now it's in the FI this path with that file name then we need to say we want to
file name then we need to say we want to move it to here that is what we want to
move it to here that is what we want to do
do um yeah so let's check it with just this
um yeah so let's check it with just this one and see if it works okay it ran
one and see if it works okay it ran through it let's go check
through it let's go check aha now that CSV file is gone perfect
aha now that CSV file is gone perfect that is exactly what we want it to
that is exactly what we want it to happen now we can just recreate this
happen now we can just recreate this for um for both our PNG files our image
for um for both our PNG files our image files and our text files so we'll say LF
files and our text files so we'll say LF and
and LF and let's do
LF and let's do PNG then we'll do image
PNG then we'll do image files and image files because again
files and image files because again we're just doing the exact same thing I
we're just doing the exact same thing I can do text files the next one's going
can do text files the next one's going to be text files text files so this
to be text files text files so this one's going to check for
one's going to check for txt now do we need anything else um
txt now do we need anything else um we'll just say else and we'll print off
we'll just say else and we'll print off print this file type is not included or
print this file type is not included or or if there's multiple files we'll say
or if there's multiple files we'll say there are files in this
there are files in this path that were're not
path that were're not moved okay so if we run through this
moved okay so if we run through this it's going to catch our CSV catch our
it's going to catch our CSV catch our PNG catch our text and if not it'll say
PNG catch our text and if not it'll say there are files in this path that we're
there are files in this path that we're not moved exclamation point all right
not moved exclamation point all right now let's run through
now let's run through this
this uh uh that's because if LF LF L
uh uh that's because if LF LF L if and then it's going to this lse
if and then it's going to this lse statement uh I don't know let's let's
statement uh I don't know let's let's Circle back around to that in a second
Circle back around to that in a second all of them were moved properly that's
all of them were moved properly that's really
really good really quickly I I'll I'll check
good really quickly I I'll I'll check and see I just don't I'm G to take that
and see I just don't I'm G to take that out for now so I'm just going to run it
out for now so I'm just going to run it um I'm we may or may not go back to that
um I'm we may or may not go back to that but let's check and see if everything
but let's check and see if everything worked properly so let's go into the CSV
worked properly so let's go into the CSV file and we have our CSV file let go
file and we have our CSV file let go into our image files and we have our
into our image files and we have our images and let's go into our text file
images and let's go into our text file and there are our text files now is
and there are our text files now is there anything else that we need to do I
there anything else that we need to do I don't believe so but what I can do is I
don't believe so but what I can do is I can take all
can take all this I can include it in
this I can include it in here and I'm going
here and I'm going to basically restart
it just to see if it works properly from scratch right I just want to make sure
scratch right I just want to make sure that I didn't miss it anything and we'll
that I didn't miss it anything and we'll delete
delete these so we have our I'm just going to
these so we have our I'm just going to rerun everything we we
rerun everything we we imported we created our path these are
imported we created our path these are our file names and then when we run this
our file names and then when we run this it should take our folder names check
it should take our folder names check through them if they aren't already
through them if they aren't already created it's going to create it don't
created it's going to create it don't need it to print so let's get rid of
need it to print so let's get rid of that then for the file within our file
that then for the file within our file names and it check it it checks each one
names and it check it it checks each one we check if there's a CSV and if it's
we check if there's a CSV and if it's already already in that file if it's
already already in that file if it's already in that folder I mean if it's in
already in that folder I mean if it's in that folder then it doesn't do anything
that folder then it doesn't do anything but if it isn't so and not it's not in
but if it isn't so and not it's not in there it is going to move it to that
there it is going to move it to that location so it's going to check CSV PNG
location so it's going to check CSV PNG and text I think everything should work
and text I think everything should work properly let's run
properly let's run this and it looks like it's working good
this and it looks like it's working good good good and perfect it worked exactly
good good and perfect it worked exactly how I had hoped um that's great so
how I had hoped um that's great so this is the automatic file sorder in
this is the automatic file sorder in file explorer project uh you can go even
file explorer project uh you can go even a step further so I had to come in here
a step further so I had to come in here and manually run this you can go a step
and manually run this you can go a step further and put a timer on this where it
further and put a timer on this where it automatically does this maybe every hour
automatically does this maybe every hour every day every 30 minutes you can run
every day every 30 minutes you can run this in your background especially if
this in your background especially if you create um like a an execution for
you create um like a an execution for this you can run this in your background
this you can run this in your background um if you are curious on how to do that
um if you are curious on how to do that I think I did something something
I think I did something something similar to that in my web scraping
similar to that in my web scraping project um my Amazon web scraping
project um my Amazon web scraping project if you want to go check that one
project if you want to go check that one out but we're not going to do it in this
out but we're not going to do it in this project this is all I wanted to show you
project this is all I wanted to show you how to do so I hope that this was
how to do so I hope that this was helpful I hope that this project was you
helpful I hope that this project was you know interesting and that you liked it I
know interesting and that you liked it I hope that you learned something and so
hope that you learned something and so if you did be sure to like And subscribe
if you did be sure to like And subscribe below and I will see you in the next
below and I will see you in the next video what's going on everybody welcome
video what's going on everybody welcome back to another video today we're going
back to another video today we're going to be starting our python web scraping
to be starting our python web scraping tutorial series now this is more of a
tutorial series now this is more of a continuation of the Python tutorial
continuation of the Python tutorial series series but because we're going to
series series but because we're going to be focusing on web scraping for three or
be focusing on web scraping for three or four videos I wanted to just make it its
four videos I wanted to just make it its own little minseries in this series I'm
own little minseries in this series I'm going to show you the basics of web
going to show you the basics of web scraping how to actually look at HTML
scraping how to actually look at HTML how to inspect a web page how to pull
how to inspect a web page how to pull that data in and then even put it into a
that data in and then even put it into a CSV file so you can save it and use it
CSV file so you can save it and use it now in this series we're just covering
now in this series we're just covering the basics which is a fantastic place to
the basics which is a fantastic place to start but in future series I'll be going
start but in future series I'll be going into some of the more advanced web
into some of the more advanced web scraping topics as well so without
scraping topics as well so without further Ado let sh up on my screen and
further Ado let sh up on my screen and get started with web scraping now the
get started with web scraping now the first thing that we need to learn is
first thing that we need to learn is HTML HTML stands for hypertext markup
HTML HTML stands for hypertext markup language and it's used to describe all
language and it's used to describe all of the elements on a web page now when
of the elements on a web page now when we actually go to a website and start
we actually go to a website and start pulling data and information we need to
pulling data and information we need to know HTML so we can specify exactly what
know HTML so we can specify exactly what we want to take off of that website so
we want to take off of that website so that's where HTML comes in and we're
that's where HTML comes in and we're going to look at the basics
going to look at the basics understanding just the basic structure
understanding just the basic structure of HTML then we'll go look at a real
of HTML then we'll go look at a real website and you'll kind of see that's a
website and you'll kind of see that's a little bit more difficult than what we
little bit more difficult than what we just have right here but this is the
just have right here but this is the basic building blocks to get to what the
basic building blocks to get to what the HTML actually looks like on a website
HTML actually looks like on a website now this is basically what HTML looks
now this is basically what HTML looks like we have these angled brackets with
like we have these angled brackets with things like HTML head title body and
things like HTML head title body and then you'll notice that at the end we'll
then you'll notice that at the end we'll have a body and then we'll have a body
have a body and then we'll have a body at the bottom this forward SL body
at the bottom this forward SL body denotes that this is the end of the body
denotes that this is the end of the body section in HTML so everything inside of
section in HTML so everything inside of this is within this body so there is
this is within this body so there is this hierarchy within HTML we have HTML
this hierarchy within HTML we have HTML and HTML at the bottom which
and HTML at the bottom which encapsulates all the HTML on the website
encapsulates all the HTML on the website then we have things like head and head
then we have things like head and head body and body now Within These sections
body and body now Within These sections we usually have things like classes tags
we usually have things like classes tags attributes text and all these other
attributes text and all these other things things that we'll get to in
things things that we'll get to in different lessons but one of the easiest
different lessons but one of the easiest ones to notice and look at are tags
ones to notice and look at are tags things like a P tag or a title tag now
things like a P tag or a title tag now Within These tags because this is a
Within These tags because this is a super simple example we have these
super simple example we have these strings here my first web page page and
strings here my first web page page and this is what's called a variable string
this is what's called a variable string and this is actual text that we could
and this is actual text that we could take out of this web page now that you
take out of this web page now that you understand the super basics of HTML
understand the super basics of HTML let's actually go to our website and I'm
let's actually go to our website and I'm going to have a link down below but it's
going to have a link down below but it's going to be this one right here this is
going to be this one right here this is basically just a website that you can
basically just a website that you can you know practice web scraping on it's
you know practice web scraping on it's called scrape the
called scrape the site.com and what we're going to do is
site.com and what we're going to do is look at the HTML behind this web page
look at the HTML behind this web page and you can do this on any website that
and you can do this on any website that you go on so we're going to right click
you go on so we're going to right click we're going to go down to inspect
we're going to go down to inspect now right off the bat this looks a lot
now right off the bat this looks a lot more complicated and a lot more complex
more complicated and a lot more complex than the very simple illustration that
than the very simple illustration that we were looking at but let's kind of
we were looking at but let's kind of roll this up just a little bit you'll
roll this up just a little bit you'll notice we have HTML and HTML at the
notice we have HTML and HTML at the bottom we have a head and there is the
bottom we have a head and there is the end of the head and then a body and the
end of the head and then a body and the end of the body so in a super simple
end of the body so in a super simple sense it is similar but just the
sense it is similar but just the information that's within it is a lot
information that's within it is a lot more difficult now if we look at this
more difficult now if we look at this title right here this is our title tag
title right here this is our title tag if we click this little arrow this is
if we click this little arrow this is our dropdown you'll notice that here we
our dropdown you'll notice that here we have the string hockey teams forms
have the string hockey teams forms searching imp pagination now let's say
searching imp pagination now let's say we didn't know we didn't want to click
we didn't know we didn't want to click on that and go find it there's something
on that and go find it there's something that's super helpful within this
that's super helpful within this inspection page that you can click on
inspection page that you can click on right here it says select an element in
right here it says select an element in the page to inspect it so we're going to
the page to inspect it so we're going to click on that and as we go through our
click on that and as we go through our page and let's click on this title it's
page and let's click on this title it's going to take us to exactly where this
going to take us to exactly where this is in our our HTML this is extremely
is in our our HTML this is extremely helpful extremely useful for example
helpful extremely useful for example let's say the data I want is down here I
let's say the data I want is down here I want to take in the Boston Bruins I can
want to take in the Boston Bruins I can click on it and it's going to take me to
click on it and it's going to take me to where that is exactly in the HTML this
where that is exactly in the HTML this is where we can start writing our web
is where we can start writing our web scraping script to specify okay I'm
scraping script to specify okay I'm looking for a TR tag I'm looking for a
looking for a TR tag I'm looking for a TD tag I'm looking for the class called
TD tag I'm looking for the class called team this is all information and things
team this is all information and things that we can use to specify exactly what
that we can use to specify exactly what we want to pull out of our web page now
we want to pull out of our web page now there are other things that didn't
there are other things that didn't really look at as well in just our
really look at as well in just our simple illustration let's come right
simple illustration let's come right over here there's things like HRS now
over here there's things like HRS now these are hyperlinks so if we went and
these are hyperlinks so if we went and then clicked on this this is just
then clicked on this this is just regular text but inside of it is this
regular text but inside of it is this hyperlink where if we clicked on it it
hyperlink where if we clicked on it it would take us to another website and
would take us to another website and typically that's denoted by this hre
typically that's denoted by this hre right here then you'll typically see
right here then you'll typically see things like a P tag which usually stands
things like a P tag which usually stands for a paragraph now the last thing that
for a paragraph now the last thing that I want to show you while we're here and
I want to show you while we're here and we're going to learn a lot more in the
we're going to learn a lot more in the next several lessons
next several lessons but if we come right down here there is
but if we come right down here there is this actual entire table here and let's
this actual entire table here and let's try to find this table and I'm having
try to find this table and I'm having trouble selecting the entire thing but
trouble selecting the entire thing but let's select this team name and if we
let's select this team name and if we look at this team name you can see that
look at this team name you can see that this is encapsulating the table this
this is encapsulating the table this table tag now these are super helpful
table tag now these are super helpful because it takes in the entire table now
because it takes in the entire table now if we wrap this up and we look just at
if we wrap this up and we look just at this it says class table and then we
this it says class table and then we have the end of this table tag now when
have the end of this table tag now when we open it it's going to have all of
we open it it's going to have all of this information so as you can see as
this information so as you can see as I'm highlighting over it we have these
I'm highlighting over it we have these th tags and we have these TD tags and
th tags and we have these TD tags and even these TR tags which is the
even these TR tags which is the individual data and this is something
individual data and this is something that we'll look at when we're actually
that we'll look at when we're actually scraping all of the data from this table
scraping all of the data from this table in a future lesson so this is how we can
in a future lesson so this is how we can use HTML how we can inspect the web page
use HTML how we can inspect the web page and see exactly what's going on kind of
and see exactly what's going on kind of under the hood and then in future
under the hood and then in future lessons we'll see how we can use this
lessons we'll see how we can use this HTML to specify exactly what data we
HTML to specify exactly what data we want to pull out thank you guys so much
want to pull out thank you guys so much for watching if you like this video be
for watching if you like this video be be sure to like And subscribe below I
be sure to like And subscribe below I will see you in the next
will see you in the next [Music]
[Music] lesson hello everybody in this lesson
lesson hello everybody in this lesson we're going to be taking a look at
we're going to be taking a look at beautiful soup and requests now these
beautiful soup and requests now these packages in Python are really useful
packages in Python are really useful these are the two main ones that I use
these are the two main ones that I use when I was first starting out with web
when I was first starting out with web scraping it can get a lot of what you
scraping it can get a lot of what you want done in order to get that
want done in order to get that information out now of course there are
information out now of course there are other packages that you can use that may
other packages that you can use that may be a little bit more advanced but again
be a little bit more advanced but again this is just the beginner Series in a
this is just the beginner Series in a future series we'll look at other
future series we'll look at other packages as well that have some more
packages as well that have some more advanced functionality so what we're
advanced functionality so what we're going to be doing is we're going to
going to be doing is we're going to import these packages and then we're
import these packages and then we're going to get all of the HTML from our
going to get all of the HTML from our website and make sure that it's in a
website and make sure that it's in a usable State and then in the next lesson
usable State and then in the next lesson we're going to kind of query around in
we're going to kind of query around in the HTML kind of pick and choose exactly
the HTML kind of pick and choose exactly what we want we look at things like tags
what we want we look at things like tags variable strings classes attributes and
variable strings classes attributes and more so let's get started by importing
more so let's get started by importing our packages what we're going to say is
our packages what we're going to say is from bs4 this is the module that we're
from bs4 this is the module that we're taking it from we're going to say import
taking it from we're going to say import and then we'll do
and then we'll do beautiful soup then we're going to come
beautiful soup then we're going to come down and we're going to say import
down and we're going to say import requests now let's go ahead and run this
requests now let's go ahead and run this I'm going hit shift enter and it works
I'm going hit shift enter and it works well for me now if this does not work
well for me now if this does not work for you you may potentially need to
for you you may potentially need to actually install bs4 so you may have to
actually install bs4 so you may have to go to your terminal window and say pip
go to your terminal window and say pip install BS 4 I'll just let you Google
install BS 4 I'll just let you Google how to do that if you need to do that
how to do that if you need to do that cuz it's pretty easy but if you're using
cuz it's pretty easy but if you're using Jupiter notebooks through Anaconda like
Jupiter notebooks through Anaconda like how we set it up at the beginning of
how we set it up at the beginning of this python series then you should be
this python series then you should be totally fine it should be there for you
totally fine it should be there for you the next thing that we need to do is
the next thing that we need to do is specify where we're taking this HTML
specify where we're taking this HTML from so what we need to actually do is
from so what we need to actually do is come right over here to our web page and
come right over here to our web page and we need to get the URL so we're going to
we need to get the URL so we're going to go here we're going to copy this URL and
go here we're going to copy this URL and I'm just going to put it right here for
I'm just going to put it right here for a second and what we're going to do is
a second and what we're going to do is we're going to be using this URL quite a
we're going to be using this URL quite a bit so we just want to assign it to a
bit so we just want to assign it to a variable so just say URL is equal to and
variable so just say URL is equal to and then we'll put it right in here now we
then we'll put it right in here now we can get rid of that so now this is our
can get rid of that so now this is our URL going forward this is where we're
URL going forward this is where we're going to be pulling data from let's go
going to be pulling data from let's go ahead and run this now we're going to
ahead and run this now we're going to use requests and what we're going to do
use requests and what we're going to do is we're going to say
is we're going to say requests.get and then we're going to put
requests.get and then we're going to put in url now this get function is going to
in url now this get function is going to use the request Library it's going to
use the request Library it's going to send a get request to that URL and it's
send a get request to that URL and it's going to return a response object let's
going to return a response object let's go ahead and run
go ahead and run this as you can see here I got a
this as you can see here I got a response of 200 if you got something
response of 200 if you got something like a 204 or a 400 or 401 or 404 all
like a 204 or a 400 or 401 or 404 all these things are potentially bad
these things are potentially bad something like a 204 would mean there
something like a 204 would mean there was no content in the actual web page
was no content in the actual web page 400 means a bad request so it was
400 means a bad request so it was invalid the server couldn't process it
invalid the server couldn't process it and you don't get any response if you
and you don't get any response if you got a 404 that might be one that you're
got a 404 that might be one that you're familiar with that's an error that means
familiar with that's an error that means the server cannot be found the next
the server cannot be found the next thing that we're going to do is take the
thing that we're going to do is take the HTML now if you remember we come right
HTML now if you remember we come right back here and we inspect this we have
back here and we inspect this we have all this HTML right here now on this web
all this HTML right here now on this web page specifically right now it's
page specifically right now it's completely static it's not a bunch of
completely static it's not a bunch of moving stuff or anything like that
moving stuff or anything like that usually when you're looking at HTML if
usually when you're looking at HTML if you're looking at something like Amazon
you're looking at something like Amazon and those web pages can update but when
and those web pages can update but when you actually pull that into python
you actually pull that into python you're basically getting a snapshot of
you're basically getting a snapshot of the HTML at that time so what we're
the HTML at that time so what we're going to do is bring in all of this HTML
going to do is bring in all of this HTML which is our snapshot of our website and
which is our snapshot of our website and then we can take a look at it so we're
then we can take a look at it so we're going to come right down here and now
going to come right down here and now we're going to say beautiful soup so now
we're going to say beautiful soup so now we'll use the beautiful soup package or
we'll use the beautiful soup package or Library so we need to say beautiful soup
Library so we need to say beautiful soup and we're going do an open parenthesis
and we're going do an open parenthesis we're going to do two things there's two
we're going to do two things there's two parameters that we need to put in here
parameters that we need to put in here first we need to put in this get request
first we need to put in this get request we actually need to name this and we'll
we actually need to name this and we'll call this page we'll say page is equal
call this page we'll say page is equal to and let's run this and now we're
to and let's run this and now we're going to put that page in here and what
going to put that page in here and what we're going to say is do text so the
we're going to say is do text so the page is what's sending that request and
page is what's sending that request and then the text is what's retrieving the
then the text is what's retrieving the actual raw HTML that we're going to be
actual raw HTML that we're going to be using then we're going to put a comma
using then we're going to put a comma here and what we need to specify is how
here and what we need to specify is how we're going to parse this information
we're going to parse this information now this is an HTML so what we're going
now this is an HTML so what we're going to do is HTML just like this this is a
to do is HTML just like this this is a standard this is already built into to
standard this is already built into to this Library so we don't need to go any
this Library so we don't need to go any further but it's basically going to
further but it's basically going to parse the information in an HTML format
parse the information in an HTML format let's go ahead and run this let's see
let's go ahead and run this let's see what we get and as you can see we have a
what we get and as you can see we have a lot of information and as we scroll down
lot of information and as we scroll down I'll try to point out some things that
I'll try to point out some things that we've already looked at in previous
we've already looked at in previous lessons
lessons um something like this th tag that
um something like this th tag that should be very similar that's the title
should be very similar that's the title then we have these TD tags and then of
then we have these TD tags and then of course if we scroll down even further
course if we scroll down even further we'll have things like ATR tag so these
we'll have things like ATR tag so these are all things that we looked at in that
are all things that we looked at in that first lesson when learning about HTML
first lesson when learning about HTML now again we want to assign this to a
now again we want to assign this to a variable so we're going to say soup
variable so we're going to say soup that's going to say equal to this
that's going to say equal to this information information right here now
information information right here now I'm not going to go into all the history
I'm not going to go into all the history behind beautiful soup what I will say is
behind beautiful soup what I will say is the guy who created this beautiful soup
the guy who created this beautiful soup Library uh what he said was is that it
Library uh what he said was is that it takes this really messy HTML or XML
takes this really messy HTML or XML which you can also use it for and makes
which you can also use it for and makes it into this kind of beautiful soup so I
it into this kind of beautiful soup so I just thought that was kind of funny uh
just thought that was kind of funny uh but that's why we're calling it soup
but that's why we're calling it soup right here and we're going to go ahead
right here and we're going to go ahead and run this and we'll come right down
and run this and we'll come right down here and we'll say print soup and let's
here and we'll say print soup and let's run it and now we have everything in
run it and now we have everything in here so we have our HTML L our head we
here so we have our HTML L our head we have some HR and some links in here
have some HR and some links in here let's scroll down a little bit more and
let's scroll down a little bit more and then we have our body right there and of
then we have our body right there and of course we have a bunch of information in
course we have a bunch of information in here now in the next lesson what we're
here now in the next lesson what we're going to be doing is learning how to
going to be doing is learning how to kind of query all of this to take
kind of query all of this to take specific information out and basically
specific information out and basically understand a lot of what's going on in
understand a lot of what's going on in this HTML to make sure we can actually
this HTML to make sure we can actually get what we need now if this looks
get what we need now if this looks really kind of messy to you and it just
really kind of messy to you and it just doesn't make a lot of sense there is one
doesn't make a lot of sense there is one more thing that I'm going to show you
more thing that I'm going to show you and we'll come right down here so we'll
and we'll come right down here so we'll say soup. pry and if you've ever used a
say soup. pry and if you've ever used a different type of programming languages
different type of programming languages uh pry is very common in a lot of them
uh pry is very common in a lot of them where it'll just make it a little bit
where it'll just make it a little bit more easy to visualize and see uh you'll
more easy to visualize and see uh you'll notice that it kind of has this
notice that it kind of has this hierarchy built in whereas if we scroll
hierarchy built in whereas if we scroll up there's no hierarchy built in it's
up there's no hierarchy built in it's all just down this left hand side so if
all just down this left hand side so if you kind of want to view it and just
you kind of want to view it and just kind of visually see the differences
kind of visually see the differences this does help a lot but it doesn't
this does help a lot but it doesn't actually help a lot when you're you know
actually help a lot when you're you know querying it or using you know find and
querying it or using you know find and find all which is what we're going to
find all which is what we're going to look at in the next lesson so that is
look at in the next lesson so that is our lesson on beautiful soup and
our lesson on beautiful soup and requests in the next two lessons we're
requests in the next two lessons we're going to be looking at find and find all
going to be looking at find and find all as well as really diving into things
as well as really diving into things like variable strings and tags and
like variable strings and tags and classes and all those things and then in
classes and all those things and then in the last lesson we're going to do kind
the last lesson we're going to do kind of this mini project where we try to get
of this mini project where we try to get all the data from this web page that
all the data from this web page that we've been using from that table and put
we've been using from that table and put it into a panda's data frame so thank
it into a panda's data frame so thank you guys so much for watching I really
you guys so much for watching I really appreciate it if you like this video be
appreciate it if you like this video be sure to like And subscribe below and I
sure to like And subscribe below and I will see you in the next
will see you in the next [Music]
[Music] lesson hello everybody in this lesson
lesson hello everybody in this lesson we're going to be taking a look at find
we're going to be taking a look at find and find all really we're going to be
and find all really we're going to be looking at a ton of different things in
looking at a ton of different things in this lesson this is where we really
this lesson this is where we really start digging in seeing how we can
start digging in seeing how we can extract specific information from our
extract specific information from our web page but in order to do that let's
web page but in order to do that let's set everything up where we actually
set everything up where we actually bring in the HTML like we did in the
bring in the HTML like we did in the last lesson and we're just going to
last lesson and we're just going to write all this out one more time just
write all this out one more time just for practice if nothing else and then
for practice if nothing else and then we'll get into actually getting that
we'll get into actually getting that information from the HTML so we're going
information from the HTML so we're going to start by saying from bs4 import
to start by saying from bs4 import beautiful soup there we go and import
beautiful soup there we go and import requests we'll go ahead and run this
requests we'll go ahead and run this then we're going to come up here grab
then we're going to come up here grab our HTML or sorry our URL so we'll say
our HTML or sorry our URL so we'll say URL is equal to and we'll have that
URL is equal to and we'll have that right here now we need to say page is
right here now we need to say page is equal to and then we'll do
equal to and then we'll do requests.get and then we'll put in our
requests.get and then we'll put in our URL right here and we're going to come
URL right here and we're going to come over here and run this and lastly we
over here and run this and lastly we need to say soup so we'll say soup is
need to say soup so we'll say soup is equal to beautiful soup there we go and
equal to beautiful soup there we go and then within our parentheses we need to
then within our parentheses we need to specify the page. text because we need
specify the page. text because we need that and our parser which is
that and our parser which is HTML
HTML and there we go and let's go ahead and
and there we go and let's go ahead and run this let's print it out make sure
run this let's print it out make sure it's
it's working and there we go so we have our
working and there we go so we have our soup right here all this should look
soup right here all this should look really similar to uh our last lesson and
really similar to uh our last lesson and so now we've brought in our HTML from
so now we've brought in our HTML from our page we have a lot a lot a lot of
our page we have a lot a lot a lot of information in here now really quickly
information in here now really quickly let's come over and let's inspect our
let's come over and let's inspect our web
web page now in here we have a ton of
page now in here we have a ton of information right we have bunch of
information right we have bunch of different tags and classes and all these
different tags and classes and all these other things but how do we actually use
other things but how do we actually use these well that's where the find and
these well that's where the find and find all is going to come into play and
find all is going to come into play and they're pretty similar and you'll see
they're pretty similar and you'll see that in just a little bit but let's say
that in just a little bit but let's say we want to take uh one of these tags and
we want to take uh one of these tags and let's come down let's say we just want
let's come down let's say we just want to take this div tag now there's going
to take this div tag now there's going to be a lot of different div tags in our
to be a lot of different div tags in our HTML but let's just come right here
HTML but let's just come right here let's go down and let's say
let's go down and let's say we're going to call Soup we're going to
we're going to call Soup we're going to say soup that's all of our information
say soup that's all of our information we're going to say do find now within
we're going to say do find now within our parentheses we can specify a lot of
our parentheses we can specify a lot of different things but we're going to keep
different things but we're going to keep it really simple right now we're just
it really simple right now we're just going to say
going to say di let's go ahead and run this what this
di let's go ahead and run this what this is going to bring up is the very first
is going to bring up is the very first div tag in our HTML and that's going to
div tag in our HTML and that's going to be this information right here now let's
be this information right here now let's copy this and we're going to do the
copy this and we're going to do the exact same thing except we're going to
exact same thing except we're going to say find underscore all now let's run
say find underscore all now let's run this now we're going to have a ton more
this now we're going to have a ton more information really all find and find all
information really all find and find all do is that they find the information now
do is that they find the information now find is only going to find the first
find is only going to find the first response in our HTML Le that's the div
response in our HTML Le that's the div class container let's go back up to the
class container let's go back up to the top that's our div class container but
top that's our div class container but find all is going to find all of them so
find all is going to find all of them so it'll put it in this list for you so
it'll put it in this list for you so it's going to have this first one and it
it's going to have this first one and it goes down to uh this SL div which should
goes down to uh this SL div which should be right here and then we have a comma
be right here and then we have a comma which separates our next div tag so that
which separates our next div tag so that is how we can use it now what if we want
is how we can use it now what if we want to specify one of these div tags we
to specify one of these div tags we pulled in a ton of them but we want to
pulled in a ton of them but we want to just look for one of them well this is
just look for one of them well this is something where the class comes in handy
something where the class comes in handy because right now we have class is equal
because right now we have class is equal to container classes equal to co
to container classes equal to co md-12 I don't know what these are at the
md-12 I don't know what these are at the off the top of my head but um usually
off the top of my head but um usually they'll be somewhat unique and we can
they'll be somewhat unique and we can use these to help us specify what we're
use these to help us specify what we're looking for for example just kind of
looking for for example just kind of glancing of this we could also use this
glancing of this we could also use this a tag if we wanted to look at this so we
a tag if we wanted to look at this so we could say oh we're looking for uh these
could say oh we're looking for uh these hrefs so we have an hre here and this
hrefs so we have an hre here and this right down here we have this hre as well
right down here we have this hre as well which again uh if you remember from
which again uh if you remember from previous lesson that stands for a
previous lesson that stands for a hyperlink now something like the class
hyperlink now something like the class or the href um or these IDs these are
or the href um or these IDs these are all attributes so we can specify or kind
all attributes so we can specify or kind of filter Down based off of these now
of filter Down based off of these now let's try it so what we can do is we can
let's try it so what we can do is we can do class first and this is kind of the
do class first and this is kind of the default uh within something like find
default uh within something like find all is you can even do class underscore
all is you can even do class underscore we can come right back up we have this
we can come right back up we have this div and then here's our class so again
div and then here's our class so again we have to have the div and the class if
we have to have the div and the class if we took this a tag this is an a tag
we took this a tag this is an a tag which would go right here with the class
which would go right here with the class of something like navlink or something
of something like navlink or something like navlink again down here we need to
like navlink again down here we need to specify that more but we have our div so
specify that more but we have our div so we'll say CL Cole
we'll say CL Cole md12 right here and let's go ahead and
md12 right here and let's go ahead and run this and now it's going to pull in
run this and now it's going to pull in just that information now we're still
just that information now we're still getting a list because we have multiple
getting a list because we have multiple of these so this div class uh Cole md-12
of these so this div class uh Cole md-12 doesn't just happen once if we scroll
doesn't just happen once if we scroll down we'll see it multiple times
down we'll see it multiple times something like right here uh or actually
something like right here uh or actually let me see right here so here's this
let me see right here so here's this comma then here's our next one so we
comma then here's our next one so we have two of these uh div tags with a
have two of these uh div tags with a class of coal- md-12 and in each of
class of coal- md-12 and in each of these we have different information this
these we have different information this looks like a paragraph with this P tag
looks like a paragraph with this P tag right here and let's scroll back up uh
right here and let's scroll back up uh so I also think we should try out doing
so I also think we should try out doing something like this P tag typically
something like this P tag typically these P tags stand for paragraphs or
these P tags stand for paragraphs or they have text information in them let's
they have text information in them let's try to P tag really quickly and let's
try to P tag really quickly and let's just see what we get and let's run this
just see what we get and let's run this and it looks like we get multiple P tags
and it looks like we get multiple P tags now if we come back here you can see
now if we come back here you can see that there's this information and it's
that there's this information and it's this information that we're pulling in
this information that we're pulling in and I'm just you know noticing that from
and I'm just you know noticing that from right here and then we have this
right here and then we have this information right here and it looks like
information right here and it looks like there's one more which is this href
there's one more which is this href which looks like this open source so
which looks like this open source so data via and then that uh hyperlink or
data via and then that uh hyperlink or that link right there so we have three
that link right there so we have three different P tags now just to verify and
different P tags now just to verify and make sure that that's correct what we
make sure that that's correct what we could do is come over here we're going
could do is come over here we're going to click on this paragraph it's going to
to click on this paragraph it's going to take us to that P tag where the class is
take us to that P tag where the class is equal to lead let's come over here and
equal to lead let's come over here and look at this paragraph now we have
look at this paragraph now we have another P tag right over here with the
another P tag right over here with the class is equal to glyphicon glyphicon
class is equal to glyphicon glyphicon education I have no idea what that means
education I have no idea what that means um and then we'll go to our last one
um and then we'll go to our last one which is right here where the P tag is
which is right here where the P tag is equal to uh we have a tag HRA class uh
equal to uh we have a tag HRA class uh and a bunch of other information so
and a bunch of other information so let's say we just wanted to pull in this
let's say we just wanted to pull in this paragraph right here let's go here and
paragraph right here let's go here and see how we can specify this information
see how we can specify this information so it looks like P or the class is equal
so it looks like P or the class is equal to lead that looks like it's going to be
to lead that looks like it's going to be unique to just that one so if we come
unique to just that one so if we come down here we're going to say comma and
down here we're going to say comma and it was class so you can do uh class
it was class so you can do uh class underscore is equal to and then we're
underscore is equal to and then we're going to say lead let's try running this
going to say lead let's try running this and we're just pulling in that
and we're just pulling in that information now let's say we actually
information now let's say we actually want to pull in this paragraph We
want to pull in this paragraph We actually want this text right here and
actually want this text right here and this is a very real use case you know
this is a very real use case you know let's say I'm trying to pull in some
let's say I'm trying to pull in some information or or a paragraph of text
information or or a paragraph of text well let's copy this and what we're
well let's copy this and what we're going to then do is say. text and let's
going to then do is say. text and let's run this now we're going to get an error
run this now we're going to get an error right here and this is a very common
right here and this is a very common error because we're trying to use find
error because we're trying to use find all unfortunately find all does not have
all unfortunately find all does not have a text attribute we actually need to
a text attribute we actually need to change this to find typically when I'm
change this to find typically when I'm working with these find and find alls
working with these find and find alls I'm using findall most of the time until
I'm using findall most of the time until I want to start extracting text then
I want to start extracting text then when I specify it I'll change this back
when I specify it I'll change this back to find just like this now let's try
to find just like this now let's try this and now we're getting in
this and now we're getting in parentheses this information now this is
parentheses this information now this is all wonky it needs to definitely be
all wonky it needs to definitely be cleaned up a little bit but if we code
cleaned up a little bit but if we code back up it's no longer in a list and we
back up it's no longer in a list and we no longer have things like these P tags
no longer have things like these P tags in here or this class attribute so we're
in here or this class attribute so we're really just trying to pull out this
really just trying to pull out this information now again this does not look
information now again this does not look perfect we could even try to do
perfect we could even try to do something like do strip look like
something like do strip look like there's some white space uh that cleans
there's some white space uh that cleans it up a little bit this definitely looks
it up a little bit this definitely looks a little better um and we could
a little better um and we could definitely go in here and clean this up
definitely go in here and clean this up more but just for you know an example
more but just for you know an example this is how we can then extract that
this is how we can then extract that information now let's look at one more
information now let's look at one more example this is some information and
example this is some information and this is what we're going to do kind of
this is what we're going to do kind of our little mini project in the next
our little mini project in the next lesson on let's say we wanted to take
lesson on let's say we wanted to take all this information what if we wanted
all this information what if we wanted to pull in something like the team name
to pull in something like the team name that's going to be in right here in this
that's going to be in right here in this TR tag and each of these TR tags have th
TR tag and each of these TR tags have th tags underneath them so if we scroll
tags underneath them so if we scroll down you'll notice that each row is this
down you'll notice that each row is this TR tag so let's go ahead and search for
TR tag so let's go ahead and search for let's do th let's just search for that
let's do th let's just search for that first so let's come right back up here
first so let's come right back up here let's use this find
let's use this find all and we'll get rid of this text for
all and we'll get rid of this text for right now and let's just say we want to
right now and let's just say we want to look for the TR is that what we said we
look for the TR is that what we said we were looking for no th so let's say
were looking for no th so let's say we're looking for th let's go ahead and
we're looking for th let's go ahead and run this so we're going to have
run this so we're going to have underneath this th we have team name
underneath this th we have team name year wins losses and notice these are
year wins losses and notice these are all the titles so these titles are the
all the titles so these titles are the only ones with these th tags if we go
only ones with these th tags if we go down you'll notice that the data is
down you'll notice that the data is actually TD tags so now let's go back
actually TD tags so now let's go back and look for TD we'll say
and look for TD we'll say D and this is going to be a lot longer
D and this is going to be a lot longer we have a lot of information but these
we have a lot of information but these are all the rows of data let's see if we
are all the rows of data let's see if we can just get one piece of this data
can just get one piece of this data we're going to get back we want just
we're going to get back we want just this team name that's all we're trying
this team name that's all we're trying to pull in for now um and then we'll try
to pull in for now um and then we'll try to get this row and then in the next
to get this row and then in the next lesson we're going to try to get all of
lesson we're going to try to get all of this information make it look really
this information make it look really nice and then we'll put it into a
nice and then we'll put it into a panda's data frame so let's just get
panda's data frame so let's just get this team name right now let's go ahead
this team name right now let's go ahead we're going to say th
we're going to say th let's run this and we have this th and
let's run this and we have this th and now that we know we're getting this
now that we know we're getting this information in we can
information in we can do find let's run this so there's our
do find let's run this so there's our team name we're just going to say. text
team name we're just going to say. text and again we can do do strip just like
and again we can do do strip just like that and Bam we have our team name so
that and Bam we have our team name so you can kind of start getting the idea
you can kind of start getting the idea of how we're pulling this information
of how we're pulling this information out we're really just specifying exactly
out we're really just specifying exactly what we're seeing in this HTML and
what we're seeing in this HTML and what's really really helpful and you
what's really really helpful and you know something that I do all the time is
know something that I do all the time is I'm inspecting it I'm just kind of
I'm inspecting it I'm just kind of searching like how what do I want what
searching like how what do I want what piece of information do I want then I go
piece of information do I want then I go ahead and click on it and then I'm
ahead and click on it and then I'm looking you know where is this sitting
looking you know where is this sitting in the hierarchy it's within the body
in the hierarchy it's within the body it's within this table with the class of
it's within this table with the class of table then it's down here where this TR
table then it's down here where this TR tag and then this TD tag so I'm looking
tag and then this TD tag so I'm looking kind of at the hierarchy and I'm
kind of at the hierarchy and I'm specifying exactly what I'm looking for
specifying exactly what I'm looking for so that is what we're going to look at
so that is what we're going to look at in today's lesson that's how we can use
in today's lesson that's how we can use find and find all we were able to look
find and find all we were able to look at classes and tags and attributes and
at classes and tags and attributes and variable strings which is this right
variable strings which is this right here getting that text uh and variable
here getting that text uh and variable strings and we will look at find and
strings and we will look at find and find all and how it's pulling that
find all and how it's pulling that information in and how we can specify
information in and how we can specify exactly what we're looking for now in
exactly what we're looking for now in the next lesson which is definitely
the next lesson which is definitely going to be the most exciting one we're
going to be the most exciting one we're going to try to pull in all of this
going to try to pull in all of this information so every single thing
information so every single thing because we'll be able to put all this
because we'll be able to put all this information into a data frame which then
information into a data frame which then we can use pandas to really search and
we can use pandas to really search and manipulate that data within that data
manipulate that data within that data frame so with that being said that is
frame so with that being said that is the end of this lesson if you like this
the end of this lesson if you like this video be sure to like And subscribe I
video be sure to like And subscribe I will see you in the next
will see you in the next [Music]
[Music] lesson hello everybody in this lesson we
lesson hello everybody in this lesson we are going to be scraping data from a
are going to be scraping data from a real website and putting it into a p
real website and putting it into a p and's data frame and maybe even
and's data frame and maybe even exporting it to CSV if we're feeling a
exporting it to CSV if we're feeling a bit spicy now in the last several
bit spicy now in the last several lessons we've been looking at this page
lessons we've been looking at this page right here and I even promised that we
right here and I even promised that we were going to be pulling this data but
were going to be pulling this data but as I was building out the project I just
as I was building out the project I just I honestly thought it was a little bit
I honestly thought it was a little bit too easy since in the last lesson we
too easy since in the last lesson we kind of already pulled out some
kind of already pulled out some information from this table and I want
information from this table and I want to kind of throw you guys off so we're
to kind of throw you guys off so we're going to be pulling from a different
going to be pulling from a different table we're going to be going on to
table we're going to be going on to Wikipedia and looking at the list of the
Wikipedia and looking at the list of the largest companies in the United States
largest companies in the United States by Revenue and we're going to be pulling
by Revenue and we're going to be pulling all of this information so if you
all of this information so if you thought this was going to be easy in a
thought this was going to be easy in a little mini project uh it's now a full
little mini project uh it's now a full project because why not so let's get
project because why not so let's get started uh what we're going to do is
started uh what we're going to do is we're going to import beautiful soup and
we're going to import beautiful soup and requests we're going to get this
requests we're going to get this information and we're going to see how
information and we're going to see how we can do this and it's going to get a
we can do this and it's going to get a little bit more complicated and a little
little bit more complicated and a little bit more tricky we're going to have to
bit more tricky we're going to have to you know format things properly to get
you know format things properly to get it into our Panda data frame to make it
it into our Panda data frame to make it looking good and making it more usable
looking good and making it more usable so let's go ahead and get rid of the
so let's go ahead and get rid of the this easy table we don't want that one
this easy table we don't want that one uh and we're going to come in here and
uh and we're going to come in here and we're just going to start off this
we're just going to start off this should look uh really familiar by now
should look uh really familiar by now we're going to say from bs4 import
we're going to say from bs4 import beautiful soup I don't know if you've
beautiful soup I don't know if you've noticed but I've messed up spelling
noticed but I've messed up spelling beautiful soup in every single uh video
beautiful soup in every single uh video I've noticed uh let's run this and now
I've noticed uh let's run this and now we need to go ahead and get our URL so
we need to go ahead and get our URL so let's come up here let's get our
let's come up here let's get our URL say URL is equal to and we'll just
URL say URL is equal to and we'll just keep it all in the same thing really
keep it all in the same thing really quickly because we know this by Heart by
quickly because we know this by Heart by now right uh we'll say request.get and
now right uh we'll say request.get and then URL to make sure that we're getting
then URL to make sure that we're getting that information it give us a response
that information it give us a response object um hopefully it'll be 200 that'll
object um hopefully it'll be 200 that'll mean a good response and then we'll say
mean a good response and then we'll say soup is equal to and then we'll say
soup is equal to and then we'll say beautiful soup and we'll do our page.
beautiful soup and we'll do our page. text now we're pulling in the
text now we're pulling in the information from this URL and then we
information from this URL and then we use our parser which will be oops HTML
use our parser which will be oops HTML and let's go ahead and run this looks
and let's go ahead and run this looks like everything went well let's print
like everything went well let's print our soup now this is completely new to
our soup now this is completely new to you it's completely new to me I don't
you it's completely new to me I don't know what I'm doing uh but it looks like
know what I'm doing uh but it looks like we're pulling in the information am I
we're pulling in the information am I right so we got a lot of things going
right so we got a lot of things going for us uh the uh stuff was imported
for us uh the uh stuff was imported properly we got our URL we got our soup
properly we got our URL we got our soup which is uh not beautiful in my opinion
which is uh not beautiful in my opinion but let's keep on rolling let's come
but let's keep on rolling let's come right down here now what we need to do
right down here now what we need to do is we we need to specify what data we're
is we we need to specify what data we're looking for so let's come and let's
looking for so let's come and let's inspect this web page now the only
inspect this web page now the only information that we're going to want is
information that we're going to want is right in here we're going to want these
right in here we're going to want these uh titles or these headers whoops so
uh titles or these headers whoops so we're going to want rank name industry
we're going to want rank name industry Etc and then we are for sure going to
Etc and then we are for sure going to want all of this information let's just
want all of this information let's just scroll down see if there's anything
scroll down see if there's anything tricky in
tricky in here all right that looks pretty good
here all right that looks pretty good and there is another table so there's
and there is another table so there's not just one table in here there are two
not just one table in here there are two tables in this page so that might change
tables in this page so that might change things for us but let's come right back
things for us but let's come right back and let's inspect our page by using this
and let's inspect our page by using this little button right here and let's
little button right here and let's specify in let's see if I can highlight
specify in let's see if I can highlight just this page oh it's not going oh
just this page oh it's not going oh let's do that right there so now we have
let's do that right there so now we have this uh Wiki table sorter now I'm going
this uh Wiki table sorter now I'm going to actually come right here I'm going to
to actually come right here I'm going to copy and I'm just going to say copy the
copy and I'm just going to say copy the outer HTML just just going to paste in
outer HTML just just going to paste in here real quick and that's a ton of
here real quick and that's a ton of information I didn't think it was going
information I didn't think it was going to copy all of it and we're just going
to copy all of it and we're just going to delete that I just wanted to keep
to delete that I just wanted to keep that class uh because I wanted to then
that class uh because I wanted to then come right down here at the bottom and
come right down here at the bottom and just see what this table uh looks like I
just see what this table uh looks like I don't know if it's part of it or if it's
don't know if it's part of it or if it's a if it's its own
a if it's its own table um I can't tell let's look at this
table um I can't tell let's look at this Rank and let's come up so it says uh
Rank and let's come up so it says uh it's under this
it's under this table and it looks like it's its own
table and it looks like it's its own table but it says Wiki table sort
table but it says Wiki table sort sortable jQuery table sorter wikip
sortable jQuery table sorter wikip sortable jQuery table sorter so it looks
sortable jQuery table sorter so it looks like there are two tables with the same
like there are two tables with the same class which shouldn't be a problem if
class which shouldn't be a problem if we're using find to get our text because
we're using find to get our text because we should be taking the first one which
we should be taking the first one which will be this table and this is the table
will be this table and this is the table we want um and if we wanted this one we
we want um and if we wanted this one we could just use find all and since it's a
could just use find all and since it's a list we could use index ing to pull this
list we could use index ing to pull this table right um but I think we're going
table right um but I think we're going to be okay with just pulling in this one
to be okay with just pulling in this one so let's go ahead and let's do our find
so let's go ahead and let's do our find so we'll do
so we'll do soup. find and we could find all or we
soup. find and we could find all or we could just do find uh table let's just
could just do find uh table let's just try this and see what we get and if it
try this and see what we get and if it pulls in the right one that we're
pulls in the right one that we're looking for that' be great now this does
looking for that' be great now this does not look correct at all um I don't know
not look correct at all um I don't know what table it's pulling in oh maybe it's
what table it's pulling in oh maybe it's this right here this might be a table
this right here this might be a table yeah it is so we have this uh box more
yeah it is so we have this uh box more citations so actually we are going to
citations so actually we are going to have to do exactly like what I was
have to do exactly like what I was talking about uh let's pull
talking about uh let's pull this and we well we could do comma class
this and we well we could do comma class uh right here and let's do both you know
uh right here and let's do both you know what this is a learning opportunity
what this is a learning opportunity let's do both so let me go back up to
let's do both so let me go back up to the top because I need these um and what
the top because I need these um and what we're going to do let's come right down
we're going to do let's come right down here I want to add in uh another thing
here I want to add in uh another thing actually I'll just push this one up
actually I'll just push this one up there we go so we're going to say findor
there we go so we're going to say findor all let's run this so now we have
all let's run this so now we have multiple and again we got that weird one
multiple and again we got that weird one first but if we scroll down here's our
first but if we scroll down here's our comma and then here's our wik Wiki table
comma and then here's our wik Wiki table sortable and then we have rank name
sortable and then we have rank name industry all the ones that we were
industry all the ones that we were hoping to see and I guarantee you if you
hoping to see and I guarantee you if you scroll all the way to the bottom
scroll all the way to the bottom um we're going to
um we're going to see potentially Well Fargo Goldman Sachs
see potentially Well Fargo Goldman Sachs I'm pretty sure those are
I'm pretty sure those are um let's see yeah here we go like Ford
um let's see yeah here we go like Ford motor Wells Fargo Goldman Sachs that's
motor Wells Fargo Goldman Sachs that's this table right here so now we're
this table right here so now we're looking at the third table but again
looking at the third table but again this is a list so we can use indexing on
this is a list so we can use indexing on this and we'll just choose not position
this and we'll just choose not position zero because that's this one right here
zero because that's this one right here which we did not like well now we'll
which we did not like well now we'll take position one let's run this let's
take position one let's run this let's go back up to the top and this is our
go back up to the top and this is our table right here rank name industry this
table right here rank name industry this is the information that we were actually
is the information that we were actually wanting just to confirm rank name
wanting just to confirm rank name industry Etc so this is the information
industry Etc so this is the information we're wanting and we're able to specify
we're wanting and we're able to specify that with our findall and this is the
that with our findall and this is the information we want so we now want to
information we want so we now want to make this the only information that
make this the only information that we're looking at so I'm just going to
we're looking at so I'm just going to copy this we didn't need to use our
copy this we didn't need to use our class for this one you could probably
class for this one you could probably could have um but we could so let's
could have um but we could so let's actually um put this right down here
actually um put this right down here this will be our table we'll say equal
this will be our table we'll say equal to but then I'll come right here and I'm
to but then I'll come right here and I'm going to say soup. find this is just for
going to say soup. find this is just for demonstration purposes we do table comma
demonstration purposes we do table comma class underscore is equal to and then
class underscore is equal to and then we'll look at this right here whoops me
we'll look at this right here whoops me do this and let's see if we get the
do this and let's see if we get the correct
correct output and let's run this and looks like
output and let's run this and looks like we're getting a nun type object uh if I
we're getting a nun type object uh if I remember remember looks like the actual
remember remember looks like the actual class is this right here so let's run
class is this right here so let's run this instead and I got to get rid of the
this instead and I got to get rid of the index there we go okay so we were able
index there we go okay so we were able to pull it in just using the find so the
to pull it in just using the find so the find table class and it says Wiki table
find table class and it says Wiki table sortable at least that's the HTML that
sortable at least that's the HTML that we're pulling in right here let me go
we're pulling in right here let me go back because I don't don't know if
back because I don't don't know if that's what I was seeing
that's what I was seeing earlier let's just get this rank let's
earlier let's just get this rank let's go back up
go back up where's the
where's the rank we go rank there we go so here's
rank we go rank there we go so here's our Rank and let's go up to the table
our Rank and let's go up to the table and there's our
and there's our class yeah and and that's just uh to me
class yeah and and that's just uh to me that's a little bit odd so it says Wiki
that's a little bit odd so it says Wiki table sortable jQuery Das table sorder
table sortable jQuery Das table sorder right here but in our
right here but in our actual um in our actual python script
actual um in our actual python script that we're running it was only pulling
that we're running it was only pulling in the wiki table sortable so it wasn't
in the wiki table sortable so it wasn't pulling in the jQuery dot sorter why uh
pulling in the jQuery dot sorter why uh I'm not 100% sure but all things that
I'm not 100% sure but all things that we're working through and we were able
we're working through and we were able to uh we were able to figure out so
to uh we were able to figure out so we're going to make this our table we're
we're going to make this our table we're going to say tables equal to uh soup.
going to say tables equal to uh soup. findall and let's run this and if we
findall and let's run this and if we print out our table we have this table
print out our table we have this table now this is our only data that we are
now this is our only data that we are looking at now the first thing that I
looking at now the first thing that I want to get is I want to get these
want to get is I want to get these titles or these headers right here
titles or these headers right here that's where we're going to get first so
that's where we're going to get first so let's go in here we can just look in
let's go in here we can just look in this information you can see that these
this information you can see that these are with these th tags and we can pull
are with these th tags and we can pull out those th tags really easily let's
out those th tags really easily let's come right down here we're just going to
come right down here we're just going to say th and we can get rid of this let's
say th and we can get rid of this let's run this now these are our only th tags
run this now these are our only th tags because everything else is a TR tag for
because everything else is a TR tag for these rows of data so these th tags are
these rows of data so these th tags are pretty unique which makes it really easy
pretty unique which makes it really easy which is really really great because
which is really really great because then we can just do worldcore titles is
then we can just do worldcore titles is equal to so now we have these titles but
equal to so now we have these titles but uh they're not perfect but what we're
uh they're not perfect but what we're going to do is we're going to Loop
going to do is we're going to Loop through it so I'm going to say worldcore
through it so I'm going to say worldcore titles and I'll kind of walk through
titles and I'll kind of walk through what I'm talking about isn't a list and
what I'm talking about isn't a list and each one is Within These th tags so th
each one is Within These th tags so th and then there's our um string that
and then there's our um string that we're trying to get so we can easily
we're trying to get so we can easily take this list and use list
take this list and use list comprehension and we can do that right
comprehension and we can do that right down here so I'm going to keep this
down here so I'm going to keep this where we can see it um we'll do
where we can see it um we'll do worldcore
worldcore tore titles that's equal to now we'll do
tore titles that's equal to now we'll do our list comprehension should be super
our list comprehension should be super easy uh we'll just say for title in
easy uh we'll just say for title in worldcore titles and then what do we
worldcore titles and then what do we want we want title. text that's it um
want we want title. text that's it um because we're just taking the text from
because we're just taking the text from each of these we're just looping through
each of these we're just looping through and we're getting rank then We're
and we're getting rank then We're looping through getting name looping
looping through getting name looping through getting industry that's that's
through getting industry that's that's it so let's go and print our world table
it so let's go and print our world table titles and see if it worked and it did
titles and see if it worked and it did uh this looks like it needs to be
uh this looks like it needs to be cleaned up just a little bit so let's go
cleaned up just a little bit so let's go ahead and do that while we're here
ahead and do that while we're here before we actually put it into the uh
before we actually put it into the uh P's data frame oops I just wanted uh I
P's data frame oops I just wanted uh I just wanted this actually so what we're
just wanted this actually so what we're going to do is try to get rid of those
going to do is try to get rid of those back slash ends if we do dot strip that
back slash ends if we do dot strip that may actually not work yeah uh because
may actually not work yeah uh because this is a list what we need to do is we
this is a list what we need to do is we can actually do it dot. text. strip
can actually do it dot. text. strip right here let's try to do it in there
right here let's try to do it in there there we go so now we have uh this and
there we go so now we have uh this and now this world tables is good to go now
now this world tables is good to go now I'm actually noticing one thing that may
I'm actually noticing one thing that may be odd yeah so we have rank name
be odd yeah so we have rank name industry goes to headquarters but then
industry goes to headquarters but then in here we're getting rank name industry
in here we're getting rank name industry and then the
and then the profits which is
profits which is from this table right here which we
from this table right here which we don't want uh let's scroll back up let's
don't want uh let's scroll back up let's kind of backtrack this and see where
kind of backtrack this and see where this happened we did find all table
this happened we did find all table we're looking at the first one
we're looking at the first one right and then we're doing
right and then we're doing [Music]
[Music] headquarters uh so we're doing print
headquarters uh so we're doing print table ah okay I think I found the issue
table ah okay I think I found the issue here and let's backtrack again this is
here and let's backtrack again this is we're working through this together
we're working through this together we're going to make mistakes uh the
we're going to make mistakes uh the table is what we actually wanted to do
table is what we actually wanted to do we just did soup. findall th which is
we just did soup. findall th which is going to pull in that secondary table um
going to pull in that secondary table um jeez we were not thinking here um so now
jeez we were not thinking here um so now we need to do find all on the table not
we need to do find all on the table not the soup because now we were looking at
the soup because now we were looking at all of them oh what a rookie mistake
all of them oh what a rookie mistake okay uh let's go back now let's look at
okay uh let's go back now let's look at this now it's just down to headquarters
this now it's just down to headquarters okay okay let's go ahead and run this
okay okay let's go ahead and run this let's run this now we just have
let's run this now we just have headquarters now let's run this now we
headquarters now let's run this now we are sitting pretty okay excuse my
are sitting pretty okay excuse my mistakes Hey listen you know if it
mistakes Hey listen you know if it happens to me it happens to you I
happens to me it happens to you I promise you this is you know this is a
promise you this is you know this is a project this a little U little project
project this a little U little project we're creating here so we're going to
we're creating here so we're going to run into issues and that's okay we're
run into issues and that's okay we're figuring out as we go now what I want to
figuring out as we go now what I want to do before we start pulling in all the
do before we start pulling in all the data is I want to put this into our
data is I want to put this into our Panda's data frame we'll have the uh you
Panda's data frame we'll have the uh you know headers there for us to go so we
know headers there for us to go so we won't have to get that later and it just
won't have to get that later and it just makes it easier uh in general trust me
makes it easier uh in general trust me so we're going to import pandas as PD
so we're going to import pandas as PD let's go ahead and run this and now
let's go ahead and run this and now we're going to create our data frame so
we're going to create our data frame so we'll say PD dot now we have these world
we'll say PD dot now we have these world uh table titles so what we're going to
uh table titles so what we're going to do is pd. data frame and then in here
do is pd. data frame and then in here for our columns we'll say that's equal
for our columns we'll say that's equal to the world table titles and let's just
to the world table titles and let's just go ahead and say that's our data frame
go ahead and say that's our data frame and call our data frame right here let's
and call our data frame right here let's run it there we go so we were able to
run it there we go so we were able to pull out and extract those headers and
pull out and extract those headers and those titles of these columns we're able
those titles of these columns we're able to put it into our data frame so we're
to put it into our data frame so we're set up and we're ready to go we're
set up and we're ready to go we're rocking and rolling the next thing we
rocking and rolling the next thing we need let's go back up next thing we need
need let's go back up next thing we need is to start pulling in this data right
is to start pulling in this data right here so we have to see how we can pull
here so we have to see how we can pull this data in now if you
this data in now if you remember that we had those th tags those
remember that we had those th tags those were our titles as you can see I'm
were our titles as you can see I'm highlighting over it but down here now
highlighting over it but down here now we have these TD tags and those are all
we have these TD tags and those are all encapsulated within a TR tag so these TR
encapsulated within a TR tag so these TR represent the row
represent the row right then the D represents the data
right then the D represents the data within those rows so R for rows D for
within those rows so R for rows D for data so let's see how we can use that in
data so let's see how we can use that in order to get the information that we
order to get the information that we want so let's go back up here just going
want so let's go back up here just going to take this because again we're only
to take this because again we're only pulling from table not soup not soup
pulling from table not soup not soup what were we thinking um and let's go
what were we thinking um and let's go ahead and let's look at TR let's run
ahead and let's look at TR let's run this now when we're doing this TR these
this now when we're doing this TR these do come in with the head so we're going
do come in with the head so we're going to have to later on we're going to have
to have to later on we're going to have to get rid of these we don't want to
to get rid of these we don't want to pull those in um and have that as part
pull those in um and have that as part of our data but if we scroll down
of our data but if we scroll down there's our
there's our Walmart um we have the location these
Walmart um we have the location these are all with these TD tags and then of
are all with these TD tags and then of course it's separated by a comma then we
course it's separated by a comma then we have our td2 so above we had our td1 so
have our td2 so above we had our td1 so Row one row two Row three all the way
Row one row two Row three all the way down now we will easily be able to use
down now we will easily be able to use this right because this is our column
this right because this is our column data and we can even call it that column
data and we can even call it that column underscore data is equal to we'll run
underscore data is equal to we'll run that um and what we're going to do is
that um and what we're going to do is we're going to Loop through that because
we're going to Loop through that because it was all in a list so we're going to
it was all in a list so we're going to Loop through that information but
Loop through that information but instead of looking at the TR tag we're
instead of looking at the TR tag we're going to look at the T D tag so let's
going to look at the T D tag so let's come right down here we'll say for the
come right down here we'll say for the row in column
row in column row and we'll do a colon now we need to
row and we'll do a colon now we need to Loop through this we'll do something
Loop through this we'll do something like row. findor all all and then what
like row. findor all all and then what are we looking for we're not looking for
are we looking for we're not looking for the TR looking for the TD and just for
the TR looking for the TD and just for now let's print this off see what this
now let's print this off see what this looks like apparently I didn't run this
looks like apparently I didn't run this uh column data that's
uh column data that's why and let's run
why and let's run this and what we actually need to do is
this and what we actually need to do is something almost exactly like
something almost exactly like this and I'm going to put it right below
this and I'm going to put it right below it um instead of of printing this off
it um instead of of printing this off because again this is all in a list
because again this is all in a list we're using find all so we're we're
we're using find all so we're we're printing off another list which isn't
printing off another list which isn't actually super helpful um for each of or
actually super helpful um for each of or all these data that we're pulling in
all these data that we're pulling in what we can do is we can call this uh
what we can do is we can call this uh the rowcor data and then we'll put the
the rowcor data and then we'll put the row data in here so we'll say four and
row data in here so we'll say four and we'll say in row data so we'll just say
we'll say in row data so we'll just say for the data in row data and we'll take
for the data in row data and we'll take the data we'll exchange that and now
the data we'll exchange that and now instead of uh World Table titles we can
instead of uh World Table titles we can change this into uh
change this into uh individual row data right and now let's
individual row data right and now let's print off the individual row data so
print off the individual row data so it's the exact same process that we were
it's the exact same process that we were doing up here and that's how we cleaned
doing up here and that's how we cleaned it up and got this and we may not need
it up and got this and we may not need to strip but let's just run this and see
to strip but let's just run this and see what we get there we go um and strip I'm
what we get there we go um and strip I'm sure was helpful let's actually get rid
sure was helpful let's actually get rid of
of this yeah strip was helpful is the exact
this yeah strip was helpful is the exact same thing that happened on the last one
same thing that happened on the last one so let's keep that actually let's run
so let's keep that actually let's run this and now let's just kind of glance
this and now let's just kind of glance at this information let's look through
at this information let's look through it this looks exactly like the
it this looks exactly like the information that's in the table let's
information that's in the table let's just confirm with this first one uh 25
just confirm with this first one uh 25 uh two what am I saying 572 754 2.4
uh two what am I saying 572 754 2.4 2300 57275 2.4 2200 so this looks
2300 57275 2.4 2200 so this looks exactly correct now we have to figure
exactly correct now we have to figure out a way to get this into our table
out a way to get this into our table because again these are all individual
because again these are all individual lists it's not like we're just you know
lists it's not like we're just you know putting all of this in at one time we
putting all of this in at one time we can't just take the entire table and
can't just take the entire table and plop it into um into the data frame we
plop it into um into the data frame we need a way to kind of put this in one at
need a way to kind of put this in one at a time now if you're just here for web
a time now if you're just here for web scraping and you haven't taken like my
scraping and you haven't taken like my panda series that's totally fine that's
panda series that's totally fine that's not what we're here for anyways um but
not what we're here for anyways um but what we can do we'll have our individual
what we can do we'll have our individual row data and we're going to put it in
row data and we're going to put it in kind of one at a time time now the
kind of one at a time time now the reason we have to do that is because
reason we have to do that is because when we had it like this and let's go
when we had it like this and let's go back when we had it like this it's
back when we had it like this it's printing out all of it but what it's
printing out all of it but what it's really doing and let's get rid of it um
really doing and let's get rid of it um what it's really doing is it's kind of
what it's really doing is it's kind of doing it like this it's printing it off
doing it like this it's printing it off one at a time and it's only going to
one at a time and it's only going to save that current row of data this last
save that current row of data this last one it's only going to save that as it's
one it's only going to save that as it's looping through so what we actually want
looping through so what we actually want to do is every time it Loops through we
to do is every time it Loops through we append this information onto the data
append this information onto the data data frame so as it goes through and
data frame so as it goes through and eventually it's going to end up with
eventually it's going to end up with this one but as it goes through let's
this one but as it goes through let's run this as it goes through it puts this
run this as it goes through it puts this one in and then the next time it Loops
one in and then the next time it Loops through it puts this one in and the next
through it puts this one in and the next time it Loops through Etc all the way
time it Loops through Etc all the way down um so let's see how we can do this
down um so let's see how we can do this so we have our data frame right here
so we have our data frame right here let's get rid of this let's bring our
let's get rid of this let's bring our data frame in now again like I just
data frame in now again like I just mentioned if you don't know pandas and
mentioned if you don't know pandas and you haven't learned that uh you know go
you haven't learned that uh you know go take my uh series on that it's really
take my uh series on that it's really good and we do something very similar to
good and we do something very similar to this in that Series so I'm not going to
this in that Series so I'm not going to kind of walk through the entire logic um
kind of walk through the entire logic um but there is something called Lo which
but there is something called Lo which stands for location when you're looking
stands for location when you're looking at the index on a data frame and we're
at the index on a data frame and we're going to use that to our advantage so
going to use that to our advantage so we're going to say the length of the
we're going to say the length of the data frame so we're looking at how many
data frame so we're looking at how many rows are in this data frame and then
rows are in this data frame and then we're going to say that's our length
we're going to say that's our length then we're going to take that length and
then we're going to take that length and use it when we're actually putting in
use it when we're actually putting in this new information pretty um pretty
this new information pretty um pretty cool so we're going to say df.loc then a
cool so we're going to say df.loc then a bracket and we're putting in that length
bracket and we're putting in that length so we're checking the length of our data
so we're checking the length of our data frame each time it's looping through and
frame each time it's looping through and then we're going to put the information
then we're going to put the information in the next position that's exactly what
in the next position that's exactly what we're doing so let's go ahead and put in
we're doing so let's go ahead and put in the individual row data um so let's just
the individual row data um so let's just recap We're looping through this TR this
recap We're looping through this TR this is our column data so these TR that's
is our column data so these TR that's our row of data then we're as
our row of data then we're as as We're looping through it we're doing
as We're looping through it we're doing find all and looking for TD tags that's
find all and looking for TD tags that's our individual data so that's our row
our individual data so that's our row data then we're taking that data each
data then we're taking that data each piece of data and we're getting out the
piece of data and we're getting out the text and we're stripping it to kind of
text and we're stripping it to kind of clean it and now it's in a list for each
clean it and now it's in a list for each individual row then we're looking at our
individual row then we're looking at our current data frame which has nothing in
current data frame which has nothing in it right now we're looking at the length
it right now we're looking at the length of it and we're appending each row of
of it and we're appending each row of this information into the next position
this information into the next position so let's go ahead and run this
so let's go ahead and run this it's working it's thinking and it looks
it's working it's thinking and it looks like we got an issue canot set a row
like we got an issue canot set a row with mismatched columns now we're
with mismatched columns now we're encountering an issue not one that I got
encountering an issue not one that I got earlier but we're going to cancel this
earlier but we're going to cancel this out we're going to figure this out
out we're going to figure this out together so let's print off our
together so let's print off our individual row data let's look at this
individual row data let's look at this this one is empty uh this is I'm almost
this one is empty uh this is I'm almost certain is probably the issue um I
certain is probably the issue um I didn't encounter this issue when I wrote
didn't encounter this issue when I wrote these uh when I wrote this lesson um but
these uh when I wrote this lesson um but I'm almost certain that this is the
I'm almost certain that this is the issue right here so let's do the column
issue right here so let's do the column data but let's start at position um
data but let's start at position um let's try one and not parentheses I need
let's try one and not parentheses I need brackets because this is a list right so
brackets because this is a list right so it should work and there we go so now
it should work and there we go so now that first one's gone so now we just
that first one's gone so now we just have the information I didn't even think
have the information I didn't even think about that um just a second ago but I'm
about that um just a second ago but I'm glad we're running into it in case you
glad we're running into it in case you ran into that uh issue let's go ahead
ran into that uh issue let's go ahead and try this
and try this again and it looked like it worked so
again and it looked like it worked so let's pull our data frame down I could
let's pull our data frame down I could have just wrote DF let's pull our data
have just wrote DF let's pull our data frame down and now this is looking
frame down and now this is looking fantastic now um these three dots just
fantastic now um these three dots just mean there's information in there just
mean there's information in there just doesn't want to display it but it looks
doesn't want to display it but it looks like we have our rank we have our name
like we have our rank we have our name have the industry revenue revenue growth
have the industry revenue revenue growth employees and headquarters for every
employees and headquarters for every single one so this is perfect now this
single one so this is perfect now this is exactly what I was hoping to get now
is exactly what I was hoping to get now you can go in and use pandas and
you can go in and use pandas and manipulate this and change it and you
manipulate this and change it and you know dive into all the information in
know dive into all the information in there but we can also export this into a
there but we can also export this into a CSV if that's what you're wanting so we
CSV if that's what you're wanting so we could easily do that by saying we'll do
could easily do that by saying we'll do DF do2 CSV and then within here we're
DF do2 CSV and then within here we're just going to do R and specify our file
just going to do R and specify our file path so let's come down here to our file
path so let's come down here to our file path then we'll go to our folder for our
path then we'll go to our folder for our output so we're just going to take this
output so we're just going to take this path and let me do it like that so I
path and let me do it like that so I have this path in my one drive documents
have this path in my one drive documents python webscript being folder for output
python webscript being folder for output so you know I already made this um and
so you know I already made this um and I'm just going to put this right down
I'm just going to put this right down here now I do have to specify what we're
here now I do have to specify what we're going to call this um we'll just call
going to call this um we'll just call this companies and then we have to say
this companies and then we have to say CSV that is very important now if we run
CSV that is very important now if we run this I already know just because uh we
this I already know just because uh we have this Rank and this index here we're
have this Rank and this index here we're going to keep this index in the output
going to keep this index in the output not great uh but let's run it let's look
not great uh but let's run it let's look at our
at our output there's our companies and when we
output there's our companies and when we pull this up as you can see this is not
pull this up as you can see this is not what we want because we have this extra
what we want because we have this extra thing right here now if we're automating
thing right here now if we're automating this this would get super annoying so
this this would get super annoying so what we're going to do is go back and
what we're going to do is go back and just say index equals false let's go out
just say index equals false let's go out of here and now we're just going to come
of here and now we're just going to come right down here we're going to say comma
right down here we're going to say comma index equals false and so it's going to
index equals false and so it's going to take this index and it's not going to
take this index and it's not going to import or actually export it into the
import or actually export it into the CSV now let's go ahead and run
CSV now let's go ahead and run this let's pull up our folder one more
this let's pull up our folder one more time and let's refresh just to make sure
time and let's refresh just to make sure should be good and now this looks a lot
should be good and now this looks a lot better so we're able to take all of that
better so we're able to take all of that information and put it into a CSV and
information and put it into a CSV and it's all there so this is the whole
it's all there so this is the whole project so if we scroll all the way back
project so if we scroll all the way back up let's just kind of glance at what we
up let's just kind of glance at what we did here scroll down we brought in our
did here scroll down we brought in our libraries and packages we specified our
libraries and packages we specified our URL we brought in our soup um and then
URL we brought in our soup um and then we tried to find our table now that took
we tried to find our table now that took a little bit of uh testing out but we
a little bit of uh testing out but we knew that the table was the second one
knew that the table was the second one so in position one so we took that table
so in position one so we took that table we were also able to specify it using
we were also able to specify it using find but then we used the class and of
find but then we used the class and of course we just wanted to work with that
course we just wanted to work with that table that's all the data we wanted so
table that's all the data we wanted so we specifi this is our table and we
we specifi this is our table and we worked with just our table going forward
worked with just our table going forward of course uh we encountered some small
of course uh we encountered some small issues user errors on my end but we were
issues user errors on my end but we were able to get our world titles and we put
able to get our world titles and we put those into our data frame right here
those into our data frame right here using pandas then next we went back and
using pandas then next we went back and we got all the row data and the
we got all the row data and the individual data from those rows and we
individual data from those rows and we put it into our Panda data frame then we
put it into our Panda data frame then we came below and we exported this into an
came below and we exported this into an actual CSV file so that is how we can
actual CSV file so that is how we can use webs scraping to get data from
use webs scraping to get data from something like a table and put it into a
something like a table and put it into a panda data frame I hope that this lesson
panda data frame I hope that this lesson was helpful I know we encountered some
was helpful I know we encountered some issues that's on my end and I apologize
issues that's on my end and I apologize but if you run into those same issues
but if you run into those same issues hopefully that helped uh but I hope this
hopefully that helped uh but I hope this was helpful and if you like this be sure
was helpful and if you like this be sure to like And subscribe below I appreciate
to like And subscribe below I appreciate you I love you and I will see you in the
you I love you and I will see you in the next
next [Music]
[Music] lesson so the first thing that we need
lesson so the first thing that we need to do is import our Panda's Library so
to do is import our Panda's Library so we're going to say import we're going to
we're going to say import we're going to say pandas now this will import the
say pandas now this will import the pandas library but it's pretty common
pandas library but it's pretty common place to give it an alias and as a
place to give it an alias and as a standard when using pandas people will
standard when using pandas people will say as PD so this is just a quick Alias
say as PD so this is just a quick Alias that you can use uh that's what I always
that you can use uh that's what I always use and I've always used it because
use and I've always used it because that's how I learned it and I want to
that's how I learned it and I want to teach it to you the right way so that's
teach it to you the right way so that's how we're going to do it in this video
how we're going to do it in this video so let's hit shift enter now that that
so let's hit shift enter now that that is imported we can start reading in our
is imported we can start reading in our files now right down here I'm going to
files now right down here I'm going to open up my file explorer and we have
open up my file explorer and we have several different types of files in here
several different types of files in here we have CSV files text files Json files
we have CSV files text files Json files and an Excel worksheet which is a little
and an Excel worksheet which is a little bit different than a CSV so we're going
bit different than a CSV so we're going to import all of those I'm going to show
to import all of those I'm going to show you how to import it as well as some of
you how to import it as well as some of the different things that you need to be
the different things that you need to be aware of when you're importing so we're
aware of when you're importing so we're going to import some of those different
going to import some of those different file types and I'll show you how to do
file types and I'll show you how to do that within pandas so the first thing
that within pandas so the first thing that we need to say is PD Dot and let's
that we need to say is PD Dot and let's read in a CSV because that's a pretty
read in a CSV because that's a pretty common one we'll say
common one we'll say read
read CSV and this is liter literally all you
CSV and this is liter literally all you have to write in order to call it in now
have to write in order to call it in now it's not going to call it in as a string
it's not going to call it in as a string like it would in one of our previous
like it would in one of our previous videos if you're just using the regular
videos if you're just using the regular operating system of python when you're
operating system of python when you're using pandas it calls it in as a data
using pandas it calls it in as a data frame and I'll talk about some of the
frame and I'll talk about some of the nuances of that so let's go down to our
nuances of that so let's go down to our file explorer we have this countries of
file explorer we have this countries of the world CSV you just need to click on
the world CSV you just need to click on it and rightclick and copy as path and
it and rightclick and copy as path and that's literally going to copy that file
that's literally going to copy that file path for us you don't have to type it
path for us you don't have to type it out manually you can if You' like and
out manually you can if You' like and we're just going to paste it in between
we're just going to paste it in between these parentheses now if we run it right
these parentheses now if we run it right now it will not work I'll do that for
now it will not work I'll do that for you it's saying we have this Unicode
you it's saying we have this Unicode error uh basically what's happening is
error uh basically what's happening is is it's reading in these backs slashes
is it's reading in these backs slashes and this colon and all those back
and this colon and all those back clashes in there and this period at the
clashes in there and this period at the end what we need to do is read this in
end what we need to do is read this in as a raw text so we're just going to say
as a raw text so we're just going to say R and now it's going to read this as a
R and now it's going to read this as a literal string or a literal value and
literal string or a literal value and not as you know with all these back
not as you know with all these back slashes which does make a big difference
slashes which does make a big difference when we run this it's going to populate
when we run this it's going to populate our very first data frame so let's go
our very first data frame so let's go ahead and run it and now we have this
ahead and run it and now we have this CSV in here with our country and our
CSV in here with our country and our region now if we go and pull up this
region now if we go and pull up this file and let's do that really quickly
file and let's do that really quickly let's bring up this country's of the
let's bring up this country's of the world it automatically populated those
world it automatically populated those headers for us in the data frame but we
headers for us in the data frame but we don't have any column for those 0 1 2 3
don't have any column for those 0 1 2 3 so if we go back as you can see right
so if we go back as you can see right here there's this index and that's
here there's this index and that's really important in a data frame it's
really important in a data frame it's really makes a data frame a data frame
really makes a data frame a data frame and we use index a lot in pandas we're
and we use index a lot in pandas we're able to filter on the index search on
able to filter on the index search on the index and a lot of other things
the index and a lot of other things which I'll show you in future videos but
which I'll show you in future videos but this is basically how you read in a file
this is basically how you read in a file now if we go right up here in between
now if we go right up here in between these parentheses and we hit shift tab
these parentheses and we hit shift tab this is going to come up for us let's
this is going to come up for us let's hit this plus button and what this is is
hit this plus button and what this is is these are all the arguments or all the
these are all the arguments or all the things that we can specify when we're
things that we can specify when we're reading in a file and there are a lot of
reading in a file and there are a lot of different opts options so let's go ahead
different opts options so let's go ahead and take a look really quickly really
and take a look really quickly really quickly I wanted to give a huge shout
quickly I wanted to give a huge shout out to the sponsor of this entire Panda
out to the sponsor of this entire Panda series and that is udemy udemy has some
series and that is udemy udemy has some of the best courses at the best prices
of the best courses at the best prices and it is no exception when it comes to
and it is no exception when it comes to pandas courses if you want to master
pandas courses if you want to master pandas this is the course that I would
pandas this is the course that I would recommend it's going to teach you just
recommend it's going to teach you just about everything you need to know about
about everything you need to know about pandas so huge shout out to you me for
pandas so huge shout out to you me for sponsoring this Panda series and let's
sponsoring this Panda series and let's get back to the video the first thing is
get back to the video the first thing is obviously the file path we can specify a
obviously the file path we can specify a separator which there is no default so
separator which there is no default so when we're pulling in this CSV when
when we're pulling in this CSV when we're reading in the CSV it's
we're reading in the CSV it's automatically going to assume it's a
automatically going to assume it's a comma CU it's a comma separated uh file
comma CU it's a comma separated uh file you can choose delimers headers names
you can choose delimers headers names index columns and a lot of other things
index columns and a lot of other things as you can see right here now I will say
as you can see right here now I will say that I don't use almost any of these uh
that I don't use almost any of these uh the few that I'm going to show you
the few that I'm going to show you really quickly in just a second are up
really quickly in just a second are up the very top but you can do a ton of
the very top but you can do a ton of different things and I'm just going to
different things and I'm just going to slowly go through them so that's what
slowly go through them so that's what those are you can also go down here this
those are you can also go down here this is our dock string and you can see
is our dock string and you can see exactly how these parameters work it'll
exactly how these parameters work it'll show you and give you a text and walk
show you and give you a text and walk you through how to do this again most of
you through how to do this again most of these you'll probably never use but
these you'll probably never use but things like a separator could actually
things like a separator could actually be useful and things like a header could
be useful and things like a header could be useful because it is possible that
be useful because it is possible that you want to either rename your headers
you want to either rename your headers or you don't have a header in your CSV
or you don't have a header in your CSV and you don't want it to autop populate
and you don't want it to autop populate that header so that is something that
that header so that is something that you can specify so for example this
you can specify so for example this header one I'll show you how to do this
header one I'll show you how to do this uh the default behaviors is to infer
uh the default behaviors is to infer that there are column names if no names
that there are column names if no names are passed this behavior is identical to
are passed this behavior is identical to header equals zero so it's saying that
header equals zero so it's saying that first row or that first index which is
first row or that first index which is like right here that zero is going to be
like right here that zero is going to be read in as a header but we can come
read in as a header but we can come right over here and we'll do comma
right over here and we'll do comma header is equal to and we can say none
header is equal to and we can say none and as you can see there are no headers
and as you can see there are no headers now instead it's another index so we
now instead it's another index so we have index indexes on both the x- axis
have index indexes on both the x- axis and the Y AIS and so right now we have
and the Y AIS and so right now we have this zero and one index indicating the
this zero and one index indicating the First Column and the second column if we
First Column and the second column if we want to specify those names we can say
want to specify those names we can say the header equals none then we can say
the header equals none then we can say names is equal to and we'll give it a
names is equal to and we'll give it a list and so the first one was country
list and so the first one was country and what's that second one oh region so
and what's that second one oh region so right here that's the first um the first
right here that's the first um the first row but we'll rename it and we'll just
row but we'll rename it and we'll just say country region and when we run that
say country region and when we run that we've now populated the country and the
we've now populated the country and the region uh we're just pretending that our
region uh we're just pretending that our CSV does not have these values in it and
CSV does not have these values in it and we have to name it ourselves that's how
we have to name it ourselves that's how you do it but let's get rid of all that
you do it but let's get rid of all that because we actually do want those in
because we actually do want those in there so we're just going to get rid of
there so we're just going to get rid of those and read it in as normal and there
those and read it in as normal and there we go now typically when you're reading
we go now typically when you're reading in a file what you need to do is you
in a file what you need to do is you want to assign that to a variable almost
want to assign that to a variable almost always when you see any tutorial or
always when you see any tutorial or anybody online or even when you're
anybody online or even when you're actually working people will say DF is
actually working people will say DF is equal to DF stands for data frame again
equal to DF stands for data frame again this is a data frame in the next video
this is a data frame in the next video in this series I'm going to walk through
in this series I'm going to walk through what a series is as well as what a data
what a series is as well as what a data frame is because that's pretty important
frame is because that's pretty important to know when you're working with these
to know when you're working with these data frames but we'll assign it to this
data frames but we'll assign it to this value and then we'll say we'll call it
value and then we'll say we'll call it by saying DF and we'll run it and that's
by saying DF and we'll run it and that's typically how you'll do things because
typically how you'll do things because you want to save this data frame so
you want to save this data frame so later on you can do things like data
later on you can do things like data frame Dot and you can uh you know pass
frame Dot and you can uh you know pass in different modules but you can't
in different modules but you can't really do that it's not as easy to do it
really do that it's not as easy to do it if you're calling this entire CSV and
if you're calling this entire CSV and importing it every time so let's copy
importing it every time so let's copy this because now we're going to import a
this because now we're going to import a different type of file so now we've been
different type of file so now we've been doing read CSV but we can also import
doing read CSV but we can also import text files now you can do that with the
text files now you can do that with the read CSV we can import text files let's
read CSV we can import text files let's look at this one we have the same one
look at this one we have the same one it's countries of the world except now
it's countries of the world except now it's a text file because I just
it's a text file because I just converted it for this video I'll copy
converted it for this video I'll copy that as a path and so now when we do
that as a path and so now when we do this oops let me get
this oops let me get those quotes in there it'll say world.
those quotes in there it'll say world. txt it will still work as you can see
txt it will still work as you can see this did not import properly um we have
this did not import properly um we have this country back SLT region and then
this country back SLT region and then all of our values are the exact same
all of our values are the exact same with this back SLT that's because we
with this back SLT that's because we need to use a separator and I'll show
need to use a separator and I'll show you in just a little bit how we can do
you in just a little bit how we can do this in a different way but with that
this in a different way but with that read CSV this is how we can do it we'll
read CSV this is how we can do it we'll just say sep is equal to we need to do
just say sep is equal to we need to do back SLT now let's try running this and
back SLT now let's try running this and as you can see it now has it broken out
as you can see it now has it broken out into country and region we could also do
into country and region we could also do it the more proper way and this is the
it the more proper way and this is the way you should do it and I'll get rid of
way you should do it and I'll get rid of these really quickly but just want to
these really quickly but just want to keep them there in case you want to see
keep them there in case you want to see that but you can also do read table and
that but you can also do read table and let's get rid of this
let's get rid of this separator and now we have no separators
separator and now we have no separators just reading it in as a table let's run
just reading it in as a table let's run this and it reads it in proper L the
this and it reads it in proper L the first time this read table can be used
first time this read table can be used for tons of different data types but
for tons of different data types but typically I've been using it for like
typically I've been using it for like text files um we can also read in that
text files um we can also read in that CSV so let's change this right here to
CSV so let's change this right here to CSV we can read it in as a CSV but just
CSV we can read it in as a CSV but just like we did in the last one when we read
like we did in the last one when we read in the text file using read CSV this
in the text file using read CSV this read table you're going to need to
read table you're going to need to specify the separator so I'll just copy
specify the separator so I'll just copy this and we'll say comma and now it
this and we'll say comma and now it reads it in properly again you can use
reads it in properly again you can use that for a ton of different file types
that for a ton of different file types but you just need to specify a few more
but you just need to specify a few more things if you don't want to use the more
things if you don't want to use the more specific read uncore function when
specific read uncore function when you're using pandas now let's copy this
you're using pandas now let's copy this again we're going to go right down here
again we're going to go right down here and now let's do Json files Json files
and now let's do Json files Json files usually hold semi structured data um
usually hold semi structured data um which is definitely different than very
which is definitely different than very structured data like a CSV where has
structured data like a CSV where has columns and rows so let's go to our file
columns and rows so let's go to our file explorer we have this Json sample we
explorer we have this Json sample we will copy this in as the
will copy this in as the path let's paste it right here and we'll
path let's paste it right here and we'll do reor Json again these different
do reor Json again these different functions were built out specifically
functions were built out specifically for these file types that's why you know
for these file types that's why you know each one has a different name so now
each one has a different name so now we're reading this in as the
we're reading this in as the Json let's read it in and it read it in
properly now let's go ahead and copy this and take a look at Excel files
this and take a look at Excel files because Excel files are a little bit
because Excel files are a little bit different than other ones that we've
different than other ones that we've looked at
looked at um so let's just do read uncore
um so let's just do read uncore Excel and let's go down to our file
Excel and let's go down to our file explorer and let's actually open up this
explorer and let's actually open up this workbook as you can see we have sheet
workbook as you can see we have sheet one right here but we also have this
one right here but we also have this world population which has a lot more
world population which has a lot more data let's say we just wanted to read in
data let's say we just wanted to read in sheet one we can do that or by default
sheet one we can do that or by default it's going to read in this world
it's going to read in this world population because it's the first sheet
population because it's the first sheet in the Excel file well let's go ahead
in the Excel file well let's go ahead and take a look at that let's get out of
and take a look at that let's get out of here and and let's say oops I forgot to
here and and let's say oops I forgot to copy the file path let's go ahead and
copy the file path let's go ahead and copy as path and we'll put it right
copy as path and we'll put it right here and let's just read it in with no
here and let's just read it in with no arguments or anything in there or no
arguments or anything in there or no parameters when we read it in it's
parameters when we read it in it's reading in that very first sheet so this
reading in that very first sheet so this is the one that has all of the data now
is the one that has all of the data now let's say we wanted to read in that
let's say we wanted to read in that extra sheet name or the second sheet
extra sheet name or the second sheet name we'll just go comma sheet undor
name we'll just go comma sheet undor name say is equal to and then we can
name say is equal to and then we can specify sheet was it sheet one like this
specify sheet was it sheet one like this yes it was so we just had to specify the
yes it was so we just had to specify the sheet name right here and then it
sheet name right here and then it brought in that sheet instead of the
brought in that sheet instead of the default which is the very first sheet in
default which is the very first sheet in that Excel now that definitely covers a
that Excel now that definitely covers a lot of how you read in those files again
lot of how you read in those files again you can come in here and hit shift Tab
you can come in here and hit shift Tab and this plus sign and take a look at
and this plus sign and take a look at all the documentation and you can
all the documentation and you can specify a lot of different things things
specify a lot of different things things that I didn't think were very important
that I didn't think were very important for you guys to know especially if
for you guys to know especially if you're just starting out the ones that
you're just starting out the ones that we looked at today are what I would say
we looked at today are what I would say are like the ones that I use almost all
are like the ones that I use almost all the time so I wanted to show you those
the time so I wanted to show you those but if you're interested in any of these
but if you're interested in any of these other ones or you have very unique data
other ones or you have very unique data and you need to do that um you know it's
and you need to do that um you know it's worth really getting in here and
worth really getting in here and figuring things out a few other things
figuring things out a few other things that I wanted to show you just in this
that I wanted to show you just in this kind of first video or this intro video
kind of first video or this intro video on how to read in files um one thing
on how to read in files um one thing that you may have noticed especially in
that you may have noticed especially in this file right here is we're only
this file right here is we're only looking at the first five and then the
looking at the first five and then the last five so if we wanted to see all the
last five so if we wanted to see all the data all the data is in these like
data all the data is in these like little three dots right here right we
little three dots right here right we want to be able to see that data but
want to be able to see that data but right now we can't and that's because of
right now we can't and that's because of some settings that are already within
some settings that are already within pandas and all we need to do is change
pandas and all we need to do is change that so this one has 234 rows and four
that so this one has 234 rows and four columns so obviously we can see all the
columns so obviously we can see all the columns well let's just change the rows
columns well let's just change the rows all we'll say is pd. set uncore option
all we'll say is pd. set uncore option now what we need to do is we're going to
now what we need to do is we're going to change the rows we're not going to
change the rows we're not going to change the columns at least not on this
change the columns at least not on this one so we'll say quote
one so we'll say quote display. max. rows now if we just run
display. max. rows now if we just run this for whatever data we bring in it's
this for whatever data we bring in it's going to be able to show the max rows
going to be able to show the max rows and then we'll say
and then we'll say 235 although there's 234 rows I'm just
235 although there's 234 rows I'm just going to be safe let's run
going to be safe let's run this and now it has changed it so let's
this and now it has changed it so let's read in this file again and you'll see
read in this file again and you'll see how it's
how it's changed now we have all the numbers and
changed now we have all the numbers and we have this little bar on the right
we have this little bar on the right that allows us to go down all the way to
that allows us to go down all the way to the bottom and all the way to the top so
the bottom and all the way to the top so now we can actually look and kind of
now we can actually look and kind of skim and see our values I like that
skim and see our values I like that better than just having that you know
better than just having that you know shorter version um we can do the exact
shorter version um we can do the exact same thing on columns as well so if we
same thing on columns as well so if we look at this one this is our Json file
look at this one this is our Json file has the same thing right here we have
has the same thing right here we have what was it 38 columns but we can only
what was it 38 columns but we can only see I think it's maybe it's 20 or
see I think it's maybe it's 20 or something like that I can't remember um
something like that I can't remember um but we have 38 we can only see like
but we have 38 we can only see like let's say 15 of them or 20 of them we'll
let's say 15 of them or 20 of them we'll do the exact same thing and we'll just
do the exact same thing and we'll just say pd. set options.
say pd. set options. max.
max. columns and we'll set that to 40 for
columns and we'll set that to 40 for that one when we run this oops let's get
that one when we run this oops let's get over here when we run this one again we
over here when we run this one again we can now scroll over and see every single
can now scroll over and see every single one of our columns now that one is a in
one of our columns now that one is a in my opinion a lot more useful I like
my opinion a lot more useful I like being able to see every single column so
being able to see every single column so definitely something that you should be
definitely something that you should be using especially when you have these
using especially when you have these really large files you want to be able
really large files you want to be able to see a lot of the data and a lot of
to see a lot of the data and a lot of the columns so when you're slicing and
the columns so when you're slicing and dicing and doing all the things that
dicing and doing all the things that we're about to learn in this Panda
we're about to learn in this Panda series you know you know what you're
series you know you know what you're looking at I also want to show you just
looking at I also want to show you just how to kind of look at your data in
how to kind of look at your data in these data frames as well that's also
these data frames as well that's also pretty important so let's go right down
pretty important so let's go right down here and the very last one that we
here and the very last one that we imported was this one right here this
imported was this one right here this read Excel so this data frame is the
read Excel so this data frame is the only one that's going going to read in
only one that's going going to read in let's run it um this is the last one to
let's run it um this is the last one to be run so this variable right here DF uh
be run so this variable right here DF uh it won't be applied to all these other
it won't be applied to all these other ones which we can always go back and
ones which we can always go back and change those typically you'll do
change those typically you'll do something like data frame two you want
something like data frame two you want to do something like that um so let's
to do something like that um so let's keep data Frame 2 oops so what we're
keep data Frame 2 oops so what we're going to do is we're going to bring data
going to do is we're going to bring data Frame 2 right down here and we want to
Frame 2 right down here and we want to take a look at some of this data we want
take a look at some of this data we want to know a little bit more about it
to know a little bit more about it something that you can do is dataframe
something that you can do is dataframe 2. info and we'll do an open parenthesis
2. info and we'll do an open parenthesis and when we run this it's going to give
and when we run this it's going to give us a really quick breakdown of a little
us a really quick breakdown of a little bit of our data so we have our columns
bit of our data so we have our columns right here rank CCA 3 country and
right here rank CCA 3 country and capital it's saying we have
capital it's saying we have 234 values in those columns because
234 values in those columns because there's
there's 234 scroll up here because there's
234 scroll up here because there's 234 uh rows that tells me that there's
234 uh rows that tells me that there's no missing data in here at least not you
no missing data in here at least not you know completely missing like null values
know completely missing like null values there is something something in each of
there is something something in each of those rows the count tells me it's non-
those rows the count tells me it's non- null so there's no null values and it
null so there's no null values and it tells me the data type so it's ringing
tells me the data type so it's ringing in as an integer an object an object and
in as an integer an object an object and an object and it also tells us how much
an object and it also tells us how much memory it's using which is also pretty
memory it's using which is also pretty neat because when you get really really
neat because when you get really really large data types memory usage and and
large data types memory usage and and knowing how to work around that stuff
knowing how to work around that stuff does become more important than when
does become more important than when you're working at these really small You
you're working at these really small You Know sample sizes that we're looking at
Know sample sizes that we're looking at we can also do oops let me get rid of
we can also do oops let me get rid of that can also do data frame two
that can also do data frame two and we'll do shape and for this one we
and we'll do shape and for this one we do not need the
do not need the parentheses and all this is going to
parentheses and all this is going to tell us is we have 234 rows and four
tell us is we have 234 rows and four columns we're also able to look at uh
columns we're also able to look at uh the first few values or rows in each of
the first few values or rows in each of these data frames so we can just say
these data frames so we can just say data frame 2. head and if we do that
data frame 2. head and if we do that it's going to give us the first five
it's going to give us the first five values but we can specify how many we
values but we can specify how many we want we can say head 10 it'll give us
want we can say head 10 it'll give us the first 10 rows right here we can do
the first 10 rows right here we can do the exact same thing and let's go right
the exact same thing and let's go right down here and we'll say tail so they'll
down here and we'll say tail so they'll give us the last 10 rows within our data
give us the last 10 rows within our data frame now let's copy this and let's say
frame now let's copy this and let's say we don't want to actually look at all of
we don't want to actually look at all of these values or all these columns we can
these values or all these columns we can specify that by saying df2 and oops
specify that by saying df2 and oops let's get rid of all of
let's get rid of all of this and we'll say with a quote we'll
this and we'll say with a quote we'll say Rank and now we can take just a look
say Rank and now we can take just a look at the rank data now we can't do that by
at the rank data now we can't do that by doing the index or at least not like
doing the index or at least not like this if we want to use this index that
this if we want to use this index that is right here we can but there's a very
is right here we can but there's a very special function called Lo and IO for
special function called Lo and IO for that and I'm going to have an entire
that and I'm going to have an entire video on this because it does get a
video on this because it does get a little bit more complex but there's
little bit more complex but there's df2 and there's Lo and I stands for
df2 and there's Lo and I stands for location and I location that's only for
location and I location that's only for the indexes whether it's the x axis or
the indexes whether it's the x axis or the Y AIS those are the indexes and for
the Y AIS those are the indexes and for location it's looking for the actual
location it's looking for the actual text the actual string of the index so
text the actual string of the index so if we come up here that data Frame 2 we
if we come up here that data Frame 2 we can specify
can specify 224 and it'll give us this information
224 and it'll give us this information right here in a little different format
right here in a little different format so let's go bracket and we'll say
so let's go bracket and we'll say 224 and when we run this it gives us our
224 and when we run this it gives us our rank CCA country capital with our values
rank CCA country capital with our values over here kind of like a dictionary
over here kind of like a dictionary almost now let's copy this and we'll say
almost now let's copy this and we'll say df2 do IO and right now these look the
df2 do IO and right now these look the exact same but we haven't really talked
exact same but we haven't really talked a lot about changing the index and you
a lot about changing the index and you can change the index to a string or a
can change the index to a string or a different column or something like that
different column or something like that and we'll look at that in future videos
and we'll look at that in future videos the iock looks at the integer location
the iock looks at the integer location so even if these um let's go right up
so even if these um let's go right up here even if this index had changed to
here even if this index had changed to let's say this rank or this CCA 3 or
let's say this rank or this CCA 3 or country or whatever you make this index
country or whatever you make this index the ILO will still look at the integer
the ILO will still look at the integer location so that 224 would still be 224
location so that 224 would still be 224 even if it was usbekistan
even if it was usbekistan so then when we look at this it's going
so then when we look at this it's going to be the exact same but if we had
to be the exact same but if we had changed that Index this Lo is the one
changed that Index this Lo is the one that we could search on and we could
that we could search on and we could search
usuzan is that how you spell usbekistan hey I nailed it so that is how you use
hey I nailed it so that is how you use Lo and IO again I just wanted to show
Lo and IO again I just wanted to show you a little bit about how you can look
you a little bit about how you can look at your data frame or search within your
at your data frame or search within your data frame now in future videos I'm
data frame now in future videos I'm going to dive a lot deeper into a lot of
going to dive a lot deeper into a lot of the concepts that we just looked at
the concepts that we just looked at because I just kind of touched on them I
because I just kind of touched on them I wanted you to have a brief introduction
wanted you to have a brief introduction to them so that in future videos I'm not
to them so that in future videos I'm not just dropping everything on you all at
just dropping everything on you all at once so hopefully this was a good quick
once so hopefully this was a good quick introduction to those topics uh you
introduction to those topics uh you should be able to read in a file now see
should be able to read in a file now see your data frame and kind of look at it
your data frame and kind of look at it in a few different ways that we just
in a few different ways that we just looked at and I hope that that was
looked at and I hope that that was helpful and if it was be sure to check
helpful and if it was be sure to check out all my other videos on Python and
out all my other videos on Python and pandas and if you like this video be
pandas and if you like this video be sure to like And subscribe below and I
sure to like And subscribe below and I will see you in the next
will see you in the next [Music]
[Music] video
video [Music]
[Music] hello everybody today we're going to be
hello everybody today we're going to be looking at filtering and ordering data
looking at filtering and ordering data frames in pandas there are a lot of
frames in pandas there are a lot of different ways you can filter and order
different ways you can filter and order your data in pandas and I'm going to try
your data in pandas and I'm going to try to show you all of the main ways that
to show you all of the main ways that you can do that so let's kick it off by
you can do that so let's kick it off by importing our data set so we're going to
importing our data set so we're going to say data frame is equal to and we'll say
say data frame is equal to and we'll say pandas and I need to import my pandas so
pandas and I need to import my pandas so we'll say import pandas as p
we'll say import pandas as p that's pretty important I think um so
that's pretty important I think um so pd. read CSV and we'll do R and then
pd. read CSV and we'll do R and then we'll say the world population CSV so
we'll say the world population CSV so let's run this all our data frame right
let's run this all our data frame right here and this is the data frame that
here and this is the data frame that we're going to be filtering through and
we're going to be filtering through and ordering in pandas so let's kick it off
ordering in pandas so let's kick it off the first thing that we can do is filter
the first thing that we can do is filter based off of The Columns so the data
based off of The Columns so the data within our columns so Asia Europe Africa
within our columns so Asia Europe Africa or whatever data we may have in that
or whatever data we may have in that column let's go right down here we're
column let's go right down here we're going to say DF and then within it we're
going to say DF and then within it we're going to specify what column we're going
going to specify what column we're going to be filtering on so we're going to say
to be filtering on so we're going to say DF with another bracket and we'll say
DF with another bracket and we'll say rank so we're going to be looking at
rank so we're going to be looking at this rank column right here and we'll
this rank column right here and we'll say in that rank column we want to do
say in that rank column we want to do greater than 10 and that's actually
greater than 10 and that's actually going to be a lot of them let's do less
going to be a lot of them let's do less than so when we run this it's only going
than so when we run this it's only going to return these values that are less
to return these values that are less than 10 we can also do less than equal
than 10 we can also do less than equal to you know all of these um comparison
to you know all of these um comparison operators so less than or equal to so
operators so less than or equal to so now we have all of the ranks 1 through
now we have all of the ranks 1 through 10 now if we look at these countries we
10 now if we look at these countries we can specify by specific values almost
can specify by specific values almost exactly like we did here but instead of
exactly like we did here but instead of doing a comparison operator like we did
doing a comparison operator like we did right here and including those names
right here and including those names let's say Bangladesh and Brazil we can
let's say Bangladesh and Brazil we can use the is in function almost like an in
use the is in function almost like an in function in SQL if you know SQL so let's
function in SQL if you know SQL so let's go right down here and we're going to
go right down here and we're going to say specific underscore countries so
say specific underscore countries so right now we're just going to make a
right now we're just going to make a list of the countries that we want and
list of the countries that we want and then we'll say
Bangladesh and Brazil so let's go right down here and
Brazil so let's go right down here and we'll say okay for these specific
we'll say okay for these specific countries from the data frame let's do
countries from the data frame let's do our bracket we'll say in this country
our bracket we'll say in this country column so we'll do data frame and then
column so we'll do data frame and then another bracket for country so in this
another bracket for country so in this country column we can do do is in and
country column we can do do is in and then an open parenthesis and then look
then an open parenthesis and then look for our specific countries so we're
for our specific countries so we're looking at just this column and we're
looking at just this column and we're saying is in so we're looking at are
saying is in so we're looking at are these values within this column and
these values within this column and we're getting this error and this looks
we're getting this error and this looks very very odd let me um this doesn't
very very odd let me um this doesn't look right there we go I just had some
look right there we go I just had some syntax errors I apologize made it way
syntax errors I apologize made it way more complicated than it needs to be but
more complicated than it needs to be but here's how you use this is in function
here's how you use this is in function so we're looking at Bangladesh and
so we're looking at Bangladesh and Brazil and we return those rows with
Brazil and we return those rows with Bangladesh and Brazil really quickly I
Bangladesh and Brazil really quickly I wanted to give a huge shout out to the
wanted to give a huge shout out to the sponsor of this entire Panda series and
sponsor of this entire Panda series and that is udemy udemy has some of the best
that is udemy udemy has some of the best courses at the best prices and it is no
courses at the best prices and it is no exception when it comes to pandas
exception when it comes to pandas courses if you want to master pandas
courses if you want to master pandas this is the course that I would
this is the course that I would recommend it's going to teach you just
recommend it's going to teach you just about everything you need to know about
about everything you need to know about pandas so huge shout out to UD me for
pandas so huge shout out to UD me for sponsoring this Panda series and let's
sponsoring this Panda series and let's get back to the video we can also do a
get back to the video we can also do a contains function kind of similar to is
contains function kind of similar to is in except it's more like the like in SQL
in except it's more like the like in SQL as well I'm comparing a lot of this to
as well I'm comparing a lot of this to SQL CU When You're filtering things I
SQL CU When You're filtering things I always my brain always goes to SQL but
always my brain always goes to SQL but in pandas it's called the contains so
in pandas it's called the contains so let's do let's actually copy this
let's do let's actually copy this because I don't want to make the same
because I don't want to make the same mistake again let's do that and we'll do
mistake again let's do that and we'll do the bracket but instead of dot is in
the bracket but instead of dot is in we're going to do string. contains and
we're going to do string. contains and then an open parenthesis so we're going
then an open parenthesis so we're going to going to be looking for a string if
to going to be looking for a string if it contain if it contains let's do
it contain if it contains let's do United almost like United States or or
United almost like United States or or any other United so let's run this and
any other United so let's run this and as you can see we have United Arab
as you can see we have United Arab Emirates United Kingdom United States
Emirates United Kingdom United States United States Virgin Islands so we can
United States Virgin Islands so we can kind of search for a specific string or
kind of search for a specific string or a number or a value within our data or
a number or a value within our data or within that column of country now so far
within that column of country now so far we've only been looking at how you can
we've only been looking at how you can filter on these columns we can also fil
filter on these columns we can also fil filter based off of the index as well
filter based off of the index as well and there's two different ways you can
and there's two different ways you can do it or two of the main ways there's
do it or two of the main ways there's filter and then there's L and IO Lo
filter and then there's L and IO Lo stands for location and IO stands for
stands for location and IO stands for integer location and if you've seen
integer location and if you've seen other previous videos I've kind of
other previous videos I've kind of mentioned those so we can take a quick
mentioned those so we can take a quick look at all of those so really quickly
look at all of those so really quickly we need to set an index because the
we need to set an index because the index right now is uh not the best we'll
index right now is uh not the best we'll set our index to
set our index to Country so let's say
Country so let's say df2
df2 is equal to DF do setor index and we'll
is equal to DF do setor index and we'll say country I'm just doing df2 because
say country I'm just doing df2 because later on I want to use that data frame
later on I want to use that data frame again so I'm just going to assign it to
again so I'm just going to assign it to another data frame so that we can just
another data frame so that we can just easily switch back and forth so now we
easily switch back and forth so now we have this index as the country and what
have this index as the country and what we can do is use the filter function so
we can do is use the filter function so let's go down here we'll say
let's go down here we'll say df2
df2 filter and we'll do an open parenthesis
filter and we'll do an open parenthesis and now we can specify our items so
and now we can specify our items so these are actually going to be
these are actually going to be specifying which columns we want to keep
specifying which columns we want to keep so we're going to say items is equal to
so we're going to say items is equal to then we'll make a list we'll say
then we'll make a list we'll say continent hope that's how we spell
continent hope that's how we spell continent I'm always messing up with my
continent I'm always messing up with my uh my stuff here my spelling then we'll
uh my stuff here my spelling then we'll do CCA 3 because why not you can specify
do CCA 3 because why not you can specify whichever ones you want when we run this
whichever ones you want when we run this it's going to only bring in those two
it's going to only bring in those two columns Now by default it's choosing the
columns Now by default it's choosing the access for us but we can also specify
access for us but we can also specify which axis we want to search on so if we
which axis we want to search on so if we say axis is equal to zero it's actually
say axis is equal to zero it's actually going to search this axis this is the
going to search this axis this is the zero axis this is the one axis so where
zero axis this is the one axis so where our columns are is one so if we go back
our columns are is one so if we go back and do one we're searching on that one
and do one we're searching on that one Axis or those header axises again and
Axis or those header axises again and this is the default but you can specify
this is the default but you can specify that so if you just want to search on uh
that so if you just want to search on uh you know filtering right here you can do
you know filtering right here you can do that and let's actually copy this and do
that and let's actually copy this and do that right down here just you can see
that right down here just you can see what it looks like but let's search for
what it looks like but let's search for Zimbabwe and we'll do Zimbabwe and we'll
Zimbabwe and we'll do Zimbabwe and we'll be looking at the zero axis which is the
be looking at the zero axis which is the up and down on the left hand side and
up and down on the left hand side and when we filter on that we can filter by
when we filter on that we can filter by Zimbabwe by looking just at the country
Zimbabwe by looking just at the country index we can also use the like just like
index we can also use the like just like we did before and I'll show you the
we did before and I'll show you the exact same demonstration that we did
exact same demonstration that we did which you can say like is equal to and
which you can say like is equal to and instead of having to put in a concrete
instead of having to put in a concrete um text text you can just say United
um text text you can just say United just like we did before and we're
just like we did before and we're searching where the axis is equal to
searching where the axis is equal to zero which again is this left-handed
zero which again is this left-handed access so now we're looking for United
access so now we're looking for United and it's going to give us all of the
and it's going to give us all of the countries or all the indexed values that
countries or all the indexed values that have United in it like we were talking
have United in it like we were talking about before we also have l and ILO so
about before we also have l and ILO so we can say data frame 2. L now this is a
we can say data frame 2. L now this is a specific value so we'll do United States
specific value so we'll do United States so location is just looking at the
so location is just looking at the actual name or the value of it not its
actual name or the value of it not its position so if we search for United
position so if we search for United States it's going to give us this right
States it's going to give us this right here where it gives us all of the
here where it gives us all of the columns for United States and then all
columns for United States and then all of the uh values for United States or we
of the uh values for United States or we can
can do the io which is the energ location
do the io which is the energ location which is not the exact same because
which is not the exact same because we're looking at the string for the L
we're looking at the string for the L we're looking at this string but
we're looking at this string but underneath it there still is a position
underneath it there still is a position that's that integer location let's do a
that's that integer location let's do a completely random one let's just say
completely random one let's just say three if we look at the third position
three if we look at the third position it's going to give us ASM which I'm not
it's going to give us ASM which I'm not exactly sure what it is but it still
exactly sure what it is but it still gives us basically the same kind of
gives us basically the same kind of output which is the columns and the
output which is the columns and the values so that's another way that you
values so that's another way that you can search within your index when you're
can search within your index when you're actually trying to filter down that data
actually trying to filter down that data now let's go look at the order bu and
now let's go look at the order bu and let's start with the very first one that
let's start with the very first one that we looked at let's do data frame that's
we looked at let's do data frame that's why I kept it because I wanted to use it
why I kept it because I wanted to use it later now we can sort and order these
later now we can sort and order these values instead of it just being kind of
values instead of it just being kind of a jumbled mess in here we can sort these
a jumbled mess in here we can sort these columns however we would like ascending
columns however we would like ascending descending multiple columns single
descending multiple columns single columns and let's look at how to do that
columns and let's look at how to do that so we'll say data frame and then we'll
so we'll say data frame and then we'll do data frame look at rank again just
do data frame look at rank again just like we were doing above and let's do
like we were doing above and let's do data frame where it's less than 10 I
data frame where it's less than 10 I should have just gone and copyed this I
should have just gone and copyed this I apologize so now we have this data frame
apologize so now we have this data frame that is greater than 10 now we can do do
that is greater than 10 now we can do do sortore values and this is the function
sortore values and this is the function that's going to allow us to sort
that's going to allow us to sort everything that we want to sort so we
everything that we want to sort so we can do buy is equal to and we'll just
can do buy is equal to and we'll just order it by the exact same thing that we
order it by the exact same thing that we were doing or calling it on we'll do
were doing or calling it on we'll do rank so now what this is going to do
rank so now what this is going to do it's going to order our rank
it's going to order our rank column and as you can see it did that
column and as you can see it did that one 2 3 4 5 we can also do it with
one 2 3 4 5 we can also do it with ascending or descending so if you want
ascending or descending so if you want to you can look here and see what you
to you can look here and see what you can do so we'll do
can do so we'll do ascending we'll say that's equal to
ascending we'll say that's equal to true and so that's the automatic default
true and so that's the automatic default so that didn't change anything but if we
so that didn't change anything but if we say false it's going to be descending
say false it's going to be descending from highest to lowest so now we have it
from highest to lowest so now we have it in the opposite direction now we don't
in the opposite direction now we don't have to just order or sort this on one
have to just order or sort this on one single column we can do multiple columns
single column we can do multiple columns and we can do that by making a list
and we can do that by making a list right here whoops make a
right here whoops make a list just like that and we'll input
list just like that and we'll input different ones as well so now let's
different ones as well so now let's input our
input our country and when we run this it will
country and when we run this it will give us rank of
give us rank of 9876 as well as the country of Russia
9876 as well as the country of Russia Bangladesh Brazil now if you noticed the
Bangladesh Brazil now if you noticed the country really didn't change because the
country really didn't change because the rank stayed the exact same that's
rank stayed the exact same that's because there's an order of importance
because there's an order of importance here and it starts with the very first
here and it starts with the very first one if we change this around and we look
one if we change this around and we look at this
at this one and put a com right here
one and put a com right here now the country is going to be descended
now the country is going to be descended and the rank would come second so it's
and the rank would come second so it's not going the rank isn't going to really
not going the rank isn't going to really have any effect here so now we have the
have any effect here so now we have the country United States Russia Pakistan
country United States Russia Pakistan and the rank really didn't get ordered
and the rank really didn't get ordered at all now if we want to see how that
at all now if we want to see how that can actually work let's do continent
can actually work let's do continent right here and actually put it right
right here and actually put it right here and do country here so if we run
here and do country here so if we run this it's first going to come and it's
this it's first going to come and it's going to organize or sort the continent
going to organize or sort the continent then it's going to come come back and go
then it's going to come come back and go to the country and then it's going to
to the country and then it's going to sort the country so keep so keep your
sort the country so keep so keep your eye right here in this Asia area because
eye right here in this Asia area because we're going to sort this differently
we're going to sort this differently than ascending so we have ascending
than ascending so we have ascending false and that applies to both of these
false and that applies to both of these it's false and false but we can specify
it's false and false but we can specify which one we want to do we can do a
which one we want to do we can do a false here and a true here so we'll do
false here and a true here so we'll do false comma true and what this is going
false comma true and what this is going to do is it's going to say false for the
to do is it's going to say false for the continent so the continent right here is
continent so the continent right here is going to stay the exact same and so that
going to stay the exact same and so that is a lot of how you can filter and order
is a lot of how you can filter and order your data within pandas I hope that this
your data within pandas I hope that this was helpful I hope that you enjoyed this
was helpful I hope that you enjoyed this video if you liked it be sure to like
video if you liked it be sure to like And subscribe below check out all my
And subscribe below check out all my other videos on Python and pandas and I
other videos on Python and pandas and I will see you in the next
will see you in the next [Music]
[Music] video hello everybody today we're going
video hello everybody today we're going to be looking at indexing and pandas if
to be looking at indexing and pandas if you remember from previous videos the
you remember from previous videos the index is an object that stores the
index is an object that stores the access labels for all Panda objects the
access labels for all Panda objects the index in a data frame is extremely
index in a data frame is extremely useful because it's customizable and you
useful because it's customizable and you can also search and filter based off of
can also search and filter based off of that index in this video we're going to
that index in this video we're going to talk all about indexing how you can
talk all about indexing how you can change the index and customize that as
change the index and customize that as well as how you can search and filter on
well as how you can search and filter on that index and then we're also going to
that index and then we're also going to be looking at something a little bit
be looking at something a little bit more advanced called multi- indexing and
more advanced called multi- indexing and you won't always use it but it's really
you won't always use it but it's really good to know in case you come across a
good to know in case you come across a data frame that has that
data frame that has that so let's get started by importing pandas
so let's get started by importing pandas import pandas as PD now we'll get our
import pandas as PD now we'll get our first data frame we say DF is equal to
first data frame we say DF is equal to pd. read CSV and I've already copied
pd. read CSV and I've already copied this but we're going to do R and we're
this but we're going to do R and we're going to put this file path so I have
going to put this file path so I have this world population CSV I will have
this world population CSV I will have that in the description just like I do
that in the description just like I do in all of my other videos let's run DF
in all of my other videos let's run DF and let's take a look at this data frame
and let's take a look at this data frame so we have a lot of information here we
so we have a lot of information here we have rank country continent population
have rank country continent population as well as the default index from zero
as well as the default index from zero all the way up to 233 now if you haven't
all the way up to 233 now if you haven't watched any of my previous videos on
watched any of my previous videos on pandas the index is pretty important and
pandas the index is pretty important and it's basically just a number or a label
it's basically just a number or a label for each row it doesn't even necessarily
for each row it doesn't even necessarily have to be a unique number um you can
have to be a unique number um you can create or add an index yourself if you
create or add an index yourself if you want to and it doesn't have to be unique
want to and it doesn't have to be unique but it it really should be unique uh
but it it really should be unique uh especially if you want to use it
especially if you want to use it appropriately for what we're doing the
appropriately for what we're doing the country is actually going to be a pretty
country is actually going to be a pretty great index because the country you know
great index because the country you know is going to be all unique because we're
is going to be all unique because we're looking at every single row as a
looking at every single row as a different um country as well as the
different um country as well as the population so let's go ahead and create
population so let's go ahead and create this country or add this country as our
this country or add this country as our index now we can do this in a lot of
index now we can do this in a lot of different ways but the first way that
different ways but the first way that you can do this if you already know what
you can do this if you already know what you are going to create that index on is
you are going to create that index on is we can just go right in here when we're
we can just go right in here when we're reading in this file and we'll say comma
reading in this file and we'll say comma index underscore oops I I spelled that
index underscore oops I I spelled that completely wrong index uncore column and
completely wrong index uncore column and we'll say that is equal to and then
we'll say that is equal to and then we're going to say quote country so
we're going to say quote country so we're taking this country and we're
we're taking this country and we're going to assign it as the index now
going to assign it as the index now let's read this in and as you can see
let's read this in and as you can see this is our index now it looks a little
this is our index now it looks a little bit different we didn't have this
bit different we didn't have this country header right here which is
country header right here which is specifying that this is still the
specifying that this is still the country but you can tell that this is
country but you can tell that this is the index based off the um bold letters
the index based off the um bold letters as well as it being on the far left and
as well as it being on the far left and all the regular columns for the data is
all the regular columns for the data is over here while the country header is
over here while the country header is right here and it's lower than all the
right here and it's lower than all the others just a quick way that you can see
others just a quick way that you can see that that is the index now before we
that that is the index now before we move on I want to show you some other
move on I want to show you some other ways that you can do this as well but
ways that you can do this as well but I'm going to show you how to reverse
I'm going to show you how to reverse this index before we move on and we'll
this index before we move on and we'll say data frame so we had our data frame
say data frame so we had our data frame right here so we have data frame dot
right here so we have data frame dot we'll say reset index and then we'll say
we'll say reset index and then we'll say in place is equal to True which means we
in place is equal to True which means we don't have to assign this to another
don't have to assign this to another variable and all that stuff it'll just
variable and all that stuff it'll just be true so now when we run that data
be true so now when we run that data frame again the index was reset to the
frame again the index was reset to the default numbers so now let's go down
default numbers so now let's go down here I'll show you how to do this in a
here I'll show you how to do this in a different way you can do DF do we'll say
different way you can do DF do we'll say setor index and then we'll just say
setor index and then we'll just say country so very similar to when we were
country so very similar to when we were reading in that file and we said set the
reading in that file and we said set the index or that index column we said index
index or that index column we said index column equals country if we do this and
column equals country if we do this and we run it in it works but if we say data
we run it in it works but if we say data frame right down here it's not going to
frame right down here it's not going to save that if we want to save it just
save that if we want to save it just like we did above we're going to say in
like we did above we're going to say in place is equal to true that is going to
place is equal to true that is going to save it to where we don't have to assign
save it to where we don't have to assign it another variable so now when we run
it another variable so now when we run this the data frame right here which is
this the data frame right here which is going to populate this the data frame is
going to populate this the data frame is going to say in place is equal to true
going to say in place is equal to true so that country will now be our index
so that country will now be our index again let's run this and there we go
again let's run this and there we go really quickly I wanted to give a huge
really quickly I wanted to give a huge shout out to the sponsor of this entire
shout out to the sponsor of this entire panda series and that is udemy udemy has
panda series and that is udemy udemy has some of the best courses at the best
some of the best courses at the best prices and it is no exception when it
prices and it is no exception when it comes to pandas courses if you want to
comes to pandas courses if you want to master pandas this is the course that I
master pandas this is the course that I would recommend it's going to teach you
would recommend it's going to teach you just about everything you need to know
just about everything you need to know about pandas so huge shout out to UD to
about pandas so huge shout out to UD to me for sponsoring this Panda series and
me for sponsoring this Panda series and let's get back to the video now what's
let's get back to the video now what's really great about this index is we're
really great about this index is we're able to search based off just this index
able to search based off just this index and so we can filter on it and basically
and so we can filter on it and basically look through our data with it and there
look through our data with it and there are two different ways that you can do
are two different ways that you can do that at least this is a very common way
that at least this is a very common way that people who use pandas we'll do to
that people who use pandas we'll do to kind of search through that index the
kind of search through that index the first one is called lock and there's
first one is called lock and there's lock and iock that stands for location
lock and iock that stands for location or integer location let's look at lock
or integer location let's look at lock first let's say df.loc and then we'll do
first let's say df.loc and then we'll do a bracket now we're able to specify the
a bracket now we're able to specify the actual string the label so let's go
actual string the label so let's go right up here and let's say
right up here and let's say Albania so we'll say Albania so again
Albania so we'll say Albania so again this is just looking at the location
this is just looking at the location let's run this now it's going to bring
let's run this now it's going to bring up all the Albania data just like here
up all the Albania data just like here where it's kind of looks like a column
where it's kind of looks like a column in a column and we can get this exact
in a column and we can get this exact same data but using iock right here and
same data but using iock right here and when we ran lock we were searching based
when we ran lock we were searching based off Albania which is in the 01 position
off Albania which is in the 01 position so if we actually pull the one position
so if we actually pull the one position for that
for that integer the iock we can look at the one
integer the iock we can look at the one position and this should give us the
position and this should give us the exact same data now let's take a look at
exact same data now let's take a look at multi- indexing and we'll come back to a
multi- indexing and we'll come back to a little bit of this in a second so multi-
little bit of this in a second so multi- indexing is creating multiple indexes
indexing is creating multiple indexes we're not just going to create the
we're not just going to create the country as the index now we're going to
country as the index now we're going to add an additional index on top of that
add an additional index on top of that so let's pull up our data frame right
so let's pull up our data frame right now we have the country but let's do do
now we have the country but let's do do reset
reset index and we'll say in place equals true
index and we'll say in place equals true oops let's run it so now we have our
oops let's run it so now we have our data frame now let's set our index but
data frame now let's set our index but this time when we set our index we're
this time when we set our index we're going to add the country as the index as
going to add the country as the index as well as the continent as an index so
well as the continent as an index so we'll say data frame. setor index then
we'll say data frame. setor index then we'll do a parenthesis and instead of
we'll do a parenthesis and instead of just doing country like we did before
just doing country like we did before we're going to create a list oops and
we're going to create a list oops and we'll do it like that and then we'll
we'll do it like that and then we'll say oops
say oops continent and separate it by a comma so
continent and separate it by a comma so we have continents and Country let's
we have continents and Country let's just say in place is equal to true now
just say in place is equal to true now when we run this we're going to have two
when we run this we're going to have two indexes and let's see what this looks
indexes and let's see what this looks like and let's run this so now we have
like and let's run this so now we have country as well as continent as our
country as well as continent as our index now you may notice that these
index now you may notice that these indexes are repeating themselves on this
indexes are repeating themselves on this continent index we have Europe right
continent index we have Europe right here and Europe right here as well as
here and Europe right here as well as Asia and Asia and it looks a little bit
Asia and Asia and it looks a little bit funky but we are able to sort these
funky but we are able to sort these values and make they look a lot better
values and make they look a lot better so let's go ahead and try this we'll do
so let's go ahead and try this we'll do DF do sortore index and when we run this
DF do sortore index and when we run this it should sort our index alphabetically
it should sort our index alphabetically and we can also look in here and see
and we can also look in here and see what kind of things we can you know
what kind of things we can you know specify we can specify the axis but it's
specify we can specify the axis but it's automatically going to be looking at the
automatically going to be looking at the zero this is zero and this is one so we
zero this is zero and this is one so we have two axes within our data frame you
have two axes within our data frame you can choose the level whether it's
can choose the level whether it's ascending or not ascending in place kind
ascending or not ascending in place kind string sort remaining all of these
string sort remaining all of these different things the only one that I
different things the only one that I really you know think is worth looking
really you know think is worth looking at is the ascending we already know some
at is the ascending we already know some of these other ones but if we look at
of these other ones but if we look at ascending let's run it now it's sorted
ascending let's run it now it's sorted these and so now it's kind of grouped
these and so now it's kind of grouped together so we have Africa and all the
together so we have Africa and all the African ones as well as South America
African ones as well as South America and all the South American ones let's
and all the South American ones let's really quickly say
really quickly say pd. setor
pd. setor option and we'll say display.
option and we'll say display. max. columns and just like this let's
max. columns and just like this let's run it and I need to specify whoops
run it and I need to specify whoops specify right here let's see how many
specify right here let's see how many rows we
rows we have 235 so let's do
have 235 so let's do 235 let's run this and now when we run
235 let's run this and now when we run this you can see that Africa is all
this you can see that Africa is all grouped together and all the countries
grouped together and all the countries are in alphabetical order under it and
are in alphabetical order under it and then we go all the way down to Asia and
then we go all the way down to Asia and again just all in alphabetical order if
again just all in alphabetical order if we wanted to we could say
we wanted to we could say ascending equals
ascending equals true and then when we run this oh I
true and then when we run this oh I meant to say false and then when we run
meant to say false and then when we run this it's the exact opposite so it
this it's the exact opposite so it starts with South America the last one
starts with South America the last one and then goes in reverse alphabetical
and then goes in reverse alphabetical order we could also say false make it a
order we could also say false make it a list and do comma
list and do comma true and just like this and then it
true and just like this and then it would sort this First Column as false
would sort this First Column as false and this next column as true so you can
and this next column as true so you can really customize it but you know for
really customize it but you know for what we're doing we don't need any of
what we're doing we don't need any of that we just need to be able to see this
that we just need to be able to see this right here so now when we try to search
right here so now when we try to search by our index like we did before we did
by our index like we did before we did data frame. Loke now when we did that
data frame. Loke now when we did that and we said you know let's say Angola
and we said you know let's say Angola when we specified Angola it's not going
when we specified Angola it's not going to work properly because it's searching
to work properly because it's searching in this first index for the first string
in this first index for the first string that we have we can search Africa
that we have we can search Africa let's search for
let's search for Africa and now we have all of the
Africa and now we have all of the African countries and if we want to
African countries and if we want to specify to Angola we can also go down
specify to Angola we can also go down another level oops by doing Ang
another level oops by doing Ang Angola and now we have what we were
Angola and now we have what we were looking at before where we're calling
looking at before where we're calling all the data within those but we
all the data within those but we couldn't do it just based off Africa
couldn't do it just based off Africa because we had an additional Index right
because we had an additional Index right here so once we called both indexes now
here so once we called both indexes now we get this view but let's look at that
we get this view but let's look at that I look really
I look really quick when we run this let's just say
quick when we run this let's just say one because right up here oh we have
one because right up here oh we have Angola zero and then one so you think it
Angola zero and then one so you think it may pull up Angola let's go ahead and
may pull up Angola let's go ahead and run this and it's still pulling up
run this and it's still pulling up Albania let's go right up here if you
Albania let's go right up here if you remember when we didn't have the
remember when we didn't have the multiple indexes it was pulling up
multiple indexes it was pulling up Albania the difference when you're doing
Albania the difference when you're doing these multi- indexes is that the the L
these multi- indexes is that the the L is able to specify this whereas this one
is able to specify this whereas this one does not go based off that multi-
does not go based off that multi- indexing it's going to go based off the
indexing it's going to go based off the initial index or the integer based index
initial index or the integer based index so that's a lot about indexing in pandas
so that's a lot about indexing in pandas we'll cover even a few more things in
we'll cover even a few more things in future videos as we get more and more
future videos as we get more and more into pandas but this is a lot of what
into pandas but this is a lot of what indexing looks like within pandas and
indexing looks like within pandas and again super important to learn how to do
again super important to learn how to do and know how to do because it's a pretty
and know how to do because it's a pretty important building block as we go
important building block as we go through this Panda series so I hope you
through this Panda series so I hope you enjoyed this video on indexing if you
enjoyed this video on indexing if you did be sure to like And subscribe below
did be sure to like And subscribe below and I will see you in the next
and I will see you in the next [Music]
[Music] video hello everybody today we're going
video hello everybody today we're going to be taking a look at the group by
to be taking a look at the group by function and aggregating within panas
function and aggregating within panas group I is going to group together the
group I is going to group together the values in a column and display them all
values in a column and display them all on the same row and this allows you to
on the same row and this allows you to perform aggregate functions on those
perform aggregate functions on those groupings so let's start reading in our
groupings so let's start reading in our data and take a look so we're going to
data and take a look so we're going to do import pandas as
do import pandas as PD and then we're going to say our data
PD and then we're going to say our data frame is equal to and we'll say pd. read
frame is equal to and we'll say pd. read CSV we'll do an open parenthesis R and
CSV we'll do an open parenthesis R and our file path and we're going to be
our file path and we're going to be looking at the flavors CSV right here so
looking at the flavors CSV right here so right here we have our flavor of ice
right here we have our flavor of ice cream we have our base flavor flavor
cream we have our base flavor flavor whether it was vanilla or chocolate
whether it was vanilla or chocolate whether I liked it or not the flavor
whether I liked it or not the flavor rating texture rating and its overall or
rating texture rating and its overall or its total rating now these are all my
its total rating now these are all my own personal scores so you know I've
own personal scores so you know I've spent years researching this so these
spent years researching this so these are all very accurate but this should be
are all very accurate but this should be a low stress environment to learn Group
a low stress environment to learn Group by and the aggregate functions so the
by and the aggregate functions so the first thing that we can do is look at
first thing that we can do is look at our group by now you can't Group by well
our group by now you can't Group by well you can you can Group by flavor but as
you can you can Group by flavor but as you can see these are all unique values
you can see these are all unique values what we need is something that has
what we need is something that has duplicate values or or similar values on
duplicate values or or similar values on different rows that'll group together so
different rows that'll group together so this base flavor is actually a perfect
this base flavor is actually a perfect one to group it on and we'll do that by
one to group it on and we'll do that by saying DF do group by do an open
saying DF do group by do an open parenthesis and we'll just specify base
parenthesis and we'll just specify base flavor and this will then group together
flavor and this will then group together those values and I need to make sure I
those values and I need to make sure I can spell properly this will group those
can spell properly this will group those flavors together so let's run this and
flavors together so let's run this and as you can see it actually is its own
as you can see it actually is its own object so it has a group by data frame
object so it has a group by data frame Group by object so now that we've
Group by object so now that we've grouped them let's give it a variable so
grouped them let's give it a variable so we'll say group underscore byor frame
we'll say group underscore byor frame let's say that's equal to Let's copy
let's say that's equal to Let's copy this we'll run it and now what we need
this we'll run it and now what we need to do is run our aggregations in order
to do is run our aggregations in order to get an output so we're going to say
to get an output so we're going to say mean and that's all we're going to put
mean and that's all we're going to put just for now just to get an output that
just for now just to get an output that we can take a look off and then we'll
we can take a look off and then we'll build from there so let's go ahead and
build from there so let's go ahead and run this and right here we have our base
run this and right here we have our base flavor which is now saying is the index
flavor which is now saying is the index of chocolate or vanilla and then it's
of chocolate or vanilla and then it's taking the mean or the average of all
taking the mean or the average of all the columns that have integers notice
the columns that have integers notice that it did not take the liked column
that it did not take the liked column and it did not take the flavor column
and it did not take the flavor column because those are strings and they
because those are strings and they cannot aggregate those and we'll take a
cannot aggregate those and we'll take a look at that later but it took all the
look at that later but it took all the values that have integers and then it
values that have integers and then it gave us the average of those ratings
gave us the average of those ratings really quickly I wanted to give a huge
really quickly I wanted to give a huge shout out to the sponsor of this entire
shout out to the sponsor of this entire Panda series and that is udemy udemy has
Panda series and that is udemy udemy has some of the best courses at the best
some of the best courses at the best prices and it is no exception when it
prices and it is no exception when it comes to pandas courses if you want to
comes to pandas courses if you want to master pandas this is the course that I
master pandas this is the course that I would recommend it's going to teach you
would recommend it's going to teach you just about everything you need to know
just about everything you need to know about pandas so huge shout out to UD me
about pandas so huge shout out to UD me for sponsoring this Panda series and
for sponsoring this Panda series and let's get back to the video so right off
let's get back to the video so right off the bat as averages with chocolate I
the bat as averages with chocolate I have a much higher rating overall than
have a much higher rating overall than the ones with vanilla bases now we can
the ones with vanilla bases now we can actually combine all of this together
actually combine all of this together into one line and we can do something
into one line and we can do something like this so we'll
like this so we'll say DF do group by and we'll say mean
say DF do group by and we'll say mean just like this and this will actually
just like this and this will actually run it before we didn't have any
run it before we didn't have any aggregating function on there so it
aggregating function on there so it didn't run but now that we combine it
didn't run but now that we combine it all into one it will run properly now
all into one it will run properly now there are a lot of different aggregate
there are a lot of different aggregate functions but I'm going to show you some
functions but I'm going to show you some of the most popular ones or the most
of the most popular ones or the most common ones that you will see so let's
common ones that you will see so let's copy this right here so we can do dot
copy this right here so we can do dot count and when we run this we can look
count and when we run this we can look at the count and this will show us the
at the count and this will show us the actual count of the rows that were
actual count of the rows that were aggregated so for chocolate we have
aggregated so for chocolate we have three so there going to be three all the
three so there going to be three all the way across and for vanilla we had six so
way across and for vanilla we had six so we're looking at a higher count of
we're looking at a higher count of vanilla which if you're comparing it to
vanilla which if you're comparing it to this mean up here that could be a big
this mean up here that could be a big skew towards the chocolate because if
skew towards the chocolate because if you have one or two good chocolates it
you have one or two good chocolates it could really pull the numbers up whereas
could really pull the numbers up whereas if you had two good vanillas but all the
if you had two good vanillas but all the other ones were bad it pulls that
other ones were bad it pulls that average down so knowing the count of
average down so knowing the count of something something is really
something something is really good let's take a look at the next one
good let's take a look at the next one and we can do Min and Max and I'll just
and we can do Min and Max and I'll just run these really quickly we can do Min
run these really quickly we can do Min and when we run this the first thing
and when we run this the first thing that you should notice is that it now
that you should notice is that it now has a flavor and a liked column and
has a flavor and a liked column and that's because Min and Max will actually
that's because Min and Max will actually look at the first letter in the string
look at the first letter in the string or the first set of letters if there are
or the first set of letters if there are um you know chocolate something it'll
um you know chocolate something it'll look at the first and then it'll
look at the first and then it'll actually populate it so chocolate with
actually populate it so chocolate with the CH chocolate is the very first or
the CH chocolate is the very first or the minimum value for that string and
the minimum value for that string and for a cake batter that is the minimum
for a cake batter that is the minimum value in vanilla as well now with the
value in vanilla as well now with the liked it's interesting because
liked it's interesting because apparently I liked all the chocolate
apparently I liked all the chocolate ones I'm going to go take a look so
ones I'm going to go take a look so chocolate I liked chocolate I liked
chocolate I liked chocolate I liked chocolate I lik so there is no no option
chocolate I lik so there is no no option in this liked column so yes was the only
in this liked column so yes was the only option and now let's look at Max
option and now let's look at Max whoops and it should do the exact
whoops and it should do the exact opposite which is going to take the
opposite which is going to take the highest value even if it's a string so
highest value even if it's a string so Rocky Road the letter r comes later in
Rocky Road the letter r comes later in the alphabet so that's what it's looking
the alphabet so that's what it's looking at and so does vanilla and then we have
at and so does vanilla and then we have yes as well and then of course right
yes as well and then of course right here it's taking the max value so before
here it's taking the max value so before when we were looking at Min I just
when we were looking at Min I just focused on those but it still does the
focused on those but it still does the exact same thing to these integer um
exact same thing to these integer um columns as well so for the max value for
columns as well so for the max value for vanilla it was mint chocolate chip that
vanilla it was mint chocolate chip that was our base so I had a rating of 10 for
was our base so I had a rating of 10 for this vanilla row or grouping and then we
this vanilla row or grouping and then we can also look at the sum
can also look at the sum and there are all the sums for these and
and there are all the sums for these and again it only does integer because we
again it only does integer because we can't add the strings here are the sum
can't add the strings here are the sum or the total values for all of them and
or the total values for all of them and for the total values since we had you
for the total values since we had you know six rows that were grouping into
know six rows that were grouping into this vanilla we now have a lot of a much
this vanilla we now have a lot of a much higher score for vanilla now that's a
higher score for vanilla now that's a really simple way to do your
really simple way to do your aggregations but there is actually an
aggregations but there is actually an aggregation function and let's take a
aggregation function and let's take a look at this CU this is um a little bit
look at this CU this is um a little bit more complex although when I write it
more complex although when I write it out or show you hope it makes a lot of
out or show you hope it makes a lot of sense we can do a so this is our
sense we can do a so this is our aggregate function and what we need to
aggregate function and what we need to pass into our aggregate function is
pass into our aggregate function is actually a dictionary so let's do an
actually a dictionary so let's do an open parenthesis and we're going to do a
open parenthesis and we're going to do a squiggly bracket and then we need to
squiggly bracket and then we need to specify what we're going to be
specify what we're going to be aggregating on or what column so let's
aggregating on or what column so let's do this flavor rating let's copy this
do this flavor rating let's copy this we'll do flavor rating and I need to put
we'll do flavor rating and I need to put that as a
that as a string and then we'll do a colon and now
string and then we'll do a colon and now we can specify what what aggregate
we can specify what what aggregate functions we want so we've done sum
functions we want so we've done sum count mean Min and Max all of those and
count mean Min and Max all of those and we can actually put all of those into
we can actually put all of those into here and perform all of those
here and perform all of those aggregations on just one column so let's
aggregations on just one column so let's make a list and then let's say
make a list and then let's say mean
mean Max count and uh what's another one sum
Max count and uh what's another one sum so let's do all four of those only on
so let's do all four of those only on this flavor rating
this flavor rating column and when we run this we have our
column and when we run this we have our base flavor right here chocolate and
base flavor right here chocolate and vanilla but now we don't have multiple
vanilla but now we don't have multiple columns we have one column with multiple
columns we have one column with multiple Columns of our aggregations and it is
Columns of our aggregations and it is possible to pass in multiple columns
possible to pass in multiple columns like that so we'll do texture
like that so we'll do texture rating and we'll just come right here
rating and we'll just come right here and do a comma then we'll say uh uh
and do a comma then we'll say uh uh texture
texture rating and then a colon I don't know why
rating and then a colon I don't know why I spelled it out when I copied it but I
I spelled it out when I copied it but I did and then we'll do the exact same
did and then we'll do the exact same ones and now when we run it we're
ones and now when we run it we're getting the exact same columns mean Max
getting the exact same columns mean Max count and sum for flavor rating then
count and sum for flavor rating then mean Max count and sum for our texture
mean Max count and sum for our texture rating now so far we've only grouped on
rating now so far we've only grouped on one column but we can actually group on
one column but we can actually group on multiple columns let's go back up here
multiple columns let's go back up here to our data and I should have just
to our data and I should have just copied this down here let's go back down
copied this down here let's go back down and just look at this so really we only
and just look at this so really we only grouped it on this base flavor but you
grouped it on this base flavor but you can do multiple groupings or group by
can do multiple groupings or group by multiple columns so let's do our base
multiple columns so let's do our base flavor which we did already as well as
flavor which we did already as well as the liked column so we're going to say
the liked column so we're going to say DF dog Group by then we'll do an open
DF dog Group by then we'll do an open parenthesis and then instead of just
parenthesis and then instead of just passing through one string we're going
passing through one string we're going to do a list and we'll say base
to do a list and we'll say base flavor oops comma and then we'll do
flavor oops comma and then we'll do liked so now when it groups this it
liked so now when it groups this it should put put two groupings and let's
should put put two groupings and let's run this and just see oops I got to say
run this and just see oops I got to say let's just do
let's just do mean so now we have our chocolate and a
mean so now we have our chocolate and a vanilla and remember chocolate only had
vanilla and remember chocolate only had yes so that's the only one that it's
yes so that's the only one that it's going to group on but vanilla had a no
going to group on but vanilla had a no and a yes so if we look at the vanilla
and a yes so if we look at the vanilla we have our base flavor vanilla and then
we have our base flavor vanilla and then within liked we have no and a yes which
within liked we have no and a yes which can show us that within our vanilla when
can show us that within our vanilla when we group on these our NOS were really
we group on these our NOS were really low
low but our yeses were really high we
but our yeses were really high we actually had a pretty similar rating or
actually had a pretty similar rating or very close to the same rating as the
very close to the same rating as the ones we really liked in chocolate and
ones we really liked in chocolate and just like we did above we can take this
just like we did above we can take this doag and I'm going to copy this and
doag and I'm going to copy this and it'll perform it on each of those rows
it'll perform it on each of those rows let me close that and what did I do
let me close that and what did I do wrong oh I need the squiggly
wrong oh I need the squiggly bracket and it'll show us each of those
bracket and it'll show us each of those so the mean Max count and sum for all of
so the mean Max count and sum for all of the chocolate and vanilla as well as the
the chocolate and vanilla as well as the groupings of light yes and no now after
groupings of light yes and no now after we've looked at all that and that's how
we've looked at all that and that's how I usually do it there is one uh shortcut
I usually do it there is one uh shortcut function that can give you some of these
function that can give you some of these things just really quickly and so let's
things just really quickly and so let's go back up here and take this it's just
go back up here and take this it's just called describe um and if you've ever
called describe um and if you've ever done it it's just going to give you some
done it it's just going to give you some high level overview of some of those
high level overview of some of those different aggregations so let's run this
different aggregations so let's run this and it's going to give us our chocolate
and it's going to give us our chocolate and vanilla and within each column it's
and vanilla and within each column it's going to give us our count our mean our
going to give us our count our mean our standard deviation I believe is what
standard deviation I believe is what that is our minimum 25% 50 75 and 100
that is our minimum 25% 50 75 and 100 which is our Max then our count and our
which is our Max then our count and our means so a lot of those aggregate
means so a lot of those aggregate functions but the describe is you know a
functions but the describe is you know a very generalized um function we can't
very generalized um function we can't get as specific as we were with the
get as specific as we were with the previous ones that we were looking at
previous ones that we were looking at but I just wanted to throw this out
but I just wanted to throw this out there in case this is something that
there in case this is something that you'd be interested in because it you
you'd be interested in because it you know technically is showing a lot of
know technically is showing a lot of those aggregate functions just you know
those aggregate functions just you know all at one time so that is our group Buy
all at one time so that is our group Buy and aggregate functions within pandas I
and aggregate functions within pandas I hope that that was helpful I hope that
hope that that was helpful I hope that you understood you know everything that
you understood you know everything that we were working on if you like this
we were working on if you like this video be sure to like And subscribe and
video be sure to like And subscribe and check out all my other videos on python
check out all my other videos on python as well as pandas and I will see you in
as well as pandas and I will see you in the next
the next [Music]
[Music] video hello everybody today we're going
video hello everybody today we're going to be talking about merging joining and
to be talking about merging joining and concatenating data frames in p do this
concatenating data frames in p do this whole video is basically around being
whole video is basically around being able to combine two separate data frames
able to combine two separate data frames together into one data frame these are
together into one data frame these are really important to understand when
really important to understand when we're actually using the merge and the
we're actually using the merge and the join right here we have what's called an
join right here we have what's called an inner join and the Shaded part is what's
inner join and the Shaded part is what's going to be returned it's only the
going to be returned it's only the things that are in both the left and the
things that are in both the left and the right data frames then we have an outer
right data frames then we have an outer join or a full outer join and this will
join or a full outer join and this will take all the data from the left data
take all the data from the left data frame and the right data frame and
frame and the right data frame and everything that is similar so basically
everything that is similar so basically just takes everything we also have a
just takes everything we also have a left join which is going to take
left join which is going to take everything from the left and then if
everything from the left and then if there's anything that's similar it'll
there's anything that's similar it'll also include that and then the exact
also include that and then the exact opposite of that is the right join which
opposite of that is the right join which is going to give us everything from the
is going to give us everything from the right data frame and it's going to give
right data frame and it's going to give us everything that is similar but it's
us everything that is similar but it's not going to give us anything that is
not going to give us anything that is just unique to the left data frame so
just unique to the left data frame so this is just for reference because in a
this is just for reference because in a little bit when we start merging these
little bit when we start merging these these become very important so I just
these become very important so I just wanted to kind of show you how that
wanted to kind of show you how that works visually so let's get started by
works visually so let's get started by pulling in our files so first we're
pulling in our files so first we're going to say import and is aspd we'll
going to say import and is aspd we'll run this and then we'll say data frame
run this and then we'll say data frame one and we'll also have a data frame two
one and we'll also have a data frame two and these are the different data frames
and these are the different data frames the left and the right data frame that
the left and the right data frame that we'll be using to join merge and
we'll be using to join merge and concatenate so we'll say data frame 1 is
concatenate so we'll say data frame 1 is equal to pd. CSV read and we'll do R and
equal to pd. CSV read and we'll do R and here is our file path so we have this
here is our file path so we have this lr. CSV that's our Lord of the Rings CSV
lr. CSV that's our Lord of the Rings CSV and let's call that really quickly so we
and let's call that really quickly so we can see what's in there and I'm having a
can see what's in there and I'm having a dyslexic moment uh because it's supposed
dyslexic moment uh because it's supposed to be reor CSV uh I apologize for that
to be reor CSV uh I apologize for that but this is our data frame this is our
but this is our data frame this is our data frame one we have three columns
data frame one we have three columns it's their Fellowship ID 10001 2 3 and
it's their Fellowship ID 10001 2 3 and four their first name froto Sam wiise
four their first name froto Sam wiise gelf and Pippen and their skills hide
gelf and Pippen and their skills hide and gardening spells and fireworks so
and gardening spells and fireworks so this is our very first data frame that
this is our very first data frame that we're going to be working with let's go
we're going to be working with let's go down a little bit let's pull this down
down a little bit let's pull this down here and we're just going to say data
here and we're just going to say data Frame 2 Data Frame 2 and this is the
Frame 2 Data Frame 2 and this is the Lord of the Rings 2 so let's pull this
Lord of the Rings 2 so let's pull this one in now as you can see it's very
one in now as you can see it's very similar we have Fellowship ID 1 2 6 7 8
similar we have Fellowship ID 1 2 6 7 8 so we have three different IDs here we
so we have three different IDs here we don't have six seven and eight in this
don't have six seven and eight in this upper this First Data frame we also have
upper this First Data frame we also have the first name so froto and Sam or Sam
the first name so froto and Sam or Sam wise are in the very first and the
wise are in the very first and the second data frame but now we have three
second data frame but now we have three new people barir Eland and legalis and
new people barir Eland and legalis and now we have this age column which again
now we have this age column which again is unique to just this second data frame
is unique to just this second data frame really quickly I want to give a huge
really quickly I want to give a huge shout out to the sponsor of this video
shout out to the sponsor of this video and that is zendesk I've been using
and that is zendesk I've been using zenes for my company's customer
zenes for my company's customer analytics and it has been absolutely
analytics and it has been absolutely phenomenal they're going to be hosting a
phenomenal they're going to be hosting a conference called zenes relate on May
conference called zenes relate on May 10th and they're going to talk all about
10th and they're going to talk all about customer analytics chat Bots and AI in
customer analytics chat Bots and AI in this space you can attend in person in
this space you can attend in person in San Francisco or you can attend
San Francisco or you can attend virtually but space is limited so be
virtually but space is limited so be sure to apply if you want to attend so
sure to apply if you want to attend so if you are a business leader and you
if you are a business leader and you want to make most out of your customer
want to make most out of your customer data or you want to learn customer data
data or you want to learn customer data analytics I will leave links in the
analytics I will leave links in the description again huge shout out to
description again huge shout out to zendesk for sponsoring this video now
zendesk for sponsoring this video now the first one that I want to look at is
the first one that I want to look at is merge and I want to look at merge first
merge and I want to look at merge first because I think this one is the most
because I think this one is the most important I use this one more than any
important I use this one more than any of ones that we're going to talk about
of ones that we're going to talk about today the merge is just like the joins
today the merge is just like the joins that we were just looking at the outer
that we were just looking at the outer the inner the left and the right and
the inner the left and the right and there's also one called cross and I'll
there's also one called cross and I'll show you that one although if I'm being
show you that one although if I'm being honest I don't really use that one that
honest I don't really use that one that much but It's Worth showing just in case
much but It's Worth showing just in case you come into a scenario where you do
you come into a scenario where you do want to do that so let's go right down
want to do that so let's go right down here and I want to be able to see these
here and I want to be able to see these while we do it so we're going to say
while we do it so we're going to say data frame one and when we specify data
data frame one and when we specify data frame one as the very first data frame
frame one as the very first data frame we say datf frame. merge this is
we say datf frame. merge this is automatically going to be our left data
automatically going to be our left data frame then if we do our parentheses
frame then if we do our parentheses right here and we say data Frame 2 this
right here and we say data Frame 2 this is our right data frame and let's see
is our right data frame and let's see what happens when we do this
what happens when we do this so what it's going to do and this we
so what it's going to do and this we didn't specify this it's just a default
didn't specify this it's just a default it's going to do an inner join so it's
it's going to do an inner join so it's only going to give us an output where
only going to give us an output where specific values or the keys are the same
specific values or the keys are the same now you can't see this but what is
now you can't see this but what is happening is is it's taking this
happening is is it's taking this Fellowship ID and saying I have 101 here
Fellowship ID and saying I have 101 here a 102 here this is the exact same as up
a 102 here this is the exact same as up here with this Fellowship ID and
here with this Fellowship ID and fellowship ID of 101 and 2 but when we
fellowship ID of 101 and 2 but when we look at 13 and 4 those aren't in this
look at 13 and 4 those aren't in this right right data frame and 678 is not in
right right data frame and 678 is not in this left data frame so the only ones
this left data frame so the only ones that match are this 101 and two and
that match are this 101 and two and that's why they get pulled in down here
that's why they get pulled in down here but because we didn't explicitly say
but because we didn't explicitly say here's what I want to join or merge
here's what I want to join or merge between these two data frames it
between these two data frames it actually is looking at the fellowship ID
actually is looking at the fellowship ID and the first name so it's taking in
and the first name so it's taking in these unique values of froto and Sam
these unique values of froto and Sam wise which are the same in both which is
wise which are the same in both which is why I pulled it over but really quickly
why I pulled it over but really quickly let's just check and make sure that we
let's just check and make sure that we did it on the inner join because again
did it on the inner join because again we didn't specify anything that was just
we didn't specify anything that was just the default so we're going to say how is
the default so we're going to say how is equal to and then we'll say iner and if
equal to and then we'll say iner and if we run this it's going to be the exact
we run this it's going to be the exact same because again the inner is the
same because again the inner is the default but now just to show you how
default but now just to show you how it's kind of joining these two uh data
it's kind of joining these two uh data frames together I'm going to say on is
frames together I'm going to say on is equal to and then I'm only going to put
equal to and then I'm only going to put Fellowship ID so let's run this now the
Fellowship ID so let's run this now the first thing that you make may have
first thing that you make may have noticed is this first name undor X and
noticed is this first name undor X and this first name uncore Y what the merge
this first name uncore Y what the merge does as kind of a default is when you
does as kind of a default is when you were only joining on a fellowship ID we
were only joining on a fellowship ID we have this right data frame with
have this right data frame with Fellowship ID the left data frame with
Fellowship ID the left data frame with the fellowship ID if you're just joining
the fellowship ID if you're just joining on these and you're not joining on the
on these and you're not joining on the first name and the first name then it's
first name and the first name then it's going to separate those into an
going to separate those into an underscore X and an underscore Y and
underscore X and an underscore Y and even though they have the exact same
even though they have the exact same values since we are not merging on that
values since we are not merging on that column it automatically separates that
column it automatically separates that into two separate columns so we can see
into two separate columns so we can see the values within each of those columns
the values within each of those columns if we went into this on and we make a
if we went into this on and we make a list and let's do it like that and we
list and let's do it like that and we say comma and then we write first name
say comma and then we write first name oops first name and then we run this
oops first name and then we run this it's going to look exactly like it did
it's going to look exactly like it did before again it automatically pulled in
before again it automatically pulled in both of these columns when it was
both of these columns when it was merging at the first time even though we
merging at the first time even though we didn't write anything but if we actually
didn't write anything but if we actually write this this it's doing exactly what
write this this it's doing exactly what it was doing when we just had df2 we're
it was doing when we just had df2 we're just now writing it out now there are
just now writing it out now there are other arguments that we can pass into
other arguments that we can pass into this merge function let's hit shift Tab
this merge function let's hit shift Tab and let's scroll down here so within
and let's scroll down here so within this merge function we have a lot of
this merge function we have a lot of different arguments that you can pass
different arguments that you can pass into it first we have this right which
into it first we have this right which is the right data frame which is this
is the right data frame which is this data frame two then we have the how and
data frame two then we have the how and the on which we've already shown how to
the on which we've already shown how to do there's a left on right on left Index
do there's a left on right on left Index right index not something you'll
right index not something you'll probably use that much but you
probably use that much but you definitely can if you want to look into
definitely can if you want to look into that and there's all these doc strings
that and there's all these doc strings which show you exactly how to use all of
which show you exactly how to use all of these so if you're interest in looking
these so if you're interest in looking at the left and the right and the left
at the left and the right and the left index it's all in here the one that is
index it's all in here the one that is really good is the sort and you can sort
really good is the sort and you can sort it saying either it's false or true then
it saying either it's false or true then we have these suffixes now if you
we have these suffixes now if you remember when we took these out what it
remember when we took these out what it automatically did was it put in these
automatically did was it put in these underscore X and underscore y you can
underscore X and underscore y you can customize that and you can put in what
customize that and you can put in what whatever you'd like instead of the
whatever you'd like instead of the underscore X andore Y you can put in
underscore X andore Y you can put in some custom um string for that we also
some custom um string for that we also have an indicator and a validates again
have an indicator and a validates again all things you can go in here and look
all things you can go in here and look at I'm just going to show you the stuff
at I'm just going to show you the stuff that I use the most so these things
that I use the most so these things right here are things that I definitely
right here are things that I definitely use the most so now that we've looked at
use the most so now that we've looked at the inner join let's copy this right
the inner join let's copy this right down here and let's look at the outer
down here and let's look at the outer join and these get a little bit more
join and these get a little bit more tricky I think the inner join is
tricky I think the inner join is probably the easiest one to understand
probably the easiest one to understand well look at the outer is spelled o u t
well look at the outer is spelled o u t e r i I don't know why I always want to
e r i I don't know why I always want to say o t t r but let's run this and see
say o t t r but let's run this and see what we get so now this looks quite
what we get so now this looks quite different the inner join only gave us
different the inner join only gave us the values that are the exact same this
the values that are the exact same this one is going to give us all of the
one is going to give us all of the values regardless of if they are the
values regardless of if they are the same so we have 1 2 3 4 six seven and
same so we have 1 2 3 4 six seven and eight so let's scroll back up here so we
eight so let's scroll back up here so we have 1 2 3 4 1 two and six s and 8 so we
have 1 2 3 4 1 two and six s and 8 so we don't have a 105 and then if you notice
don't have a 105 and then if you notice in this data frame right here if the
in this data frame right here if the value doesn't have so if we can't join
value doesn't have so if we can't join on the fellowship ID or the first name
on the fellowship ID or the first name like legalis wasn't one that we joined
like legalis wasn't one that we joined on or that has a similar value in the
on or that has a similar value in the left data frame it just gives us an N
left data frame it just gives us an N which is not a number and it's going to
which is not a number and it's going to do that for any value where it couldn't
do that for any value where it couldn't find that join or it couldn't match uh
find that join or it couldn't match uh something within that either ID or first
something within that either ID or first name so in age we also have that for the
name so in age we also have that for the ones that weren't in the right data
ones that weren't in the right data frame we only had 101 and 102 so we'll
frame we only had 101 and 102 so we'll have the age for both froto and Sam but
have the age for both froto and Sam but for Gandalf and Pippen we don't have
for Gandalf and Pippen we don't have their corresponding IDs and so it's just
their corresponding IDs and so it's just going to be blank for Gandalf and Pippen
going to be blank for Gandalf and Pippen and you can see that right here so again
and you can see that right here so again outer joins are kind of the opposite of
outer joins are kind of the opposite of inner joins they're going to return
inner joins they're going to return everything from both if there is
everything from both if there is overlapping data it won't be duplicated
overlapping data it won't be duplicated now let's go on to the left join and I'm
now let's go on to the left join and I'm going to pull this down right here and
going to pull this down right here and now we're just going to say how is equal
now we're just going to say how is equal to left and let's run this so what this
to left and let's run this so what this is going to do is it's going to take
is going to do is it's going to take everything from the left table or the
everything from the left table or the left data frame right here so everything
left data frame right here so everything from data frame one then if there is any
from data frame one then if there is any overlap it'll also pull the overlapped
overlap it'll also pull the overlapped or the you know whatever we're able to
or the you know whatever we're able to merge on from data Frame 2 so let's go
merge on from data Frame 2 so let's go back up to our data frame 1 and two so
back up to our data frame 1 and two so it's going to pull everything from this
it's going to pull everything from this left data frame cuz we're specifying
left data frame cuz we're specifying we're doing a left join so everything
we're doing a left join so everything from the left data frame will be in
from the left data frame will be in there we're also going to try to bring
there we're also going to try to bring in everything from the right but only if
in everything from the right but only if it matches or or is able to merge so
it matches or or is able to merge so just this information right here will
just this information right here will come over we weren't able to join on
come over we weren't able to join on 1006 17 or 1008 so really none of that
1006 17 or 1008 so really none of that information is going to come over so
information is going to come over so let's go down and check on this so again
let's go down and check on this so again we have 1 2 3 4 all of the data with
we have 1 2 3 4 all of the data with this first name and skills everything is
this first name and skills everything is in here but then we are trying to bring
in here but then we are trying to bring over the age but we only have matches
over the age but we only have matches with 1,1 and 1002 so only these two
with 1,1 and 1002 so only these two values will come in let's look at the
values will come in let's look at the right join because it's basically the
right join because it's basically the exact opposite let's look at the
exact opposite let's look at the right and this is basically the exact
right and this is basically the exact opposite of the left in the fact that
opposite of the left in the fact that now we're only looking at the right hand
now we're only looking at the right hand and then if there's something that
and then if there's something that matches in data frame one then we will
matches in data frame one then we will pull that in so this this is basically
pull that in so this this is basically just looking like data Frame 2 except
just looking like data Frame 2 except we're pulling in that skills column and
we're pulling in that skills column and since only 101 and 102 are the same
since only 101 and 102 are the same that's why the skills values are here
that's why the skills values are here now those are the main types of merges
now those are the main types of merges that I will use when I'm using a data
that I will use when I'm using a data frame or when I'm trying to merge a data
frame or when I'm trying to merge a data frame but there also is one called a
frame but there also is one called a cross or a cross join uh and let's look
cross or a cross join uh and let's look at this one and this one is quite a bit
at this one and this one is quite a bit different here we go let's run this so
different here we go let's run this so this one is different in that it takes
this one is different in that it takes each value from the left data frame and
each value from the left data frame and Compares it to each value in the right
Compares it to each value in the right data frame so for froto in this left
data frame so for froto in this left data frame it looks at the froto in the
data frame it looks at the froto in the right data frame Sam wise in the right
right data frame Sam wise in the right data frame legalis elron and baromir all
data frame legalis elron and baromir all on the right data frame then it goes to
on the right data frame then it goes to the next value Sam wise and does the
the next value Sam wise and does the exact same thing Roto Sam wise legalis
exact same thing Roto Sam wise legalis Elon baromir and it does that for every
Elon baromir and it does that for every single value so let's go right back up
single value so let's go right back up here so it's taking this this this 101
here so it's taking this this this 101 it's comparing it to 1 2 3 4 5 then it's
it's comparing it to 1 2 3 4 5 then it's taking Samwise it's comparing it to 1 2
taking Samwise it's comparing it to 1 2 3 4 5 Gandalf 1 2 3 4 5 Pippen and then
3 4 5 Gandalf 1 2 3 4 5 Pippen and then you kind of see that pattern and that's
you kind of see that pattern and that's what a cross joint is um there are very
what a cross joint is um there are very few in my opinion reasons for a cross
few in my opinion reasons for a cross join although you'll if you ever do like
join although you'll if you ever do like an interview where you're being
an interview where you're being interviewed on python you will sometimes
interviewed on python you will sometimes be asked on Cross joins but there aren't
be asked on Cross joins but there aren't a lot of instances in actual work where
a lot of instances in actual work where you really use need a cross join now
you really use need a cross join now let's take a look at joins and joins are
let's take a look at joins and joins are pretty similar to the merge function and
pretty similar to the merge function and it can do a lot of the same thing except
it can do a lot of the same thing except in my opinion the join function isn't as
in my opinion the join function isn't as easily understood as the merge function
easily understood as the merge function it's a little bit more complicated um
it's a little bit more complicated um but let's take a look and see how we can
but let's take a look and see how we can join together these data frames using
join together these data frames using the join function so let's go right up
the join function so let's go right up here we're going to say data frame one
here we're going to say data frame one do join and then we'll do data frame two
do join and then we'll do data frame two very similar to how we did it before and
very similar to how we did it before and let's try running this and it's not
let's try running this and it's not going to work um when we did the merge
going to work um when we did the merge function it had a lot of defaults for us
function it had a lot of defaults for us let's go down and see what this error is
let's go down and see what this error is it says the columns overlap but no
it says the columns overlap but no suffix was specified so it's telling us
suffix was specified so it's telling us that it's trying to use the fellowship
that it's trying to use the fellowship ID and the first name just like the join
ID and the first name just like the join did except it's not able to distinguish
did except it's not able to distinguish which is which and so we need to go in
which is which and so we need to go in there and kind of help it out a little
there and kind of help it out a little bit again a little bit more Hands-On
bit again a little bit more Hands-On than the merge but let's see what we can
than the merge but let's see what we can do to make this work let's do comma and
do to make this work let's do comma and we'll say on and let's really quickly
we'll say on and let's really quickly let's open this up and kind of see what
let's open this up and kind of see what we have so this one has less options
we have so this one has less options than the merge does we have other and
than the merge does we have other and that's our other data frame we can do on
that's our other data frame we can do on and we're going to specify you know what
and we're going to specify you know what column do we want to join on and then we
column do we want to join on and then we can look at how do we want it to be a
can look at how do we want it to be a left an inner an outer the same kind of
left an inner an outer the same kind of types of joins as the merge then we have
types of joins as the merge then we have that left suffix right suffix and that's
that left suffix right suffix and that's right here is kind of part of the issue
right here is kind of part of the issue that we were just facing is that those
that we were just facing is that those columns are the same but if we say left
columns are the same but if we say left suffix it'll give us an underscore
suffix it'll give us an underscore whatever we want to specify any string
whatever we want to specify any string four columns that are both in the left
four columns that are both in the left and the right we can give it a unique
and the right we can give it a unique name so we'll no longer have that issue
name so we'll no longer have that issue and then we can also sort it like we did
and then we can also sort it like we did on the other one but anyways let's go
on the other one but anyways let's go back to our on we'll say on is equal to
back to our on we'll say on is equal to and then we'll say
and then we'll say Fellowship ID let's try running this and
Fellowship ID let's try running this and we're still getting an error it's just
we're still getting an error it's just not as simple as the merge so let's keep
not as simple as the merge so let's keep going so now let's specify the type so
going so now let's specify the type so we'll say how is equal to and we'll do
we'll say how is equal to and we'll do an
an outer and if we run this it still
outer and if we run this it still doesn't work we're still getting the
doesn't work we're still getting the exact same issue as the left suffix and
exact same issue as the left suffix and the right suffix so now let's finally
the right suffix so now let's finally resolve it I just wanted to show you how
resolve it I just wanted to show you how a little bit more frustrating it was but
a little bit more frustrating it was but now let's say uh L suffix is equal to
now let's say uh L suffix is equal to and now it automatically when we did the
and now it automatically when we did the merge did an underscore X but we can do
merge did an underscore X but we can do let's do
let's do underscore uh
underscore uh left and then we can do a comma we'll do
left and then we can do a comma we'll do right
right suffix and we'll says equal to and we'll
suffix and we'll says equal to and we'll do underscore right now when we run this
do underscore right now when we run this it should work properly let's run this
it should work properly let's run this so this is our output and obviously
so this is our output and obviously looks quite a bit different over here we
looks quite a bit different over here we have this Fellowship ID then we also
have this Fellowship ID then we also have Fellowship ID left first name left
have Fellowship ID left first name left Fellowship ID right and first name right
Fellowship ID right and first name right so it just doesn't doesn't look right
so it just doesn't doesn't look right now something I didn't specify when I
now something I didn't specify when I first started this cuz I kind of wanted
first started this cuz I kind of wanted to show you is that the join usually is
to show you is that the join usually is better for when you're working with
better for when you're working with indexes before when we were using the
indexes before when we were using the merge we were using the column names and
merge we were using the column names and that worked really well and it was
that worked really well and it was pretty easy to do but as you can see
pretty easy to do but as you can see right here when we're trying to use
right here when we're trying to use these column names it's not working
these column names it's not working exceptionally well let's go ahead and
exceptionally well let's go ahead and create our index and then I can show you
create our index and then I can show you how this actually works and how it works
how this actually works and how it works a little bit better when we're working
a little bit better when we're working with just the index although you can get
with just the index although you can get to work just the same as the merge it's
to work just the same as the merge it's just a lot more work so let's go right
just a lot more work so let's go right down here and let's go and say df4 so
down here and let's go and say df4 so we'll create a new data frame we'll say
we'll create a new data frame we'll say df1 do setor index and we'll do an open
df1 do setor index and we'll do an open parentheses and we'll say we want to do
parentheses and we'll say we want to do this index on the
this index on the fellowship ID and then we're going to do
fellowship ID and then we're going to do the join so now we're going to say join
the join so now we're going to say join so we're setting an index so we're
so we're setting an index so we're setting that index on the fellowship ID
setting that index on the fellowship ID now we're we're going to join it on df2
now we're we're going to join it on df2 do setor index and then we're also going
do setor index and then we're also going to do that on the fellowship ID and I'll
to do that on the fellowship ID and I'll just copy
this oh geez I hate it when I do that okay now we also want to do and specify
okay now we also want to do and specify the left and the right index so I'll
the left and the right index so I'll just copy this as we do need to specify
just copy this as we do need to specify this now let's try running the data
this now let's try running the data frame 4 so really quick just to recap we
frame 4 so really quick just to recap we were setting the indexes we were doing
were setting the indexes we were doing the same thing above right we have this
the same thing above right we have this join we were joining data frame one with
join we were joining data frame one with data Frame 2 now we're joining data
data Frame 2 now we're joining data frame 1 with data frame two except in
frame 1 with data frame two except in both instances we're setting the index
both instances we're setting the index as Fellowship ID so we're joining now on
as Fellowship ID so we're joining now on that index so now let's run this and
that index so now let's run this and this should look a lot more similar to
this should look a lot more similar to the merge than the join that we did
the merge than the join that we did above except now the fellowship ID right
above except now the fellowship ID right here is actually an index so it's just a
here is actually an index so it's just a little bit different but we can still go
little bit different but we can still go in here and do how is equal to
in here and do how is equal to Outer oops let's say outer so we can
Outer oops let's say outer so we can still specify our different types of
still specify our different types of joins or the different way that we can
joins or the different way that we can merge or join these data frames together
merge or join these data frames together we can still specify that again it's
we can still specify that again it's just a little bit different and that's
just a little bit different and that's why for most instances I'm using that
why for most instances I'm using that merge function because it's just a
merge function because it's just a little bit more seamless little bit more
little bit more seamless little bit more intuitive the join function can still
intuitive the join function can still get the job done but as you can see it
get the job done but as you can see it takes a little bit more work now let's
takes a little bit more work now let's look at concatenate concatenating data
look at concatenate concatenating data frames can be really useful and the
frames can be really useful and the distinction between a merge and join
distinction between a merge and join versus the concatenate is that the
versus the concatenate is that the concatenate is kind of like putting one
concatenate is kind of like putting one data frame on top of the other rather
data frame on top of the other rather than putting one data frame next to one
than putting one data frame next to one another which is like the merge and the
another which is like the merge and the join so concatenating them is just a
join so concatenating them is just a little bit different in how it'll
little bit different in how it'll operate but let's actually write this
operate but let's actually write this out and see how this looks let's go up
out and see how this looks let's go up here and we'll say pd. concat we'll do
here and we'll say pd. concat we'll do an open parenthesis and then we're going
an open parenthesis and then we're going to concatenate data frame 1 comma data
to concatenate data frame 1 comma data Frame 2 that's all we have to write and
Frame 2 that's all we have to write and let's run this and so just like I said
let's run this and so just like I said it literally took the First Data frame 1
it literally took the First Data frame 1 2 3 4 and put it on top of the right
2 3 4 and put it on top of the right data frame 1 2 6 7 8 so that is our left
data frame 1 2 6 7 8 so that is our left data frame this is our right data frame
data frame this is our right data frame and they're literally just sitting one
and they're literally just sitting one on top of the other but just like when
on top of the other but just like when we merg either with a left or a right
we merg either with a left or a right when you have these skills and there
when you have these skills and there aren't any values that populate for them
aren't any values that populate for them it is going to say not a number and
it is going to say not a number and since we're not actually joining we're
since we're not actually joining we're not joining on one and two even though
not joining on one and two even though this one and this one is the same rows
this one and this one is the same rows it's not populating that value because
it's not populating that value because again we're not joining these together
again we're not joining these together we're just concatenating and putting one
we're just concatenating and putting one on top of the other now if we go into
on top of the other now if we go into this concat we say shift tab there are a
this concat we say shift tab there are a lot of different things that we can do
lot of different things that we can do which if you remember the zero axis is
which if you remember the zero axis is the leftand index and the axis of one is
the leftand index and the axis of one is the top index which is the columns so
the top index which is the columns so you can specify that and we can also o
you can specify that and we can also o do joins and this is the one that I'm
do joins and this is the one that I'm going to take a look at but there are
going to take a look at but there are other ones that you can um look into as
other ones that you can um look into as well let's look at join let's do comma
well let's look at join let's do comma and we'll say join is equal to and let's
and we'll say join is equal to and let's do an inner join so let's see what
do an inner join so let's see what happens with this as you can see it is
happens with this as you can see it is only taking the columns that are the
only taking the columns that are the same that's what this in is doing it's
same that's what this in is doing it's joining these columns together and the
joining these columns together and the ones that were different they didn't
ones that were different they didn't take because again we weren't able to
take because again we weren't able to combine them they aren't similar between
combine them they aren't similar between both frames Let's do an outer and now
both frames Let's do an outer and now it's going to take all of them and like
it's going to take all of them and like I said that's doing this on these
I said that's doing this on these columns right here but we can also do it
columns right here but we can also do it on this axis as well so let's go ahead
on this axis as well so let's go ahead and say axis is equal to one and when we
and say axis is equal to one and when we run this now it's joining us on this
run this now it's joining us on this Index right here of 0 1 2 3 4 so now
Index right here of 0 1 2 3 4 so now these ones are being joined together and
these ones are being joined together and it's putting it side by side much like a
it's putting it side by side much like a merge wood so that's how concatenate
merge wood so that's how concatenate works and I'm going to show you one more
works and I'm going to show you one more thing and again it's not up here in this
thing and again it's not up here in this you know title because it's not one that
you know title because it's not one that I recommend but is one called append the
I recommend but is one called append the append function is used to append rows
append function is used to append rows from one data frame to the end of
from one data frame to the end of another data frame and then we can
another data frame and then we can return that new data frame and so let's
return that new data frame and so let's do data frame one. aend we'll do an open
do data frame one. aend we'll do an open parenthesis and we'll say data Frame 2
parenthesis and we'll say data Frame 2 very similar to how we've been doing
very similar to how we've been doing other things and let's run this and as
other things and let's run this and as you can see this is almost exactly like
you can see this is almost exactly like how the concatenate did when we first
how the concatenate did when we first did it but if we read kind of this
did it but if we read kind of this warning it's saying the frame append
warning it's saying the frame append method is deprecated and will be removed
method is deprecated and will be removed from pandas in the future version use
from pandas in the future version use pandas do concat instead so it's
pandas do concat instead so it's literally warning us you know a pend is
literally warning us you know a pend is on its way out if you want to do exactly
on its way out if you want to do exactly what you're doing right here go and try
what you're doing right here go and try concat or concatenate because that'll do
concat or concatenate because that'll do the exact same thing so I'm not really
the exact same thing so I'm not really going to show you any other variations
going to show you any other variations of a pend because there's no reason it's
of a pend because there's no reason it's going to be on its way out in the next
going to be on its way out in the next version so that is our video on merge
version so that is our video on merge join and concatenate and aend as well uh
join and concatenate and aend as well uh in panda does and I hope that that was
in panda does and I hope that that was helpful I hope that you learned
helpful I hope that you learned something I mean this stuff is really
something I mean this stuff is really important because often times you're not
important because often times you're not just working with one CSV or one Json or
just working with one CSV or one Json or one text file you're working with
one text file you're working with multiple of them and you need to combine
multiple of them and you need to combine them all into one data frame and so this
them all into one data frame and so this is a really really important concept and
is a really really important concept and thing to understand with that being said
thing to understand with that being said be sure to like And subscribe check out
be sure to like And subscribe check out all my other videos on Python and pandas
all my other videos on Python and pandas and I will see you in the next
and I will see you in the next [Music]
video [Music]
[Music] hello everybody today we're going to be
hello everybody today we're going to be building visualizations in pandas in
building visualizations in pandas in this video we'll look at how we can
this video we'll look at how we can build visualizations like line plots
build visualizations like line plots Scatter Plots bar charts histograms and
Scatter Plots bar charts histograms and more I'll also show you some of the ways
more I'll also show you some of the ways that you can customize these
that you can customize these visualizations to make them just a
visualizations to make them just a little bit better with that being said
little bit better with that being said let's go right over here start importing
let's go right over here start importing our libraries and we'll start with
our libraries and we'll start with importing pandas as PD and this one is
importing pandas as PD and this one is really all you need to actually create
really all you need to actually create the visualizations in pandas but we may
the visualizations in pandas but we may get a little bit crazy uh and so we're
get a little bit crazy uh and so we're going to do a few different ones as well
going to do a few different ones as well like import
like import numpy as NP and then we're going to do
numpy as NP and then we're going to do import Matt plot lib do
import Matt plot lib do pyplot as PLT now I may or may not use
pyplot as PLT now I may or may not use this I just you know when I get into
this I just you know when I get into visualizations I may want to change some
visualizations I may want to change some different things so we're going to at
different things so we're going to at least have them here in case we do want
least have them here in case we do want to use them let's go ahead and run this
to use them let's go ahead and run this so now let's get our data set that we're
so now let's get our data set that we're going to be using so let's say data
going to be using so let's say data frame is equal to pd. read _
frame is equal to pd. read _ CSV and let's get this in right here now
CSV and let's get this in right here now we're going to be doing these ice cream
we're going to be doing these ice cream ratings let's take a look at this really
ratings let's take a look at this really quickly now these values are completely
quickly now these values are completely randomly generated they are not real in
randomly generated they are not real in any way um but that's what we're going
any way um but that's what we're going to be using cuz I just wanted something
to be using cuz I just wanted something kind of generic something that wouldn't
kind of generic something that wouldn't be too crazy confusing just something
be too crazy confusing just something that we could use and you guys can
that we could use and you guys can understand that they're just numerical
understand that they're just numerical values but let's also set that index
values but let's also set that index really quick so we'll say data frame.
really quick so we'll say data frame. setor index and then we'll say date and
setor index and then we'll say date and then we'll say that's equal to the data
then we'll say that's equal to the data frame and we have this date column right
frame and we have this date column right here as our index so we have uh January
here as our index so we have uh January 1st 2nd 3rd 4th and then we have our
1st 2nd 3rd 4th and then we have our ratings right here and again these are
ratings right here and again these are all just integers and they're pretty
all just integers and they're pretty easy or are really easy to demonstrate
easy or are really easy to demonstrate how you can visualize these so that's
how you can visualize these so that's why we're using it today so the way that
why we're using it today so the way that we visualize something in pandas is we
we visualize something in pandas is we use something called plot so let's just
use something called plot so let's just take our data frame we'll do data frame.
take our data frame we'll do data frame. plot and we'll do our parentheses now
plot and we'll do our parentheses now let's go in here really quickly let's
let's go in here really quickly let's hit shift Tab and this is going to come
hit shift Tab and this is going to come up and this is pretty important because
up and this is pretty important because this kind of is going to tell us what we
this kind of is going to tell us what we can do within this plot and
can do within this plot and unfortunately there isn't like a quick
unfortunately there isn't like a quick overview we just have this doc string
overview we just have this doc string but we have our parameters right here
but we have our parameters right here these are what we can pass in to kind of
these are what we can pass in to kind of customize our visualization so the data
customize our visualization so the data is going to be our data frame then we
is going to be our data frame then we have our X and Y labels we can specify
have our X and Y labels we can specify the kind and this one's important
the kind and this one's important because you can specify what kind of
because you can specify what kind of visualization do we want we can do a
visualization do we want we can do a line plot horizontal a vertical bar plot
line plot horizontal a vertical bar plot histogram box plot and then a few others
histogram box plot and then a few others including area Pi density all these
including area Pi density all these other things we can also specify if we
other things we can also specify if we want it to be a subplot and a lot of
want it to be a subplot and a lot of these things that I'm specifying you
these things that I'm specifying you know I'm going to show you how to do you
know I'm going to show you how to do you can use a different indexes you can add
can use a different indexes you can add titles add grids Legends Styles all
titles add grids Legends Styles all these different things I mean you can go
these different things I mean you can go through here CU there are a lot but you
through here CU there are a lot but you can specify and and you know customize
can specify and and you know customize all of these things we won't be going
all of these things we won't be going into all of them but I will show you
into all of them but I will show you some of the ones that I probably use the
some of the ones that I probably use the most and that I think are the most
most and that I think are the most useful to know right away so let's get
useful to know right away so let's get out of here and we're just going to do
out of here and we're just going to do DF do plot and when we run this we'll
DF do plot and when we run this we'll get this right here and that was super
get this right here and that was super super easy created a line plot by
super easy created a line plot by literally doing just about nothing
literally doing just about nothing nothing um but by default it's going to
nothing um but by default it's going to give us a line plot so if we come up
give us a line plot so if we come up here we say kind and let me get that out
here we say kind and let me get that out of the way is equal to line and we run
of the way is equal to line and we run this so by default without us actually
this so by default without us actually having to input anything it's giving us
having to input anything it's giving us that line plot as a default so uh we can
that line plot as a default so uh we can specify it's a line plot as you can see
specify it's a line plot as you can see we already have all of our data right
we already have all of our data right here we didn't have to specify anything
here we didn't have to specify anything it kind of automatically took it in it
it kind of automatically took it in it is visualizing all three of these
is visualizing all three of these columns
columns and it has this little um Legend right
and it has this little um Legend right here and we can specify where we want
here and we can specify where we want that uh there is an argument to be able
that uh there is an argument to be able to do that it also gave us these tick
to do that it also gave us these tick marks of 2 4 6 8 10 again it read in and
marks of 2 4 6 8 10 again it read in and said it's only going from 0.0 to 1.0
said it's only going from 0.0 to 1.0 that is kind of the peak and so it kind
that is kind of the peak and so it kind of automatically gave us these ticks for
of automatically gave us these ticks for us again that's another thing that you
us again that's another thing that you can specify we make it go up to 2 5 10
can specify we make it go up to 2 5 10 1,000 whatever you want it to be and
1,000 whatever you want it to be and then we're doing this based on off of
then we're doing this based on off of this date value right here really
this date value right here really quickly I wanted to give a huge shout
quickly I wanted to give a huge shout out to the sponsor of this entire Panda
out to the sponsor of this entire Panda series and that is udemy udy has some of
series and that is udemy udy has some of the best courses at the best prices and
the best courses at the best prices and it is no exception when it comes to
it is no exception when it comes to pandas courses if you want to master
pandas courses if you want to master pandas this is the course that I would
pandas this is the course that I would recommend it's going to teach you just
recommend it's going to teach you just about everything you need to know about
about everything you need to know about pandas so huge shout out to you me for
pandas so huge shout out to you me for sponsoring this Panda series and let's
sponsoring this Panda series and let's get back to the video if we wanted to
get back to the video if we wanted to break these out by the actual column we
break these out by the actual column we could go in here and say subplot is
could go in here and say subplot is equal to true and it's actually subplots
equal to true and it's actually subplots whoops and now we can run that and then
whoops and now we can run that and then we can see each of those columns being
we can see each of those columns being broken out by themselves instead of them
broken out by themselves instead of them all being in one visualization it's now
all being in one visualization it's now uh three separate visualizations now
uh three separate visualizations now let's go right over here we're going to
let's go right over here we're going to get rid of the subplots I want to show
get rid of the subplots I want to show you just some of the different arguments
you just some of the different arguments that you can use to make this look nice
that you can use to make this look nice uh because I don't want to do this on
uh because I don't want to do this on every single visualization I just want
every single visualization I just want to show you what you can do so we have
to show you what you can do so we have this one right here we can add a title
this one right here we can add a title notice there's no title or anything
notice there's no title or anything really telling us what that is so we can
really telling us what that is so we can say comma title and we'll say ice cream
say comma title and we'll say ice cream ratings if we run this we now have this
ratings if we run this we now have this nice title right here now we can also
nice title right here now we can also customize the labels or the titles for
customize the labels or the titles for the X and Y AIS it automatically took
the X and Y AIS it automatically took this date which is right here this is
this date which is right here this is our date index it automatically took
our date index it automatically took that for us but we can customize that if
that for us but we can customize that if we'd like to all we have to do is comma
we'd like to all we have to do is comma and then we'll say xlabel is equal to
and then we'll say xlabel is equal to and so our X is this date one right here
and so our X is this date one right here and we can say daily
and we can say daily rating and then we can do the Y Lael
rating and then we can do the Y Lael we'll say y label is equal to and for
we'll say y label is equal to and for this one we can say
this one we can say scores hope you cannot hear my dog in
scores hope you cannot hear my dog in the background CU they being insane uh
the background CU they being insane uh but let's go ahead and run this and now
but let's go ahead and run this and now we have these daily ratings on the x-
we have these daily ratings on the x- axis and on the Y AIS we have scores now
axis and on the Y AIS we have scores now let's go right down here and start
let's go right down here and start taking a look at our next kind of
taking a look at our next kind of visualization which is going to be a bar
visualization which is going to be a bar plot so we'll do DF do plot we'll do
plot so we'll do DF do plot we'll do kind is equal to and for this one we're
kind is equal to and for this one we're going to say bar now this is what your
going to say bar now this is what your typical bar plot will look like and a
typical bar plot will look like and a lot of the arguments that we just did on
lot of the arguments that we just did on the line plot you can also apply to this
the line plot you can also apply to this bar plot something that's unique to the
bar plot something that's unique to the bar plot is that you can also make it a
bar plot is that you can also make it a stacked bar plot all we have to do is go
stacked bar plot all we have to do is go in here we'll say comma and we'll say
in here we'll say comma and we'll say stacked is equal to true so now this
stacked is equal to true so now this going to make it a stacked bar chart
going to make it a stacked bar chart instead of just know your regular bar
instead of just know your regular bar chart let's go ahead and run this and as
chart let's go ahead and run this and as you can see this is now stacked on top
you can see this is now stacked on top of one another with each of these
of one another with each of these columns all representing the values that
columns all representing the values that they have now we don't always have to do
they have now we don't always have to do every single column we can also specify
every single column we can also specify the column that we want so let's take
the column that we want so let's take the flavor rating for example we could
the flavor rating for example we could do flavor oops flavor rating good night
do flavor oops flavor rating good night flavor rating and then it's only going
flavor rating and then it's only going to take in that flavor rating column and
to take in that flavor rating column and if you notice we don't have a legend
if you notice we don't have a legend that's only when you have multiple
that's only when you have multiple values which we are only looking at this
values which we are only looking at this one column so all the values are right
one column so all the values are right here now in this bar chart it
here now in this bar chart it automatically defaults to a vertical bar
automatically defaults to a vertical bar chart but you can change it to a
chart but you can change it to a horizontal bar chart let's go ahead and
horizontal bar chart let's go ahead and take a look at how to do that bring back
take a look at how to do that bring back all of them we'll do DF do plot Dot and
all of them we'll do DF do plot Dot and then we'll say
then we'll say barh and I don't know if I can keeping
barh and I don't know if I can keeping that kind equals bar let me run this
that kind equals bar let me run this yeah I need to get rid of that because
yeah I need to get rid of that because the bar. H is its own um this is its own
the bar. H is its own um this is its own function so now I'm going to run this it
function so now I'm going to run this it should just have a stacked bar chart
should just have a stacked bar chart except now it should be horizontal so
except now it should be horizontal so now you can see this worked properly
now you can see this worked properly it's basically the exact same thing as a
it's basically the exact same thing as a vertical bar chart just now horizontal
vertical bar chart just now horizontal which may look better especially
which may look better especially depending on if you have values like
depending on if you have values like this or you know something else that
this or you know something else that just looks better being horizontal now
just looks better being horizontal now the next one that we're going to take a
the next one that we're going to take a look at is the scatter plot so we're
look at is the scatter plot so we're going to say DF do plot do scatter
going to say DF do plot do scatter scatter and if we run this we're going
scatter and if we run this we're going to get an error what we need in order to
to get an error what we need in order to run this properly is we need to specify
run this properly is we need to specify the X and the Y AIS in order for this
the X and the Y AIS in order for this scatter plot to work so let's go here
scatter plot to work so let's go here and we'll say x is equal to and we can
and we'll say x is equal to and we can take any of our columns that we have up
take any of our columns that we have up here so we'll say x is equal to texture
here so we'll say x is equal to texture rating and then oops Y is equal to we'll
rating and then oops Y is equal to we'll do overall rating
do overall rating now when we run this it should work
now when we run this it should work properly let's go ahead and take a look
properly let's go ahead and take a look now if we go in here and we do shift tab
now if we go in here and we do shift tab we can also see some other things that
we can also see some other things that we can specify so let's go right down
we can specify so let's go right down here so we have our X and we have our Y
here so we have our X and we have our Y and those are the ones that we just did
and those are the ones that we just did we can also pass through an S which is
we can also pass through an S which is going to tell us or or change the size
going to tell us or or change the size of the actual dots right here in our
of the actual dots right here in our scatter plot then we can also do a c
scatter plot then we can also do a c which is the color of each point let's
which is the color of each point let's start with the S
start with the S let's say s is equal to let's just do
let's say s is equal to let's just do 100 let's see what that looks like so we
100 let's see what that looks like so we have a much larger number let's do 500
have a much larger number let's do 500 and see what that looks like so we can
and see what that looks like so we can make these much larger on our
make these much larger on our visualization depending on what you're
visualization depending on what you're looking for we can also look at the
looking for we can also look at the color let's put comma C so for color we
color let's put comma C so for color we can say color is equal to and let's do
can say color is equal to and let's do uh yellow let's see if this works so now
uh yellow let's see if this works so now we've changed it to Yellow that looks
we've changed it to Yellow that looks absolutely terrible but it does work now
absolutely terrible but it does work now let's move on to the histogram histogram
let's move on to the histogram histogram is always a good one it's very similar
is always a good one it's very similar to something like a bar chart but what's
to something like a bar chart but what's great about a histogram is you can
great about a histogram is you can specify the bins um so let's go ahead
specify the bins um so let's go ahead and say DF
and say DF dolot doist then we'll do an open
dolot doist then we'll do an open parenthesis and let's go ahead and hit
parenthesis and let's go ahead and hit shift tab in here take a look at this
shift tab in here take a look at this one as well so some of our parameters
one as well so some of our parameters are the actual Columns of the data
are the actual Columns of the data frames that we want to pull in we get
frames that we want to pull in we get you can choose the bins and they have a
you can choose the bins and they have a default of 10 in here and so let's take
default of 10 in here and so let's take a look at how this works so we'll just
a look at how this works so we'll just run this as it is so this is by default
run this as it is so this is by default what this histogram is going to look
what this histogram is going to look like let's go ahead and specify our bins
like let's go ahead and specify our bins we'll just say it was 10 by default
we'll just say it was 10 by default let's just do 20 see what that looks
let's just do 20 see what that looks like so there are smaller columns right
like so there are smaller columns right off the bat and remember histograms are
off the bat and remember histograms are really good for showing distribution of
really good for showing distribution of variables you know that's really what a
variables you know that's really what a histogram is for but of course since
histogram is for but of course since these are completely random numbers this
these are completely random numbers this histogram isn't going to make any sense
histogram isn't going to make any sense at all but you can at least kind of see
at all but you can at least kind of see visually how it works and if I didn't
visually how it works and if I didn't mention it before which I should have
mention it before which I should have the bins represent how many kind of tick
the bins represent how many kind of tick marks are down here so if we just do one
marks are down here so if we just do one only going to be one very large uh you
only going to be one very large uh you know histogram we could even go further
know histogram we could even go further down from 10 and do five so now there's
down from 10 and do five so now there's only one 2 3 four five so the
only one 2 3 four five so the distribution gets smaller and and things
distribution gets smaller and and things get more compact as you spread it out
get more compact as you spread it out again like we did
again like we did 100 it's going to spread it out a lot um
100 it's going to spread it out a lot um and this is what it shows you know it's
and this is what it shows you know it's showing the distribution of those bins
showing the distribution of those bins across however many you want so the 10
across however many you want so the 10 by default you know it usually is pretty
by default you know it usually is pretty good for a lot of different things now
good for a lot of different things now let's go down here and look at the box
let's go down here and look at the box plot and the box plot is a pretty
plot and the box plot is a pretty interesting one let's go ahead and
interesting one let's go ahead and visualize it really quickly and then
visualize it really quickly and then I'll kind of explain how this one works
I'll kind of explain how this one works let's do d boox plot let's run this and
let's do d boox plot let's run this and really what we're looking at is some
really what we're looking at is some different markers within our data this
different markers within our data this line right here is the minimum value
line right here is the minimum value within that column we also have the
within that column we also have the bottom of the box which is the 25th
bottom of the box which is the 25th percentile of all the values within just
percentile of all the values within just this column this is 50% then we have 75%
this column this is 50% then we have 75% and then up here we have our maximum
and then up here we have our maximum value so I can take a glance at this and
value so I can take a glance at this and see that we have a low minimum a high
see that we have a low minimum a high maximum and it definitely skews towards
maximum and it definitely skews towards the lower range whereas if I look over
the lower range whereas if I look over here we have a lower minimum and a
here we have a lower minimum and a higher maximum and you can see that this
higher maximum and you can see that this medium point is at0 6 versus 04 over
medium point is at0 6 versus 04 over here so the skew is a lot higher now
here so the skew is a lot higher now let's go down here and take a look at an
let's go down here and take a look at an area plot we'll do DF do plot. area and
area plot we'll do DF do plot. area and let's just run this this is what we're
let's just run this this is what we're going to get by default now something I
going to get by default now something I wanted to show you earlier I just
wanted to show you earlier I just haven't gotten around to I want to show
haven't gotten around to I want to show you something called Figure size or fig
you something called Figure size or fig size um so for this it's know it's just
size um so for this it's know it's just looks small small looks a little bit
looks small small looks a little bit cramped let's say we want to increase
cramped let's say we want to increase the size of this and we'll say fig size
the size of this and we'll say fig size oops fig size is equal to and let's just
oops fig size is equal to and let's just do a parentheses and say 10 comma 5 that
do a parentheses and say 10 comma 5 that should be pretty large this is going to
should be pretty large this is going to make it a lot larger just something I
make it a lot larger just something I wanted to throw in there I look at these
wanted to throw in there I look at these area charts as pretty similar to like a
area charts as pretty similar to like a line chart if we went and compared those
line chart if we went and compared those be pretty similar um but they're
be pretty similar um but they're different visually and you know you
different visually and you know you absolutely can use these for different
absolutely can use these for different types of visualizations but I don't use
types of visualizations but I don't use this one a lot if I'm being honest
this one a lot if I'm being honest that's why it's kind of towards the end
that's why it's kind of towards the end of the video but you definitely can do
of the video but you definitely can do it let's go on to our very last one of
it let's go on to our very last one of the video that's going to be the
the video that's going to be the beautiful pie chart let's say DF plot.py
beautiful pie chart let's say DF plot.py do an open parenthesis and let's run it
do an open parenthesis and let's run it we're going to get this error that's
we're going to get this error that's because we need to specify what column
because we need to specify what column we're working with here so let's just
we're working with here so let's just say the Y and that's what we need let me
say the Y and that's what we need let me open this up for
open this up for us right here we have our y and this is
us right here we have our y and this is our our label or a column that we're
our our label or a column that we're going to plot that's really all we need
going to plot that's really all we need so we can just say Y is equal to flavor
so we can just say Y is equal to flavor rating oops flavor rating let's run this
rating oops flavor rating let's run this and now we get this visualization right
and now we get this visualization right here let's make this one a little bit
here let's make this one a little bit bigger big size is equal to 10 comma 6
bigger big size is equal to 10 comma 6 now it's a little bit bigger it
now it's a little bit bigger it definitely depends so this Legend is
definitely depends so this Legend is going to autop populate you know you can
going to autop populate you know you can make this as big as you want and
make this as big as you want and obviously it's going to look a little
obviously it's going to look a little bit better if you do it larger and these
bit better if you do it larger and these colors autop populate now you can
colors autop populate now you can customize these colors although I found
customize these colors although I found these ones to be just when you have a
these ones to be just when you have a lot of them it's harder to customize
lot of them it's harder to customize them as easily but you know definitely
them as easily but you know definitely look into it these are things that
look into it these are things that everything in here is almost something
everything in here is almost something that you can customize in some way
that you can customize in some way although it does get a little bit tricky
although it does get a little bit tricky you definitely have to do some research
you definitely have to do some research and some Googling around just to kind of
and some Googling around just to kind of figure out how to do those things now
figure out how to do those things now one last thing that I wanted to show and
one last thing that I wanted to show and something you know I could have probably
something you know I could have probably done at the beginning um is you can
done at the beginning um is you can actually change what visual this is and
actually change what visual this is and we can do that pretty easily within mpot
we can do that pretty easily within mpot lib there are different styles um and so
lib there are different styles um and so let's go right here let's add a new row
let's go right here let's add a new row a new cell and we'll say print and we'll
a new cell and we'll say print and we'll do PLT so that's that map plot lib right
do PLT so that's that map plot lib right here we'll do PLT do style.
here we'll do PLT do style. available and what this is going to do
available and what this is going to do whoops what this is going to do is show
whoops what this is going to do is show us all these different different types
us all these different different types of stylings that you can do to kind of
of stylings that you can do to kind of change up this visualization then once
change up this visualization then once we find the one that we like we'll just
we find the one that we like we'll just do PLT do style. use and then in the
do PLT do style. use and then in the parenthesis we'll just specify which one
parenthesis we'll just specify which one we want now there's all these Seaborn
we want now there's all these Seaborn ones and Seaborn is a really great um
ones and Seaborn is a really great um really great Library let's try Seaborn
really great Library let's try Seaborn deep I haven't tried this one at all
deep I haven't tried this one at all let's go ahead and try this and just
let's go ahead and try this and just changes some of the colors some of the
changes some of the colors some of the visuals we can try something like
visuals we can try something like 538 let's try this that looks quite a
538 let's try this that looks quite a bit different and let's try something
bit different and let's try something like um classic I don't know what this
like um classic I don't know what this one looks like let's just try
one looks like let's just try it so you can try out all these
it so you can try out all these different styles find one that you like
different styles find one that you like find one that you think looks really
find one that you think looks really nice and you can run with it through all
nice and you can run with it through all your visualizations so this has been our
your visualizations so this has been our video on visualizing data in pandas I
video on visualizing data in pandas I think it's is a really good introduction
think it's is a really good introduction on how you can visualize data within
on how you can visualize data within python and in future videos we'll look
python and in future videos we'll look at mpot lib and Seaborn which are some
at mpot lib and Seaborn which are some really great libraries for visualizing
really great libraries for visualizing data which I use a lot so I hope that
data which I use a lot so I hope that you enjoyed this video if you did be
you enjoyed this video if you did be sure to check out all my other videos on
sure to check out all my other videos on Python and pandas and I will see you in
Python and pandas and I will see you in the next
the next [Music]
[Music] video hello everybody today we're going
video hello everybody today we're going to be cleaning data using paint P now
to be cleaning data using paint P now there are literally hundreds of ways
there are literally hundreds of ways that you can clean data within pandas
that you can clean data within pandas but I'm going to show you some of the
but I'm going to show you some of the ones that I use a lot and ones that I
ones that I use a lot and ones that I think are really good to know when you
think are really good to know when you are cleaning your data sets so we're
are cleaning your data sets so we're going to start by saying import pandas
going to start by saying import pandas aspd and we're going to run that and now
aspd and we're going to run that and now we're going to import our file so we're
we're going to import our file so we're going to say data frame is equal to PD
going to say data frame is equal to PD that's pandas do read uncore and we
that's pandas do read uncore and we actually have this in an Excel file so
actually have this in an Excel file so we'll say read oops say read Excel do an
we'll say read oops say read Excel do an open parenthesis eses and we'll do R and
open parenthesis eses and we'll do R and then we'll paste the path right here and
then we'll paste the path right here and now we're just going to call that
now we're just going to call that variable so we'll call data frame and
variable so we'll call data frame and we'll actually read it in and look at
we'll actually read it in and look at the data so let's scroll down here and
the data so let's scroll down here and let's take a look at this data frame or
let's take a look at this data frame or this Excel file that we're reading in so
this Excel file that we're reading in so right off the bat we have this customer
right off the bat we have this customer ID that goes from 101 all the way down
ID that goes from 101 all the way down to
to 1020 we have this first name and
1020 we have this first name and everything looks pretty good here except
everything looks pretty good here except in this last name column uh looks like
in this last name column uh looks like we have some errors we have some forward
we have some errors we have some forward slashes some dots some null values um so
slashes some dots some null values um so definitely going to have to clean that
definitely going to have to clean that up because we don't want that in the
up because we don't want that in the data we have a phone number and it looks
data we have a phone number and it looks like we have a lot of different formats
like we have a lot of different formats um as well as Nas not a number um just
um as well as Nas not a number um just lots of different stuff so we're going
lots of different stuff so we're going to need to standardize that so clean it
to need to standardize that so clean it up and then standardize it to where it
up and then standardize it to where it all looks the same um we also have
all looks the same um we also have address and it looks like on some of
address and it looks like on some of these we just have a street address but
these we just have a street address but on some of the other ones we have like a
on some of the other ones we have like a street address and another location as
street address and another location as well as a zip code in some of them so
well as a zip code in some of them so we'll probably want to split those out
we'll probably want to split those out we have a paying customer uh which is
we have a paying customer uh which is yes and Nos and some of those are not
yes and Nos and some of those are not the same so I have to standardize that
the same so I have to standardize that we have a do not contact kind of the
we have a do not contact kind of the same thing as the paying customer and we
same thing as the paying customer and we have this not useful column which we'll
have this not useful column which we'll probably just want to get rid of okay so
probably just want to get rid of okay so the scenario is is that we got handed
the scenario is is that we got handed this list of names and we need to clean
this list of names and we need to clean it up and hand it off to the people who
it up and hand it off to the people who are actually going to make these calls
are actually going to make these calls to this customer list so they want all
to this customer list so they want all the data in here standardized and
the data in here standardized and cleaned so that the people who are
cleaned so that the people who are making those calls can just make those
making those calls can just make those calls as quickly as possible but they
calls as quickly as possible but they also don't want columns and rows that
also don't want columns and rows that aren't useful to them so things like
aren't useful to them so things like this not useful column we're probably
this not useful column we're probably going to get rid of and then ones that
going to get rid of and then ones that say do not contact if it says yes we
say do not contact if it says yes we should not contact them we probably will
should not contact them we probably will want to get rid of those somehow so
want to get rid of those somehow so that's a lot of what we're going to be
that's a lot of what we're going to be doing to clean this data set normally
doing to clean this data set normally the very first thing that I do when I'm
the very first thing that I do when I'm working with a data set most of the time
working with a data set most of the time except very rare cases when you're
except very rare cases when you're actually supposed to have duplicates is
actually supposed to have duplicates is I actually go and drop the duplicates
I actually go and drop the duplicates from the data set completely all you
from the data set completely all you have to do for that is say DF do
have to do for that is say DF do dropcore duplicates so they make it
dropcore duplicates so they make it super easy for you let's just run it and
super easy for you let's just run it and up here is our original data set we have
up here is our original data set we have this 19 and 20 and those are obviously
this 19 and 20 and those are obviously duplicates they have the exact same data
duplicates they have the exact same data it's just a duplicate row that we need
it's just a duplicate row that we need to get rid of if we look right down here
to get rid of if we look right down here we no longer have that 20 we now just
we no longer have that 20 we now just have one row of Anakin Skywalker and of
have one row of Anakin Skywalker and of course we want to save that so we're
course we want to save that so we're just going to say DF is equal to and DF
just going to say DF is equal to and DF so now it's going to save that to the
so now it's going to save that to the data frame variable again and now when
data frame variable again and now when we run this our data frame Now does not
we run this our data frame Now does not have any duplicates that's definitely
have any duplicates that's definitely one of the easier steps that we're going
one of the easier steps that we're going to look at uh things are going to get
to look at uh things are going to get quite a bit more complicated as we go
quite a bit more complicated as we go but I'm starting out you know kind of
but I'm starting out you know kind of simple so that we can kind of get a feel
simple so that we can kind of get a feel for it and then we'll start getting into
for it and then we'll start getting into the really tough stuff so the next thing
the really tough stuff so the next thing that I want to do is remove any columns
that I want to do is remove any columns that we don't need I don't want to clean
that we don't need I don't want to clean data that we're not going to use so if
data that we're not going to use so if we're just looking through here you know
we're just looking through here you know they may need you know first name last
they may need you know first name last name phone number for sure address might
name phone number for sure address might give them some information of where
give them some information of where they're calling to or time zone so we
they're calling to or time zone so we want that this not useful column looks
want that this not useful column looks like a pretty good candidate to delete
like a pretty good candidate to delete and it's very easy to do that we're
and it's very easy to do that we're going to go right down here and we're
going to go right down here and we're going to say DF do drop and we'll do an
going to say DF do drop and we'll do an open parenthesis drop just means we are
open parenthesis drop just means we are dropping that column and we can specify
dropping that column and we can specify that by saying columns is equal to and
that by saying columns is equal to and then we'll paste in that column that we
then we'll paste in that column that we want to delete so let's run this and see
want to delete so let's run this and see what it looks like and it literally just
what it looks like and it literally just drops that column exactly like we were
drops that column exactly like we were talking about it no longer has that
talking about it no longer has that column again we want to save that we can
column again we want to save that we can always do in in place equals true um if
always do in in place equals true um if you follow this tutorial series you can
you follow this tutorial series you can always do in place equals true and
always do in place equals true and that'll save it as well but just for our
that'll save it as well but just for our workflow most of the time I'm going to
workflow most of the time I'm going to assign it back to that variable um just
assign it back to that variable um just for keeping it the same really quickly I
for keeping it the same really quickly I wanted to give a huge shout out to the
wanted to give a huge shout out to the sponsor of this entire Panda series and
sponsor of this entire Panda series and that is udemy udemy has some of the best
that is udemy udemy has some of the best courses at the best prices and it is no
courses at the best prices and it is no exception when it comes to pandas
exception when it comes to pandas courses if you want to master pandas
courses if you want to master pandas this is the course that I would
this is the course that I would recommend it's going to teach you just
recommend it's going to teach you just about everything you need to know about
about everything you need to know about pandas so huge shout out to you me for
pandas so huge shout out to you me for sponsoring this Panda series and let's
sponsoring this Panda series and let's get back to the video now let's kind of
get back to the video now let's kind of go column by column and see what we need
go column by column and see what we need to fix and we'll start on this left-and
to fix and we'll start on this left-and side this customer ID to me looks
side this customer ID to me looks perfectly fine I'm not going to mess
perfectly fine I'm not going to mess with it at all the first name at a
with it at all the first name at a glance also looks perfectly fine I don't
glance also looks perfectly fine I don't see anything wrong with it visually
see anything wrong with it visually which is a good thing um although
which is a good thing um although sometimes that can be deceiving and that
sometimes that can be deceiving and that can cause errors down the line but we're
can cause errors down the line but we're not going to uh assume that there are
not going to uh assume that there are errors in here now let's look at this
errors in here now let's look at this last name now the last name obviously
last name now the last name obviously I'm I'm seeing some obvious things
I'm I'm seeing some obvious things things that we talked about when we were
things that we talked about when we were first looking at this data set we have
first looking at this data set we have this forward slash which we definitely
this forward slash which we definitely need to get rid of we have null values
need to get rid of we have null values so not a number right here we have some
so not a number right here we have some periods as well as an underscore right
periods as well as an underscore right here so all those things I think we
here so all those things I think we should clean up and get rid of it so
should clean up and get rid of it so that when the person is making these
that when the person is making these calls you know it's all cleaned up for
calls you know it's all cleaned up for them so how are we going to do that we
them so how are we going to do that we can actually do this in several
can actually do this in several different ways but let's just copy this
different ways but let's just copy this last name the first one I'm going to
last name the first one I'm going to show you is strip and we'll write it
show you is strip and we'll write it kind of like this we'll say data frame
kind of like this we'll say data frame and then we'll specify the column that
and then we'll specify the column that we're working with because we don't want
we're working with because we don't want to make these changes or strip all of
to make these changes or strip all of these values from everywhere we only
these values from everywhere we only want to do it on just this column if we
want to do it on just this column if we do this and we don't specify the column
do this and we don't specify the column name it will apply to everywhere so if
name it will apply to everywhere so if we're trying to do these yeah let's say
we're trying to do these yeah let's say bum these underscores maybe that would
bum these underscores maybe that would mess with something else in another
mess with something else in another column and we don't want that so we just
column and we don't want that so we just want to specify just this last name so
want to specify just this last name so let's go last name.
let's go last name. string. strip now what strip does and
string. strip now what strip does and let's see if we can open this up really
let's see if we can open this up really quickly no we can't um but what strip
quickly no we can't um but what strip does I was just I was hitting shift tab
does I was just I was hitting shift tab in here to see if it could bring up um
in here to see if it could bring up um you know some of the notes on it but
you know some of the notes on it but what strip does is it takes either the
what strip does is it takes either the left side or the right side well L strip
left side or the right side well L strip takes from the left side our strip takes
takes from the left side our strip takes from the right side and strip takes from
from the right side and strip takes from both but you can strip values off the
both but you can strip values off the left and the right hand side and we can
left and the right hand side and we can specify those values now for what we're
specify those values now for what we're doing in this column we can just use
doing in this column we can just use strip because as you can see this
strip because as you can see this forward slash these dots as well as this
forward slash these dots as well as this um underscore are all on the far sides
um underscore are all on the far sides if there was a value Like swancore Son
if there was a value Like swancore Son the strip wouldn't work at all because
the strip wouldn't work at all because it's not on the outside of the value or
it's not on the outside of the value or the word so we can use strip I'll also
the word so we can use strip I'll also show you how to use replace and replace
show you how to use replace and replace is another really good option for things
is another really good option for things like this but let's start with strip and
like this but let's start with strip and just see what it looks like and see if
just see what it looks like and see if we can get what we need done so let's
we can get what we need done so let's just run this for now see what happens
just run this for now see what happens so it looks like nothing has changed
so it looks like nothing has changed because again we're not specifying any
because again we're not specifying any specific value just by default it's only
specific value just by default it's only taking out white space so like spaces
taking out white space so like spaces that shouldn't be there that's what it
that shouldn't be there that's what it does by default now we can specify
does by default now we can specify within this exactly what values we want
within this exactly what values we want to take out so let's go ahead and do
to take out so let's go ahead and do that let's say left strip and let's try
that let's say left strip and let's try to take out these dots real quick so
to take out these dots real quick so we're just going to do a parenthesis dot
we're just going to do a parenthesis dot dot dot now let's run this and see what
dot dot now let's run this and see what it looks
it looks like for this one Potter it is now gone
like for this one Potter it is now gone so those three dots were there before
so those three dots were there before let's just show it so they were there
let's just show it so they were there and then when I ran it like this now
and then when I ran it like this now they're gone that's what the L strip
they're gone that's what the L strip does it takes it only off the left hand
does it takes it only off the left hand side now we can also do a forward slash
side now we can also do a forward slash so we'll do something like this and
so we'll do something like this and it'll get rid of the white but as you
it'll get rid of the white but as you can see now we aren't taking out these
can see now we aren't taking out these three dots so they're still there now is
three dots so they're still there now is it possible to do something like this
it possible to do something like this where we put these values inside of a
where we put these values inside of a list um let's try it so we'll say just
list um let's try it so we'll say just like this one two 3 let's run it and no
like this one two 3 let's run it and no it doesn't um this L strip actually sits
it doesn't um this L strip actually sits within the the realm of regular
within the the realm of regular expression so if you've ever worked with
expression so if you've ever worked with regular expression you know it gets very
regular expression you know it gets very complicated very complex so you want to
complicated very complex so you want to keep it kind of simple especially with
keep it kind of simple especially with these values where we're just taking a
these values where we're just taking a few out so what we're going to do is
few out so what we're going to do is we're going to do dot dot dot and we're
we're going to do dot dot dot and we're take it out one by one now in order to
take it out one by one now in order to save this because we want to save this
save this because we want to save this we want to take out that value we don't
we want to take out that value we don't just want to say data frame equals
just want to say data frame equals because that would be uh very bad what
because that would be uh very bad what this would say is now this data frame is
this would say is now this data frame is only equal to these values that we're
only equal to these values that we're seeing right here we want to only apply
seeing right here we want to only apply it to this column so we're going to go
it to this column so we're going to go like this so now when we do it and then
like this so now when we do it and then we call the entire data frame it's only
we call the entire data frame it's only applying this to this one column the
applying this to this one column the last name column so let's run
last name column so let's run it and now when we go down to Potter
it and now when we go down to Potter right here it's cleaned up so we're
right here it's cleaned up so we're going to do the same thing but for those
going to do the same thing but for those other
other values and we'll do it just like this
values and we'll do it just like this we'll do a forward slash and it's a left
we'll do a forward slash and it's a left strip and then we'll do I'll do the left
strip and then we'll do I'll do the left strip on this underscore to just to show
strip on this underscore to just to show you that it won't work and then
you that it won't work and then we will go on from there so it's not
we will go on from there so it's not pulling it because we're looking at the
pulling it because we're looking at the left hand side only we need to use R
left hand side only we need to use R strip so now let's use R
strip so now let's use R strip and now that looks perfect has no
strip and now that looks perfect has no underscore so that's how you can use
underscore so that's how you can use strip for either the left side the right
strip for either the left side the right side or just Strip by itself which
side or just Strip by itself which covers both sides now I showed you all
covers both sides now I showed you all of that because I am going to show you a
of that because I am going to show you a different way to do it um and I
different way to do it um and I apologize because I somewhat lied to you
apologize because I somewhat lied to you earlier um let's run this right here
earlier um let's run this right here actually we're just going to pull it in
actually we're just going to pull it in like
like this we're going to remove the
this we're going to remove the duplicates again bear with me we're
duplicates again bear with me we're going to drop that column and then now
going to drop that column and then now we're sitting with that data frame again
we're sitting with that data frame again with those exact same mistakes I just
with those exact same mistakes I just wanted to reset it for a second there is
wanted to reset it for a second there is a way uh that you can do this and I just
a way uh that you can do this and I just wanted to you know kind of show you how
wanted to you know kind of show you how you can do it you can do this right
you can do it you can do this right here and we'll say so we're now again
here and we'll say so we're now again we're just looking at this column just
we're just looking at this column just this column and we're using strip and
this column and we're using strip and let's get rid of R CU we want to do
let's get rid of R CU we want to do apply it to everywhere you can input all
apply it to everywhere you can input all of those values in visually and it will
of those values in visually and it will clean it up so let's say we want to get
clean it up so let's say we want to get rid of numbers we'll do one two three
rid of numbers we'll do one two three then we can do the dot so that's going
then we can do the dot so that's going to be for a period or for a dot dot dot
to be for a period or for a dot dot dot Potter we could also do the underscore
Potter we could also do the underscore and we can do the forward slash so we
and we can do the forward slash so we put it all in one string right here now
put it all in one string right here now let's take a look at this we'll get rid
let's take a look at this we'll get rid of this really quickly now let's take a
of this really quickly now let's take a look and all of them were removed I
look and all of them were removed I showed you how to do it before because
showed you how to do it before because that's at least how my mind would think
that's at least how my mind would think about it I'd think oh I can put it in a
about it I'd think oh I can put it in a list and run it through this L strip or
list and run it through this L strip or this right strip and it would work um
this right strip and it would work um but that's not how strip works you have
but that's not how strip works you have to kind of combine it all into one value
to kind of combine it all into one value so uh yes I deceived you I apologize but
so uh yes I deceived you I apologize but now when we call data frame and we
now when we call data frame and we assign it to that column so the last
assign it to that column so the last name column or assigning what we just
name column or assigning what we just did to this last name column everything
did to this last name column everything should look perfect
should look perfect and it does so our customer ID first
and it does so our customer ID first name last name are all cleaned up now
name last name are all cleaned up now we're going to come to a much more
we're going to come to a much more difficult one this is probably if I'm
difficult one this is probably if I'm being honest the hardest one I said we
being honest the hardest one I said we were going to work up but this is
were going to work up but this is probably the hardest one of the whole
probably the hardest one of the whole video working with phone numbers and
video working with phone numbers and look at all these different types of of
look at all these different types of of formats I mean it is um it's not going
formats I mean it is um it's not going to be fun and imagine you you know
to be fun and imagine you you know there's 20,000 of these you can't just
there's 20,000 of these you can't just go and manually clean those up you need
go and manually clean those up you need something to kind of automate that
something to kind of automate that so that is what we're going to do so
so that is what we're going to do so let's go right down here we'll copy the
let's go right down here we'll copy the data frame and I'm going to pull it
data frame and I'm going to pull it right here so now we need to clean up
right here so now we need to clean up this phone number what we want is it all
this phone number what we want is it all to look exactly the same unless it's
to look exactly the same unless it's blank and we'll keep it blank we don't
blank and we'll keep it blank we don't want to populate that data but we want
want to populate that data but we want all of them to look exactly like this
all of them to look exactly like this one and what we're going to do is right
one and what we're going to do is right off the bat we're going to take all of
off the bat we're going to take all of the non-numeric values and just complete
the non-numeric values and just complete completely get rid of them strip it down
completely get rid of them strip it down to just the numbers so this 1 23- 643 or
to just the numbers so this 1 23- 643 or forward slash will just be the numbers
forward slash will just be the numbers same with these bars and these slashes
same with these bars and these slashes and everything all of these will just be
and everything all of these will just be numeric then we'll go back and reformat
numeric then we'll go back and reformat it how we want to format it which will
it how we want to format it which will look exactly like this one um but we
look exactly like this one um but we just want to do it for the entire column
just want to do it for the entire column so let's go right up here and we're
so let's go right up here and we're going to try replace for the first time
going to try replace for the first time so let's do phone number
so let's do phone number just oops that's not what I wanted so
just oops that's not what I wanted so we're going to do a bracket say phone
we're going to do a bracket say phone number do
number do string. replace just like we did before
string. replace just like we did before now we're going to use some regular
now we're going to use some regular expression in here and I'll kind of do a
expression in here and I'll kind of do a really high overview although I'm not
really high overview although I'm not going to dive super deep into the
going to dive super deep into the regular expression then we're going to
regular expression then we're going to do a parenthesis and within there we're
do a parenthesis and within there we're going to do a bracket um I can't
going to do a bracket um I can't remember what this is called is it
remember what this is called is it called a carrot I think it's called a
called a carrot I think it's called a carrot uh I'm just going to call it that
carrot uh I'm just going to call it that it may not be correct but I think it's a
it may not be correct but I think it's a an upper Arrow so it's an upper Arrow a
an upper Arrow so it's an upper Arrow a dash oops A- Z A- Z and then
dash oops A- Z A- Z and then 0-9 now at a super high level what that
0-9 now at a super high level what that character that first thing is doing it's
character that first thing is doing it's saying we're going to return any
saying we're going to return any character except and then we specify
character except and then we specify anything A to Z A to Z upper or
anything A to Z A to Z upper or lowercase and then actually I think this
lowercase and then actually I think this should be like this A to Z uh and then 0
should be like this A to Z uh and then 0 to 9 so any value like a BC 1 2 3 those
to 9 so any value like a BC 1 2 3 those are not going to be matched it's going
are not going to be matched it's going to match all of them except these values
to match all of them except these values and then we're going to replace them by
and then we're going to replace them by saying comma and we're going to replace
saying comma and we're going to replace them with nothing so this is just an
them with nothing so this is just an empty string so literally we're taking
empty string so literally we're taking everything that is not an A B C A one
everything that is not an A B C A one two 3 so a letter or a number we're
two 3 so a letter or a number we're replacing all of that and then we're
replacing all of that and then we're replacing it with nothing so let's run
replacing it with nothing so let's run this and see what it looks like and it
this and see what it looks like and it looks like that worked properly now we
looks like that worked properly now we do have this na cuz we had an n- a for I
do have this na cuz we had an n- a for I don't remember maybe that was Creed
don't remember maybe that was Creed Bratton um but it worked for basically
Bratton um but it worked for basically everything else we're going to go
everything else we're going to go through the entire process and then at
through the entire process and then at the end we'll remove any values we want
the end we'll remove any values we want them to just be completely null we we
them to just be completely null we we don't want them to even see n an and
don't want them to even see n an and wonder what that is we just want it to
wonder what that is we just want it to be blank and we'll do that at the very
be blank and we'll do that at the very end so now that we know that that worked
end so now that we know that that worked let's assign it we'll do DF phone num is
let's assign it we'll do DF phone num is equal to and then we'll say data frame
equal to and then we'll say data frame and this looks a lot more standardized
and this looks a lot more standardized than it did before already but now what
than it did before already but now what we want to do is try to format this um
we want to do is try to format this um and I've done this many many times I
and I've done this many many times I always use a Lambda you can definitely
always use a Lambda you can definitely use a for loop I just I don't do it that
use a for loop I just I don't do it that way myself so I'm going to show you how
way myself so I'm going to show you how to do it using a Lambda let's get rid of
to do it using a Lambda let's get rid of this and we're going to say thef phone
this and we're going to say thef phone number we've already done that I'm just
number we've already done that I'm just going to get rid of it now we're going
going to get rid of it now we're going to say d phone number then we're going
to say d phone number then we're going to say do apply we'll do an open
to say do apply we'll do an open parentheses and then this is where we're
parentheses and then this is where we're going to build out our Lambda so we'll
going to build out our Lambda so we'll say Lambda X colon now this is where
say Lambda X colon now this is where we're going to kind of format it so what
we're going to kind of format it so what I want to do is I want to take the first
I want to do is I want to take the first three strings one two three then I want
three strings one two three then I want to add a slash and then the next three
to add a slash and then the next three strings add a slash or a dash uh and
strings add a slash or a dash uh and then that be the value that's returned
then that be the value that's returned so it's not super difficult we're just
so it's not super difficult we're just going to do X then a bracket let me get
going to do X then a bracket let me get rid of that an X and then a bracket and
rid of that an X and then a bracket and then we want the 0 to three so goes 0 1
then we want the 0 to three so goes 0 1 2 so 0 1 2 it doesn't include the three
2 so 0 1 2 it doesn't include the three it goes up to three so 0 one two that's
it goes up to three so 0 one two that's our third first three values then we'll
our third first three values then we'll do plus and do a quote and do a dash so
do plus and do a quote and do a dash so this is our first kind of sequence and
this is our first kind of sequence and I'm just going to copy this we'll do
I'm just going to copy this we'll do plus and instead of three or we are
plus and instead of three or we are going to start at three because now it's
going to start at three because now it's inclusive so we're going to go from
inclusive so we're going to go from three and we're going to go all the way
three and we're going to go all the way up to six so it should be 3 four five
up to six so it should be 3 four five our next three values then we have a
our next three values then we have a dash and we'll copy this and we'll say
dash and we'll copy this and we'll say plus and now we go from six all the way
plus and now we go from six all the way to 10 now let's try running this and as
to 10 now let's try running this and as you can see we get an error now I
you can see we get an error now I already know what the error is float
already know what the error is float object is not subscriptable which means
object is not subscriptable which means we're trying to um basically look at it
we're trying to um basically look at it like a string right now it's not a
like a string right now it's not a string it's actually a number so let me
string it's actually a number so let me get rid of this for just a second I'm G
get rid of this for just a second I'm G show you what it's talking about so
show you what it's talking about so right now we have values that are floats
right now we have values that are floats and values that are strings or not even
and values that are strings or not even a number so we have values that are
a number so we have values that are strings or not a number so if we want to
strings or not a number so if we want to actually look through it like kind of
actually look through it like kind of like indexing if we want to do that they
like indexing if we want to do that they all have to be strings so we need to
all have to be strings so we need to change this entire column into Strings
change this entire column into Strings before we can apply this um formatting
before we can apply this um formatting now when I was creating this if I'm
now when I was creating this if I'm being honest my first thought when I was
being honest my first thought when I was doing this was to do it like this string
doing this was to do it like this string DF phone number um let's just run that
DF phone number um let's just run that this is what the values look like um and
this is what the values look like um and I don't remember why or why it was doing
I don't remember why or why it was doing this I can't I can't remember but I
this I can't I can't remember but I looked into it quite a bit and I was
looked into it quite a bit and I was like oh I need to apply this string
like oh I need to apply this string converting it to a string on each value
converting it to a string on each value not the entire row or not the entire
not the entire row or not the entire column so how we can do that is actually
column so how we can do that is actually fairly easy because we've already done a
fairly easy because we've already done a lot of the heavy lifting we're just
lot of the heavy lifting we're just going to copy this and we're going to
going to copy this and we're going to say
say x so string of X and again Lambda is
x so string of X and again Lambda is like a little Anonymous function so you
like a little Anonymous function so you could do this by saying for um X in this
could do this by saying for um X in this uh column we could do a for Loop and
uh column we could do a for Loop and then say for every X it equals the
then say for every X it equals the string of X and then it changes it to a
string of X and then it changes it to a string but a Lambda just does it a lot
string but a Lambda just does it a lot quicker um so we're going to say so
quicker um so we're going to say so let's do that really quickly and all of
let's do that really quickly and all of our values look exactly the same and
our values look exactly the same and that's how we want it so we're just
that's how we want it so we're just going to copy this apply
going to copy this apply it good and now we're going to take this
it good and now we're going to take this and we're going to run this again just
and we're going to run this again just ignore all my commented out stuff
ignore all my commented out stuff pretend I don't have that um so now when
pretend I don't have that um so now when we run this it should work there we go
we run this it should work there we go now if we look at these numbers 1 2 3-
now if we look at these numbers 1 2 3- 545 d 5
421 and it does that for every single one where there's values even when
one where there's values even when there's Nan or na it's still adding
there's Nan or na it's still adding those values but we expected that so
those values but we expected that so let's apply it say is equal to and then
let's apply it say is equal to and then we'll look at the data
we'll look at the data frame and this looks almost exactly what
frame and this looks almost exactly what we're hoping for we just need to get rid
we're hoping for we just need to get rid of these so this n- Das and this na Dash
of these so this n- Das and this na Dash we need to get rid of those and that is
we need to get rid of those and that is super easy to do um we're just going to
super easy to do um we're just going to say so now that we've done it and we'll
say so now that we've done it and we'll comment that out we'll say
comment that out we'll say DF and let's copy this ignore the
DF and let's copy this ignore the messiness I do apologize for that it's
messiness I do apologize for that it's very messy um but if you're following
very messy um but if you're following along with me you get what we're doing
along with me you get what we're doing so DF phone number so only on the phone
so DF phone number so only on the phone number say string.
number say string. replace no open parenthesis now we can
replace no open parenthesis now we can specify this value so we want to take
specify this value so we want to take this exact
this exact value and replace it with nothing and
value and replace it with nothing and let's just see if that does work it does
let's just see if that does work it does now we have these
now we have these Nas and so let's actually I'll paste
Nas and so let's actually I'll paste that right down here we're going to do
that right down here we're going to do this is equal to and then we're just
this is equal to and then we're just going to take this entire string put it
going to take this entire string put it right here and put this value as our
right here and put this value as our what we're looking for and then
what we're looking for and then replacing and then when we call that
replacing and then when we call that data frame it should work properly and
data frame it should work properly and it is perfectly cleaned so we have every
it is perfectly cleaned so we have every single value all the exact same they
single value all the exact same they don't have different characters or
don't have different characters or different um you know formatting and we
different um you know formatting and we got rid of all the ones that we don't
got rid of all the ones that we don't have or don't need um all the ones that
have or don't need um all the ones that were just random values so this column
were just random values so this column is now completely cleaned up again
is now completely cleaned up again definitely one of the more difficult
definitely one of the more difficult ones um one that I've done a thousand
ones um one that I've done a thousand times I've had to work with a lot of
times I've had to work with a lot of phone numbers and stuff like like that
phone numbers and stuff like like that this one does get very tricky especially
this one does get very tricky especially if you have like a plus one which is
if you have like a plus one which is like an area code um that can get tricky
like an area code um that can get tricky as well but this is on a kind of a high
as well but this is on a kind of a high level this is how you can do that and
level this is how you can do that and it's pretty neat how you can actually
it's pretty neat how you can actually you know clean up and standardize those
you know clean up and standardize those phone numbers so let's go right down
phone numbers so let's go right down here uh let's run it the next thing that
here uh let's run it the next thing that we're going to look at is this address
we're going to look at is this address now let's just pretend that the people
now let's just pretend that the people who are on the call center want all
who are on the call center want all these separated into three different
these separated into three different columns they can read it easier see what
columns they can read it easier see what the ZIP code is where they live
the ZIP code is where they live uh you know whatever they want it for
uh you know whatever they want it for let's just say we want to do that and
let's just say we want to do that and this is you know again for this use case
this is you know again for this use case it may not make sense but you have to do
it may not make sense but you have to do this I do this all the time um you need
this I do this all the time um you need to split those columns now luckily all
to split those columns now luckily all of these things are separated by a comma
of these things are separated by a comma so we can specify that we're going to
so we can specify that we're going to split on this column and then we'll be
split on this column and then we'll be able to create three separate columns
able to create three separate columns based off of this one column which is
based off of this one column which is exactly what we want then we can name it
exactly what we want then we can name it as well and we can do that very easily
as well and we can do that very easily by using this split so we're going to
by using this split so we're going to say DF and we want to
specify oh jeez not again so we want to specify that we're looking at the
specify that we're looking at the address then we're going to say.
address then we're going to say. string. split we'll do an open
string. split we'll do an open parenthesis now the very first value
parenthesis now the very first value that we need to specify is what we're
that we need to specify is what we're splitting on so we want to split on the
splitting on so we want to split on the comma so we want to specify that and
comma so we want to specify that and then we need to specify how many values
then we need to specify how many values from left to right it should look for
from left to right it should look for now we'll just start with one and then
now we'll just start with one and then we'll go from there let's just see what
we'll go from there let's just see what this looks
this looks like
like so it doesn't really look like it did
so it doesn't really look like it did anything let's do two well let's go back
anything let's do two well let's go back to one and then let's say
to one and then let's say expand equals true when we expand it
expand equals true when we expand it it's actually going to uh separate it I
it's actually going to uh separate it I believe okay so we're expanding we now
believe okay so we're expanding we now we're only doing this with one comma so
we're only doing this with one comma so we're only looking at the very first
we're only looking at the very first comma and splitting it but in some of
comma and splitting it but in some of these well just in one there is an
these well just in one there is an additional comma so we should do it up
additional comma so we should do it up to two let's do this okay so now we have
to two let's do this okay so now we have three columns if we just save it like
three columns if we just save it like this it's going to give us these 0 one
this it's going to give us these 0 one two these basically these indexed values
two these basically these indexed values for these columns and we don't want that
for these columns and we don't want that we want to specify what these actually
we want to specify what these actually are and we can do that by saying DF and
are and we can do that by saying DF and let me just do is equal to we'll do
let me just do is equal to we'll do bracket and then within there we're
bracket and then within there we're going to specify our list so we have
going to specify our list so we have three three of them that we have so I'm
three three of them that we have so I'm going to do um the first one this is the
going to do um the first one this is the street address so we'll say street
street address so we'll say street address the next one is and it's sh is
address the next one is and it's sh is not a state uh but these all are states
not a state uh but these all are states so I'm just going to say
so I'm just going to say State and then for the very last one
State and then for the very last one that looks like a zip code so we'll say
that looks like a zip code so we'll say zip and we'll do code in fact I also
zip and we'll do code in fact I also want to do streetcore address um so what
want to do streetcore address um so what this is is now going to do is these
this is is now going to do is these three columns are going to be applied to
three columns are going to be applied to these three names and they'll basically
these three names and they'll basically be appended it doesn't replace the
be appended it doesn't replace the address we're not saying DF address
address we're not saying DF address equals the DF address we're not
equals the DF address we're not replacing it we're now creating
replacing it we're now creating different columns so let's run it and
different columns so let's run it and then let's also call it so they're right
then let's also call it so they're right over here on this right hand side I
over here on this right hand side I couldn't see them at first but it did
couldn't see them at first but it did exactly what we needed it to do so now
exactly what we needed it to do so now if we wanted to at the very end if we
if we wanted to at the very end if we want to we're not going to we could just
want to we're not going to we could just delete this address and keep the street
delete this address and keep the street address the state and the zip code
address the state and the zip code another really common thing that you can
another really common thing that you can do this happens often again with like
do this happens often again with like first name last name well you'll have
first name last name well you'll have Alex freeberg but it's Alex comma
Alex freeberg but it's Alex comma freeberg or Alex space freeberg and you
freeberg or Alex space freeberg and you can separate those out into different
can separate those out into different columns now the next one that we want to
columns now the next one that we want to look at is this paying customer and the
look at is this paying customer and the paying customer and do not contact are
paying customer and do not contact are very similar um in the fact that it's
very similar um in the fact that it's yes no NY yes no NY
yes no NY yes no NY um and so let's go right on down here
um and so let's go right on down here and we're going to say DF Dot and we
and we're going to say DF Dot and we want to just replace these values as all
want to just replace these values as all yeses or all NOS but just with the same
yeses or all NOS but just with the same formatting um just to keep it consistent
formatting um just to keep it consistent so let's make anything that's an N into
so let's make anything that's an N into a no anything that's a a y into a yes I
a no anything that's a a y into a yes I like it spelled out so let's change
like it spelled out so let's change anything that's a yes into a y anything
anything that's a yes into a y anything that's uh a a no into an n that's
that's uh a a no into an n that's usually how I do it just saves on data
usually how I do it just saves on data because it's less strings although it's
because it's less strings although it's can be often very minimal um but let's
can be often very minimal um but let's specify the P
specify the P customer we'll s say DF bracket Pay
customer we'll s say DF bracket Pay customer then we'll do do string.
customer then we'll do do string. replace so now we're just going to look
replace so now we're just going to look for those specific values so if it's a y
for those specific values so if it's a y oops a capital Y then we'll say
oops a capital Y then we'll say yes now let's run it and now we have no
yes now let's run it and now we have no more y's we now just have yeses although
more y's we now just have yeses although now these are yes yeses okay we don't
now these are yes yeses okay we don't want to do that let's do if we're
want to do that let's do if we're looking because it's taking it's
looking because it's taking it's literally looking up here and saying
literally looking up here and saying okay there's here's a y um let's change
okay there's here's a y um let's change the let's change that Y into a y so now
the let's change that Y into a y so now it's doing y uh we don't want that so
it's doing y uh we don't want that so let's look for the yes and change it
let's look for the yes and change it into a y now when we run this that looks
into a y now when we run this that looks a lot better um so we'll
a lot better um so we'll do DF paying customers equal to and then
do DF paying customers equal to and then we'll copy this we'll do the exact same
we'll copy this we'll do the exact same thing
thing no and
no and N then let's call it and now that entire
N then let's call it and now that entire column looks really good except for that
column looks really good except for that value right there but I'm going to leave
value right there but I'm going to leave that because I'm just going to apply it
that because I'm just going to apply it to the entire thing all at once to get
to the entire thing all at once to get rid of those at the end instead of just
rid of those at the end instead of just going column by column and then it's
going column by column and then it's it's literally going to be the exact
it's literally going to be the exact same thing so I'm not even going to
same thing so I'm not even going to scroll down whoops I'm just going to put
scroll down whoops I'm just going to put it right up here because this is the
it right up here because this is the exact same thing I'm going save us all
exact same thing I'm going save us all some
time and when we run this this looks exactly like what we're looking for
exactly like what we're looking for again some not a number values but we
again some not a number values but we can get rid of that in just a second by
can get rid of that in just a second by doing a place over the entire data frame
doing a place over the entire data frame and that is basically the end of
and that is basically the end of cleaning up individual columns now let's
cleaning up individual columns now let's go right down here we're going to say DF
go right down here we're going to say DF do string.
do string. replace and then we'll first do these
replace and then we'll first do these values oops so we'll do oops let me do
values oops so we'll do oops let me do that there we go and replace that with
that there we go and replace that with nothing and let's just see what it looks
nothing and let's just see what it looks like oops data frame object has no value
like oops data frame object has no value string well that's CU we were looking at
string well that's CU we were looking at columns before yeah I think I just need
columns before yeah I think I just need to get rid of this string we're not
to get rid of this string we're not looking it we're just really doing it
looking it we're just really doing it across the entire data frame now let's
across the entire data frame now let's try that
try that okay that worked
okay that worked appropriately and we'll just say data
appropriately and we'll just say data frame is equal to and then we'll copy
frame is equal to and then we'll copy this and we'll do the NN as
this and we'll do the NN as well and we'll
well and we'll [Music]
[Music] do and now when we do this it is not
do and now when we do this it is not going to replace these because these
going to replace these because these aren't actually a value because we're
aren't actually a value because we're looking for that string we actually need
looking for that string we actually need to use and I I completely forgot this
to use and I I completely forgot this I'm not going to lie to you um let's get
I'm not going to lie to you um let's get rid of this uh to get rid those values
rid of this uh to get rid those values because it's literally not a number
because it's literally not a number there it is technically empty um I
there it is technically empty um I forgot we can do um or we could not even
forgot we can do um or we could not even specify it we'll do DF do fillna so
specify it we'll do DF do fillna so we're going to fill these values if
we're going to fill these values if there's nothing in them we're going to
there's nothing in them we're going to fill it and we're going to
fill it and we're going to say blank and when we run that every
say blank and when we run that every value that doesn't have something in it
value that doesn't have something in it is going to show up blank even over here
is going to show up blank even over here where we only had a few all of them
where we only had a few all of them throughout the data frame if if it
throughout the data frame if if it doesn't have a value it is now blank so
doesn't have a value it is now blank so let's apply
let's apply that and we'll run
that and we'll run this and now all of our cleaning we
this and now all of our cleaning we actually cleaning up the individual
actually cleaning up the individual columns is completely done we've removed
columns is completely done we've removed columns we've split columns we've
columns we've split columns we've formatted and cleaned up phone numbers
formatted and cleaned up phone numbers we've also taken values off of first
we've also taken values off of first name or or this last name column and
name or or this last name column and then we formatt it in just kind of
then we formatt it in just kind of standardized paying customer and do not
standardized paying customer and do not contact now they also asked us to only
contact now they also asked us to only give them a list of phone numbers that
give them a list of phone numbers that they can call so if we take a look some
they can call so if we take a look some of these do not contacts are why which
of these do not contacts are why which means we cannot contact them and then
means we cannot contact them and then there are some that don't even have
there are some that don't even have phone numbers so we don't want to give
phone numbers so we don't want to give the people the call center numbers that
the people the call center numbers that or or people who don't have numbers so
or or people who don't have numbers so we want to remove those now there's a
we want to remove those now there's a few different ways that we can do this
few different ways that we can do this but let's start with and we'll just go
but let's start with and we'll just go by do this do not contact it seems like
by do this do not contact it seems like the most obvious one now if it's blank
the most obvious one now if it's blank we want to give them a call we only want
we want to give them a call we only want to not call them if they've specifically
to not call them if they've specifically said we cannot call them so if it's y
said we cannot call them so if it's y we're not going to call them so what we
we're not going to call them so what we need to do it's not anything like this
need to do it's not anything like this we probably need to Loop through this
we probably need to Loop through this column and then look at each row that
column and then look at each row that has a value of this and drop that entire
has a value of this and drop that entire row uh and we probably will'll need to
row uh and we probably will'll need to do that based off this index instead of
do that based off this index instead of doing it based off just this column uh
doing it based off just this column uh that may not make sense but let's
that may not make sense but let's actually let's actually start writing it
actually let's actually start writing it so we'll do 4X in and we need to look at
so we'll do 4X in and we need to look at our index so we're just going to do
our index so we're just going to do let's do nf. index and we'll do a colon
let's do nf. index and we'll do a colon enter and then we want to look at these
enter and then we want to look at these indexes how do we look at these indexes
indexes how do we look at these indexes we use lock that's going to be DF
we use lock that's going to be DF Lo and then we need to look at the value
Lo and then we need to look at the value which is this x right here so each time
which is this x right here so each time it looks at the index it's looking at
it looks at the index it's looking at the value but we want to look at the
the value but we want to look at the value of this column do not contact I
value of this column do not contact I don't know if I copied this before let
don't know if I copied this before let me copy it we only want to look at the
me copy it we only want to look at the value in this one column if we didn't it
value in this one column if we didn't it would look at um a different value so we
would look at um a different value so we don't want that so we're looking at just
don't want that so we're looking at just that value if it's equal to Y so if this
that value if it's equal to Y so if this value is equal to Y then we want to drop
value is equal to Y then we want to drop it so we actually need to say
it so we actually need to say if so if this value X in this column is
if so if this value X in this column is equal to Y then we want to do DF do drop
equal to Y then we want to do DF do drop and then we'll say x and we I think we
and then we'll say x and we I think we have to say in place equals true here
have to say in place equals true here otherwise it won't take a fact um
otherwise it won't take a fact um otherwise you have to say like DF is
otherwise you have to say like DF is equal to DF yeah I don't I don't want to
equal to DF yeah I don't I don't want to start messing with that let's just do in
start messing with that let's just do in place equals true
place equals true um and let's see if that works I I can't
um and let's see if that works I I can't remember if this is going to work or not
remember if this is going to work or not invalid syntax okay
invalid syntax okay neon and now let's try to run
neon and now let's try to run this okay okay yeah if we look at our
this okay okay yeah if we look at our index we can already tell that there are
index we can already tell that there are ones missing the one the one is missing
ones missing the one the one is missing the three is missing um let's see and
the three is missing um let's see and the 18 is missing so we already got rid
the 18 is missing so we already got rid of those values and you can you can see
of those values and you can you can see that there's no y's in here anymore
that there's no y's in here anymore which is really good we can if we want
which is really good we can if we want to and we probably should we should
to and we probably should we should probably populate that um really
probably populate that um really quickly um let me just go up here really
quick I'll copy this we probably should populate that and I didn't plan on doing
populate that and I didn't plan on doing this so um if it's blank oops it's blank
this so um if it's blank oops it's blank give it an n and we want to attribute it
give it an n and we want to attribute it to do not
to do not contact do not contact
whoops let's see if that works and we probably need to do do
works and we probably need to do do string let's just see if it
string let's just see if it works so if it's
works so if it's blank dude okay I don't know why it's
blank dude okay I don't know why it's giving us a triple
giving us a triple n maybe there's maybe I need to strip
n maybe there's maybe I need to strip this or
this or something uh okay never mind let's not
something uh okay never mind let's not do that but now we basically need to the
do that but now we basically need to the exact same thing for this phone number
exact same thing for this phone number um because if it's blank we don't want
um because if it's blank we don't want them calling it um so we can copy this
them calling it um so we can copy this entire thing go right down here and but
entire thing go right down here and but now we're looking at phone
now we're looking at phone number so now we're looking just at the
number so now we're looking just at the values within phone number and we only
values within phone number and we only want to look at if it's blank so if it
want to look at if it's blank so if it literally has no value we want to get
literally has no value we want to get rid of it let's run this and see if it
rid of it let's run this and see if it works again it should good and now our
works again it should good and now our list is getting much smaller so you can
list is getting much smaller so you can see in our index a lot of um those rows
see in our index a lot of um those rows were removed and okay good actually this
were removed and okay good actually this worked itself out because these all have
worked itself out because these all have ends um so right now we're sitting
ends um so right now we're sitting really good everything looks really um
really good everything looks really um standardized cleaned everything looks
standardized cleaned everything looks great I might drop this address if you
great I might drop this address if you want to you can drop this address but
want to you can drop this address but besides that this is all looking really
besides that this is all looking really good this Paint customer doesn't uh the
good this Paint customer doesn't uh the yes and NOS aren't really anything um
yes and NOS aren't really anything um now we could and we probably should
now we could and we probably should before we hand this off to the client or
before we hand this off to the client or the customer call list we probably
the customer call list we probably should reset this index because they
should reset this index because they might be confused as why there's numbers
might be confused as why there's numbers missing or you know they might use this
missing or you know they might use this index um to show how many people they've
index um to show how many people they've called or I don't know something like
called or I don't know something like that so let's go right down here we're
that so let's go right down here we're going to say DF Dot and then we'll do
going to say DF Dot and then we'll do reset
reset index and let's just see what this looks
index and let's just see what this looks like um it does work but as you can tell
like um it does work but as you can tell it didn't uh get rid of that index
it didn't uh get rid of that index completely it actually took the index
completely it actually took the index and saved that original one we do not
and saved that original one we do not need to save that whoops let's put it
need to save that whoops let's put it right in here now we're just going to do
right in here now we're just going to do drop equals true and when we do that it
drop equals true and when we do that it just completely resets it drops the
just completely resets it drops the original index and gives us a new index
original index and gives us a new index and that is what we want let's do DF
and that is what we want let's do DF equals and this is our final product now
equals and this is our final product now one thing that I you definitely could
one thing that I you definitely could have done here um and I made this a
have done here um and I made this a little probably more complicated than it
little probably more complicated than it needed to be um that was just how my
needed to be um that was just how my brain was working at the time when I'm
brain was working at the time when I'm you know typing this out we could have
you know typing this out we could have done DF do drop an a um which is
done DF do drop an a um which is literally going to look at these null
literally going to look at these null values um
values um before we couldn't do that with this one
before we couldn't do that with this one because these aren't we're not looking
because these aren't we're not looking at na we're looking at y's so we
at na we're looking at y's so we couldn't do that but because we're
couldn't do that but because we're looking at null values we could have
looking at null values we could have also done drop
also done drop na um and done subset is equal to and
na um and done subset is equal to and then done it just on this phone number
then done it just on this phone number and then done like this and done in
and then done like this and done in place equals true so we could have also
place equals true so we could have also done this and then said DF equals um I
done this and then said DF equals um I can't I mean I can run it it's just not
can't I mean I can run it it's just not going to do anything I can run it on the
going to do anything I can run it on the different column but that'll me mess
different column but that'll me mess everything up but this is another way
everything up but this is another way you can do it and I'll just save it in
you can do it and I'll just save it in case you want to um I'll say another way
case you want to um I'll say another way to drop null
to drop null values there you go and that'll just be
values there you go and that'll just be a note for us in the future um but this
a note for us in the future um but this is our final product it looks a lot
is our final product it looks a lot different than when we first started I
different than when we first started I mean we had mistakes here completely
mean we had mistakes here completely different formatting in the phone number
different formatting in the phone number different address everything that we
different address everything that we just talked about um and this looks just
just talked about um and this looks just a lot lot better and you can tell why
a lot lot better and you can tell why it's really important to do this process
it's really important to do this process because again we're working on a very
because again we're working on a very small data set I I purposely you know
small data set I I purposely you know created this data set with these
created this data set with these mistakes because you know when you're
mistakes because you know when you're looking at data that has tens of
looking at data that has tens of thousands 100 thousands a million rows
thousands 100 thousands a million rows these are all things that are going to
these are all things that are going to be applied to much larger scale and you
be applied to much larger scale and you won't be able to as easily see them um
won't be able to as easily see them um you'll have to do some exploratory data
you'll have to do some exploratory data analysis to find these mistakes and then
analysis to find these mistakes and then you're going to need to clean the data
you're going to need to clean the data or doing it at the same time when you're
or doing it at the same time when you're exploring the data uh so you'll clean it
exploring the data uh so you'll clean it up as you go but these are a lot of the
up as you go but these are a lot of the ways that I clean data a lot of the
ways that I clean data a lot of the things that you can do to make your data
things that you can do to make your data just a lot more standardized is a lot
just a lot more standardized is a lot more um visually better and then it
more um visually better and then it really helps later on with
really helps later on with visualizations and your you know actual
visualizations and your you know actual data analysis so I hope that that was
data analysis so I hope that that was helpful I know that this was a long
helpful I know that this was a long video I'm sure it was uh but I hope that
video I'm sure it was uh but I hope that you got something out of this you
you got something out of this you learned some of the techniques on how to
learned some of the techniques on how to actually clean data in pandas if you
actually clean data in pandas if you like this video be sure to like And
like this video be sure to like And subscribe check out all my other videos
subscribe check out all my other videos on pandas as well as Python and I will
on pandas as well as Python and I will see you in the next
[Music] video
[Music] hello everybody today we're going to be
hello everybody today we're going to be looking at exploratory data analysis
looking at exploratory data analysis using pandas exploratory data analysis
using pandas exploratory data analysis or Eda for short is basically just the
or Eda for short is basically just the first look at your data during this
first look at your data during this process we'll look at identifying
process we'll look at identifying patterns within the data understanding
patterns within the data understanding the relationships between the features
the relationships between the features and looking at outliers that may exist
and looking at outliers that may exist within your data set during this process
within your data set during this process you are looking for patterns and all
you are looking for patterns and all these things but you're also looking for
these things but you're also looking for um mistakes and missing values that you
um mistakes and missing values that you need to clean up during your cleaning
need to clean up during your cleaning process in the future now there are
process in the future now there are hundreds of ways to perform Eda on your
hundreds of ways to perform Eda on your data set but we can't possibly look at
data set but we can't possibly look at every single thing so I'm just going to
every single thing so I'm just going to show you what I think are some of the
show you what I think are some of the most popular and the best things that
most popular and the best things that you can do when you're first looking at
you can do when you're first looking at a data set the first thing that we're
a data set the first thing that we're going to do are import our libraries so
going to do are import our libraries so we'll do import pandas
we'll do import pandas aspd we're also going to import Seaborn
aspd we're also going to import Seaborn and matplot lib now dur during this
and matplot lib now dur during this exploratory data analysis process I
exploratory data analysis process I often like to visualize things as I go
often like to visualize things as I go because sometimes you just can't fully
because sometimes you just can't fully comprehend it unless you just visualize
comprehend it unless you just visualize it and it gives you a a larger broader
it and it gives you a a larger broader glimpse of everything so we're going to
glimpse of everything so we're going to import and let's do caborn
import and let's do caborn oops as SNS and then we'll import Matt
oops as SNS and then we'll import Matt plot li.
plot li. pyplot as
pyplot as PLT
PLT let's run
let's run this this should work okay perfect now
this this should work okay perfect now we need to bring in our data set so
we need to bring in our data set so we've worked with that world population
we've worked with that world population data set that is the exact one that
data set that is the exact one that we're going to use now so we'll say
we're going to use now so we'll say dataframe equals pd. read undor
dataframe equals pd. read undor CSV do R and we'll paste in our CSV and
CSV do R and we'll paste in our CSV and this is what it should look like
this is what it should look like although your path may be different be
although your path may be different be sure to make sure that you have the
sure to make sure that you have the correct file path then we'll read it in
correct file path then we'll read it in now this data set should look extremely
now this data set should look extremely familiar if you've done some of my
familiar if you've done some of my previous pandas tutorials but I did make
previous pandas tutorials but I did make some alterations to this one took out a
some alterations to this one took out a little bit of data put in a little bit
little bit of data put in a little bit of data here and there um to change
of data here and there um to change things up because if it was just exactly
things up because if it was just exactly how I pulled it which I got this data
how I pulled it which I got this data set from kaggle if it was exactly how we
set from kaggle if it was exactly how we pulled it like we've looked at in the
pulled it like we've looked at in the previous videos it's too simple you know
previous videos it's too simple you know we wouldn't actually be able to do some
we wouldn't actually be able to do some of the things that I would like to show
of the things that I would like to show you so be sure to actually download this
you so be sure to actually download this exact data set for this video because it
exact data set for this video because it is a little bit
is a little bit different but what we're going to do now
different but what we're going to do now is just try to get some highlevel
is just try to get some highlevel information from this now if yours looks
information from this now if yours looks just a little bit different like your
just a little bit different like your values are in scientific notation uh I
values are in scientific notation uh I have applied this so many times I think
have applied this so many times I think it's um you know still applied to this
it's um you know still applied to this you can do something and we'll write it
you can do something and we'll write it right down here we're going do pd. setor
right down here we're going do pd. setor option and we'll do an open parenthesis
option and we'll do an open parenthesis and we'll say
and we'll say display. float uncore format and so
display. float uncore format and so we're going to change that float format
we're going to change that float format by just saying Lambda X colon and then
by just saying Lambda X colon and then we're going to change basically how many
we're going to change basically how many um decimal points we're looking at so
um decimal points we're looking at so let's just do here so we'll do a quote
let's just do here so we'll do a quote percent sign 2f so we're formatting it
percent sign 2f so we're formatting it whoops 0 2f so we're going to format it
whoops 0 2f so we're going to format it and we'll do percent X this is going to
and we'll do percent X this is going to format it appropriately I'm I can run it
format it appropriately I'm I can run it um and actually it will change it this
um and actually it will change it this is at 0 one I believe last time I did it
is at 0 one I believe last time I did it so let's run this and then let's run
so let's run this and then let's run this again it'll change it to point 2 so
this again it'll change it to point 2 so that's two I like it at 0.1 we don't
that's two I like it at 0.1 we don't really need it any well let's keep it at
really need it any well let's keep it at point two why not we're going to keep it
point two why not we're going to keep it at point two that's how you change that
at point two that's how you change that and I like looking at it like this a lot
and I like looking at it like this a lot better than scientific notation so just
better than scientific notation so just something to point out um let's go down
something to point out um let's go down here and let's just pull up data frame
here and let's just pull up data frame so we have this data one of the first
so we have this data one of the first things that I like to do when I get a
things that I like to do when I get a data set is to just look at the info so
data set is to just look at the info so we're going to do doino and this gives
we're going to do doino and this gives us just some really high level
us just some really high level information this is how many columns we
information this is how many columns we have here are the column names here are
have here are the column names here are how many uh values we have and if you
how many uh values we have and if you notice this is where it kind of gets so
notice this is where it kind of gets so we have 234 in each of these so in each
we have 234 in each of these so in each of these columns we have 234 until we
of these columns we have 234 until we get to this 2022 population once we get
get to this 2022 population once we get there we start losing some values and
there we start losing some values and then at the world population percentage
then at the world population percentage we have all of our values all 234 of
we have all of our values all 234 of them the count tells us that it's non
them the count tells us that it's non null so it does have values in it and
null so it does have values in it and then we also have the data types and
then we also have the data types and these come in handy later um and these
these come in handy later um and these are really great to know and we'll be
are really great to know and we'll be able to kind of use those in a few
able to kind of use those in a few different ways later on in this tutorial
different ways later on in this tutorial really quickly I wanted to give a huge
really quickly I wanted to give a huge shout out to the sponsor of this entire
shout out to the sponsor of this entire Panda series and that is udemy udemy has
Panda series and that is udemy udemy has some of the best courses at the best
some of the best courses at the best prices and it is no exception when it
prices and it is no exception when it comes to Panda courses if you want to
comes to Panda courses if you want to master Master pandas this is the course
master Master pandas this is the course that I would recommend it's going to
that I would recommend it's going to teach you just about everything you need
teach you just about everything you need to know about pandas so huge shout out
to know about pandas so huge shout out to you to me for sponsoring this Panda
to you to me for sponsoring this Panda series and let's get back to the video
series and let's get back to the video the next thing that I really like to do
the next thing that I really like to do and this one is DF do
and this one is DF do describe this allows you to get really a
describe this allows you to get really a highlevel overview of all of your
highlevel overview of all of your columns very quickly you can get the
columns very quickly you can get the count the mean the standard deviation
count the mean the standard deviation the minimum value and the maximum value
the minimum value and the maximum value as well as your 25 50 and 75
as well as your 25 50 and 75 percentiles of your values so just at a
percentiles of your values so just at a super quick glance there is a row
super quick glance there is a row somewhere in here and there this country
somewhere in here and there this country their population is 510 for 2022 and in
their population is 510 for 2022 and in fact if you go back to 1970 it was
fact if you go back to 1970 it was higher it was at
higher it was at 752 that's just interesting then if we
752 that's just interesting then if we look at the um max population one has
look at the um max population one has 1.42 billion I believe that's China and
1.42 billion I believe that's China and then over here in 1970 we have 822
then over here in 1970 we have 822 million again I still believe that's
million again I still believe that's China but this gives you just a really
China but this gives you just a really nice high level of all of these values
nice high level of all of these values or all these different calculations that
or all these different calculations that you can run on it and we can run all
you can run on it and we can run all these individually on even specific
these individually on even specific columns but you know it's just a nice
columns but you know it's just a nice highlevel overview one thing that we
highlevel overview one thing that we just talked about was the null values
just talked about was the null values that we're seeing in here um I'd like to
that we're seeing in here um I'd like to see how many values we're actually
see how many values we're actually missing because that is a problem um we
missing because that is a problem um we don't want to have too many missing
don't want to have too many missing values or could really obscure or change
values or could really obscure or change the data set in irely and so we don't
the data set in irely and so we don't want that so we'll say DF do is null and
want that so we'll say DF do is null and then we'll do a parenthesis and we'll
then we'll do a parenthesis and we'll say do sum and when we do this
say do sum and when we do this whoops dot sum there we go when we do
whoops dot sum there we go when we do this it's going to give us all the
this it's going to give us all the columns and how many values we're
columns and how many values we're actually missing now we have
actually missing now we have 234 rows of data so we have 41 477 55424
234 rows of data so we have 41 477 55424 um so we have we definitely have data
um so we have we definitely have data missing what we choose to do with it in
missing what we choose to do with it in the data cleaning process maybe we want
the data cleaning process maybe we want to populate it with a median value maybe
to populate it with a median value maybe we just want to delete those countries
we just want to delete those countries entirely if the data is missing um you
entirely if the data is missing um you know I don't think you're going to do
know I don't think you're going to do that but these are things that you need
that but these are things that you need to think about when you're actually
to think about when you're actually finding these missing values this is
finding these missing values this is what the Eda process is all about we
what the Eda process is all about we want to find different um either
want to find different um either outliers missing values things that are
outliers missing values things that are wrong with the data or we can find
wrong with the data or we can find insights into it while we're doing this
insights into it while we're doing this as well so so this is definitely
as well so so this is definitely something that I would consider um when
something that I would consider um when I'm actually going through that data
I'm actually going through that data cleaning process really really important
cleaning process really really important information to know now let's go right
information to know now let's go right down here go to our next cell say DF do
down here go to our next cell say DF do unique and this is going to show us how
unique and this is going to show us how many unique values and it's actually n
many unique values and it's actually n unique uh this is going to show us how
unique uh this is going to show us how many unique values are actually in each
many unique values are actually in each of these uh columns and this one makes
of these uh columns and this one makes the most sense um for continents because
the most sense um for continents because I think there's only seven continents
I think there's only seven continents right right um but we have six right
right right um but we have six right here and for all of these each of these
here and for all of these each of these ranks countries capitals should all be
ranks countries capitals should all be unique that makes perfect sense as well
unique that makes perfect sense as well as these you know these populations are
as these you know these populations are such specific numbers and such large
such specific numbers and such large numbers I would be shocked if any of
numbers I would be shocked if any of these were similar and then for these
these were similar and then for these world population percentages it's much
world population percentages it's much lower and again that makes a lot of
lower and again that makes a lot of sense because when we're looking at and
sense because when we're looking at and we'll pull it up right here when we're
we'll pull it up right here when we're looking at these world population
looking at these world population percentages um a lot of them are really
percentages um a lot of them are really low 0.00 0.01 like this one um 0.2 there
low 0.00 0.01 like this one um 0.2 there are a lot of really low values for those
are a lot of really low values for those small countries and so those are all um
small countries and so those are all um you know one unique value now let's say
you know one unique value now let's say we just have this data right here and we
we just have this data right here and we want to take a look at some of the
want to take a look at some of the largest countries and we can easily do
largest countries and we can easily do that we could even we could say Max and
that we could even we could say Max and take a look at the largest country but I
take a look at the largest country but I want to be a little bit more strategic I
want to be a little bit more strategic I want to be able to look at some of the
want to be able to look at some of the top range of countries and we can do
top range of countries and we can do that based off this
that based off this 2022 population so we'll say DF do
2022 population so we'll say DF do sortore values this is how we sort and
sortore values this is how we sort and um not filter but um order our data so
um not filter but um order our data so we'll do sort values and then we'll do
we'll do sort values and then we'll do buy is equal and then we'll specify that
buy is equal and then we'll specify that we want uh this 2022 population and then
we want uh this 2022 population and then we're going to say comma and we'll say
we're going to say comma and we'll say actually let's just run this as is um
actually let's just run this as is um but we'll do head because we just want
but we'll do head because we just want to look at the top values so now we're
to look at the top values so now we're just looking at the very top values so
just looking at the very top values so what we're looking at is actually these
what we're looking at is actually these 2022 population um that's what we're
2022 population um that's what we're filtering on or sorting on basically and
filtering on or sorting on basically and we're looking at the very bottom values
we're looking at the very bottom values because it's sorting ascending so from
because it's sorting ascending so from lowest to highest so this Vatican City
lowest to highest so this Vatican City in Europe is um you know 510 that's the
in Europe is um you know 510 that's the value that we were looking at earlier
value that we were looking at earlier now we can do comma ascending equal to
now we can do comma ascending equal to false because it was by default true we
false because it was by default true we can do false whoops we can do false and
can do false whoops we can do false and then it'll give us the very largest ones
then it'll give us the very largest ones so if we just take a look at the top
so if we just take a look at the top five largest by population we're looking
five largest by population we're looking at China India United States Indonesia
at China India United States Indonesia and Pakistan and we can even specify
and Pakistan and we can even specify that we want the top 10 in this head we
that we want the top 10 in this head we can bring in the top 10 we also have
can bring in the top 10 we also have Nigeria Brazil Bangladesh Russia and
Nigeria Brazil Bangladesh Russia and Mexico and you can do this for literally
Mexico and you can do this for literally any of these columns whether you want to
any of these columns whether you want to look at continent capital country um you
look at continent capital country um you can sort on these and look at them and
can sort on these and look at them and you can even look at you know things
you can even look at you know things like growth rate world percentage this
like growth rate world percentage this one seems really interesting let's just
one seems really interesting let's just look at this one really quickly before
look at this one really quickly before we move on to the next thing um if we
we move on to the next thing um if we look at this world percentage just China
look at this world percentage just China alone I believe yeah just China alone is
alone I believe yeah just China alone is 17.88% of the world so
17.88% world population percentage again just getting in here looking around
just getting in here looking around that's all we're really doing now I want
that's all we're really doing now I want to look at something and I have always
to look at something and I have always liked doing this which is looking at
liked doing this which is looking at correlations um so correlation between
correlations um so correlation between usually only numeric values we can do
usually only numeric values we can do that by saying DF
that by saying DF docr and a parenthesis and we'll run
docr and a parenthesis and we'll run this and what this is is it is comparing
this and what this is is it is comparing every column to every other column and
every column to every other column and looking at how closely correlated they
looking at how closely correlated they are so this 2022 population if we look
are so this 2022 population if we look across the board it's very highly I mean
across the board it's very highly I mean this is a one: one this is highly
this is a one: one this is highly correlated to each other and that almost
correlated to each other and that almost for all of these populations they're
for all of these populations they're very very closely tied to each other
very very closely tied to each other which makes perfect sense because for
which makes perfect sense because for most countries they're going to be
most countries they're going to be steadily increasing and so they're
steadily increasing and so they're probably almost exactly correlated but
probably almost exactly correlated but we can look at these populations and if
we can look at these populations and if you look at the area it's only somewhat
you look at the area it's only somewhat correlated and that's because in some
correlated and that's because in some countries you know they have a very high
countries you know they have a very high population but a small area or vice
population but a small area or vice versa a small area and a very high
versa a small area and a very high population so there isn't a one toone
population so there isn't a one toone correlation there but it's hard to
correlation there but it's hard to really just glance at this um and
really just glance at this um and understand everything that's there we
understand everything that's there we could just visualize it and it would be
could just visualize it and it would be a lot easier so let's go ahead and do
a lot easier so let's go ahead and do that let's go down here we're just going
that let's go down here we're just going to visualize this using a heat map
to visualize this using a heat map basically so we're going to say SNS do
basically so we're going to say SNS do heatmap and an open parenthesis and the
heatmap and an open parenthesis and the data that we're going to be looking at
data that we're going to be looking at is DF do core correlation and then we
is DF do core correlation and then we also want to say inote equals true I'll
also want to say inote equals true I'll kind of show you what that looks like in
kind of show you what that looks like in just a little bit um but let's do PLT
just a little bit um but let's do PLT doow and this will be our first look and
doow and this will be our first look and I need to say show not shot um we can
I need to say show not shot um we can get a little glimpse of what it looks
get a little glimpse of what it looks like but this looks um absolutely
like but this looks um absolutely terrible let's change the figure size
terrible let's change the figure size really quick so I want to make this much
really quick so I want to make this much larger than it already is we'll do
larger than it already is we'll do pl. RC prams RC params oops right there
pl. RC prams RC params oops right there do an open parenthesis and then right
do an open parenthesis and then right here we're going to do in quotes do
here we're going to do in quotes do figure. fig size this actually needs to
figure. fig size this actually needs to be in brackets I
be in brackets I believe just like this not parentheses
believe just like this not parentheses we'll say fig size is equal to and now
we'll say fig size is equal to and now we can specify the value that we want
we can specify the value that we want let's do 10 comma seven and see if this
let's do 10 comma seven and see if this looks any
looks any better no no that doesn't look good do
better no no that doesn't look good do 20 okay that looks a lot better and um
20 okay that looks a lot better and um you know this is just a quick way
you know this is just a quick way because it gives you basically a
because it gives you basically a colorcoded system highly correlated is
colorcoded system highly correlated is this tan all the way down to basically
this tan all the way down to basically no correlation or negative correlation
no correlation or negative correlation even which is black so when we're
even which is black so when we're looking at these 2022 populations and
looking at these 2022 populations and these are populations right down here on
these are populations right down here on this axis we can see that all of these
this axis we can see that all of these are extremely highly correlated very
are extremely highly correlated very very quickly whereas the rank really has
very quickly whereas the rank really has nothing to do it's it's negatively
nothing to do it's it's negatively correlated doesn't really have anything
correlated doesn't really have anything to do with it then for the population
to do with it then for the population and the world population percentage it
and the world population percentage it again is quite correlated except for the
again is quite correlated except for the area density and growth rate so I find
area density and growth rate so I find that really interesting that you know
that really interesting that you know the density the growth rate in the area
the density the growth rate in the area aren't really all that Associated or
aren't really all that Associated or correlated with the population numbers
correlated with the population numbers that is I kind of would assumed that on
that is I kind of would assumed that on some level they went hand inand the area
some level they went hand inand the area does um would you know again make sense
does um would you know again make sense you know larger area larger population
you know larger area larger population that kind of thing but even density um I
that kind of thing but even density um I guess I guess density and growth rate um
guess I guess density and growth rate um growth rate I can see because that's a
growth rate I can see because that's a percentile thing that could be
percentile thing that could be definitely not correlated I thought the
definitely not correlated I thought the density would be more correlated than it
density would be more correlated than it is all that to say is this is one way
is all that to say is this is one way that you can kind of look at your data
that you can kind of look at your data see how correlated it is to one another
see how correlated it is to one another that can definitely um help you know
that can definitely um help you know what to analyze and look at later when
what to analyze and look at later when you're actually doing your data analysis
you're actually doing your data analysis let's go right down here um something
let's go right down here um something that I do almost all the time when I'm
that I do almost all the time when I'm doing any type of uh exploratory data
doing any type of uh exploratory data analysis like this I'm going to group
analysis like this I'm going to group together columns start looking at the
together columns start looking at the data a little bit closer um so let's go
data a little bit closer um so let's go ahead and group on the continent so
ahead and group on the continent so let's look at it right here let's group
let's look at it right here let's group on this continent because some times
on this continent because some times when you're doing this Eda you already
when you're doing this Eda you already know kind of what the end goal of this
know kind of what the end goal of this data set is you know kind of what you're
data set is you know kind of what you're looking for what you're going to
looking for what you're going to visualize at the end that you really
visualize at the end that you really comes in handy when doing this but
comes in handy when doing this but sometimes you don't sometimes just going
sometimes you don't sometimes just going in blind and so far we've really just
in blind and so far we've really just been going in blind we're just throwing
been going in blind we're just throwing things at the wind kind of seeing some
things at the wind kind of seeing some overviews um looking at correlation
overviews um looking at correlation that's all we've done now I kind of want
that's all we've done now I kind of want to get more specific I want to have like
to get more specific I want to have like a use case something that I'm kind of
a use case something that I'm kind of looking for not doing full data analysis
looking for not doing full data analysis not diving into the depths but something
not diving into the depths but something we can kind of aim for so the use case
we can kind of aim for so the use case or the question for us is are there
or the question for us is are there certain continents that have grown
certain continents that have grown faster than others and in which ways so
faster than others and in which ways so we want to focus on these continents we
we want to focus on these continents we know that that's the most important
know that that's the most important column for this use case this very fake
column for this use case this very fake use case um so we can group on this
use case um so we can group on this continent and we can look at these
continent and we can look at these populations right here because we can't
populations right here because we can't really see growth you can see a growth
really see growth you can see a growth rate but the density per uh kilometer we
rate but the density per uh kilometer we don't have multiple values for that it's
don't have multiple values for that it's just a static one single value same for
just a static one single value same for growth rate same for world population
growth rate same for world population percentage but we have this over a long
percentage but we have this over a long span many many years um you know 50
span many many years um you know 50 years of data here so this we can see
years of data here so this we can see which countries have really done well or
which countries have really done well or which continents have really done well
which continents have really done well so without you know talking about it
so without you know talking about it even more let's do DF Group by and then
even more let's do DF Group by and then we'll say continent oops let me just
we'll say continent oops let me just copy this I'm I'm not could it's
copy this I'm I'm not could it's spelling we're going to say DF groupy
spelling we're going to say DF groupy and then we'll do
and then we'll do mean and we can just do it just like
mean and we can just do it just like this and now we have Africa Asia Europe
this and now we have Africa Asia Europe North America Oceana and South
North America Oceana and South America okay so if I'm being completely
America okay so if I'm being completely honest I knew most of these all right
honest I knew most of these all right I'm no geography extra expert but I I
I'm no geography extra expert but I I knew most of these I don't know what
knew most of these I don't know what this ocean is um this that I don't I
this ocean is um this that I don't I genuinely don't know what that is um
genuinely don't know what that is um so let's just search for that value and
so let's just search for that value and see we'll come back up here in just a
see we'll come back up here in just a second but I want to I want to kind of
second but I want to I want to kind of understand um what this is so we're
understand um what this is so we're going to DF um and we'll say
going to DF um and we'll say continent let me sound that out for you
continent let me sound that out for you guys um then we'll do string. contains
guys um then we'll do string. contains oops contains good night and then I want
oops contains good night and then I want to look for
to look for Oceana uh and let's let's run this oh I
Oceana uh and let's let's run this oh I need to do it like
this now let's run this so now we're looking at our data frame we're seeing
looking at our data frame we're seeing when the values have this continent as
when the values have this continent as Oceana um okay so these look like
Oceana um okay so these look like Islands I'm guessing so we have Fiji
Islands I'm guessing so we have Fiji Guam um New
Guam um New Zealand Papa New Guinea yeah these look
Zealand Papa New Guinea yeah these look like all I'm I'm guessing based off the
like all I'm I'm guessing based off the continent Oceana
continent Oceana um Oceania o Ocea Oceania guys this is
um Oceania o Ocea Oceania guys this is tough for me okay I'm doing my best I
tough for me okay I'm doing my best I you know this is part of the Eda process
you know this is part of the Eda process I don't know what that means I don't
I don't know what that means I don't know what ocean ocean ocean Oceania geez
know what ocean ocean ocean Oceania geez I'm just going to call it Oceana that's
I'm just going to call it Oceana that's so wrong but I'm just gonna it's so easy
so wrong but I'm just gonna it's so easy for me to say you know I I now am seeing
for me to say you know I I now am seeing this and it looks like
this and it looks like Islands um which would make sense
Islands um which would make sense because for their average they have the
because for their average they have the highest average rank um and I'm guessing
highest average rank um and I'm guessing that's because they're just mostly small
that's because they're just mostly small continents so let's let's order this
continents so let's let's order this really quickly we're going to do dot
really quickly we're going to do dot sortore values do an open parenthesis
sortore values do an open parenthesis and I want to sort on the population
and I want to sort on the population we're just doing the average population
we're just doing the average population um we'll do BU um equal so on the
um we'll do BU um equal so on the average population and we'll do
average population and we'll do ascending equals false so we're looking
ascending equals false so we're looking at this average or the mean population
at this average or the mean population Asia has the highest population on
Asia has the highest population on average and we have South America Africa
average and we have South America Africa Europe North America and then Oceana at
Europe North America and then Oceana at the very bottom which makes perfect
the very bottom which makes perfect sense again small Islands um world
sense again small Islands um world population percentage so each of the
population percentage so each of the countries each of those countries in
countries each of those countries in Asia makes up about 1% on average really
Asia makes up about 1% on average really interesting um to know and just kind of
interesting um to know and just kind of look at this and and the density in Asia
look at this and and the density in Asia is far higher than double almost double
is far higher than double almost double every single other continent um really
every single other continent um really really interesting actually now that I'm
really interesting actually now that I'm looking at this but you know that's
looking at this but you know that's something that I would actually look
something that I would actually look into and I would be like what is this
into and I would be like what is this Oceana or Oceania what does that mean
Oceana or Oceania what does that mean and you know let me look into that let
and you know let me look into that let me explore that more because I want to
me explore that more because I want to know this data set I'm trying to really
know this data set I'm trying to really understand this data set well but what I
understand this data set well but what I want to do now is I want to visualize
want to do now is I want to visualize this um
this um because I just feel like looking at it I
because I just feel like looking at it I don't it's hard to visualize and again
don't it's hard to visualize and again the use case that we're saying is is
the use case that we're saying is is which continent has grown the fastest
which continent has grown the fastest like it could be percentage wise it
like it could be percentage wise it could be um you know as just a whole on
could be um you know as just a whole on average let's take a look so we're going
average let's take a look so we're going to take this and let's copy it like this
to take this and let's copy it like this let's bring this right down here so
let's bring this right down here so let's look at this so if I try to
let's look at this so if I try to visualize this and let's do that let's
visualize this and let's do that let's do df2 is equal to because I'm I already
do df2 is equal to because I'm I already know it's not going to look good just
know it's not going to look good just based off how the data is sitting um we
based off how the data is sitting um we do df2 oops what am I doing I don't need
do df2 oops what am I doing I don't need to do that but I will okay df2 and we'll
to do that but I will okay df2 and we'll do df2 do
do df2 do lot I'll we'll run it just like this um
lot I'll we'll run it just like this um as you can see Asia South America Africa
as you can see Asia South America Africa Europe North America Oceana we can kind
Europe North America Oceana we can kind of understand what's happening but these
of understand what's happening but these are the actual um values that are being
are the actual um values that are being visualized not the continents which is
visualized not the continents which is what I wanted um in order to switch it
what I wanted um in order to switch it and it's actually pretty easy and this
and it's actually pretty easy and this is something that um you know is good to
is something that um you know is good to know we can actually transpose it to
know we can actually transpose it to where these these continents become the
where these these continents become the columns and the columns become the index
columns and the columns become the index and all we have to do is say df2 do
and all we have to do is say df2 do transpose and we'll do this parentheses
transpose and we'll do this parentheses right here and let's just look at it and
right here and let's just look at it and then we'll save it so now all these
then we'll save it so now all these columns are right
columns are right here and all of the indexes are the
here and all of the indexes are the columns so let's say df3 is equal to and
columns so let's say df3 is equal to and I'm just doing that so I don't you know
I'm just doing that so I don't you know write over the DF or my earlier data
write over the DF or my earlier data frames so now we have this data frame
frames so now we have this data frame three so now let's do data frame 3. plot
three so now let's do data frame 3. plot and it should look quite a bit
and it should look quite a bit different uh whoops I didn't run this
different uh whoops I didn't run this let's run this and run this
let's run this and run this and as you can see this does not look
and as you can see this does not look right at all and the reason is is
right at all and the reason is is because we're not only looking at uh the
because we're not only looking at uh the correct columns we have this density in
correct columns we have this density in here word population percentage rank we
here word population percentage rank we don't need any of those the only ones
don't need any of those the only ones that we want to keep are these ones
that we want to keep are these ones right here this
right here this population now we can do that and we can
population now we can do that and we can just go right up here this is where we
just go right up here this is where we created that data frame two that we
created that data frame two that we transposed we can go right up here and
transposed we can go right up here and we can specify within this we actually
we can specify within this we actually only want specific specific values now
only want specific specific values now we can go through and handr write all of
we can go through and handr write all of these and by all means go for it but I
these and by all means go for it but I am going to go down here I'm going to
am going to go down here I'm going to say DF do columns and I'm going to run
say DF do columns and I'm going to run this it's going to give us this list of
this it's going to give us this list of all of our columns and I'm just going to
all of our columns and I'm just going to you can just copy
you can just copy this and you can put it right in here
this and you can put it right in here think I need a list with I think it
think I need a list with I think it needs to be like this if I'm let me try
needs to be like this if I'm let me try running this okay so this worked
running this okay so this worked properly you can do it just like this or
properly you can do it just like this or a little shortcut if you want to do it
a little shortcut if you want to do it like that if you want to do a shortcut
like that if you want to do a shortcut like um I I would hope you would you
like um I I would hope you would you would just do DF doc columns just like
would just do DF doc columns just like how we looked at down here except since
how we looked at down here except since this is our an index we can search
this is our an index we can search through it so we can just say 0 one two
through it so we can just say 0 one two okay so we can do five up to 13 because
okay so we can do five up to 13 because I think it's seven and we'll just let's
I think it's seven and we'll just let's see if this
see if this works uh it may not I may actually need
works uh it may not I may actually need to go like this let's see
to go like this let's see there we go so you can just use you know
there we go so you can just use you know the indexing to save you some visual
the indexing to save you some visual space gives you the exact same output so
space gives you the exact same output so now we have this this is our df2 now
now we have this this is our df2 now let's go down and transpose it so now we
let's go down and transpose it so now we just have these populations and we have
just have these populations and we have our conents right here and then now
our conents right here and then now we're going to plot it and this looks
we're going to plot it and this looks good although it's
good although it's backward um okay it's
backward um okay it's backward so what I actually want to do
backward so what I actually want to do is not this uh that is a quick way to do
is not this uh that is a quick way to do it although not the best way to do it um
it although not the best way to do it um so I'm actually going to copy all of
so I'm actually going to copy all of these and although I said it would save
these and although I said it would save us time it did not at all so I'm going
us time it did not at all so I'm going to put a bracket right
to put a bracket right here I'm going to paste this in here and
here I'm going to paste this in here and I'm literally going to change these up I
I'm literally going to change these up I might speed this up or I might just have
might speed this up or I might just have you sit through this because you know
you sit through this because you know this is an interesting part of the proc
this is an interesting part of the proc process and I want you know you to get
process and I want you know you to get the full experience you know what now
the full experience you know what now that I'm talking about it that is what
that I'm talking about it that is what we're going to do you guys can hang out
we're going to do you guys can hang out with me this is a good time we have
with me this is a good time we have 2010
2010 2015
2015 2020 and 2022 now let's run it what did
2020 and 2022 now let's run it what did I do oh too many brackets there we go so
I do oh too many brackets there we go so now it's ordered appropriately we have
now it's ordered appropriately we have 1970 all the way up to 2022 this is how
1970 all the way up to 2022 this is how we want it let's transpose it
we want it let's transpose it appropriate
appropriate let's run it and now we basically have
let's run it and now we basically have the inverted uh image of this now just
the inverted uh image of this now just at a glance and we haven't done anything
at a glance and we haven't done anything to this except for literally what we are
to this except for literally what we are looking at at a glance we can see that
looking at at a glance we can see that from
from 1970 China here you know Asia and China
1970 China here you know Asia and China are already in the lead by quite a bit
are already in the lead by quite a bit and it continues to drastically go up
and it continues to drastically go up especially in the 2000s like right here
especially in the 2000s like right here it explodes like just straight up then
it explodes like just straight up then kind of starts going up and just
kind of starts going up and just leveling off every other continent
leveling off every other continent especially oce Oceana is just really low
especially oce Oceana is just really low it it never has done a bunch let's see
it it never has done a bunch let's see look at green green has gone up um from
look at green green has gone up um from you know Point let's say
you know Point let's say 0.1 up to about point2 so they've almost
0.1 up to about point2 so they've almost doubled um in the last 50 years and
doubled um in the last 50 years and again you can just get an overview a
again you can just get an overview a high level overview of each of these you
high level overview of each of these you know continents over the span of this
know continents over the span of this time so this is kind of one way that we
time so this is kind of one way that we can you know look at that use case we're
can you know look at that use case we're not going to harp on that too long I
not going to harp on that too long I just want to give you an example like
just want to give you an example like you know when you're looking at this
you know when you're looking at this sometimes you'll have something in mind
sometimes you'll have something in mind of what you're looking for and you go
of what you're looking for and you go exploring and just kind of find what's
exploring and just kind of find what's out there and find what you see um the
out there and find what you see um the next thing I want to look at is a box
next thing I want to look at is a box plot now I personally I love box plots
plot now I personally I love box plots you know they're really good for finding
you know they're really good for finding outliers and there's a lot of outliers I
outliers and there's a lot of outliers I already know this because because the
already know this because because the average the 25th 50 percentile are very
average the 25th 50 percentile are very low and then there's some really just
low and then there's some really just big outliers but for your data set it
big outliers but for your data set it may not be that way and those outliers
may not be that way and those outliers may be something that you really need to
may be something that you really need to look into and box plots have been
look into and box plots have been something that I've used a lot where I
something that I've used a lot where I found those outliers that way and
found those outliers that way and started to dig into the data to find
started to dig into the data to find those outliers and you know came across
those outliers and you know came across some stuff that I'm like oh I have to
some stuff that I'm like oh I have to clean this up I have to go back to the
clean this up I have to go back to the source really um really really powerful
source really um really really powerful and useful to be able to find these so
and useful to be able to find these so all you have to do is d. boox plot and
all you have to do is d. boox plot and let's take a look at it and this already
let's take a look at it and this already looks good as is maybe I'll make it a
looks good as is maybe I'll make it a little bit wider um let's do fig size
little bit wider um let's do fig size oops sorry fig size is equal to let's
oops sorry fig size is equal to let's try
try 20 by
20 by 10 um okay that didn't help at all I
10 um okay that didn't help at all I apologize thought I would but let's keep
apologize thought I would but let's keep going what this is showing us is that
going what this is showing us is that these little boxes down here which are
these little boxes down here which are actually usually much much larger
actually usually much much larger because you have a more equal
because you have a more equal distribution of of um numbers or values
distribution of of um numbers or values in the small value this is where our
in the small value this is where our averages lie this number right here is
averages lie this number right here is the upper range and then all these
the upper range and then all these values all these Open Circles those
values all these Open Circles those actually stand for
actually stand for outliers so we're looking at the 2022
outliers so we're looking at the 2022 population there's a lot of outliers now
population there's a lot of outliers now for our data set knowing our data set is
for our data set knowing our data set is really important outliers are to be
really important outliers are to be expected especially when most countries
expected especially when most countries are continents are small so we're
are continents are small so we're looking at you know all of these little
looking at you know all of these little dots are outlier countries um or outlier
dots are outlier countries um or outlier values which each value corresponds to a
values which each value corresponds to a country so if this was a different data
country so if this was a different data set I would be you know searching on
set I would be you know searching on these and trying to find these so that I
these and trying to find these so that I can see what's wrong with them if
can see what's wrong with them if anything or if they are real um numbers
anything or if they are real um numbers like if this was Revenue everyone's
like if this was Revenue everyone's revenue is way down here and then
revenue is way down here and then there's one company that's making like
there's one company that's making like 10 trillion dollar that'd be an outlier
10 trillion dollar that'd be an outlier up here and it would definitely be
up here and it would definitely be something that you want to look into to
something that you want to look into to for our data set knowing that you know
for our data set knowing that you know we're looking at population this is more
we're looking at population this is more than acceptable you know oddly enough
than acceptable you know oddly enough but that's what box plots are really
but that's what box plots are really good for showing you some of those cor
good for showing you some of those cor tiles the upper and the lower um as well
tiles the upper and the lower um as well as denoting these points that fall
as denoting these points that fall outside of those normal ranges for you
outside of those normal ranges for you to look into so really really useful so
to look into so really really useful so now let's go down here pull up our data
now let's go down here pull up our data frame again and we've kind of just
frame again and we've kind of just zoomed into the whole Eda process there
zoomed into the whole Eda process there was one last thing that I wanted to show
was one last thing that I wanted to show you and this is the very last thing that
you and this is the very last thing that we're going to look at we're ending on
we're going to look at we're ending on really a low point if I'm being honest
really a low point if I'm being honest because the last kind of stuff was more
because the last kind of stuff was more much more exciting but there is
much more exciting but there is something DF DOD types oops let's do DF
something DF DOD types oops let's do DF DOD
DOD types and we'll run this now just like
types and we'll run this now just like info it gave us these values but we're
info it gave us these values but we're actually able to search on these values
actually able to search on these values now so these um object float and integer
now so these um object float and integer we can search on those which is really
we can search on those which is really great because we can do include equal
great because we can do include equal and we can do something like number and
and we can do something like number and none of these are numbers right or none
none of these are numbers right or none of them explicitly say number but when
of them explicitly say number but when we run it I'm getting an error series
we run it I'm getting an error series object not oh that's because I'm doing
object not oh that's because I'm doing um D types is for a series we need to do
um D types is for a series we need to do select underscore D types now let's run
select underscore D types now let's run this now it's only returning um The
this now it's only returning um The Columns in this data frame where the
Columns in this data frame where the data types are included in this number
data types are included in this number so you won't see any you know country or
so you won't see any you know country or any of those text or the strings if we
any of those text or the strings if we want to do that we go in here and say
want to do that we go in here and say object and run that and this is another
object and run that and this is another really quick way where we can just
really quick way where we can just filter those columns to look for
filter those columns to look for specific whether it's numeric um we
specific whether it's numeric um we could even do float in here and so now
could even do float in here and so now it's not including that rank which was
it's not including that rank which was an integer so we can specify the type of
an integer so we can specify the type of data type and it'll filter all of the
data type and it'll filter all of the columns based off of that which you know
columns based off of that which you know when you're doing stuff like this you it
when you're doing stuff like this you it is good to know what kind of data types
is good to know what kind of data types you're working with and look at just
you're working with and look at just those types of data types because there
those types of data types because there might be some type of analysis you want
might be some type of analysis you want to perform on just that whether it's
to perform on just that whether it's numeric or just the string or integer
numeric or just the string or integer columns within your data set so again
columns within your data set so again ending on a low note I apologize um you
ending on a low note I apologize um you know everything else that we looked at
know everything else that we looked at all those other things that we looked at
all those other things that we looked at are all things that I typically do in
are all things that I typically do in some way or another when I'm looking at
some way or another when I'm looking at a data set exploratory data analysis is
a data set exploratory data analysis is really just the first look you're
really just the first look you're looking at it you're going to be
looking at it you're going to be cleaning it up doing the data cleaning
cleaning it up doing the data cleaning process and then you're going to be
process and then you're going to be doing your actual data analysis actually
doing your actual data analysis actually finding those Trends and patterns and
finding those Trends and patterns and then visualizing it um in some way to
then visualizing it um in some way to find some kind of meaning or Insight or
find some kind of meaning or Insight or value from that data and again there's a
value from that data and again there's a thousand different ways you can go about
thousand different ways you can go about this it it does typically um you know
this it it does typically um you know depend on the data set but these are a
depend on the data set but these are a lot of the ways that you'll clean a lot
lot of the ways that you'll clean a lot of different data sets and so you know
of different data sets and so you know that's why I went into the things that
that's why I went into the things that we looked at in this video video so I
we looked at in this video video so I hope that you guys liked it I hope that
hope that you guys liked it I hope that you enjoyed something in this tutorial
you enjoyed something in this tutorial if you like this video be sure to like
if you like this video be sure to like And subscribe as well as check out all
And subscribe as well as check out all my other videos on pandas and Python and
my other videos on pandas and Python and I will see you in the next
I will see you in the next [Music]
[Music] video what's going on everybody welcome
video what's going on everybody welcome back to another video today we are back
back to another video today we are back with another data analyst portfolio
with another data analyst portfolio project where we will be scraping data
project where we will be scraping data from Amazon using
[Music] python now you may be asking do I need
python now you may be asking do I need to know web scraping to become a data
to know web scraping to become a data analyst and the answer is no you
analyst and the answer is no you absolutely don't need to know it but it
absolutely don't need to know it but it is a very cool skill to learn and in
is a very cool skill to learn and in fact I have used it in my job in the
fact I have used it in my job in the past and so it is useful but you really
past and so it is useful but you really don't need to know it something that it
don't need to know it something that it is used for is kind of creating your own
is used for is kind of creating your own data sets um and we're going to be
data sets um and we're going to be looking at one where you can create your
looking at one where you can create your own data set today but there are a lot
own data set today but there are a lot of other uses for web scraping and I'm
of other uses for web scraping and I'm sure I'll talk a little bit more about
sure I'll talk a little bit more about that while we're actually walking
that while we're actually walking through the project one last thing I
through the project one last thing I want to say before we get started is
want to say before we get started is that this is most likely an intermediate
that this is most likely an intermediate project so if you are just now learning
project so if you are just now learning the basics of python this might be a
the basics of python this might be a little bit challenging for you but I
little bit challenging for you but I still recommend going through it because
still recommend going through it because I will do my best to walk through
I will do my best to walk through everything every single step of the way
everything every single step of the way and and kind of explain all the concepts
and and kind of explain all the concepts and so you can still learn something
and so you can still learn something even if you aren't super good at python
even if you aren't super good at python right now with that being said let's
right now with that being said let's jump over to my screen and get started
jump over to my screen and get started on the project all right so we are going
on the project all right so we are going to get started and if you didn't watch
to get started and if you didn't watch the last project I had people download
the last project I had people download Anaconda uh we use Jupiter notebooks um
Anaconda uh we use Jupiter notebooks um and I'll show you how to get to that in
and I'll show you how to get to that in just a second but I'll I'll leave this
just a second but I'll I'll leave this link in the description if you haven't
link in the description if you haven't done that already and you are just doing
done that already and you are just doing this project um but you'll go you'll
this project um but you'll go you'll download andaconda You Know download
download andaconda You Know download super easy um and you're going to open
super easy um and you're going to open up Jupiter notebooks I'll launch it
up Jupiter notebooks I'll launch it right now I already have it open uh but
right now I already have it open uh but I'll open up another one just for you
I'll open up another one just for you know the purposes of demonstration what
know the purposes of demonstration what we are going to do today and what we um
we are going to do today and what we um what people voted on I mean there's like
what people voted on I mean there's like there was like 8,000 people that voted
there was like 8,000 people that voted um in the poll that I made of what data
um in the poll that I made of what data you wanted me to scrape there was like
you wanted me to scrape there was like Amazon cryptocurrency weather um
Amazon cryptocurrency weather um something else I don't remember
something else I don't remember overwhelmingly I mean like 70% of people
overwhelmingly I mean like 70% of people maybe even 80% I you don't don't fact
maybe even 80% I you don't don't fact check me on that voted for Amazon um and
check me on that voted for Amazon um and so I'm going to do it now there are many
so I'm going to do it now there are many things that you can scrape um off of
things that you can scrape um off of Amazon just a ton of stuff um and I'm
Amazon just a ton of stuff um and I'm going to show you how to do it I'm going
going to show you how to do it I'm going to show you how to make it useful how to
to show you how to make it useful how to make a data set um and it's going to be
make a data set um and it's going to be really interesting but there are lots of
really interesting but there are lots of other ways to do this and so I think um
other ways to do this and so I think um and I have already kind of created it
and I have already kind of created it I'm going to show you how to do it off
I'm going to show you how to do it off of this page um when you're actually in
of this page um when you're actually in an item and you can scrape you know
an item and you can scrape you know basically anything in here um and I'll
basically anything in here um and I'll show you how to do that another thing
show you how to do that another thing that is a little bit more advanced and
that is a little bit more advanced and that's why this first video is starting
that's why this first video is starting off I think on the more easy side it's
off I think on the more easy side it's not easy but it's easier the next thing
not easy but it's easier the next thing the next video that I'm going to make is
the next video that I'm going to make is how to actually do um basically do
how to actually do um basically do multiple items right so this item this
multiple items right so this item this item this item this item and then
item this item this item and then Traverse through the different pages so
Traverse through the different pages so there 20 Pages um you want all of that
there 20 Pages um you want all of that data how do you get all of that that'll
data how do you get all of that that'll be the next project um I don't know when
be the next project um I don't know when I plan on doing that I it like 90% of
I plan on doing that I it like 90% of the way done um but I had this one
the way done um but I had this one completed and so I wanted to get that
completed and so I wanted to get that out to you guys now but that will
out to you guys now but that will probably be the next project I think
probably be the next project I think that is much more difficult um and so if
that is much more difficult um and so if you can understand this one and you get
you can understand this one and you get it and and you understand it then the
it and and you understand it then the next project you should be able to
next project you should be able to understand too is just a little bit more
understand too is just a little bit more complicated so with that being said um
complicated so with that being said um we are going to actually get into the
we are going to actually get into the project I'm going to delete one of these
project I'm going to delete one of these um all we're going to do is go to new do
um all we're going to do is go to new do Python 3 it'll open up new one we'll
Python 3 it'll open up new one we'll call this um
call this um Amazon web
Amazon web scraper um project that's what we'll
scraper um project that's what we'll call it I spell it right perfect um the
call it I spell it right perfect um the first thing that we need to do uh or
first thing that we need to do uh or that we should do is
that we should do is upload um or or or import our
upload um or or or import our libraries so I'm going to say um import
libraries so I'm going to say um import oops what am I doing it's off to a
oops what am I doing it's off to a terrible start there we go import
terrible start there we go import libraries now I'm not going to write out
libraries now I'm not going to write out all the libraries um I have some things
all the libraries um I have some things that I'm going to be copying and pasting
that I'm going to be copying and pasting throughout this I won't there's only a
throughout this I won't there's only a few things that I'm copying and pasting
few things that I'm copying and pasting you can take a quick glance um some of
you can take a quick glance um some of the things that I just don't want to
the things that I just don't want to waste time on um because this could be a
waste time on um because this could be a long video I don't know I don't want to
long video I don't know I don't want to waste time on stuff like this um and so
waste time on stuff like this um and so you know I'm just going to copy and
you know I'm just going to copy and paste it you guys are going to I'm going
paste it you guys are going to I'm going there will be a link below if you
there will be a link below if you haven't clicked it already that will go
haven't clicked it already that will go to the GitHub page where you can
to the GitHub page where you can literally have all of this code already
literally have all of this code already written WR I do recommend writing it all
written WR I do recommend writing it all yourself because you will learn it much
yourself because you will learn it much better I promise CU then you'll make
better I promise CU then you'll make mistakes and you'll figure it out and
mistakes and you'll figure it out and all that all that good stuff but you
all that all that good stuff but you will have that code available so just go
will have that code available so just go copy and paste it um that's what I would
copy and paste it um that's what I would do but what we are we are going to be
do but what we are we are going to be using today is uh something called
using today is uh something called Beautiful soup requests um then we're
Beautiful soup requests um then we're going to be using time and date time and
going to be using time and date time and a potential one if you want to get and
a potential one if you want to get and I'm going to show you this at the end
I'm going to show you this at the end this is not really part of the project
this is not really part of the project it goes above and beyond but this
it goes above and beyond but this Library here is for sending emails to
Library here is for sending emails to yourself um and I'll show you how uh you
yourself um and I'll show you how uh you can use it if you want to I already have
can use it if you want to I already have the whole code written out um you can
the whole code written out um you can just steal it and try it out yourself
just steal it and try it out yourself and see if you can get it to work but
and see if you can get it to work but this one is not um as important I'll put
this one is not um as important I'll put it down here so um let's move on now one
it down here so um let's move on now one thing I want to say before we get too
thing I want to say before we get too into it is that well give me a
into it is that well give me a second is that right here in front of me
second is that right here in front of me is a different laptop now it took me a
is a different laptop now it took me a solid I would say you know 10 hours or
solid I would say you know 10 hours or so to write all of this is took over the
so to write all of this is took over the course of like two weeks in my free time
course of like two weeks in my free time I'd pick it up it took me a solid you
I'd pick it up it took me a solid you know two weeks on and off an hour here
know two weeks on and off an hour here an hour there to finish this project um
an hour there to finish this project um and I made a ton of mistakes and messed
and I made a ton of mistakes and messed a bunch of things up and I finally got
a bunch of things up and I finally got it to work um you know after a bunch of
it to work um you know after a bunch of revisions that's typically how things go
revisions that's typically how things go when I do projects and so uh I'm about
when I do projects and so uh I'm about to give you a stream lined version of
to give you a stream lined version of this because I have all the code right
this because I have all the code right down here and so I'm going to be
down here and so I'm going to be glancing at this a lot um just so I
glancing at this a lot um just so I don't make this video 20 hours of trying
don't make this video 20 hours of trying to remember all the code off the top of
to remember all the code off the top of my head I have it written out already I
my head I have it written out already I already did the project it works it's
already did the project it works it's beautiful it's a good project so um I
beautiful it's a good project so um I don't want to waste your time and I just
don't want to waste your time and I just want you to know that you know you you
want you to know that you know you you nobody should be able to do this up top
nobody should be able to do this up top their head in an hour most people won't
their head in an hour most people won't um it takes time you make mistakes um
um it takes time you make mistakes um but uh let's get started on the project
but uh let's get started on the project now in this uh in this what we're going
now in this uh in this what we're going to have to do is we going to have to
to have to do is we going to have to tell beautiful soup and requests where
tell beautiful soup and requests where we are actually getting this data from
we are actually getting this data from what website um what is our computer you
what website um what is our computer you know some information from our computer
know some information from our computer I'm going to again there's going to be a
I'm going to again there's going to be a little copying and pasting in here
little copying and pasting in here because you don't ever you will never
because you don't ever you will never ever ever need to know this um but right
ever ever need to know this um but right here we're going to to basically connect
here we're going to to basically connect to the website so I'm just going to say
to the website so I'm just going to say connect to
connect to website and we going to say URL is equal
website and we going to say URL is equal to and let's go get our
to and let's go get our URL so we have this right here so
URL so we have this right here so literally just go up here do you know uh
literally just go up here do you know uh controll a copy that oops that's the
controll a copy that oops that's the actual project get rid of
actual project get rid of that uh paste it in here and that is our
that uh paste it in here and that is our URL we will use that in just a second uh
URL we will use that in just a second uh what am I doing
what am I doing me just get some room here and then we
me just get some room here and then we what we're going to need is something
what we're going to need is something called headers now again you will never
called headers now again you will never ever ever need to know this so I'm just
ever ever need to know this so I'm just going to say headers um what I'm going
going to say headers um what I'm going to do is I'm going to copy this I'm
to do is I'm going to copy this I'm going to show you how to get this really
going to show you how to get this really quick um but is something called headers
quick um but is something called headers so uh let me show you how to use how to
so uh let me show you how to use how to get
get this and why you don't need to know any
this and why you don't need to know any of this so what this headers is is this
of this so what this headers is is this something called a user agent you need
something called a user agent you need to do this for your computer um and you
to do this for your computer um and you can do that by going to this link right
can do that by going to this link right here so I'm going to put this link in
here so I'm going to put this link in the description so that you can go and
the description so that you can go and get that and there's something right
get that and there's something right here called the user agent so all you
here called the user agent so all you have to do is copy this just like this
have to do is copy this just like this do copy I'm going to go back here and
do copy I'm going to go back here and I'll show you that it's I'm going to
I'll show you that it's I'm going to copy it in um it'll be the exact same so
copy it in um it'll be the exact same so there you go
there you go it's the exact same um all of this extra
it's the exact same um all of this extra stuff except encoding except um this
stuff except encoding except um this HTML stuff Connection close all the you
HTML stuff Connection close all the you don't need to know any of it I promise
don't need to know any of it I promise you'll never come in handy ever in
you'll never come in handy ever in life actually there will be one person
life actually there will be one person who that becomes in handy for and then
who that becomes in handy for and then they'll message me um but we are now
they'll message me um but we are now connecting um using our computer using
connecting um using our computer using this
this URL and then what we want to write is we
URL and then what we want to write is we want write page we're going to say
want write page we're going to say equals and this is where we start using
equals and this is where we start using uh these libraries so we're going to use
uh these libraries so we're going to use requests.get and we are going to pull in
requests.get and we are going to pull in that URL and we're just going to say
that URL and we're just going to say headers is equal to our headers right
headers is equal to our headers right here so uh we have this and this is
here so uh we have this and this is where we're going to actually
where we're going to actually start getting the data bringing in the
start getting the data bringing in the data um and it's not going to look like
data um and it's not going to look like that at first but I'll try to print some
that at first but I'll try to print some stuff out out as we go along the way so
stuff out out as we go along the way so that you can kind of see what it looks
that you can kind of see what it looks like and how we're going to kind of make
like and how we're going to kind of make it more useful because it comes in very
it more useful because it comes in very dirty uh when we first get it and some
dirty uh when we first get it and some of the things I'm going to show you will
of the things I'm going to show you will just help clean that up um and before we
just help clean that up um and before we actually go any any further I don't want
actually go any any further I don't want my head to be here for the entire time
my head to be here for the entire time I'm going to get rid of myself so you
I'm going to get rid of myself so you can just see the page uh I just it's
can just see the page uh I just it's less distracting uh I hate when I feel
less distracting uh I hate when I feel like people are always watching me so I
like people are always watching me so I want people to just focus on the code uh
want people to just focus on the code uh so I will see in a little bit let's get
so I will see in a little bit let's get back into it all right so what we are
back into it all right so what we are going to do is we are actually going to
going to do is we are actually going to start using the beautiful soup Library
start using the beautiful soup Library all right so we are going to say soup
all right so we are going to say soup one is equal to and this is where we
one is equal to and this is where we actually start bringing beautiful soup
actually start bringing beautiful soup and you guess it you're going to say
and you guess it you're going to say beautiful soup and then in parenthesis
beautiful soup and then in parenthesis we're going to do page.
we're going to do page. content um and again these aren't really
content um and again these aren't really things that you need to remember or need
things that you need to remember or need to memorize we're just pulling in the
to memorize we're just pulling in the content from the page that's really all
content from the page that's really all we're doing right now and and it comes
we're doing right now and and it comes in as HTML so we're going to do html.
in as HTML so we're going to do html. parser uh and let's see if I can print
parser uh and let's see if I can print out uh actually let me just do soup one
out uh actually let me just do soup one I don't like I don't like doing upper
I don't like I don't like doing upper caps on
caps on stuff let's see if anything prints out
stuff let's see if anything prints out real quick so we are literally pulling
real quick so we are literally pulling in all of the
in all of the HTML um and let me go show you really
HTML um and let me go show you really quick because we're going to get to this
quick because we're going to get to this in a second anyways um if you come here
in a second anyways um if you come here this is
this is this is a static page basically written
this is a static page basically written in HTML um if you have never seen HTML
in HTML um if you have never seen HTML before um you
before um you know actually a lot of this is you know
know actually a lot of this is you know just stuff that most people will never
just stuff that most people will never use uh it's just good to know some of
use uh it's just good to know some of the stuff is good to know so as you see
the stuff is good to know so as you see I'm scrolling on this right side by the
I'm scrolling on this right side by the way I did rightclick and inspect or
way I did rightclick and inspect or control shift I whichever one works
control shift I whichever one works better for you but as I'm scrolling over
better for you but as I'm scrolling over this you should see it kind of
this you should see it kind of highlighting different areas um it's
highlighting different areas um it's hard to kind of get what you want let's
hard to kind of get what you want let's say we want this title um what I can do
say we want this title um what I can do is I can click select element go right
is I can click select element go right here um and then we can select like a TI
here um and then we can select like a TI the the the header or the title of the
the the the header or the title of the the page now I just want to show you
the page now I just want to show you though of what we're pulling in so we're
though of what we're pulling in so we're pulling in this doc type HTML all of
pulling in this doc type HTML all of this is coming in so that's what this is
this is coming in so that's what this is right here this doc type HTML and we're
right here this doc type HTML and we're pulling every single thing in that is
pulling every single thing in that is what we're doing right now uh so let's
what we're doing right now uh so let's get or let's go down a little bit let's
get or let's go down a little bit let's do soup two we're just going to do a
do soup two we're just going to do a very uh you know an upgrade to soup one
very uh you know an upgrade to soup one basically we'll do beautiful soup
basically we'll do beautiful soup again and then we're going to do uh soup
again and then we're going to do uh soup one so we're pulling in that content
one so we're pulling in that content again so that soup one and we're going
again so that soup one and we're going to do do PR prettify if you don't know
to do do PR prettify if you don't know what that is it is common in a lot of
what that is it is common in a lot of different languages and a lot of
different languages and a lot of different stuff um it just makes things
different stuff um it just makes things look better it that's really all it
look better it that's really all it is uh I don't know why I'm using double
is uh I don't know why I'm using double quotes I don't know why I can you can do
quotes I don't know why I can you can do single ones if you want um and now let's
single ones if you want um and now let's do beautiful soup to and it should just
do beautiful soup to and it should just be a it should be better formatted um
be a it should be better formatted um and let's see if that's true and it is
and let's see if that's true and it is so before if you did if you could tell
so before if you did if you could tell it was didn't have basically any
it was didn't have basically any formatting it has a little bit of
formatting it has a little bit of formatting now um it'll help in a second
formatting now um it'll help in a second um and you'll see that but now what we
um and you'll see that but now what we want to do is go back and we want to
want to do is go back and we want to actually get the data that we want now
actually get the data that we want now you can get any data you want I'm going
you can get any data you want I'm going to show you simple things really really
to show you simple things really really easy um in my in in in my opinion it
easy um in my in in in my opinion it gets more difficult the more complicated
gets more difficult the more complicated stuff you start pulling um and and
stuff you start pulling um and and you'll understand that as we go into it
you'll understand that as we go into it so what I'm going to do is I'm going to
so what I'm going to do is I'm going to select this and I'm going to select this
select this and I'm going to select this um the title I want that and so if you
um the title I want that and so if you do span ID it's equal to product uh
do span ID it's equal to product uh title so we need to remember that um
title so we need to remember that um class we don't need to know class I
class we don't need to know class I believe uh we're going to be using that
believe uh we're going to be using that ID this um ID equals product title so
ID this um ID equals product title so that's what we're going to be using um
that's what we're going to be using um class will come in in the next video
class will come in in the next video when we start looking at these uh but
when we start looking at these uh but not in this one so let's remember ID
not in this one so let's remember ID equals product title so let's go back
equals product title so let's go back over here so we have this soup 2 it's
over here so we have this soup 2 it's basically all of that HTML in it right
basically all of that HTML in it right down here that that is what we're
down here that that is what we're pulling in so we need to kind of specify
pulling in so we need to kind of specify what we actually want so let's say title
what we actually want so let's say title that's what we're going to be getting um
that's what we're going to be getting um and we're going to do soup 2 so using
and we're going to do soup 2 so using taking all that content um we're do
taking all that content um we're do find and we're going to do open
find and we're going to do open parenthesis and we're going to say we
parenthesis and we're going to say we want to find that ID where it's equal to
want to find that ID where it's equal to product
product title and then we're going to do do
title and then we're going to do do getor text and then we're going to do
getor text and then we're going to do open parentheses so now let's um let's
open parentheses so now let's um let's print the
print the title and see what we get all right so
title and see what we get all right so that is exactly what we're looking for
that is exactly what we're looking for it's funny got data Mis um T-shirt that
it's funny got data Mis um T-shirt that that is what we're trying to pull in so
that is what we're trying to pull in so that's perfect that's exactly what we
that's perfect that's exactly what we want we don't uh let me let me just do
want we don't uh let me let me just do this save me some time later on we don't
this save me some time later on we don't only want the title we are also going to
only want the title we are also going to be pulling in the price so if you can
be pulling in the price so if you can guess uh we'll be doing some uh a data
guess uh we'll be doing some uh a data set on the actual
set on the actual pricing um and so let's go back here
pricing um and so let's go back here we're going to again use this right here
we're going to again use this right here and we're going to go to this
and we're going to go to this price and it says again we're going to
price and it says again we're going to look at this ID the ID equals price
look at this ID the ID equals price blockor our price so fairly easy you can
blockor our price so fairly easy you can copy this I'm just going to write it out
copy this I'm just going to write it out um we're going to say price is equal to
um we're going to say price is equal to sup
sup 2. find and then it's going to be again
2. find and then it's going to be again ID is equal to and it's going to be
ID is equal to and it's going to be price block underscore our price did I
price block underscore our price did I saw that right
saw that right oops excuse me there we go and the exact
oops excuse me there we go and the exact same
same thing.get
thing.get text
text parenthesis uh and there's a g text
parenthesis uh and there's a g text there's a get all or get all text um so
there's a get all or get all text um so you know that get text is a specific
you know that get text is a specific thing that we are using you we might use
thing that we are using you we might use a different one later on um but that
a different one later on um but that that is what we have so now
that is what we have so now let's let's print the title and print
let's let's print the title and print when I why do I have all
when I why do I have all this too much uh too much space so let's
this too much uh too much space so let's T print the title and print the price
T print the title and print the price let's see what we get okay so we have
let's see what we get okay so we have our title and we have our price I mean
our title and we have our price I mean you know I don't know what all this
you know I don't know what all this white space is over here um but it looks
white space is over here um but it looks like there's a lot of white space over
like there's a lot of white space over here we'll have to get rid of that uh in
here we'll have to get rid of that uh in a little bit as we clean it up a little
a little bit as we clean it up a little bit you can if you want do things like
bit you can if you want do things like um you can get and this is up to you I'm
um you can get and this is up to you I'm not going to do this right now but I'm
not going to do this right now but I'm just going to show you how to do it you
just going to show you how to do it you can get this where you're pulling in the
can get this where you're pulling in the ratings um which is you know if you want
ratings um which is you know if you want to look at like how the ratings over
to look at like how the ratings over time or or what ratings are for specific
time or or what ratings are for specific products that could be really useful um
products that could be really useful um you can pull basically anything you can
you can pull basically anything you can go down the product details and look at
go down the product details and look at Dimensions uh anything you want on this
Dimensions uh anything you want on this page it is static so you can go in here
page it is static so you can go in here and pull anything it's it you just have
and pull anything it's it you just have to pull it from the HTML know where
to pull it from the HTML know where you're looking pull it in um and now
you're looking pull it in um and now when we go back here excuse me I'm going
when we go back here excuse me I'm going to show you now kind of how to use this
to show you now kind of how to use this right because we have this
right because we have this but how are we going to use it um that's
but how are we going to use it um that's kind of the important part I think first
kind of the important part I think first thing we need to do is clean this up a
thing we need to do is clean this up a little bit because it just is you know
little bit because it just is you know if we try to use this it wouldn't be
if we try to use this it wouldn't be super useful because it'd be just a
super useful because it'd be just a little bit dirty it's not super
little bit dirty it's not super clean um so what we want to do is let's
clean um so what we want to do is let's start with the price why not uh we're
start with the price why not uh we're going to say price. strip um and that's
going to say price. strip um and that's just going to take uh basically the the
just going to take uh basically the the junk off of either side and so let's run
junk off of either side and so let's run that real quick so this is what we have
that real quick so this is what we have but what we can also do is I don't want
but what we can also do is I don't want that dollar sign I just want the numeric
that dollar sign I just want the numeric value um later on we are going to be
value um later on we are going to be putting this and we're going to be um
putting this and we're going to be um creating a process to put this into an
creating a process to put this into an Excel file again we're trying to create
Excel file again we're trying to create a data set I don't want you to have to
a data set I don't want you to have to copy and paste stuff it's all going to
copy and paste stuff it's all going to be automated basically to input this
be automated basically to input this data into an Excel file for you or a CSV
data into an Excel file for you or a CSV file for you so um you know think about
file for you so um you know think about making it useful in a CSV or in an Excel
making it useful in a CSV or in an Excel later on so what we can do is do a
later on so what we can do is do a bracket and we're going to do one and
bracket and we're going to do one and then everything after that so basically
then everything after that so basically it's just going to take everything from
it's just going to take everything from the first position onward uh so let's
the first position onward uh so let's run that and there we go so let's just
run that and there we go so let's just say price is equal to price. strip um
say price is equal to price. strip um and pull uh just do everything after
and pull uh just do everything after that first um that first not value what
that first um that first not value what am I saying what's the word for that I
am I saying what's the word for that I can't remember the word the first space
can't remember the word the first space that's not the right word but all right
that's not the right word but all right let's do the title um this is basically
let's do the title um this is basically going to be the exact same thing um
going to be the exact same thing um super easy so we're just going to do
super easy so we're just going to do title. strip and open
title. strip and open parentheses um and we can you know if
parentheses um and we can you know if you want to do this exact same thing so
you want to do this exact same thing so now we have it it's a little bit cleaner
now we have it it's a little bit cleaner so this is what it originally looked
so this is what it originally looked like and now this is what it looks like
like and now this is what it looks like so you know nothing super crazy but you
so you know nothing super crazy but you know something interesting to know now
know something interesting to know now we are about to in the very next part
we are about to in the very next part what we are going to do and let me just
what we are going to do and let me just add a few of these because makes me feel
add a few of these because makes me feel better um what we are about to do is
better um what we are about to do is we're going to create our CSV to insert
we're going to create our CSV to insert this data into the CSV and then later on
this data into the CSV and then later on what I'm going to do is show you kind of
what I'm going to do is show you kind of how to um automate this process to pull
how to um automate this process to pull this data
this data um to create a data set right just
um to create a data set right just pulling this one time and putting into a
pulling this one time and putting into a csb really doesn't do anything you can
csb really doesn't do anything you can just copy and paste that and save
just copy and paste that and save yourself a lot of time um what I'm going
yourself a lot of time um what I'm going to show you is is um basically doing it
to show you is is um basically doing it over over time and just having it
over over time and just having it automated in the background that is what
automated in the background that is what I'm going to show you um I guess a
I'm going to show you um I guess a spoiler but what we need to do is we
spoiler but what we need to do is we need
need to create uh create the CSV insert it
to create uh create the CSV insert it into the CSV and then create a process
into the CSV and then create a process to append more data into that CSV um I'm
to append more data into that CSV um I'm doing a lot of talking let's do some
doing a lot of talking let's do some writing so what we need to do is we're
writing so what we need to do is we're going to use um I should have done this
going to use um I should have done this at the top maybe I'll go back and add
at the top maybe I'll go back and add that later on we're going to do import
that later on we're going to do import CSV now in a CSV what you want is you
CSV now in a CSV what you want is you want headers and then you want the data
want headers and then you want the data right so for our headers and we're going
right so for our headers and we're going to call it header we're going to do um
to call it header we're going to do um we're going to do a bracket and let's
we're going to do a bracket and let's make the first one a title because
make the first one a title because that's going to be uh we can call it
that's going to be uh we can call it title you can call it product um
title you can call it product um whatever you want I'm just going to call
whatever you want I'm just going to call it because I've been using title I'm
it because I've been using title I'm going to call it title um and then we'll
going to call it title um and then we'll also
also have
have price now we need our data so I'm going
price now we need our data so I'm going to say data is equal to now this is
to say data is equal to now this is important um right now how our data is
important um right now how our data is and I can do this right here we're going
and I can do this right here we're going type um title or no let's do type
type um title or no let's do type price so these are strings and that's
price so these are strings and that's important to know um again I don't want
important to know um again I don't want to get too much into you know
to get too much into you know dictionaries and arrays and lists and
dictionaries and arrays and lists and and strings and all these things but
and strings and all these things but this is a string and you can't put that
this is a string and you can't put that right now it's not super usable what
right now it's not super usable what we're going to do is make this a list um
we're going to do is make this a list um and so I'm doing an Open Bracket and I'm
and so I'm doing an Open Bracket and I'm going to say our data is title
going to say our data is title comma price oops price now oops if I do
comma price oops price now oops if I do type oops of data I'll just run that
type oops of data I'll just run that it's a list now um and this is important
it's a list now um and this is important because you can run into a lot of issues
because you can run into a lot of issues with the stuff it's really important to
with the stuff it's really important to remember what's what type
remember what's what type um how do I say this uh how your data is
um how do I say this uh how your data is is it a list is it an array is it a
is it a list is it an array is it a dictionary um you know what is it these
dictionary um you know what is it these things are important they do play a big
things are important they do play a big impact especially with this type of
impact especially with this type of stuff so just wanted to show you that
stuff so just wanted to show you that really quick but what we are now going
really quick but what we are now going to do is create a CSV um you're going to
to do is create a CSV um you're going to create an Excel I I call an Excel CSV
create an Excel I I call an Excel CSV you know whatever you want to call it so
you know whatever you want to call it so what we are going to do is we are going
what we are going to do is we are going to say with and we're going to say open
to say with and we're going to say open and now we're going to name our file you
and now we're going to name our file you can name this whatever you want I'm
can name this whatever you want I'm going to call it
going to call it uh um
uh um Amazon web
Amazon web scraper data set that's real long
scraper data set that's real long uh.
uh. CSV and then we're going to do
CSV and then we're going to do underscore W and that means
underscore W and that means right um oh whoops that's not right just
right um oh whoops that's not right just like I was wondering why that was uh in
like I was wondering why that was uh in Black uh so we're going to do W which
Black uh so we're going to do W which means right um and then we're going to
means right um and then we're going to do new line and if you don't know what
do new line and if you don't know what new line is uh all that does is when we
new line is uh all that does is when we insert the data it doesn't have a a
insert the data it doesn't have a a space in between each CSV and then we
space in between each CSV and then we are going to do encode
are going to do encode coding is equal to oops is equal to
coding is equal to oops is equal to utf8 and that is it and we'll just say
utf8 and that is it and we'll just say as uh let's do F so some of that stuff
as uh let's do F so some of that stuff you don't need to know some of it's
you don't need to know some of it's useful this W definitely need to know
useful this W definitely need to know this new line is is good to know and um
this new line is is good to know and um I'll take it I might take it out just to
I'll take it I might take it out just to show you what it actually does because
show you what it actually does because it's annoying if you don't have it I
it's annoying if you don't have it I promise um but you know that that new
promise um but you know that that new Line's important this encoding you know
Line's important this encoding you know good to know I think that's by default
good to know I think that's by default is is it's like that uh anyways what
is is it's like that uh anyways what we're going to do now is we're going to
we're going to do now is we're going to uh it's something within the
uh it's something within the CSV within the CSV um Library so we're
CSV within the CSV um Library so we're going to do something called CSV
going to do something called CSV writer and oops CSV do
writer and oops CSV do writer and we're going to do open
writer and we're going to do open parenthesis and that is that and we'll
parenthesis and that is that and we'll just call that
just call that writer and then we'll we'll do this is
writer and then we'll we'll do this is where we need to actually create the
where we need to actually create the header so uh we're going to do writer is
header so uh we're going to do writer is dot sorry writer. WR row uh and this is
dot sorry writer. WR row uh and this is just for the
just for the initial um the
initial um the initial import or or or um not import
initial import or or or um not import the initial insertion of the data into
the initial insertion of the data into the CSV this is what's important the
the CSV this is what's important the next one that we're going to write is
next one that we're going to write is for when we're actually appending the
for when we're actually appending the data which is going to be a little bit
data which is going to be a little bit different but anyways we're going to do
different but anyways we're going to do right Row open parenthesis and this is
right Row open parenthesis and this is where that header is going to go so
where that header is going to go so we're going to that these headers are
we're going to that these headers are going to be the title and the
going to be the title and the price and then for our last one we're
price and then for our last one we're going to actually write the data which
going to actually write the data which is this data right here and we're going
is this data right here and we're going to say
to say writer. write
writer. write row and we're going to do data so this
row and we're going to do data so this one we are creating the
one we are creating the CSV and then we are inserting the header
CSV and then we are inserting the header and inserting the data so super easy um
and inserting the data so super easy um yeah I think that's fairly
yeah I think that's fairly straightforward right now let's do this
straightforward right now let's do this and let's see what happens so I just ran
and let's see what happens so I just ran it um let's go over here in here
it um let's go over here in here somewhere Amazon web scraper data set
somewhere Amazon web scraper data set let's open that
let's open that up and there we go oh
up and there we go oh jeez this isn't good can't verify my uh
my subscription uh why does it say $699 I'm going to go back and look but I
$699 I'm going to go back and look but I think I know the
think I know the issue um but this is exactly what we
issue um but this is exactly what we want now of course we want more data and
want now of course we want more data and maybe a little bit more useful data um
maybe a little bit more useful data um and I'll show you how to get that in
and I'll show you how to get that in just a second but we just created that
just a second but we just created that out of thin air uh that was not I didn't
out of thin air uh that was not I didn't have that saved before so we have this
have that saved before so we have this data set and the issue was is that I ran
data set and the issue was is that I ran this multiple times so now it's $6.99 if
this multiple times so now it's $6.99 if I do it again it's 99 uh and if I did it
I do it again it's 99 uh and if I did it again it's it gets rid of everything so
again it's it gets rid of everything so I'm just going to run this again run
I'm just going to run this again run this
again now everything's back to normal okay so now if we run this it's going to
okay so now if we run this it's going to overwrite this Amazon webscraper data
overwrite this Amazon webscraper data set. CSV and it will put the data in
set. CSV and it will put the data in properly so there we go oh jeez guys
properly so there we go oh jeez guys this is embarrassing
this is embarrassing I'm
I'm embarrassed no I don't want this okay
embarrassed no I don't want this okay perfect um guys I if you can't tell I'm
perfect um guys I if you can't tell I'm in need of some um I'm in need I'm in
in need of some um I'm in need I'm in need of some help here but I'm just
need of some help here but I'm just kidding I'm I'm doing fine uh I just I
kidding I'm I'm doing fine uh I just I don't know why that uh why I don't have
don't know why that uh why I don't have my uh subscription activated it's not
my uh subscription activated it's not going to matter for this video I guess
going to matter for this video I guess but that's really random um so we got
but that's really random um so we got what we need that's perfect
what we need that's perfect now what we want to do after this um I I
now what we want to do after this um I I guess actually what is important is some
guess actually what is important is some more useful
more useful data something that I like to do a lot
data something that I like to do a lot when I do this type of this type of
when I do this type of this type of stuff is I like to have some type of
stuff is I like to have some type of date stamp um or some type of Tim stamp
date stamp um or some type of Tim stamp to know when I collected this data it
to know when I collected this data it usually comes in handy later on um I I
usually comes in handy later on um I I have never regretted putting it in there
have never regretted putting it in there I'll show you really quick how you can
I'll show you really quick how you can do it uh you're going to do import
do it uh you're going to do import daytime
daytime geez I hate having to format stuff like
geez I hate having to format stuff like that and what you can do is you can do
that and what you can do is you can do date let me get date time and you do
date let me get date time and you do dat. today open parentheses and that is
dat. today open parentheses and that is going to give us this right here uh and
going to give us this right here uh and so we're just going to do um today
so we're just going to do um today that's what we'll call it is equal to
that's what we'll call it is equal to this and we'll say print today and there
this and we'll say print today and there we go so that is today's date is the 20
we go so that is today's date is the 20 of August in
of August in 2021 so today is now um is now this so
2021 so today is now um is now this so actually I'm going to get rid of that
actually I'm going to get rid of that I'm going to put it back up here I'm
I'm going to put it back up here I'm going to put it right there I'm going to
going to put it right there I'm going to run it again let's add this right here
run it again let's add this right here we'll do
we'll do um we'll do we'll call it
um we'll do we'll call it date and then we'll add
date and then we'll add today and we'll just run this again
today and we'll just run this again and what we can do just to check the
and what we can do just to check the data without having to open up the data
data without having to open up the data every single time which is super
every single time which is super annoying is we're going to use pandas
annoying is we're going to use pandas again I should have imported this at the
again I should have imported this at the top I'm just kind of um I'm not doing
top I'm just kind of um I'm not doing this off the top of my head but uh I
this off the top of my head but uh I didn't have it 100% planned so import
didn't have it 100% planned so import pandas and we're just going to say pd.
pandas and we're just going to say pd. read CSV and then we'll read it in um
read CSV and then we'll read it in um what you can do or what I often do is I
what you can do or what I often do is I go to properties and I go right
here and we'll say boom boom back slash this right here this I am doing
slash this right here this I am doing off the top of my head I don't do this
off the top of my head I don't do this often I think I have this memorized by
often I think I have this memorized by now uh I I I hope and then we'll do
now uh I I I hope and then we'll do print oh no we don't have to do print
print oh no we don't have to do print we'll just do this uh what do I do R
we'll just do this uh what do I do R let's actually call this um data frame
let's actually call this um data frame and we'll do
and we'll do print let's see what happens perfect
print let's see what happens perfect okay so what we have now is the new our
okay so what we have now is the new our new header our new data that we added in
new header our new data that we added in there so we have our title we have our
there so we have our title we have our price and we have our date now again you
price and we have our date now again you can customize this whatever you want to
can customize this whatever you want to add go back here um you know find what
add go back here um you know find what you want you know do you want it to make
you want you know do you want it to make sure it has a men's option or different
sure it has a men's option or different colors or you want to pull in this
colors or you want to pull in this information whatever you want it it
information whatever you want it it really does not matter um just matters
really does not matter um just matters that you know you get what you need for
that you know you get what you need for whatever purpose whatever you're making
whatever purpose whatever you're making this for this is more of an introductory
this for this is more of an introductory video to how to scrape data from Amazon
video to how to scrape data from Amazon um the next video will probably be a
um the next video will probably be a little bit more difficult and in-depth
little bit more difficult and in-depth but this is kind of let's get you guys
but this is kind of let's get you guys started so um we now have this and this
started so um we now have this and this is
is beautiful now something that you want to
beautiful now something that you want to do when you're scraping data and you're
do when you're scraping data and you're getting um I guess data over time and
getting um I guess data over time and that's kind of what we're doing is going
that's kind of what we're doing is going to be almost like um a price tracker
to be almost like um a price tracker over time is you want to then append
over time is you want to then append data to this so we can't only create it
data to this so we can't only create it and that's what this does because if I
and that's what this does because if I run this 100 times it'll only give me
run this 100 times it'll only give me this first row we need to now append
this first row we need to now append data to this so um
data to this so um let's let's pull this down here
let's let's pull this down here uh again I'm I'm not I haven't added a
uh again I'm I'm not I haven't added a bunch of notes I'm going to say now we
bunch of notes I'm going to say now we are appending data to the csb I haven't
are appending data to the csb I haven't added a ton of notes I'll try to go back
added a ton of notes I'll try to go back maybe afterwards and add some notes for
maybe afterwards and add some notes for people who like to read
people who like to read notes
notes um so what we are now going to do is
um so what we are now going to do is we're going to change this W to an A+
we're going to change this W to an A+ now this is going to be how we append
now this is going to be how we append the data um and we no longer need the
the data um and we no longer need the header so we don't aren't going to do
header so we don't aren't going to do the header anymore and there we go so
the header anymore and there we go so now instead of excuse me so now instead
now instead of excuse me so now instead of creating that header again creating
of creating that header again creating that first row of data again we are
that first row of data again we are ignoring the data and we're now going to
ignoring the data and we're now going to the next nearest free row and a pending
the next nearest free row and a pending data which means to add on data to
data which means to add on data to that um and so if I run this which I'm
that um and so if I run this which I'm not going to right now I mean why not I
not going to right now I mean why not I can I can run it um and then we can read
can I can run it um and then we can read this in so now there there's our data
this in so now there there's our data I'll run it a few more more
I'll run it a few more more times I ran it like three or four more
times I ran it like three or four more times I I run that in and there we go
times I I run that in and there we go now it's all the exact same data super
now it's all the exact same data super um
um boring but very very uh you know good to
boring but very very uh you know good to have now we don't want to have to come
have now we don't want to have to come in here and run this every day let's say
in here and run this every day let's say we're going to do this daily um we don't
we're going to do this daily um we don't want to have to come and write run this
want to have to come and write run this every single day right we want a way
every single day right we want a way where it does it while we sleep it does
where it does it while we sleep it does it in the background of our laptop um
it in the background of our laptop um and is easy to do right I don't want to
and is easy to do right I don't want to come in here every single morning with a
come in here every single morning with a set an alarm on my phone every single
set an alarm on my phone every single morning come in here I want to automate
morning come in here I want to automate this so uh how are we going to do that
this so uh how are we going to do that give me one second uh if you didn't know
give me one second uh if you didn't know I have three kids and one of them is
I have three kids and one of them is waking up I'll be right back all right I
waking up I'll be right back all right I think he is asleep um at least let's
think he is asleep um at least let's hope he's asleep so now what we're going
hope he's asleep so now what we're going to do is we're going
to do is we're going to put this
to put this all into
all into uh this check uncore
uh this check uncore price now you may never have used oh
price now you may never have used oh geez what are these things called oh my
geez what are these things called oh my gosh
gosh super used all the time you'll know what
super used all the time you'll know what I what it is
I what it is uh not a function I don't even remember
uh not a function I don't even remember what it's called maybe this's a function
what it's called maybe this's a function um I can't think I'm having like a
um I can't think I'm having like a writer's block or whatever that is we're
writer's block or whatever that is we're going to put it all in here and then
going to put it all in here and then we're going to be able to use this price
we're going to be able to use this price check later um because we want to be
check later um because we want to be able to automate this so let's go back
able to automate this so let's go back all the way up
all the way up here we are going to use this so let's
here we are going to use this so let's copy all of that
copy all of that in and oh jeez I hate
this all right everything just like that um so this pulls in our
um so this pulls in our data pulls in uh or or yeah pulls in all
data pulls in uh or or yeah pulls in all of our data down to the title and the
of our data down to the title and the price we want
price we want to make it look
to make it look right so we're going to put it right
right so we're going to put it right here so now we have it formatted
here so now we have it formatted properly um we want to add our date
time do it just like that I don't know if there's a better I'm sure there's a
if there's a better I'm sure there's a better way to do
better way to do this um then we need
this um then we need need this right
here and just like that like that so now we have our header and our data and then
we have our header and our data and then we want to pull this in right
we want to pull this in right here boom boom boom
here boom boom boom okay so everything that we just wrote
okay so everything that we just wrote out we are now putting into this check
out we are now putting into this check price now you can call it whatever you
price now you can call it whatever you want doesn't matter but let's run that
want doesn't matter but let's run that see if we get any errors we don't so
see if we get any errors we don't so this is now good to go
this is now good to go basically um what we are going to use
basically um what we are going to use this for um and what this is going to do
this for um and what this is going to do is we are going to put this on a timer
is we are going to put this on a timer um you know have you ever wanted to like
um you know have you ever wanted to like check something once a day once every 10
check something once a day once every 10 seconds once a minute whatever you want
seconds once a minute whatever you want and you don't want to have to actually
and you don't want to have to actually pull up your phone and look at it this
pull up your phone and look at it this is how we are going to do that so we had
is how we are going to do that so we had something called uh let's see time this
something called uh let's see time this this Library time right here that's what
this Library time right here that's what we're going to use right now so we're
we're going to use right now so we're going to say while oops while
going to say while oops while true and go like this do a
true and go like this do a colon we're going to say check unor
colon we're going to say check unor price that's what we just wrote out and
price that's what we just wrote out and we're going to do time dos sleep now
we're going to do time dos sleep now this is completely up to you how how
this is completely up to you how how much time you want to put in here for
much time you want to put in here for the purposes of demonstration I'm going
the purposes of demonstration I'm going to put 5 Seconds which means every 5
to put 5 Seconds which means every 5 Seconds it is going to run through this
Seconds it is going to run through this entire process and so let's run this
entire process and so let's run this really quick and I'm going to run it for
really quick and I'm going to run it for let's say 30 seconds and then I'm going
let's say 30 seconds and then I'm going to
to pull this in right
pull this in right here so we just looked at it earlier we
here so we just looked at it earlier we had four um well five rows of data right
had four um well five rows of data right what we are going to do is in just a
what we are going to do is in just a second I'm going to stop this you know
second I'm going to stop this you know maybe after 30 seconds or so we're going
maybe after 30 seconds or so we're going to see how much data is in
to see how much data is in there uh and let's stop it right now
there uh and let's stop it right now it's been going far enough um and La
it's been going far enough um and La let's run it so now we have five six
let's run it so now we have five six seven eight so I guess I ran for 20
seven eight so I guess I ran for 20 seconds we
seconds we can that was for demonstration purposes
can that was for demonstration purposes I've never do any some anything every
I've never do any some anything every every 5 Seconds um unless it was like
every 5 Seconds um unless it was like Black Friday on
Black Friday on Amazon we can put this
Amazon we can put this as long or as short as you want you can
as long or as short as you want you can run it every second if you want um that
run it every second if you want um that doesn't make sense to me but you can
doesn't make sense to me but you can what we can do is do a little bit of
what we can do is do a little bit of math uh and I don't know this off the
math uh and I don't know this off the top of my head so I'm going to uh do the
top of my head so I'm going to uh do the math with you live pretty exciting stuff
math with you live pretty exciting stuff got the calculator out so there are 60
got the calculator out so there are 60 seconds in a minute and this goes by
seconds in a minute and this goes by seconds by the way and you could do you
seconds by the way and you could do you know you can do some um some string up
know you can do some um some string up here of calculating this but I'm just
here of calculating this but I'm just going to put in the number because it's
going to put in the number because it's easier uh maybe not easier I'm just
easier uh maybe not easier I'm just going to do it there's 60 seconds um in
going to do it there's 60 seconds um in a minute there are 60 seconds or 60
a minute there are 60 seconds or 60 minutes in an hour so that's one hour uh
minutes in an hour so that's one hour uh and we can do 24 hours in a day so
and we can do 24 hours in a day so that's
86,000 400 I believe did I read that right oops did I read that right
right oops did I read that right right yes so this now if I ran this and
right yes so this now if I ran this and I'm going to this is going to check the
I'm going to this is going to check the price every single day and this is the
price every single day and this is the entire point of this um of of
entire point of this um of of this project not the entire point but
this project not the entire point but this is a big part of this project is we
this is a big part of this project is we want to create our own data set now
want to create our own data set now something that I personally really love
something that I personally really love is a data set that
is a data set that has you know that I can do some type of
has you know that I can do some type of time ser series with now this is not
time ser series with now this is not exciting it's probably not super
exciting it's probably not super exciting for this right but you get the
exciting for this right but you get the idea that if this price were to change
idea that if this price were to change we would then see that reflected in the
we would then see that reflected in the data at some
data at some point you can do this on any item you
point you can do this on any item you could ever imagine on Amazon it's the
could ever imagine on Amazon it's the exact same process and some items change
exact same process and some items change often this t-shirt will most likely
often this t-shirt will most likely never change um and so you know again
never change um and so you know again this is for for demonstration purposes
this is for for demonstration purposes the code itself will be nice to put in a
the code itself will be nice to put in a project although the data set that you
project although the data set that you get from this probably won't be the best
get from this probably won't be the best I would
I would imagine but notice that this is running
imagine but notice that this is running um I can then minimize this and this can
um I can then minimize this and this can run on my computer basically as long as
run on my computer basically as long as my computer uh is is
my computer uh is is working um one thing I will say before I
working um one thing I will say before I go on to some more stuff one thing that
go on to some more stuff one thing that I will say is that I personally when I
I will say is that I personally when I did this for a when I um created this I
did this for a when I um created this I did something similar and I put this in
did something similar and I put this in Visual Studio code um and I didn't put
Visual Studio code um and I didn't put it in Jupiter notebooks that's a
it in Jupiter notebooks that's a personal preference I would look into
personal preference I would look into that if that is something that you want
that if that is something that you want um I think visual studio code is a
um I think visual studio code is a little bit easier for automating these
little bit easier for automating these types of tasks um but for illustrative
types of tasks um but for illustrative purposes and for demonstration purposes
purposes and for demonstration purposes you cannot beat jupyter notebooks that's
you cannot beat jupyter notebooks that's why I did it
why I did it so with all that being said that is
so with all that being said that is basically the end of the project now um
basically the end of the project now um I'm not going to stop this and read it
I'm not going to stop this and read it again but you get the point um we now
again but you get the point um we now have um a data
have um a data set that oh jeez all this again that now
set that oh jeez all this again that now has um data I'm getting out of here oh
has um data I'm getting out of here oh geez it's hounding me let me get out of
geez it's hounding me let me get out of here oh
here oh no all this is embarrassing guys I'm
no all this is embarrassing guys I'm embarrassed we now have a CSV file with
embarrassed we now have a CSV file with data in now you run this in the
data in now you run this in the background of your computer you can do
background of your computer you can do that I have done it I've ran it for
that I have done it I've ran it for weeks I have ran it for months um if you
weeks I have ran it for months um if you restart your computer just come back in
restart your computer just come back in here and restart running this process um
here and restart running this process um it's the same for any automated process
it's the same for any automated process unless you start using some online um
unless you start using some online um automation service which will run it
automation service which will run it regardless of your computer they do it
regardless of your computer they do it you know either in the cloud or on some
you know either in the cloud or on some um
um server so you know that this is a really
server so you know that this is a really good option again if if you restart your
good option again if if you restart your computer or something happens and you
computer or something happens and you lose connection just come in here run
lose connection just come in here run this through this script again um except
this through this script again um except for the one where it deletes all your
for the one where it deletes all your data don't run that one again only run
data don't run that one again only run that one time um and then you will in
that one time um and then you will in fact what I would do is then um I would
fact what I would do is then um I would just comment this out right I'd come in
just comment this out right I'd come in here and I would just comment this
here and I would just comment this out so that anytime I come back in here
out so that anytime I come back in here I would never accidentally delete all my
I would never accidentally delete all my data
data but that is what this project does now
but that is what this project does now something really interesting something
something really interesting something that I have done in the past that I
that I have done in the past that I thought was really cool really useful I
thought was really cool really useful I actually did it for um I actually did it
actually did it for um I actually did it for some
for some watches that I was watching especially
watches that I was watching especially on Black Friday it's when I used it I
on Black Friday it's when I used it I was interested in a price drop or
was interested in a price drop or specific price change and what I did was
specific price change and what I did was is I said and I don't
is I said and I don't know so what I basically did was is I
know so what I basically did was is I said if the price is lower than let's
said if the price is lower than let's say let's say we wanted to drop below
say let's say we wanted to drop below $14 it would then send an email um and
$14 it would then send an email um and I'm going to show you the script that I
I'm going to show you the script that I used it still works um and if this is
used it still works um and if this is something that you are interested in
something that you are interested in this could be a completely different
this could be a completely different project I just think it's interesting
project I just think it's interesting and I wanted to show it to you although
and I wanted to show it to you although I wouldn't say this this is part of the
I wouldn't say this this is part of the um final project let me just come in
um final project let me just come in here and we are going to create this
here and we are going to create this super simple um not super simple we're
super simple um not super simple we're sending a mail we're connecting to a
sending a mail we're connecting to a server we we're using Gmail we're
server we we're using Gmail we're logging into our account that is my
logging into our account that is my email you will not get my password we're
email you will not get my password we're creting the subject the body um we we
creting the subject the body um we we configure or or just kind of create this
configure or or just kind of create this message and then we send a mail so then
message and then we send a mail so then I have this Define uh or this send mail
I have this Define uh or this send mail I am blanking on what this is called I'm
I am blanking on what this is called I'm going to call it a function but that's
going to call it a function but that's probably not right so if that price
probably not right so if that price drops below a certain point it'll send
drops below a certain point it'll send me an email um I have used this and I
me an email um I have used this and I used it and was able to buy a watch that
used it and was able to buy a watch that was like you know let's say 140 bucks
was like you know let's say 140 bucks for like 90 bucks um on Black Friday
for like 90 bucks um on Black Friday sale I was really really happy about
sale I was really really happy about that so this can be used in that way as
that so this can be used in that way as well um not something you to write into
well um not something you to write into your project just something I'm going to
your project just something I'm going to include down here if you want to try it
include down here if you want to try it I think it's super interesting something
I think it's super interesting something really
really fun um really fun to mess around with I
fun um really fun to mess around with I enjoyed this so with that being said uh
enjoyed this so with that being said uh this is this is the project um I in the
this is this is the project um I in the next one and I promise you this one is
next one and I promise you this one is probably going to get a lot
probably going to get a lot more difficult if you thought this one
more difficult if you thought this one was easy which I hope maybe I hope you
was easy which I hope maybe I hope you do then that means you're you know
do then that means you're you know pretty good at python you know in the
pretty good at python you know in the next the next um web scraping project
next the next um web scraping project and I hope to do many of these I might
and I hope to do many of these I might do um even all the ones that I put in
do um even all the ones that I put in that poll but I started with the one
that poll but I started with the one that was the most
that was the most popular um you know if you were able to
popular um you know if you were able to get through this I think that that is
get through this I think that that is fantastic I think this is a solid
fantastic I think this is a solid project to create um a data set and so
project to create um a data set and so use this how you will you can copy my
use this how you will you can copy my code exactly I don't have a problem with
code exactly I don't have a problem with that again I don't think this is
that again I don't think this is beginner there are some a little bit
beginner there are some a little bit more advanced things and I not even
more advanced things and I not even Advanced just like intermediate level
Advanced just like intermediate level things um that you kind of learn as you
things um that you kind of learn as you get into it and so um I hope that this
get into it and so um I hope that this was instructional I hope I explained it
was instructional I hope I explained it you know well um and I hope that this is
you know well um and I hope that this is useful again you know when you actually
useful again you know when you actually use this you'll have 22 23 24 25 you
use this you'll have 22 23 24 25 you know you'll see a price change a price
know you'll see a price change a price change a price change a price change go
change a price change a price change go use a a product or go to something that
use a a product or go to something that you were interested in or that you know
you were interested in or that you know fluctuates often um and there are plenty
fluctuates often um and there are plenty of those on Amazon I promise you there
of those on Amazon I promise you there some that literally change almost every
some that literally change almost every other day like down a dollar up a dollar
other day like down a dollar up a dollar um and then Black Friday just goes crazy
um and then Black Friday just goes crazy um with these price changes so use this
um with these price changes so use this as you will I hope that this was
as you will I hope that this was instructional I hope that it's useful I
instructional I hope that it's useful I think I said that before is you know I'm
think I said that before is you know I'm doing this because I think it's really
doing this because I think it's really interesting it's really useful um um
interesting it's really useful um um this to me again was a good
this to me again was a good introduction a really good introduction
introduction a really good introduction to web scraping because in this next one
to web scraping because in this next one it gets quite a bit more difficult um I
it gets quite a bit more difficult um I would say on a scale of like difficulty
would say on a scale of like difficulty this is like maybe a four and it'll
this is like maybe a four and it'll probably jump up to like a seven on this
probably jump up to like a seven on this next one um just just much
next one um just just much more um technical or or coding heavy so
more um technical or or coding heavy so um you know look forward to that if
um you know look forward to that if that's something that you look forward
that's something that you look forward to with that being said I'm going to go
to with that being said I'm going to go back over here for my send off with that
back over here for my send off with that being said I hope this was helpful I
being said I hope this was helpful I hope that you learned something um don't
hope that you learned something um don't get mad at me if it was too easy don't
get mad at me if it was too easy don't get mad if it was me if it was too hard
get mad if it was me if it was too hard uh I'm doing my best over here so I
uh I'm doing my best over here so I appreciate your patience thank you so
appreciate your patience thank you so much for watching I really appreciate it
much for watching I really appreciate it if you like this video be sure to like
if you like this video be sure to like And subscribe below and I will see you
And subscribe below and I will see you in the next
in the next [Music]
[Music] video
what's going on everybody welcome back to another video today we're going to be
to another video today we're going to be creating a script to automatically take
creating a script to automatically take data from a crypto
data from a crypto [Music]
[Music] API now this project stems from an
API now this project stems from an earlier video that I did where I walked
earlier video that I did where I walked through what an API was and how you can
through what an API was and how you can use it and in that video I showed you
use it and in that video I showed you how to use coin market caps API so you
how to use coin market caps API so you could start pulling in their crypto data
could start pulling in their crypto data and in this video we're going to take it
and in this video we're going to take it one step further and automate that
one step further and automate that process now we're going to do a little
process now we're going to do a little bit of transformation with the data I'm
bit of transformation with the data I'm going to show you some cool stuff of how
going to show you some cool stuff of how you can use it and maybe we'll do a
you can use it and maybe we'll do a little bit of visualization at the end
little bit of visualization at the end but that is not the main point of this
but that is not the main point of this video it's mostly around the automation
video it's mostly around the automation piece and a little bit of the data
piece and a little bit of the data cleaning piece as well now fair warning
cleaning piece as well now fair warning this is not a beginners level project
this is not a beginners level project it's probably more like an intermediate
it's probably more like an intermediate project and it's not even a complete
project and it's not even a complete project per se because we're not doing
project per se because we're not doing all the data cleaning we're not doing
all the data cleaning we're not doing all the visualizations but but if you
all the visualizations but but if you follow along we're going to cover a lot
follow along we're going to cover a lot of different things and you're really
of different things and you're really going to set yourself up to be able to
going to set yourself up to be able to do just about anything you want with
do just about anything you want with this data or different apis that you
this data or different apis that you pull from so with that being said let's
pull from so with that being said let's jump onto my screen and get started with
jump onto my screen and get started with the project all right so this is where
the project all right so this is where we stopped in our last video so if you
we stopped in our last video so if you haven't watched it now is the time to go
haven't watched it now is the time to go back and do that I'll have a link in the
back and do that I'll have a link in the description also all the code that we're
description also all the code that we're going to be looking at today and working
going to be looking at today and working through is going to be in a GitHub repo
through is going to be in a GitHub repo below so you can go and get all the code
below so you can go and get all the code and have it completely finished and just
and have it completely finished and just follow along or you can code it from
follow along or you can code it from scratch along with me I do recommend
scratch along with me I do recommend writing it from scratch if you can
writing it from scratch if you can because I think you'll learn more and
because I think you'll learn more and you'll make mistakes and you'll learn
you'll make mistakes and you'll learn from that as we go through it but it is
from that as we go through it but it is up to you so let's get started and as
up to you so let's get started and as you can see uh we have the script right
you can see uh we have the script right here and I'm starting basically from
here and I'm starting basically from scratch I have a completed one up here
scratch I have a completed one up here I'm actually going to get rid of those
I'm actually going to get rid of those um and what we're going to do is we're
um and what we're going to do is we're going to start from exactly where we
going to start from exactly where we started in our last one I'm going to run
started in our last one I'm going to run the script um this is going to p from
the script um this is going to p from our
our API and we're going to look at the
API and we're going to look at the dictionary set our option and do our
dictionary set our option and do our Json normaliz so this is where we
Json normaliz so this is where we literally left off from the from the
literally left off from the from the last video so we have all of this
last video so we have all of this data
data and what we want to do with it is we
and what we want to do with it is we want to kind of automate that process
want to kind of automate that process right because we don't want to have to
right because we don't want to have to come in here run this and you know put
come in here run this and you know put into a CSV manually or something like
into a CSV manually or something like that we want to automate this data
that we want to automate this data collection process so that we can just
collection process so that we can just have the data ready for us to use um and
have the data ready for us to use um and it all be ready to go so we're going to
it all be ready to go so we're going to be using this script um but you know we
be using this script um but you know we we might want to add a little bit more
we might want to add a little bit more to it before we do that uh the first
to it before we do that uh the first thing that I want to do before um before
thing that I want to do before um before anything is something that I like to do
anything is something that I like to do when I'm creating these automation
when I'm creating these automation scripts as I I like to add a Tim stamp
scripts as I I like to add a Tim stamp uh and the reason for that is because I
uh and the reason for that is because I want to know when I ran or when each of
want to know when I ran or when each of those um Loops you can say runs through
those um Loops you can say runs through an and does those automated runs right
an and does those automated runs right so if I do it every day I want to know
so if I do it every day I want to know what time of day I ran it making sure
what time of day I ran it making sure each run ran
each run ran successfully and so all I'm going to do
successfully and so all I'm going to do is I'm going to add a new column at the
is I'm going to add a new column at the end and just call it timestamp so let's
end and just call it timestamp so let's go right up here and we're going to say
go right up here and we're going to say PD Dot and there's something called two
PD Dot and there's something called two date time so we're going to do
date time so we're going to do 2core date
2core date time and then we're going to do now and
time and then we're going to do now and what this is literally going to do is
what this is literally going to do is take the the date the the Tim stamp of
take the the date the the Tim stamp of right now when it's running and it's
right now when it's running and it's going to show that now we need to of
going to show that now we need to of course add a new uh a new column for
course add a new uh a new column for that so all we're going to do is we're
that so all we're going to do is we're going to say data frame whoops we're say
going to say data frame whoops we're say data frame and let me see real quick we
data frame and let me see real quick we just have the
just have the data
data we need to add we need to create this
we need to add we need to create this data frame right here so data frame
data frame right here so data frame equals and then this Json normalized and
equals and then this Json normalized and we're going to say data frame and then
we're going to say data frame and then we're going to do a bracket and we're
we're going to do a bracket and we're going to say timestamp and we'll do well
going to say timestamp and we'll do well are all these lowercase we're going to
are all these lowercase we're going to keep with the the lower case we're going
keep with the the lower case we're going to say time
to say time stamp and we do that bracket and we'll
stamp and we do that bracket and we'll say equals so what this going to do is g
say equals so what this going to do is g to first off it's going to create this
to first off it's going to create this dat or or assign this DF as our data
dat or or assign this DF as our data frame and then we're going to add this
frame and then we're going to add this time stamp and add this new column and
time stamp and add this new column and so let's run this really
so let's run this really quickly and let's go all the way to the
quickly and let's go all the way to the right and this is our timestamp and this
right and this is our timestamp and this is the time uh that it is right now this
is the time uh that it is right now this is the day that I'm running it this is
is the day that I'm running it this is the time that I'm running it and so this
the time that I'm running it and so this is working properly now if you look
is working properly now if you look really quickly there is a last updated
really quickly there is a last updated in here and this is very close to this
in here and this is very close to this timestamp but it is not the same thing
timestamp but it is not the same thing um but if you looked through this data
um but if you looked through this data and you really into it a little bit
and you really into it a little bit there's this last update is coming from
there's this last update is coming from coin market caps API and this is when
coin market caps API and this is when the actual um cryptocurrency was updated
the actual um cryptocurrency was updated in their system and so it is going to be
in their system and so it is going to be really close but it's not going to be
really close but it's not going to be exact and so I don't like to rely on
exact and so I don't like to rely on built-in ones that you know are coming
built-in ones that you know are coming from an API or something I want to make
from an API or something I want to make one myself that's running on the system
one myself that's running on the system where I'm creating the automated process
where I'm creating the automated process just like just something I do um so now
just like just something I do um so now we have this original data frame created
we have this original data frame created right we H we now have what we need but
right we H we now have what we need but what we want to do is to keep adding
what we want to do is to keep adding data to this um we don't want it to just
data to this um we don't want it to just go to um you know create these 5,000
go to um you know create these 5,000 rows we want it to create 5,000 5,000
rows we want it to create 5,000 5,000 5,000 over time whether it's a day an
5,000 over time whether it's a day an hour a week um whatever you want to run
hour a week um whatever you want to run it so um what I'm actually going to do
it so um what I'm actually going to do is I'm going to limit this a lot I just
is I'm going to limit this a lot I just want to look at the top let's say 15 so
want to look at the top let's say 15 so we're going to do that that we're going
we're going to do that that we're going to run through all this again so now I
to run through all this again so now I just have top 15 it's going to be um
just have top 15 it's going to be um easier to to see and it won't take as
easier to to see and it won't take as much time to run our scripts again you
much time to run our scripts again you can keep as many as you'd like if you
can keep as many as you'd like if you want a 100 200 all 5,000 you do whatever
want a 100 200 all 5,000 you do whatever you'd like but what we are now going to
you'd like but what we are now going to do is we're going to create a function
do is we're going to create a function using this original script so we again
using this original script so we again we have this data frame and we are going
we have this data frame and we are going to create an automated process that is
to create an automated process that is going to autom a script to automate this
going to autom a script to automate this that is going to append data to this
that is going to append data to this data frame right here so that's kind of
data frame right here so that's kind of you know the big thing that we're trying
you know the big thing that we're trying to accomplish in this project um so
to accomplish in this project um so let's go up here and we're going to
let's go up here and we're going to we'll just take from here all the way to
we'll just take from here all the way to here we just going to copy this and
here we just going to copy this and going to paste it down here now what we
going to paste it down here now what we need to do is we need to create a
need to do is we need to create a function so we're going to say
function so we're going to say DF and we're going to call this the a
DF and we're going to call this the a apore Runner because this is going to
apore Runner because this is going to run our API um whenever we need it to
run our API um whenever we need it to run now when you are
run now when you are formatting um something for a function
formatting um something for a function it it needs to be formatted properly and
it it needs to be formatted properly and so what we need to do is need to go over
so what we need to do is need to go over here hit tap we're going to do this all
here hit tap we're going to do this all the way down I'm just going to skip
the way down I'm just going to skip forward when it's all the way done all
forward when it's all the way done all right so now we have this URL and what
right so now we have this URL and what we want to add because this is again
we want to add because this is again this is going to run through kind of
this is going to run through kind of this this automated process we're going
this this automated process we're going to run this um this function there what
to run this um this function there what we want is to also add this right here
we want is to also add this right here so we need to take this and we're gonna
so we need to take this and we're gonna need to add
need to add this we'll just put it down
this we'll just put it down here
here [Music]
[Music] okay and let's do that so what we have
okay and let's do that so what we have so far is really close to what we want
so far is really close to what we want our function to be um we have this
our function to be um we have this function that we're going to be running
function that we're going to be running through it's going to call this function
through it's going to call this function it's going to call the the API we're
it's going to call the the API we're going to use our key we are going to um
going to use our key we are going to um you know test it load it format It And
you know test it load it format It And format it right here then we're going to
format it right here then we're going to add this timestamp and then we will have
add this timestamp and then we will have this now right now it's just C it's just
this now right now it's just C it's just going to print this data frame basically
going to print this data frame basically but that's not what we want right now
but that's not what we want right now what we want is to actually append this
what we want is to actually append this data so when it gets to here when it
data so when it gets to here when it gets to this data that's going to be
gets to this data that's going to be right um right here what we want to do
right um right here what we want to do now since we already have the original
now since we already have the original data frame set up up top is we now want
data frame set up up top is we now want to say that this is going to be data
to say that this is going to be data frame two and we're going to say it's
frame two and we're going to say it's going to append it to data Frame 2 and
going to append it to data Frame 2 and so the original data frame we're going
so the original data frame we're going to say data frame
to say data frame 2. append and we're going to say
2. append and we're going to say df2 all this does is this says this new
df2 all this does is this says this new data that's GNA be coming in every time
data that's GNA be coming in every time let's say it's a loop and it's just
let's say it's a loop and it's just looping through pulling the data pulling
looping through pulling the data pulling the data pulling the data we're going to
the data pulling the data we're going to create this data frame we're going to
create this data frame we're going to add add this time stamp like like we
add add this time stamp like like we want and then we're going to append that
want and then we're going to append that to this original data frame so as of
to this original data frame so as of right now this looks good I will we'll
right now this looks good I will we'll run it in a second I'll create it so I
run it in a second I'll create it so I just created
just created it so now we need to actually create our
it so now we need to actually create our script to automatically run this so
script to automatically run this so we're going to do something called
we're going to do something called import OS and let me tell you there's a
import OS and let me tell you there's a thousand different ways to do this and
thousand different ways to do this and there are better ways to do this but
there are better ways to do this but they are much more complex much more
they are much more complex much more complicated and some cost money in order
complicated and some cost money in order to do it I'm going to show you different
to do it I'm going to show you different options on how to do this in future
options on how to do this in future videos on how to automate your Python
videos on how to automate your Python scripts but this one to me is one I've
scripts but this one to me is one I've used a lot um many many times for
used a lot um many many times for different projects and it works so I'm
different projects and it works so I'm not going to show you the most
not going to show you the most complicated thing in the world I'm going
complicated thing in the world I'm going to show you something that I've just
to show you something that I've just used a lot and so we're going to say
used a lot and so we're going to say from time import time from time import
from time import time from time import sleep that one's
sleep that one's important and now we're going to create
important and now we're going to create our Loop so what these um what the time
our Loop so what these um what the time and the sleep and the OS uh your
and the sleep and the OS uh your operating system what what these are
operating system what what these are going to do is they're going to give us
going to do is they're going to give us the ability to track the time and we're
the ability to track the time and we're going to be able to run through and call
going to be able to run through and call this function in certain intervals that
this function in certain intervals that we want so let's create our for loop
we want so let's create our for loop we're going to say 4 I in now
we're going to say 4 I in now you can create this specific part in
you can create this specific part in different ways but what I'm going to do
different ways but what I'm going to do is I'm going to say range of one uh
is I'm going to say range of one uh let's say
let's say 333 and I say 333 and if you remember
333 and I say 333 and if you remember from the first video on the API you only
from the first video on the API you only have 333 runs per day and so if I ran
have 333 runs per day and so if I ran ran this 333 times today that would be
ran this 333 times today that would be our Max and so that's why I'm using that
our Max and so that's why I'm using that 333 just for reference so now we're
333 just for reference so now we're going to
going to do
do API Runner so in this loop we're going
API Runner so in this loop we're going to call this function up here and then
to call this function up here and then I'm going to say I want to prove or or
I'm going to say I want to prove or or show have an output to show that this is
show have an output to show that this is running through successfully so I'm just
running through successfully so I'm just going to and you can write anything here
going to and you can write anything here we're just going to say API Runner
we're just going to say API Runner completed uh completed
completed uh completed successfully successfully how do you
successfully successfully how do you spell that successfully that doesn't
spell that successfully that doesn't look
look right I'm just going to say completed
right I'm just going to say completed all right forget that I don't remember
all right forget that I don't remember how to say uh Spell successfully if
how to say uh Spell successfully if that's if it spelled it right you guys
that's if it spelled it right you guys spell it that way but I can't remember
spell it that way but I can't remember now we're going to use this sleep right
now we're going to use this sleep right here now this counts it in seconds you
here now this counts it in seconds you can change it to minutes hours whatever
can change it to minutes hours whatever we're GNA have it run every minute which
we're GNA have it run every minute which is every 60 seconds and so this is going
is every 60 seconds and so this is going to I'm just going to say it's going to
to I'm just going to say it's going to sleep for one
sleep for one minute and then we're g to say
minute and then we're g to say exit so all this is going to do and this
exit so all this is going to do and this is again fairly simple it's just a
is again fairly simple it's just a simple for Loop and what it says is it's
simple for Loop and what it says is it's going to call this API it's going to
going to call this API it's going to tell us that it ran successfully and
tell us that it ran successfully and then it's going to wait for 60 seconds
then it's going to wait for 60 seconds and it's going to run again that's it so
and it's going to run again that's it so let's run this and see what happens see
let's run this and see what happens see if what we did works so rant the first
if what we did works so rant the first time now I'm not gonna I'm not going to
time now I'm not gonna I'm not going to bore you because I'm doing this live
bore you because I'm doing this live exactly what we're about to get is what
exactly what we're about to get is what we're going to use I didn't run it
we're going to use I didn't run it overnight or or for a week so that we
overnight or or for a week so that we have a bunch of data I'm what you were
have a bunch of data I'm what you were going to work with I'm going to work
going to work with I'm going to work with as well so I'm going to wait a few
with as well so I'm going to wait a few minutes I'm going to let this run I want
minutes I'm going to let this run I want you to do the same thing I'm going to
you to do the same thing I'm going to let this run for maybe like five minutes
let this run for maybe like five minutes or so and we'll work with what we have
or so and we'll work with what we have and we'll keep going with the project
and we'll keep going with the project because again we're not the point of
because again we're not the point of this project is not to create the final
this project is not to create the final product or creating all the visuals
product or creating all the visuals ations that um will most likely be in
ations that um will most likely be in another video where we're taking all
another video where we're taking all this data and doing all these things
this data and doing all these things with it the point of this video is to
with it the point of this video is to automate it clean it up to where we have
automate it clean it up to where we have it to where we can really use it and
it to where we can really use it and then I'm going to let you guys loose and
then I'm going to let you guys loose and you guys can do whatever you want with
you guys can do whatever you want with it and I think it's really setting you
it and I think it's really setting you up for a lot of successful projects in
up for a lot of successful projects in the future that you can do all by
the future that you can do all by yourself without me having to walk you
yourself without me having to walk you through it so as you can see it's
through it so as you can see it's already ran through twice I'm going to
already ran through twice I'm going to pause for a second I'm going to let that
pause for a second I'm going to let that run through uh just a few more times and
run through uh just a few more times and then we will continue with the project
then we will continue with the project all right we are back and of course it's
all right we are back and of course it's only ran what five times um it has not
only ran what five times um it has not reached the limit of 333 so we are
reached the limit of 333 so we are perfectly fine what I'm going to do is
perfectly fine what I'm going to do is I'm just going to stop this by clicking
I'm just going to stop this by clicking this uh square up here and it's going to
this uh square up here and it's going to give us some error and then we're going
give us some error and then we're going to check it and we will see what we have
to check it and we will see what we have I don't know why it's taking so long if
I don't know why it's taking so long if I'm being honest all right so I
I'm being honest all right so I interrupted it and let's run this let's
interrupted it and let's run this let's see what we got I hope we have more than
see what we got I hope we have more than 15 because if not I'm going be very
15 because if not I'm going be very upset
okay so okay well uh I made a mistake um I was
well uh I made a mistake um I was supposed to put data frame right here
supposed to put data frame right here and I had data frame too so um take
and I had data frame too so um take change your script do not do what I just
change your script do not do what I just did we're supposed to be append it's
did we're supposed to be append it's supposed to be data frame append and
supposed to be data frame append and we're supposed to be appending the
we're supposed to be appending the original D this data frame two to the
original D this data frame two to the original data frame so so um I messed up
original data frame so so um I messed up on that one let's rerun that let's rerun
on that one let's rerun that let's rerun that um let's
that um let's see local variable DF reference before
see local variable DF reference before assignment okay this is perfect because
assignment okay this is perfect because this happened to me before um we're
this happened to me before um we're running into all sorts of good stuff I
running into all sorts of good stuff I like to keep this stuff in my videos I
like to keep this stuff in my videos I laugh because I hate running into
laugh because I hate running into mistakes but everybody says they they're
mistakes but everybody says they they're happy that I do this um so I'm going to
happy that I do this um so I'm going to keep doing it I'm not going to cut this
keep doing it I'm not going to cut this out I promise um
out I promise um but what we actually need to do is we
but what we actually need to do is we need to go back up to this function
need to go back up to this function because what happened was is we called
because what happened was is we called this data
this data frame and now it's it's because it's in
frame and now it's it's because it's in a function it's in what they would call
a function it's in what they would call a local variable what we need to do is
a local variable what we need to do is we now need to state that this is a
we now need to state that this is a global um it's just called a global
global um it's just called a global that's all it is um and so what we're
that's all it is um and so what we're going to do is we're going do tab we're
going to do is we're going do tab we're say Global say
say Global say DF and what this should do is this
DF and what this should do is this should declare it as a global variable
should declare it as a global variable and it should let this run properly
and it should let this run properly let's hope it
let's hope it does all right it's
does all right it's running um again I run into mistakes I
running um again I run into mistakes I let me tell you something while we're
let me tell you something while we're here for just a second this project I
here for just a second this project I ran into probably a hundred mistakes or
ran into probably a hundred mistakes or a hundred errors issues that I had to
a hundred errors issues that I had to research for hours um and hours I'm
research for hours um and hours I'm legitimately on stack Overflow and just
legitimately on stack Overflow and just Googling and F figuring these things out
Googling and F figuring these things out there were a lot of new things that I
there were a lot of new things that I had never run into before um just on
had never run into before um just on this project and so um everything that
this project and so um everything that you're seeing is from after I went
you're seeing is from after I went through all of those things or after I
through all of those things or after I fixed all of those things and had to
fixed all of those things and had to really work through them it was it was
really work through them it was it was very um it was frustrating at times I
very um it was frustrating at times I just I couldn't figure it out and so
just I couldn't figure it out and so what you're looking at is kind of the
what you're looking at is kind of the polished version of that now that I have
polished version of that now that I have everything laid out because I I can't
everything laid out because I I can't spend 10 hours on a project nobody would
spend 10 hours on a project nobody would watch it so just know that if you are
watch it so just know that if you are running into some of these mistakes or
running into some of these mistakes or you run into mistakes later on when
you run into mistakes later on when you're expanding this project that's
you're expanding this project that's completely normal so what we're going to
completely normal so what we're going to do is we're going to let this run for a
do is we're going to let this run for a little bit and then after maybe three or
little bit and then after maybe three or four minutes we'll come back and we'll
four minutes we'll come back and we'll keep going with the project all right so
keep going with the project all right so let's run this and check and see if we
let's run this and check and see if we have uh the data that we're looking for
have uh the data that we're looking for uh and it looks like we do let's go
uh and it looks like we do let's go actually back up here really
actually back up here really quick um we want to set this to display
quick um we want to set this to display Max rows because I want to be able to
Max rows because I want to be able to see all the rows and not just um a few
see all the rows and not just um a few of them so and that just instead of it
of them so and that just instead of it gives us this scrolling instead of that
gives us this scrolling instead of that dot dot dot that shows us just a few so
dot dot dot that shows us just a few so there's our original 15 and then we have
there's our original 15 and then we have the next um the next Loop and then we
the next um the next Loop and then we have the next Loop and let me scroll
have the next Loop and let me scroll over to the timestamps and I'll show you
over to the timestamps and I'll show you what I mean um so was ran on
what I mean um so was ran on 52651 let's go down
52651 let's go down 526 at 150
526 at 150 2905 I say 1501 2905 and then the next
2905 I say 1501 2905 and then the next one you can see was ran at
one you can see was ran at 36 31 these are all the ones one minute
36 31 these are all the ones one minute after each other my original one was
after each other my original one was from
from earlier 32 33 yeah so you can see 32 31
earlier 32 33 yeah so you can see 32 31 3030 or um 3029 and this one was about
3030 or um 3029 and this one was about 15 minutes ago when I first
15 minutes ago when I first um ran the original data frame right all
um ran the original data frame right all right guys this is Alex from the future
right guys this is Alex from the future I've actually completed this entire
I've actually completed this entire project uh in the video and you're about
project uh in the video and you're about to see all that after this but I wanted
to see all that after this but I wanted to show you one more thing that you can
to show you one more thing that you can do in this function up here that I
do in this function up here that I didn't show you uh originally that I'm
didn't show you uh originally that I'm coming back to show you and that's how
coming back to show you and that's how to actually put it into a CSV now all
to actually put it into a CSV now all we've done in this one is we we've kept
we've done in this one is we we've kept it all enclosed in a data frame and
it all enclosed in a data frame and that's it and that may be great but a
that's it and that may be great but a lot of you guys are going to want to
lot of you guys are going to want to automate this and put it into a CSV and
automate this and put it into a CSV and I want to show you how to do that all
I want to show you how to do that all right so what I'm going to show you
right so what I'm going to show you really quickly is right here in this uh
really quickly is right here in this uh in this folder right here I have all
in this folder right here I have all these different API 3es and fours these
these different API 3es and fours these were tests that I did before but what
were tests that I did before but what you can do is instead of just putting it
you can do is instead of just putting it into a data frame you can actually
into a data frame you can actually append the data to a CSV and have that
append the data to a CSV and have that CSV sitting out there for you instead of
CSV sitting out there for you instead of just keeping it all in the data frame
just keeping it all in the data frame and there's a lot of different uses for
and there's a lot of different uses for that you may want to have that file
that you may want to have that file separately from here just in case
separately from here just in case something times out or something breaks
something times out or something breaks which is a legitimate concern or your
which is a legitimate concern or your computer shuts off or or something like
computer shuts off or or something like that that is a legitimate concern so
that that is a legitimate concern so what we're going to do is we're going to
what we're going to do is we're going to say um if not and this is basically an
say um if not and this is basically an if statement we're going to say
if statement we're going to say os.
os. path dot is file so what this is going
path dot is file so what this is going to do is check if there's already a file
to do is check if there's already a file under this name and we're going to do r
under this name and we're going to do r dot or or R um if you have never done um
dot or or R um if you have never done um if you've never done CSV stuff before
if you've never done CSV stuff before it's really important that you put that
it's really important that you put that you you're going to get an error every
you you're going to get an error every time so we're going to take this right
time so we're going to take this right here and we're going to copy that and
here and we're going to copy that and we're going to put that right here and
we're going to put that right here and then we're also going to do a slash and
then we're also going to do a slash and then we're going to name it basically um
then we're going to name it basically um let's name this API because I don't
let's name this API because I don't think I have that one in there I think I
think I have that one in there I think I deleted it yeah so I don't have API so
deleted it yeah so I don't have API so I'm just going to keep it api.
I'm just going to keep it api. CSV and then I'm going to close that
CSV and then I'm going to close that parentheses and then we're going to add
parentheses and then we're going to add a colon right here and we're going to
a colon right here and we're going to say if that does not exist we are going
say if that does not exist we are going to write this to it and create it so
to write this to it and create it so we're going to say data frames that's
we're going to say data frames that's this data frame right
this data frame right here data frame dot we going to say 2or
here data frame dot we going to say 2or CSV and we're going to do that R and
CSV and we're going to do that R and then we're going to copy this so let's
then we're going to copy this so let's just let's just replace it like
just let's just replace it like that and then we're going to say
that and then we're going to say comma
comma header oops header is equal
header oops header is equal to column uncore names so what this is
to column uncore names so what this is going to do is if we run through this
going to do is if we run through this and what we would have to do is um I'll
and what we would have to do is um I'll talk about this in a little bit we'll
talk about this in a little bit we'll have to change this up a little bit but
have to change this up a little bit but what this is going to do is going to
what this is going to do is going to check to see if this file right here
check to see if this file right here exists if it does not it is going to
exists if it does not it is going to create it and create the column headers
create it and create the column headers based off the this data frame that is
based off the this data frame that is what that does now what we want to do is
what that does now what we want to do is say else and this next part that we're
say else and this next part that we're going to write is saying if there's
going to write is saying if there's already the API file there we want to
already the API file there we want to append the data we don't want to
append the data we don't want to overwrite it or anything like that we
overwrite it or anything like that we want to append the the data so we're
want to append the the data so we're going to say we're basically going to
going to say we're basically going to copy
copy this maybe not the whole thing but I
this maybe not the whole thing but I already did it um so we're going to copy
already did it um so we're going to copy that and we're going to say mode oops
that and we're going to say mode oops mode equals
mode equals a and a stands for append and then we're
a and a stands for append and then we're going to say header oops keep messing up
going to say header oops keep messing up header and we're say false oops we're
header and we're say false oops we're going to say false which means when it
going to say false which means when it depends the data it's not going to use
depends the data it's not going to use those the column headers every time
those the column headers every time which you don't want because every time
which you don't want because every time you append it if you added the headers
you append it if you added the headers every 15 rows every 15 rows you're going
every 15 rows every 15 rows you're going to have another headers that you're
to have another headers that you're going to have to like go out into that
going to have to like go out into that CSV and filter out and and get rid of
CSV and filter out and and get rid of them so we're going to say header equals
them so we're going to say header equals false now just a second ago I said you
false now just a second ago I said you would need to mess with this just a
would need to mess with this just a little bit and you would because every
little bit and you would because every time um you'd be putting in this data
time um you'd be putting in this data frame which it's already appending it to
frame which it's already appending it to this data frame so every time you'd be
this data frame so every time you'd be creating a lot of duplicates if if you
creating a lot of duplicates if if you kept it exactly as is what you were
kept it exactly as is what you were going to need to do is basically take it
going to need to do is basically take it back to its to its um bones um so you
back to its to its um bones um so you need
need to kind of keep it like this so what you
to kind of keep it like this so what you need to do is just now run this and it
need to do is just now run this and it would work perfectly uh let's test it
would work perfectly uh let's test it really quick um to see if it works uh
really quick um to see if it works uh because I'm I'm promising you something
because I'm I'm promising you something I want to make sure it actually works
I want to make sure it actually works let's run it this time okay so it just
let's run it this time okay so it just ran for the first time so it should have
ran for the first time so it should have created this file
created this file let's go see if that works properly so
let's go see if that works properly so now it just created that file and now
now it just created that file and now we're going to see if it actually
we're going to see if it actually appends the data so let's wait just one
appends the data so let's wait just one time um and then I'm going to stop it
time um and then I'm going to stop it I'm going to see if it works again I'm
I'm going to see if it works again I'm just verifying to make sure that what
just verifying to make sure that what I'm telling you is actually working uh
I'm telling you is actually working uh because if it doesn't I would feel
because if it doesn't I would feel terrible we don't want that and while
terrible we don't want that and while that's running actually I'm going to add
that's running actually I'm going to add this because now I want to show you how
this because now I want to show you how to call it um super easy we're just
to call it um super easy we're just going to do
going to do pd.
pd. reor CSV we do that we're going to call
reor CSV we do that we're going to call this just like
this just like that and then we're going to say data
that and then we're going to say data frame and we're just going to do
frame and we're just going to do 72 something random because I've already
72 something random because I've already done this whole project I don't want to
done this whole project I don't want to mess anything up so we're going say data
mess anything up so we're going say data frame 72 so now let's stop this
frame 72 so now let's stop this um and what we're going to do is once
um and what we're going to do is once that stops we're going to run this and
that stops we're going to run this and see if it actually um worked and see
see if it actually um worked and see make sure that this actually pulled the
make sure that this actually pulled the data in all right so we interrupted it
data in all right so we interrupted it the file is ready to be read in so let's
the file is ready to be read in so let's read it in there's our file um let's see
read it in there's our file um let's see what did I mess up or did I mess
what did I mess up or did I mess anything
anything up ah I didn't mess anything up this is
up ah I didn't mess anything up this is the index for this file and we already
the index for this file and we already had this in here we'd probably be able
had this in here we'd probably be able to get rid of it but if you see we have
to get rid of it but if you see we have zero 1 two 3 four five six seven eight n
zero 1 two 3 four five six seven eight n 14 then we have zero 1 2 3 and if we
14 then we have zero 1 2 3 and if we look at the time stamp it should be one
look at the time stamp it should be one minute apart so it's 11
minute apart so it's 11 1945 it said 12045 so this worked
1945 it said 12045 so this worked exactly as planned um again you have two
exactly as planned um again you have two different options you can just keep it
different options you can just keep it how it was before and I'll leave both of
how it was before and I'll leave both of those options you know in the in the
those options you know in the in the script so that you can kind of choose
script so that you can kind of choose which one you want but um that's how you
which one you want but um that's how you do that so then right here you're
do that so then right here you're appending it to a CSV file and then if
appending it to a CSV file and then if you just keep this and you get rid of
you just keep this and you get rid of all this you're just appending it to a
all this you're just appending it to a data frame now please continue with the
data frame now please continue with the rest of the video that I already have
rest of the video that I already have done um but again I'm future Alex so uh
done um but again I'm future Alex so uh please continue with the rest of the
please continue with the rest of the video okay so we have all this data we
video okay so we have all this data we have we have so many columns we can do
have we have so many columns we can do now you know if you want to completely
now you know if you want to completely just go and do your own thing you
just go and do your own thing you absolutely can do that I'm going to mess
absolutely can do that I'm going to mess around with a few things um kind of show
around with a few things um kind of show you something that I did that I thought
you something that I did that I thought was really interesting um in order to
was really interesting um in order to visualize this data a little bit and
visualize this data a little bit and transform it a little bit to make it
transform it a little bit to make it more
more usable um but we're not doing a full
usable um but we're not doing a full data cleaning that's not what this
data cleaning that's not what this project is I'm not doing a full data
project is I'm not doing a full data cleaning of this data that would be a ma
cleaning of this data that would be a ma a very large undertaking because
a very large undertaking because honestly this needs a lot of work one
honestly this needs a lot of work one thing that I do want to clean up really
thing that I do want to clean up really quick uh is is this right here I this
quick uh is is this right here I this the math will be fine it's just the way
the math will be fine it's just the way that it's shown on here is in state the
that it's shown on here is in state the scientific notation and I don't like it
scientific notation and I don't like it so what I'm going to do really
so what I'm going to do really quickly I is just um get rid of that so
quickly I is just um get rid of that so we're going
we're going to we're GNA say
to we're GNA say pd. set and we do underscore option and
pd. set and we do underscore option and this is going to be do parentheses I'm
this is going to be do parentheses I'm going to say display this is just this
going to say display this is just this how this is formatting so we're going to
how this is formatting so we're going to display
display float underscore
float underscore format and we're going to say comma and
format and we're going to say comma and now we're going to use this
now we're going to use this Lambda say x colon and we're going to
Lambda say x colon and we're going to say
say percent
percent 0.5f and that right there and we're
0.5f and that right there and we're going to say percent X now if you don't
going to say percent X now if you don't know what lambdas is lambdas are um I
know what lambdas is lambdas are um I highly recommend looking those up um
highly recommend looking those up um again this is not a beginner tutorial
again this is not a beginner tutorial whoops no such Keys display floor format
whoops no such Keys display floor format that makes sense uh this is float yeah
that makes sense uh this is float yeah guys this is not a beginner's level all
guys this is not a beginner's level all right uh you can't use the floor format
right uh you can't use the floor format this is the float format all right so
this is the float format all right so now let's take a look at this uh this DF
now let's take a look at this uh this DF uh this data frame that we have so we're
uh this data frame that we have so we're just GNA hit DF hit enter and now our
just GNA hit DF hit enter and now our numbers are a little bit more easily
numbers are a little bit more easily readable I prefer it this way you do not
readable I prefer it this way you do not have to do this I'm doing this just
have to do this I'm doing this just because this is what I
because this is what I prefer so let's jump right into it um
prefer so let's jump right into it um something that when I saw this data I
something that when I saw this data I was like something that I really thought
was like something that I really thought was interesting is this percent change
was interesting is this percent change of one hour percent change 24 hours 7
of one hour percent change 24 hours 7 days 30 days 60 days 90 days if you're
days 30 days 60 days 90 days if you're not in crypto or you don't do investing
not in crypto or you don't do investing or anything like that what this is going
or anything like that what this is going to show us is how I mean it's pretty
to show us is how I mean it's pretty obvious how much the price of this coin
obvious how much the price of this coin has changed over the last hour 24 hours
has changed over the last hour 24 hours seven days so as you can see it's it's
seven days so as you can see it's it's barely fluctuated over the past 24 hours
barely fluctuated over the past 24 hours a little bit over the past um seven days
a little bit over the past um seven days a lot over the last 30 days 60 days and
a lot over the last 30 days 60 days and 90 days 20 minus 26% minus 33% we're in
90 days 20 minus 26% minus 33% we're in may we just had a kind of a crash in
may we just had a kind of a crash in crypto a couple weeks ago so I mean this
crypto a couple weeks ago so I mean this tracks right but I want to visualize
tracks right but I want to visualize this see this and kind of see um
this see this and kind of see um you know how this is going to look and
you know how this is going to look and how if I can gain any insight from that
how if I can gain any insight from that information and just having it all
information and just having it all displayed for me but in its current
displayed for me but in its current state um you know we really cannot do
state um you know we really cannot do that um now another issue not an issue
that um now another issue not an issue but another thing that we have to take
but another thing that we have to take into consideration is we
into consideration is we have Bitcoin net right here we have
have Bitcoin net right here we have Bitcoin right here after different polls
Bitcoin right here after different polls now we just did it a minute after each
now we just did it a minute after each other but for your project may do it a a
other but for your project may do it a a run each day a run every hour or
run each day a run every hour or something like that right
something like that right and if you did that your data could be
and if you did that your data could be very different and so you may just want
very different and so you may just want to take this first one but what I'm
to take this first one but what I'm going to do for the sake of this project
going to do for the sake of this project I'm going to group them so let's go down
I'm going to group them so let's go down here and we're going to say DF dog Group
here and we're going to say DF dog Group by and so if you've ever done something
by and so if you've ever done something like SQL uh this is how you Group by in
like SQL uh this is how you Group by in pandas basically we're going to group by
pandas basically we're going to group by uh the name so so on bitcoin etherium te
uh the name so so on bitcoin etherium te so we're gonna we're gonna do that on
so we're gonna we're gonna do that on name and uh I'm not gonna I'm gonna say
name and uh I'm not gonna I'm gonna say sort is equal to false oops I'm not
sort is equal to false oops I'm not going to sort it uh you could say true
going to sort it uh you could say true there but we're not going to and I guess
there but we're not going to and I guess you'll see why later we're going to do
you'll see why later we're going to do an open
an open bracket and now we need to choose what
bracket and now we need to choose what we're going to group by uh or what we're
we're going to group by uh or what we're going to what columns we're going to
going to what columns we're going to have so I'm going to do another Open
have so I'm going to do another Open Bracket and I'm just going to copy and
Bracket and I'm just going to copy and paste these so I'm going to start right
paste these so I'm going to start right here at quote percent one hour so I'm
here at quote percent one hour so I'm going to do boom and
going to do boom and then go over one and we're going to take
then go over one and we're going to take 24
24 hours paste that
hours paste that comma we have the 7day 30-day
comma we have the 7day 30-day and we're going to do like
that and I'm just going to do comma I'm gonna do the same one but I'm just going
gonna do the same one but I'm just going to manually change it to
to manually change it to 30day rid of that at the end I don't
30day rid of that at the end I don't know what that is uh then we're going to
know what that is uh then we're going to do 60
do 60 days and comma and we're going to do our
days and comma and we're going to do our last one which is 90 days and let's see
last one which is 90 days and let's see what that gives us
uh doesn't give us anything okay I know what's wrong here
anything okay I know what's wrong here um we forgot to add basically the what
um we forgot to add basically the what we're we have we're grouping by
we're we have we're grouping by something we need to have like an
something we need to have like an average a
average a mean a mode or something like that right
mean a mode or something like that right so all we have to do is go to the end
so all we have to do is go to the end right here and let's just do we're going
right here and let's just do we're going to do an
to do an average um and so we're taking this
average um and so we're taking this number let's say this is for Bitcoin so
number let's say this is for Bitcoin so we're going to take this number in this
we're going to take this number in this one hour for every time it's Bitcoin
one hour for every time it's Bitcoin it's going to group them all together um
it's going to group them all together um and then it's going to average them so
and then it's going to average them so in the past five minutes where it's been
in the past five minutes where it's been running we're going to take the average
running we're going to take the average or the mean of that so let's run this
or the mean of that so let's run this again and so now this is our output
again and so now this is our output let's take a
let's take a look Oops I meant down here let's run
look Oops I meant down here let's run this
now now what we have is all of these um cryptos these are all 15 that we have
cryptos these are all 15 that we have and this is the average um for this 1
and this is the average um for this 1 hour 247 days 30 days 60 days and 90
hour 247 days 30 days 60 days and 90 days so now we have all of our
days so now we have all of our cryptocurrencies over here we have our
cryptocurrencies over here we have our percent changes up top and then our
percent changes up top and then our averages um here as well and so now what
averages um here as well and so now what we're going to do is you know if you try
we're going to do is you know if you try to visualize this as is doesn't really
to visualize this as is doesn't really work because these percent changes are
work because these percent changes are up here as columns and we don't really
up here as columns and we don't really want them as columns because that it
want them as columns because that it just doesn't work for visual for
just doesn't work for visual for actually creating the visualizations we
actually creating the visualizations we really need these to be rows and so my
really need these to be rows and so my initial thought when I was doing this
initial thought when I was doing this was I of course I need to Pivot um you
was I of course I need to Pivot um you know if you've ever used pivot like an
know if you've ever used pivot like an Excel or powerbi or something like that
Excel or powerbi or something like that that was my first thought and I tried
that was my first thought and I tried everything and I could get not could not
everything and I could get not could not get it to work and I almost gave up
get it to work and I almost gave up until I I ran across um something called
until I I ran across um something called stacking or back and and so this was not
stacking or back and and so this was not something that I I I think I have used
something that I I I think I have used it before but I I couldn't remember to
it before but I I couldn't remember to be being completely Frank I couldn't
be being completely Frank I couldn't remember how to do this so I just did um
remember how to do this so I just did um once I saw what it was I did Stack let's
once I saw what it was I did Stack let's make that dat four you don't have to do
make that dat four you don't have to do this uh you can keep this all the
this uh you can keep this all the original data frame I'm just I like for
original data frame I'm just I like for visual purposes you can see like the
visual purposes you can see like the progression that we're making um but I
progression that we're making um but I like to you know create its new data
like to you know create its new data frame and I can always go back and look
frame and I can always go back and look at this data frame three um as we go but
at this data frame three um as we go but you don't you don't have to do that
you don't you don't have to do that that's just what I'm doing so now let's
that's just what I'm doing so now let's take a look at this now uh up here we
take a look at this now uh up here we had Bitcoin and we had all these columns
had Bitcoin and we had all these columns and we had uh these numbers as rows but
and we had uh these numbers as rows but now we have all of these as rows as well
now we have all of these as rows as well this how we have this is much much more
this how we have this is much much more usable um and if you've ever done
usable um and if you've ever done something like pivot or the stacking
something like pivot or the stacking before you'll know that you you kind of
before you'll know that you you kind of have to do it if you really want to
have to do it if you really want to visualize this
visualize this well but um you because we just stacked
well but um you because we just stacked it it kind of changed it so if we look
it it kind of changed it so if we look at um let's look at the type of let's do
at um let's look at the type of let's do type of data frame three this is
type of data frame three this is before um before we stacked it this was
before um before we stacked it this was in a data frame but now let's go and
in a data frame but now let's go and look at data frame four so this is a
look at data frame four so this is a series this is no longer a data frame so
series this is no longer a data frame so we have to remember that that's that's
we have to remember that that's that's really important because we can no
really important because we can no longer treat it as a data frame it's now
longer treat it as a data frame it's now a series so we want to get it back to a
a series so we want to get it back to a data frame we don't want it to be like
data frame we don't want it to be like that because you can't really use it in
that because you can't really use it in the series so what we're going to do and
the series so what we're going to do and let me just create a few of these so you
let me just create a few of these so you can be up here better so now what we're
can be up here better so now what we're going to do is we're going to say data
going to do is we're going to say data frame 4 Dot and something called 2core
frame 4 Dot and something called 2core frame so we're going to make this into a
frame so we're going to make this into a frame and now we're going to specify the
frame and now we're going to specify the name and it doesn't mean um the name
name and it doesn't mean um the name like right here we have actually mean
like right here we have actually mean the name of these values right here this
the name of these values right here this is part of the stacking process in these
is part of the stacking process in these columns or these two columns so let's go
columns or these two columns so let's go right here and we're going to call it
right here and we're going to call it let's just say
let's just say values and let's make this data frame
values and let's make this data frame five and let's see the output whoops for
five and let's see the output whoops for data frame five and now so there's that
data frame five and now so there's that values and now this already looks a lot
values and now this already looks a lot better right so it's in this it's in
better right so it's in this it's in this more um this is already a data so
this more um this is already a data so this is a data frame so let's look at
this is a data frame so let's look at type data frame five so now it's in a
type data frame five so now it's in a data frame
data frame but the issue is is that this name is
but the issue is is that this name is kind of acting like a an index which we
kind of acting like a an index which we don't want because we want to be able to
don't want because we want to be able to use this so it doesn't really have an
use this so it doesn't really have an index at the moment so we need to give
index at the moment so we need to give it an index but typically when you give
it an index but typically when you give an index you'll do something like um
an index you'll do something like um we'll say dataframe do5 we'll do
we'll say dataframe do5 we'll do setor index and then you'll do something
setor index and then you'll do something like um name so let's just do dat frame
like um name so let's just do dat frame six is equal to we'll see we'll see what
six is equal to we'll see we'll see what happens here it's going to give us an
happens here it's going to give us an error oops what I meant is we're going
error oops what I meant is we're going to do data frame five
to do data frame five bracket uh name and that's a column
bracket uh name and that's a column right we're going to do that and it's
right we're going to do that and it's basically going to say that that's not
basically going to say that that's not going to work and and what we need to do
going to work and and what we need to do is what or at least what I want to do
is what or at least what I want to do and what we're going to do in this video
and what we're going to do in this video is I'm going to create numbers I really
is I'm going to create numbers I really would just want it to be numbered one
would just want it to be numbered one two three four five that's what I want
two three four five that's what I want um but we don't have that right now I
um but we don't have that right now I can't just will it into existence so now
can't just will it into existence so now what we're going to do is kind of create
what we're going to do is kind of create uh an index basically out of thin air so
uh an index basically out of thin air so we're going to do pd.
we're going to do pd. index and we're going to say uh you know
index and we're going to say uh you know we basically want how many um rows are
we basically want how many um rows are in here that's where we want our our um
in here that's where we want our our um index to be we want it to count how many
index to be we want it to count how many are in here now you can make this
are in here now you can make this Dynamic and I it probably wouldn't be
Dynamic and I it probably wouldn't be that hard but I'm gonna take this super
that hard but I'm gonna take this super lazy route um and I'm just GNA
lazy route um and I'm just GNA say let's do DF
say let's do DF do5 or oops df5 doc
do5 or oops df5 doc count and there's 90 values in here so
count and there's 90 values in here so what I'm going to do is I'm going to do
what I'm going to do is I'm going to do a
a range of 90 uh and this is not uh I
range of 90 uh and this is not uh I would definitely make this Dynamic but
would definitely make this Dynamic but I'm again I'm just
I'm again I'm just being being a little bit lazy we call
being being a little bit lazy we call this index is equal to and I'm going to
this index is equal to and I'm going to put this Index right here so now this is
put this Index right here so now this is a number so now it's going
a number so now it's going to literally Index this for us now I've
to literally Index this for us now I've ran into this issue many times um so
ran into this issue many times um so what I need to actually do is to reset
what I need to actually do is to reset this index and then do it properly the
this index and then do it properly the first time uh so let's do re let's get
first time uh so let's do re let's get rid of this let's reset this index um
rid of this let's reset this index um and it actually fixed itself um so what
and it actually fixed itself um so what was happening was is we were indexing
was happening was is we were indexing something that was already indexed we
something that was already indexed we were causing
were causing issues in a nutshell so we reset the
issues in a nutshell so we reset the index and now this is what it looks like
index and now this is what it looks like and this is exactly what we want this is
and this is exactly what we want this is really how we wanted it formatted in
really how we wanted it formatted in order to for our visualizations we have
order to for our visualizations we have multiple rows for the Bitcoin um each of
multiple rows for the Bitcoin um each of these columns are is now a row with the
these columns are is now a row with the value attached to it exactly what we
value attached to it exactly what we wanted so um really quick I for whatever
wanted so um really quick I for whatever reason it it makes that uh level one I
reason it it makes that uh level one I don't know why but we're just going to
don't know why but we're just going to rename that column really quickly so
rename that column really quickly so we're going to do data frame
we're going to do data frame 6.
6. rename and then we're going to do and
rename and then we're going to do and open parentheses say columns equal to
open parentheses say columns equal to we're going to do one of these these bad
we're going to do one of these these bad boys oops one of these bad boys this
boys oops one of these bad boys this this type of bracket and we're going to
this type of bracket and we're going to say
say levelor one and we do a colon and then
levelor one and we do a colon and then oops and then a colon and then we want
oops and then a colon and then we want to change it to and I'm just going to
to change it to and I'm just going to call this the percent underscore change
call this the percent underscore change so let's call this dat frame
so let's call this dat frame [Music]
[Music] seven again you don't have to do that
seven again you don't have to do that I'm just doing it so now this looks much
I'm just doing it so now this looks much much better now let's try to visualize
much better now let's try to visualize this one um because we haven't done any
this one um because we haven't done any visualizations yet we've just been
visualizations yet we've just been messing with the data a little bit I I
messing with the data a little bit I I you know I kind of want to see how we
you know I kind of want to see how we can use this it's something that I
can use this it's something that I personally am interested in so I kind of
personally am interested in so I kind of wanted to see visualize how these
wanted to see visualize how these changed over these these time periods um
changed over these these time periods um but we need to um import some stuff in
but we need to um import some stuff in order to be able to visualize this so
order to be able to visualize this so we're going to import cbor as SNS and if
we're going to import cbor as SNS and if we need to um we're going to import map
we need to um we're going to import map plot lib as well I don't know if we'll
plot lib as well I don't know if we'll use it right now or at all but um we're
use it right now or at all but um we're going to we're going to add it in here
going to we're going to add it in here either
either way so now those are added and so what
way so now those are added and so what we're going to do is come right here
we're going to do is come right here we're going to do
we're going to do SNS doat plot and we're going to oops
SNS doat plot and we're going to oops we're going to say the x axis is equal
we're going to say the x axis is equal to and we want to do this as the percent
to and we want to do this as the percent change percent
change percent change and then we have the Y AIS now we
change and then we have the Y AIS now we want the y- axis to be these values
want the y- axis to be these values right here say comma Y is equal to and
right here say comma Y is equal to and we're going to say
we're going to say values oops and then we're going to say
values oops and then we're going to say comma and we'll say we want to basically
comma and we'll say we want to basically create a Legend um I guess you could
create a Legend um I guess you could call it we're going to say Hue is equal
call it we're going to say Hue is equal to name um I'll show you what it looks
to name um I'll show you what it looks like without it and then you know you
like without it and then you know you can
can see that we need that we're going to say
see that we need that we're going to say the data is equal to this data frame
the data is equal to this data frame seven data frame
seven data frame seven and then we are going to say the
seven and then we are going to say the kind is equal
to now let's run this and see what we get and super quickly with just you know
get and super quickly with just you know limited um inputs here's what we have
limited um inputs here's what we have now this looks really good we can narrow
now this looks really good we can narrow this down if we wanted to to a few less
this down if we wanted to to a few less because there's a lot here and there's a
because there's a lot here and there's a lot of colors but again that's just
lot of colors but again that's just because we have a lot of different stuff
because we have a lot of different stuff but there's a few that are doing really
but there's a few that are doing really well I think this is
well I think this is Tron um and then we have a few that are
Tron um and then we have a few that are not doing so well but it's really hard
not doing so well but it's really hard to see if you look down here it's really
to see if you look down here it's really hard to see this um and that's just
hard to see this um and that's just because of the the column name
because of the the column name and so I actually want to change these
and so I actually want to change these column names or these values so that
column names or these values so that when we visualize it right down here it
when we visualize it right down here it it doesn't look like that I kind of want
it doesn't look like that I kind of want this to be you know at least one good
this to be you know at least one good visualization you can take out of here
visualization you can take out of here this is definitely not perfect or
this is definitely not perfect or complete by any means but you know you
complete by any means but you know you can take take that away from here um so
can take take that away from here um so let's um I did Alt Enter which adds
let's um I did Alt Enter which adds another row I could have just pushed
another row I could have just pushed plus that's was kind of the lazy way um
plus that's was kind of the lazy way um what I'm going to do
what I'm going to do is I'm going to change these um these
is I'm going to change these um these values in here so how I'm going to do
values in here so how I'm going to do that is I'm going to do data frame seven
that is I'm going to do data frame seven and we only want to look at this one
and we only want to look at this one column so we'll do that right
column so we'll do that right there and we want to say dot
there and we want to say dot replace and we're going to an
replace and we're going to an open parenthesis and then a bracket now
open parenthesis and then a bracket now what we need to do is I'm just to show
what we need to do is I'm just to show you um one of them is I'm going to say
you um one of them is I'm going to say this one
this one hour do that oops and then what I need
hour do that oops and then what I need to do is a comma another bracket and
to do is a comma another bracket and this is what it's going to change to I'm
this is what it's going to change to I'm just going to say one hour oops one hour
just going to say one hour oops one hour um and we'll do this one really quick
um and we'll do this one really quick and then I'm gonna I don't want you to
and then I'm gonna I don't want you to have to watch me type all this out but
have to watch me type all this out but I'm going to go through and basically do
I'm going to go through and basically do all of this uh for those but let's let's
all of this uh for those but let's let's see this really quick and so now as you
see this really quick and so now as you can see that um the originally it said
can see that um the originally it said quote. USD percent change 1 hour is now
quote. USD percent change 1 hour is now only 1 hour now
only 1 hour now this didn't actually do anything we need
this didn't actually do anything we need to apply it to this right here so I'm
to apply it to this right here so I'm going to say data frame 7 is equal
going to say data frame 7 is equal to and then we'll run data frame 7 again
to and then we'll run data frame 7 again so now that has actually changed that
so now that has actually changed that value now I'm going to go through and
value now I'm going to go through and I'm going to update that for every
I'm going to update that for every single one all right so I basically just
single one all right so I basically just put the other ones um in here that we
put the other ones um in here that we wanted to change with commas afterneath
wanted to change with commas afterneath so I have 24 hours comma with the seven
so I have 24 hours comma with the seven days 30 days 60 days 90 days and then
days 30 days 60 days 90 days and then this bracket over here which tells uh it
this bracket over here which tells uh it what to change it do 24 7 days 30 days
what to change it do 24 7 days 30 days 60 days 90 days so let's run this I
60 days 90 days so let's run this I haven't even tried it yet uh and it
haven't even tried it yet uh and it looks like it obviously worked properly
looks like it obviously worked properly so now let's go back down here and let's
so now let's go back down here and let's run this
run this again and look at that it looks so much
again and look at that it looks so much cleaner so much nicer um and as you I
cleaner so much nicer um and as you I mean all of them with that 1 hour change
mean all of them with that 1 hour change has very little change and then you can
has very little change and then you can look back so we can see back within 90
look back so we can see back within 90 days it's gone a lot of these have gone
days it's gone a lot of these have gone down which again if you're following
down which again if you're following crypto you know there's a big crash
crypto you know there's a big crash recently um especially with with you
recently um especially with with you know all these altcoins um that you're
know all these altcoins um that you're seeing right here went down a ton so I
seeing right here went down a ton so I think this is um Avalanche or die or
think this is um Avalanche or die or whatever these ones are you know went
whatever these ones are you know went down dramatically whereas there's one up
down dramatically whereas there's one up here this Lone Wolf um that's just
here this Lone Wolf um that's just that's just did do really well for
that's just did do really well for whatever reason so it's really
whatever reason so it's really interesting um to see now this is a
interesting um to see now this is a pretty specific um visualization that I
pretty specific um visualization that I personally wanted to see and I thought
personally wanted to see and I thought was interesting you can do absolutely
was interesting you can do absolutely whatever you want to do with this data I
whatever you want to do with this data I mean there's so much here you can do a
mean there's so much here you can do a lot I mean a lot with this data
lot I mean a lot with this data especially depending on how long you
especially depending on how long you track it right I only did this over the
track it right I only did this over the course of like five minutes but if you
course of like five minutes but if you set this up um and you can track it over
set this up um and you can track it over a longer time
a longer time now um let's say you wanted to do
now um let's say you wanted to do something much simpler uh you just
something much simpler uh you just wanted to look at like Bitcoin over that
wanted to look at like Bitcoin over that time that you you know uh uh took the
time that you you know uh uh took the data in that's going to be a lot simpler
data in that's going to be a lot simpler than what we just did and I'll show you
than what we just did and I'll show you how to do that really quickly so we're
how to do that really quickly so we're going to look at the data frame and we
going to look at the data frame and we are going to say uh or we're going to
are going to say uh or we're going to take specific columns we just want um a
take specific columns we just want um a few columns that we want to keep or or
few columns that we want to keep or or pull from so we're going to take uh oops
pull from so we're going to take uh oops we're going to take the name
we're going to take the name column we're going to do
column we're going to do uh might be easier if I copy them but
uh might be easier if I copy them but I'm just going to write them out quote.
I'm just going to write them out quote. USD do price this is the price of the
USD do price this is the price of the actual
actual cryptocurrency then we're going to
cryptocurrency then we're going to do Tim
do Tim stamp and let's make this data frame and
stamp and let's make this data frame and we're just going to do 10 for absolutely
we're just going to do 10 for absolutely no
no reason uh maybe made at n it would have
reason uh maybe made at n it would have been easier so now we just have these um
been easier so now we just have these um these columns and you know we have all
these columns and you know we have all these separate columns so what we can do
these separate columns so what we can do and the re kind of the reason I want to
and the re kind of the reason I want to show you this is you can just query this
show you this is you can just query this really quickly and just take the columns
really quickly and just take the columns that you want so let's say we just
that you want so let's say we just wanted to look at Bitcoin so we're going
wanted to look at Bitcoin so we're going to say data frame
to say data frame 10. query do open parenthesis and we're
10. query do open parenthesis and we're going to say name is equal and equal is
going to say name is equal and equal is not like that uh when you're doing it
not like that uh when you're doing it like this you need to say equal equal
like this you need to say equal equal equal
equal to oops ignore that uh is equal to
to oops ignore that uh is equal to bitcoin and we're going do it just like
bitcoin and we're going do it just like that and we're going to say data frame
that and we're going to say data frame 10 is equal to let's try running that I
10 is equal to let's try running that I think something's wrong with it try it
think something's wrong with it try it like
like this oops all right let's try that there
this oops all right let's try that there we go it was just the I needed a double
we go it was just the I needed a double quotation instead of a single quotation
quotation instead of a single quotation that was the issue so now we have
that was the issue so now we have Bitcoin we have the price and we have
Bitcoin we have the price and we have these time stamps so this is the actual
these time stamps so this is the actual time when we ran it so this is the
time when we ran it so this is the original data frame and then in the you
original data frame and then in the you know this this project it took me 15
know this this project it took me 15 more minutes to get this one and then we
more minutes to get this one and then we had it running properly for the next
had it running properly for the next five minutes so that's you know that's
five minutes so that's you know that's actually what we have now if we want to
actually what we have now if we want to just visualize this really simply what
just visualize this really simply what we can do is we're going to
we can do is we're going to say uh we're going to do SNS doline plot
say uh we're going to do SNS doline plot and that's going to be like a little
and that's going to be like a little line chart or line graph what whatever
line chart or line graph what whatever you want to call it and then we're going
you want to call it and then we're going to say x is equal to and we'll say
to say x is equal to and we'll say quote no actually we wanted the time
quote no actually we wanted the time stamp to be on the x-axis um and then
stamp to be on the x-axis um and then we'll do y is equal to quote. USD do
we'll do y is equal to quote. USD do price and let's see if that
price and let's see if that works good not interpret time stamp for
works good not interpret time stamp for the
the parameter uh that's because it's not
parameter uh that's because it's not understanding that the
understanding that the data equals data frame 10 now let's try
data equals data frame 10 now let's try this all right so this is uh looks
this all right so this is uh looks terrible let
terrible let me me just say SNS doet underscore
me me just say SNS doet underscore theme and open parentheses we'll do
theme and open parentheses we'll do style is equal to dark
style is equal to dark grid this looks a little better now
grid this looks a little better now again we are looking just at a very very
again we are looking just at a very very short time series but we can look at
short time series but we can look at just Bitcoin or we could look at
just Bitcoin or we could look at multiple and we're showing this you know
multiple and we're showing this you know this line that's showing us this
this line that's showing us this trajectory over time so you can get
trajectory over time so you can get really creative with this you can run
really creative with this you can run this for a long time you can show
this for a long time you can show Bitcoin over days weeks or month months
Bitcoin over days weeks or month months however long you run this and so that's
however long you run this and so that's really all I've got um honestly like I
really all I've got um honestly like I said this is not a I wouldn't say this
said this is not a I wouldn't say this is a complete full project but I'm
is a complete full project but I'm showing you how to do something to
showing you how to do something to enable you to kind of run with it and
enable you to kind of run with it and run with the ball and do basically
run with the ball and do basically whatever you want with this you can pull
whatever you want with this you can pull it from you know data from a different
it from you know data from a different API you can use this exact API in data
API you can use this exact API in data but I wanted to show you just a few
but I wanted to show you just a few things that I initially saw that I might
things that I initially saw that I might do with the data and you you have so
do with the data and you you have so much let me go back to this original
much let me go back to this original data
data frame uh right we'll use this one right
frame uh right we'll use this one right here this one right here look at all
here this one right here look at all this data I mean you have so so so much
this data I mean you have so so so much data actually let's go to this one this
data actually let's go to this one this one's better you have so much data so
one's better you have so much data so many numbers here um so many columns
many numbers here um so many columns that we didn't even look at that you can
that we didn't even look at that you can use um and so you know there's a lot
use um and so you know there's a lot that you can use here and I'm really
that you can use here and I'm really trying to just set you up so that you
trying to just set you up so that you can run with it and do whatever you want
can run with it and do whatever you want I could have done a thousand different
I could have done a thousand different things here but you know I tried to just
things here but you know I tried to just show you two things that you can do with
show you two things that you can do with the data that I thought were pretty
the data that I thought were pretty interesting or or simple to do and you
interesting or or simple to do and you know I want you guys to go out and do
know I want you guys to go out and do something way way better than what I did
something way way better than what I did so I hope that this was helpful I hope
so I hope that this was helpful I hope that this showed you how to automate
that this showed you how to automate that process so you don't have to sit
that process so you don't have to sit there and click it and append it and do
there and click it and append it and do all these different things that it can
all these different things that it can show you how to kind of automate this
show you how to kind of automate this process and hopefully that will be
process and hopefully that will be helpful in your future projects so with
helpful in your future projects so with that being said thank you so much for
that being said thank you so much for watching if you made it all day to the
watching if you made it all day to the end you guys are fantastic if you like
end you guys are fantastic if you like this video be sure to like And subscribe
this video be sure to like And subscribe below I'll see you in the next
below I'll see you in the next [Music]
[Music] video what's going on everybody welcome
video what's going on everybody welcome back to another video today I'm going to
back to another video today I'm going to be walking you through how to create
be walking you through how to create your very own portfolio
your very own portfolio website
website [Music]
now we just completed our data analyst portfolio project Series where we walk
portfolio project Series where we walk through four projects in SQL Tableau and
through four projects in SQL Tableau and Python and so if you have completed
Python and so if you have completed those projects you now want to share
those projects you now want to share them with potential employers and I
them with potential employers and I think the best way to do that is to
think the best way to do that is to create your own website in just a little
create your own website in just a little bit I'm going to show you two options on
bit I'm going to show you two options on how you can actually create your own
how you can actually create your own website the first one is a website
website the first one is a website builder like wix.com and the second one
builder like wix.com and the second one is hosting your own website through
is hosting your own website through something called GitHub Pages now if you
something called GitHub Pages now if you have never created your own website
have never created your own website before it can sound a little bit
before it can sound a little bit daunting but don't worry I'm going to
daunting but don't worry I'm going to walk you through every single step of
walk you through every single step of the way from the very start to the very
the way from the very start to the very end and once you reach the end you'll
end and once you reach the end you'll have a complete data analyst portfolio
have a complete data analyst portfolio website so without further Ado let's
website so without further Ado let's jump on my screen and let's get started
jump on my screen and let's get started all right so the website that you're
all right so the website that you're looking at right now is the actual
looking at right now is the actual website that we are going to build in
website that we are going to build in this video um it is hosted on GitHub
this video um it is hosted on GitHub Pages or github.io so this is actually
Pages or github.io so this is actually being hosted right now by GitHub pages
being hosted right now by GitHub pages so if you type this in I'll leave a link
so if you type this in I'll leave a link in the description if you type Tye this
in the description if you type Tye this in um you will get this page and you can
in um you will get this page and you can check it out for yourself if you don't
check it out for yourself if you don't want to just watch me look at it um so
want to just watch me look at it um so you know it has this little header and
you know it has this little header and you can write a little bit about
you can write a little bit about yourself and then these are our actual
yourself and then these are our actual projects so this is our data cleaning in
projects so this is our data cleaning in SQL project um and then there's the
SQL project um and then there's the covid uh data exploration Tableau
covid uh data exploration Tableau dashboards movie correlation with python
dashboards movie correlation with python um this is a future video I plan on
um this is a future video I plan on doing a few more of these projects
doing a few more of these projects because I just really enjoy them so uh
because I just really enjoy them so uh you know and then there's this contact
you know and then there's this contact information at the bottom so it's a
information at the bottom so it's a really
really simple website and it gets the point
simple website and it gets the point across and uh I have something similar
across and uh I have something similar to this for my own personal one I I use
to this for my own personal one I I use a different variation but um this all
a different variation but um this all comes from this website HTML 5 up there
comes from this website HTML 5 up there are lots of templates lots of options
are lots of templates lots of options that you can use um again the one we're
that you can use um again the one we're going to be working with is this one but
going to be working with is this one but I use a different one for mine and they
I use a different one for mine and they are really good
are really good I me super easy to build and customize
I me super easy to build and customize yourself and I will say again I have no
yourself and I will say again I have no experience doing this I just watched a
experience doing this I just watched a YouTube video that showed me how to do
YouTube video that showed me how to do this and now I am creating my own
this and now I am creating my own YouTube video to show you how to do this
YouTube video to show you how to do this so it's coming um pretty much full
so it's coming um pretty much full circle so like I said there's no no real
circle so like I said there's no no real narrative to it it just clicks to your
narrative to it it just clicks to your project um if you click on this and
project um if you click on this and let's just open a new tab it'll take you
let's just open a new tab it'll take you right to our to the GitHub project um
right to our to the GitHub project um and then you the the whoever is checking
and then you the the whoever is checking this out like a an employer or a
this out like a an employer or a recruiter can see your code so super
recruiter can see your code so super simple another way that you can do this
simple another way that you can do this is kind of creating your own website
is kind of creating your own website through like a template or something
through like a template or something like that um almost like a Blog style so
like that um almost like a Blog style so I imagine it being very something very
I imagine it being very something very similar to this where there's this
similar to this where there's this introduction and you can talk about you
introduction and you can talk about you know where you got the data set how you
know where you got the data set how you got the data um and then you can kind of
got the data um and then you can kind of have a more narrative uh approach with
have a more narrative uh approach with screenshots and with some code as well
screenshots and with some code as well so you know this person included
so you know this person included screenshots um and then there's the code
screenshots um and then there's the code right here that I can actually copy um
right here that I can actually copy um and paste that and it just walks through
and paste that and it just walks through the logic of how the project was done um
the logic of how the project was done um there's a story to it really and so that
there's a story to it really and so that might be something that you're
might be something that you're interested in now I have done something
interested in now I have done something like this in the past and I used Wix and
like this in the past and I used Wix and there's a you can do this completely for
there's a you can do this completely for free um the one we're doing today is
free um the one we're doing today is completely free as well but you know if
completely free as well but you know if you want the customize
you want the customize um the customized URL you do have to pay
um the customized URL you do have to pay for it on Wix but you can get a free Wix
for it on Wix but you can get a free Wix website with the Wix um in the URL so
website with the Wix um in the URL so you know try this out these are super
you know try this out these are super easy you can find thousands of templates
easy you can find thousands of templates and a million tutorials of how to do
and a million tutorials of how to do them um so that's not the one we're
them um so that's not the one we're going to be working on today so with
going to be working on today so with that being said uh the very very first
that being said uh the very very first thing that we need to do before we do
thing that we need to do before we do anything is actually download visual
anything is actually download visual studio code this is where we're going to
studio code this is where we're going to download that HTML and we're going to be
download that HTML and we're going to be working with it in there um again I
working with it in there um again I don't know if I said this before but it
don't know if I said this before but it seems a little bit intimidating at first
seems a little bit intimidating at first but once we actually start looking at it
but once we actually start looking at it it's a lot easier than it looks I
it's a lot easier than it looks I promise you so if you are me and you
promise you so if you are me and you have a Windows computer you'll just go
have a Windows computer you'll just go right here you'll install it um super
right here you'll install it um super easy to install I'm not going to walk
easy to install I'm not going to walk you through how to do that um of course
you through how to do that um of course I already have it up and running down
I already have it up and running down here so once you have that installed
here so once you have that installed what you're going to do going to come to
what you're going to do going to come to this website a link should be in the
this website a link should be in the description we are going to download
description we are going to download this all you have to click is the free
this all you have to click is the free download it's going to pop up I'm going
download it's going to pop up I'm going to put it in my downloads I'm GNA click
to put it in my downloads I'm GNA click save
save fantastic uh so let's go to the
fantastic uh so let's go to the downloads and it should be right here
downloads and it should be right here now if we open this up it has a few
now if we open this up it has a few different things in it okay so um I'm
different things in it okay so um I'm using the brave browser so that's going
using the brave browser so that's going to be right here so that's this the
to be right here so that's this the symbol but for you if you're using
symbol but for you if you're using Google Chrome that should be the symbol
Google Chrome that should be the symbol there as well but this is everything
there as well but this is everything that you should be seeing and what we
that you should be seeing and what we want to do is we want to take it out of
want to do is we want to take it out of this um zip folder because it's there
this um zip folder because it's there are things that can read into it with
are things that can read into it with Visual Studio code but I want to make
Visual Studio code but I want to make this as user friendly as I possibly can
this as user friendly as I possibly can so what we're going to do is we're going
so what we're going to do is we're going to make create a new folder and I'm just
to make create a new folder and I'm just going to call it massively or you can
going to call it massively or you can call it um Port website whatever you
call it um Port website whatever you want to call it I'm just going to do
want to call it I'm just going to do Port website
Port website um and we are just going to I'm going to
um and we are just going to I'm going to copy this in I'm not going to cut it in
copy this in I'm not going to cut it in just in case I make a mistake so going
just in case I make a mistake so going to put all of those um all of those
to put all of those um all of those things in
things in here and now what we're going to do is
here and now what we're going to do is we're going to go to visual studio code
we're going to go to visual studio code right here and you should be greeted
right here and you should be greeted with this um this right here and we're
with this um this right here and we're just going to click open folder and
just going to click open folder and we're going to go to Port website and
we're going to go to Port website and we're going to go select
we're going to go select folder and you're going to say say yes I
folder and you're going to say say yes I trust this one and right over here is
trust this one and right over here is all of the documents that we were just
all of the documents that we were just looking at now the one that the only one
looking at now the one that the only one really that we're going to be working in
really that we're going to be working in um we'll work a little bit in the images
um we'll work a little bit in the images um because I'll show you how to add your
um because I'll show you how to add your own images the really the only one we're
own images the really the only one we're going to be working in is this index so
going to be working in is this index so again it looks complicated um if you've
again it looks complicated um if you've never looked at HTML before um it does
never looked at HTML before um it does look a little bit complicated but HTML
look a little bit complicated but HTML to me
to me is one of the more easily understood
is one of the more easily understood languages um once you start kind of
languages um once you start kind of getting into it which we're about to
getting into it which we're about to we're going to walk through the entire
we're going to walk through the entire process it actually makes a lot of sense
process it actually makes a lot of sense and it is pretty simple um something
and it is pretty simple um something that you're going to want is you're
that you're going to want is you're going to want something called a live so
going to want something called a live so like if I click right here and I click
like if I click right here and I click open with live server you don't have it
open with live server you don't have it yet I'm guessing unless you've done this
yet I'm guessing unless you've done this before um it's going to open up this
before um it's going to open up this website and this is what we're looking
website and this is what we're looking at right now so it has a bunch of um
at right now so it has a bunch of um gibberish or some language that I do not
gibberish or some language that I do not know and so we can view this live um in
know and so we can view this live um in just a second I'm going to take myself
just a second I'm going to take myself off screen but before I do that um let's
off screen but before I do that um let's download or let's um search for that
download or let's um search for that that
that live um I think it's called live share
live um I think it's called live share live server um let me see what this is
live server um let me see what this is called yeah live server so come right
called yeah live server so come right here it's called this live server there
here it's called this live server there it is yeah that's the one so this is our
it is yeah that's the one so this is our live server you just need to click
live server you just need to click install it takes like 5 seconds and it
install it takes like 5 seconds and it should be completely installed um what
should be completely installed um what this does is it just hosts a local
this does is it just hosts a local website it's not something that anybody
website it's not something that anybody can access um but it connects to your
can access um but it connects to your code and when we make updates it'll make
code and when we make updates it'll make a lot you can see it live you can see
a lot you can see it live you can see those updates live so I'll show you all
those updates live so I'll show you all that in a second just be sure to um be
that in a second just be sure to um be sure to download that or install that uh
sure to download that or install that uh with that being said let's get out of
with that being said let's get out of this let's go all let's go back right
this let's go all let's go back right here uh with that being said I am going
here uh with that being said I am going to take myself off screen so that you
to take myself off screen so that you can see everything that I am seeing as
can see everything that I am seeing as well um it's been really great seeing
well um it's been really great seeing you have lots of different videos coming
you have lots of different videos coming up lots of new projects um I just I
up lots of new projects um I just I really enjoyed this project series I
really enjoyed this project series I think I'm just going to do more of them
think I'm just going to do more of them so uh all right I'm G to get myself off
so uh all right I'm G to get myself off screen so let's look at what we actually
screen so let's look at what we actually need to do so I'm going to um
need to do so I'm going to um so let me see okay so we're already
so let me see okay so we're already connected to the
connected to the live um actually I got rid of it
live um actually I got rid of it whoops let's pull this over and let's
whoops let's pull this over and let's pull
pull that and we're going
that and we're going to open in live server so if we look
to open in live server so if we look right over here and I know this going to
right over here and I know this going to be a little bit Squish and I'm sorry
be a little bit Squish and I'm sorry about that um but if we look right over
about that um but if we look right over here this says this is massively so you
here this says this is massively so you you can change that that's that's this
you can change that that's that's this right here and you can say we're going
right here and you can say we're going to say Alex the analyst portfolio and
to say Alex the analyst portfolio and we'll get rid of this massively I'm
we'll get rid of this massively I'm gonna hit control save you can also go
gonna hit control save you can also go up here and hit save but I'm I'm going
up here and hit save but I'm I'm going hit controls so I hit contrl s and just
hit controls so I hit contrl s and just like that it updates on the website now
like that it updates on the website now again this is just a local so it's
again this is just a local so it's nothing that anybody can see so don't
nothing that anybody can see so don't worry but what we're going to do is I'm
worry but what we're going to do is I'm going to walk you through the entire
going to walk you through the entire process of creating this and then at the
process of creating this and then at the end I will show you how to host it on
end I will show you how to host it on GitHub um and it's honestly it's it's a
GitHub um and it's honestly it's it's a fairly easy process it's just takes a
fairly easy process it's just takes a little bit of time to customize it all
little bit of time to customize it all so let's get into it so we have this um
so let's get into it so we have this um you may not be able to see it let me
you may not be able to see it let me actually pull this up so it says
actually pull this up so it says massively by HTTP we're going to
massively by HTTP we're going to customiz that customize that as well
customiz that customize that as well whoops I don't want to do that every
whoops I don't want to do that every single time I'm I'm going to try not to
single time I'm I'm going to try not to go full and go back and everything like
go full and go back and everything like that so we're just going to say Alex the
that so we're just going to say Alex the analyst
portfolio um contrl s and right up here that changed it you may not be able to
that changed it you may not be able to see yeah don't ask me that again thank
see yeah don't ask me that again thank you uh right up here you probably can't
you uh right up here you probably can't see at the moment we'll see that later
see at the moment we'll see that later um but it it customizes this um tab
um but it it customizes this um tab which is really
which is really cool so let's go right down here now
cool so let's go right down here now this is where it says a free fully
this is where it says a free fully responsive HTML uh five template we can
responsive HTML uh five template we can customize that and I highly encourage
customize that and I highly encourage you do so what you can do and they
you do so what you can do and they actually included their Twitter handle
actually included their Twitter handle right here and you can do the same if
right here and you can do the same if you look at this one right here I
you look at this one right here I included my Alex the analyst handle that
included my Alex the analyst handle that that goes to my YouTube channel and you
that goes to my YouTube channel and you can do the exact same thing includes
can do the exact same thing includes your LinkedIn or your GitHub profile or
your LinkedIn or your GitHub profile or whatever you want to include in there um
whatever you want to include in there um and so you know be aware that you can do
and so you know be aware that you can do that so let's say um oops I need to
that so let's say um oops I need to click back in here
click back in here so we're going to
so we're going to say
say um data analyst skilled in and then
um data analyst skilled in and then again don't write what I'm writing um
again don't write what I'm writing um you can it's I'm just going to make it
you can it's I'm just going to make it really simple but you know this part is
really simple but you know this part is meant to be a little bit about you um as
meant to be a little bit about you um as who you are so I'm going to say data
who you are so I'm going to say data analyst skilled in
analyst skilled in SQL Tableau and
SQL Tableau and Python and then I'm just going to get
Python and then I'm just going to get rid of all of this
rid of all of this yep yep yep everything from here
yep yep yep everything from here over and contrl
over and contrl S and so super simple um actually let me
S and so super simple um actually let me where was that
where was that four four here it is we don't need that
four four here it is we don't need that actually we don't need any anything from
actually we don't need any anything from here
here over probably here honestly see what
over probably here honestly see what that looks like um and yeah and I can
that looks like um and yeah and I can again you can use any website right here
again you can use any website right here that you want
that you want and you can customize what it looks like
and you can customize what it looks like so I'm going to say Alex the analyst um
so I'm going to say Alex the analyst um and then whatever URL you want to
and then whatever URL you want to include in there that's what you need to
include in there that's what you need to put so now if I save oops if I hit
put so now if I save oops if I hit contrl s so now it says Alex the
contrl s so now it says Alex the analyst um so pretty
analyst um so pretty easy now we're going to go down and you
easy now we're going to go down and you can use this however you want to use it
can use this however you want to use it I would you can even make this um you
I would you can even make this um you can make this like one of your one of
can make this like one of your one of your readmes like a you and put the link
your readmes like a you and put the link for that I decided to include um again
for that I decided to include um again on this one I decided to include the
on this one I decided to include the project that I thought that we've done
project that I thought that we've done that was like the most impressive or the
that was like the most impressive or the I don't know the coolest one I don't
I don't know the coolest one I don't know if you consider data cleaning and
know if you consider data cleaning and SQ cool but um I do I think it's cool so
SQ cool but um I do I think it's cool so I included that one as my very first one
I included that one as my very first one so that's what we're going to do um
so that's what we're going to do um right here so we're going to go down and
right here so we're going to go down and it's going to
it's going to say
say let's say it says this is massively
let's say it says this is massively that's not
that's not it uh cool so let's see what oh okay I
it uh cool so let's see what oh okay I know what that is we'll come back to
know what that is we'll come back to this up here um in just a little bit I'm
this up here um in just a little bit I'm going to go full screen I'll show you
going to go full screen I'll show you what this is and then we'll come back to
what this is and then we'll come back to it but if we go right down here this is
it but if we go right down here this is our what they're calling a featured post
our what they're calling a featured post and then the ones below this are posts
and then the ones below this are posts so in our featured post um I'm going to
so in our featured post um I'm going to get rid of the date I don't want them to
get rid of the date I don't want them to know that I just created it like um I
know that I just created it like um I don't know oops I keep doing uh control
don't know oops I keep doing uh control a selecting everything whoops so we're
a selecting everything whoops so we're going to say um data cleaning in
going to say um data cleaning in SQL and we'll get rid of
SQL and we'll get rid of this and contrl S again I'm just
this and contrl S again I'm just updating it a lot so that you see what
updating it a lot so that you see what I'm doing and where it's going and we're
I'm doing and where it's going and we're going to get rid of basically all of
going to get rid of basically all of this and go back and we're just going to
this and go back and we're just going to say in this project we C clean data in
say in this project we C clean data in we clean let's do we clean housing data
we clean let's do we clean housing data in SQL
in SQL server and contr S so super easy again
server and contr S so super easy again uh give a little bit more description I
uh give a little bit more description I did in my other one um and you have the
did in my other one um and you have the you have you can see that website so go
you have you can see that website so go check it out and then we'll have an
check it out and then we'll have an image and I'm going to show you um at
image and I'm going to show you um at the end we're going to go back and redo
the end we're going to go back and redo all the images but I'm not going to do
all the images but I'm not going to do that at this very
that at this very moment um
moment um so what
so what now you can have this full story I chose
now you can have this full story I chose to do view
project and i h contrl s it says view project I think that just looks better
project I think that just looks better especially if you're displaying a
especially if you're displaying a project I think it is nice uh now we go
project I think it is nice uh now we go into all the indiv individual posts um
into all the indiv individual posts um actually no wait what I want I want to
actually no wait what I want I want to show you really quick is how you
show you really quick is how you actually link it to this so let's go
actually link it to this so let's go right over here this is our co uh that's
right over here this is our co uh that's our Co one here's a data cleaning
our Co one here's a data cleaning project so all you have to do is take um
project so all you have to do is take um take this website so that's the URL and
take this website so that's the URL and you're going to put it right here now
you're going to put it right here now there's three different places this href
there's three different places this href is places are places where you can put a
is places are places where you can put a link to a website um and on here it
link to a website um and on here it references this right here so you can
references this right here so you can they can click on this data cleaning and
they can click on this data cleaning and SQL they can click on the image um as
SQL they can click on the image um as because you know this href is right next
because you know this href is right next to this image they can also click on the
to this image they can also click on the view project button so you can put it in
view project button so you can put it in all three um and you'll just go like
all three um and you'll just go like this you'll you'll stick the URL right
this you'll you'll stick the URL right where that um hashtag or pound sign
where that um hashtag or pound sign is and then we're going to save that
is and then we're going to save that oops oh I I this is embarrassing I am
oops oh I I this is embarrassing I am not a website I am not a web developer
not a website I am not a web developer as you can see um but then if I go in
as you can see um but then if I go in here and I right click and I say open
here and I right click and I say open link it is going to take me to that
link it is going to take me to that project so super super simple and we're
project so super super simple and we're going to do basically that for all of
going to do basically that for all of these um I'm only going to show you
these um I'm only going to show you three and then you can do the rest but I
three and then you can do the rest but I want to show you how to also do the um
want to show you how to also do the um put the Tableau it's the exact same
put the Tableau it's the exact same thing but you know it's different so
thing but you know it's different so wanted to show it to you so the next one
wanted to show it to you so the next one that we're going to do is go down to
that we're going to do is go down to posts
posts and again I'm going to get rid of this
and again I'm going to get rid of this date you can keep that in there if you
date you can keep that in there if you want excuse me and that's totally fine
want excuse me and that's totally fine just
just update the date um this is that said mag
update the date um this is that said mag again I think this might be like some
again I think this might be like some language that I just don't know about um
language that I just don't know about um the next one is data
the next one is data exploration in
exploration in SQL and I'm going to get rid of
SQL and I'm going to get rid of this and we'll save that
this and we'll save that perfect and we'll do view
project cool and yeah so now we need to um
cool and yeah so now we need to um customize this summary and so I'm just
customize this summary and so I'm just going to say something really simple um
going to say something really simple um data exploration of
Server there we go let's save that we have view project now let's go get our
have view project now let's go get our project so this is the data exploration
project so this is the data exploration we're going to take this we're going to
we're going to take this we're going to copy it and we're going to put it right
copy it and we're going to put it right in
in here and right in here as well and if
here and right in here as well and if you want to you can also include it
you want to you can also include it right up here so we have it in all three
right up here so we have it in all three places uh again once you click on these
places uh again once you click on these they will come up let's go to the next
they will come up let's go to the next one we're going to get rid of
one we're going to get rid of this this one is going to be our Tableau
this this one is going to be our Tableau projects so actually let me just copy
projects so actually let me just copy that while we're here this is going to
that while we're here this is going to be our Tableau projects so if you have
be our Tableau projects so if you have one specific project that you want to
one specific project that you want to include what you need to do is actually
include what you need to do is actually go in here click view grab that URL what
go in here click view grab that URL what I am doing is I am just sharing my
I am doing is I am just sharing my Tableau public page so if you have tons
Tableau public page so if you have tons of projects in here and um you want to
of projects in here and um you want to display all of them then or you want
display all of them then or you want them to be able to see all of them and
them to be able to see all of them and go and pick and see and choose what they
go and pick and see and choose what they want to look at then just choose this
want to look at then just choose this URL that we're choosing right here so um
URL that we're choosing right here so um in here on in the um HTML we're going to
in here on in the um HTML we're going to put I'm going to put tab
projects and let's go like
and let's go like this and then we will get rid of uh that
this and then we will get rid of uh that hashtag pound sign whatever you want to
hashtag pound sign whatever you want to call
call it and we'll hit contrl s and oh we got
it and we'll hit contrl s and oh we got to do the
to do the um this as
well this is my this is going to be a terrible don't use this this is my
terrible don't use this this is my Tableau this holds I'm just this is bad
Tableau this holds I'm just this is bad this holds all of my
this holds all of my Tableau
Tableau dashboards don't please don't do this um
dashboards don't please don't do this um I am doing this because I don't want to
I am doing this because I don't want to take forever in a video to make it
take forever in a video to make it perfect um and then you know you're
perfect um and then you know you're going to do the exact same thing so in
going to do the exact same thing so in this one right here I included four so
this one right here I included four so I'm going to keep
I'm going to keep four
four um let me do the
um let me do the no I'm just going to do these three I'm
no I'm just going to do these three I'm not gonna take up more of our time um so
not gonna take up more of our time um so we did those I'm just going to keep
we did those I'm just going to keep these three in for visual purposes but
these three in for visual purposes but once you get down here um you know what
once you get down here um you know what we're going to do is delete some of this
we're going to do is delete some of this right so we this is our data
right so we this is our data exploration and where's our
exploration and where's our Tableau this is our Tableau right here
Tableau this is our Tableau right here so Tableau projects they're separated by
so Tableau projects they're separated by these articles so what we're going to do
these articles so what we're going to do is go around right here and we're going
is go around right here and we're going to go down down down down to right here
to go down down down down to right here this is going to get rid of all these
this is going to get rid of all these other articles or all these other what
other articles or all these other what they're calling um posts so we're going
they're calling um posts so we're going to get rid of those and we're going to
to get rid of those and we're going to hit
hit save and now as you can see we have our
save and now as you can see we have our header we have our first project and we
header we have our first project and we have our second and our third I would
have our second and our third I would include those other projects that we've
include those other projects that we've done in here so that it looks good this
done in here so that it looks good this is this footer right here we don't need
is this footer right here we don't need that because we don't have any
that because we don't have any um anything else in there so we're going
um anything else in there so we're going to get rid of that as well and now we
to get rid of that as well and now we just have this information
just have this information now I don't have anything where they can
now I don't have anything where they can do the name email message or you can
do the name email message or you can keep that in there if you'd like um but
keep that in there if you'd like um but I am going to get rid of this so we're
I am going to get rid of this so we're going to go right here that's the
going to go right here that's the section so don't delete the section we
section so don't delete the section we want that I'm going to delete this
want that I'm going to delete this footer section as what they're calling
footer section as what they're calling it and now we have this address phone
it and now we have this address phone email social um and I'm G to get to the
email social um and I'm G to get to the Social in just a second it's again super
Social in just a second it's again super easy but for the address I just put
easy but for the address I just put location I don't want to give somebody
location I don't want to give somebody my address or put it on a website
my address or put it on a website anywhere um it's not something I want to
anywhere um it's not something I want to do so what we're going to do is just put
do so what we're going to do is just put I'm going to put
I'm going to put Dallas and Texas and we can keep it like
Dallas and Texas and we can keep it like that and we'll hit oops we'll hit save
that and we'll hit oops we'll hit save and it'll have Dallas Texas um hate the
and it'll have Dallas Texas um hate the look of the zeros 6 seven8 n z so we're
look of the zeros 6 seven8 n z so we're going we're going to do that phone
going we're going to do that phone number
number two3
two3 56
56 7890 and then email and we'll
7890 and then email and we'll put Alex the analyst 95@gmail.com
put Alex the analyst 95@gmail.com if you have issues with this um you can
if you have issues with this um you can email me
email me but I'll try I will try to respond to
but I'll try I will try to respond to all your emails I get a lot um so I will
all your emails I get a lot um so I will do my best but that is my actual email
do my best but that is my actual email if you are curious
if you are curious now um now that we have this we also
now um now that we have this we also have these the social media now I want
have these the social media now I want to display my LinkedIn and I also want
to display my LinkedIn and I also want to display my GitHub so what I'm going
to display my GitHub so what I'm going to do right here is I'm going to go over
to do right here is I'm going to go over here and do
LinkedIn perfect let's go to this so I'm going to take my LinkedIn
URL and I am going to get rid of these first two because I'm only going to
first two because I'm only going to include two and for this one I'm going
include two and for this one I'm going to do uh
to do uh LinkedIn oops linked in and then for
LinkedIn oops linked in and then for right here I'm going to replace that
right here I'm going to replace that with linked
with linked in and what you're going to do is put
in and what you're going to do is put this link right here and then we're
this link right here and then we're going to go get get the
going to go get get the GitHub so let's do GitHub oh who is this
GitHub so let's do GitHub oh who is this sign up what is going on
sign up what is going on um I don't there let's just go back here
um I don't there let's just go back here I that was some I was like viewing a
I that was some I was like viewing a while back or something um so we're
while back or something um so we're going to take the GitHub and we're going
going to take the GitHub and we're going to put that right here so it already has
to put that right here so it already has it as um the GitHub is this supposed to
it as um the GitHub is this supposed to be
be lowercase I think it is let me see if
lowercase I think it is let me see if this is lowercased as well yeah um so do
this is lowercased as well yeah um so do it like that do it lowercased um I
it like that do it lowercased um I forgot that that was how they did it
forgot that that was how they did it um and oh that's the label that doesn't
um and oh that's the label that doesn't matter as much but this right here is
matter as much but this right here is the class is actually the important part
the class is actually the important part because then when we go back here there
because then when we go back here there is no LinkedIn image but when we save it
is no LinkedIn image but when we save it oops when we save it it has the LinkedIn
oops when we save it it has the LinkedIn image because it's already a class that
image because it's already a class that was created in this HTML um
was created in this HTML um template so we have that um and let me
template so we have that um and let me bring this full screen really quick
bring this full screen really quick because there are a few things that we
because there are a few things that we couldn't see in that that screen these
couldn't see in that that screen these right here are things that we could not
right here are things that we could not see before um and these as well so what
see before um and these as well so what we can do is we're going to go down here
we can do is we're going to go down here we're just going to copy these social
we're just going to copy these social we're going to replace them right here
we're going to replace them right here so they can have those and then we're
so they can have those and then we're going to get rid of these two right here
going to get rid of these two right here and this says this is massively um and
and this says this is massively um and we're going to change that as well let's
we're going to change that as well let's make this full screen for the first time
make this full screen for the first time feels good um I hate doing split screen
feels good um I hate doing split screen but I do it for you guys um so this is
but I do it for you guys um so this is massively and we're just going to put
massively and we're just going to put we're just going to get rid of these two
we're just going to get rid of these two this is um it's called The Navigator the
this is um it's called The Navigator the the different tabs we're going to get
the different tabs we're going to get rid of those two tabs and then for this
rid of those two tabs and then for this I'm just going to call it
I'm just going to call it projects and I'll once I once we go back
projects and I'll once I once we go back and update all this then you will um
and update all this then you will um you'll see those
you'll see those changes so let's see so we made those
changes so let's see so we made those changes here's our social or the social
changes here's our social or the social medias uh Social Media stuff we're going
medias uh Social Media stuff we're going to go and copy copy these
to go and copy copy these two and we're going to replace all of
two and we're going to replace all of these with
these with this
this um and let's save that and let's go back
um and let's save that and let's go back so now as you can see those two are gone
so now as you can see those two are gone this says projects there's only two
this says projects there's only two right here and if you click on it it's
right here and if you click on it it's going to go to my LinkedIn or your
going to go to my LinkedIn or your LinkedIn when you do it um and this will
LinkedIn when you do it um and this will take you to the GitHub so it is all
take you to the GitHub so it is all working as intended this is great um
working as intended this is great um when you scroll down and it says
when you scroll down and it says massively we can change that as well and
massively we can change that as well and we should let's do that really quick um
we should let's do that really quick um we'll just
we'll just say Alex the
say Alex the analyst and we'll update
analyst and we'll update that and there we go so in a nutshell
that and there we go so in a nutshell this is the a lot of it um we need
this is the a lot of it um we need images and I don't think I set this up
images and I don't think I set this up for this video so I'm going to I'm going
for this video so I'm going to I'm going to like cut myself off for like 2
to like cut myself off for like 2 seconds go pull those images in um
seconds go pull those images in um because it could take like a few minutes
because it could take like a few minutes I don't want to waste your time and then
I don't want to waste your time and then I'll come back so I'll see you in two
I'll come back so I'll see you in two seconds all right so I just pulled over
seconds all right so I just pulled over the images that we are going to use
the images that we are going to use let's go to the downloads um they're
let's go to the downloads um they're right here they're the housing Tableau
right here they're the housing Tableau and Co um if I open up this Co one this
and Co um if I open up this Co one this is what the image looks like this is
is what the image looks like this is what we're going to use for that covid
what we're going to use for that covid project so I'm going to copy these I'm
project so I'm going to copy these I'm going to go into the port website um
going to go into the port website um that we just have I'm going to go to
that we just have I'm going to go to images and I'm going to insert these in
images and I'm going to insert these in here so now that we have those images in
here so now that we have those images in here let's go
here let's go back and let's see what we got so we
back and let's see what we got so we just put these images in this um you'll
just put these images in this um you'll have this folder right here and you can
have this folder right here and you can open it up and you can see all of these
open it up and you can see all of these that we have so all we're going to do is
that we have so all we're going to do is go and replace the images these these
go and replace the images these these you know temporary images that they had
you know temporary images that they had for us and we should be gold and then
for us and we should be gold and then we're going to actually upload it to to
we're going to actually upload it to to GitHub and then create our website for
GitHub and then create our website for free so let's go right down here this is
free so let's go right down here this is our very first uh one this is our data
our very first uh one this is our data cleaning in SQL this is with the housing
cleaning in SQL this is with the housing data so this image right over here it
data so this image right over here it says images p1. jpeg so jpeg I don't
says images p1. jpeg so jpeg I don't know why I said it like that so this is
know why I said it like that so this is the housing so what we're going to do
the housing so what we're going to do right here is do housing and it'll
right here is do housing and it'll autocomplete for us um so that housing
autocomplete for us um so that housing should be in there now next one is the
should be in there now next one is the data exploration in SQL that was with
data exploration in SQL that was with the co so we're going to get rid of this
the co so we're going to get rid of this we're going to say Co um because that is
we're going to say Co um because that is the image that I have right over here
the image that I have right over here and then the last one is excuse me
and then the last one is excuse me Tableau so let's go right over here
Tableau so let's go right over here let's do TBL
let's do TBL low let's get rid oh I got to save that
low let's get rid oh I got to save that uh contrl s perfect and now let's look
uh contrl s perfect and now let's look at
at it there you go there you go go oh this
it there you go there you go go oh this one still says full story go change that
one still says full story go change that um I'm going to go change it just
um I'm going to go change it just doesn't feel
doesn't feel right uh view project oh that's not how
right uh view project oh that's not how you spell
you spell it okay contrl s perfect okay so now
it okay contrl s perfect okay so now this looks a lot better um and when we
this looks a lot better um and when we host it um through GitHub Pages or
host it um through GitHub Pages or github.io this is going to be what it
github.io this is going to be what it looks like I mean it is and you can add
looks like I mean it is and you can add a lot more to it you can take away from
a lot more to it you can take away from it you can add as many projects as you
it you can add as many projects as you want you can keep adding you can copy
want you can keep adding you can copy those articles or those posts and you
those articles or those posts and you can just keep adding them um so this is
can just keep adding them um so this is kind of what it's going to look
kind of what it's going to look like
like and it was not that hard I don't think I
and it was not that hard I don't think I hope this was not too difficult I really
hope this was not too difficult I really don't think it is um it's really just
don't think it is um it's really just using a template and kind of
using a template and kind of understanding a little basics of HTML so
understanding a little basics of HTML so um we are going to take this and we we
um we are going to take this and we we have this saved already we have this all
have this saved already we have this all saved what we are going to do now is
saved what we are going to do now is upload this to GitHub so let's go right
upload this to GitHub so let's go right over here let's go to here and let's go
over here let's go to here and let's go to
to repositories and how do where where's
repositories and how do where where's the new one oh I need to sign in okay
the new one oh I need to sign in okay I'm going to get rid of this part so you
I'm going to get rid of this part so you can't see it so we are going to say a
can't see it so we are going to say a new
new repository we're going to call it Alex
repository we're going to call it Alex the analyst
the analyst 2 .
2 . github.io so we're going to write it
github.io so we're going to write it just like that you know if your name's
just like that you know if your name's um
um Alex Jimmy I don't know why I said Jimmy
Alex Jimmy I don't know why I said Jimmy Alex Jimmy Alex jimmy. github.io you can
Alex Jimmy Alex jimmy. github.io you can always go back after the fact and change
always go back after the fact and change this so it's not a big deal whether you
this so it's not a big deal whether you change it or not and we're going to
change it or not and we're going to create this
create this repository we're going to say upload an
repository we're going to say upload an existing
existing file and instead of choosing them what
file and instead of choosing them what we're going to do is just go right over
we're going to do is just go right over here go to this and we're just going to
here go to this and we're just going to copy this in or not copy it in but drag
copy this in or not copy it in but drag it in okay so we're going to take this
it in okay so we're going to take this drag it in right here and it can take a
drag it in right here and it can take a it'll take a little bit has a 75 but it
it'll take a little bit has a 75 but it shouldn't take that
long and let's just wait for it I taking a sip of water I
a sip of water I apologize but it is literally uploading
apologize but it is literally uploading just everything that we had in there so
just everything that we had in there so all the updates and all the changes and
all the updates and all the changes and all the stuff that we um had
all the stuff that we um had and it looks like it's done so let's
and it looks like it's done so let's just
just write initial
write initial commit commit
commit commit changes it is processing
changes it is processing it all right and it should be done very
it all right and it should be done very very soon as long as I have a good
very soon as long as I have a good internet
internet connection we shall
see stick with me it's taking its time um while while it's loading let's
time um while while it's loading let's go over to oh oh there it is so perfect
go over to oh oh there it is so perfect so here's everything that we have has
so here's everything that we have has this read me that it generated let's
this read me that it generated let's over to
over to settings and we have this U
settings and we have this U github.io and if we go right down here
github.io and if we go right down here to GitHub Pages pages settings now has
to GitHub Pages pages settings now has its own dedicated tab let's check it out
its own dedicated tab let's check it out here so it is
here so it is um it's currently disabled but we're
um it's currently disabled but we're going to say want it to do pull from the
going to say want it to do pull from the main um I think it's the doc we'll see
main um I think it's the doc we'll see I'm going to save this your site is
I'm going to save this your site is ready to be published let's open this up
ready to be published let's open this up okay site not found maybe it's from the
okay site not found maybe it's from the root
root save um your site is having a build a
save um your site is having a build a problem let me see if I can actually
problem let me see if I can actually change the name I already have an Alex
change the name I already have an Alex analyist but I'm GNA see it's already
analyist but I'm GNA see it's already taken um I'm just going to try this one
taken um I'm just going to try this one one more time oh and now it's working uh
one more time oh and now it's working uh I have no idea why it uh didn't work
I have no idea why it uh didn't work before but this is fantastic it was
before but this is fantastic it was giving me all this I was maybe I was
giving me all this I was maybe I was just reading too much into that I had I
just reading too much into that I had I had never tried to create another
had never tried to create another umio or or GitHub pages on this so
umio or or GitHub pages on this so anyways thanks for sticking with me
anyways thanks for sticking with me through all that um stuff so now we have
through all that um stuff so now we have our actual website um it doesn't look
our actual website um it doesn't look the same up here because of that thing
the same up here because of that thing that we were just looking at it should
that we were just looking at it should just be this part right here but um this
just be this part right here but um this is an actual website now it's being
is an actual website now it's being hosted through GitHub and it's
hosted through GitHub and it's completely free if you want to pay you
completely free if you want to pay you can hide this from your GitHub um your
can hide this from your GitHub um your repository has to be public uh something
repository has to be public uh something I didn't mention when you're doing this
I didn't mention when you're doing this your repository has to be public um if I
your repository has to be public um if I change the visibility to private um you
change the visibility to private um you will not be able to see it anymore
will not be able to see it anymore you'll have to then pay if you want to
you'll have to then pay if you want to make this repository private you have to
make this repository private you have to then pay I think it's like $4 a month or
then pay I think it's like $4 a month or something like that so worth looking
something like that so worth looking into um if you don't want to display
into um if you don't want to display that on your GitHub worth looking into
that on your GitHub worth looking into but this is our final product I mean it
but this is our final product I mean it looks pretty fantastic and you can use
looks pretty fantastic and you can use any of these templates right there are
any of these templates right there are lots of different templates that are
lots of different templates that are fantastic I mean they look amazing they
fantastic I mean they look amazing they look professional um it's really up to
look professional um it's really up to your style like this one looks kind of
your style like this one looks kind of cool a little bit um edgy for for my
cool a little bit um edgy for for my taste but uh this one looks really good
taste but uh this one looks really good too might might be able to add some more
too might might be able to add some more narrative to that one so again go
narrative to that one so again go through it make your make a good choice
through it make your make a good choice in it and then update it how we updated
in it and then update it how we updated it uh I will include the um let's see I
it uh I will include the um let's see I will include everything that's in here
will include everything that's in here and I'll keep this on my on this GitHub
and I'll keep this on my on this GitHub that you can go in there and if you want
that you can go in there and if you want to download these images you can
to download these images you can download the images that I I used um or
download the images that I I used um or you can go find your own just um you
you can go find your own just um you know look for try to get like HD images
know look for try to get like HD images on Google just type in Google Images and
on Google just type in Google Images and search for whatever image you want to
search for whatever image you want to search try to get an HD image with that
search try to get an HD image with that being said that is the entire project I
being said that is the entire project I I I I hope this didn't go too long um
I I I hope this didn't go too long um this may have gone you know this may
this may have gone you know this may have gone like 30 45 minutes but in the
have gone like 30 45 minutes but in the end of it at the at the end which is
end of it at the at the end which is where we are now we have an entire
where we are now we have an entire website it was completely free and I
website it was completely free and I hope that you can host the projects and
hope that you can host the projects and you can create create more projects I
you can create create more projects I will be coming out with more projects
will be coming out with more projects myself that hopefully will be
myself that hopefully will be interesting to you in the future so with
interesting to you in the future so with that being said thank you guys for
that being said thank you guys for joining me for you who stuck it out to
joining me for you who stuck it out to the very end you are fantastic you know
the very end you are fantastic you know send me a post your website on LinkedIn
send me a post your website on LinkedIn and tag me in it because I love seeing
and tag me in it because I love seeing um you guys do these projects and this
um you guys do these projects and this stuff so I'm super excited to see all of
stuff so I'm super excited to see all of these um that you guys tag me on on
these um that you guys tag me on on LinkedIn and whatnot so with that being
LinkedIn and whatnot so with that being said this is it I hope you learned
said this is it I hope you learned something I hope that it worked for you
something I hope that it worked for you and I appreciate you watching be sure to
and I appreciate you watching be sure to like And subscribe below and I will see
like And subscribe below and I will see you in the next video
[Music] goodbye what's going on everybody
goodbye what's going on everybody welcome back to another video today I'm
welcome back to another video today I'm going to help you create a data analyst
resume [Music]
now when I say data analyst rume it's not that much different than a regular
not that much different than a regular rume except that it's going to be
rume except that it's going to be catered for a data analyst job in just a
catered for a data analyst job in just a second we're going to take a look on my
second we're going to take a look on my screen at a sample resume I'll have the
screen at a sample resume I'll have the template in the description so you can
template in the description so you can just go and download it and fill in your
just go and download it and fill in your information but it's a fantastic
information but it's a fantastic starting place to actually creating your
starting place to actually creating your resume when we're looking at this resume
resume when we're looking at this resume we'll take a look at each section kind
we'll take a look at each section kind of dissect each part of it and then at
of dissect each part of it and then at the very end I'll give some extra tips
the very end I'll give some extra tips on what you should include and how to
on what you should include and how to actually write your rume as well so
actually write your rume as well so without further Ado let's jump onto my
without further Ado let's jump onto my screen take a look at the rume and see
screen take a look at the rume and see how you can create your own data analyst
how you can create your own data analyst resume so here's our sample resume I'm
resume so here's our sample resume I'm just going to walk through the entire
just going to walk through the entire thing super quick and then we'll break
thing super quick and then we'll break down each section individually I'll give
down each section individually I'll give my thoughts and some tips on each
my thoughts and some tips on each section and remember you can download
section and remember you can download this exact thing in the description
this exact thing in the description below I'll have a link I'll probably put
below I'll have a link I'll probably put it on my GitHub or somewhere else but
it on my GitHub or somewhere else but it'll be free to download uh so you can
it'll be free to download uh so you can go ahead and do that but let's zoom in
go ahead and do that but let's zoom in just a little bit so at the very top we
just a little bit so at the very top we have our header we have some just basic
have our header we have some just basic uh contact information then we have
uh contact information then we have skills then we have projects and notice
skills then we have projects and notice the projects are up here at the top and
the projects are up here at the top and we'll get to that later about the order
we'll get to that later about the order of where you should be putting your
of where you should be putting your things then we have work experience and
things then we have work experience and then we have education so really quickly
then we have education so really quickly I'm going to zoom out and I hope you can
I'm going to zoom out and I hope you can still see it the order is actually quite
still see it the order is actually quite important now there is one piece that is
important now there is one piece that is not in here right now and that is a
not in here right now and that is a summary section I don't have a summary
summary section I don't have a summary section on my real resume I just I don't
section on my real resume I just I don't think it's useful or helpful I don't
think it's useful or helpful I don't have one you can include one and it
have one you can include one and it would be right up here at the very top
would be right up here at the very top now why do we have the skills and
now why do we have the skills and projects at the top well it's because
projects at the top well it's because that most people who are trying to break
that most people who are trying to break into a data analytics don't have any
into a data analytics don't have any experience in data analytics if I am
experience in data analytics if I am reading this resume as a hiring manager
reading this resume as a hiring manager and the first thing that I look up here
and the first thing that I look up here and I see is experience and it's not
and I see is experience and it's not analyst it's a teacher or a nurse or
analyst it's a teacher or a nurse or something I'm going to be like
something I'm going to be like this person doesn't have any experience
this person doesn't have any experience I don't want to hire them the first
I don't want to hire them the first thing that you want to have in your
thing that you want to have in your resume is something that is good for the
resume is something that is good for the hiring manager to see the first several
hiring manager to see the first several things you should put all your best
things you should put all your best stuff at the top that's my uh what I
stuff at the top that's my uh what I believe so I think that these skills are
believe so I think that these skills are really strong a lot of great skills and
really strong a lot of great skills and then these projects are all really good
then these projects are all really good projects now this is just a sample these
projects now this is just a sample these aren't all real projects um or they are
aren't all real projects um or they are real real projects they're just not you
real real projects they're just not you know ones that I built myself it's just
know ones that I built myself it's just a sample so uh then right here we have
a sample so uh then right here we have our work experience now if you're like I
our work experience now if you're like I said a nurse or a teacher or a lawyer or
said a nurse or a teacher or a lawyer or something that's not relevant to data
something that's not relevant to data analytics you want that at the bottom um
analytics you want that at the bottom um and then you're going to want to tie in
and then you're going to want to tie in uh some things in these descriptions and
uh some things in these descriptions and then the education at the bottom my
then the education at the bottom my education was terrible okay I had a
education was terrible okay I had a bachelor's in recreational therapy which
bachelor's in recreational therapy which had nothing to do with data analytics so
had nothing to do with data analytics so for a tech job has was not good I always
for a tech job has was not good I always had mine at the bottom so let's start at
had mine at the bottom so let's start at the very top and walk through each
the very top and walk through each section so
section so at the very top you want to have maybe a
at the very top you want to have maybe a title but for sure your full name you
title but for sure your full name you definitely want to include your phone
definitely want to include your phone number if you're okay with them calling
number if you're okay with them calling you but definitely an email for sure
you but definitely an email for sure include things like a LinkedIn profile
include things like a LinkedIn profile or a GitHub profile you can also put
or a GitHub profile you can also put your portfolio in fact I highly
your portfolio in fact I highly recommend putting your portfolio because
recommend putting your portfolio because it just looks good or if they check it
it just looks good or if they check it out that's a really good thing and then
out that's a really good thing and then your location cuz sometimes your job is
your location cuz sometimes your job is going to be location based whether
going to be location based whether you're in Dallas or another Metropolitan
you're in Dallas or another Metropolitan City it's just nice ni to have that on
City it's just nice ni to have that on there this should be the simplest one to
there this should be the simplest one to fill out unless you haven't built out
fill out unless you haven't built out something like a portfolio you just
something like a portfolio you just don't include it um but this one should
don't include it um but this one should be the simplest one right you're just
be the simplest one right you're just putting contact information maybe a link
putting contact information maybe a link to a website next we have the skill
to a website next we have the skill section and this one on my own personal
section and this one on my own personal resume I have at the very top I
resume I have at the very top I typically recommend anyone who does not
typically recommend anyone who does not have experience who is trying to break
have experience who is trying to break in to data analytics to put this at the
in to data analytics to put this at the top as well and have these skills and
top as well and have these skills and know these skills that's important um
know these skills that's important um but when the hiring manager first
but when the hiring manager first initially sees this there's just going
initially sees this there's just going to be a mental check okay they have the
to be a mental check okay they have the skills that we're looking for let's move
skills that we're looking for let's move on to the rest of the resume um but you
on to the rest of the resume um but you want as many mental checks for what
want as many mental checks for what they're looking for at the beginning
they're looking for at the beginning just going to I'm going to keep
just going to I'm going to keep repeating that um this is how I
repeating that um this is how I personally write my skills so I write
personally write my skills so I write something like SQL and then I'll say SQL
something like SQL and then I'll say SQL Server my SQL postrace SQL now I have
Server my SQL postrace SQL now I have used all these different types of SQL in
used all these different types of SQL in my actual job if you don't you haven't
my actual job if you don't you haven't done done that and you're just starting
done done that and you're just starting out maybe you put something like um you
out maybe you put something like um you know
know subqueries store procedures joins
subqueries store procedures joins whatever the actual things within SQL I
whatever the actual things within SQL I don't really think I don't recommend
don't really think I don't recommend that as much because typically people
that as much because typically people know what SQL is like if they use SQL
know what SQL is like if they use SQL they know what SQL is so they're just
they know what SQL is so they're just going to expect that you know those
going to expect that you know those things now for something like python
things now for something like python it's different because there are
it's different because there are packages something are there are
packages something are there are packages and libraries within them so
packages and libraries within them so you can specify I have worked with
you can specify I have worked with pandas in my actual job and I look for
pandas in my actual job and I look for people who know pandas as well because
people who know pandas as well because you know we use it so actually
you know we use it so actually specifying these packages or libraries
specifying these packages or libraries is really helpful so this is how I would
is really helpful so this is how I would put these things on a resume now this is
put these things on a resume now this is another resume this is our sample two
another resume this is our sample two I'm going to maybe include this one down
I'm going to maybe include this one down below although I don't like this format
below although I don't like this format as much but if you like it you can but
as much but if you like it you can but here's another way that you can um show
here's another way that you can um show these skills just a different way to do
these skills just a different way to do it I want to show you both ways um we
it I want to show you both ways um we have like Python and the libraries
have like Python and the libraries underneath it I've even seen it to where
underneath it I've even seen it to where people will write out almost like um let
people will write out almost like um let me go down here they'll write out like a
me go down here they'll write out like a narrative um they'll do
narrative um they'll do Python and then they'll have like a
Python and then they'll have like a colon and then they'll say use to um
colon and then they'll say use to um manipulate data and I'm not spelling
manipulate data and I'm not spelling that right in pandas dot dot dot and
that right in pandas dot dot dot and they've write it out you can do that as
they've write it out you can do that as well again I'd like bullet points
well again I'd like bullet points because it's to the point it's exactly
because it's to the point it's exactly what you need let's get rid of this one
what you need let's get rid of this one real quick so this is the one
real quick so this is the one uh that I like so that's the skill
uh that I like so that's the skill section let's move down to the projects
section let's move down to the projects now the project section is almost
now the project section is almost primarily for people who are just
primarily for people who are just starting out once you get experience
starting out once you get experience typically you maybe have one project on
typically you maybe have one project on there or no projects at all but the
there or no projects at all but the project section is used as kind of um
project section is used as kind of um inl of actual experience right I've
inl of actual experience right I've always said that you need to build
always said that you need to build projects not just for your resume but
projects not just for your resume but also for the interviews so so then when
also for the interviews so so then when you get into an interview you can point
you get into an interview you can point to these projects and say yes I've used
to these projects and say yes I've used SQL I did it in this project and they
SQL I did it in this project and they may have seen it and you can walk them
may have seen it and you can walk them through how you actually used it it
through how you actually used it it gives you more credibility than just
gives you more credibility than just saying you know how to use SQL So within
saying you know how to use SQL So within the project section we're going to have
the project section we're going to have a project like this one says data
a project like this one says data science job market exploratory data
science job market exploratory data analysis so this is a personal project
analysis so this is a personal project and then within it they did some really
and then within it they did some really great stuff here's usually what I
great stuff here's usually what I recommend and this is in here which is
recommend and this is in here which is you specify what you did you say I used
you specify what you did you say I used Python and what did you do to analyze
Python and what did you do to analyze this and gain insights in the job market
this and gain insights in the job market then you walk through some of the things
then you walk through some of the things that you actually did things like regex
that you actually did things like regex techniques you used pandas matplot lib
techniques you used pandas matplot lib you built a wordcloud these are keywords
you built a wordcloud these are keywords that somebody will look for and they
that somebody will look for and they even highlighted them which I personally
even highlighted them which I personally like and do as myself they highlighted
like and do as myself they highlighted these things so that the viewer or the
these things so that the viewer or the um hiring manager is actually seeing
um hiring manager is actually seeing them making sure that they're bold so
them making sure that they're bold so that they are catching their eye so I
that they are catching their eye so I personally do this and I recommend this
personally do this and I recommend this that's all it needs to be it just needs
that's all it needs to be it just needs to be I built a Tablo dashboard doing
to be I built a Tablo dashboard doing this from this data set I cleaned it in
this from this data set I cleaned it in SQL and you show those skills something
SQL and you show those skills something that's important in both the skill
that's important in both the skill section and the project section is using
section and the project section is using and highlighting your skills as much as
and highlighting your skills as much as possible especially if you don't have
possible especially if you don't have any experience if you've never had a job
any experience if you've never had a job before once you have a job and you come
before once you have a job and you come down to like the work experience then it
down to like the work experience then it kind of speaks for you but if you don't
kind of speaks for you but if you don't you want the projects and the skills to
you want the projects and the skills to speak towards your skills and
speak towards your skills and credibility so we have this right here
credibility so we have this right here now one thing that's not in here that I
now one thing that's not in here that I actually do recommend is a hyperlink
actually do recommend is a hyperlink maybe right here or actually this being
maybe right here or actually this being a hyperlink to the project because they
a hyperlink to the project because they might read this and be like I we work
might read this and be like I we work with you know data science job market
with you know data science job market data I don't know and then they'll click
data I don't know and then they'll click on this link and they can see your work
on this link and they can see your work that is the one thing that I would
that is the one thing that I would change change in this other than that
change change in this other than that this is exactly how I would have it very
this is exactly how I would have it very very very similar to my own um and a lot
very very similar to my own um and a lot of this that I did I actually took from
of this that I did I actually took from other resumés and formatted how I prefer
other resumés and formatted how I prefer and like it um so again some of this is
and like it um so again some of this is personal preference and you can change
personal preference and you can change it however you want that's just how I
it however you want that's just how I like it so that is the project section
like it so that is the project section now we're going to go down to the work
now we're going to go down to the work experience section now this person does
experience section now this person does have a little bit of analyst uh
have a little bit of analyst uh experience so you know if you don't
experience so you know if you don't that's okay
that's okay but you put your previous experience now
but you put your previous experience now here's what I recommend if you've been a
here's what I recommend if you've been a teacher for 15 years you've been a nurse
teacher for 15 years you've been a nurse for 10 years you've had 10 different
for 10 years you've had 10 different jobs don't put all your experience on
jobs don't put all your experience on here um maybe put your last two jobs
here um maybe put your last two jobs going back maybe three years I don't
going back maybe three years I don't recommend you filling it up because it's
recommend you filling it up because it's not going to be super relevant unless
not going to be super relevant unless you're applying for a healthc care data
you're applying for a healthc care data analyst position and you have a Nursing
analyst position and you have a Nursing degree then it's relevant and that
degree then it's relevant and that experience is super helpful because it's
experience is super helpful because it's domain experience right then you may go
domain experience right then you may go back five years just you know use your
back five years just you know use your discretion but what you need to include
discretion but what you need to include of course your title where you worked
of course your title where you worked your location and the times that's
your location and the times that's standard for almost any resume but
standard for almost any resume but within here uh what you really want to
within here uh what you really want to do is highlight again the skills if you
do is highlight again the skills if you can if you can't that'll change but in
can if you can't that'll change but in here he says implemented a new reporting
here he says implemented a new reporting using Excel pivot and VBA which reduced
using Excel pivot and VBA which reduced processing time by 50% these types of um
processing time by 50% these types of um quantitative information I reduced time
quantitative information I reduced time I I I saved the company money I I did
I I I saved the company money I I did something quantitative putting that in
something quantitative putting that in here is always helpful always highly
here is always helpful always highly recommended although it can be tough to
recommended although it can be tough to measure these things right typically
measure these things right typically what I recommend especially if you're
what I recommend especially if you're first starting out is to highlight
first starting out is to highlight skills if you're a teacher you've
skills if you're a teacher you've probably used Excel and you've probably
probably used Excel and you've probably used Excel for closer to data analytics
used Excel for closer to data analytics than you think just in a teacher way and
than you think just in a teacher way and not a data analytics way but you can
not a data analytics way but you can reward these things and make them sound
reward these things and make them sound good if you are a a nurse like I was
good if you are a a nurse like I was saying youve used used Excel you've used
saying youve used used Excel you've used a health information system you've used
a health information system you've used uh some type of database talk to that
uh some type of database talk to that include that in here um and it can be
include that in here um and it can be hard to write these out and I'm going to
hard to write these out and I'm going to show you away in just a little bit about
show you away in just a little bit about how you can write these out and think
how you can write these out and think about these things or have a way to help
about these things or have a way to help you write them or give you ideas we'll
you write them or give you ideas we'll get to that in a second lastly we have
get to that in a second lastly we have the education piece this is again really
the education piece this is again really simple at the very bottom education what
simple at the very bottom education what your degree was where you went um and if
your degree was where you went um and if you have you know some help ful things
you have you know some help ful things to include you can do that and then when
to include you can do that and then when you actually went now you can include
you actually went now you can include other things in here as well like boot
other things in here as well like boot camps if you went to a boot camp or you
camps if you went to a boot camp or you could also include things like a GPA
could also include things like a GPA although I don't personally recommend it
although I don't personally recommend it GPA has never been anything that I've
GPA has never been anything that I've ever cared about or I've seen anyone
ever cared about or I've seen anyone care about ever um so you don't normally
care about ever um so you don't normally have to include it one other thing that
have to include it one other thing that you can include at the very bottom is
you can include at the very bottom is something like
something like certifications uh I personally don't put
certifications uh I personally don't put a lot of stock in certifications unless
a lot of stock in certifications unless it is one that I have recommended in
it is one that I have recommended in previous video like the Tableau
previous video like the Tableau certification or Tableau desktop
certification or Tableau desktop certification if you're applying to a
certification if you're applying to a job that uses taow that actually could
job that uses taow that actually could be really good so definitely include
be really good so definitely include that but ones on udem me ones on corsera
that but ones on udem me ones on corsera or like my Alex the analyst boot camp
or like my Alex the analyst boot camp that I have on my channel I wouldn't
that I have on my channel I wouldn't really include that in your resume it's
really include that in your resume it's mostly for learning if you get something
mostly for learning if you get something like the Tableau one or the AWS uh Cloud
like the Tableau one or the AWS uh Cloud one or the um Azure Cloud one those are
one or the um Azure Cloud one those are all actual certifications that can help
all actual certifications that can help you and give you credibility towards a
you and give you credibility towards a certain skill now really quickly let's
certain skill now really quickly let's just take a glance at the other resume
just take a glance at the other resume this is resume 2 so we have the
this is resume 2 so we have the education at the top doesn't have to be
education at the top doesn't have to be at the top unless it's relevant which
at the top unless it's relevant which you could put at the top we have a skill
you could put at the top we have a skill section they again this is the projects
section they again this is the projects same projects and then work experience
same projects and then work experience this is just a little bit different um
this is just a little bit different um order so you can do it like this as well
order so you can do it like this as well in different way you can write the
in different way you can write the skills and you can also include a
skills and you can also include a summary section as well so that's the
summary section as well so that's the meat and potatoes of how I would create
meat and potatoes of how I would create create a data analyst resume now writing
create a data analyst resume now writing it is actually a different Beast right
it is actually a different Beast right you have to actually write it out get
you have to actually write it out get something on the resume and then apply
something on the resume and then apply using that resume but it can be hard to
using that resume but it can be hard to come up with these ideas so uh I just
come up with these ideas so uh I just want to show you something that a lot of
want to show you something that a lot of people have been using I personally
people have been using I personally haven't written a resume in a little
haven't written a resume in a little while so I don't use it for my own
while so I don't use it for my own resume or haven't used it but I will um
resume or haven't used it but I will um and that's using chat gbt or some
and that's using chat gbt or some variation whether it's on Bing or you
variation whether it's on Bing or you know you get some different version or
know you get some different version or some new product that's out there at the
some new product that's out there at the moment I'm just going to show you how to
moment I'm just going to show you how to do it in chat GPT some of the things
do it in chat GPT some of the things that you can prompt it to do and that'll
that you can prompt it to do and that'll be it I'm just going to show you kind of
be it I'm just going to show you kind of some ideas that it can generate for you
some ideas that it can generate for you to help you write these things all right
to help you write these things all right so here in my screen we're on chat gbt
so here in my screen we're on chat gbt if you haven't used it I'll leave a link
if you haven't used it I'll leave a link in the description I also have a whole
in the description I also have a whole video on how to use chat GPT for a data
video on how to use chat GPT for a data analysis um so I like chat GPT now I've
analysis um so I like chat GPT now I've already written out these questions
already written out these questions because I don't want to wait for the
because I don't want to wait for the responses but here's what I asked it to
responses but here's what I asked it to do and you can do some variation of this
do and you can do some variation of this whether you're a nurse or a lawyer or a
whether you're a nurse or a lawyer or a teach teacher or whatever I said I'm a
teach teacher or whatever I said I'm a math High School teacher trying to
math High School teacher trying to become a data analyst how can I use my
become a data analyst how can I use my experience on my resume to help me get a
experience on my resume to help me get a job this is just to help provoke some
job this is just to help provoke some ideas and it says you know you most
ideas and it says you know you most likely have some skills emphasize your
likely have some skills emphasize your quantitative skills so those are some of
quantitative skills so those are some of the things you can focus on showcase
the things you can focus on showcase your ability to commute complex Concepts
your ability to commute complex Concepts which is really important in data
which is really important in data analytics being able to present
analytics being able to present information which teachers have
information which teachers have highlight your experience with
highlight your experience with technology hopefully you're using some
technology hopefully you're using some type of uh you know database for
type of uh you know database for students or you know Excel or something
students or you know Excel or something like that you can highlight that and
like that you can highlight that and showcase your ability to solve problems
showcase your ability to solve problems now the next thing that I asked it was I
now the next thing that I asked it was I built a covid tableau dashboard using
built a covid tableau dashboard using Tableau how can I add this to my resume
Tableau how can I add this to my resume and then it's going to tell you exactly
and then it's going to tell you exactly how you can do that it's going to say
how you can do that it's going to say include the link to your dashboard which
include the link to your dashboard which I also recommend provide a brief
I also recommend provide a brief description highlight your data
description highlight your data visualization skills include screenshots
visualization skills include screenshots or images which that's what I would be
or images which that's what I would be putting in the project itself not on
putting in the project itself not on your resume then provide context for the
your resume then provide context for the data all really good stuff really great
data all really good stuff really great now the last thing is kind of what I'm
now the last thing is kind of what I'm trying to get at as a whole it can help
trying to get at as a whole it can help you write things so I'm going to say
you write things so I'm going to say write a two sent I said write a two
write a two sent I said write a two write two sentences highlighting my
write two sentences highlighting my covid Tableau dashboard to add to my
covid Tableau dashboard to add to my resume and it's going to say developed a
resume and it's going to say developed a covid tablet dashboard to visualize
covid tablet dashboard to visualize pandemic Trends using real-time data
pandemic Trends using real-time data sources demonstrating strong data
sources demonstrating strong data visualization and Analysis skills so
visualization and Analysis skills so this can help you generate those
this can help you generate those descriptions in your work experience it
descriptions in your work experience it can help you generate the descriptions
can help you generate the descriptions in your projects and this can be really
in your projects and this can be really helpful to just generate some ideas cuz
helpful to just generate some ideas cuz I personally really struggle with like
I personally really struggle with like highlighting my skills and descriptions
highlighting my skills and descriptions within those things this can be a way to
within those things this can be a way to kind of help you do that so don't you
kind of help you do that so don't you know just copy and paste but let it
know just copy and paste but let it prompt you let it give you ideas now the
prompt you let it give you ideas now the last thing that I want to mention is
last thing that I want to mention is just your overall resume as a whole the
just your overall resume as a whole the template that I use the template that I
template that I use the template that I recommend is very very friend friendly
recommend is very very friend friendly to these automated systems that check
to these automated systems that check your resume if you did not know most
your resume if you did not know most companies especially big companies use
companies especially big companies use these automated systems that scan your
these automated systems that scan your resum see if it has what they're looking
resum see if it has what they're looking for and then that rume if it gets
for and then that rume if it gets through that system gets passed on to a
through that system gets passed on to a recruiter or hiring manager typically
recruiter or hiring manager typically most companies don't go straight to the
most companies don't go straight to the hiring manager so you need a resume that
hiring manager so you need a resume that can pass through those initial systems
can pass through those initial systems and pass those tests the RS that I've
and pass those tests the RS that I've shown you today will do that they have
shown you today will do that they have bullet points they have the keywords
bullet points they have the keywords they have everything you need that's why
they have everything you need that's why I recommend or partially why I recommend
I recommend or partially why I recommend this type of resume other ones that have
this type of resume other ones that have images and different fonts and different
images and different fonts and different stylings can cause issues with these
stylings can cause issues with these automated systems where it just doesn't
automated systems where it just doesn't read it properly or you know it doesn't
read it properly or you know it doesn't read the right words that you want it to
read the right words that you want it to read so just know that these types of
read so just know that these types of résumés have different uses right you're
résumés have different uses right you're not just handing it off to somebody to
not just handing it off to somebody to where they can read it and it's needs to
where they can read it and it's needs to be visually stimulating really what you
be visually stimulating really what you need is you needed to get through those
need is you needed to get through those initial systems which these resumés uh
initial systems which these resumés uh if you write them well you have good you
if you write them well you have good you know skills and the right things on your
know skills and the right things on your resume they will pass through that first
resume they will pass through that first layer to get to those hiring managers so
layer to get to those hiring managers so again be sure to download those those
again be sure to download those those are completely free I just I highly
are completely free I just I highly recommend using them I think they're
recommend using them I think they're really good so be sure to download those
really good so be sure to download those use those just put in your own
use those just put in your own information be sure to build out your
information be sure to build out your own projects don't just keep the ones
own projects don't just keep the ones that are on there because you'll need to
that are on there because you'll need to be able to speak to them sometimes
be able to speak to them sometimes recruiters or hiring managers are going
recruiters or hiring managers are going to ask you about them how you build it
to ask you about them how you build it what you did and you can also point to
what you did and you can also point to those projects in your actual interview
those projects in your actual interview so I hope that this was helpful I hope
so I hope that this was helpful I hope that your resume is ready to go I hope
that your resume is ready to go I hope that you ready to start applying for
that you ready to start applying for those data analyst jobs thank you guys
those data analyst jobs thank you guys so much for watching I really appreciate
so much for watching I really appreciate it if you like this video be sure to
it if you like this video be sure to like And subscribe below and I'll see
like And subscribe below and I'll see you in the next
you in the next [Music]
[Music] video what's going on everybody my name
video what's going on everybody my name is Alex freeberg and today we're going
is Alex freeberg and today we're going to be walking through my top three tips
to be walking through my top three tips on how to use LinkedIn to land a job
on how to use LinkedIn to land a job LinkedIn is a fantastic place to look
LinkedIn is a fantastic place to look for a job it's its own little ecosystem
for a job it's its own little ecosystem where career-driven people can connect
where career-driven people can connect and talk with one another and help each
and talk with one another and help each other find jobs I personally have landed
other find jobs I personally have landed jobs through Linkedin and so I know how
jobs through Linkedin and so I know how effective it can be let's jump over to
effective it can be let's jump over to my screen and I'm going to show you my
my screen and I'm going to show you my top three strategies that I have found
top three strategies that I have found to be the most successful to actually
to be the most successful to actually finding a job so I'm logged into my
finding a job so I'm logged into my completely Anonymous account here and
completely Anonymous account here and I'm going to show you the very first tip
I'm going to show you the very first tip which is you shouldn't be just applying
which is you shouldn't be just applying to a position you should be actually
to a position you should be actually reaching out to the recruiter and I'm
reaching out to the recruiter and I'm going to show you exactly how to do that
going to show you exactly how to do that so the first thing that we have to do is
so the first thing that we have to do is actually find a job that we want to
actually find a job that we want to apply to so let's go to the job section
apply to so let's go to the job section right over here and let's search for
right over here and let's search for data
data analyst and let's do that
analyst and let's do that in let's do
in let's do Chicago because why not uh so it's going
Chicago because why not uh so it's going to search for data analy positions in
to search for data analy positions in Chicago we have one right here let's see
Chicago we have one right here let's see what it looks like cuz you know I don't
what it looks like cuz you know I don't want to apply to jobs that I'm not
want to apply to jobs that I'm not extremely qualified for so this is a job
extremely qualified for so this is a job that I want to apply for and before I
that I want to apply for and before I actually go and applies to the job I
actually go and applies to the job I want to see if I can reach out to a
want to see if I can reach out to a recruiter and talk to them beforehand so
recruiter and talk to them beforehand so let me show you how to do that so what
let me show you how to do that so what we're going to do is actually click on
we're going to do is actually click on the company right here it's going to
the company right here it's going to take us to basically their LinkedIn
take us to basically their LinkedIn profile page for their entire company
profile page for their entire company and we're going to scroll down we're
and we're going to scroll down we're going to go over to
going to go over to people and then we're going to search
people and then we're going to search for
for recruiter
recruiter so if we scroll down all the way to the
so if we scroll down all the way to the bottom we can see that there are
bottom we can see that there are recruiters that actually work inh house
recruiters that actually work inh house for this company and so now would be a
for this company and so now would be a time where I actually reach out to some
time where I actually reach out to some of these recruiters and I say hey I see
of these recruiters and I say hey I see a job that I really like I think I'm
a job that I really like I think I'm really qualified for and I would love to
really qualified for and I would love to talk more about it with you you can ask
talk more about it with you you can ask them things about the job to make sure
them things about the job to make sure that it is a good fit for you and then I
that it is a good fit for you and then I highly recommend you asking them what
highly recommend you asking them what they think is the best way to apply for
they think is the best way to apply for this job to make sure that your resume
this job to make sure that your resume gets noticed and you get an interview
gets noticed and you get an interview since they are a recruiter who works at
since they are a recruiter who works at this company they may be the the one
this company they may be the the one who's actually going to be looking at
who's actually going to be looking at these resumés and so they may give you a
these resumés and so they may give you a tip on the best way to actually apply
tip on the best way to actually apply they may also just ask you to send them
they may also just ask you to send them your ré directly that they can look at
your ré directly that they can look at it or maybe later on down the line this
it or maybe later on down the line this actually is a person who is reviewing
actually is a person who is reviewing resumés and so if they come across your
resumés and so if they come across your resume they may be able to put a face to
resume they may be able to put a face to the name and that may give you bonus
the name and that may give you bonus points I'm going to leave a template
points I'm going to leave a template script in the description in case you
script in the description in case you don't know exactly what you want to say
don't know exactly what you want to say to this recruiter and it'll give you
to this recruiter and it'll give you just a baseline of some of the things
just a baseline of some of the things that you might want to say number two is
that you might want to say number two is to actually ask for a referral now if
to actually ask for a referral now if you don't know what a referral is it is
you don't know what a referral is it is is where somebody who already works at
is where somebody who already works at the company can refer you to a specific
the company can refer you to a specific job and then might get you a little bit
job and then might get you a little bit higher on the list for interviews so I
higher on the list for interviews so I highly recommend reaching out to
highly recommend reaching out to somebody who already works at that
somebody who already works at that company and ask if they're willing to be
company and ask if they're willing to be a referral for you I get people reaching
a referral for you I get people reaching out to me all the time asking to be a
out to me all the time asking to be a referral for them for my company and
referral for them for my company and nine times out of 10 I say yes I always
nine times out of 10 I say yes I always ask to see their resume first just to
ask to see their resume first just to make sure that their resume aligns with
make sure that their resume aligns with the position at least a little bit but
the position at least a little bit but there's basically no harm in me being a
there's basically no harm in me being a referral for somebody in fact I may
referral for somebody in fact I may actually get a bonus if that person ends
actually get a bonus if that person ends up getting hired and so for the most
up getting hired and so for the most part there's almost no risk for the
part there's almost no risk for the employee to actually being a referral
employee to actually being a referral and so a lot of times they will say yes
and so a lot of times they will say yes now let me show you how to do that and
now let me show you how to do that and it is very similar to finding a
it is very similar to finding a recruiter so we're going to stay on this
recruiter so we're going to stay on this people section but instead of searching
people section but instead of searching for a recruiter we're going to search
for a recruiter we're going to search for a job title that is similar to yours
for a job title that is similar to yours so let's actually see if they do already
so let's actually see if they do already have any data analysts and if they do
have any data analysts and if they do that is the person that we're going to
that is the person that we're going to reach out to because that is the person
reach out to because that is the person we'll probably have the best connection
we'll probably have the best connection with so it looks like we have six
with so it looks like we have six employees and let's SC SC down and so it
employees and let's SC SC down and so it looks like all these people have data
looks like all these people have data related jobs and so I would reach out to
related jobs and so I would reach out to these people and say I saw an open data
these people and say I saw an open data analyst position at your company I would
analyst position at your company I would love to know more about your company as
love to know more about your company as a whole and then you can talk to them a
a whole and then you can talk to them a little bit and then in the end your goal
little bit and then in the end your goal is to ask them for a referral and if
is to ask them for a referral and if that happens that is fantastic and then
that happens that is fantastic and then you can go ahead and apply for the job
you can go ahead and apply for the job and mark them as a referral for you now
and mark them as a referral for you now my third tip on how to get a job through
my third tip on how to get a job through Linkedin is to actually have recruiters
Linkedin is to actually have recruiters reach out to you so let me show you how
reach out to you so let me show you how to do that the first thing we're going
to do that the first thing we're going to do is actually go over to my profile
to do is actually go over to my profile here and we'll click view
here and we'll click view profile now there's a few things that we
profile now there's a few things that we want to make sure that we have on here
want to make sure that we have on here so that recruiters can reach out to us
so that recruiters can reach out to us the first thing that I want to do is to
the first thing that I want to do is to actually come to this section right here
actually come to this section right here which is show recruiters you're open to
which is show recruiters you're open to work and when I click on this I can
work and when I click on this I can actually choose some job titles and some
actually choose some job titles and some locations where I actually want to apply
locations where I actually want to apply and have recruiters reach out to me and
and have recruiters reach out to me and so right now I have data analyst I have
so right now I have data analyst I have in the DFW area which is where I live I
in the DFW area which is where I live I can also add titles like business
can also add titles like business analyst um and then maybe Junior data
analyst um and then maybe Junior data analyst entry-level data analyst or
analyst entry-level data analyst or things like that that could potentially
things like that that could potentially have recruiters reach out to me for
have recruiters reach out to me for positions that I'm interested in and
positions that I'm interested in and then you can say that you're immediately
then you can say that you're immediately and actively applying and you can also
and actively applying and you can also say that you're only looking for
say that you're only looking for full-time positions or contract
full-time positions or contract positions and then you can actually add
positions and then you can actually add this to your profile and I only want
this to your profile and I only want recruiters to see that because I do
recruiters to see that because I do currently have a job at McDonald's and
currently have a job at McDonald's and so I don't want McDonald's firing me
so I don't want McDonald's firing me because I'm looking for employment
because I'm looking for employment elsewhere so let's save that and it
elsewhere so let's save that and it looks like it was updated and so now
looks like it was updated and so now when recruiters are searching for
when recruiters are searching for candidates for a specific position you
candidates for a specific position you will be on that list so that they can
will be on that list so that they can find you and reach out to you something
find you and reach out to you something else I should mention is on your profile
else I should mention is on your profile page I would try to have some type of
page I would try to have some type of professional photo so that you look
professional photo so that you look really good I would also try to include
really good I would also try to include data analyst somewhere in your title if
data analyst somewhere in your title if you already have a data analyst job and
you already have a data analyst job and you're looking for another one you can
you're looking for another one you can just have your previous company but if
just have your previous company but if you're looking for a data analyst job
you're looking for a data analyst job you can always put seeking data analyst
you can always put seeking data analyst position or something like that another
position or something like that another thing I think is really important is
thing I think is really important is having really good descriptions for your
having really good descriptions for your previous work I don't currently have
previous work I don't currently have this but I would go a little bit into
this but I would go a little bit into the work that I actually do make sure
the work that I actually do make sure that the experience matches kind of what
that the experience matches kind of what you're looking for if you do have
you're looking for if you do have previous experience if not that's
previous experience if not that's totally fine the next section on your
totally fine the next section on your profile page that I would recommend
profile page that I would recommend looking at and updating is your skill
looking at and updating is your skill section and so you want to go in there
section and so you want to go in there and make sure you have all of your
and make sure you have all of your relevant really data analyst heavy
relevant really data analyst heavy skills on there specifically hard skills
skills on there specifically hard skills because soft skills aren't going to
because soft skills aren't going to translate too much into this section I
translate too much into this section I would definitely stick to things like
would definitely stick to things like SQL python Tableau Excel things that
SQL python Tableau Excel things that data analysts are going to use because
data analysts are going to use because this is where they're going to actually
this is where they're going to actually look and see if you have the skills that
look and see if you have the skills that they are looking for for that position
they are looking for for that position when I was applying to jobs in only
when I was applying to jobs in only applying to job postings and not using
applying to job postings and not using any of these strategies my success rate
any of these strategies my success rate was 0.04 which means out of 1,000
was 0.04 which means out of 1,000 applications that I filled out and sent
applications that I filled out and sent my resume to I only heard back from four
my resume to I only heard back from four of them to actually get an interview but
of them to actually get an interview but with these strategies I was able to get
with these strategies I was able to get that up to 10% and at my best I was able
that up to 10% and at my best I was able to get that up to 15% but that's because
to get that up to 15% but that's because I was applying to a lot less positions
I was applying to a lot less positions and I was targeting jobs that I really
and I was targeting jobs that I really wanted to work for and so I put in more
wanted to work for and so I put in more effort in order to contact people and
effort in order to contact people and work with Recruiters in order to get
work with Recruiters in order to get that job I genuinely hope that these
that job I genuinely hope that these strategies can be helpful for you
strategies can be helpful for you especially if you're trying to apply for
especially if you're trying to apply for jobs right now thank you guys so much
jobs right now thank you guys so much for watching I really appreciate it if
for watching I really appreciate it if you liked this video and got anything
you liked this video and got anything out of it at all be sure to like And
out of it at all be sure to like And subscribe below and I'll see you in the
subscribe below and I'll see you in the next video hello everybody
next video hello everybody congratulations if you are watching this
congratulations if you are watching this that means that you completed the data
that means that you completed the data analyst boot camp if you haven't don't
analyst boot camp if you haven't don't keep watching this is only for people
keep watching this is only for people who have completed the data analyst boot
who have completed the data analyst boot camp playlist on my YouTube channel woo
camp playlist on my YouTube channel woo all right now that we filtered those
all right now that we filtered those people out I'm going to show you how you
people out I'm going to show you how you can download your certificate and your
can download your certificate and your certification now that you've completed
certification now that you've completed the data analyst boot camp I will leave
the data analyst boot camp I will leave a link in the description but let's go
a link in the description but let's go on to my screen I'm going to show you
on to my screen I'm going to show you how to actually access this and download
how to actually access this and download your certification all right guys don't
your certification all right guys don't go around telling people this or sharing
go around telling people this or sharing this uh but this is our data analytics
this uh but this is our data analytics boot camp on the Alex the analyst GitHub
boot camp on the Alex the analyst GitHub right up here I will have this link in
right up here I will have this link in the description what you can go ahead
the description what you can go ahead and do is you can come right here you
and do is you can come right here you can download this you'll just right
can download this you'll just right click or click download and you just do
click or click download and you just do something like save image as um or you
something like save image as um or you can come to this one this is the one
can come to this one this is the one that I think is the the real money maker
that I think is the the real money maker here uh this is the certificate of
here uh this is the certificate of completion for the data analytics boot
completion for the data analytics boot camp I have my not signature but my name
camp I have my not signature but my name as well as U my position with a blank
as well as U my position with a blank space right here to fill in your name
space right here to fill in your name feel free to put this on LinkedIn or
feel free to put this on LinkedIn or Twitter or Instagram and tag me in that
Twitter or Instagram and tag me in that because I would love to just say
because I would love to just say congratulations because honestly it's a
congratulations because honestly it's a lot lot of work to go through all those
lot lot of work to go through all those videos and learn all of those skills so
videos and learn all of those skills so congratulations I hope that you learned
congratulations I hope that you learned something along this journey a new skill
something along this journey a new skill a new thought a new idea and I'm proud
a new thought a new idea and I'm proud of you I'm proud of you for putting in
of you I'm proud of you for putting in the work it's not easy but you did it
the work it's not easy but you did it and I hope that you came out on the
and I hope that you came out on the other side better for it so congrats
other side better for it so congrats I'll see you in the next
[Music] video
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.
Works with YouTube, Coursera, Udemy and more educational platforms
Get Instant Transcripts: Just Edit the Domain in Your Address Bar!
YouTube
←
→
↻
https://www.youtube.com/watch?v=UF8uR6Z6KLc
YoutubeToText
←
→
↻
https://youtubetotext.net/watch?v=UF8uR6Z6KLc