0:01 you are in the right place to learn to
0:04 become a data analyst in this massive
0:07 boot camp Alex the analyst will cover
0:09 all the core topics that data analysts
0:11 need to know and along the way you'll
0:13 build plenty of projects to gain
0:16 hands-on experience hello everybody my
0:17 name is Alex freeberg better known as
0:19 Alex the analyst on YouTube and in this
0:20 video you're going to be taking my
0:22 entire data analyst boot camp this boot
0:24 camp is comprised of videos that I've
0:26 made over the past 3 years and they
0:28 cover a lot of different topics like SQL
0:30 Excel powerbi tableau and python
0:31 throughout the boot camp there are a lot
0:33 of Hands-On guided projects that will
0:35 really help you learn these skills well
0:36 and speaking of projects there's an
0:37 entire Part near the end where you can
0:39 build a free portfolio website where you
0:41 can put all of your projects on so that
0:43 hiring managers and recruiters can go
0:44 and look at all these projects that
0:45 you've built if you wanted to go even
0:47 more in depth into the skills that we
0:48 learn in this boot camp I have a data
0:50 analytics learning platform called
0:51 analyst Builder analy Builder was
0:53 designed specifically for data analyst
0:55 so all of the courses and all the
0:56 content are just for you and it has a
0:58 coding section where you can learn and
1:00 practice for technical interviews and
1:01 lastly before we jump into the boot camp
1:02 I want to give a huge shout out to free
1:04 code camp for putting this all together
1:05 personally learned a ton from free code
1:07 camp and so I'm really honored that my
1:08 boot camp is going to be here for you
1:09 guys to learn and I really hope you
1:11 enjoy it what's going on everybody it is
1:14 2023 and in this video I'm going to help
1:16 you become a data [Music]
1:20 [Music]
1:22 analyst we're going to start at the very
1:24 beginning assuming you haven't started
1:26 this process at all of becoming a data
1:28 analyst if you already have you can kind
1:31 of find IDE identify where you are in
1:33 this process and then go from there now
1:34 before we dive into everything I want to
1:36 warn you I will be mentioning my own
1:38 channel a lot in this video I have
1:40 videos and playlists on just about every
1:41 single topic that we're going to be
1:43 talking about today I'll have all the
1:44 links to those videos in the description
1:46 so you can dive into those topics more
1:48 in depth so I hope that's okay and it's
1:50 all completely free I've been building
1:52 this out for the past 3 years and
1:54 honestly you can probably get 90% of the
1:55 way to learning everything you need for
1:58 data analytics just on my channel so now
1:59 that I've warned you let's J been of
2:01 number one and that is learn the data
2:02 analyst skills now there are literally a
2:03 hundred different things that you can
2:05 learn for data analytics you can learn
2:07 things like alter X or a cloud platform
2:09 or different programming languages but
2:10 there are some core skills that I
2:12 recommend you start out with before kind
2:13 of branching into some of those other
2:15 skills the number one skill that I
2:17 always recommend people start with is
2:19 SQL SQL is just one of those fundamental
2:21 skills I think everybody should learn
2:23 even if you don't use SQL you'll use
2:25 some variation of SQL if your company
2:27 has a large enough data set SQL is used
2:29 to actually query and retrieve data from
2:31 a database so if your company collects
2:34 data which every company does they're
2:35 going to put it somewhere to store it's
2:37 usually stored in a database and sqls
2:39 how you get that data from the database
2:42 I think SQL is also fairly easy to learn
2:43 which makes it really good when you're
2:44 just starting out I have several
2:46 playlists dedicated to SQL starting from
2:48 beginner all the way to Advanced and you
2:50 can learn all of that for free one other
2:51 reason why I think you should learn SQL
2:53 first is that a lot of companies
2:55 interview or have a technical interview
2:57 during the interview process on SQL
2:58 that's something that really caught me
2:59 off guard when I was first starting out
3:00 out because I thought it was going to be
3:02 more behavioral I didn't even know what
3:05 a technical interview was so knowing SQL
3:06 actually became a really important part
3:08 of interviewing and getting a job as a
3:10 data analyst the second skill that I
3:11 would learn is a business intelligence
3:14 tool like Tableau or powerbi now there
3:16 are a ton of different bi tools I can
3:17 literally name 10 off the top of my head
3:19 that I've used throughout my career but
3:20 what I will say is that learning
3:22 something like Tableau or powerbi is
3:24 pretty transferable to almost all those
3:26 other bi tools they're all fairly
3:29 similar and how they do things and how
3:31 they show display the data you most
3:32 likely won't have a technical interview
3:34 asking you about Tableau or powerbi like
3:36 to build something for them that usually
3:38 does not happen but the combination of
3:40 SQL where you can query your data and
3:42 then taking that data to build something
3:44 that is a really really great
3:46 combination to learn right away I have
3:47 entire series on both Tableau and
3:50 powerbi with projects on my channel the
3:52 third skill that I would learn is Excel
3:54 now most people have used Excel they
3:56 know what Excel is and how it's used but
3:58 it can be used a little bit differently
3:59 for a data analyst for example example
4:01 in Excel a lot of people haven't cleaned
4:04 data in Excel or built charts and graphs
4:05 using Excel and those are things that
4:07 data analysts would probably do excel is
4:09 also just a fundamental skill that every
4:11 company is going to expect you to know
4:13 so I have an entire playlist dedicated
4:15 to excel to actually walk you through
4:17 how to use it for data analysis the
4:19 fourth skill that I recommend you learn
4:21 is python now a lot of people will have
4:23 python higher up on their list they only
4:25 use Python they don't use SQL or a bi
4:27 tool they just do everything in Python
4:29 now python is a fantastic tool you can
4:31 use it to manipulate your data to create
4:34 data visualizations and a ton more like
4:35 web scraping and regular expression and
4:38 a hundred different other things but it
4:40 can be kind of hard to learn it took me
4:42 a long time to really learn the basics
4:44 very well that's really the only reason
4:46 why it is farther back I feel like SQL
4:48 and a bi tool are really easy to learn
4:51 and really pack a big punch whereas
4:53 python can be quite tough to learn in my
4:55 experience and you may not use it as
4:57 often as you would something like SQL or
4:59 a bi tool if you're interested in
5:00 learning py python I have an entire
5:02 series dedicated to python as well as
5:04 projects that you can build again I
5:05 warned you there's going to be a lot of
5:07 self-promotion in this video I have
5:08 videos on just about every single one of
5:10 these topics the fifth and the last
5:12 skill that I recommend you learning and
5:14 this is the only one that I don't have a
5:16 series on yet I will make those is
5:19 learning a cloud platform like AWS
5:21 Google Cloud platform or Azure there's
5:23 no denying that these platforms have
5:25 played a huge impact in how we use data
5:27 as a whole in the data analyst industry
5:29 they can be kind of tough to learn
5:31 though if you aren't using it Hands-On
5:33 in an actual job I think that learning a
5:35 cloud platform is already something that
5:36 most people should start working towards
5:38 because in the future it's only going to
5:40 become more prevalent now where can you
5:42 go and actually learn all of these
5:43 skills that you need to become a data
5:45 analyst well the number one place I'd
5:48 recommend of course is my channel I have
5:50 free tutorials on all these skills and a
5:51 lot of other topics and I think it's
5:53 just a really great place to start the
5:55 next place that I recommend you looking
5:57 at is udemy I recommend udemy especially
5:59 if you're just starting out because it's
6:00 pretty pretty cheap you can buy an
6:03 entire course entire SQL course for $10
6:05 or $15 and they have courses on every
6:07 single one of these skills and I just
6:09 recently made a video called DIY data
6:11 analysts curriculum using udemy for
6:14 under $75 so you can create an entire
6:15 curriculum to learn all of these skills
6:19 for under $75 which is just amazing the
6:20 next place I'm going to recommend you
6:23 look is corsera now udemy is fantastic
6:25 they have really good instructors and
6:28 good courses but as a whole I find that
6:30 sometimes corsera just has more
6:32 professional or better content corsera
6:34 is a bit more expensive though you're
6:36 looking at $59 per month for all of
6:39 their courses or you can pay upfront an
6:42 annual fee of $399 so again it's just a
6:44 lot more expensive I moved to corsera
6:46 once I started having a data analyst job
6:48 and had a bit more money but when I was
6:49 first starting out I just couldn't
6:51 afford it so I went to udemy and it was
6:53 a really great place to start there's
6:55 also places like data camp and data
6:57 Quest that kind of gamify learning and
6:59 they're more text based so all these
7:01 other platforms udem me corsera and me
7:03 they're all video based but if you like
7:05 reading data camp and data Quest are a
7:07 lot more of text where you can learn it
7:09 by reading it and doing it after you
7:10 learn all of these skills the next thing
7:12 that I recommend you do is actually
7:14 build projects with those skills now
7:16 what is building a project actually mean
7:18 it means taking a skill and then
7:19 building something out of it that you
7:22 can then show a potential employer for
7:23 example if you went through and learn
7:26 Tableau you go and take a data set and
7:27 you could build a visualization and a
7:30 dashboard in tableau and that would be a
7:32 project with these projects you can
7:34 build something called a portfolio and I
7:36 usually call it a portfolio website a
7:38 portfolio website is a website that you
7:39 create where you store all of your
7:41 projects and then you can share that
7:43 with recruiters and hiring managers so
7:45 that they can see all of your work now
7:47 do you absolutely need a portfolio to
7:50 show employers no you don't but it does
7:52 help in two different ways the first
7:53 thing that it may do is actually help
7:55 you land the interview if you have a
7:57 link on your resume and they click on it
7:58 they may see your skills and see your
8:00 projects and be like man this person
8:02 really knows what they're doing this is
8:04 exactly what we need the second reason
8:05 that I recommend building projects is
8:07 because most likely during your
8:08 interview you're going to get asked
8:11 questions like how have you used SQL how
8:13 have you used Tableau and if you don't
8:15 have any experience in that you're just
8:16 going to say well you know I've taken
8:19 courses to learn it but with a project
8:21 you can be a lot more specific you'll be
8:23 able to say well I actually just built
8:25 out this project in Tableau I took the
8:27 data and cleaned it in Excel and then I
8:28 put it in Tableau and built out this
8:30 Dash board and here are the insights
8:32 that I found from this data set it's
8:34 just a much better answer and as a
8:36 hiring manager myself I can tell you
8:37 that it is definitely beneficial to
8:39 build out these projects The Next Step
8:41 that I recommend you take in becoming a
8:43 data analyst is building a data analyst
8:46 resume the resume to say the least is
8:48 extremely important it's what's going to
8:50 actually allow you to land an interview
8:52 to potentially get a job now if you were
8:54 like me when I was first starting out I
8:57 had a resume it just had nothing to do
8:59 with data analytics so how do you make a
9:01 data analyst resume if you don't have
9:04 any experience as a data analyst well
9:06 you are asking the perfect questions
9:07 because the very first things that we
9:09 talked about are what are going to go on
9:11 your resume those skills and those
9:13 projects if you have no experience or
9:15 degree like myself who has a
9:17 recreational therapy degree if you have
9:19 no background in this it can be really
9:21 daunting to kind of display that you
9:23 know what you're doing and that a
9:24 company should hire you so what I
9:26 usually recommend is right beneath your
9:28 contact at the top you put your skills
9:30 and your projects that you built out on
9:32 your resume things like work experience
9:34 and education should go on your resume
9:37 as well but just a little bit lower you
9:39 want them to see those things before
9:40 they see that your last work experience
9:42 was at Domino's and you have a degree in
9:45 Marine Biology it's just not relevant to
9:47 data analysis and if you put those
9:48 things at the top they're probably going
9:50 to rule you out right away the fourth
9:52 step to become a data analyst is
9:54 actually applying you have the skills
9:55 you have the projects you have the
9:56 resume now you're ready to start
9:58 applying for those data analyst jobs now
10:00 there's there's a lot of different
10:01 opinions on how you need to go about
10:03 applying for data analyst jobs but I'll
10:05 give you my take on it and this has been
10:07 the most successful for me in my career
10:08 the first thing I want to mention is
10:10 actually what I would not do which is
10:13 just blindly apply on glass door monster
10:14 zip recruiter and all these other
10:16 platforms to just any data analyst job
10:18 that you can find now I'm not against
10:20 this I think you should do that but I
10:21 don't think that's the only thing that
10:23 you should do because the chances of you
10:25 getting a call back or actually hearing
10:27 something back are extremely low to
10:29 really increase your chance of becoming
10:31 a data analyst I highly highly highly
10:33 recommend working with a recruiter a
10:35 recruiter is literally someone who is
10:37 there to help you find a job now when I
10:38 first started out I didn't understand
10:41 what a technical recruiter was at all I
10:43 was kind of nervous or scared to work
10:45 with him but it's actually pretty simple
10:47 a company has a position that they want
10:48 to fill and they don't want to spend
10:51 hours and hours and hours to find
10:52 someone to fill that position so they
10:54 hire a recruiter a recruiter is going to
10:56 go out and try to find someone to fill
10:58 that position AKA you and so if you go
11:00 into talk to that recruiter and they
11:01 have a position that opens up they will
11:03 help you get that interview and then if
11:05 you get a job let's say for
11:08 $50,000 the company is going to pay that
11:10 recruiter let's say 10% of your salary
11:12 so they'll give them $5,000 so you don't
11:15 actually lose or have anything to lose
11:17 using a recruiter you can reach out to
11:19 Recruiters in several ways and I've done
11:20 every variation but I'll tell you my
11:22 most successful way which was using
11:25 LinkedIn there are tens of thousands of
11:26 Recruiters on LinkedIn I made an entire
11:28 video of how you can reach out to
11:30 recruiters and what to St Recruiters on
11:32 LinkedIn to help you land a job so be
11:33 sure to check out that video when you
11:35 actually get to that point but you can
11:38 also just cold email and cold call these
11:40 recruiting companies but to me it's just
11:42 not as effective as reaching out
11:44 directly on LinkedIn and this is just a
11:45 bonus one the last thing that you need
11:47 to do is accept a job offer so on step
11:49 number four after you apply to those
11:51 jobs you do actually have to go in
11:53 interview and then get a job offer which
11:55 you will accept I just thought I'd
11:56 mentioned that just in case that was not
11:59 super clear now that was a lot of stuff
12:01 let's talk about time frames to actually
12:03 complete all of these things now doing
12:05 all of these things from scratch is
12:07 going to take a while but let's break it
12:09 down by each step and see how long I
12:10 generally think it's going to take let's
12:12 start with step number one which is
12:14 actually learning the skills now just to
12:15 be up front this one probably is going
12:17 to take the longest for most people for
12:19 most people to learn all of these skills
12:21 it's going to take around 3 to four
12:23 months now if you don't learn a cloud
12:25 platform and python which are the last
12:27 ones that I recommend and you just focus
12:30 on SQL a to in Excel I think you can do
12:32 that in under 3 months that is very
12:34 dependent though on how much time you
12:36 have to study that time frame is more
12:38 for someone who has several hours per
12:40 day maybe 3 hours in the end of a night
12:42 after you go to work that is someone who
12:44 has quite a bit of time to dedicate to
12:46 learning during their week of course
12:47 that time frame is going to take longer
12:48 if you don't have as much time to
12:50 dedicate to learning now let's look at
12:52 number two which was creating projects
12:54 and a portfolio of projects from my
12:55 experience when you're first starting
12:56 out it takes a lot longer to actually
12:58 create these projects it can take one
13:01 one or two weeks per project I usually
13:02 recommend people doing three to five
13:04 projects in their portfolio before they
13:06 start applying and since they can take
13:07 anywhere from 1 to two weeks you're
13:10 looking at anywhere from 3 to 6 weeks
13:12 The Next Step was to create a data
13:13 analyst resume now in my opinion this
13:15 one should take the shortest out of
13:16 every single step here because you're
13:18 really just kind of reformatting a
13:20 resume or creating a resume you're just
13:22 adding skills you're adding your
13:23 projects and then kind of reformatting
13:25 it to make it look nice this should
13:27 hopefully take under a week but if you
13:28 use something like a professional
13:30 service so they help you build a resume
13:32 it could take one to two weeks the two
13:34 last steps which kind of go hand inand
13:36 are step four and five which is actually
13:38 applying for jobs and then Landing a job
13:40 now this process can take as little as a
13:43 month or it can take as long as 6 months
13:45 or a year it really depends on how
13:47 you're applying where you're applying
13:49 and just the kind of luck that you're
13:51 having with actually Landing interviews
13:52 I've seen people who have never had any
13:54 experience land a job within a month of
13:57 starting to apply and it's incredible
13:58 it's amazing but it doesn't happen too
14:01 often you're usually looking at around 2
14:03 to 4 months on average to land your
14:06 first data analyst job if you put all of
14:07 those together and kind of average
14:09 everything out you're looking at around
14:12 6 months total for the entire process
14:13 now I don't want that to discourage you
14:16 okay 2023 is a long year you have a lot
14:18 of time and it doesn't have to take 6
14:20 months you could do it faster you could
14:22 do it in three months and just prove me
14:23 wrong but if you are really focused and
14:25 you are really driven to become a data
14:27 analyst this year I know that you can do
14:29 it now to maybe boost your spirits and
14:30 make you feel a little bit better I
14:32 didn't know any of these things when I
14:33 first started out I didn't have anyone
14:35 telling me kind of a plan on what to do
14:37 I had to go out and figure all these
14:38 things out by myself and it took me
14:40 almost a year to land my first real data
14:42 analyst job so with all that being said
14:44 I hope that this video is helpful I hope
14:46 you now have a path on how to become a
14:47 data analyst this year and that my
14:49 channel can be a big part of that so
14:51 thank you guys so much for watching I
14:52 really appreciate it if you like this
14:54 video be sure to like And subscribe
14:59 video [Music]
15:10 what's going on everybody my name is
15:11 Alex freeberg and in today's video we're
15:13 going to be starting our basics of SQL
15:14 series now in this series we're going to
15:16 be going over everything you need just
15:18 to get started and then in future videos
15:19 we're going to be going over some
15:20 intermediate Concepts and some more
15:23 advanced concepts and then in the final
15:25 series we're going to be going over some
15:26 portfolio projects in this video in
15:27 particular we're going to be downloading
15:29 SQL Server Studio we're going to be
15:31 creating our tables inserting data into
15:33 our tables and in future videos we're
15:34 going to actually learn how to query
15:36 those tables if you already have SQL
15:38 Server management Studio downloaded you
15:39 can skip ahead to where we actually
15:40 create the tables and insert the data
15:42 into the tables if you don't care about
15:43 that at all and you're just looking at a
15:45 query I would skip to the next video
15:47 where we actually start quering the data
15:49 that we inserted into those tables so to
15:50 download SQL Server management Studio we
15:52 actually have to download two things and
15:54 I have both links right here I'm going
15:56 to leave those in the descriptions that
15:58 you guys have those but this one is to
15:59 actually download SQL Server management
16:01 studio so let's go down here I actually
16:03 deleted it off my computer so I can walk
16:05 through this with you guys so we're
16:07 going to download that let's also go
16:10 over here this is actually a server so
16:12 we have to download a SQL server and if
16:15 you go down right here there's a free
16:17 version now I don't need the developer
16:18 version I'm just going to download the
16:20 express version it's actually smaller so
16:22 let's download that as
16:25 well now once this is done running we're
16:27 going to open it up and I'll show you
16:29 what to do next so it just finished
16:32 running let's click on
16:36 it all right so we need to install it
16:38 we're going to click yes and this is
16:40 going to take a little while so this
16:42 popped up I clicked install and it's
16:43 been running for the past couple minutes
16:45 apparently I was not recording so I
16:48 apologize for that but that's all I did
16:50 so now it's been installed I'm actually
16:52 going to pull it up right
16:55 here and let's open it
16:57 up now when it pulls up it's going to
16:59 ask you to connect to a server and
17:01 that's why we downloaded the SQL Express
17:08 that and there you go it's as easy as
17:10 that so now we have SQL Server
17:12 management Studio set up and we are good
17:14 to go so the first thing that we need to
17:17 do is actually create a database so
17:19 let's go over here to databases and
17:21 let's click new
17:25 database and let's just do SQL
17:27 SQL
17:31 tutorial keep it simple and if we click
17:32 that it's going to create our database
17:34 for us now when you open up the database
17:36 there's going to be a lot of stuff you
17:38 really do not need to know all this
17:39 really what we're going to be sticking
17:42 to is this tables right here uh as of
17:45 right now we do not have any tables so
17:47 we need to create tables now there's two
17:49 ways that you can do that you can click
17:50 right here and you can go to new and
17:53 create table we're not actually going to
17:55 do that we're going to create it using a
17:57 script or a t-sql so we're going to go
18:00 over here and do new query and we will
18:03 get started on actually creating uh the
18:04 two tables that we're going to be using
18:06 for all the stuff going forward all
18:08 right so let's get rid of me CU you
18:09 really don't need to be seeing me
18:11 anymore let's get started by doing our
18:12 very first table which is going to be
18:15 our employee demographics table so let's
18:19 start off by saying create table and we
18:22 have to name it so let's do
18:25 employee demographics and enter down we
18:27 want to do an open parenthesis now we
18:28 need to specify what our column names
18:31 are going to be and what the data type
18:35 is for each column so let's start off
18:37 with employee ID and we want that to be
18:40 an integer so that'll be like 1 2 3 4 uh anything
18:41 anything
18:44 numeric now we want to
18:49 do first name and let's make that varar
18:50 50 if you don't know what these data
18:53 types are that's okay uh that will
18:54 probably be covered in a different video
18:56 that's not really necessary for this
18:59 video uh let's do last name we'll also
19:04 make that varar 50 let's do age make
19:09 that an integer and very last let's do
19:14 gender and we will make that varar 50 as
19:18 well so now we have our very first
19:22 table let's run that and we'll see if it
19:24 works we'll go over here we'll refresh our
19:25 our
19:28 tables and there you go so we have our
19:31 very first table let's go up here let's
19:33 get rid of this one and now let's create
19:35 our second table so we're going to do
19:37 basically the exact same thing but we're
19:38 going to have a little bit different
19:39 information in it this is going to be
19:43 our employee salary table so let's do
19:51 it and enter and open parenthesis so now
19:53 we're going to do the same thing we're
19:56 going to do employee ID let's make that
20:01 an integer now we want the job title
20:03 because we want to know what they
20:08 do and this one is going to be varar 50
20:12 because we keep it pretty simple
20:14 whoops and then for our very last one
20:16 we're going to do salary and that will
20:19 be integer as
20:22 well and I'll just do PR here so let's
20:24 create this
20:29 table let's see if it is there and there
20:32 we go so let's open up one of these
20:33 tables really
20:35 quick see what's in there see what it
20:39 looks like as you can see we do not have
20:41 any information in there uh when you
20:42 create a new table sometimes when you
20:44 open it up you're going to see this if
20:45 you want to get rid of that you just
20:48 need to do a I think it's called A Hard
20:49 refresh or something like that but you
20:52 can do control shift R let's see if it
20:55 works for me I just did it all right it
20:57 goes away so now it recognizes it as a
21:01 table so we're good there let's go back
21:03 here and let's get rid of all this we've
21:05 already created our tables now we want
21:08 to insert the data into our tables so
21:11 let's see what that looks like let's do
21:14 insert into and now we need to specify
21:16 what table we're inserting our data into
21:19 so let's start off with employee
21:21 demographics let's do
21:24 values so now we have to select what
21:27 values we're going to put into um into
21:29 this table
21:31 so now we're going to have to do the
21:34 employee ID so let's do
21:38 101 then we're do first name so let's do
21:42 Jim last name
21:46 Halpert and then his age let's say he's
21:49 30 and he is a
21:54 male now just for fun let's execute that
21:57 let's go back to this table right here
21:59 and execute
22:00 and as you can see all of our
22:02 information actually went in there so
22:04 now we have his employee ID his first
22:07 name his last name age and gender now we
22:10 need a lot more information uh for this
22:12 table in order to actually learn a lot
22:15 of the concepts of quering the table so
22:17 I'm actually going to go through and add
22:19 a ton more information I'm not going to
22:21 bore you through that but I will show
22:22 you the final product before I actually
22:25 hit execute so stick with me I'm
22:26 actually just going to cut to the end
22:28 where I insert all my stuff down on here
22:30 and then if you want that I'll probably
22:32 leave it in the description or maybe put
22:33 in my GitHub or something so you can
22:35 easily just go copy and paste that if
22:36 that's what you want to do so I'll see
22:38 you in a few
22:40 seconds all right so I have all my
22:42 values right here I actually going to
22:44 take this one out cu I already did that
22:46 one but this is our additional
22:49 information let's insert that into our
22:52 table real quick and go back here and
22:54 take a look at it and there you go this
22:56 is going to be our core information that
22:58 we are querying off of
23:00 uh in future videos so that table is
23:03 completely finished let's go back here
23:06 we're going to get rid of this because
23:07 now we want to insert our information to
23:11 our other table so let's do insert into
23:12 and let's do
23:15 employee and now we're going to do
23:19 salary so let's do values to specify
23:22 that we're inserting values into
23:25 there and in this one we have employee
23:28 ID so again let's do in th1 that's
23:33 gym his job title is
23:42 $45,000 and let's execute that and you
23:43 can't see it but down here it says it's
23:46 done let's go to that
23:48 table and as you can see that is
23:50 inserted I'm going to do the exact same
23:52 thing as I did before I am going to fill
23:54 out all these and in a second it will be
23:57 done uh on your side and then again I
23:59 will leave it in the description or I'm
24:01 going to put it on my GitHub and you
24:03 guys can just copy and paste that if
24:03 that's what you want to do or you can
24:06 write it out whatever you want to do all
24:07 right just like before I'm going to get
24:10 rid of this first one that is Jim he is
24:12 already done now let's insert this
24:15 information Ed is finished let's go back
24:19 here and there we go now we have both of
24:20 our tables and we are good to go for
24:22 future videos so thank you so much for
24:24 sticking all the way through this one in
24:26 the next video we're going to actually
24:28 begin uh quering the table and learning
24:31 the select the from the where the group
24:33 by and the order by statement everything
24:36 is in these upcoming videos so stick
24:37 around and we will learn all of that
24:39 together thank you so much for joining
24:40 me if you like this type of content be
24:42 sure to subscribe below and I'll see you
24:45 in the next video what is going on
24:46 everybody my name is Alex freeberg and
24:48 in today's video we're going to be going
24:50 over the select and the from statement
24:52 so if you joined us for our last video
24:55 we went over creating our tables and
24:57 inserting data into those tables and so
24:59 we have this employee demographics table
25:01 and we also have this employee salary
25:03 table and today we're going to be
25:05 walking through the select statement in
25:08 the fir statement on these tables so
25:09 here are some of the concepts that we're
25:10 going to be going over today let's just
25:13 get it started by doing select
25:16 everything and let's do this from the
25:18 employee demographics table so let's
25:21 execute this if we wanted to only show
25:25 the first names we can just do first
25:28 name and run that
25:30 and if we want first name and last name
25:34 we can just separate that by using a
25:37 comma and it will return those well if
25:39 we want to return all columns and all
25:42 rows then all we have to do is use this
25:44 star so that's what the star does now we
25:47 have nine rows of data here and if we
25:50 only wanted to return let's say the top
25:52 five we can easily do that and we can
25:56 just say top five of everything now the
25:59 reason this could be useful is say you
26:01 have a table that has millions of rows
26:03 in it and you only want a small sample
26:06 you can say select top 1,000 and when
26:08 you do that it will only select the top
26:11 five rows now let's get everything back
26:13 in here really quick because we're going
26:16 to move on to this distinct feature so
26:18 when we use distinct we're actually
26:20 saying that we want the unique values in
26:25 a specific column so if we say distinct
26:28 and then let's do employee ID
26:32 D everything should be returned so all
26:34 nine rows should be returned and that's
26:36 because every single one of these are
26:40 unique now let's try gender so there's
26:42 only going to be two results the male
26:43 and the female and that's because
26:46 there's only two distinct values in that
26:49 column now let's look at all of our data
26:52 again so now we want to look at count
26:55 now count is very simple all is going to
26:58 do is going to show us all the non null
27:02 values in a column so let's look at last
27:06 name for example if we do count of last
27:08 name all that's going to give us is a
27:10 count of nine because we have nine last
27:13 names if for whatever reason somebody's
27:16 last name was left out and that was null
27:18 then it would have returned maybe eight
27:19 or seven depending on how many were
27:22 actually in there so if an entire column
27:24 was null we it would be a Return To
27:27 Zero and if you notice we are not given
27:29 a column name that's because this is
27:31 derived information based off the last
27:34 name so if we want to actually give this
27:36 a name so that that column does not say
27:40 no column name we can use this as right
27:43 here so once you put as you can actually
27:45 name it so since this is the count of
27:49 the last name we'll write last name
27:52 count keep it simple and if we execute
27:55 that as you can see we have last name
27:56 count right there so that's how you use
28:01 that as let's look at all of our data
28:04 again we want to look at some Max mins
28:07 and averages right now and the only
28:09 column here where it would be useful to
28:12 do it on is age but let's actually go
28:15 over and let's look at our salary table
28:17 and at our salary table we have some
28:18 really interesting salaries that I think
28:20 would be a little bit more useful for
28:23 this information so let's go over
28:26 to employee
28:28 salary all all right and let's look at
28:31 this table really quick so we have our
28:34 salary now we want to look at the maximum
28:35 maximum
28:38 salary that is
28:41 in uh that column and that is going to be
28:42 be
28:44 $65,000 now let's say we wanted to know
28:47 what the minimum salary was let's
28:48 execute this and the person who makes
28:51 the least money is making
28:53 $36,000 now what's the average what is
28:56 the average salary for all employees
28:58 that's going to be $
29:02 48,5 so so super easy to use all of
29:05 these things they're extremely useful I
29:07 use them every single day so I know that
29:10 each of these are very very useful and
29:12 are definitely among the basics that you
29:16 have to know let's look real quick at
29:18 everything really quick so we just
29:19 learned the select statement but
29:21 learning this from statement really
29:23 quick is also important up here this
29:25 actually shows us that we're already
29:28 Hitting off the SQL tutorial database
29:30 but let's say we change it to master
29:32 when we try to run this it's going to
29:34 give us an error and that's because now
29:35 we're hitting off this database and this
29:38 database does not have this table in it
29:40 so in order to do this in order to still
29:43 hit off that table while up here we're
29:44 actually hitting off a different table
29:47 we can change this information so the
29:49 from statement you have to specify three
29:51 separate things the first thing that you
29:54 need to specify is the database so let's
29:56 say we want to hit off the SQL tutorial database
29:58 database
30:00 now we want to select what table we're
30:04 going to do this is actually a dbo so
30:06 let's put dbo there's there's a lot that
30:08 can go into that um it's not worth
30:12 getting into now but dbo do and let's do employee
30:14 employee
30:16 salary when we execute this our
30:18 information comes up even though up here
30:20 we're still hitting off the master
30:23 database when we specify it right here
30:25 then we actually are choosing what
30:28 database and what table a hit off of and
30:30 so it does not matter what it is up here
30:33 so that's how you use the from statement
30:34 in the next video we're going to be
30:36 going over the wear statement and then
30:38 after that the group by and order by
30:40 statement and that will be the complete
30:42 basics of SQL tutorial and then we'll
30:43 start getting into a little bit more fun
30:46 stuff some more advanced concepts which
30:48 I think it be really really exciting for
30:49 everybody to learn thank you guys so
30:51 much for joining me I really appreciate
30:53 I hope this has been helpful if you like
30:55 this type of content subscribe below and
30:56 I'll see you in the next video thanks
30:58 and goodbye
30:59 what's going on everybody my name is
31:01 Alex freeberg and in this video we're
31:02 going to be going over the we statement
31:04 and SQL in the very first video we
31:07 created our table inserted data into our
31:09 table in the second video we went over
31:10 the select and the from statement and
31:13 now we are on to the wear statements now
31:15 what does the wear statement do it helps
31:17 limit the amount of data and specify
31:19 what data you want returned we have
31:20 quite a few Concepts that we're going to
31:22 be covering today let's just start out
31:24 with something really easy let's
31:28 do where first name
31:32 equals gym really simple so we're
31:34 selecting everything where our first
31:36 name equals gym and this is our output
31:38 so really really simple now let's try
31:40 where it does not equal this right here
31:43 says does not equal gym and let's
31:46 execute that and as you can see we have
31:49 everybody except Jim Halbert in there so
31:50 now let's look at the greater than or
31:53 less than so in this table I think the
31:55 one that we're going to look at is age
31:58 so let's look at age and let's do where
32:01 it's greater than
32:03 30 and when we execute that we're going
32:05 to get everyone who is over the age of
32:07 30 now as you can see we're not
32:09 including people who are 30 years old if
32:11 we want to include people who actually
32:13 are 30 years old we're going to add the
32:16 equal sign right there so we should be
32:19 seeing people who are now 30 so before
32:20 Pam and Jim were not in there and now
32:23 they are if we do the exact same thing
32:28 let's do less than 32
32:30 here's everyone that's going to be
32:32 included but if we want to include the
32:35 people who are 32y old then we are just
32:37 going to add that equal sign and now the
32:39 people who are 32 years old like Toby
32:41 and Meredith are now
32:43 included if we want to go even further
32:45 we want people who are less than or
32:47 equal than
32:52 32 and who are male we can say where gender
32:53 gender equals
32:55 equals
32:57 male so now we have two two things that
32:59 we are specifying that we need we need
33:02 somebody whose age is less than 32 and
33:04 we need their gender to be male so let's
33:06 execute that and we have four people who
33:09 meet that criteria so that's what the
33:14 and statement does if we write or then
33:16 only one of these criteria has to be
33:19 correct in order for it to be met so if
33:22 we hit execute now we're saying anybody
33:26 who's under the age or equal to 32 or
33:28 their gender equals male so if we look
33:30 down here Michael Scott is actually 35
33:33 years old so he's over 32 but since he
33:35 is male he is now included let's get rid
33:37 of everything really quick I want to
33:40 look at this like really quick so let's
33:42 execute just that and if you do that you
33:45 highlight just that hit execute then it
33:46 uh will only run what you have
33:48 highlighted so now let's look at this
33:51 whole table now when you're using like
33:54 you typically are doing this for
33:55 sometimes numerical but most of the time
33:57 you're using it for text
33:59 information so if we're looking at this
34:03 right here if I'm looking at last names
34:06 and let's say I want everybody whose
34:09 last name starts with s you can't really
34:12 do that with anything else so I'm going
34:14 to say where it's like and then I'm
34:17 going to say s and after that I'm going
34:18 to put a percent sign that's actually
34:21 called a wild card and if I close that
34:23 off what this is saying is is I want
34:26 every last name where it starts with
34:27 where it's like
34:30 where it only starts with an S so let's
34:33 run this really quick now we have two
34:36 people whose last names start with s now
34:38 if I put a wild card at the beginning we
34:41 are now saying where there's an S
34:44 anywhere in anybody's name so let's
34:47 execute this and see what we get so now
34:49 even if the S is like flenderson towards
34:52 the end it's still counts so you can
34:54 specify multiple things in here as well
34:56 so let's say I want it to start with s
34:59 that would return shre and Scott but now
35:02 I want something that also has an o in
35:04 it so so it has an S at the beginning
35:07 and then somewhere in there there's an O
35:08 now let's execute that and there's only
35:10 one person that meets that criteria so
35:12 you can do that for multiple things you
35:15 can even say OT
35:18 TT and let's execute that and he's still
35:20 going to be returned and if we put C at
35:23 the back it's not going to be returned
35:27 because it follows it in order so isn't
35:32 s o TT C the C would actually need to
35:38 go over here so now we have s c o t t
35:39 and although there's a bunch of wild
35:41 cards in here it is going to return
35:44 Scott so that is a little bit a little
35:47 hint at how you can use like there is a
35:48 little bit more that goes into it you
35:51 can use it for numerics um there's a lot
35:52 of things that you can use this for but
35:55 this is just the basics how you can use
35:58 it today how you get started on using
36:00 the like a nutshell that is how you use
36:03 like and as I said before you can use
36:06 like with numerical data as well but for
36:08 demonstration purposes I wanted to use
36:10 text Data let's get rid of this really
36:13 quick um let's look at our entire
36:16 table and I wanted to show you how to
36:19 use null and not null I can't really
36:21 show you how to use null because I do
36:24 not have any null Fields I could easily
36:28 update this table and make n but that's
36:30 in a future video where it's a little
36:31 bit more advanced where you can start
36:34 altering your data but just for purposes
36:36 of showing you what null and not null is
36:39 let's do where first
36:42 name is
36:45 null and if we see that is not going to
36:48 return anything but if we say is not
36:50 null it's going to return everything
36:53 because nothing in here is null nothing
36:55 in this first name column is null so
36:57 that's how you use it um there are a lot
36:59 of use cases where you actually will use
37:02 null and not null that will be in future
37:04 videos probably in the project section
37:06 or the portfolio section we weren't able
37:08 to show really how to use this super
37:10 well but just as a demonstration that's
37:13 really all it does it looks at the whole
37:15 column and whether it is null or not
37:16 null that's really all it's used for
37:19 this is actually super useful and you
37:21 can use it in a ton of situations but
37:22 again for demonstration purposes that's
37:25 really all it does so let's get rid of
37:29 this let's look at in really quick so in
37:32 is kind of like the equal statement but
37:35 it's multiple equal statements so let's
37:41 say we want to say we first name equals
37:44 gy and then we were like wait we also
37:46 want to include Michael
37:49 Scott so then we would have to write and
37:51 where first name equals and then we
37:54 would do Michael and then etc etc for
37:56 anybody that we wanted to include but if
37:59 we said in we could do an open
38:01 parentheses and then we can say
38:05 gy we can say
38:08 Michael and we can say as many people as
38:09 we want going down the road just
38:11 separating it by commas and if we had
38:13 execute everything would be returned so
38:16 it really is just a condensed way to say
38:18 equal for multiple
38:21 things so that is the we statement I
38:22 think the wear statement can get
38:25 extremely complex but this really is
38:26 highlighting the basics so if you can
38:29 learn all of these Concepts you will
38:30 absolutely have the basics down and will
38:32 be set to go over some more intermediate
38:34 and more advanced things with the we
38:36 statement later on in the next video
38:37 we're going to be going over the group
38:40 buy and the order buy and then we are
38:42 done with the SQL Basics and then you
38:44 can practice and work your way up into
38:46 my intermediate level videos which are
38:48 going to be coming out very shortly
38:50 after these videos thank you guys so
38:51 much for joining me if you like this
38:52 tutorial Series be sure to subscribe
38:55 below and I'll see you in the next
38:57 video going on everybody my name is Alex
38:58 freeberg and in today's video we're
39:00 going to be going over the group by and
39:02 the order by statements in previous
39:04 videos we created tables we went over to
39:06 select the from and the where and now we
39:10 are at the very end of our SQL basic
39:11 series if you stayed with us for the
39:13 whole time hopefully you have learned a
39:16 lot and learned the basics of SQL in
39:17 future videos we're going to be going
39:19 over intermediate and even more advanced
39:21 concepts and even going through
39:23 portfolio projects that you can use to
39:25 put on your resume if you like this type
39:27 of content be sure to subscribe below
39:29 but let's get into it for today the
39:30 group by statement is similar to
39:32 distinct in the select statement in that
39:35 it's going to show the unique values in
39:39 a column the difference is is if we say distinct
39:40 distinct
39:43 gender what's going to be returned is
39:46 the very first unique value of female
39:48 and the very first unique value of
39:52 male but if we say
39:57 gender and we say Group by
40:00 gender it's only going to return two
40:03 values but in these two values we
40:05 actually have all the males rolled up
40:07 into this one row and all the females
40:10 rolled up into this one row now let me
40:13 further show you what that means if I
40:17 say count of
40:20 gender now you can see that this whole
40:23 time there were six males in this one
40:25 row and there were three females in this
40:27 one row so with a distinct it really is
40:29 only showing us what value is in there
40:31 that's unique but with the group by it's
40:33 showing us what the unique value is but
40:35 it's also rolling them all up into one
40:37 column that we can use it for other
40:39 things now real quick I want to be able
40:42 to see both of these at the same time so
40:46 let's just put this up here and let's
40:49 run this so we can actually see both now
40:52 let's add age to this statement down
40:54 here or this
40:57 query and let's only run this one and I
41:00 want to show you what happens and why it
41:03 happens we're now looking at gender age
41:06 and then the count of gender so if we
41:09 look down here we only have one male who
41:13 is 29 we have one male who is female
41:16 that's age 30 and so on and so forth so
41:19 none of these people are both the same
41:22 gender and the same age if for example
41:24 we had two or three people who were male
41:26 and who were 30 years old then we would
41:29 have a two or a three over here so this
41:31 count is actually being counted at each
41:34 row that's being returned so for our
41:37 data that we have today this isn't a
41:39 fantastic example CU it really split it
41:41 out there any that were the same but as
41:44 you can see you can put multiple columns
41:47 as long as you put multiple down here
41:48 now why did we not have to put this
41:51 count gender down here in this group by
41:54 that's because this count gender is
41:56 actually a derived field or derived
42:00 column it's derived based off the gender
42:02 column so it's technically not a real
42:05 column that's in the table it's one that
42:07 we're creating that's fictional uh per
42:11 se so the age and the gender are actual
42:13 fields or actual columns that are in our
42:16 table they have to be down here and like
42:17 I said before it's the comparison to
42:20 that distinct in the select statement
42:22 because we're looking at the distinct of
42:25 gender and age so we're saying distinct
42:27 across multiple columns both gender and
42:30 age now as we had it before we were only
42:33 looking at gender it's going to roll all
42:37 of those up into just male and female
42:38 but if we want to add more we can easily
42:40 add more in this group by statement we
42:44 can still do things like where age is
42:46 greater than
42:50 31 we can still do those things so let's
42:52 execute this and our numbers are going
42:54 to change now we're doing it based off
42:56 gender and we're looking at the count of
42:59 people whose age is greater than 31
43:01 which is smaller than before now let's
43:03 look at order bu I'll do it down here
43:05 really quick for demonstration but I am
43:06 eventually going to come up here and use
43:08 it because I think it'll be a little bit
43:10 better to completely round out this
43:13 query down here let me give this a name
43:16 let's do count of gender and then let's
43:20 come down here and let's order
43:24 by uh let's order by count
43:27 gender and when we run that it's going
43:30 to do 1 three and that's because as a
43:34 default SQL has an ascending feature
43:35 which is going to be smallest to largest
43:37 going down if we want to change that we
43:39 can change it to descending that's going
43:43 to be largest to smallest so now we have
43:47 31 and if we want to do it based off
43:49 gender and we do it descending now we
43:52 have Z to
43:54 A and so that's going to be male female
43:55 and if we get rid of that it's going to
43:57 do the the default
44:01 ascending and let's see what that brings
44:04 female male now for what we're trying to
44:06 do let's look at this large table so I
44:08 think it's going to be a little bit more
44:10 descriptive or a little bit better
44:15 visually let's do order by and let's do
44:18 age let's run this and it's going to
44:20 order smallest to
44:25 largest if we do
44:28 descending it's going to do largest to
44:31 smallest now you don't only have to do
44:34 just one thing you can do multiple
44:37 columns so if I wanted to do age and
44:39 then gender I can do that as well so
44:41 let's do
44:44 gender and let's run that so now we have
44:47 the age but under the age we also have
44:49 it ordered by female and that's an
44:52 ascending order so AB BC d f so females
44:54 first so it's going to be female first
44:56 and then it's going to be male and again
45:00 female and male now we don't have to
45:03 just let it be ascending for each one if
45:06 I wanted to do it reverse in this column
45:10 I can do descending now let's run that
45:12 and when we have 30 now male is first
45:15 and female second and if I wanted to do
45:17 that over here I can do descending and
45:20 now we have them both descending so it's
45:22 going to go top to bottom and we have 32
45:26 it's going to be male 32 female so you
45:28 can specify lots of different things in
45:30 here and we don't actually have to use
45:32 column names we could just use numbers
45:37 so if I wanted to do 1 2 3 4 5 I could
45:39 but let's try to replicate the exact
45:41 same thing before this would be column 1
45:46 2 three four so let's do where four
45:48 descending and then let's do
45:52 five descending and if we execute that
45:53 it's going to give us the exact same
45:55 result as if we' actually put in the
45:57 column name and I I do use this a lot
46:00 oftentimes I don't use the column name I
46:02 just if it's a small table I'll just use
46:04 the number so in my actual queries I do
46:06 this a lot where I just use the number
46:08 instead of the column name so that is
46:09 the group buy and the order by statement
46:10 and if you have walked through my
46:12 previous videos you should be completely
46:14 done with the basics of SQL so
46:16 congratulations the next thing to do is
46:18 really just practice the basics because
46:20 the basics are what you're going to be
46:22 using day in day out and so what I would
46:24 recommend is create a few more tables
46:26 query those tables try to think of use
46:28 cases and what you would actually want
46:30 to know from that information after that
46:32 I would move on to my intermediate
46:34 videos if those are already out and then
46:35 I would move on to my Advanced videos
46:37 those are going to go over some more
46:39 challenging topics but things that would
46:42 be very useful for anybody to know in my
46:44 next video I'm going to be going over
46:46 intermediate SQL topics things like
46:49 joins and subqueries and a ton more so
46:51 if I already have posted those be sure
46:53 to go check those out on my page and if
46:55 I haven't I hope to have those up soon
46:56 thank you thank you guys so much for
46:58 watching I really appreciate it if you
46:59 learned anything in this basics of
47:01 sequel Series be sure to subscribe below
47:03 and I'll see you in the next video
47:05 what's going on everybody my name is
47:06 Alex freeberg and today we're going to
47:08 be starting our intermediate SQL series
47:10 if you joined us for our last series we
47:11 walked through the basics of SQL which
47:13 is everything you needed just to get
47:14 started and in this series we're going
47:16 to be walking through some intermediate
47:17 Concepts to really take your skills up
47:19 to the next level now today we're going
47:21 to be walking through joins but let me
47:23 show you what you can expect from the
47:25 entire series for this intermediate course
47:26 course
47:28 so we're going walking through joins
47:30 today and then in future videos we're
47:32 walking through unions case statements
47:35 updating and deleting data Partition by
47:39 data types aliasing creating views
47:41 having versus the group by statement the
47:44 get date function primary care of your
47:46 foreign key and then we're going to have
47:48 an advanced course and this is not set
47:50 in stone yet but these are some of the
47:51 things that I think I will be going
47:53 through or walking through we're going
47:57 through CTE CIS tables or system tables
48:00 subqueries temp tables string functions
48:03 regular expression store procedures and
48:05 then importing and exporting data so
48:07 with all that being said let's get into
48:09 it all right now let's get rid of me
48:11 because we do not need to be seeing me
48:13 for the rest of the series at the very
48:14 top here are some of the things that
48:15 we're going to be going through today
48:16 which are inner joins and then outer
48:18 joins and in the outer joins we have a
48:19 few different styles or a few different
48:24 types of outer joins now a join is a way
48:26 to combine multiple tables
48:28 into a single
48:30 output for now we're going to be using
48:31 the employee demographics and the
48:34 employee salary table so let's get a
48:37 look at both of these tables and see
48:39 what's in them in our employee
48:41 demographics table we have employee ID
48:44 first name last name age and gender and
48:46 then down here in our employee salary
48:49 table we have employee ID job title and
48:52 salary if you notice they have a similar
48:54 column and that's going to be the
48:57 employee ID now when you're doing a join
48:59 you have to do this based off a similar
49:02 column and typically you want it to be a
49:04 unique field so we're going to be using
49:06 the employee ID from both tables to join
49:08 these tables together to create one
49:11 output so let's get rid of this real
49:14 quick and let's start building our query
49:16 to join these two tables
49:18 together so the first thing we're going
49:21 to do is an inner join so let's do select
49:22 select
49:26 everything and let's do it from SQL tutorial.
49:28 tutorial.
49:31 db. employee
49:35 demographics and let's do join we can
49:39 also say inner join but join by default
49:41 is going to say
49:45 iner and we're going to do SQL tutorial.
49:48 db. employee
49:51 salary now we have to join them together
49:52 which is what we talked about earlier
49:53 and we're going to be doing that based
49:56 off the employee ID so for that we have
49:59 to say on and then we're going to
50:02 say employee
50:08 demographics dot employee ID is equal to employee
50:10 employee
50:15 salary dot employee ID so let's run this
50:19 real quick and take a look at the
50:22 output and let me pull this up real quick
50:24 quick
50:27 so what we are looking at is actually
50:29 both tables combined we have the
50:31 employee ID first name last name age
50:33 gender and then here's the salary
50:36 employee ID job title salary now an in
50:38 join is really only going to show
50:41 everything that is the same so in both
50:44 tables there are employee IDs of
50:47 10001 all the way down to
50:50 10009 but if you notice there is data
50:52 that is missing real quick let's go down
50:54 to this graphic and let's look at this
50:56 inner join an inner join is going to
50:59 show everything that is common or
51:02 overlapping between table a and table B
51:04 so what we are looking at here is
51:06 exactly that we're only looking at the
51:09 things that are similar based off this
51:12 employee ID in both tables now let's
51:17 change this join to a full outer
51:23 join and let's run this and see what we
51:27 get now if you notice the output is very
51:30 different so let's take a look at it and
51:33 see why it's so different if you notice
51:35 everything down till here is the exact
51:40 same so employees 101 down to 1009 are
51:43 exactly the same but once we get down to
51:46 row 10 it starts to get very different
51:47 now we are joining these tables based
51:51 off the employee ID so for example right
51:54 here Ryan Howard has an employee ID of
51:57 101 but as you can see in this table for
52:01 salaries there is no 101 employee ID so
52:04 it has nothing to link it to so because
52:06 of that it fills in everything as null
52:08 because it has nothing to match on this
52:12 table and vice versa in the employee
52:14 salary table there's a person in here
52:15 that's a Salesman and there's no
52:18 employee ID at all which means all this
52:20 information is going to be null and we
52:22 can see that in this diagram right here
52:24 so this is the full outer join right
52:27 here and what it is saying is we are
52:29 going to show everything from table a
52:32 and table B regardless of if it has a
52:34 match based on what we were joining them
52:37 on so even if table a has an employee ID
52:40 but there's no employee ID in table B
52:41 we're still going to show it and vice
52:46 versa so now let's look at a left outer
52:49 join a left outer join is going to take
52:52 the left table and say we want
52:54 everything from the left table and
52:56 everything that's overlook lapping but
52:58 if it's only in the right table we do
53:00 not want it now what is the left and the
53:01 right table the left table is going to
53:03 be our first table that we use our right
53:05 table is going to be the second table
53:07 that we use so we're going to look at
53:08 everything in the employee demographics
53:10 table regardless of whether or not it
53:13 has a match on the employee ID in the
53:15 employee salary table so this is what
53:18 that looks like so as you can see this
53:20 is our entire table for employee
53:24 demographics and down here we have three
53:26 that have information in the employee
53:28 demographics table but have absolutely
53:31 no information in any of the employee
53:33 salary table because there's nothing to
53:36 match it on so this 101 is not in this
53:39 table this 13 is not in this table and
53:41 this one does not even have an employee
53:43 ID so we're not going to have a match at
53:47 all and if we change that to the
53:50 right you'll see the exact opposite it's
53:52 going to show us everything in the
53:55 employee salary table so now we have all
53:57 of our information right here from the
53:59 employee salary table and if it doesn't
54:01 match in this table it's just going to
54:04 give nulls so down here we have 1,0 and
54:05 obviously there's not going to be
54:07 anything associated with that because
54:09 there's no 10,0 in the employee
54:11 demographics table and for this one we
54:14 have a Salesman with no employee ID and
54:16 since there's no employee ID to tie it
54:18 to this demographics table we're going
54:20 to have nothing and we can see that in
54:22 the diagram right here so for the left
54:24 outer join we're looking at everything
54:26 in table a which is our demographics
54:28 table and in our right outer join
54:30 looking at everything at table B which
54:33 is our salary table now let's pull this
54:36 down a little bit so so far we've only
54:38 been using the select star so we've been
54:40 selecting everything and I only did that
54:42 just for demonstration purposes but you
54:44 most likely would not be doing this when
54:46 you actually use these joins what you're
54:48 probably going to want to do is Select
54:51 exactly what columns you want in your
54:56 output so for example let's do employee
55:00 ID let's do first
55:03 name last
55:07 name and let's do job
55:09 title and let's do
55:12 salary and let's try to run that really
55:15 quick and as you can see it is not going
55:18 to work now why is that not working it's
55:21 not working because we have two Fields
55:23 one in each of these tables and we have
55:26 to specify what employee ID we want
55:27 because that is going to drastically
55:29 change what our output is so we have an
55:31 employee ID in this table and in this
55:34 table which one do we want to use so for
55:37 this demonstration let's use employeed
55:41 demographics. employee ID and let's
55:44 actually just do an inner join because
55:48 it's easier for the
55:51 output now let's run this and see what
55:54 we get so as you can see we now have the
55:56 employee ID first name last name job
55:59 title and salary now we're doing this
56:01 with an injoin based off the employee ID
56:04 from the employee demographics table but
56:06 if we use the employee salary table it
56:09 should give us the exact same output and
56:10 that's cuz we're using an in join and an
56:12 in joint is only going to show us
56:14 everything that overlaps between both
56:19 tables but now let's try a write outer
56:22 join and let's run this now we're using
56:24 this employee ID from our employee
56:26 salary table and since we're doing a
56:29 write outer join we're going to get all
56:31 the information from our employee salary
56:33 table and it does not have to be in our
56:35 left table which is our employee
56:37 demographics table so if you look at the
56:40 information down here this 110 is in the
56:43 employee salary table but it's in this
56:44 position because that's what we're
56:47 looking at in our select statement and
56:49 then over here we have our salary and
56:52 since we have information right here
56:54 which is in our employee salary table
56:55 but there is no employee employe ID our
56:58 employee ID is null now let's change
57:00 this to look at the employee
57:04 demographics employee ID and execute it
57:07 as you can see that 110 is gone now we
57:09 just have this information right down
57:12 here and we didn't have the employee ID
57:13 for either of these so it's going to
57:16 show it regardless and that's again
57:18 because we have a right outer join and
57:21 that's why we have no employee ID down
57:25 here now let's do a left
57:27 join and it's basically going to do the
57:30 opposite of what we just looked at now
57:31 we're looking at everything from our
57:34 left table regardless of if it's in our
57:36 right table and so our left table is our
57:39 employee demographics table and we are
57:41 looking at our employee demographics ID
57:43 so with the employee demographics ID
57:45 it's going to show us the first name and
57:47 the last name which is everything in our
57:48 left table our employee demographics
57:53 table and since for these IDs or lack of
57:55 IDs it's just going to give us NES in
57:58 all of these places if I change it right
58:01 up here to the employee salary employee
58:05 ID and I execute it because we're
58:07 showing everything from our left table
58:09 which is our employee demographics table
58:12 we are still going to see our names but
58:14 since we're using the employee ID from
58:17 our right table now we're just going to
58:20 have blanks in this information and this
58:22 information now let's look at a use case
58:25 for these joins let's say Robert
58:27 California is pressuring Michael Scott
58:30 to meet his quarterly quota and Michael
58:32 Scott is almost there he needs like a
58:33 thousand more dollars and he comes up
58:36 with the genius idea to deduct pay from
58:38 the highest paid employee at his Branch
58:42 besides himself so how does he go about
58:43 doing this and identifying the person
58:46 that makes the most money well of course
58:49 he's going to come to SQL first so we
58:51 actually want to look at a
58:56 full outer join real quick
58:59 and let's just look at
59:02 everything so here's what we have we
59:05 have the employee ID first name last
59:08 name age gender employee ID job title
59:10 and salary now what information do we
59:13 need to know to get the information that
59:15 Michael Scott needs well we need the
59:17 employee ID we want the first name and
59:20 last name so let's write all that real
59:24 quick so employee ID we need first name
59:26 name we
59:30 need last name and then we're also going
59:32 to need the
59:34 salary cuz we need to know who is the
59:35 highest paid
59:38 employee so now let's do an injin
59:40 because we really only want to look at
59:43 the employee IDs where we know what
59:45 their name is and their salary is and
59:47 let's do this based off the employee
59:48 demographics table really doesn't matter
59:51 for an in join but let's do that real
59:54 quick so let's look at this so we have
59:56 our employee ID we have our first name
59:58 our last name and our salary and we want
60:01 to do it where it's not Michael Scott and that's because Michael doesn't want
60:02 and that's because Michael doesn't want to take away his own money he wants to
60:04 to take away his own money he wants to take away his employees money so let's
60:06 take away his employees money so let's do
60:07 do where first name does not equal Michael
60:12 where first name does not equal Michael and he knows that he's the only one that
60:14 and he knows that he's the only one that is not named Michael so now we have our
60:18 is not named Michael so now we have our list and let's do order
60:21 list and let's do order bu and let's do
60:25 bu and let's do salary and let's execute
60:28 salary and let's execute this and let's do descending so that we
60:32 this and let's do descending so that we can get at the very
60:33 can get at the very top and this is tough tough news for
60:36 top and this is tough tough news for Dwight shut because it looks like he is
60:39 Dwight shut because it looks like he is the highest paid employee besides
60:41 the highest paid employee besides Michael and so it looks like he is going
60:43 Michael and so it looks like he is going to get a cut in his pay this quarter so
60:46 to get a cut in his pay this quarter so that Michael can meet his quota so
60:48 that Michael can meet his quota so that's just one use case let's look at
60:50 that's just one use case let's look at one more use case let's start out by
60:52 one more use case let's start out by getting rid of this and looking at
60:56 getting rid of this and looking at everything
61:03 again so for our next use case Kevin Malone who is an accountant thinks that
61:05 Malone who is an accountant thinks that he may have made a mistake when looking
61:07 he may have made a mistake when looking at the average salary for our salesman
61:10 at the average salary for our salesman now Angela Martin is very good at SQL
61:12 now Angela Martin is very good at SQL and so what she is going to do is she
61:14 and so what she is going to do is she wants to go in and calculate the average
61:17 wants to go in and calculate the average salary for our salesman so let's try to
61:20 salary for our salesman so let's try to get that information so all we're going
61:22 get that information so all we're going to need is the job title and the salary
61:26 to need is the job title and the salary so let's come up here and let's get job
61:29 so let's come up here and let's get job title and let's get
61:31 title and let's get salary and let's look at
61:34 salary and let's look at this and now we only want to look at
61:38 this and now we only want to look at where the job title is equal to
61:47 salesman now the very last thing we want to do is we want to say we want the
61:51 to do is we want to say we want the average of salary now since we're going
61:55 average of salary now since we're going to need to do a group buy we're going to
61:57 to need to do a group buy we're going to have to get rid of this
61:59 have to get rid of this salary and just take job title write
62:03 salary and just take job title write down here and do group by job title so
62:07 down here and do group by job title so we're going to have job title and then
62:09 we're going to have job title and then the average
62:11 the average salary and there you go we have the
62:14 salary and there you go we have the salesman and the average salary is
62:16 salesman and the average salary is 52,000 so Angela now knows to go back
62:19 52,000 so Angela now knows to go back and fix what Kevin made a mistake on so
62:22 and fix what Kevin made a mistake on so that's how you use joins I will includ
62:24 that's how you use joins I will includ include this image in the description so
62:26 include this image in the description so you can go and look that up yourself if
62:28 you can go and look that up yourself if you are curious and want to look at that
62:29 you are curious and want to look at that that really helped me out when I was
62:31 that really helped me out when I was first getting started to kind of
62:32 first getting started to kind of conceptualize and understand what kind
62:34 conceptualize and understand what kind of data I was pulling based on what join
62:37 of data I was pulling based on what join I was using so I hope that was useful to
62:39 I was using so I hope that was useful to you as well in the very next video we're
62:41 you as well in the very next video we're going to be looking at the union so if
62:43 going to be looking at the union so if that is posted be sure to check that out
62:45 that is posted be sure to check that out next thank you guys so much for joining
62:47 next thank you guys so much for joining me I really appreciate it if you like
62:49 me I really appreciate it if you like this type of content or got anything out
62:50 this type of content or got anything out of it today be sure to smash the like
62:52 of it today be sure to smash the like button smash the Subscribe button and
62:54 button smash the Subscribe button and I'll see see in the next video what's
62:55 I'll see see in the next video what's going on everybody my name is Alex free
62:57 going on everybody my name is Alex free in today's video we're going to be
62:59 in today's video we're going to be looking at unions now in the very last
63:01 looking at unions now in the very last video we walked through joins and I
63:03 video we walked through joins and I thought it was appropriate to look at
63:04 thought it was appropriate to look at unions next because unions and joins are
63:07 unions next because unions and joins are somewhat similar or closely related and
63:10 somewhat similar or closely related and that's because in both instances they're
63:13 that's because in both instances they're combining two tables to create one
63:15 combining two tables to create one output now what's the difference the
63:17 output now what's the difference the difference is that a join combines both
63:19 difference is that a join combines both tables based off a common column and in
63:22 tables based off a common column and in last video that was the employee ID so
63:25 last video that was the employee ID so in both tables we had an employee ID and
63:28 in both tables we had an employee ID and when you're selecting your data you have
63:30 when you're selecting your data you have to choose either to only select one
63:32 to choose either to only select one employee ID or you can choose both
63:35 employee ID or you can choose both employee IDs but they're in separate
63:37 employee IDs but they're in separate columns and with a union you're actually
63:40 columns and with a union you're actually able to select all the data from both
63:42 able to select all the data from both tables and put it into one output where
63:46 tables and put it into one output where all the data is in each column and not
63:49 all the data is in each column and not separate it out and you don't have to
63:50 separate it out and you don't have to choose which table you're choosing it
63:52 choose which table you're choosing it from now that may not have made1 100%
63:54 from now that may not have made1 100% sense but let's look at it real quick in
63:57 sense but let's look at it real quick in stages so let's go down here and let's
64:00 stages so let's go down here and let's actually join this table
64:02 actually join this table together and see what we get now the two
64:05 together and see what we get now the two tables that we're looking at is employee
64:07 tables that we're looking at is employee demographics and warehouse employee
64:10 demographics and warehouse employee demographics so over here we have our
64:12 demographics so over here we have our employee demographics information and
64:14 employee demographics information and then over here or actually down here we
64:17 then over here or actually down here we have our warehouse employee demographics
64:19 have our warehouse employee demographics now right now I'm doing a full outer
64:21 now right now I'm doing a full outer join so we're looking at all the data
64:24 join so we're looking at all the data and if we were to pull this in to an
64:26 and if we were to pull this in to an Excel spreadsheet we could just copy
64:28 Excel spreadsheet we could just copy this and paste it over here and we would
64:30 this and paste it over here and we would be good to go and that's because we have
64:32 be good to go and that's because we have all the same columns first name last
64:35 all the same columns first name last name age gender first name last name age
64:37 name age gender first name last name age gender but if we tried to combine this
64:39 gender but if we tried to combine this in a query where we have this
64:41 in a query where we have this information right here it wouldn't work
64:44 information right here it wouldn't work we cannot get it in the same column and
64:47 we cannot get it in the same column and that's where a union comes into play so
64:50 that's where a union comes into play so let's go back up here and let's actually
64:53 let's go back up here and let's actually run both of
64:55 run both of these now as you can see they have the
64:59 these now as you can see they have the exact same columns and that makes it
65:01 exact same columns and that makes it super easy for what we're about to do
65:03 super easy for what we're about to do all we're going to do is between these
65:06 all we're going to do is between these two queries which are completely
65:08 two queries which are completely separate right now all we're going to do
65:11 separate right now all we're going to do is write
65:12 is write Union so let's run just
65:18 Union so let's run just this now because of the Union you can
65:21 this now because of the Union you can look down here and the information that
65:22 look down here and the information that used to be in the other table which were
65:25 used to be in the other table which were in separate columns are now added Down
65:28 in separate columns are now added Down Below in the exact same order now Daryl
65:31 Below in the exact same order now Daryl filin was actually in both tables and
65:34 filin was actually in both tables and the reason he isn't showing up multiple
65:35 the reason he isn't showing up multiple times is because this Union is actually
65:37 times is because this Union is actually taking out and removing the duplicates
65:40 taking out and removing the duplicates kind of like a distinct statement now
65:42 kind of like a distinct statement now there's actually another thing called
65:44 there's actually another thing called Union all and if we do Union all it is
65:46 Union all and if we do Union all it is going to show us all of the information
65:49 going to show us all of the information regardless if it is a duplicate or not
65:51 regardless if it is a duplicate or not so let's run that real quick and they
65:54 so let's run that real quick and they they are both there but let's order
65:57 they are both there but let's order by and let's do employee
66:01 by and let's do employee ID so now let's run it and as you can
66:05 ID so now let's run it and as you can see right here these are exact
66:07 see right here these are exact duplicates and so the union got rid of
66:10 duplicates and so the union got rid of it because they were the exact same but
66:12 it because they were the exact same but the union all kept it in because it is
66:15 the union all kept it in because it is showing just the data as is now let's
66:18 showing just the data as is now let's get rid of this Union all because the
66:20 get rid of this Union all because the only reason why it works so well is
66:22 only reason why it works so well is because those two tables were exact same
66:25 because those two tables were exact same they were employee ID first name last
66:27 they were employee ID first name last name age gender so they're basically the
66:29 name age gender so they're basically the same tables just with different
66:30 same tables just with different information so it made it really easy
66:33 information so it made it really easy but we have another table
66:36 but we have another table employee uh
66:38 employee uh salary and let's look at these two
66:41 salary and let's look at these two tables so these two tables are obviously
66:45 tables so these two tables are obviously very different they hold different
66:48 very different they hold different information now we would still be able
66:50 information now we would still be able to combine them so let's do employee
66:55 to combine them so let's do employee ID first name and let's do
67:00 ID first name and let's do age now down here on the employee salary
67:03 age now down here on the employee salary table we will do employee ID job title
67:07 table we will do employee ID job title and
67:09 and salary now let's use a union really
67:14 salary now let's use a union really quick and run this
67:17 quick and run this one and it is still going to work now
67:21 one and it is still going to work now why does this work well first off the
67:23 why does this work well first off the the reason it's working is because these
67:25 the reason it's working is because these data types are the exact same or at
67:29 data types are the exact same or at least similar so text and text age which
67:31 least similar so text and text age which is an integer salary which is an integer
67:34 is an integer salary which is an integer it has the same amount of columns so
67:36 it has the same amount of columns so three and three so we have employee ID
67:40 three and three so we have employee ID first name and age and it's taking that
67:42 first name and age and it's taking that from the first select statement and it's
67:44 from the first select statement and it's still using a union to take the data
67:47 still using a union to take the data from the second select statement so it's
67:49 from the second select statement so it's still inserting this information now
67:51 still inserting this information now this is not what you want to do because
67:53 this is not what you want to do because right here we have first name and it's
67:55 right here we have first name and it's salesman salesman and then our age we
67:57 salesman salesman and then our age we have 30 45,000 and 45,000 is obviously
68:01 have 30 45,000 and 45,000 is obviously not an age so you want to be careful
68:03 not an age so you want to be careful when you're using a union to combine two
68:05 when you're using a union to combine two separate tables and make sure that the
68:07 separate tables and make sure that the data you're selecting is the same in the
68:10 data you're selecting is the same in the very next video we're going to be
68:11 very next video we're going to be walking through case statements thank
68:12 walking through case statements thank you guys so much for joining me I really
68:14 you guys so much for joining me I really appreciate it if you like this type of
68:16 appreciate it if you like this type of content be sure to subscribe below and
68:18 content be sure to subscribe below and I'll see you in the next video what is
68:20 I'll see you in the next video what is going on everybody my name is Alex
68:21 going on everybody my name is Alex freeberg and today we're going to be
68:23 freeberg and today we're going to be walking through cas statements in SQL a
68:26 walking through cas statements in SQL a case statement allows you to specify a
68:28 case statement allows you to specify a condition and then it also allows you to
68:30 condition and then it also allows you to specify what you want returned when that
68:32 specify what you want returned when that condition is met so we're going to be
68:35 condition is met so we're going to be using this employee demographics table
68:36 using this employee demographics table that we're looking at right here we're
68:38 that we're looking at right here we're going to walk through the syntax of how
68:40 going to walk through the syntax of how to create a case statement and then
68:41 to create a case statement and then we're going to actually go into some use
68:43 we're going to actually go into some use cases at the end so let's start off by
68:45 cases at the end so let's start off by specifying what columns we want let's
68:47 specifying what columns we want let's say we want the first name we want the
68:51 say we want the first name we want the last name and we want want the age now
68:56 last name and we want want the age now let's just get that
68:57 let's just get that information now for our case statement
68:59 information now for our case statement we're going to be using this age column
69:01 we're going to be using this age column so we actually want the age to be in
69:03 so we actually want the age to be in there so let's
69:04 there so let's specify where age is not
69:08 specify where age is not null and run that so now we have a
69:11 null and run that so now we have a pretty good look at it and let's just
69:13 pretty good look at it and let's just order
69:15 order by H just to clean it up a little bit so
69:19 by H just to clean it up a little bit so now let's start building our case
69:21 now let's start building our case statement so we're going to say case and
69:24 statement so we're going to say case and then we want to say when now we need to
69:27 then we want to say when now we need to specify what condition we want to look
69:28 specify what condition we want to look for so let's do when age is greater than
69:33 for so let's do when age is greater than 30 then then what do we want to be
69:36 30 then then what do we want to be returned so we want to return that they
69:39 returned so we want to return that they are old else so that means anything that
69:42 are old else so that means anything that is not over the age of 30 we want to
69:45 is not over the age of 30 we want to return
69:48 return young and then you need to specify that
69:50 young and then you need to specify that you done with the case statement and so
69:52 you done with the case statement and so you will write end at the very bottom so
69:55 you will write end at the very bottom so this is our first case statement let's
69:56 this is our first case statement let's run it and see what we get so as you can
69:59 run it and see what we get so as you can see a new column was created and if the
70:02 see a new column was created and if the person is over the age of 30 so 31 and
70:04 person is over the age of 30 so 31 and up they are given old and if they're not
70:07 up they are given old and if they're not over the age of 30 they are given
70:10 over the age of 30 they are given young now we can do as many when and
70:13 young now we can do as many when and then statements as we want so if we want
70:15 then statements as we want so if we want to we can also do when the age is
70:19 to we can also do when the age is between 27 and 30
70:24 between 27 and 30 then we want to return
70:26 then we want to return young and anyone else we're going to
70:29 young and anyone else we're going to call a
70:31 call a baby so now we have Ryan Howard as the
70:35 baby so now we have Ryan Howard as the baby anyone between 27 and 30 they're
70:38 baby anyone between 27 and 30 they're considered young and anyone over the age
70:41 considered young and anyone over the age of 30 is old now something to note is
70:44 of 30 is old now something to note is that the very first condition that is
70:46 that the very first condition that is met is going to be returned so if there
70:49 met is going to be returned so if there are multiple conditions that meet the
70:50 are multiple conditions that meet the criteria only the very first one is
70:53 criteria only the very first one is going to be return returned and let's
70:54 going to be return returned and let's demonstrate that real quick so if the
70:57 demonstrate that real quick so if the age equals
70:59 age equals 38 then return Stanley because that is
71:05 38 then return Stanley because that is Stanley uh and let's execute this real
71:08 Stanley uh and let's execute this real quick so right here I'm specifying that
71:10 quick so right here I'm specifying that if it's 38 it should return Stanley but
71:13 if it's 38 it should return Stanley but he is right here and it still says old
71:15 he is right here and it still says old and that's because this condition was
71:17 and that's because this condition was already met now if we were to put this
71:20 already met now if we were to put this right
71:22 right here it should work correctly and let's
71:26 here it should work correctly and let's try it out so now because this condition
71:28 try it out so now because this condition is met first it is going to return
71:30 is met first it is going to return Stanley down here so now let's get into
71:33 Stanley down here so now let's get into our first use case let's start off by
71:36 our first use case let's start off by copying this and then commenting it out
71:40 copying this and then commenting it out I only did that because I don't want to
71:43 I only did that because I don't want to rewrite it because I'm
71:44 rewrite it because I'm lazy uh let's get rid of that and let's
71:47 lazy uh let's get rid of that and let's look at this real quick we are going to
71:48 look at this real quick we are going to join on another table that we have
71:51 join on another table that we have really fast um that's going to be SQL
71:54 really fast um that's going to be SQL tutorial if you watched my other videos
71:56 tutorial if you watched my other videos then you know this table and we're going
72:00 then you know this table and we're going to do that on employee
72:03 to do that on employee demographics. employee ID is equal to
72:08 demographics. employee ID is equal to employee
72:10 employee salary. employee ID okay so let's just
72:14 salary. employee ID okay so let's just look at everything in these tables
72:16 look at everything in these tables really quick now we are going to be
72:17 really quick now we are going to be focusing on the job title in the salary
72:19 focusing on the job title in the salary column but we want their first name and
72:21 column but we want their first name and last name as well so let's start
72:22 last name as well so let's start building that out
72:24 building that out let's do first
72:25 let's do first name last
72:27 name last name job title and salary and let's look
72:32 name job title and salary and let's look at this really quick so now we have our
72:34 at this really quick so now we have our employees and here is the situation we
72:36 employees and here is the situation we had a fantastic year this year selling
72:38 had a fantastic year this year selling paper and corporate has allowed Michael
72:40 paper and corporate has allowed Michael Scott to give out a yearly raise to
72:43 Scott to give out a yearly raise to every single employee but not every
72:45 every single employee but not every employee is going to get the same raise
72:47 employee is going to get the same raise because our salesmen are genuinely the
72:50 because our salesmen are genuinely the people who made us our money and they're
72:52 people who made us our money and they're going to get the biggest raises well
72:53 going to get the biggest raises well other people really aren't going to get
72:55 other people really aren't going to get that big of a raise so now let's go
72:57 that big of a raise so now let's go through and create a case statement to
72:58 through and create a case statement to calculate what their salary will be
73:00 calculate what their salary will be after they get their
73:01 after they get their raise so let's start off by saying
73:05 raise so let's start off by saying case and
73:07 case and when and we want it to say when job
73:10 when and we want it to say when job title is equal to
73:12 title is equal to salesman so when they are a Salesman
73:15 salesman so when they are a Salesman what do we want to happen so this is
73:17 what do we want to happen so this is where the calculation occurs so we're
73:18 where the calculation occurs so we're going to take their
73:20 going to take their salary and then we're going to add their
73:24 salary and then we're going to add their salary times how much their raise is
73:27 salary times how much their raise is going to be so the salesman did really
73:29 going to be so the salesman did really really well and we want to give them a
73:30 really well and we want to give them a 10% raise this year now when their job
73:35 10% raise this year now when their job title is equal
73:38 title is equal to
73:40 to accountant then and we'll take their
73:44 accountant then and we'll take their salary we will give
73:47 salary we will give them let's give them a 5% raise still
73:51 them let's give them a 5% raise still very
73:52 very generous there we we go and when the job
73:59 generous there we we go and when the job title is equal to
74:03 title is equal to HR then it's going to be the salary
74:07 HR then it's going to be the salary plus the
74:09 plus the salary times and then we're going to do
74:14 salary times and then we're going to do 01 all right and else we are just going
74:18 01 all right and else we are just going to
74:18 to do
74:20 do salary plus salary oops let's do
74:25 salary plus salary oops let's do parentheses times and let's just give
74:28 parentheses times and let's just give everyone else a 3% rays and then we'll
74:32 everyone else a 3% rays and then we'll write end now let's take a look at our
74:36 write end now let's take a look at our results so here's what we have so far we
74:39 results so here's what we have so far we have our first name our last name our
74:41 have our first name our last name our job title and our salary that is our
74:43 job title and our salary that is our current salary and then we're going to
74:45 current salary and then we're going to have our salary after we get our raise
74:47 have our salary after we get our raise so I'm going to actually write that up
74:49 so I'm going to actually write that up here so let's do as
74:51 here so let's do as salary a after
74:54 salary a after raise and let's execute
74:57 raise and let's execute that so let's look at these raises
74:59 that so let's look at these raises really quick so we have 45,000 and since
75:01 really quick so we have 45,000 and since he is a Salesman he gets a 10% raise
75:04 he is a Salesman he gets a 10% raise which is a raise of
75:05 which is a raise of $4,500 so 45,000 plus
75:08 $4,500 so 45,000 plus $4,500 is $49,500 and as you can see
75:12 $4,500 is $49,500 and as you can see down here we have HR who is making
75:15 down here we have HR who is making $50,000 and now he is making
75:17 $50,000 and now he is making $5,000 5 so everybody got a raise so
75:22 $5,000 5 so everybody got a raise so that is our case statement I hope that
75:23 that is our case statement I hope that was helpful I find myself using the case
75:25 was helpful I find myself using the case statement a lot when I'm wanting to
75:27 statement a lot when I'm wanting to categorize things or label things and
75:29 categorize things or label things and that's kind of what we did in the first
75:31 that's kind of what we did in the first example and you can even do calculations
75:33 example and you can even do calculations like we did in this use case so I hope
75:36 like we did in this use case so I hope that was helpful thank you guys so much
75:38 that was helpful thank you guys so much for watching I really appreciate it if
75:40 for watching I really appreciate it if you learned anything from this video be
75:41 you learned anything from this video be sure to like And subscribe below and
75:43 sure to like And subscribe below and I'll see you in the next video what is
75:45 I'll see you in the next video what is going on everybody my name is Alex fre
75:47 going on everybody my name is Alex fre and today we're going to be looking at
75:48 and today we're going to be looking at the having Clause now the having Clause
75:51 the having Clause now the having Clause I feels a little bit unappreciated in
75:53 I feels a little bit unappreciated in the SQL Community I feel like it doesn't
75:55 the SQL Community I feel like it doesn't get a lot of love and so today I want to
75:57 get a lot of love and so today I want to describe how to use it and what it's
75:58 describe how to use it and what it's used for so before we use the having
76:00 used for so before we use the having Clause I want to set up our query here
76:03 Clause I want to set up our query here uh we want to use an aggregate function
76:04 uh we want to use an aggregate function in the group by statement and then I
76:06 in the group by statement and then I will show you how to use this having
76:08 will show you how to use this having Clause so let's look at the job title
76:12 Clause so let's look at the job title and let's look at the count of job
76:17 and let's look at the count of job titles and then down here we need to do
76:20 titles and then down here we need to do group by job title
76:24 group by job title and let's execute
76:25 and let's execute this and here is our job titles and
76:28 this and here is our job titles and here's the count of how many people have
76:30 here's the count of how many people have those job titles so now let's say we
76:32 those job titles so now let's say we want to look at all the jobs that have
76:34 want to look at all the jobs that have more than one person in that specific
76:36 more than one person in that specific job so let's do where uh the
76:42 job so let's do where uh the count of job title is greater oops is
76:48 count of job title is greater oops is greater than one and let's run
76:50 greater than one and let's run that and as you can see we're going to
76:52 that and as you can see we're going to get this this message right here now
76:53 get this this message right here now let's read it an aggregate may not
76:56 let's read it an aggregate may not appear in the wear Clause unless it is
76:58 appear in the wear Clause unless it is in a subquery contained in a having
77:00 in a subquery contained in a having clause or a select list and the column
77:04 clause or a select list and the column being aggregated is an outer
77:06 being aggregated is an outer reference what that is basically saying
77:08 reference what that is basically saying is is we cannot use this aggregate
77:11 is is we cannot use this aggregate function in the wear statement we need
77:12 function in the wear statement we need to use a having Clause so let's get rid
77:16 to use a having Clause so let's get rid of this and let's say
77:18 of this and let's say having the count of job
77:21 having the count of job title greater than one I did the same
77:24 title greater than one I did the same thing again and let's execute this and
77:27 thing again and let's execute this and we're still going to get an error now
77:29 we're still going to get an error now why are we getting that error the reason
77:31 why are we getting that error the reason is is because this having statement is
77:34 is is because this having statement is completely dependent on the group by
77:36 completely dependent on the group by statement because we are performing this
77:38 statement because we are performing this after it has been aggregated so this
77:40 after it has been aggregated so this having statement actually needs to go
77:42 having statement actually needs to go after the group by statement because we
77:45 after the group by statement because we can't look at the aggregated information
77:47 can't look at the aggregated information before it's actually aggregated in that
77:49 before it's actually aggregated in that group by statement so now let's run this
77:52 group by statement so now let's run this and and it worked
77:54 and and it worked perfectly so now we only have the jobs
77:56 perfectly so now we only have the jobs that have more than one employee for
77:58 that have more than one employee for that job
78:00 that job title so now let's look at one more
78:02 title so now let's look at one more example let's do the average let's say
78:06 example let's do the average let's say salary and let's get rid of this having
78:08 salary and let's get rid of this having Clause real quick and just to look at
78:12 Clause real quick and just to look at this
78:13 this information uh let's do order by and
78:17 information uh let's do order by and we'll do average
78:21 we'll do average salary so let's look at this and we have
78:25 salary so let's look at this and we have 36,000 to 65,000 so in the middle we got
78:28 36,000 to 65,000 so in the middle we got 44,500 so let's use this having
78:31 44,500 so let's use this having statement and let's
78:34 statement and let's say the
78:36 say the average of
78:38 average of salary where it is greater than
78:44 salary where it is greater than 45,000 and we actually need to put this
78:47 45,000 and we actually need to put this right here right after the group buy and
78:50 right here right after the group buy and before the order buy so let's run this
78:53 before the order buy so let's run this and see what we get and it worked
78:55 and see what we get and it worked perfectly so now we're looking at the
78:56 perfectly so now we're looking at the job titles that have an average salary
78:58 job titles that have an average salary of over
79:00 of over $45,000 so there you go that is the
79:02 $45,000 so there you go that is the having Clause definitely one that is
79:04 having Clause definitely one that is good to know and is very useful in
79:06 good to know and is very useful in specific situations thank you guys so
79:08 specific situations thank you guys so much for watching I really appreciate it
79:10 much for watching I really appreciate it if you like this video or learned
79:12 if you like this video or learned anything today be sure to subscribe
79:13 anything today be sure to subscribe below and I'll see you in the next
79:15 below and I'll see you in the next video what is going on everybody my name
79:18 video what is going on everybody my name is Alex freeberg and today we're going
79:19 is Alex freeberg and today we're going to be looking at updating and deleting
79:21 to be looking at updating and deleting data in a table now what's the
79:23 data in a table now what's the difference between inserting data into a
79:25 difference between inserting data into a table and updating data insert into is
79:28 table and updating data insert into is going to create a new row in your table
79:31 going to create a new row in your table while updating is going to alter a
79:32 while updating is going to alter a pre-existing row while deleting is going
79:36 pre-existing row while deleting is going to specify what rows you want to remove
79:38 to specify what rows you want to remove from your table so let's get going with
79:41 from your table so let's get going with the updating so down here Holly flax
79:44 the updating so down here Holly flax does not have an employee ID age or
79:47 does not have an employee ID age or gender now we want to update this table
79:49 gender now we want to update this table to give her that information so let's do
79:52 to give her that information so let's do update now we need to specify what table
79:54 update now we need to specify what table we are going to be hitting off of so
79:56 we are going to be hitting off of so let's do SQL tutorial. db. employee
79:59 let's do SQL tutorial. db. employee demographics so now we're going to use
80:01 demographics so now we're going to use something called set and set is going to
80:03 something called set and set is going to specify what column and what value you
80:06 specify what column and what value you actually want to insert into that cell
80:09 actually want to insert into that cell so let's set
80:10 so let's set her employee ID equal to and it's going
80:15 her employee ID equal to and it's going to be
80:16 to be 1,2 and we have to specify which one to
80:19 1,2 and we have to specify which one to do this to because if we ran just this
80:22 do this to because if we ran just this is going to set every single employee ID
80:24 is going to set every single employee ID to 112 because we haven't specified that
80:27 to 112 because we haven't specified that we only want Holly flax's row to be
80:29 we only want Holly flax's row to be updated so now we have to specify
80:32 updated so now we have to specify where first
80:34 where first name is equal to
80:38 name is equal to Holly
80:39 Holly and last name is equal to flex so now
80:46 and last name is equal to flex so now let's run this and see what we
80:49 let's run this and see what we get so one row has been affected
80:53 get so one row has been affected let's see what we got and there we go as
80:57 let's see what we got and there we go as you can see the employee ID was updated
80:59 you can see the employee ID was updated exactly how we specified it right here
81:01 exactly how we specified it right here so we also want to update age and gender
81:03 so we also want to update age and gender and let's do that in the same
81:05 and let's do that in the same query so let's set the age equal to 31
81:10 query so let's set the age equal to 31 and instead of using and we actually
81:12 and instead of using and we actually need to use a comma so let's say age
81:15 need to use a comma so let's say age equal to 31 comma gender is going to be
81:19 equal to 31 comma gender is going to be equal to female and let's write
81:23 equal to female and let's write this and see what we get there you go
81:26 this and see what we get there you go now let's look at our
81:28 now let's look at our table and as you can see it was updated
81:31 table and as you can see it was updated to 31 and
81:33 to 31 and female so very easy very easy to specify
81:36 female so very easy very easy to specify what you want often times uh tables like
81:38 what you want often times uh tables like this will have a unique key like
81:40 this will have a unique key like employee ID is our unique key in this
81:42 employee ID is our unique key in this table so I could easily just say uh
81:45 table so I could easily just say uh where the employee ID is equal to and
81:49 where the employee ID is equal to and then you know
81:50 then you know 102 so it's an easy way way to specify
81:53 102 so it's an easy way way to specify what employee you're trying to update so
81:55 what employee you're trying to update so now let's look at the delete statement
81:56 now let's look at the delete statement the delete statement is going to remove
81:58 the delete statement is going to remove an entire row from our table so let's do
82:03 an entire row from our table so let's do delete and we actually need to say from
82:06 delete and we actually need to say from and we have to specify what table we
82:07 and we have to specify what table we want to be removing this information
82:09 want to be removing this information from so let's do SQL tutorial. db.
82:13 from so let's do SQL tutorial. db. employee
82:14 employee demographics and now we need to specify
82:17 demographics and now we need to specify what row we want to remove so let's do
82:20 what row we want to remove so let's do where employee ID is equal to and let's
82:24 where employee ID is equal to and let's choose a completely random employee ID
82:27 choose a completely random employee ID 105 so let's run this and see what
82:32 105 so let's run this and see what happens so one row is
82:34 happens so one row is affected let's look at our table and as
82:38 affected let's look at our table and as you can see 105 is now gone now you have
82:42 you can see 105 is now gone now you have to be very careful when you use the
82:43 to be very careful when you use the delete statement because once you run it
82:46 delete statement because once you run it you cannot get that data back there's no
82:48 you cannot get that data back there's no way to reverse a delete statement so if
82:50 way to reverse a delete statement so if I had gotten rid of this wear statement
82:52 I had gotten rid of this wear statement and I ran this it would delete
82:54 and I ran this it would delete everything from the entire table and you
82:55 everything from the entire table and you could not get that data back so a little
82:57 could not get that data back so a little trick that I use before I actually run a
82:59 trick that I use before I actually run a delete statement is I make it a select
83:02 delete statement is I make it a select statement because you're going to select
83:04 statement because you're going to select everything where the employee ID is
83:08 everything where the employee ID is equal to let's just do
83:10 equal to let's just do 1,4 and now when you run this you are
83:13 1,4 and now when you run this you are going to see exactly what you will be
83:15 going to see exactly what you will be deleting and now we know that Angela
83:17 deleting and now we know that Angela Martin that entire row is going to be
83:19 Martin that entire row is going to be gone if I hadn't done that and I just
83:21 gone if I hadn't done that and I just went like this and I wrote delete and I
83:24 went like this and I wrote delete and I only had this running I would not know
83:26 only had this running I would not know that this information is going to be the
83:27 that this information is going to be the only one that's gone maybe I made a
83:29 only one that's gone maybe I made a mistake down here maybe I accidentally
83:31 mistake down here maybe I accidentally put something in there that wasn't
83:32 put something in there that wasn't supposed to be in there and now I'm
83:34 supposed to be in there and now I'm deleting much more than I thought I was
83:35 deleting much more than I thought I was actually going to
83:37 actually going to delete so using the select statement can
83:39 delete so using the select statement can be a very good Safeguard against
83:41 be a very good Safeguard against accidentally deleting data that you do
83:43 accidentally deleting data that you do not want to delete so that is update and
83:45 not want to delete so that is update and delete thank you guys so much for
83:47 delete thank you guys so much for watching I really appreciate it if you
83:48 watching I really appreciate it if you like this video be sure to subscribe
83:50 like this video be sure to subscribe below and I'll see you in the next video
83:52 below and I'll see you in the next video What's going going on everybody my name
83:53 What's going going on everybody my name is Alex free and today we're going to be
83:54 is Alex free and today we're going to be talking about aliasing now all aliasing
83:57 talking about aliasing now all aliasing really is is temporarily changing the
83:59 really is is temporarily changing the column name or the table name in your
84:01 column name or the table name in your script and it's not really going to
84:02 script and it's not really going to impact your output at all aliasing is
84:05 impact your output at all aliasing is really used for the readability of your
84:07 really used for the readability of your script so that if you hand this off to
84:08 script so that if you hand this off to somebody or somebody comes behind you
84:10 somebody or somebody comes behind you and starts working on this they can more
84:12 and starts working on this they can more easily understand it and it may not
84:14 easily understand it and it may not sound super useful especially for small
84:16 sound super useful especially for small scripts like what we have on the screen
84:18 scripts like what we have on the screen but when you start getting to larger
84:19 but when you start getting to larger scripts where you have six seven or
84:21 scripts where you have six seven or eight joins and you're selecting 10
84:23 eight joins and you're selecting 10 different column names it actually is
84:25 different column names it actually is very useful and very important so let's
84:27 very useful and very important so let's get into how that actually works and
84:29 get into how that actually works and then I'll have an example later of how
84:30 then I'll have an example later of how we can use aling with a little bit of a
84:32 we can use aling with a little bit of a larger query so in this table let's
84:35 larger query so in this table let's select first
84:37 select first name and
84:39 name and execute what we want to do is just write
84:42 execute what we want to do is just write as and let's do FN name and all that's
84:45 as and let's do FN name and all that's going to do is it's going to rename this
84:47 going to do is it's going to rename this column from first name which it was
84:49 column from first name which it was originally named to FN name now you can
84:52 originally named to FN name now you can can use as but you can also just get rid
84:54 can use as but you can also just get rid of that and do it exactly how I have it
84:57 of that and do it exactly how I have it and it's still going to work perfectly
84:59 and it's still going to work perfectly you can either use the as or you can not
85:01 you can either use the as or you can not use it I typically don't I just put a
85:03 use it I typically don't I just put a space in between the actual column and
85:04 space in between the actual column and the Alias now let's look at an example
85:06 the Alias now let's look at an example of how this might actually be useful so
85:08 of how this might actually be useful so we have a first name and a last name in
85:10 we have a first name and a last name in this column so what we're going to do is
85:11 this column so what we're going to do is actually combine those so let's do plus
85:14 actually combine those so let's do plus and let's add a space in there and let's
85:16 and let's add a space in there and let's do a plus and let's do last name so this
85:19 do a plus and let's do last name so this is going to take the first name add a
85:21 is going to take the first name add a space and then do the last name and
85:23 space and then do the last name and we're going to do that as and let's do
85:26 we're going to do that as and let's do full
85:27 full name and let's execute
85:29 name and let's execute this so now we have a column called full
85:32 this so now we have a column called full name which is our Alias so we've
85:33 name which is our Alias so we've combined the first name and the last
85:35 combined the first name and the last name column into one single column and
85:37 name column into one single column and we've renamed it full name if we had not
85:39 we've renamed it full name if we had not used this Alias at all it would have
85:41 used this Alias at all it would have just said this which is no column name
85:44 just said this which is no column name at all we don't typically want that when
85:46 at all we don't typically want that when we have an output we want to give this
85:48 we have an output we want to give this column a name so that somebody who's
85:49 column a name so that somebody who's actually looking at the script or who's
85:50 actually looking at the script or who's looking at the output of the script
85:52 looking at the output of the script actually understand what is contained
85:54 actually understand what is contained within this column so for that we're
85:56 within this column so for that we're just going to keep it as full name now
85:59 just going to keep it as full name now another time that you're often going to
86:00 another time that you're often going to use aliasing in the select statement is
86:02 use aliasing in the select statement is when you're using aggregate functions so
86:04 when you're using aggregate functions so in this table we have age so let's pull
86:07 in this table we have age so let's pull that up really
86:08 that up really quick so we have age right here and
86:11 quick so we have age right here and let's actually just do the average
86:16 let's actually just do the average age and when we execute this we're going
86:18 age and when we execute this we're going to get no column name and 31 so we want
86:21 to get no column name and 31 so we want to do
86:22 to do is give it average
86:25 is give it average age and when we do that we now have a
86:27 age and when we do that we now have a column name and again you want to have a
86:29 column name and again you want to have a column name in case someone comes up
86:31 column name in case someone comes up behind you and is reading the script so
86:32 behind you and is reading the script so that they understand what this column is
86:34 that they understand what this column is being used for now that we've looked at
86:36 being used for now that we've looked at aliasing column names let's look at
86:37 aliasing column names let's look at aliasing table names it basically is the
86:40 aliasing table names it basically is the exact same thing uh we're just going to
86:42 exact same thing uh we're just going to write as and let's do demo for
86:46 write as and let's do demo for demographics and let's do demo Dot and
86:50 demographics and let's do demo Dot and it's going to give us all of our options
86:51 it's going to give us all of our options and we'll do employee
86:54 and we'll do employee ID so when you alias in a table name
86:58 ID so when you alias in a table name when you are selecting in the select
86:59 when you are selecting in the select statement you actually need to preface
87:01 statement you actually need to preface your column name with a table name or
87:04 your column name with a table name or the table Alias Dot and then employee ID
87:07 the table Alias Dot and then employee ID and this is extremely important to do
87:09 and this is extremely important to do especially when you have a lot of joins
87:10 especially when you have a lot of joins that you're doing or you're selecting a
87:12 that you're doing or you're selecting a lot of columns when you have several
87:14 lot of columns when you have several joins because it can get very very messy
87:16 joins because it can get very very messy quick so let's actually join this to
87:21 quick so let's actually join this to employees
87:22 employees salary and let's do that
87:25 salary and let's do that on
87:28 on demo. employee ID is equal
87:36 to s. employee ID so now let's do demo.
87:40 s. employee ID so now let's do demo. employee ID comma s do and let's do
87:43 employee ID comma s do and let's do salary so looking at the script now is
87:46 salary so looking at the script now is very clean it is very easy to understand
87:49 very clean it is very easy to understand and that is what's so important with
87:50 and that is what's so important with aliasing if for for example we took this
87:53 aliasing if for for example we took this off every time we wanted to reference
87:55 off every time we wanted to reference this table we would have to put the
87:57 this table we would have to put the entire table name and putting the entire
87:59 entire table name and putting the entire table name is correct it just is very
88:02 table name is correct it just is very cumbersome and does not look clean at
88:04 cumbersome and does not look clean at all and so using something like demo as
88:06 all and so using something like demo as an alias makes it a lot more easily
88:08 an alias makes it a lot more easily readable and a lot more manageable when
88:10 readable and a lot more manageable when you're looking at it when you have a
88:12 you're looking at it when you have a very long script let's look at this
88:14 very long script let's look at this queer where we're joining together three
88:15 queer where we're joining together three Separate Tables and after each table we
88:17 Separate Tables and after each table we have an alias for employee demographics
88:19 have an alias for employee demographics we have a employee salary we have B and
88:22 we have a employee salary we have B and warehouse employee demographics we have
88:23 warehouse employee demographics we have C now unfortunately I have seen a lot of
88:25 C now unfortunately I have seen a lot of scripts that look exactly like this and
88:27 scripts that look exactly like this and this is what you do not want to do you
88:29 this is what you do not want to do you do not want to use your aliasing to just
88:31 do not want to use your aliasing to just write an a a b or a c that is very
88:33 write an a a b or a c that is very frowned upon when writing queries
88:35 frowned upon when writing queries because it really doesn't give any
88:36 because it really doesn't give any context to what the table that you're
88:38 context to what the table that you're referencing is and it gets really
88:40 referencing is and it gets really confusing as this query continues to
88:42 confusing as this query continues to grow and as you add more columns to your
88:44 grow and as you add more columns to your select statement it makes it more
88:45 select statement it makes it more difficult to understand where those
88:47 difficult to understand where those columns are coming from and so when I'm
88:48 columns are coming from and so when I'm reading that I say select a. employee ID
88:51 reading that I say select a. employee ID okay what's a a is employee demographics
88:54 okay what's a a is employee demographics so you really do not want to do that now
88:56 so you really do not want to do that now let's look at an example of what it
88:57 let's look at an example of what it should look like so for employee
88:59 should look like so for employee demographics instead of having an alias
89:01 demographics instead of having an alias of a a I used demo for demographics for
89:04 of a a I used demo for demographics for employee salary I used s and for
89:06 employee salary I used s and for warehouse employee demographics I used
89:08 warehouse employee demographics I used where now this is not perfect by any
89:11 where now this is not perfect by any means but in the select statement if
89:12 means but in the select statement if you're just glancing at it you can
89:14 you're just glancing at it you can easily understand which columns are
89:16 easily understand which columns are coming from which tables so when I look
89:18 coming from which tables so when I look at employee ID I know that's coming from
89:20 at employee ID I know that's coming from employee demographics CU I have demo as
89:22 employee demographics CU I have demo as the Alias so it's a lot easier to
89:24 the Alias so it's a lot easier to understand and when you hand this query
89:25 understand and when you hand this query off to somebody it is going to be a lot
89:27 off to somebody it is going to be a lot easier for them to read through it and
89:29 easier for them to read through it and understand where those columns and those
89:31 understand where those columns and those table names are coming from and so they
89:33 table names are coming from and so they will appreciate that in the long run so
89:35 will appreciate that in the long run so that is all I got that is aling again
89:37 that is all I got that is aling again not a super tough subject but a really
89:39 not a super tough subject but a really important one to understand especially
89:41 important one to understand especially as you start working in teams and as you
89:43 as you start working in teams and as you start creating more and more complex
89:44 start creating more and more complex queries you want to have it more
89:46 queries you want to have it more organized and more easily readable and
89:48 organized and more easily readable and so it may not come into play with those
89:49 so it may not come into play with those really simple queries but again as as
89:52 really simple queries but again as as you build out those more complex queries
89:54 you build out those more complex queries this becomes very useful I really hope
89:56 this becomes very useful I really hope you enjoyed this video if you did be
89:58 you enjoyed this video if you did be sure to comment and subscribe below
90:00 sure to comment and subscribe below thank you so much for watching and I'll
90:01 thank you so much for watching and I'll see you in the next video what's going
90:03 see you in the next video what's going on everybody welcome back to another
90:04 on everybody welcome back to another intermediate SQL tutorial today we're
90:06 intermediate SQL tutorial today we're going to be covering Partition by now
90:08 going to be covering Partition by now Partition by is often compared to the
90:10 Partition by is often compared to the group by statement the group by
90:12 group by statement the group by statement is a little bit different the
90:14 statement is a little bit different the group by statement is going to reduce
90:15 group by statement is going to reduce the number of rows in our output by
90:17 the number of rows in our output by actually rolling them up and then
90:19 actually rolling them up and then calculating the sums or averages for
90:21 calculating the sums or averages for each group whereas Partition by actually
90:23 each group whereas Partition by actually divides the result set into partitions
90:25 divides the result set into partitions and changes how the window function is
90:27 and changes how the window function is calculated and so the Partition by
90:28 calculated and so the Partition by doesn't actually reduce the number of
90:30 doesn't actually reduce the number of rows returned in our output let's get
90:32 rows returned in our output let's get started to look at the actual syntax of
90:33 started to look at the actual syntax of how to use Partition by and then we'll
90:35 how to use Partition by and then we'll compare it to the group ey statement
90:36 compare it to the group ey statement later just to see the differences
90:38 later just to see the differences between the two we're going to be using
90:40 between the two we're going to be using these two tables on our left over here
90:41 these two tables on our left over here so I'm going to pull those up really
90:43 so I'm going to pull those up really quick so let's run this and let's look
90:44 quick so let's run this and let's look at the two these two tables Side by well
90:48 at the two these two tables Side by well one underneath the other really quick so
90:50 one underneath the other really quick so what we're going to be using to
90:51 what we're going to be using to demonstrate these partitioned by is this
90:53 demonstrate these partitioned by is this gender column as well as this salary
90:55 gender column as well as this salary column and so we just need to join these
90:57 column and so we just need to join these two tables together on the employee ID
90:59 two tables together on the employee ID and then we'll go from there now I'm not
91:01 and then we'll go from there now I'm not going to bore you with that I'm going to
91:02 going to bore you with that I'm going to skip ahead and we'll actually look at
91:03 skip ahead and we'll actually look at how to use this partition bu so I've
91:05 how to use this partition bu so I've joined these two tables together and
91:06 joined these two tables together and this is our output but we don't want
91:08 this is our output but we don't want every single column I'm going to start
91:09 every single column I'm going to start selecting some of these columns and then
91:11 selecting some of these columns and then we'll start using this partition Buy and
91:13 we'll start using this partition Buy and see what the output looks like after
91:14 see what the output looks like after that all right so let's go right up here
91:16 that all right so let's go right up here let's choose the first name let's do the
91:19 let's choose the first name let's do the last name we'll do
91:22 last name we'll do gender and let's do salary and now we
91:26 gender and let's do salary and now we want to identify how many male and
91:27 want to identify how many male and female employees we actually have and so
91:29 female employees we actually have and so we're going to say count of
91:33 we're going to say count of gender and this going to be
91:35 gender and this going to be over and now we're going to do our
91:37 over and now we're going to do our Partition
91:39 Partition by and we're also going to partition
91:41 by and we're also going to partition that by the
91:43 that by the gender as total gender now I'm going to
91:46 gender as total gender now I'm going to come back to why we did each part but I
91:49 come back to why we did each part but I want to see the output first and then we
91:52 want to see the output first and then we come back to why we wrote it this way so
91:54 come back to why we wrote it this way so let's just do this really
91:57 let's just do this really quick so it's going to be a little bit
92:00 quick so it's going to be a little bit different than what you typically would
92:02 different than what you typically would expect in a group by statement the group
92:04 expect in a group by statement the group by is going to roll everything up and
92:06 by is going to roll everything up and you typically wouldn't have like a first
92:07 you typically wouldn't have like a first name last name in a group by statement
92:09 name last name in a group by statement because it would be very hard to roll
92:11 because it would be very hard to roll all those things up into those
92:13 all those things up into those individual columns and to reduce the
92:14 individual columns and to reduce the number of columns that are in your
92:16 number of columns that are in your output and so in our output we can see
92:18 output and so in our output we can see Pam Beasley she's a female she makes
92:20 Pam Beasley she's a female she makes $36,000 as a salary and there are three
92:23 $36,000 as a salary and there are three total women that work alongside her in
92:26 total women that work alongside her in this employee demographics table and so
92:28 this employee demographics table and so in our total gender column over here
92:30 in our total gender column over here this is where we use the partition bu
92:31 this is where we use the partition bu and if we used a group bu statement to
92:34 and if we used a group bu statement to get this kind of information all we
92:35 get this kind of information all we would be able to do to get this
92:37 would be able to do to get this information in a group by statement is
92:39 information in a group by statement is say select gender count of gender and
92:42 say select gender count of gender and then Group by the gender down below
92:43 then Group by the gender down below underneath the join so because we're
92:45 underneath the join so because we're using the partition bu we're able to
92:47 using the partition bu we're able to isolate just one column that we want to
92:49 isolate just one column that we want to perform our aggregate function on and so
92:51 perform our aggregate function on and so we're able to add things like the first
92:52 we're able to add things like the first name and last name columns even though
92:54 name and last name columns even though we aren't trying to include that in any
92:56 we aren't trying to include that in any partition or group by statement yet
92:58 partition or group by statement yet we're still able to add the aggregate
92:59 we're still able to add the aggregate function to each individual row while
93:01 function to each individual row while still maintaining those other columns
93:03 still maintaining those other columns let's take this entire query and let's
93:05 let's take this entire query and let's basically just transform it into a group
93:07 basically just transform it into a group by statement and we'll see kind of what
93:10 by statement and we'll see kind of what that looks like and what the difference
93:11 that looks like and what the difference is so all I'm going to do is get rid of
93:13 is so all I'm going to do is get rid of all this I'm going
93:16 all this I'm going to copy all of
93:19 to copy all of this and I'm going to say
93:22 this and I'm going to say Group
93:23 Group by and I'm going to do that because we
93:26 by and I'm going to do that because we have to use all these columns in our
93:28 have to use all these columns in our group by statement so let's execute this
93:32 group by statement so let's execute this and as you can tell we are not able to
93:34 and as you can tell we are not able to see the output for the aggregate
93:35 see the output for the aggregate function that we were hoping for if we
93:37 function that we were hoping for if we wanted to get the same output that we
93:38 wanted to get the same output that we had before where we're showing three for
93:40 had before where we're showing three for females and six for males what we'd have
93:42 females and six for males what we'd have to do is get rid of this first and last
93:45 to do is get rid of this first and last name and the
93:47 name and the salary and do the same thing in the
93:49 salary and do the same thing in the group by
93:50 group by statement and so let me get rid of these
93:52 statement and so let me get rid of these really
93:55 really quick and run this and so what the
93:59 quick and run this and so what the Partition by is doing is basically
94:01 Partition by is doing is basically taking this query right here and
94:03 taking this query right here and sticking it on one line in the select
94:05 sticking it on one line in the select statement and so I hope now you can see
94:07 statement and so I hope now you can see how valuable the partition bu can be if
94:09 how valuable the partition bu can be if used correctly thank you guys so much
94:11 used correctly thank you guys so much for watching I really appreciate it if
94:13 for watching I really appreciate it if you like this video be sure to like And
94:14 you like this video be sure to like And subscribe below and I'll see you in the
94:16 subscribe below and I'll see you in the next video what's going on everybody
94:17 next video what's going on everybody welcome back to another squl tutorial
94:19 welcome back to another squl tutorial today we're going to be talking about
94:20 today we're going to be talking about CTE
94:22 CTE a CTE is a common table expression and
94:24 a CTE is a common table expression and it's a named temporary result set which
94:26 it's a named temporary result set which is used to manipulate the complex
94:28 is used to manipulate the complex subqueries data now this only exists
94:30 subqueries data now this only exists within the scope of the statement that
94:32 within the scope of the statement that we were about to write once we cancel
94:34 we were about to write once we cancel out of this query it's like it never
94:35 out of this query it's like it never existed a CTE is also only created in
94:38 existed a CTE is also only created in memory rather than a tempdb file like a
94:40 memory rather than a tempdb file like a temp table would be but in general a CTE
94:43 temp table would be but in general a CTE acts very much like a subquery and so if
94:45 acts very much like a subquery and so if you know how to do subqueries you should
94:47 you know how to do subqueries you should be able to pick up on CTE fairly easily
94:49 be able to pick up on CTE fairly easily so let's get started writing our very
94:51 so let's get started writing our very first C CTE and we're going to come down
94:52 first C CTE and we're going to come down here and we're going to say with and
94:54 here and we're going to say with and we're going to write
94:55 we're going to write CTE
94:57 CTE employee and we're going to say as and
95:00 employee and we're going to say as and this is where everything's going to
95:01 this is where everything's going to start now CTE are sometimes called with
95:04 start now CTE are sometimes called with queries I've never personally used that
95:05 queries I've never personally used that but I've seen it called that online but
95:07 but I've seen it called that online but that's because it uses this with
95:09 that's because it uses this with statement right at the very beginning so
95:11 statement right at the very beginning so now we have with CTE employee as then we
95:13 now we have with CTE employee as then we have an open parenthesis and now we have
95:15 have an open parenthesis and now we have to construct our select statement and
95:17 to construct our select statement and this is kind of where we build out our
95:18 this is kind of where we build out our quote unquote subquery and so I'm going
95:20 quote unquote subquery and so I'm going to take in a select statement that I
95:22 to take in a select statement that I actually used in a previous video where
95:24 actually used in a previous video where we using the partition bu and so I'm
95:26 we using the partition bu and so I'm going to put that in there and I'm kind
95:27 going to put that in there and I'm kind of walk us through what that does and
95:29 of walk us through what that does and how we're going to use this so I'm going
95:31 how we're going to use this so I'm going to paste this down right here and I'm
95:33 to paste this down right here and I'm actually going to go like this just to
95:36 actually going to go like this just to make it look a little nicer and then I'm
95:38 make it look a little nicer and then I'm going to close the parentheses at the
95:40 going to close the parentheses at the end so now we have our CTE in place and
95:43 end so now we have our CTE in place and as you can see it is basically just a
95:44 as you can see it is basically just a select statement within the with CTE
95:47 select statement within the with CTE employee as and what this is going to do
95:49 employee as and what this is going to do is going to take the first name last
95:51 is going to take the first name last last name gender and salary and then
95:53 last name gender and salary and then it's going to take this aggregate
95:54 it's going to take this aggregate function with the partition buy
95:56 function with the partition buy aggregate function with the partition
95:57 aggregate function with the partition buy and it's going to place it to where
95:59 buy and it's going to place it to where we can now query off of this data so
96:01 we can now query off of this data so it's putting it basically in a temporary
96:03 it's putting it basically in a temporary place where we can then go and grab that
96:05 place where we can then go and grab that data so all we're going to do at the
96:07 data so all we're going to do at the very bottom is we're going to say select
96:10 very bottom is we're going to say select everything and we can do that from CTE
96:14 everything and we can do that from CTE employee so let's run this entire thing
96:17 employee so let's run this entire thing and see what we
96:19 and see what we get so as you can see this select
96:22 get so as you can see this select everything from CTE employee we are
96:24 everything from CTE employee we are selecting everything from this select
96:27 selecting everything from this select statement and so this feels a lot like a
96:29 statement and so this feels a lot like a temp table we're actually quering off of
96:31 temp table we're actually quering off of a temp table but it actually acts a lot
96:33 a temp table but it actually acts a lot more like a subquery now we don't have
96:35 more like a subquery now we don't have to the select everything we can just do
96:38 to the select everything we can just do first name and let's do average
96:40 first name and let's do average salary and when we run this we'll just
96:43 salary and when we run this we'll just get those two columns and we don't have
96:45 get those two columns and we don't have to go through and actually write this
96:46 to go through and actually write this out each time it's just in this CTE for
96:49 out each time it's just in this CTE for us so it does all the heavy lift within
96:51 us so it does all the heavy lift within the CTE and then we can just query off
96:53 the CTE and then we can just query off of what we want now something to note is
96:55 of what we want now something to note is that the CTE is not stored anywhere and
96:57 that the CTE is not stored anywhere and so it's not stored in some temp database
96:59 so it's not stored in some temp database somewhere if I try to run just this by
97:01 somewhere if I try to run just this by itself it is not going to work so let's
97:03 itself it is not going to work so let's try that out really quick and we should
97:05 try that out really quick and we should get an error and that's because each
97:07 get an error and that's because each time we run this query is actually
97:08 time we run this query is actually creating the CTE again and so it's not
97:11 creating the CTE again and so it's not being saved anywhere and so each time we
97:12 being saved anywhere and so each time we run it we have to run it with the entire
97:14 run it we have to run it with the entire CTE another thing to note is you
97:16 CTE another thing to note is you actually have to put the select
97:17 actually have to put the select statement right after the CTE if I try
97:20 statement right after the CTE if I try to go down here and say select
97:21 to go down here and say select everything from uh let's do
97:24 everything from uh let's do CTE employees it doesn't actually work
97:26 CTE employees it doesn't actually work it's not going to come up at all and
97:28 it's not going to come up at all and that's because it only is going to work
97:30 that's because it only is going to work with the select statement directly after
97:32 with the select statement directly after the actual CTE that you've created I
97:34 the actual CTE that you've created I hope this was helpful and I hope that
97:36 hope this was helpful and I hope that you understand how to use a CTE a little
97:38 you understand how to use a CTE a little bit better again you don't have to go
97:40 bit better again you don't have to go super complicated with the select
97:41 super complicated with the select statement within your CTE it can be very
97:44 statement within your CTE it can be very very simple I just wanted to demonstrate
97:46 very simple I just wanted to demonstrate that you can use aggregate functions
97:47 that you can use aggregate functions within your CTE and then just query off
97:50 within your CTE and then just query off of those without having to do the the
97:51 of those without having to do the the aggregate function again which I find is
97:53 aggregate function again which I find is very very useful again thank you for
97:55 very very useful again thank you for watching if you like this video be sure
97:57 watching if you like this video be sure to like And subscribe below and I'll see
97:58 to like And subscribe below and I'll see you in the next video what's going on
98:00 you in the next video what's going on everybody welcome back to another squl
98:02 everybody welcome back to another squl tutorial today we are looking at temp
98:04 tutorial today we are looking at temp tables and if you can guess it based off
98:06 tables and if you can guess it based off of the name they're kind of like
98:08 of the name they're kind of like temporary tables and we create them very
98:10 temporary tables and we create them very much the same way we're going to do
98:12 much the same way we're going to do create table um it's just a little bit
98:14 create table um it's just a little bit different and you can hit off of this
98:17 different and you can hit off of this temp table multiple times which you
98:19 temp table multiple times which you cannot do with something like a CTE or a
98:23 cannot do with something like a CTE or a subquery where you can only use it one
98:25 subquery where you can only use it one time or with a subquery you need to
98:26 time or with a subquery you need to write it multiple times within a query
98:30 write it multiple times within a query and so these temp tables are extremely
98:32 and so these temp tables are extremely useful I'm going to kind of talk about
98:34 useful I'm going to kind of talk about how you can use them as we're going uh
98:36 how you can use them as we're going uh throughout this video but let's get
98:39 throughout this video but let's get started right away with actually
98:40 started right away with actually creating one looking at it inserting
98:43 creating one looking at it inserting some data and and and kind of showing
98:44 some data and and and kind of showing you how temp tables work and what we can
98:47 you how temp tables work and what we can do with them so uh we're going to start
98:49 do with them so uh we're going to start off with create table
98:51 off with create table much like uh a regular table is created
98:54 much like uh a regular table is created the only difference is we're going to do
98:55 the only difference is we're going to do this pound signed and then we're going
98:57 this pound signed and then we're going to do
98:59 to do tempcore
99:01 tempcore employee uh so literally the only
99:04 employee uh so literally the only difference between a regular table and a
99:07 difference between a regular table and a temp table is this right here at the
99:09 temp table is this right here at the very beginning this this pound sign so
99:12 very beginning this this pound sign so uh let's just start by doing employee ID
99:16 uh let's just start by doing employee ID we make that an integer we'll do job
99:20 we make that an integer we'll do job title
99:22 title and we'll make that a varar
99:26 and we'll make that a varar 100 and then we'll do
99:29 100 and then we'll do salary and let's make that an
99:31 salary and let's make that an integer and so now we have our temp
99:35 integer and so now we have our temp table uh let's go ahead and create
99:38 table uh let's go ahead and create it so now we have our temp table created
99:42 it so now we have our temp table created and so we can look at it really
99:45 and so we can look at it really quick so let's select
99:47 quick so let's select everything from and we'll do temp
99:52 everything from and we'll do temp employee so let's take a look it's
99:55 employee so let's take a look it's completely empty um and we can insert
99:58 completely empty um and we can insert data very much the same way we'd insert
100:00 data very much the same way we'd insert data into a regular table so let's start
100:03 data into a regular table so let's start doing that let's do insert
100:07 doing that let's do insert into and we'll do temp
100:10 into and we'll do temp employee and we'll do
100:14 employee and we'll do values and let's just do something
100:16 values and let's just do something really quick because I'm going to get to
100:18 really quick because I'm going to get to a little bit more interesting stuff in a
100:20 a little bit more interesting stuff in a second
100:27 oops so we'll make this person HR that's their job title then for
100:29 HR that's their job title then for salary we'll give them
100:32 salary we'll give them 45,000 and close it off so let's run
100:37 45,000 and close it off so let's run this and let's select everything again
100:40 this and let's select everything again and see what's in there perfect so we
100:44 and see what's in there perfect so we were able to insert data into this temp
100:46 were able to insert data into this temp table and again we we don't have to
100:48 table and again we we don't have to create this every single time we um um
100:51 create this every single time we um um or we don't have to run this every
100:52 or we don't have to run this every single time we need to hit off of it
100:53 single time we need to hit off of it like we did a CTE if you watch my
100:55 like we did a CTE if you watch my previous video and this one we can just
100:57 previous video and this one we can just run it and it sits there and so U again
101:01 run it and it sits there and so U again it feels very much like a real table and
101:03 it feels very much like a real table and I'm going to get to a little bit of the
101:04 I'm going to get to a little bit of the nuances of of the and the differences
101:06 nuances of of the and the differences between a regular table and a temp table
101:08 between a regular table and a temp table in a second but let's really quickly um
101:12 in a second but let's really quickly um we want more data in there you don't
101:13 we want more data in there you don't have to just um do it value by value we
101:19 have to just um do it value by value we can also just do
101:22 can also just do um uh where we select all of the data
101:25 um uh where we select all of the data from a specific table and insert that
101:27 from a specific table and insert that into a temp table and that is really
101:30 into a temp table and that is really quickly you know how I do it most of the
101:33 quickly you know how I do it most of the time most of the time I'm not inserting
101:36 time most of the time I'm not inserting values um I am you know taking a large
101:41 values um I am you know taking a large table and taking a subset of that and
101:44 table and taking a subset of that and then sticking it into a temp table so
101:47 then sticking it into a temp table so let's look at this really
101:49 let's look at this really quick and
101:51 quick and and run that so now we took all of the
101:55 and run that so now we took all of the data from employee salary and then we
101:59 data from employee salary and then we just stuck it into this table and really
102:02 just stuck it into this table and really quickly this is one of the big uses of a
102:06 quickly this is one of the big uses of a temp table we had let let's say for
102:08 temp table we had let let's say for example that this employee salary table
102:10 example that this employee salary table had a billion rows or or or just an
102:13 had a billion rows or or or just an extremely large number and we were
102:15 extremely large number and we were trying to uh you know hit a somewhat
102:18 trying to uh you know hit a somewhat complex query off of it where we're
102:20 complex query off of it where we're using joint coins and we're using U
102:22 using joint coins and we're using U maybe some window functions or different
102:24 maybe some window functions or different things you know it would take a very
102:27 things you know it would take a very long time to hit off of this but what we
102:29 long time to hit off of this but what we can do is we could insert that data into
102:34 can do is we could insert that data into this temp table and then we can hit off
102:35 this temp table and then we can hit off the temp table and it already has that
102:37 the temp table and it already has that sub uh that subsection of data that
102:40 sub uh that subsection of data that we're wanting to use for all of our
102:42 we're wanting to use for all of our later queries so really quickly that's
102:45 later queries so really quickly that's kind of um kind of a use case for that
102:49 kind of um kind of a use case for that so let's go down here we're going to
102:50 so let's go down here we're going to kind of create another one and this
102:52 kind of create another one and this one's going to be a little bit more
102:53 one's going to be a little bit more advanced a little bit of how I would
102:55 advanced a little bit of how I would actually use a temp table above was just
102:58 actually use a temp table above was just kind of showing the basic syntax how you
103:00 kind of showing the basic syntax how you kind of put data into it you know kind
103:02 kind of put data into it you know kind of how it's used now I'm going to show
103:04 of how it's used now I'm going to show you kind of how I would actually use it
103:07 you kind of how I would actually use it so let's do create table uh let's do
103:10 so let's do create table uh let's do temp
103:11 temp oops create
103:13 oops create table uh let's do
103:16 table uh let's do temp uh employee
103:19 temp uh employee 2 and then let's do open parentheses and
103:23 2 and then let's do open parentheses and we'll do job title and we'll make that a
103:27 we'll do job title and we'll make that a varar
103:29 varar 50 and then we can do
103:33 50 and then we can do employees per job we'll make that an
103:37 employees per job we'll make that an integer now we need average age make
103:40 integer now we need average age make that an integer and the very last one
103:43 that an integer and the very last one will be average salary I'll make that an
103:46 will be average salary I'll make that an integer as well and let's run this oops
103:52 integer as well and let's run this oops so we have our second table now we want
103:56 so we have our second table now we want to insert data into this one so we're
103:59 to insert data into this one so we're just going to do insert
104:01 just going to do insert into and we'll do temp employee 2 and
104:06 into and we'll do temp employee 2 and for this one I'm going to take a query
104:09 for this one I'm going to take a query that we used in a previous video and so
104:12 that we used in a previous video and so I'm just going to copy and paste that to
104:13 I'm just going to copy and paste that to save time uh and then we'll keep on
104:16 save time uh and then we'll keep on moving from there all right so I'm just
104:18 moving from there all right so I'm just going to paste that in we will run this
104:21 going to paste that in we will run this and really all it's doing is from this
104:24 and really all it's doing is from this these tables it's taking the job title
104:26 these tables it's taking the job title we're getting a count on the job title
104:27 we're getting a count on the job title average age average salary and that is
104:30 average age average salary and that is it um so let's see if that worked which
104:34 it um so let's see if that worked which it looks like it did but you know let's
104:37 it looks like it did but you know let's actually take a look at the
104:38 actually take a look at the [Music]
104:40 [Music] data and so now we have this subsection
104:44 data and so now we have this subsection of data from this join
104:47 of data from this join above and what this is going to do is is
104:50 above and what this is going to do is is whenever we want to run this we don't
104:53 whenever we want to run this we don't have to run it on these two tables and
104:56 have to run it on these two tables and create the join and then do the
104:58 create the join and then do the calculations which takes time what it's
105:01 calculations which takes time what it's going to do is it's going to take this
105:03 going to do is it's going to take this these exact values and place this into
105:05 these exact values and place this into this temporary table and if we want to
105:07 this temporary table and if we want to run further calculations on these values
105:10 run further calculations on these values we can easily do that in a fraction of
105:12 we can easily do that in a fraction of the time instead of having to run this
105:14 the time instead of having to run this every single time which will take up so
105:16 every single time which will take up so much uh uh processing power and it will
105:19 much uh uh processing power and it will reduce your runtime dramatically when
105:21 reduce your runtime dramatically when you're placing this data in this temp
105:23 you're placing this data in this temp table and hitting off of that instead of
105:25 table and hitting off of that instead of all these joints and everything above uh
105:28 all these joints and everything above uh a lot of times these temp tables are
105:30 a lot of times these temp tables are used in store procedures now if you
105:31 used in store procedures now if you haven't learned about store procedures
105:33 haven't learned about store procedures or used stor procedures at all you know
105:36 or used stor procedures at all you know that's okay I still want to show you
105:38 that's okay I still want to show you something that might be useful um
105:40 something that might be useful um although this is used a ton in store
105:42 although this is used a ton in store procedures so for example let's say we
105:45 procedures so for example let's say we have a store procedure set up we run the
105:47 have a store procedure set up we run the store procedure and we get an output and
105:49 store procedure and we get an output and you know we for whatever reason want to
105:51 you know we for whatever reason want to run it again and when we run it again uh
105:54 run it again and when we run it again uh we get this error and you know this temp
105:57 we get this error and you know this temp table lives somewhere it it doesn't live
106:01 table lives somewhere it it doesn't live in an actual in the actual database uh
106:03 in an actual in the actual database uh but it lives somewhere and so when we
106:05 but it lives somewhere and so when we run it again we get an error because
106:07 run it again we get an error because there's already a temp table created one
106:10 there's already a temp table created one trick or one little tip that I would
106:12 trick or one little tip that I would give is doing something like this saying
106:16 give is doing something like this saying drop table oops I don't know why I did
106:18 drop table oops I don't know why I did so many spaces drop table if
106:23 so many spaces drop table if exists and we'll do temp employee
106:28 exists and we'll do temp employee 2 just like that now what this is going
106:32 2 just like that now what this is going to do is when you're running that store
106:34 to do is when you're running that store procedure over and over and over again
106:35 procedure over and over and over again you're getting error or whatever for
106:37 you're getting error or whatever for whatever reason you need to run it
106:38 whatever reason you need to run it multiple times every time that you run
106:42 multiple times every time that you run it it's going to encounter this and so
106:45 it it's going to encounter this and so if that already exists it is going to
106:48 if that already exists it is going to delete that table and then allow you to
106:51 delete that table and then allow you to create it again and this is just a
106:54 create it again and this is just a really good thing to do so now if you
106:58 really good thing to do so now if you see down below I can run this time and
107:00 see down below I can run this time and time and time again and it is going to
107:03 time and time again and it is going to work every single time because it is
107:04 work every single time because it is checking to see if that exists and if it
107:07 checking to see if that exists and if it does it deletes it and then I can create
107:09 does it deletes it and then I can create again and so that is just a helpful tip
107:13 again and so that is just a helpful tip if you're going to try to use this I
107:15 if you're going to try to use this I highly recommend adding that to your
107:17 highly recommend adding that to your query just to make sure things run
107:18 query just to make sure things run smoothly I know there is a lot more that
107:20 smoothly I know there is a lot more that can go into temp tables a lot more of
107:22 can go into temp tables a lot more of the technical aspects or the DBA stuff
107:25 the technical aspects or the DBA stuff um obviously I just want to teach you
107:26 um obviously I just want to teach you how to use it and what you might use it
107:28 how to use it and what you might use it for and how to actually write it out but
107:31 for and how to actually write it out but you know there are a lot more things
107:32 you know there are a lot more things that you can do research on about
107:34 that you can do research on about processing speed and storage but unless
107:37 processing speed and storage but unless you are something like a DBA you
107:39 you are something like a DBA you probably don't need to worry about those
107:41 probably don't need to worry about those things and so if you are a DBA I do
107:44 things and so if you are a DBA I do recommend looking into those things
107:46 recommend looking into those things making sure you understand how that
107:47 making sure you understand how that works how this data is stored uh so that
107:50 works how this data is stored uh so that when people use them or you are using
107:52 when people use them or you are using them you know what's going on in the
107:54 them you know what's going on in the background but for getting up and
107:55 background but for getting up and running with temp tables I hope that
107:57 running with temp tables I hope that this was helpful thank you guys so much
107:59 this was helpful thank you guys so much for watching I really appreciate it if
108:01 for watching I really appreciate it if you like this video be sure to like And
108:03 you like this video be sure to like And subscribe below and I'll see you in the
108:04 subscribe below and I'll see you in the next
108:17 [Music] video what's going on everybody welcome
108:19 video what's going on everybody welcome back to another SQL tutorial today we're
108:21 back to another SQL tutorial today we're going to be looking at string functions
108:24 going to be looking at string functions some of the things that we're going to
108:25 some of the things that we're going to be looking at are things like trim
108:27 be looking at are things like trim replace substring and upper and lower uh
108:29 replace substring and upper and lower uh we're going to create a new table insert
108:32 we're going to create a new table insert a little bit of bad data into it and
108:34 a little bit of bad data into it and then we're going to be using that to
108:36 then we're going to be using that to work on our string functions today so I
108:39 work on our string functions today so I already have this set up right here um
108:41 already have this set up right here um I'm going to put this in the GitHub that
108:43 I'm going to put this in the GitHub that you can just download this you don't
108:44 you can just download this you don't have to you know type this out manually
108:46 have to you know type this out manually so go look in the description if you
108:48 so go look in the description if you know you just want to get that off the
108:50 know you just want to get that off the GitHub and download that and copy and
108:52 GitHub and download that and copy and paste it save you a little bit of time
108:54 paste it save you a little bit of time but let's go ahead and run this really
108:56 but let's go ahead and run this really quick and as you can see in this table
109:00 quick and as you can see in this table we have uh our data right here give me
109:02 we have uh our data right here give me one second so in this employeee errors
109:07 one second so in this employeee errors table basically what we have actually
109:08 table basically what we have actually let me pull this back up basically what
109:11 let me pull this back up basically what we have is in this first one we have
109:15 we have is in this first one we have here we go we have some uh basically
109:18 here we go we have some uh basically blank spaces on the right side the
109:20 blank spaces on the right side the second one some blank spaces on the left
109:22 second one some blank spaces on the left side U we also have Jimbo which is an
109:25 side U we also have Jimbo which is an error because his name is Jim um and
109:27 error because his name is Jim um and Halbert because his name is actually
109:30 Halbert because his name is actually Halbert um and then for Toby for
109:33 Halbert um and then for Toby for whatever reason that o is capitalized
109:34 whatever reason that o is capitalized and then uh Michael got in here and
109:38 and then uh Michael got in here and added this extra part so we're going to
109:40 added this extra part so we're going to have to figure out a way to take that
109:41 have to figure out a way to take that out when we're doing our query and
109:43 out when we're doing our query and that'll come in a little bit later I
109:44 that'll come in a little bit later I think in the substring section so let's
109:48 think in the substring section so let's get into it right away
109:50 get into it right away let's start using uh our left trim and
109:54 let's start using uh our left trim and right trim we're going to kind of go
109:55 right trim we're going to kind of go through each one um pretty quickly
109:57 through each one um pretty quickly hopefully I'm not not trying to make
109:59 hopefully I'm not not trying to make this a super long video because we got a
110:01 this a super long video because we got a lot of things to get through in this one
110:03 lot of things to get through in this one video uh so I'm going to go through the
110:05 video uh so I'm going to go through the trim right trim and left trim let's look
110:07 trim right trim and left trim let's look at uh the employee ID because that's the
110:10 at uh the employee ID because that's the one where we have some blank spaces on
110:12 one where we have some blank spaces on the right and the left side the left
110:14 the right and the left side the left side you'll be able to obviously you're
110:15 side you'll be able to obviously you're going to see that one much easier but uh
110:18 going to see that one much easier but uh let's start walking through this so
110:19 let's start walking through this so let's do
110:20 let's do select employee ID and before we get any
110:24 select employee ID and before we get any further let me just get the employee
110:27 further let me just get the employee errors on here so we can
110:31 errors on here so we can um so that we can see everything as it
110:33 um so that we can see everything as it comes up so we're just going to do trim
110:37 comes up so we're just going to do trim and then type in the column that we want
110:40 and then type in the column that we want to uh take these blank spaces out of
110:43 to uh take these blank spaces out of that's where the trim does the trim gets
110:45 that's where the trim does the trim gets rid of Blank spaces on either the front
110:48 rid of Blank spaces on either the front or the back or or the left on the right
110:50 or the back or or the left on the right side so on both sides that's what trim
110:52 side so on both sides that's what trim does and we'll say as ID trim so let's
110:56 does and we'll say as ID trim so let's run this one really quick and as you can
110:59 run this one really quick and as you can see this is our regular employee ID and
111:02 see this is our regular employee ID and so you know you can't visually see it as
111:04 so you know you can't visually see it as easily on this first one but there are
111:06 easily on this first one but there are blank spaces after this 101 and we got
111:10 blank spaces after this 101 and we got rid of those and then there were blank
111:11 rid of those and then there were blank spaces before the 102 and we got rid of
111:15 spaces before the 102 and we got rid of those now I'm just going to copy this uh
111:18 those now I'm just going to copy this uh two times because it's basically the
111:21 two times because it's basically the exact same thing but uh I'm going to
111:24 exact same thing but uh I'm going to show you them all at the same time so
111:26 show you them all at the same time so it's the exact same thing except lrim
111:28 it's the exact same thing except lrim and right trim uh and let's take a look
111:30 and right trim uh and let's take a look at all these at the same
111:32 at all these at the same time and let me pull it up so in the me
111:37 time and let me pull it up so in the me see if I can get these all in here okay
111:39 see if I can get these all in here okay in the trim it got rid of both the left
111:41 in the trim it got rid of both the left and the right side so all of these were
111:44 and the right side so all of these were fixed in the employee ID for the left
111:48 fixed in the employee ID for the left trim we're only getting going to be
111:50 trim we're only getting going to be getting rid of this one this one still
111:52 getting rid of this one this one still has um blank spaces on it and when we do
111:55 has um blank spaces on it and when we do the right trim we're only going to get
111:57 the right trim we're only going to get rid of the stuff on the right side so
111:59 rid of the stuff on the right side so this one doesn't change because this is
112:01 this one doesn't change because this is on the left hand side where the blank
112:03 on the left hand side where the blank spaces are so this one was fixed again
112:05 spaces are so this one was fixed again not super visual so you can't really see
112:07 not super visual so you can't really see it but that one is fixed uh let's move
112:11 it but that one is fixed uh let's move on to the next part uh which is using
112:16 on to the next part uh which is using replace so for this one we're going to
112:18 replace so for this one we're going to be looking at the last name so let's go
112:22 be looking at the last name so let's go back up really quick to the employee
112:25 back up really quick to the employee errors uh as you can tell the last name
112:29 errors uh as you can tell the last name um the biggest one where we kind of want
112:31 um the biggest one where we kind of want to take something out of because we
112:33 to take something out of because we don't want that um that Dash fired still
112:37 don't want that um that Dash fired still in there we're going to replace that and
112:40 in there we're going to replace that and so let's look at how to do that um let
112:43 so let's look at how to do that um let me
112:44 me just copy this real quick and get rid of
112:47 just copy this real quick and get rid of this top part um so we're going to do
112:50 this top part um so we're going to do the last name so let's just start off
112:51 the last name so let's just start off with our last name um and then just as a
112:54 with our last name um and then just as a baseline so we can see what it looks
112:55 baseline so we can see what it looks like before and then we'll do replace
112:58 like before and then we'll do replace and all we're going to specify is the
113:00 and all we're going to specify is the column that we want uh to do the
113:03 column that we want uh to do the replacing in we're going to specify the
113:06 replacing in we're going to specify the value that we want to replace so in this
113:08 value that we want to replace so in this it's going to be Dash fire oops got a
113:11 it's going to be Dash fire oops got a little aggressive on that one dash
113:13 little aggressive on that one dash fired and we're going to indicate what
113:16 fired and we're going to indicate what we want to replace it with now I'm just
113:18 we want to replace it with now I'm just going to replace it with blank
113:20 going to replace it with blank um and we can say as last name
113:24 um and we can say as last name fixed so let's see what this looks like
113:27 fixed so let's see what this looks like really
113:28 really quick and it looks like it worked so in
113:31 quick and it looks like it worked so in this last name it originally had
113:32 this last name it originally had flenderson DF fired and when we replaced
113:36 flenderson DF fired and when we replaced it and we took that Dash fired and
113:38 it and we took that Dash fired and replaced it with basically nothing uh it
113:41 replaced it with basically nothing uh it then fixed it and so now it looks
113:44 then fixed it and so now it looks correct all right let's move on to the
113:46 correct all right let's move on to the next one I think this one might be um
113:49 next one I think this one might be um the the longest one to write but that is
113:51 the the longest one to write but that is the
113:52 the substring um and let me take this real
113:55 substring um and let me take this real quick trying to save us some time so
113:59 quick trying to save us some time so substring is
114:01 substring is very is very very unique you can
114:05 very is very very unique you can specify um in a either a number or a
114:10 specify um in a either a number or a string you can specify the place that
114:12 string you can specify the place that you want to start and then you can also
114:14 you want to start and then you can also specify how many characters you want to
114:15 specify how many characters you want to go out um and and and it pulls that in
114:19 go out um and and and it pulls that in so just as a really quick example um and
114:22 so just as a really quick example um and then I'm going to show you kind of a use
114:24 then I'm going to show you kind of a use case for this one that I think is pretty
114:25 case for this one that I think is pretty cool that um you know
114:28 cool that um you know maybe let me
114:30 maybe let me see so that maybe that you'd find useful
114:34 see so that maybe that you'd find useful so I'm going to do first name and then
114:36 so I'm going to do first name and then I'm just going to do one comma three so
114:39 I'm just going to do one comma three so it's going to take the first name it's
114:40 it's going to take the first name it's going to start at the very first um very
114:44 going to start at the very first um very first letter or number and it's going to
114:46 first letter or number and it's going to go forward three spaces or three spots
114:49 go forward three spaces or three spots spots so let's just take a look at what
114:51 spots so let's just take a look at what that looks like so for our table it's
114:54 that looks like so for our table it's going to take Jim Pam and to or or Tobe
114:57 going to take Jim Pam and to or or Tobe for Toby um and so it's only going to
115:00 for Toby um and so it's only going to take the the first three because you're
115:03 take the the first three because you're starting at number one now what if we
115:04 starting at number one now what if we started at three so we do three comma 3
115:08 started at three so we do three comma 3 it's going to go to the
115:10 it's going to go to the third um digit or or third letter and
115:15 third um digit or or third letter and then it's going to go forward three so
115:17 then it's going to go forward three so you kind of get a sense of how this
115:19 you kind of get a sense of how this works now I'm going to show you
115:22 works now I'm going to show you something that I think is very
115:24 something that I think is very interesting that I think you guys will
115:27 interesting that I think you guys will also find interesting uh let me fix that
115:30 also find interesting uh let me fix that CU I just messed it up so if you've ever
115:32 CU I just messed it up so if you've ever heard of something called fuzzy matching
115:35 heard of something called fuzzy matching now if you don't know what fuzzy
115:36 now if you don't know what fuzzy matching is I'll give you an example
115:38 matching is I'll give you an example let's say in one table my name is Alex
115:40 let's say in one table my name is Alex and in another table my name is
115:42 and in another table my name is Alexander if we tried to join those two
115:44 Alexander if we tried to join those two together based off of my name they will
115:47 together based off of my name they will not join because one is Alex and one is
115:49 not join because one is Alex and one is Alexander there's not they're not an
115:50 Alexander there's not they're not an exact match but for if I take the
115:53 exact match but for if I take the substring and start position one and
115:55 substring and start position one and move forward four characters it's going
115:57 move forward four characters it's going to take Alex from both and then it will
115:59 to take Alex from both and then it will match them together uh and say that they
116:02 match them together uh and say that they are the same so that you know it may not
116:05 are the same so that you know it may not be perfect that's why it's called a
116:08 be perfect that's why it's called a fuzzy match because it can work for a
116:10 fuzzy match because it can work for a large majority of the time but it's not
116:12 large majority of the time but it's not going to work every single time and so I
116:15 going to work every single time and so I want to show you how we can use this
116:17 want to show you how we can use this here um really quick I need to join this
116:21 here um really quick I need to join this to um the demographics table so I'm
116:25 to um the demographics table so I'm going to do that really
116:27 going to do that really quick bear with me for just one
116:31 quick bear with me for just one second let's try to make this at least
116:33 second let's try to make this at least look somewhat good so what I'm going to
116:35 look somewhat good so what I'm going to do is I'm going to start off by saying
116:39 do is I'm going to start off by saying um let's tie it to the first name uh
116:42 um let's tie it to the first name uh let's do whoops let's do air. first name
116:45 let's do whoops let's do air. first name is equal to the demographics table first
116:48 is equal to the demographics table first name okay so I want to see and I'm just
116:51 name okay so I want to see and I'm just going to do first name for
116:55 going to do first name for ER and let's do them. first name so
117:01 ER and let's do them. first name so let's see what comes up when we do it
117:03 let's see what comes up when we do it like this so the only one that is going
117:05 like this so the only one that is going to work is Toby and that's because even
117:08 to work is Toby and that's because even though it has a capital O it's still
117:10 though it has a capital O it's still going to take it um so you know we want
117:14 going to take it um so you know we want to get all of them to match and we can
117:17 to get all of them to match and we can do that but it's going to be um a little
117:19 do that but it's going to be um a little bit of a different way than maybe is
117:21 bit of a different way than maybe is perfect but that's why they call it
117:23 perfect but that's why they call it fuzzy matching so we're going to use
117:24 fuzzy matching so we're going to use substring on this so I'm going to say
117:27 substring on this so I'm going to say substring oops let me that right so I'm
117:30 substring oops let me that right so I'm going to say substring and we're going
117:32 going to say substring and we're going to go one three so starting at the first
117:34 to go one three so starting at the first position and going forward with three
117:36 position and going forward with three and we're going to do the exact same
117:38 and we're going to do the exact same thing on the oops subst string it be
117:41 thing on the oops subst string it be great if I could spell that correctly
117:43 great if I could spell that correctly we're going to do the exact same thing
117:44 we're going to do the exact same thing so one and
117:47 so one and three so we are actually going
117:50 three so we are actually going to take this give me a second missed
117:54 to take this give me a second missed that we're going to take this up here
117:57 that we're going to take this up here and we're just going to go like that and
118:01 and we're just going to go like that and I why did I copy it with the error okay
118:04 I why did I copy it with the error okay so let's run this really
118:06 so let's run this really quickly and as you can see it is now
118:09 quickly and as you can see it is now going to match all of them and you can
118:11 going to match all of them and you can do this on a lot of different things
118:14 do this on a lot of different things typically when I'm doing a fuzzy match
118:16 typically when I'm doing a fuzzy match like this I'm not just going to do it on
118:18 like this I'm not just going to do it on a first name right because if every
118:20 a first name right because if every there can be a ton of people named JY
118:23 there can be a ton of people named JY you know we want to do it on uh and and
118:27 you know we want to do it on uh and and real quick let me actually show you
118:31 real quick let me actually show you um what the originals looked like just
118:35 um what the originals looked like just to make sure I hit the the point
118:38 to make sure I hit the the point across um and that is going to be first
118:42 across um and that is going to be first name and come all right so real quick
118:45 name and come all right so real quick let's actually look at this so it
118:47 let's actually look at this so it originally was Jimbo Pamela and Toby uh
118:51 originally was Jimbo Pamela and Toby uh in this one was Jim Pam and Toby And so
118:54 in this one was Jim Pam and Toby And so when we just took the first three
118:57 when we just took the first three because it was Jimbo it then becomes Jim
119:00 because it was Jimbo it then becomes Jim it was Pamela it becomes Pam now it
119:02 it was Pamela it becomes Pam now it matches and so that's what that's kind
119:04 matches and so that's what that's kind of the example that we're going
119:06 of the example that we're going for like I was saying I typically will
119:09 for like I was saying I typically will not just filter on a first name because
119:11 not just filter on a first name because there's going to be a ton of people
119:12 there's going to be a ton of people named Alex or Jim or or or you know
119:15 named Alex or Jim or or or you know Henry or whatever you're going to do
119:17 Henry or whatever you're going to do this on many different things so would
119:19 this on many different things so would be doing it on things like uh if I'm
119:21 be doing it on things like uh if I'm trying to do a fuzzy match on a person I
119:23 trying to do a fuzzy match on a person I do it on their gender to make sure that
119:25 do it on their gender to make sure that their gender is the same um and I
119:27 their gender is the same um and I wouldn't probably need to use a
119:29 wouldn't probably need to use a substring for that but just to kind
119:31 substring for that but just to kind of give you a little bit more
119:33 of give you a little bit more information I need to do it on the last
119:35 information I need to do it on the last name um so I need to use that substring
119:38 name um so I need to use that substring again and I would probably do it on the
119:39 again and I would probably do it on the age
119:40 age oops the what am I doing come on the age
119:45 oops the what am I doing come on the age and the date of birth okay so all of
119:48 and the date of birth okay so all of those things if you if you fuzzy match
119:50 those things if you if you fuzzy match on the first name and the last name and
119:53 on the first name and the last name and then the gender the age and the date of
119:55 then the gender the age and the date of birth are all the same then you can
119:57 birth are all the same then you can typically get a very high accuracy in
120:01 typically get a very high accuracy in matching people across
120:04 matching people across tables whether or not you have you know
120:06 tables whether or not you have you know this is an example if you don't have
120:08 this is an example if you don't have like an employee ID which is what we do
120:10 like an employee ID which is what we do have but take for example we were not
120:12 have but take for example we were not given that uh this is a way to match
120:14 given that uh this is a way to match them using substrings let's move on to
120:18 them using substrings let's move on to Upper and lower all upper and lower is
120:21 Upper and lower all upper and lower is going to do is basically take all the
120:24 going to do is basically take all the characters in The the text and make them
120:27 characters in The the text and make them either upper or make them lower so it's
120:30 either upper or make them lower so it's very
120:31 very self-explanatory uh let me copy this up
120:35 self-explanatory uh let me copy this up here and we will get going on this
120:38 here and we will get going on this one uh let's just look at the first name
120:43 one uh let's just look at the first name um specifically we're going to be
120:44 um specifically we're going to be looking at Toby right here so let's do
120:48 looking at Toby right here so let's do first name
120:49 first name let's do uh lower and all we have to do
120:53 let's do uh lower and all we have to do is put in the column that we want to
120:57 is put in the column that we want to do so this is our original first name
121:01 do so this is our original first name and it then takes every single uh string
121:06 and it then takes every single uh string that is in here or every single I guess
121:08 that is in here or every single I guess character and and it makes it lowercase
121:12 character and and it makes it lowercase that's all it does uh and it is the
121:14 that's all it does uh and it is the exact opposite when we do upper so we
121:18 exact opposite when we do upper so we can now take take a look at this one and
121:21 can now take take a look at this one and now everything's going to be capitalized
121:23 now everything's going to be capitalized so there is a lot that you can do with
121:25 so there is a lot that you can do with these string functions and this is not
121:26 these string functions and this is not all the string functions that there are
121:28 all the string functions that there are there are a lot more but I would say
121:30 there are a lot more but I would say that these are the more popular more
121:31 that these are the more popular more useful ones that I typically use on a
121:34 useful ones that I typically use on a regular basis and so I hope that this
121:37 regular basis and so I hope that this has been helpful I hope that you learned
121:38 has been helpful I hope that you learned something from this if you did be sure
121:40 something from this if you did be sure to like And subscribe below I have a lot
121:43 to like And subscribe below I have a lot more videos coming out with tutorials on
121:45 more videos coming out with tutorials on everything from SQL python Tableau and
121:48 everything from SQL python Tableau and Excel
121:49 Excel thank you so much for joining me I
121:51 thank you so much for joining me I appreciate it and I will see you in the
121:52 appreciate it and I will see you in the next
122:05 [Music] video what's going on everybody welcome
122:07 video what's going on everybody welcome back to another SQL tutorial today we
122:09 back to another SQL tutorial today we are talking about stored procedures now
122:12 are talking about stored procedures now what is a store procedure a store
122:14 what is a store procedure a store procedure is a group of SQL statements
122:16 procedure is a group of SQL statements that has been created and then stored in
122:18 that has been created and then stored in that database a store procedure can
122:19 that database a store procedure can accept input parameters and we will be
122:21 accept input parameters and we will be looking at that today but that means
122:23 looking at that today but that means that a single store procedure can be
122:25 that a single store procedure can be used over the network by several
122:26 used over the network by several different users uh and we can all be
122:28 different users uh and we can all be using different input data a store
122:30 using different input data a store procedure will also reduce Network
122:32 procedure will also reduce Network traffic and increase the performance and
122:34 traffic and increase the performance and lastly if we modify that store procedure
122:36 lastly if we modify that store procedure everyone who uses that store procedure
122:38 everyone who uses that store procedure in the future will also get that update
122:40 in the future will also get that update let's start writing out the store
122:41 let's start writing out the store procedure so we can look at the syntax
122:43 procedure so we can look at the syntax we'll start off very simple and then in
122:44 we'll start off very simple and then in the next one we'll get a little bit more
122:46 the next one we'll get a little bit more complicated so the very first thing that
122:48 complicated so the very first thing that you need to write is create and then
122:51 you need to write is create and then procedure and after that you're going to
122:53 procedure and after that you're going to name it so let's just call this one
122:56 name it so let's just call this one test and all you're going to say is as
122:59 test and all you're going to say is as and then you're going to write your
123:00 and then you're going to write your query and so let's just do select
123:04 query and so let's just do select everything from employee
123:09 everything from employee demographics and that is it we have
123:11 demographics and that is it we have created our very first store procedure
123:12 created our very first store procedure of course this is super super simple but
123:14 of course this is super super simple but let's execute this really quick and take
123:16 let's execute this really quick and take a look at
123:17 a look at it so it says that the commands
123:19 it so it says that the commands completed successfully let's go over to
123:22 completed successfully let's go over to our SQL tutorial we're going to go over
123:24 our SQL tutorial we're going to go over to
123:25 to programmability store procedures and it
123:27 programmability store procedures and it is not showing up there what we need to
123:29 is not showing up there what we need to do is we need to refresh our store
123:31 do is we need to refresh our store procedures we're just going to go right
123:33 procedures we're just going to go right here we're going to click refresh and
123:35 here we're going to click refresh and then there is our store procedure now
123:37 then there is our store procedure now how do you actually use the store
123:38 how do you actually use the store procedure that we just created so let's
123:40 procedure that we just created so let's go right down here and let's say x which
123:45 go right down here and let's say x which means execute and then all we're going
123:47 means execute and then all we're going to say is test test and we're going to
123:50 to say is test test and we're going to run
123:51 run this and there we go so all we put in
123:54 this and there we go so all we put in this store procedure was a select
123:56 this store procedure was a select statement and so when we actually
123:57 statement and so when we actually Rebrand the store procedure it returned
124:00 Rebrand the store procedure it returned our select statement now let's go down
124:02 our select statement now let's go down here and we're going to make it a little
124:03 here and we're going to make it a little bit more complicated we're going to do
124:05 bit more complicated we're going to do the exact same thing in create store
124:09 the exact same thing in create store procedure make sure I spelled that right
124:12 procedure make sure I spelled that right and let's call this
124:14 and let's call this tempore employee so if you remember from
124:17 tempore employee so if you remember from a previous video we worked on temp
124:19 a previous video we worked on temp tables and we created our temp tables
124:21 tables and we created our temp tables then inserted data into that we are
124:22 then inserted data into that we are going to add that to this St procedure
124:24 going to add that to this St procedure so we can see the difference between a
124:26 so we can see the difference between a simple query versus a little bit more
124:27 simple query versus a little bit more complicated query so I'm going to say as
124:30 complicated query so I'm going to say as and then I'm going to insert that in
124:32 and then I'm going to insert that in here now what this is doing is I'm
124:34 here now what this is doing is I'm creating a table and then right down
124:37 creating a table and then right down here I inserting that table now if I
124:39 here I inserting that table now if I create this store procedure and then
124:41 create this store procedure and then execute it nothing is actually going to
124:43 execute it nothing is actually going to be returned it will insert the data into
124:45 be returned it will insert the data into that temp table but since I don't have a
124:47 that temp table but since I don't have a select statement in this proced
124:48 select statement in this proced procedure nothing will be returned so
124:51 procedure nothing will be returned so let's write
124:52 let's write select everything and we'll just do
124:56 select everything and we'll just do from and this is temp
124:59 from and this is temp employee and right here and so now let's
125:02 employee and right here and so now let's create our store
125:04 create our store procedure so that created successfully
125:07 procedure so that created successfully let's refresh over
125:09 let's refresh over here and let's execute this so let's
125:12 here and let's execute this so let's just go down right
125:14 just go down right here and say execute and it's going to
125:17 here and say execute and it's going to be temp
125:19 be temp employee and now we will execute
125:22 employee and now we will execute this and there is our output now really
125:25 this and there is our output now really quick let's go into temp employee and we
125:28 quick let's go into temp employee and we actually want to change this store
125:29 actually want to change this store procedure so we're going to go over to
125:31 procedure so we're going to go over to modify so when we modify it a few things
125:34 modify so when we modify it a few things are going to show up on your screen the
125:36 are going to show up on your screen the first thing that you're going to see is
125:37 first thing that you're going to see is it says use SQL tutorial so it's just
125:39 it says use SQL tutorial so it's just specifying the database the next two
125:41 specifying the database the next two things you may not be as familiar with
125:43 things you may not be as familiar with it's set anzy nules and then set quoted
125:46 it's set anzy nules and then set quoted identifier if you don't know what these
125:47 identifier if you don't know what these are it's not super important the first
125:49 are it's not super important the first one just talks about how it to deal with
125:51 one just talks about how it to deal with nulles when you're using the wear
125:52 nulles when you're using the wear statement and then the quoted identifier
125:55 statement and then the quoted identifier just talks about how it uses quotes in
125:57 just talks about how it uses quotes in the actual query itself again not super
125:59 the actual query itself again not super important but they have those
126:00 important but they have those automatically turned on let's go down a
126:03 automatically turned on let's go down a little bit further and we're going to
126:04 little bit further and we're going to look at the alter procedure so we
126:07 look at the alter procedure so we created our store procedure but now we
126:08 created our store procedure but now we want to alter it so this is the alter
126:12 want to alter it so this is the alter procedure and we are going to add a
126:14 procedure and we are going to add a parameter to this so what the parameter
126:16 parameter to this so what the parameter is going to allow us to do is when we're
126:18 is going to allow us to do is when we're actually executing the store procedure
126:20 actually executing the store procedure we can specify an input into that store
126:23 we can specify an input into that store procedure so that we get a specific
126:24 procedure so that we get a specific result back and I'm going to show you
126:26 result back and I'm going to show you what I mean by that in just a second but
126:28 what I mean by that in just a second but let's actually add our input and we're
126:30 let's actually add our input and we're going to say at we're going to say job
126:34 going to say at we're going to say job title and we need to specify the data
126:37 title and we need to specify the data type that that is going to be so let's
126:39 type that that is going to be so let's just say
126:41 just say nvar 100 I know below it says varar 100
126:44 nvar 100 I know below it says varar 100 but that's um not extremely important so
126:47 but that's um not extremely important so this is going to be our input so we need
126:49 this is going to be our input so we need to go down here and say
126:52 to go down here and say where job title is equal to at job title
126:58 where job title is equal to at job title so when we actually are executing this
127:00 so when we actually are executing this and we say the job title is equal to
127:02 and we say the job title is equal to let's say accountant this is going to
127:04 let's say accountant this is going to become accountant and it's going to give
127:06 become accountant and it's going to give us our results based off of it being an
127:08 us our results based off of it being an accountant so let's go over here and we
127:10 accountant so let's go over here and we are going to click this execute temp
127:12 are going to click this execute temp employee which we just modified and when
127:15 employee which we just modified and when we run it we're going to get an error
127:17 we run it we're going to get an error because it is now expecting us to
127:20 because it is now expecting us to include our parameter of job title so
127:23 include our parameter of job title so what we need to do is we need to say at
127:27 what we need to do is we need to say at job title and let's say it's equal to a
127:30 job title and let's say it's equal to a Salesman now let's try running this one
127:33 Salesman now let's try running this one and see what we get and so there is our
127:36 and see what we get and so there is our output if we go back here I just wanted
127:38 output if we go back here I just wanted to show you really quick we do not have
127:39 to show you really quick we do not have to put this job title right here you can
127:42 to put this job title right here you can put this anywhere in the query and use
127:44 put this anywhere in the query and use it however you want that's how
127:45 it however you want that's how parameters work and that's why
127:47 parameters work and that's why parameters are so useful and you can use
127:49 parameters are so useful and you can use multiple parameters for one store
127:51 multiple parameters for one store procedure so you don't have to just
127:52 procedure so you don't have to just limit yourself to one or none you can
127:55 limit yourself to one or none you can put as many as you really like so I hope
127:57 put as many as you really like so I hope that this video is helpful and that you
127:58 that this video is helpful and that you understand store procedures just a
128:00 understand store procedures just a little bit better thank you guys so much
128:02 little bit better thank you guys so much for watching I really appreciate it if
128:04 for watching I really appreciate it if you like this video be sure to like And
128:06 you like this video be sure to like And subscribe below and I'll see you in the
128:07 subscribe below and I'll see you in the next
128:09 next [Music]
128:17 [Music] video
128:22 what's going on everybody welcome back to another SQL tutorial today we are
128:23 to another SQL tutorial today we are going to be talking about subqueries now
128:26 going to be talking about subqueries now subqueries are often called inner
128:27 subqueries are often called inner queries or an nestic queries and they're
128:30 queries or an nestic queries and they're basically a query within a query a
128:32 basically a query within a query a subquery is used to return data that
128:34 subquery is used to return data that will be used in the main query or the
128:36 will be used in the main query or the outer query as a condition to specify
128:38 outer query as a condition to specify the data that we want retrieved you can
128:40 the data that we want retrieved you can use subqueries almost anywhere you can
128:42 use subqueries almost anywhere you can use it in the select part of a query the
128:45 use it in the select part of a query the from the where you can also use it in
128:47 from the where you can also use it in insert update and delete statements but
128:50 insert update and delete statements but in today's tutorial we're only going to
128:51 in today's tutorial we're only going to be looking at the select the from in the
128:53 be looking at the select the from in the Weare statements and you should get a
128:54 Weare statements and you should get a pretty good idea of how to use it in
128:56 pretty good idea of how to use it in those other statements all right now I'm
128:58 those other statements all right now I'm going to paste on screen basically what
128:59 going to paste on screen basically what we're going to be walking through today
129:01 we're going to be walking through today but really quick let's just take a look
129:03 but really quick let's just take a look at the table that we're actually be
129:04 at the table that we're actually be working in and that is going to be from
129:06 working in and that is going to be from the employee salary table and I just
129:09 the employee salary table and I just want to show you the data that we're
129:10 want to show you the data that we're going to be working with before we
129:11 going to be working with before we actually get into it so we have an
129:13 actually get into it so we have an employee ID we have a job title and then
129:16 employee ID we have a job title and then we have a salary so really quick I'm
129:19 we have a salary so really quick I'm going to show you what it looks like to
129:21 going to show you what it looks like to have a subquery in the select statement
129:24 have a subquery in the select statement so let's go down here really
129:25 so let's go down here really quick and what we're going to try to do
129:28 quick and what we're going to try to do is kind of do something like a Windows
129:30 is kind of do something like a Windows function but without actually having to
129:32 function but without actually having to do the windows function um and so we're
129:34 do the windows function um and so we're going to do this with a subquery so I'm
129:37 going to do this with a subquery so I'm going to select and really quick
129:39 going to select and really quick actually let me copy this so we're going
129:42 actually let me copy this so we're going to do employee
129:45 to do employee ID there we go we're going to do salary
129:48 ID there we go we're going to do salary and now we can start building our
129:50 and now we can start building our subquery so we need to do an open
129:52 subquery so we need to do an open parenthesis and I'm just going to copy
129:54 parenthesis and I'm just going to copy this really quick because we're going to
129:55 this really quick because we're going to be doing it off of that table so we're
129:57 be doing it off of that table so we're going to say select and then I'll paste
129:59 going to say select and then I'll paste that and close it as well but what we
130:02 that and close it as well but what we want to do is we want to say average and
130:05 want to do is we want to say average and salary now what this is going to do is
130:08 salary now what this is going to do is it is literally going to run this and
130:10 it is literally going to run this and let's run this really quick it is going
130:12 let's run this really quick it is going to run this and is going to show that
130:15 to run this and is going to show that the average salary for all the employees
130:17 the average salary for all the employees is 40 $
130:19 is 40 $ 7,99 so we are looking at the average
130:21 7,99 so we are looking at the average salary for every employee so when we run
130:24 salary for every employee so when we run this it is going to give us the employee
130:27 this it is going to give us the employee ID the salary and then in the very last
130:30 ID the salary and then in the very last one is going to show the average salary
130:32 one is going to show the average salary for every employee now it doesn't have a
130:35 for every employee now it doesn't have a column header so or or a column name so
130:38 column header so or or a column name so let's give it um let's say as all
130:42 let's give it um let's say as all average salary and we'll run that one
130:45 average salary and we'll run that one more time just to make it look a little
130:47 more time just to make it look a little prettier um you can also do this in
130:50 prettier um you can also do this in Partition bu I'm going to Super quickly
130:52 Partition bu I'm going to Super quickly just really quickly write this out um it
130:55 just really quickly write this out um it should take no time at all and then I'm
130:56 should take no time at all and then I'm going to show you why we can't do this
130:59 going to show you why we can't do this without the subquery why you aren't able
131:02 without the subquery why you aren't able to do this with a group buy so really
131:04 to do this with a group buy so really quickly let me copy this I'm going to
131:06 quickly let me copy this I'm going to put it right down here and we're going
131:08 put it right down here and we're going to say average salary whoops and we can
131:12 to say average salary whoops and we can get rid of all this and we can say over
131:16 get rid of all this and we can say over and we're not going to partition it by
131:17 and we're not going to partition it by anything
131:18 anything but let's run both these at the same
131:20 but let's run both these at the same time and you'll see that they're the
131:21 time and you'll see that they're the exact same outputs and so it's just a
131:24 exact same outputs and so it's just a different way of doing it in this
131:26 different way of doing it in this example but it really is just to show a
131:29 example but it really is just to show a comparison of how you might be able to
131:31 comparison of how you might be able to use a subquery in the select statement
131:33 use a subquery in the select statement now you might be wondering why group I
131:34 now you might be wondering why group I does not work for this uh really quickly
131:36 does not work for this uh really quickly I'm going to write this out and let's
131:39 I'm going to write this out and let's get rid of that and we'll say Group by
131:43 get rid of that and we'll say Group by whoops let me at least try to write it
131:45 whoops let me at least try to write it correctly Group by and we'll do employee
131:49 correctly Group by and we'll do employee ID and we also have to do salary and
131:53 ID and we also have to do salary and then we'll say order
131:56 then we'll say order by one two so let's run
132:00 by one two so let's run this and as you can see since we have to
132:03 this and as you can see since we have to use the group by it groups by both the
132:05 use the group by it groups by both the ordered ID and the salary and so we're
132:08 ordered ID and the salary and so we're not going to be able to get that all
132:10 not going to be able to get that all average salary that we're looking for
132:12 average salary that we're looking for that we can get in the partition buy and
132:14 that we can get in the partition buy and also the subquery in the select
132:15 also the subquery in the select statement now I'm going to show you the
132:17 statement now I'm going to show you the subquery in the from statement so let's
132:20 subquery in the from statement so let's just get rid of that really quick and
132:21 just get rid of that really quick and let's say select everything let's say
132:24 let's say select everything let's say from and we're going to do an open
132:26 from and we're going to do an open parentheses here and here is where we're
132:28 parentheses here and here is where we're going to write our subquery so if you
132:31 going to write our subquery so if you have watched previous videos where I've
132:33 have watched previous videos where I've done uh tutorials on the CTE or tutorial
132:37 done uh tutorials on the CTE or tutorial on the temp tables this is one that is
132:40 on the temp tables this is one that is very much like those except I think a
132:42 very much like those except I think a little bit less efficient when I'm doing
132:45 little bit less efficient when I'm doing something where I'm creating a table and
132:47 something where I'm creating a table and then quering off off of it which is what
132:48 then quering off off of it which is what we're about to do I much prefer a CTE or
132:52 we're about to do I much prefer a CTE or a temp table subqueries tend to be a
132:55 a temp table subqueries tend to be a little bit slow compared to a temp table
132:58 little bit slow compared to a temp table or a CTE I tend to use temp tables a lot
133:01 or a CTE I tend to use temp tables a lot more because you can reuse them over and
133:03 more because you can reuse them over and over whereas a subquery you cannot you
133:05 over whereas a subquery you cannot you have to write it out each time so really
133:08 have to write it out each time so really quickly I'm going to show you how it's
133:09 quickly I'm going to show you how it's done although I don't really recommend
133:11 done although I don't really recommend using this method really quickly let's
133:13 using this method really quickly let's go up here and let's steal this
133:15 go up here and let's steal this partition bu really quick this will be
133:17 partition bu really quick this will be our
133:18 our subquery uh and let's paste this in here
133:21 subquery uh and let's paste this in here I'm going make this look a little nicer
133:23 I'm going make this look a little nicer just so you can visualize it a little
133:25 just so you can visualize it a little bit
133:26 bit easier um so really quick what this is
133:29 easier um so really quick what this is going to do is it is first going to run
133:32 going to do is it is first going to run this and create this table again much
133:35 this and create this table again much like a temp table or a CTE so let's
133:38 like a temp table or a CTE so let's execute this really quick it's going to
133:40 execute this really quick it's going to create this table and then it's going to
133:41 create this table and then it's going to allow us to query off of it so I can
133:44 allow us to query off of it so I can actually say um and let me give kind of
133:47 actually say um and let me give kind of kind of an alias to this a. employee ID
133:51 kind of an alias to this a. employee ID and then let's say all average salary so
133:55 and then let's say all average salary so now I can take um columns from this
133:59 now I can take um columns from this inner query if I want to and just select
134:02 inner query if I want to and just select those or I can select everything and
134:04 those or I can select everything and return that entire table again I much
134:06 return that entire table again I much prefer a temp table or a CTE for this
134:09 prefer a temp table or a CTE for this type of situation but as an example I
134:11 type of situation but as an example I just wanted to show you how it works now
134:14 just wanted to show you how it works now let's go down to the subquery in thewar
134:16 let's go down to the subquery in thewar statement but really quick I just steal
134:18 statement but really quick I just steal this query so I don't have to rewrite
134:21 this query so I don't have to rewrite everything and let's get rid of this
134:24 everything and let's get rid of this really quick and add back the job
134:28 really quick and add back the job title all right so let's look at this
134:31 title all right so let's look at this really quick so we have our table that
134:34 really quick so we have our table that we've been using our employee ID job
134:35 we've been using our employee ID job title salary so for this example we only
134:38 title salary so for this example we only want to return employees if they're over
134:40 want to return employees if they're over the age of 30 and as you can see in this
134:42 the age of 30 and as you can see in this table there is no age column that is in
134:45 table there is no age column that is in the employee demographics table now if
134:47 the employee demographics table now if we wanted we could join to that table
134:49 we wanted we could join to that table and get that information or we could use
134:51 and get that information or we could use a subquery and so for this example we
134:53 a subquery and so for this example we are going to be using a subquery so
134:55 are going to be using a subquery so let's go right down here and say where
134:57 let's go right down here and say where employee ID is in and we'll do an open
135:00 employee ID is in and we'll do an open parentheses and now this is where we are
135:02 parentheses and now this is where we are going to build out the subquery so just
135:05 going to build out the subquery so just for visual purposes I'm going to go
135:07 for visual purposes I'm going to go right here I'm going to say select
135:09 right here I'm going to say select everything and we'll do from employee
135:11 everything and we'll do from employee demographics and close the parenthesis
135:14 demographics and close the parenthesis so we're going to try to select
135:15 so we're going to try to select something in this subquery that will
135:18 something in this subquery that will then identify the employee IDs that are
135:21 then identify the employee IDs that are over the age of 30 so really quickly
135:23 over the age of 30 so really quickly let's take a look at this table so right
135:26 let's take a look at this table so right now we have the entire table selected so
135:29 now we have the entire table selected so we have the employee ID first name last
135:30 we have the employee ID first name last name age and gender so in this subquery
135:34 name age and gender so in this subquery the only thing that should be returned
135:36 the only thing that should be returned is the employee ID and in fact in your
135:39 is the employee ID and in fact in your subquery you can only have one column
135:41 subquery you can only have one column selected so I can't select everything I
135:43 selected so I can't select everything I have to specify one column and that's a
135:45 have to specify one column and that's a little bit different than how we did it
135:47 little bit different than how we did it in in this from statement where we were
135:49 in in this from statement where we were basically able to select the entire
135:51 basically able to select the entire table and then in the select statement
135:54 table and then in the select statement specify what columns we wanted in the
135:56 specify what columns we wanted in the where statement we can't do that so we
135:58 where statement we can't do that so we want to return the employee ID and we
136:01 want to return the employee ID and we also want to say where the age is
136:04 also want to say where the age is greater than 30 so let's run this really
136:08 greater than 30 so let's run this really quick and see if it works as you can see
136:10 quick and see if it works as you can see in the results these are the employees
136:12 in the results these are the employees who are over the age of 30 now if you
136:14 who are over the age of 30 now if you wanted to display the age as a column in
136:17 wanted to display the age as a column in this output you would have to join to
136:19 this output you would have to join to that table and then put that column or
136:21 that table and then put that column or that field in the select statement but
136:23 that field in the select statement but in a lot of situations you won't
136:25 in a lot of situations you won't actually want or need to do that and so
136:27 actually want or need to do that and so a subquery can be a really good option
136:29 a subquery can be a really good option in these scenarios with that being said
136:31 in these scenarios with that being said this is the last video in the advanced
136:33 this is the last video in the advanced sequel tutorials I hope that this Series
136:35 sequel tutorials I hope that this Series has been helpful and that you learned
136:36 has been helpful and that you learned something along the way thank you so
136:38 something along the way thank you so much for joining me I really appreciate
136:40 much for joining me I really appreciate it if you like this video be sure to
136:42 it if you like this video be sure to like And subscribe below and I'll see
136:44 like And subscribe below and I'll see you in the next
136:48 video [Music]
136:57 [Music] what is going on everybody welcome back
136:58 what is going on everybody welcome back to another video today we are starting
137:00 to another video today we are starting our data analyst portfolio project
137:10 series now before we jump into our first project I wanted to talk with you for
137:12 project I wanted to talk with you for just a second so that we're all on the
137:13 just a second so that we're all on the same page first thing is that there are
137:15 same page first thing is that there are going to be four projects the first one
137:17 going to be four projects the first one is going to be SQL and we doing a lot of
137:18 is going to be SQL and we doing a lot of data exploration and we'll be setting up
137:21 data exploration and we'll be setting up a lot of our data to visualize it in
137:23 a lot of our data to visualize it in Tableau Tableau is going to be our
137:25 Tableau Tableau is going to be our second project in our third project
137:27 second project in our third project again we're going back to SQL but we're
137:29 again we're going back to SQL but we're going to be doing a lot more of the ETL
137:31 going to be doing a lot more of the ETL process so a lot more of the data
137:32 process so a lot more of the data cleaning I did that one as the third
137:35 cleaning I did that one as the third project because I think it's going to be
137:36 project because I think it's going to be a little bit more advanced than this
137:38 a little bit more advanced than this first project I tried to make it as
137:41 first project I tried to make it as beginner friendly as possible so even if
137:43 beginner friendly as possible so even if you are a complete beginner as long as
137:45 you are a complete beginner as long as you've walked through uh you know the
137:47 you've walked through uh you know the tutorial that I have made on my channel
137:49 tutorial that I have made on my channel you should be pretty good and then the
137:51 you should be pretty good and then the fourth and the final project will be
137:52 fourth and the final project will be with python we'll be using a lot of
137:54 with python we'll be using a lot of pandas doing a little bit of data
137:55 pandas doing a little bit of data cleaning and then doing visualizations
137:57 cleaning and then doing visualizations as well as I said just a second ago I'm
137:59 as well as I said just a second ago I'm trying to make this as beginner friendly
138:01 trying to make this as beginner friendly as I possibly can the whole point of the
138:03 as I possibly can the whole point of the series is that if you are trying to
138:05 series is that if you are trying to apply for a data analyst job by the end
138:08 apply for a data analyst job by the end of the series you should have an entire
138:09 of the series you should have an entire portfolio or at least a a really good
138:12 portfolio or at least a a really good start at a portfolio to show a potential
138:14 start at a portfolio to show a potential employer I give you full permission to
138:16 employer I give you full permission to copy every script every query line for
138:17 copy every script every query line for line if that is what you want to do and
138:19 line if that is what you want to do and create your own portfolio I am totally
138:21 create your own portfolio I am totally fine with that but I will encourage you
138:23 fine with that but I will encourage you and I'm sure I'll say this throughout
138:24 and I'm sure I'll say this throughout the video I encourage you to try to
138:26 the video I encourage you to try to think of your own queries try to think
138:28 think of your own queries try to think of your own insights and your own things
138:30 of your own insights and your own things that you can do to make this portfolio
138:32 that you can do to make this portfolio project unique with that being said I'm
138:34 project unique with that being said I'm super excited to get started on this
138:36 super excited to get started on this with you guys so let's jump over to my
138:38 with you guys so let's jump over to my screen and get started on our very first
138:39 screen and get started on our very first project all right so now that we are on
138:41 project all right so now that we are on my screen we are going to get started on
138:42 my screen we are going to get started on this project we're going to download the
138:44 this project we're going to download the data set we are going to format it just
138:46 data set we are going to format it just a little bit in Excel
138:48 a little bit in Excel and then we're going to get into sequel
138:49 and then we're going to get into sequel where we will start querying it I will
138:52 where we will start querying it I will say that I think this is going to be a
138:53 say that I think this is going to be a very long video I'm hoping to keep it
138:55 very long video I'm hoping to keep it under an hour and a half I may separate
138:58 under an hour and a half I may separate this into two videos depending on how
139:00 this into two videos depending on how long it runs um but you know I I will do
139:04 long it runs um but you know I I will do my best to keep it short but we have a
139:06 my best to keep it short but we have a lot to get through I'm going to
139:08 lot to get through I'm going to basically do no Cuts I'm I'm that's my
139:10 basically do no Cuts I'm I'm that's my goal is to do no cuts um in this because
139:12 goal is to do no cuts um in this because I want to walk you through each step of
139:14 I want to walk you through each step of the process so that you understand
139:16 the process so that you understand everything that's going on and I I you
139:18 everything that's going on and I I you don't get lost at some point um but I
139:21 don't get lost at some point um but I think this is probably the best way to
139:24 think this is probably the best way to do it we'll see uh the very first thing
139:26 do it we'll see uh the very first thing we're going to do is download our data
139:28 we're going to do is download our data set so you know as we're looking at this
139:32 set so you know as we're looking at this there's an option right here to download
139:33 there's an option right here to download the data set I don't recommend that one
139:35 the data set I don't recommend that one um you can it just won't give you all
139:37 um you can it just won't give you all the information that I personally want
139:38 the information that I personally want which is go back to like the very
139:40 which is go back to like the very beginning um if you go down right here
139:42 beginning um if you go down right here to the very first graph um you can
139:44 to the very first graph um you can actually push this back and then
139:47 actually push this back and then download it and what this will do is it
139:50 download it and what this will do is it will go back to I think January 1st of
139:53 will go back to I think January 1st of 2020 so let's open this one
139:56 2020 so let's open this one up um and when we get in here we're
140:00 up um and when we get in here we're going to reformat it just a little bit
140:02 going to reformat it just a little bit it's nothing too complicated I hope um
140:05 it's nothing too complicated I hope um I'm just going to double click here
140:07 I'm just going to double click here actually let me let me go up here and
140:09 actually let me let me go up here and filter just in case we want to filter
140:11 filter just in case we want to filter anything so um what we have here is a
140:15 anything so um what we have here is a ton of information on Co I mean just a
140:17 ton of information on Co I mean just a ton and it goes back to early 2020 I
140:20 ton and it goes back to early 2020 I believe it does go back to the first of
140:22 believe it does go back to the first of 2020 so really quick a really brief
140:25 2020 so really quick a really brief introduction of what kind of data is in
140:27 introduction of what kind of data is in here we have total cases new
140:29 here we have total cases new cases um total deaths new deaths we use
140:33 cases um total deaths new deaths we use those quite a bit in the the queries
140:35 those quite a bit in the the queries that are coming
140:36 that are coming up um if we go way over here we have
140:40 up um if we go way over here we have total vaccin vaccinations people
140:42 total vaccin vaccinations people vaccinated um and then over here a
140:44 vaccinated um and then over here a little bit farther we have population
140:47 little bit farther we have population that's the main stuff we're going to be
140:48 that's the main stuff we're going to be working with today as you can see
140:50 working with today as you can see there's so many other things in here I
140:53 there's so many other things in here I mean you can use this if you want to go
140:55 mean you can use this if you want to go back and do more stuff on this I highly
140:58 back and do more stuff on this I highly recommend it there's such you know
141:00 recommend it there's such you know there's so such unique data in here
141:02 there's so such unique data in here about smokers and diabetes and like all
141:04 about smokers and diabetes and like all this random stuff that I did not do a
141:07 this random stuff that I did not do a deep dive in I mean I could I could
141:08 deep dive in I mean I could I could spend you know a month just like looking
141:12 spend you know a month just like looking at this data set and and getting really
141:14 at this data set and and getting really interesting stuff from it um but I'm not
141:17 interesting stuff from it um but I'm not going to do that I wanted to do this
141:20 going to do that I wanted to do this faster than uh two months to to
141:23 faster than uh two months to to complete what we're going to do um is
141:26 complete what we're going to do um is we're going to go back over here we're
141:27 we're going to go back over here we're going to take this
141:29 going to take this population and we're going to click on
141:31 population and we're going to click on this as and we're going to click contrl
141:34 this as and we're going to click contrl X and that's going to cut it we're going
141:36 X and that's going to cut it we're going to go back to the very beginning and
141:37 to go back to the very beginning and we're going to place it right here and
141:39 we're going to place it right here and we're going to right click and say
141:41 we're going to right click and say insert cut cells now why are we doing
141:43 insert cut cells now why are we doing this because I've already done this
141:45 this because I've already done this entire project um and if you don't do
141:49 entire project um and if you don't do this you're going to do a join with
141:51 this you're going to do a join with every single query you do which if you
141:53 every single query you do which if you want to do that keep it there and then
141:55 want to do that keep it there and then just you know change your query for for
141:57 just you know change your query for for that I did it like this because I wanted
141:59 that I did it like this because I wanted to show joins later on I wanted to keep
142:02 to show joins later on I wanted to keep it kind of simple at the beginning um
142:04 it kind of simple at the beginning um and then work my way to a little bit
142:06 and then work my way to a little bit more advanced things which you will see
142:08 more advanced things which you will see um it gets you know semi Advanced but
142:10 um it gets you know semi Advanced but not too much I promise um just stick
142:13 not too much I promise um just stick with me let's go back over here we're
142:15 with me let's go back over here we're going to go to uh actually double A and
142:19 going to go to uh actually double A and then we're going to click control shift
142:21 then we're going to click control shift right key that's going to select
142:23 right key that's going to select everything over here and we're going to
142:25 everything over here and we're going to literally delete it okay this is going
142:27 literally delete it okay this is going to be our first table over here so
142:29 to be our first table over here so everything you see over here is our
142:30 everything you see over here is our first table um and we're going to save
142:32 first table um and we're going to save that so let's save as I'm just going to
142:34 that so let's save as I'm just going to keep it in my downloads as and let's do
142:37 keep it in my downloads as and let's do covid deaths so that has our death
142:40 covid deaths so that has our death information the next one is going to
142:42 information the next one is going to include our um vaccination information
142:46 include our um vaccination information which is what we're going to join on and
142:47 which is what we're going to join on and then um we're going to do that later so
142:50 then um we're going to do that later so let's let's hit contrl Z that's going to
142:52 let's let's hit contrl Z that's going to bring it back now let's select on Z and
142:55 bring it back now let's select on Z and go all the way to e and we're going to
142:58 go all the way to e and we're going to do the same thing we're going to delete
142:59 do the same thing we're going to delete this looks like there's no data but I
143:01 this looks like there's no data but I promise there is later on the
143:04 promise there is later on the vaccinations um like total vaccinations
143:06 vaccinations um like total vaccinations if we go down um you can see that that
143:09 if we go down um you can see that that starts on in February the end very end
143:13 starts on in February the end very end of February in 2021 that's because
143:15 of February in 2021 that's because vaccinations are you know didn't come
143:17 vaccinations are you know didn't come out till recently now let's save this
143:20 out till recently now let's save this file and we're going to save as instead
143:23 file and we're going to save as instead of covid deaths we'll do Co
143:27 of covid deaths we'll do Co vaccinations all right now let's save
143:29 vaccinations all right now let's save that so now we have our two excels that
143:33 that so now we have our two excels that we want we need to get them into SQL
143:36 we want we need to get them into SQL we're going to go over to SQL and we're
143:38 we're going to go over to SQL and we're going to create a portfolio project
143:40 going to create a portfolio project database I've already done this all you
143:42 database I've already done this all you have to do though is rightclick click
143:45 have to do though is rightclick click new database type in
143:49 new database type in portfolio project and then click okay
143:52 portfolio project and then click okay and it will create your database for you
143:55 and it will create your database for you um if you open up the tables it should
143:57 um if you open up the tables it should be empty and that's where we're going to
143:58 be empty and that's where we're going to put these two Excel files now uh I had a
144:01 put these two Excel files now uh I had a ton of trouble actually importing these
144:04 ton of trouble actually importing these excels um I mean I tried everything and
144:07 excels um I mean I tried everything and I eventually just went down a rabbit
144:09 I eventually just went down a rabbit hole of how to get these in I don't know
144:11 hole of how to get these in I don't know if it's me or or what but I could not
144:14 if it's me or or what but I could not figure out how to do it if you go to
144:17 figure out how to do it if you go to portfolio project you hit tasks and you
144:19 portfolio project you hit tasks and you hit import data that may do it for you
144:23 hit import data that may do it for you and it may work um it did not work for
144:25 and it may work um it did not work for me uh it just it kept giving me errors
144:29 me uh it just it kept giving me errors so what I would recommend you do right
144:32 so what I would recommend you do right off the bat just to make sure that we're
144:34 off the bat just to make sure that we're doing the same thing um and you can do
144:35 doing the same thing um and you can do it that way if you want I went over here
144:37 it that way if you want I went over here to start um again I'm on a Windows and I
144:40 to start um again I'm on a Windows and I went down to Microsoft SQL Server 2019
144:43 went down to Microsoft SQL Server 2019 and clicked Import and
144:45 and clicked Import and Export looks the same but for whatever
144:48 Export looks the same but for whatever reason it it all the research I did it
144:51 reason it it all the research I did it has to do with the 32-bit versus the
144:54 has to do with the 32-bit versus the 64bit when you do it this way it goes to
144:57 64bit when you do it this way it goes to the 64-bit and it is able to import the
144:59 the 64-bit and it is able to import the data if you do it the other way it was
145:01 data if you do it the other way it was doing it the 32-bit version and gives
145:03 doing it the 32-bit version and gives you an error I don't understand it don't
145:05 you an error I don't understand it don't ask me that's that's the re that's I
145:07 ask me that's that's the re that's I mean I went down a huge rabbit hole but
145:09 mean I went down a huge rabbit hole but this one works so let's go over here and
145:13 this one works so let's go over here and this is going to be our data source
145:15 this is going to be our data source where is the data coming from it's an
145:16 where is the data coming from it's an Excel file
145:17 Excel file so let's do that let's browse and let's
145:20 so let's do that let's browse and let's go over to my
145:26 downloads I thought I saved it in downloads uh maybe because it's an Excel
145:28 downloads uh maybe because it's an Excel workbook what was I saving
145:31 workbook what was I saving before Oh that's a
145:34 before Oh that's a CSV okay something important to note is
145:37 CSV okay something important to note is we're doing an Excel and not a
145:39 we're doing an Excel and not a CSV you're going to get the same error
145:42 CSV you're going to get the same error I'm just doing it live and I'm making
145:43 I'm just doing it live and I'm making myself look stupid so um we're going to
145:46 myself look stupid so um we're going to save it but instead of a CSV we're going
145:49 save it but instead of a CSV we're going to save it as an Excel workbook so let's
145:51 to save it as an Excel workbook so let's save that um now we have to go back to
145:56 save that um now we have to go back to how it was right
145:57 how it was right here um the same way and we're going to
146:00 here um the same way and we're going to file save as and let's do this is now
146:04 file save as and let's do this is now covid
146:06 covid deaths and save it as a workbook now we
146:10 deaths and save it as a workbook now we have them now let's go back um now we
146:13 have them now let's go back um now we have our covid deaths and our covid
146:15 have our covid deaths and our covid vaccinations let's do our deaths first
146:18 vaccinations let's do our deaths first um let me get back right here so it
146:20 um let me get back right here so it looks kind of more
146:21 looks kind of more normal um so we have our Excel file we
146:25 normal um so we have our Excel file we have our covid deaths let's go next and
146:27 have our covid deaths let's go next and now we have to say where we're going to
146:29 now we have to say where we're going to place it where's our destination so
146:31 place it where's our destination so we're going to click over here and go
146:32 we're going to click over here and go down to SQL Server native client
146:37 down to SQL Server native client 11.0 I want to say this is something
146:39 11.0 I want to say this is something that I messed up and it took me like 45
146:41 that I messed up and it took me like 45 minutes to figure out it was the
146:42 minutes to figure out it was the stupidest mistake um it's gonna autop
146:45 stupidest mistake um it's gonna autop populate a server name
146:47 populate a server name and I never checked to confirm that this
146:49 and I never checked to confirm that this was my server name and so I couldn't
146:51 was my server name and so I couldn't figure out why I wasn't able to insert
146:54 figure out why I wasn't able to insert this into my portfolio project uh
146:57 this into my portfolio project uh database that's because mine is 01 I
147:00 database that's because mine is 01 I created two different servers um
147:03 created two different servers um intentionally and for whatever reason I
147:06 intentionally and for whatever reason I forgot that and so all I have to do is
147:08 forgot that and so all I have to do is add 01 over here so just make sure yours
147:11 add 01 over here so just make sure yours is is the same thing click portfolio
147:13 is is the same thing click portfolio project click next yes we're want to
147:16 project click next yes we're want to copy the data should autop populate if
147:18 copy the data should autop populate if it doesn't if it gives you like multiple
147:20 it doesn't if it gives you like multiple you can always uh check mark on the one
147:22 you can always uh check mark on the one that you think is the right one it
147:23 that you think is the right one it should be the first one we'll click next
147:26 should be the first one we'll click next we'll just click finish I'm sure it says
147:28 we'll just click finish I'm sure it says run immediately we'll click finish and
147:31 run immediately we'll click finish and finish now while this is running um
147:33 finish now while this is running um there should be around
147:35 there should be around 89,000 that's how it was like a week ago
147:38 89,000 that's how it was like a week ago when I started it maybe a little more
147:39 when I started it maybe a little more now because there's extra
147:41 now because there's extra days um with that being said you know
147:45 days um with that being said you know there's going to be
147:47 there's going to be a good siiz amount of data um we're
147:49 a good siiz amount of data um we're about to do a lot of different things
147:51 about to do a lot of different things we're going to start at the very basics
147:52 we're going to start at the very basics of just like queer quering the table
147:54 of just like queer quering the table like super simple um and then we're
147:56 like super simple um and then we're going to go into things like joins ctes
147:58 going to go into things like joins ctes temp tables creating views um I the
148:01 temp tables creating views um I the whole purpose of what we're about to do
148:04 whole purpose of what we're about to do is not
148:05 is not to it's not to keep it too simple um I
148:10 to it's not to keep it too simple um I want to showcase to a potential employer
148:13 want to showcase to a potential employer right that you can do more advanced
148:16 right that you can do more advanced Advanced things so I'm going to probably
148:18 Advanced things so I'm going to probably do I mean I'm I'm looking at because I
148:20 do I mean I'm I'm looking at because I have already done this entire project
148:22 have already done this entire project individually I mean we've probably got
148:24 individually I mean we've probably got like 15 to 20 queries here you don't
148:27 like 15 to 20 queries here you don't have to do all of them um I'm going to
148:30 have to do all of them um I'm going to walk through all of them and you can
148:31 walk through all of them and you can choose which ones you want but you don't
148:33 choose which ones you want but you don't have to do all them it is quite a few so
148:36 have to do all them it is quite a few so just know that so there's 85,000 right
148:39 just know that so there's 85,000 right here that's
148:40 here that's fantastic uh it won't show up
148:42 fantastic uh it won't show up immediately you need to refresh it uh
148:46 immediately you need to refresh it uh and there we go so that's our covid
148:48 and there we go so that's our covid vaccinations U let's get rid of this so
148:51 vaccinations U let's get rid of this so we just have Co vaccinations um I
148:53 we just have Co vaccinations um I thought that was our covid deaths one
148:56 thought that was our covid deaths one but maybe I'm wrong um but let's do the
148:58 but maybe I'm wrong um but let's do the exact same thing down
149:02 exact same thing down here and we will import and say
149:07 here and we will import and say next we're going to go down to
149:09 next we're going to go down to Excel and browse and now we want to do
149:12 Excel and browse and now we want to do the covid deaths apparently last time we
149:14 the covid deaths apparently last time we did the vaccinations which
149:18 did the vaccinations which um I actually actually you know what I
149:20 um I actually actually you know what I bet what it did was it took yeah it took
149:23 bet what it did was it took yeah it took this right here as Co vaccinations but
149:25 this right here as Co vaccinations but that was the deaths one as it saved so
149:28 that was the deaths one as it saved so uh forget that let's go right
149:32 uh forget that let's go right here let's do the co vaccinations it
149:36 here let's do the co vaccinations it just has the same sheet
149:37 just has the same sheet name uh so sorry for the confusion
149:41 name uh so sorry for the confusion destination is going to be the exact
149:42 destination is going to be the exact same place it's going to be SQL Server
149:44 same place it's going to be SQL Server native client let's add that
149:48 native client let's add that 01 and let's click refresh portfolio
149:52 01 and let's click refresh portfolio project next next um like I said before
149:57 project next next um like I said before if it does this just click the first one
149:59 if it does this just click the first one it's going to be Co vaccinations it did
150:01 it's going to be Co vaccinations it did that for the covid deaths that's because
150:03 that for the covid deaths that's because I made the mistake earlier I hope you I
150:06 I made the mistake earlier I hope you I hope when you're watching this you
150:07 hope when you're watching this you aren't super confused um the whole point
150:11 aren't super confused um the whole point make two tables or make two excels one
150:14 make two tables or make two excels one should be covid deaths one should be Co
150:15 should be covid deaths one should be Co vaccinations upload them and then rename
150:18 vaccinations upload them and then rename them in a nutshell U so we have the same
150:22 them in a nutshell U so we have the same amount uh let's refresh
150:26 amount uh let's refresh this this one is actually the co
150:28 this this one is actually the co vaccinations this one is covid
150:31 vaccinations this one is covid deaths I'm telling you this stuff is
150:34 deaths I'm telling you this stuff is it's confuses me sometimes to be honest
150:37 it's confuses me sometimes to be honest um but we're going to query this really
150:38 um but we're going to query this really quick to make sure we act are actually
150:41 quick to make sure we act are actually doing um what we're supposed to be doing
150:43 doing um what we're supposed to be doing so let's do select
150:45 so let's do select everything from um and let's do
150:49 everything from um and let's do portfolio project and you can do
150:53 portfolio project and you can do dbo or you can do dot dot I tend to just
150:57 dbo or you can do dot dot I tend to just do that because it's easier um let's
150:59 do that because it's easier um let's look at this one make sure it's the
151:00 look at this one make sure it's the right table so we have total cases new
151:03 right table so we have total cases new cases
151:04 cases perfect um and let's order on let's do
151:08 perfect um and let's order on let's do three comma 4 just to make
151:12 three comma 4 just to make sure or order by of course just to make
151:17 sure or order by of course just to make sure that we have all everything that
151:18 sure that we have all everything that we're looking for so this looks right
151:20 we're looking for so this looks right this looks like our
151:21 this looks like our Excel let's copy this let's go down here
151:25 Excel let's copy this let's go down here we're going to do covid
151:27 we're going to do covid vaccinations and let's run this one make
151:30 vaccinations and let's run this one make sure the second one came in correctly as
151:32 sure the second one came in correctly as well so perfect so we have our two
151:35 well so perfect so we have our two tables this is fantastic news um and now
151:39 tables this is fantastic news um and now we can get going um we can keep this one
151:43 we can get going um we can keep this one I'm GNA comment it out in case you know
151:46 I'm GNA comment it out in case you know we want to come back to it um I'm going
151:50 we want to come back to it um I'm going to really quick again right here I have
151:54 to really quick again right here I have another laptop I have already done this
151:56 another laptop I have already done this whole project so I'm just using it as a
151:58 whole project so I'm just using it as a guideline to know kind of what I'm doing
152:01 guideline to know kind of what I'm doing next so that I don't waste everyone's
152:02 next so that I don't waste everyone's time um so really quickly let's just
152:06 time um so really quickly let's just let's select the data that we are going
152:10 let's select the data that we are going to be using you don't have to use these
152:12 to be using you don't have to use these comments I will say that I'm going to
152:14 comments I will say that I'm going to specify I'm going to say hey this
152:16 specify I'm going to say hey this comment is something I would keep in
152:18 comment is something I would keep in your portfolio project I'm going to add
152:20 your portfolio project I'm going to add a bunch of extra stuff that is not
152:22 a bunch of extra stuff that is not needed um just for your purpose but when
152:25 needed um just for your purpose but when you are creating your portfolio project
152:27 you are creating your portfolio project you shouldn't be adding some of the
152:29 you shouldn't be adding some of the things that I'm going to be commenting
152:31 things that I'm going to be commenting um on so we're going to do um or
152:34 um on so we're going to do um or actually let's do really quick let's
152:36 actually let's do really quick let's copy this so that it kind of knows what
152:38 copy this so that it kind of knows what we're doing so let's select the
152:42 we're doing so let's select the location uh the date the to total
152:48 location uh the date the to total cases the new
152:51 cases the new cases the
152:53 cases the [Music]
152:54 [Music] total deaths and then
152:58 total deaths and then population uh now where we're at I'm
153:02 population uh now where we're at I'm going to turn off my camera because it's
153:04 going to turn off my camera because it's going to get it's going to start getting
153:05 going to get it's going to start getting in the way to be honest I don't want it
153:06 in the way to be honest I don't want it to interfere with your ability to see
153:09 to interfere with your ability to see what we're doing on screen so it's been
153:11 what we're doing on screen so it's been great seeing you guys I'm going to turn
153:13 great seeing you guys I'm going to turn this off and we will continue from here
153:16 this off and we will continue from here all right that should be turned off so
153:20 all right that should be turned off so let's keep running so this is what we're
153:22 let's keep running so this is what we're doing let's actually let's keep this
153:24 doing let's actually let's keep this going because I I don't like things not
153:28 going because I I don't like things not being
153:29 being organized um so we have our location oh
153:33 organized um so we have our location oh no we want to do one two we want to do
153:36 no we want to do one two we want to do it based off the location and the date
153:38 it based off the location and the date makes things everything easier I promise
153:40 makes things everything easier I promise you so we're going to be the first one's
153:42 you so we're going to be the first one's obviously Afghanistan here's our date we
153:45 obviously Afghanistan here's our date we have our total cases are new cases total
153:48 have our total cases are new cases total deaths and population so really quick
153:49 deaths and population so really quick I'm just going to scroll down just a
153:51 I'm just going to scroll down just a second um they started having you know
153:54 second um they started having you know the the total deaths it's um it started
153:58 the the total deaths it's um it started about a
154:00 about a month after they got their first case it
154:03 month after they got their first case it looks like so and then it just like
154:07 looks like so and then it just like ramps up a lot um and we're going to be
154:10 ramps up a lot um and we're going to be diving into all these numbers what they
154:11 diving into all these numbers what they mean how to you can do some really
154:13 mean how to you can do some really simple calculations on them um but
154:17 simple calculations on them um but really quickly we're just going to do
154:19 really quickly we're just going to do again a super simple calculation um and
154:22 again a super simple calculation um and one that we do multiple times for
154:24 one that we do multiple times for different things um so let's go right
154:27 different things um so let's go right down here and let's say uh we're going
154:30 down here and let's say uh we're going to be looking at the total
154:34 to be looking at the total cases
154:35 cases versus total deaths so how many cases
154:40 versus total deaths so how many cases are there in this country and then how
154:42 are there in this country and then how many deaths do they have per um uh you
154:46 many deaths do they have per um uh you know how many deaths they have for their
154:47 know how many deaths they have for their entire cases so let's say they have a
154:50 entire cases so let's say they have a thousand people who H who've been
154:51 thousand people who H who've been diagnosed they had 10 people who died
154:54 diagnosed they had 10 people who died what's the percentage of people who died
154:57 what's the percentage of people who died who had um who had it
155:00 who had um who had it so uh let's go right down here and we're
155:04 so uh let's go right down here and we're gonna I'm just going to copy this really
155:06 gonna I'm just going to copy this really quick this just going to make our life
155:07 quick this just going to make our life easier I think you should do the same as
155:10 easier I think you should do the same as well um so we have location date total
155:13 well um so we have location date total cases um and we're going to get rid of
155:16 cases um and we're going to get rid of our new cases we don't need that one in
155:18 our new cases we don't need that one in this query right here uh nor do you need
155:21 this query right here uh nor do you need this population so let's work on our
155:23 this population so let's work on our calculation really quick it should be
155:24 calculation really quick it should be super super easy let me make sure I'm
155:27 super super easy let me make sure I'm still recording perfect oh man we're 25
155:30 still recording perfect oh man we're 25 almost 25 minutes in um or more because
155:34 almost 25 minutes in um or more because I have the
155:35 I have the intro so now we're going to do uh we
155:38 intro so now we're going to do uh we want to know the percentage of people
155:40 want to know the percentage of people who are dying who actually get infected
155:42 who are dying who actually get infected or or or or who um report being infected
155:46 or or or or who um report being infected so we're going to do um total underscore
155:50 so we're going to do um total underscore deaths we'll go right down here and
155:52 deaths we'll go right down here and we're going to divide that by the total
155:55 we're going to divide that by the total cases total cases and if we do this
155:59 cases total cases and if we do this really
156:00 really quick um what it's going to have and
156:03 quick um what it's going to have and well let's go down to where there's
156:04 well let's go down to where there's actually
156:05 actually numbers so we have 34 we have one um
156:09 numbers so we have 34 we have one um it's it's showing
156:12 it's it's showing 0.029% if you ever try to get a
156:14 0.029% if you ever try to get a percentage of something you have to
156:15 percentage of something you have to multiply times 8 100 um so let's do that
156:18 multiply times 8 100 um so let's do that really quick all we have to add is the
156:21 really quick all we have to add is the what's that the asteris sign um times
156:24 what's that the asteris sign um times 100 um and while we're here let's just
156:27 100 um and while we're here let's just add the um what's it called Alias Let's
156:31 add the um what's it called Alias Let's do let's call this death percentage I
156:35 do let's call this death percentage I don't know that that works for me and
156:37 don't know that that works for me and let's take a look at this it'll be a
156:39 let's take a look at this it'll be a little bit more accurate
156:41 little bit more accurate accurate so when there were 34 there was
156:45 accurate so when there were 34 there was one and that gives gives us a
156:47 one and that gives gives us a 2.94% death rate and we can go down even
156:51 2.94% death rate and we can go down even further um and this is still all
156:54 further um and this is still all Afghanistan let's go down to the very
156:56 Afghanistan let's go down to the very bottom let's go down to the very very
156:58 bottom let's go down to the very very bottom so as of as of today yesterday
157:03 bottom so as of as of today yesterday there were
157:04 there were 59745 total cases in Afghanistan and
157:07 59745 total cases in Afghanistan and there were 20 2,625 deaths which is 4%
157:11 there were 20 2,625 deaths which is 4% so you have a 4% chance basically right
157:14 so you have a 4% chance basically right now of dying I mean if if you want to
157:16 now of dying I mean if if you want to look at it like that 4% chance of dying
157:18 look at it like that 4% chance of dying if you get it and you live in
157:20 if you get it and you live in Afghanistan um let's I mean we you don't
157:23 Afghanistan um let's I mean we you don't have to but really quick just to look at
157:24 have to but really quick just to look at it further let's look at where the
157:27 it further let's look at where the location um I think
157:30 location um I think it's let's say like real quick because
157:32 it's let's say like real quick because I'm not 100% if it's
157:34 I'm not 100% if it's States
157:35 States um it should I think it's United
157:38 um it should I think it's United States but yeah so I mean I live in the
157:41 States but yeah so I mean I live in the United States if you don't you can look
157:42 United States if you don't you can look at your country but um you know
157:46 at your country but um you know we we this is like this is genuine real
157:49 we we this is like this is genuine real reported data so it's really interesting
157:52 reported data so it's really interesting um right at the beginning I mean the I
157:54 um right at the beginning I mean the I don't know if it was the way we were
157:55 don't know if it was the way we were reporting or what but we had really high
157:57 reporting or what but we had really high percentage rates um as we go down we're
157:59 percentage rates um as we go down we're looking at a 5% 6% I mean this was the
158:02 looking at a 5% 6% I mean this was the peak of it this got really bad in the US
158:05 peak of it this got really bad in the US um maybe get I hope it gets better um
158:09 um maybe get I hope it gets better um how many are we at this is I'm going to
158:11 how many are we at this is I'm going to go to the end of this year we sitting at
158:13 go to the end of this year we sitting at around 2 to
158:14 around 2 to 3% um um yeah it goes down to under 2%
158:18 3% um um yeah it goes down to under 2% so at the end of at the end of the year
158:21 so at the end of at the end of the year we were looking at over 2 million people
158:24 we were looking at over 2 million people that's 2
158:26 that's 2 million no wait 20 million
158:31 million no wait 20 million 9363 wait wait wait 20 million people
158:33 9363 wait wait wait 20 million people who have been
158:35 who have been infected um that's a lot that's a lot of
158:38 infected um that's a lot that's a lot of 20 million people who have had it 35,000
158:41 20 million people who have had it 35,000 or 352,000 deaths by the end of the year
158:43 or 352,000 deaths by the end of the year that's a lot um let's keep
158:46 that's a lot um let's keep going um and at the very end we had over
158:50 going um and at the very end we had over 32
158:58 m346fa um there's a lot of deaths 576,000 and I verified this number um I
159:01 576,000 and I verified this number um I Googled it Google knows all I googled
159:04 Googled it Google knows all I googled this number and it's pretty accurate um
159:06 this number and it's pretty accurate um and it's really sad that's a lot of lot
159:07 and it's really sad that's a lot of lot of lives um and that's
159:11 of lives um and that's 1.78% so as of right now if you're were
159:13 1.78% so as of right now if you're were to get it today a estimate is around one
159:18 to get it today a estimate is around one uh and three fourest to 2% chance that
159:21 uh and three fourest to 2% chance that you're that you could die from it um so
159:24 you're that you could die from it um so really interesting numbers this is the
159:25 really interesting numbers this is the kind of exploratory stuff that that you
159:27 kind of exploratory stuff that that you know we're going to be doing we're going
159:28 know we're going to be doing we're going to get a lot more advanced as we go on
159:31 to get a lot more advanced as we go on but this shows you know the likelihood
159:34 but this shows you know the likelihood um and we can I'm going to write that
159:36 um and we can I'm going to write that shows the likely I hope I'm spelling
159:38 shows the likely I hope I'm spelling this right I'm not spelling this right
159:41 this right I'm not spelling this right likelihood I hope that's right if this
159:43 likelihood I hope that's right if this not I apologize likelihood
159:46 not I apologize likelihood of dying if you
159:50 of dying if you contract uh covid in your
159:55 contract uh covid in your country um again rough estimates but you
159:58 country um again rough estimates but you know just glancing at the data that's
160:00 know just glancing at the data that's kind of what we're looking at um now
160:03 kind of what we're looking at um now we're going to look at and let's go down
160:06 we're going to look at and let's go down here let's look
160:10 here let's look at looking at the total cases versus the
160:17 at looking at the total cases versus the population again we're going to do a lot
160:20 population again we're going to do a lot of this like percentage stuff um it it's
160:23 of this like percentage stuff um it it's pretty simple um that will only last for
160:26 pretty simple um that will only last for so long I promise you but it'll be
160:28 so long I promise you but it'll be really I'm going to keep it on the
160:29 really I'm going to keep it on the states just because um I'm going to be
160:31 states just because um I'm going to be looking at that one the most because
160:33 looking at that one the most because obviously it's pretty relevant to me um
160:37 obviously it's pretty relevant to me um so if you're in another country filter
160:39 so if you're in another country filter by your country you'll be really
160:40 by your country you'll be really interested in the stats I I know I was
160:43 interested in the stats I I know I was really really really um shocked by a lot
160:48 really really really um shocked by a lot of the things that we're going to find
160:49 of the things that we're going to find today so we're going to keep the
160:51 today so we're going to keep the location we're going
160:54 location we're going to we're going to keep the date keep the
160:56 to we're going to keep the date keep the total cases um but let's change this to
161:02 total cases um but let's change this to population and then instead of um the
161:06 population and then instead of um the total cases being here we're going to
161:08 total cases being here we're going to put the total cases there and then
161:10 put the total cases there and then change this to population so what is
161:13 change this to population so what is this going to do for us this is going to
161:15 this going to do for us this is going to show us what percentage of the
161:17 show us what percentage of the population has gotten covid so shows
161:21 population has gotten covid so shows what
161:23 what percentage of population
161:26 percentage of population oops got covid um some of these things
161:29 oops got covid um some of these things again they're they're good to know um
161:32 again they're they're good to know um the one that I upload to
161:35 the one that I upload to GitHub will have the notes that I
161:38 GitHub will have the notes that I recommend keeping um again not
161:40 recommend keeping um again not everything in here is um not everything
161:44 everything in here is um not everything in here is
161:46 in here is what you know you need to have in there
161:49 what you know you need to have in there this is mostly just you know what I
161:53 this is mostly just you know what I think you guys need to see while we're
161:54 think you guys need to see while we're actually typing this out all right so
161:57 actually typing this out all right so let's take a look at this um actually I
161:59 let's take a look at this um actually I want to change this I want to put this
162:01 want to change this I want to put this right here just as easier for me
162:04 right here just as easier for me visually um just for because the total
162:06 visually um just for because the total cases right here so our our population
162:08 cases right here so our our population in the US is around
162:11 in the US is around 331
162:13 331 million um
162:16 million um so at the beginning when we had one case
162:18 so at the beginning when we had one case I mean it's like nothing let's keep
162:21 I mean it's like nothing let's keep scrolling um and see where we get to 1%
162:24 scrolling um and see where we get to 1% so
162:25 so 1% that's
162:29 1% that's 3,311
162:31 3,311 32 uh people and that happened in what
162:35 32 uh people and that happened in what is that August August of last year so 1%
162:38 is that August August of last year so 1% of the population let's keep going all
162:40 of the population let's keep going all the way down again we're just kind of
162:42 the way down again we're just kind of glancing at this we're about 10% um
162:44 glancing at this we're about 10% um again we're at the that 32 million so
162:46 again we're at the that 32 million so 10% of the population has has gotten it
162:48 10% of the population has has gotten it gotten a test and it's been confirmed so
162:51 gotten a test and it's been confirmed so really
162:52 really interesting um you know we'll come back
162:54 interesting um you know we'll come back to that one I'm sure in the future I you
162:56 to that one I'm sure in the future I you know we might make we might use this one
162:59 know we might make we might use this one as
162:59 as like um a visualization again uh I'm
163:04 like um a visualization again uh I'm only looking at the states or United
163:07 only looking at the states or United States right now but you know think
163:10 States right now but you know think about it in terms of how we're going to
163:11 about it in terms of how we're going to visualize this in the future cuz a lot
163:14 visualize this in the future cuz a lot of what we're doing
163:15 of what we're doing we're going to visualize in the future
163:18 we're going to visualize in the future um in Tableau I have Tableau even open
163:20 um in Tableau I have Tableau even open right here you can see I have a map um
163:22 right here you can see I have a map um this is just a super I threw this
163:24 this is just a super I threw this together in like two seconds um we have
163:27 together in like two seconds um we have the uh we have the location and so you
163:30 the uh we have the location and so you know this is like our future this is
163:31 know this is like our future this is what you need to be envisioning when
163:33 what you need to be envisioning when you're looking at this data so we have
163:35 you're looking at this data so we have you know Afghanistan and let's just
163:38 you know Afghanistan and let's just scroll through bellaro and Bolivia and
163:41 scroll through bellaro and Bolivia and Bulgaria and cambod all the every single
163:43 Bulgaria and cambod all the every single country um that that is reporting so
163:47 country um that that is reporting so we're just looking at the states but
163:48 we're just looking at the states but remember all of these are going to be
163:50 remember all of these are going to be used so just something to remember um I
163:55 used so just something to remember um I want to know and I'm really curious as
163:57 want to know and I'm really curious as to what countries have the highest um
164:01 to what countries have the highest um infection rates compared to the
164:04 infection rates compared to the population so we're just looking at our
164:06 population so we're just looking at our population um up here um how are we
164:10 population um up here um how are we going to do this we'll do actually let
164:11 going to do this we'll do actually let me say well let me write it out really
164:14 me say well let me write it out really quick so let's look looking at
164:18 quick so let's look looking at countries with highest infection rate
164:24 countries with highest infection rate compared to population so that's what
164:27 compared to population so that's what this script is going to do or this query
164:29 this script is going to do or this query is going to do I'm going to copy
164:32 is going to do I'm going to copy this um so we're going to keep the
164:34 this um so we're going to keep the location we are not going to keep the
164:36 location we are not going to keep the date this is not going to be date
164:38 date this is not going to be date specific this just going to be
164:40 specific this just going to be overall and then we're going to look at
164:42 overall and then we're going to look at the max of the total cases so we only
164:45 the max of the total cases so we only want to look at the highest so when when
164:46 want to look at the highest so when when we were looking at the us we had 32
164:48 we were looking at the us we had 32 million we don't want to look at every
164:50 million we don't want to look at every single Pop um uh of the total cases we
164:54 single Pop um uh of the total cases we only look at the very highest one so
164:56 only look at the very highest one so we'll look at the Max total
164:59 we'll look at the Max total cases um and let's right here we'll just
165:03 cases um and let's right here we'll just say give it an alias at least something
165:05 say give it an alias at least something to recognize it so highest U I guess we
165:08 to recognize it so highest U I guess we can say infection count so we'll say
165:10 can say infection count so we'll say highest infection count that's the
165:11 highest infection count that's the highest infection count per country um
165:14 highest infection count per country um so per location
165:16 so per location um and then we want to also take because
165:20 um and then we want to also take because it's going to it's not going since we
165:22 it's going to it's not going since we don't have Max total cases here if we
165:24 don't have Max total cases here if we just kept total cases here it'll give us
165:26 just kept total cases here it'll give us the same one that we were looking at in
165:28 the same one that we were looking at in this above query what we need to do is
165:30 this above query what we need to do is we need to look at the max of this um so
165:33 we need to look at the max of this um so we're going to look at
165:35 we're going to look at Max and just add a parentheses there um
165:39 Max and just add a parentheses there um and we'll look at this isn't the death
165:41 and we'll look at this isn't the death percentage anymore I forgot to change it
165:43 percentage anymore I forgot to change it in this last one this is
165:45 in this last one this is is what is this it's percent of
165:50 is what is this it's percent of population
165:52 population infected so let's change that for both
165:54 infected so let's change that for both of these because I don't want to get
165:56 of these because I don't want to get confused when you're looking at the
165:57 confused when you're looking at the column headers later um so we'll look at
166:00 column headers later um so we'll look at the percent of population infected let's
166:03 the percent of population infected let's run this and see what we
166:06 run this and see what we get uh list is not contained in either
166:08 get uh list is not contained in either the aggregate oh I need to add a group
166:12 the aggregate oh I need to add a group ey of course um so let's add Group
166:16 ey of course um so let's add Group by um and we need to group by both the
166:19 by um and we need to group by both the population and the location so let's try
166:22 population and the location so let's try that really quick let's see if this
166:25 that really quick let's see if this works
166:28 works awesome um well we ordered on location
166:31 awesome um well we ordered on location and population but I really want to look
166:33 and population but I really want to look at the
166:34 at the highest um so let's so let's just see
166:36 highest um so let's so let's just see really quick look at some of these
166:39 really quick look at some of these numbers got like 1% 4% um 10% okay so
166:44 numbers got like 1% 4% um 10% okay so yeah yeah what we want to do is order on
166:47 yeah yeah what we want to do is order on um this percent population infected so
166:51 um this percent population infected so let's go ahead and do that uh and let's
166:53 let's go ahead and do that uh and let's do that
166:54 do that descending so the descending gets the
166:57 descending so the descending gets the highest number
166:58 highest number first um my goodness 177% so what
167:02 first um my goodness 177% so what percentage of your population has gotten
167:05 percentage of your population has gotten covid it's been reported and and and um
167:08 covid it's been reported and and and um we can see that now so the very first
167:11 we can see that now so the very first one small population so it doesn't
167:12 one small population so it doesn't surprise me but if you look right down
167:14 surprise me but if you look right down here here so that's that 32 million that
167:16 here here so that's that 32 million that we were talking about that's that Max of
167:18 we were talking about that's that Max of total
167:19 total cases um which is the the highest number
167:22 cases um which is the the highest number of our infection count so we have 33 so
167:25 of our infection count so we have 33 so we're at I mean we're we're right up
167:27 we're at I mean we're we're right up there on the list let's look for other
167:28 there on the list let's look for other large countries I mean it's us you know
167:31 large countries I mean it's us you know there's Israel there's
167:34 there's Israel there's Belgium Portugal France so you know
167:37 Belgium Portugal France so you know we're up almost to about 10% in a lot of
167:39 we're up almost to about 10% in a lot of these countries
167:40 these countries so some some of us including the United
167:43 so some some of us including the United States we are we are in there as well
167:46 States we are we are in there as well some of us has have really high
167:48 some of us has have really high percentage rates we just did not keep it
167:49 percentage rates we just did not keep it under control um and you know a large
167:52 under control um and you know a large amount of the population has gotten it
167:54 amount of the population has gotten it that's what this one shows um now let's
167:58 that's what this one shows um now let's look uh kind of at the sad side of
167:59 look uh kind of at the sad side of things we were just looking at how many
168:01 things we were just looking at how many people were infected let's look at how
168:03 people were infected let's look at how many people actually died um so let's
168:07 many people actually died um so let's do let's comment and we'll say this is
168:09 do let's comment and we'll say this is going to this is
168:11 going to this is showing the
168:13 showing the countries with the let's do
168:17 countries with the let's do highest high am I spelling that right
168:19 highest high am I spelling that right yeah highest death count per
168:25 yeah highest death count per population um now how are we going to do
168:27 population um now how are we going to do this let's copy this off the bat but I
168:32 this let's copy this off the bat but I don't know if we're going to do it the
168:33 don't know if we're going to do it the exact same way because we just need
168:37 exact same way because we just need location um and not much else honestly
168:41 location um and not much else honestly so let's get rid of all this stuff but
168:42 so let's get rid of all this stuff but we do need we're looking at the highest
168:44 we do need we're looking at the highest death count so like we did up here with
168:47 death count so like we did up here with the Max total cases we're going to do
168:50 the Max total cases we're going to do Max and then we'll do total
168:53 Max and then we'll do total deaths I hope it's like this total
168:56 deaths I hope it's like this total deaths um and then we'll do as total
169:01 deaths um and then we'll do as total oops total death
169:03 oops total death count um and we'll order that by the
169:07 count um and we'll order that by the total death count see I don't need
169:11 total death count see I don't need this I think yeah I need to group by
169:13 this I think yeah I need to group by because there's an aggregate function
169:15 because there's an aggregate function and let's try this really
169:16 and let's try this really quick okay so if you're getting this
169:19 quick okay so if you're getting this there's a there's a simple slash
169:22 there's a there's a simple slash confusing explanation to
169:24 confusing explanation to this total deaths right now let's go
169:27 this total deaths right now let's go into our covid deaths
169:31 into our covid deaths columns okay let's show the total deaths
169:35 columns okay let's show the total deaths which is right
169:38 which is right here it's an nvar chart 255 it's an
169:41 here it's an nvar chart 255 it's an issue with the data type um oh wait
169:44 issue with the data type um oh wait total deaths no total deaths right here
169:47 total deaths no total deaths right here it's an issue with the data type um it
169:49 it's an issue with the data type um it just has to do with how the data type is
169:51 just has to do with how the data type is read when you use this aggregate
169:53 read when you use this aggregate function we need to convert it um or
169:57 function we need to convert it um or cast it is what we're actually do we
169:58 cast it is what we're actually do we need to cast this as an integer so
170:00 need to cast this as an integer so that's red as a numeric um why I cannot
170:05 that's red as a numeric um why I cannot 100% give you a perfect explanation for
170:07 100% give you a perfect explanation for it but this happens all the time you
170:09 it but this happens all the time you just need to look at the data and
170:10 just need to look at the data and realize oh it's probably because of this
170:12 realize oh it's probably because of this data type let's try something else um
170:14 data type let's try something else um and then it'll work so let's cast this
170:17 and then it'll work so let's cast this and we're in casting it I find is just
170:19 and we're in casting it I find is just easier but just as int boom there you go
170:22 easier but just as int boom there you go so now we're taking this nvar chart 255
170:26 so now we're taking this nvar chart 255 over here and then we are converting it
170:28 over here and then we are converting it to an
170:29 to an integer now let's run this um and let's
170:33 integer now let's run this um and let's get rid of this just for visual visual
170:36 get rid of this just for visual visual purposes now we are much more
170:39 purposes now we are much more accurate but we have a slight issue or
170:43 accurate but we have a slight issue or we're we're now seeing a slight issue
170:46 we're we're now seeing a slight issue with our
170:47 with our data in our data in the location section
170:51 data in our data in the location section we have a few ones that really shouldn't
170:54 we have a few ones that really shouldn't be there ones like world or
170:58 be there ones like world or Africa um or South America these are
171:04 Africa um or South America these are grouping entire
171:06 grouping entire continents so let's go back up to our um
171:11 continents so let's go back up to our um let's go back up here and let's do
171:14 let's go back up here and let's do actually let's pull it up really quick
171:16 actually let's pull it up really quick because this is just part of exploring
171:17 because this is just part of exploring the data and figuring it out so if we
171:20 the data and figuring it out so if we scroll down um we're going to f we're
171:22 scroll down um we're going to f we're going to see one like right where is it
171:24 going to see one like right where is it right here this this location is all of
171:27 right here this this location is all of Asia whereas in other ones the continent
171:30 Asia whereas in other ones the continent is Asia if I can pull one up real quick
171:33 is Asia if I can pull one up real quick so like right here the continent is Asia
171:35 so like right here the continent is Asia whereas before the location is Asia but
171:38 whereas before the location is Asia but if you also notice um the continent is
171:41 if you also notice um the continent is null here so what we need to do is say
171:44 null here so what we need to do is say um uh where
171:47 um uh where continent is not null because when it is
171:51 continent is not null because when it is null that means that this location is
171:53 null that means that this location is actually an entire continent and we
171:54 actually an entire continent and we don't want that um that may be helpful
171:56 don't want that um that may be helpful for us um later on but it is not helpful
172:00 for us um later on but it is not helpful now so now this right here will get rid
172:03 now so now this right here will get rid of that um and just knowing that
172:06 of that um and just knowing that figuring that out now we can add that to
172:09 figuring that out now we can add that to every every
172:11 every every script um and we can do you know you
172:14 script um and we can do you know you don't have to do this I'm just doing
172:15 don't have to do this I'm just doing this for you know visual purposes I'm
172:17 this for you know visual purposes I'm not going to do that for everyone um so
172:20 not going to do that for everyone um so let's say where continent is not null
172:22 let's say where continent is not null and now let's look at this and now you
172:25 and now let's look at this and now you can see that the United States is number
172:29 can see that the United States is number one and
172:30 one and so number one is not the best thing to
172:33 so number one is not the best thing to be number one in but we have a death
172:35 be number one in but we have a death count of 576,000 and again I I googled
172:38 count of 576,000 and again I I googled this earlier these numbers are pretty
172:40 this earlier these numbers are pretty accurate there some of them are like a
172:42 accurate there some of them are like a day or two behind give me a second I'm
172:43 day or two behind give me a second I'm going to take a
172:49 water they're like a couple days behind um this number is actually higher um and
172:53 um this number is actually higher um and as you know as we continue to have more
172:57 as you know as we continue to have more people die unfortunately that number
172:58 people die unfortunately that number just continues to go up um so the data
173:00 just continues to go up um so the data that that you download may be a a lot
173:03 that that you download may be a a lot higher um as of right now we've been
173:06 higher um as of right now we've been breaking everything out by location
173:09 breaking everything out by location right really quickly let's just do this
173:13 right really quickly let's just do this by something we kind of saw earlier um
173:16 by something we kind of saw earlier um and I'm just going to do this for
173:18 and I'm just going to do this for breaking it up purposes but I'm going to
173:20 breaking it up purposes but I'm going to say I'm going do caps lock let's break
173:24 say I'm going do caps lock let's break things down by continent how SP
173:28 things down by continent how SP continent
173:30 continent Contin jeez is that even how you spell
173:32 Contin jeez is that even how you spell it I don't even know let's keep going um
173:37 it I don't even know let's keep going um but now we can do consonant right
173:40 but now we can do consonant right here and we'll just copy and paste that
173:44 here and we'll just copy and paste that let's get that back up here um and now
173:47 let's get that back up here um and now we can see where continent is not
173:52 we can see where continent is not null let's see if that makes that yeah
173:55 null let's see if that makes that yeah okay so now it's breaking it out by
173:57 okay so now it's breaking it out by continents um with North America South
174:00 continents um with North America South America Asia Europe Africa
174:04 America Asia Europe Africa Oceana is this
174:07 Oceana is this perfect no no it's not perfect um North
174:10 perfect no no it's not perfect um North America looks like it's only including
174:13 America looks like it's only including the numbers from the United States and
174:15 the numbers from the United States and not Canada um so we have some small
174:18 not Canada um so we have some small issues in here um but for the purposes
174:23 issues in here um but for the purposes of what we're trying to do which I don't
174:26 of what we're trying to do which I don't think anyone's going going to come in
174:28 think anyone's going going to come in here and fact check us or check the data
174:30 here and fact check us or check the data they may and then you're I don't know
174:32 they may and then you're I don't know you might be screwed but for the
174:34 you might be screwed but for the purposes of
174:36 purposes of hierarchy um and you know drill that
174:40 hierarchy um and you know drill that drill down effect in Tableau which is
174:42 drill down effect in Tableau which is something we are going to do we want
174:44 something we are going to do we want want to start including this continent
174:45 want to start including this continent in our in our queries so that we can
174:48 in our in our queries so that we can drill down um further into these things
174:53 drill down um further into these things um we can also do where just wait I'm
174:54 um we can also do where just wait I'm going to do where
174:57 going to do where isnull um actually let me see so before
175:00 isnull um actually let me see so before we were doing work continent is not null
175:02 we were doing work continent is not null but let's do location I'm just I I'm
175:05 but let's do location I'm just I I'm doing this on the Fly I haven't done
175:06 doing this on the Fly I haven't done this before I just kind of
175:07 this before I just kind of am doing
175:10 am doing this um this actually is the correct
175:13 this um this actually is the correct numbers
175:14 numbers and I don't know why I didn't do this
175:17 and I don't know why I didn't do this before when I was actually creating this
175:18 before when I was actually creating this project but now this is a wonderful
175:20 project but now this is a wonderful beautiful thing I believe this is the
175:21 beautiful thing I believe this is the correct numbers um I could verify but I
175:23 correct numbers um I could verify but I don't want to do that live because I I
175:26 don't want to do that live because I I might look stupid but I think this is
175:27 might look stupid but I think this is accurate um remember before we were
175:29 accurate um remember before we were looking at the location and the location
175:33 looking at the location and the location um and it was actually the countries
175:34 um and it was actually the countries itself and then there were ones where we
175:36 itself and then there were ones where we did where is notnull to get rid of all
175:39 did where is notnull to get rid of all the ones that were like world and all
175:40 the ones that were like world and all those other things well now I'm just
175:42 those other things well now I'm just filtering on those instead of deleting
175:45 filtering on those instead of deleting them before we were looking at
175:47 them before we were looking at everything but these now we're only
175:48 everything but these now we're only looking at these and these numbers look
175:50 looking at these and these numbers look a lot more accurate so with that being
175:53 a lot more accurate so with that being said um I'm going to use this going
175:55 said um I'm going to use this going forward in my script so I'm going to
175:56 forward in my script so I'm going to kind of change things up to where from
175:58 kind of change things up to where from what I originally had um let me see
176:02 what I originally had um let me see though because if that is the case it
176:06 though because if that is the case it may screw up our drill down effect um
176:09 may screw up our drill down effect um which is highly unfortunate I may I I
176:12 which is highly unfortunate I may I I honestly might just revert back to it
176:13 honestly might just revert back to it for the pure fact that we want the
176:15 for the pure fact that we want the visualizations to look correct um just
176:17 visualizations to look correct um just know that this is the right way and if
176:18 know that this is the right way and if you want to go back and do that I highly
176:21 you want to go back and do that I highly encourage that I didn't figure that out
176:23 encourage that I didn't figure that out my first time around um but I'm willing
176:25 my first time around um but I'm willing to admit when I'm wrong let me see what
176:28 to admit when I'm wrong let me see what let me do a time check all we're run
176:30 let me do a time check all we're run like 50 minutes or so I think we're
176:33 like 50 minutes or so I think we're gonna we're just going to keep going all
176:34 gonna we're just going to keep going all the way through I I I don't think we're
176:35 the way through I I I don't think we're going to stop um I don't think we're
176:38 going to stop um I don't think we're going to stop in this project so we want
176:41 going to stop in this project so we want to do some of the the above queries were
176:45 to do some of the the above queries were kind of what we were going for nothing
176:47 kind of what we were going for nothing crazy difficult right nothing crazy hard
176:51 crazy difficult right nothing crazy hard um and now we want to we want to start
176:55 um and now we want to we want to start breaking this out by um continent as
176:58 breaking this out by um continent as well I'm I'm going to go back
177:01 well I'm I'm going to go back and is this correct let me look no so is
177:06 and is this correct let me look no so is not
177:07 not no
177:09 no um so we want to start doing some of the
177:12 um so we want to start doing some of the above queries but adding that content in
177:14 above queries but adding that content in there you can even go back and add that
177:16 there you can even go back and add that as well um if you want to that's totally
177:19 as well um if you want to that's totally fine I'm going to do some more queries
177:22 fine I'm going to do some more queries down
177:23 down here um or at least one more one or two
177:26 here um or at least one more one or two more and then we're going to start
177:27 more and then we're going to start getting I think into some a little bit
177:29 getting I think into some a little bit more advanced things we're going to
177:30 more advanced things we're going to start getting into some temp tables uh
177:33 start getting into some temp tables uh stuff like that because we're going to
177:36 stuff like that because we're going to eventually set these up in um views so
177:40 eventually set these up in um views so that we have these views to um use for
177:44 that we have these views to um use for Tableau
177:45 Tableau later um and again it shows you know how
177:48 later um and again it shows you know how to create a view so that's important so
177:51 to create a view so that's important so we we've we've done this first one this
177:53 we we've we've done this first one this next one is going to let me go down one
177:57 next one is going to let me go down one more this is
178:00 more this is showing the continents with the highest
178:04 showing the continents with the highest death count so almost the exact same as
178:08 death count so almost the exact same as we did before but now we're looking at
178:10 we did before but now we're looking at the continents um we can even go up and
178:12 the continents um we can even go up and look at uh just wait we literally just
178:16 look at uh just wait we literally just did that
178:18 did that um so that's what this one is
178:21 um so that's what this one is actually looking at my notes wrong
178:26 actually looking at my notes wrong idiot okay perfect um now we you know we
178:31 idiot okay perfect um now we you know we want to start looking at this from a
178:33 want to start looking at this from a Viewpoint of I'm going to visualize this
178:36 Viewpoint of I'm going to visualize this so how do we do that what we want to
178:39 so how do we do that what we want to look at let's look at some Global
178:42 look at let's look at some Global numbers um you can do as many many of
178:44 numbers um you can do as many many of these as you want anything up here just
178:46 these as you want anything up here just add continent to it um anything what
178:49 add continent to it um anything what like groupy just replace it with
178:50 like groupy just replace it with continent and you and you got it um so I
178:52 continent and you and you got it um so I don't want to go through and do every
178:53 don't want to go through and do every single one of those but that is kind of
178:55 single one of those but that is kind of the gist of what you might want to do
178:57 the gist of what you might want to do especially if you want that drill down
178:58 especially if you want that drill down effect and if you don't know what that
178:59 effect and if you don't know what that is um you know it's like clicking on
179:02 is um you know it's like clicking on North America and then when you bring up
179:04 North America and then when you bring up North America then it shows all the
179:07 North America then it shows all the countries in North America so Canada uh
179:09 countries in North America so Canada uh and the United States and so it's a
179:11 and the United States and so it's a drill down so you like on Africa and
179:13 drill down so you like on Africa and then there's all the African countries
179:15 then there's all the African countries that's what drilling down does and
179:16 that's what drilling down does and that's what you can do when you have um
179:19 that's what you can do when you have um those layers so you have the continent
179:22 those layers so you have the continent then you have the location um so you
179:25 then you have the location um so you know I'm not going to we we'll look at
179:27 know I'm not going to we we'll look at that when we actually get to Tableau but
179:29 that when we actually get to Tableau but I don't want to actually spend all the
179:30 I don't want to actually spend all the time writing that
179:32 time writing that out um but what we now want to do is we
179:38 out um but what we now want to do is we want to calculate everything for the
179:40 want to calculate everything for the across the entire
179:41 across the entire world so
179:45 world so let's do this let's
179:48 let's do this let's say um breaking let's do Global let's
179:53 say um breaking let's do Global let's just say global global numbers easier
179:56 just say global global numbers easier easier than
179:58 easier than nothing
180:01 nothing um all
180:03 um all right uh I let me really quick find the
180:07 right uh I let me really quick find the I think it's probably the first one the
180:09 I think it's probably the first one the death percentage let me let me see if
180:12 death percentage let me let me see if this is the one that we want
180:15 this is the one that we want
180:17 [Music] okay let me
180:25 see all right so let's take this one I'm sorry that took me a while to find again
180:28 sorry that took me a while to find again I'm not cutting any of this stuff out
180:29 I'm not cutting any of this stuff out you just got to stick with me you if
180:31 you just got to stick with me you if you're sticking with me this long I know
180:33 you're sticking with me this long I know you care I know you're not you're not
180:35 you care I know you're not you're not cutting away because I'm trying to
180:36 cutting away because I'm trying to figure things out on my side so um let
180:40 figure things out on my side so um let me get rid of this so this is the exact
180:43 me get rid of this so this is the exact same SC what well let's say where just
180:46 same SC what well let's say where just so we can get the right
180:48 so we can get the right numbers um so we are now going to look
180:52 numbers um so we are now going to look at the global numbers uh so we're not
180:56 at the global numbers uh so we're not going
180:58 going to we're not going to uh include any
181:02 to we're not going to uh include any location any continent or anything like
181:04 location any continent or anything like that but we do want to make sure that
181:06 that but we do want to make sure that we're only looking at all of the um
181:09 we're only looking at all of the um countries and we're not looking at the
181:11 countries and we're not looking at the world numbers plus all the countries
181:12 world numbers plus all the countries because then the numbers would get
181:15 because then the numbers would get astronomical so instead of now now we
181:17 astronomical so instead of now now we can't do so let's try running this
181:19 can't do so let's try running this really
181:20 really quick so now
181:23 quick so now we really can't do this um because now
181:26 we really can't do this um because now it's breaking everything out by
181:30 it's breaking everything out by um by you know that uh which is the date
181:33 um by you know that uh which is the date it's breaking everything out by the date
181:35 it's breaking everything out by the date because um these total case the numbers
181:37 because um these total case the numbers are different right so really quick
181:40 are different right so really quick let's Group by date
181:44 let's Group by date and now let's see what it looks
181:46 and now let's see what it looks like uh it's going to give us an error
181:49 like uh it's going to give us an error obviously that's because we're looking
181:52 obviously that's because we're looking at
181:54 at um that's because when we're looking at
181:56 um that's because when we're looking at this we're looking at multiple things
181:59 this we're looking at multiple things and we can't Group by just the dates
182:01 and we can't Group by just the dates obviously if we wanted to group by
182:03 obviously if we wanted to group by something which we need to
182:05 something which we need to do we then need to um start using
182:08 do we then need to um start using aggregate functions on everything
182:11 aggregate functions on everything else um so really
182:14 else um so really quickly let's do some aggregate
182:16 quickly let's do some aggregate functions I'm looking at my notes for
182:18 functions I'm looking at my notes for just a second um to see what I
182:22 just a second um to see what I did basically what we want to do and I
182:24 did basically what we want to do and I think what'll make things
182:26 think what'll make things easier is I mean I could try to do the
182:30 easier is I mean I could try to do the sum of Max total cases I don't think
182:33 sum of Max total cases I don't think that's possible um let me comment this
182:36 that's possible um let me comment this out really
182:41 quick yeah um it's because there's an aggregate function within an aggregate
182:43 aggregate function within an aggregate fun function and we can't really do that
182:48 fun function and we can't really do that um if we go back to the data and you we
182:52 um if we go back to the data and you we kind of looked at this earlier there's
182:53 kind of looked at this earlier there's one called new
182:55 one called new cases um let's use this because instead
182:58 cases um let's use this because instead of doing Max we can just sum it or or or
183:03 of doing Max we can just sum it or or or do a sum on it and that's going to give
183:05 do a sum on it and that's going to give us the sum of all the new cases which
183:07 us the sum of all the new cases which adds up to the total cases so if we do
183:11 adds up to the total cases so if we do this let's see this will give us on each
183:15 this let's see this will give us on each day the total across the world because
183:17 day the total across the world because we're not filtering by any continent or
183:20 we're not filtering by any continent or or we're filtering out um like the world
183:22 or we're filtering out um like the world and in the actual continents we're not
183:24 and in the actual continents we're not filtering by location or continent or
183:25 filtering by location or continent or anything it's just by date so we're
183:27 anything it's just by date so we're looking at the sum of the new
183:29 looking at the sum of the new cases so now let's
183:32 cases so now let's do uh let's do the
183:34 do uh let's do the [Music]
183:36 [Music] sum
183:38 sum of uh new underscore
183:42 of uh new underscore deaths and we can run that
183:50 one um operating data type and our chart is invalid for the some operator so
183:53 is invalid for the some operator so going back um and this is something I
183:54 going back um and this is something I encountered a lot when I was doing this
183:57 encountered a lot when I was doing this is these new cases is a float which is
184:00 is these new cases is a float which is why it's working in the sum but the new
184:03 why it's working in the sum but the new deaths is an narar so what we need to do
184:05 deaths is an narar so what we need to do again is cast that as an integer it's
184:10 again is cast that as an integer it's just the easiest thing to
184:11 just the easiest thing to do um and now that one should
184:15 do um and now that one should work so um let's get rid of the well
184:19 work so um let's get rid of the well let's get rid of down to here so we're
184:21 let's get rid of down to here so we're we're about to do another one and that's
184:23 we're about to do another one and that's going to be our death percentage
184:24 going to be our death percentage globally across um across the I guess
184:28 globally across um across the I guess the
184:29 the world so we need to do the sum of I
184:34 world so we need to do the sum of I think it's we need to do new
184:41 deaths all right divided by the sum
184:44 by the sum of new
184:46 of new [Music]
184:48 [Music] cases all right times
184:52 cases all right times 100 let's see what this takes
184:55 100 let's see what this takes us um okay of course we're getting the
184:58 us um okay of course we're getting the same thing let me
184:59 same thing let me um let me put this right
185:03 um let me put this right here and see if this
185:07 here and see if this works um
185:09 works um invalid data oh that's because this was
185:12 invalid data oh that's because this was new cases
185:14 new cases the new deaths one is right
185:18 the new deaths one is right here and let's run
185:21 here and let's run this and now we
185:24 this and now we are looking good um and as you can see
185:27 are looking good um and as you can see the death percentage is right here we
185:30 the death percentage is right here we have
185:31 have 91 um and let me give these I don't we
185:35 91 um and let me give these I don't we can't let me go back real quick and just
185:37 can't let me go back real quick and just say as total
185:42 say as total cases as as
185:45 cases as as total
185:47 total deaths um and let's run that
185:59 again okay and so across the world these are our numbers so we have total cases
186:02 are our numbers so we have total cases on that very first day that cases were
186:04 on that very first day that cases were starting to be
186:05 starting to be reported there were 98 total cases there
186:08 reported there were 98 total cases there was one total deaths that gives us a
186:10 was one total deaths that gives us a death percentage of 1% across the
186:12 death percentage of 1% across the country
186:13 country or across the world and as we scroll
186:16 or across the world and as we scroll down it gets lower and lower and that's
186:18 down it gets lower and lower and that's cuz we have a lot of people who have
186:21 cuz we have a lot of people who have gotten infected are the total cases um
186:24 gotten infected are the total cases um and again that's per day right so if we
186:27 and again that's per day right so if we remove this all together that date Al
186:30 remove this all together that date Al together which we can do right
186:37 now this will uh this will give us the total
186:39 will uh this will give us the total cases which is oh gosh let me read this
186:42 cases which is oh gosh let me read this through one two 150
186:44 through one two 150 million um versus
186:47 million um versus 3,180 26 so overall across the world we
186:51 3,180 26 so overall across the world we are looking at a um a death percentage
186:54 are looking at a um a death percentage of a little over
187:02 2% so interesting numbers you can keep both of those queries separate if you'd
187:04 both of those queries separate if you'd like um you know they might come in
187:06 like um you know they might come in handy
187:07 handy later but let's do
187:12 later but let's do this
187:14 this so we
187:15 so we have
187:18 have um give me one second check on my notes
187:21 um give me one second check on my notes again because I just want to make sure
187:23 again because I just want to make sure I'm not doing something
187:35 right all right so again we have a whole another table that we haven't used yet
187:38 another table that we haven't used yet uh it's this covid
187:40 uh it's this covid vaccinations um and just to you know
187:42 vaccinations um and just to you know refresh your memory let's do um let's
187:47 refresh your memory let's do um let's look at the table from portfolio
187:50 look at the table from portfolio project. Co vaccinations let's jog our
187:53 project. Co vaccinations let's jog our memory on what we got
187:55 memory on what we got here so we have
188:00 here so we have um we have these tests we have um
188:04 um we have these tests we have um vaccinations over here which was what
188:06 vaccinations over here which was what we're actually going to be
188:09 we're actually going to be using
188:11 using um excuse me
188:13 um excuse me me uh that's what we are going to be
188:17 me uh that's what we are going to be using so let's join these two tables
188:21 using so let's join these two tables together uh and let's let's actually
188:24 together uh and let's let's actually just
188:25 just do from actually let's just do this
188:28 do from actually let's just do this whole
188:29 whole thing from let's do covid deaths and
188:35 thing from let's do covid deaths and here's how we're going to join it so
188:36 here's how we're going to join it so we're going to say join and we're going
188:38 we're going to say join and we're going to say oops wait that is wrong join and
188:43 to say oops wait that is wrong join and we're going to say on so what are we
188:45 we're going to say on so what are we going to join them on um we're going to
188:48 going to join them on um we're going to join them on two things we're going to
188:50 join them on two things we're going to join them on location because that's
188:52 join them on location because that's much more specific than the continent
188:55 much more specific than the continent we're going to join them on location and
188:57 we're going to join them on location and we're going to join them on date let's
188:59 we're going to join them on date let's call this one DEA let's call this one
189:02 call this one DEA let's call this one vaccination so a little Alias for these
189:06 vaccination so a little Alias for these so that we don't have to type out this
189:07 so that we don't have to type out this entire table name each time so let's do
189:11 entire table name each time so let's do dea. location
189:14 dea. location is equal to
189:16 is equal to vac.
189:17 vac. location and da do and we'll say date is
189:23 location and da do and we'll say date is equal to
189:24 equal to vac. date and let's just see what we get
189:28 vac. date and let's just see what we get really quick so we'll have all of these
189:31 really quick so we'll have all of these things and let's look at Granada
189:34 things and let's look at Granada 0717 let's go all the way over
189:39 0717 let's go all the way over here and it should
189:41 here and it should have Gren
189:44 have Gren 0717 so just making sure that they were
189:47 0717 so just making sure that they were joined
189:48 joined correctly um for this query what we're
189:51 correctly um for this query what we're going to do is look at the total
189:53 going to do is look at the total population and let's do that right here
189:56 population and let's do that right here so looking at total population versus
190:01 so looking at total population versus vaccination so how many PE what is the
190:04 vaccination so how many PE what is the total amount of people in the world that
190:07 total amount of people in the world that have been vaccinated that is that is
190:09 have been vaccinated that is that is what we're going to do in this query
190:12 what we're going to do in this query right here so
190:13 right here so let's do
190:15 let's do dea.
190:20 continent location uh da. date again these are
190:23 location uh da. date again these are going to be the same in either one but
190:25 going to be the same in either one but we have to specify um let me just for
190:29 we have to specify um let me just for example if we do population population
190:32 example if we do population population oh actually that's a terrible example um
190:35 oh actually that's a terrible example um because population's only in one let me
190:36 because population's only in one let me go back real quick let me say I only
190:38 go back real quick let me say I only write date that's going to give me an
190:41 write date that's going to give me an error because there's date in both of
190:43 error because there's date in both of them in fact we joined it on them so we
190:45 them in fact we joined it on them so we know there's date in both of them so
190:46 know there's date in both of them so it's going to give us an error we just
190:47 it's going to give us an error we just have to specify what table we want to
190:49 have to specify what table we want to pull it from so we going to do
190:52 pull it from so we going to do DEA um and da. population just to keep
190:55 DEA um and da. population just to keep it consistent um and now we're going to
190:58 it consistent um and now we're going to add the next one da do and let's do new
191:02 add the next one da do and let's do new vaccinations um and really quick let's
191:05 vaccinations um and really quick let's just look at
191:11 this um and let me get my orders cu I want it to be organized I I
191:17 orders cu I want it to be organized I I actually one let's do one two three I
191:20 actually one let's do one two three I don't like it when it's not organized it
191:22 don't like it when it's not organized it bothers
191:24 bothers me so we're looking at oh no I also need
191:29 me so we're looking at oh no I also need to add or consonant is not
191:31 to add or consonant is not [Music]
191:32 [Music] null there we
191:35 null there we go uh
191:37 go uh da perfect now let's run this this
191:40 da perfect now let's run this this should look much better there we go all
191:45 should look much better there we go all right we are in fact if we want to look
191:48 right we are in fact if we want to look at Afghanistan like we have normally
191:50 at Afghanistan like we have normally been doing um in previous ones we do two
191:54 been doing um in previous ones we do two slash3 so there's our population here's
191:58 slash3 so there's our population here's our new
191:59 our new vaccinations
192:01 vaccinations now let's
192:04 now let's see we're going to go back go down and
192:07 see we're going to go back go down and let's see they have vaccinations
192:08 let's see they have vaccinations starting on
192:10 starting on 218 um if we go even further down let's
192:12 218 um if we go even further down let's just go
192:13 just go to who's this Canada oh yeah Canada
192:16 to who's this Canada oh yeah Canada would be a good one to look at they
192:18 would be a good one to look at they started doing vaccinations
192:20 started doing vaccinations on right here so 12:15 I mean they
192:25 on right here so 12:15 I mean they started very
192:27 started very early and their numbers only increased
192:31 early and their numbers only increased and now they're you know doing this is
192:33 and now they're you know doing this is per day right so this is 288,000 in one
192:36 per day right so this is 288,000 in one day um so that's you know really high
192:39 day um so that's you know really high numbers but this is the number of new
192:41 numbers but this is the number of new vaccinations um there is a column called
192:45 vaccinations um there is a column called total vaccinations in this table but
192:47 total vaccinations in this table but we're going to do something pretty just
192:50 we're going to do something pretty just to display again this whole portfolio
192:52 to display again this whole portfolio project is to show potential employers
192:54 project is to show potential employers that you know how to do certain things
192:56 that you know how to do certain things so I want to set up opportunities to do
192:58 so I want to set up opportunities to do that we're not going to use the total
193:00 that we're not going to use the total vaccinations we're going to use this new
193:01 vaccinations we're going to use this new vaccinations which is new vaccinations
193:03 vaccinations which is new vaccinations per
193:05 per day um so we want to we want to know or
193:10 day um so we want to we want to know or do kind of like a rolling count um out
193:15 do kind of like a rolling count um out here so as this number let me go back to
193:18 here so as this number let me go back to the beginning as this number increases
193:19 the beginning as this number increases 718 2300 4179 we want it to add up over
193:24 718 2300 4179 we want it to add up over here it's a pretty cool thing I mean you
193:26 here it's a pretty cool thing I mean you know it's once you see it you'll be like
193:28 know it's once you see it you'll be like oh that's pretty easy but you know we're
193:31 oh that's pretty easy but you know we're going to be using partition bu we're
193:32 going to be using partition bu we're going to be using
193:34 going to be using um uh this a Windows function so it's
193:38 um uh this a Windows function so it's really good to to Showcase I think so
193:41 really good to to Showcase I think so we're going to do
193:47 um and let's do um we need to do the sum because
193:50 do um we need to do the sum because we're going to be adding these together
193:53 we're going to be adding these together so we need to do the sum of new
193:57 so we need to do the sum of new vaccinations
193:59 vaccinations oops do the sum of new
194:03 oops do the sum of new vaccinations let's do
194:06 vaccinations let's do over and we're going to say partition oh
194:09 over and we're going to say partition oh gosh Partition by and
194:13 gosh Partition by and we need to Partition by the location
194:17 we need to Partition by the location first and foremost because we're
194:19 first and foremost because we're breaking it up by if we do it by
194:22 breaking it up by if we do it by continent the numbers are going to be
194:24 continent the numbers are going to be completely off we need to do it by
194:27 completely off we need to do it by location location and and also partly
194:30 location location and and also partly the date but you'll see that in just a
194:31 the date but you'll see that in just a second but we need to partition it by
194:33 second but we need to partition it by breaking it up by um
194:37 breaking it up by um location and why is that because every
194:40 location and why is that because every time it gets to a new location we want
194:41 time it gets to a new location we want the count to start over we we don't want
194:43 the count to start over we we don't want this aggregate function to just keep
194:45 this aggregate function to just keep running and running running it'll ruin
194:46 running and running running it'll ruin all of our numbers we only want the this
194:49 all of our numbers we only want the this part a partition on the the location so
194:51 part a partition on the the location so that it runs only through Canada and
194:53 that it runs only through Canada and then when it gets to the next country it
194:55 then when it gets to the next country it doesn't keep going um and if we only did
194:58 doesn't keep going um and if we only did that by the way let's look at what this
195:01 that by the way let's look at what this looks like uh okay real quick I need to
195:04 looks like uh okay real quick I need to cast
195:06 cast this um as an integer like we've been
195:09 this um as an integer like we've been doing in the past you can also do um
195:13 doing in the past you can also do um real quick I want to show you another
195:15 real quick I want to show you another one convert and I think
195:18 one convert and I think it's comma
195:20 it's comma [Music]
195:21 [Music] integer um or is it integer comma let me
195:24 integer um or is it integer comma let me try integer comma I think it's that way
195:29 try integer comma I think it's that way actually um and you can do it this way
195:31 actually um and you can do it this way as well that is up to you um you know
195:35 as well that is up to you um you know either one is totally fine if you want
195:36 either one is totally fine if you want to use both that's even better because
195:38 to use both that's even better because then it kind of shows you can do both um
195:40 then it kind of shows you can do both um but they basically do the exact same
195:41 but they basically do the exact same thing so let's go
195:44 thing so let's go down and let's see what what's happening
195:48 down and let's see what what's happening here so it goes down to Albania and
195:51 here so it goes down to Albania and since we're partitioning on Albania
195:53 since we're partitioning on Albania Albania their total amount of
195:55 Albania their total amount of vaccinations is 347,000 I know that
195:58 vaccinations is 347,000 I know that going into it because it has it on every
196:00 going into it because it has it on every single stinking row but down here they
196:03 single stinking row but down here they started to add they started to add up
196:05 started to add they started to add up right but we didn't do that we only
196:08 right but we didn't do that we only partitioned on location so it added it
196:10 partitioned on location so it added it did the sum of all the new vaccinations
196:13 did the sum of all the new vaccinations by that location so what we need to do
196:16 by that location so what we need to do is go over here and say order by and we
196:19 is go over here and say order by and we need to order it by both the location
196:22 need to order it by both the location oops da.
196:24 oops da. location and the date that is very
196:28 location and the date that is very important uh the date is what's going to
196:30 important uh the date is what's going to separate it out um and you'll see in
196:34 separate it out um and you'll see in just a second what I mean so now let's
196:36 just a second what I mean so now let's run this and let's go back down to
196:38 run this and let's go back down to Albania I think it was so here's Albania
196:41 Albania I think it was so here's Albania let's go to our first one so here's what
196:44 let's go to our first one so here's what we have we have 60 and it gives us 60
196:48 we have we have 60 and it gives us 60 then we add 78 so we add 60 + 78 = 138
196:53 then we add 78 so we add 60 + 78 = 138 then 78 + 1 78 sorry 60 + 78 + 42 = 180
197:00 then 78 + 1 78 sorry 60 + 78 + 42 = 180 then 60 + 78+ 142 + 61 241 so you get
197:03 then 60 + 78+ 142 + 61 241 so you get the point it adds up every single uh
197:06 the point it adds up every single uh consecutive one and when there's nulls
197:08 consecutive one and when there's nulls or there's zeros it's going to uh not
197:12 or there's zeros it's going to uh not anything it's just going to keep it uh
197:14 anything it's just going to keep it uh going and then you can see as it's it's
197:17 going and then you can see as it's it's a rolling count so we're going to name
197:19 a rolling count so we're going to name this let's do
197:21 this let's do as
197:24 as um let's do as um rolling people
197:29 um let's do as um rolling people vaccinated let's call
197:40 that um I think that's good now what we want to do is actually
197:42 good now what we want to do is actually look at the total population versus the
197:45 look at the total population versus the vaccinations um and really what we want
197:47 vaccinations um and really what we want to do is use this rolling people
197:50 to do is use this rolling people vaccinated we want to use the max number
197:52 vaccinated we want to use the max number because at the very bottom is our Max
197:54 because at the very bottom is our Max number this is how many people in
197:57 number this is how many people in Albania um we want to use that number
198:00 Albania um we want to use that number and divide it by the population to know
198:02 and divide it by the population to know how many people in that country are
198:05 how many people in that country are vaccinated so what we want to do is
198:08 vaccinated so what we want to do is we'll do this we'll do rolling people
198:10 we'll do this we'll do rolling people vaccinated divided by
198:12 vaccinated divided by population times 100 and as you can see
198:15 population times 100 and as you can see we're getting an error you can't use a
198:18 we're getting an error you can't use a column that you just created to then use
198:21 column that you just created to then use the next one so what we need to do is we
198:24 the next one so what we need to do is we need to create either a CTE or a temp
198:27 need to create either a CTE or a temp table um this is at this is the time of
198:31 table um this is at this is the time of of the show of this tutorial whatever
198:34 of the show of this tutorial whatever you want to call it where I'm going to
198:35 you want to call it where I'm going to give you some options you can do one you
198:38 give you some options you can do one you can do
198:39 can do both you know there's no preference to
198:41 both you know there's no preference to me
198:42 me um but we're going to take this and
198:45 um but we're going to take this and we're going to at least for this first
198:47 we're going to at least for this first one we're going to use a
198:50 one we're going to use a CT so we're going to say excuse me we're
198:55 CT so we're going to say excuse me we're going to say with and let's call
198:59 going to say with and let's call it um pop vers vac I don't
199:05 it um pop vers vac I don't know population versus
199:08 know population versus vaccination and then all we need to do
199:12 vaccination and then all we need to do is specify the um basically the columns
199:15 is specify the um basically the columns that we're going to input um so let's
199:18 that we're going to input um so let's put as and let's insert that down here
199:21 put as and let's insert that down here because what we need to do is we want to
199:22 because what we need to do is we want to say
199:24 say um we do
199:26 um we do continent oh gosh I'm so bad at spelling
199:31 continent oh gosh I'm so bad at spelling continent uh
199:34 continent uh location date
199:37 location date population um and then we'll have this
199:41 population um and then we'll have this rolling people vaccinated that should be
199:51 it um and let's see if there's we just need to close this parentheses so this
199:53 need to close this parentheses so this is our CTE it should be
199:56 is our CTE it should be working um actually that's not true I
200:00 working um actually that's not true I need an open parenthesis here that's why
200:02 need an open parenthesis here that's why it's giving me that
200:04 it's giving me that error um let's see it's I'm still
200:07 error um let's see it's I'm still getting an error so let me see if I'm
200:09 getting an error so let me see if I'm doing something
200:10 doing something wrong do
200:22 parentheses there and there I say with pop back there continent location
200:27 with pop back there continent location date
200:28 date population
200:30 population ah I believe that is the
200:33 ah I believe that is the issue so then we need we just need to
200:35 issue so then we need we just need to add that last
200:36 add that last column new
200:40 column new vaccinations um if the number of columns
200:44 vaccinations um if the number of columns in the CTE is different than the number
200:46 in the CTE is different than the number of columns here it's going to give you
200:47 of columns here it's going to give you an error so you got to make sure um and
200:50 an error so you got to make sure um and then let's just say for real for right
200:52 then let's just say for real for right now select everything from and we'll do
200:56 now select everything from and we'll do and we can even say pop versus vag it'll
200:59 and we can even say pop versus vag it'll come up right away so really quickly
201:02 come up right away so really quickly let's run this and see what
201:04 let's run this and see what happens uh the order by Clause can't be
201:06 happens uh the order by Clause can't be in there I knew that but
201:09 in there I knew that but whoops let's comment that out
201:13 whoops let's comment that out let's get that all the way up here let's
201:15 let's get that all the way up here let's run this so now that query that we were
201:17 run this so now that query that we were looking at before is now in here but now
201:20 looking at before is now in here but now we can actually use it to perform
201:23 we can actually use it to perform further calculations um so we'll just do
201:25 further calculations um so we'll just do everything
201:27 everything comma and then we'll do rolling people
201:31 comma and then we'll do rolling people vaccinated uh divided
201:34 vaccinated uh divided by and that needs to be
201:37 by and that needs to be population time 100 I'm pretty sure this
201:40 population time 100 I'm pretty sure this is incorrect give me me a
201:42 is incorrect give me me a second um invalid object oh it's because
201:46 second um invalid object oh it's because I have to run it with the
201:48 I have to run it with the CTE my bad
201:51 CTE my bad um so let's look at this percentage
201:54 um so let's look at this percentage really quick
201:56 really quick um it's not wrong and it's actually
201:58 um it's not wrong and it's actually going to give us a rolling number and
202:00 going to give us a rolling number and this may actually be what we
202:02 this may actually be what we want
202:10 um so basically what it's doing is it's taking this column
202:12 taking this column and doing it versus this column and so
202:15 and doing it versus this column and so this number should only increase because
202:17 this number should only increase because as this number increases this number
202:19 as this number increases this number will increase because the population
202:21 will increase because the population stays stagnant um again I'm kind of
202:24 stays stagnant um again I'm kind of looking at this as we go so right now
202:27 looking at this as we go so right now 12% of the population in um
202:31 12% of the population in um Albania is vaccinated so that you know
202:35 Albania is vaccinated so that you know that is that's all we know I don't think
202:37 that is that's all we know I don't think we need to go any further than that I
202:39 we need to go any further than that I think um if you want to
202:42 think um if you want to you can look at the max
202:44 you can look at the max one um but you'll have to get rid of
202:47 one um but you'll have to get rid of date and just keep the location um
202:50 date and just keep the location um population Etc because the date is going
202:52 population Etc because the date is going to throw everything
202:53 to throw everything off so if that's something you want to
202:55 off so if that's something you want to do absolutely do
202:57 do absolutely do that um you can use a temp table here uh
203:01 that um you can use a temp table here uh we can look at how to do
203:03 we can look at how to do that really quickly I think um so that
203:08 that really quickly I think um so that you guys know how to do that again I
203:10 you guys know how to do that again I recommend throwing in one or two of
203:12 recommend throwing in one or two of these um like even up
203:14 these um like even up here you can do different um different
203:18 here you can do different um different counts and then do one for each um so
203:21 counts and then do one for each um so let's do temp
203:23 let's do temp table all right so it's going to be a
203:27 table all right so it's going to be a lot of the same stuff we're going to
203:29 lot of the same stuff we're going to keep
203:30 keep this
203:32 this and this is going to be what we insert
203:36 and this is going to be what we insert so let's say insert into and we need to
203:39 so let's say insert into and we need to write where we're inserting it into but
203:41 write where we're inserting it into but let's say uh again I'm only doing this
203:44 let's say uh again I'm only doing this for it's going to be basically the same
203:46 for it's going to be basically the same it's going to have the same effect but
203:48 it's going to have the same effect but um with a temp table so uh we're going
203:51 um with a temp table so uh we're going to do temp table and let's look
203:53 to do temp table and let's look at um let's say let's call percent
203:58 at um let's say let's call percent population
204:00 population vaccinated and we need to specify our
204:03 vaccinated and we need to specify our columns so let's go down here excuse me
204:06 columns so let's go down here excuse me let's go down here and let's do the
204:08 let's go down here and let's do the basically the exact same thing so
204:10 basically the exact same thing so continent
204:12 continent I think I spelled that
204:14 I think I spelled that right no I didn't spell that right I
204:18 right no I didn't spell that right I almost did I got really confident we'll
204:20 almost did I got really confident we'll do we and and just so you know for these
204:22 do we and and just so you know for these we have to specify the data type as well
204:24 we have to specify the data type as well um because we're basically creating like
204:26 um because we're basically creating like a genuine table is just a temporary one
204:29 a genuine table is just a temporary one so let's do invar Char 255 we'll do um
204:34 so let's do invar Char 255 we'll do um location we'll do the same thing and
204:36 location we'll do the same thing and barar
204:38 barar oops
204:40 oops 255 we need to do date and we'll do that
204:44 255 we need to do date and we'll do that as date time we'll do
204:48 as date time we'll do population and we can do I mean there's
204:51 population and we can do I mean there's lots of different ones we can do but
204:52 lots of different ones we can do but we'll do numeric for this
204:54 we'll do numeric for this example there's new uncore
204:58 example there's new uncore vaccinations and let's do that one as
205:02 vaccinations and let's do that one as numeric again you can use different
205:05 numeric again you can use different things um and then we'll do rolling
205:07 things um and then we'll do rolling people
205:08 people vaccinated Um this can can be numeric as
205:11 vaccinated Um this can can be numeric as well
205:13 well um and then we need to insert that into
205:16 um and then we need to insert that into here okay so we're inserting the data
205:19 here okay so we're inserting the data and then down here we can actually
205:21 and then down here we can actually select it and let's let's take
205:25 select it and let's let's take this and do right here except we're
205:28 this and do right here except we're going to be doing this
205:30 going to be doing this by this right here but it hasn't been
205:33 by this right here but it hasn't been created yet but it will be created in
205:34 created yet but it will be created in just a
205:36 just a second okay so you let me see if
205:40 second okay so you let me see if yeah so these were the rows that were
205:43 yeah so these were the rows that were affected um and and then we got our
205:46 affected um and and then we got our actual output from this right here now
205:49 actual output from this right here now let's say you wanted to change something
205:50 let's say you wanted to change something in here you're like oh you know I I
205:52 in here you're like oh you know I I don't want to do it we this let me
205:53 don't want to do it we this let me comment that out and then let me do this
205:57 comment that out and then let me do this and um create that table again oh no we
206:02 and um create that table again oh no we got an error um how can we get around
206:04 got an error um how can we get around this very simple I've done this in a I
206:06 this very simple I've done this in a I should do this in a different one you
206:08 should do this in a different one you can do drop table if exist
206:12 can do drop table if exist and then do this right
206:14 and then do this right here um and when we run this it
206:17 here um and when we run this it should give us our output I highly
206:20 should give us our output I highly recommend just adding this especially if
206:22 recommend just adding this especially if you plan on making any alterations so
206:25 you plan on making any alterations so that when you um run it multiple times
206:29 that when you um run it multiple times you don't have to you know go and then
206:30 you don't have to you know go and then delete the view or or delete the temp
206:33 delete the view or or delete the temp table or drop temp table or you know
206:36 table or drop temp table or you know it's just built in it's at the top it's
206:38 it's just built in it's at the top it's easy to maintain and it looks good it's
206:40 easy to maintain and it looks good it's it's something that that a lot of people
206:42 it's something that that a lot of people do and so if you have that at the top of
206:44 do and so if you have that at the top of your query and somebody you know
206:47 your query and somebody you know somebody who wants to hire you looks at
206:48 somebody who wants to hire you looks at this like oh okay that makes sense I'm
206:51 this like oh okay that makes sense I'm glad they included that they know what
206:52 glad they included that they know what they're doing this guy's smart I should
206:53 they're doing this guy's smart I should hire them um now what we're going to do
206:58 hire them um now what we're going to do is uh I feel like I've showed you as
207:01 is uh I feel like I've showed you as much as I can show you um with the
207:03 much as I can show you um with the limited data that we've looked at again
207:05 limited data that we've looked at again I could have done this for six hours
207:07 I could have done this for six hours straight if I had used all the data at
207:09 straight if I had used all the data at least I mean there's just so much data
207:11 least I mean there's just so much data but let's create a view you know I'm
207:14 but let's create a view you know I'm only going to show you how to create one
207:15 only going to show you how to create one view but I want you to go back and
207:17 view but I want you to go back and create multiple views you know if this
207:19 create multiple views you know if this is one that you want to look at these
207:21 is one that you want to look at these Global numbers um let's look at this one
207:23 Global numbers um let's look at this one really quick if you want to look at this
207:26 really quick if you want to look at this number right here toss it in a view I
207:28 number right here toss it in a view I mean that one doesn't make sense to toss
207:29 mean that one doesn't make sense to toss in a view but this
207:30 in a view but this one toss these numbers in a view um and
207:34 one toss these numbers in a view um and we're we're going to um look at it in
207:36 we're we're going to um look at it in Tableau later but for right now let's
207:40 Tableau later but for right now let's just create our
207:42 just create our view um so like let's just
207:45 view um so like let's just say
207:47 say creating view to store data for
207:52 creating view to store data for later
207:54 later visualizations all right so let's say
207:57 visualizations all right so let's say create view um and I want I'm just going
208:01 create view um and I want I'm just going to keep the same thing um like that um
208:06 to keep the same thing um like that um and for views it's so easy I mean I'm
208:09 and for views it's so easy I mean I'm literally just going to and I can even
208:11 literally just going to and I can even take um the order by I believe we'll see
208:14 take um the order by I believe we'll see if I'm
208:16 if I'm correct um actually let's get rid of
208:18 correct um actually let's get rid of both of these
208:20 both of these things so it says create view percent
208:25 things so it says create view percent uh percent populate oops percent
208:29 uh percent populate oops percent population
208:32 population vaccinated um and let's see am I doing
208:34 vaccinated um and let's see am I doing anything wrong
208:35 anything wrong [Music]
208:36 [Music] here let me
208:39 here let me see the order by clause
208:42 see the order by clause I was completely wrong I was wondering
208:44 I was completely wrong I was wondering why I was getting that now let's try
208:46 why I was getting that now let's try running it okay so it ran successfully
208:50 running it okay so it ran successfully um let's look at our views it's not
208:52 um let's look at our views it's not going to be in there let's refresh it
208:54 going to be in there let's refresh it hey look we got our very first view we
208:57 hey look we got our very first view we can open that up like a table if we want
208:59 can open that up like a table if we want to um isn't it's I mean it's gorgeous um
209:01 to um isn't it's I mean it's gorgeous um if you want to get rid of that select or
209:04 if you want to get rid of that select or sorry control shift R that's a
209:07 sorry control shift R that's a refresh um and now it it basically
209:10 refresh um and now it it basically recognized is it but let's go back here
209:13 recognized is it but let's go back here for a
209:14 for a second um and you know we can now query
209:18 second um and you know we can now query off of that it's a view now so you know
209:22 off of that it's a view now so you know it's it's something that you can it's
209:25 it's it's something that you can it's permanent you know you have to go in and
209:27 permanent you know you have to go in and actually delete it's not like a temp
209:28 actually delete it's not like a temp table this is now permanent and this
209:30 table this is now permanent and this could be something that we now use for a
209:32 could be something that we now use for a visualization later so do some of these
209:35 visualization later so do some of these look at some of the queries that we've
209:36 look at some of the queries that we've looked at and create a few of these
209:38 looked at and create a few of these views um and we will use them later
209:42 views um and we will use them later um normally in like a normal setting uh
209:46 um normally in like a normal setting uh if I was actually working I would put
209:49 if I was actually working I would put some of these in actual like I would
209:51 some of these in actual like I would call them like a work view or a work
209:53 call them like a work view or a work table or something set aside so that I
209:56 table or something set aside so that I can use them consistently um but I would
209:59 can use them consistently um but I would also set them aside so that I could
210:01 also set them aside so that I could connect Tableau to that view now we're
210:04 connect Tableau to that view now we're going to be using something called
210:05 going to be using something called Tableau public that'll be in the very
210:07 Tableau public that'll be in the very next tutorial unfortunately um
210:11 next tutorial unfortunately um let me see if I can show you I can't
210:14 let me see if I can show you I can't show you Tableau public does not connect
210:17 show you Tableau public does not connect to SQL databases um and that's because
210:19 to SQL databases um and that's because it's free and I totally get it you have
210:22 it's free and I totally get it you have to pay for the upgraded version but I am
210:23 to pay for the upgraded version but I am not a a billionaire okay I cannot afford
210:27 not a a billionaire okay I cannot afford uh the real version of Tableau I'm also
210:29 uh the real version of Tableau I'm also not like a student or or like something
210:31 not like a student or or like something where I can get it cheap so I'm not
210:33 where I can get it cheap so I'm not paying for that so we're going to use
210:34 paying for that so we're going to use Tableau public and and I recommend this
210:36 Tableau public and and I recommend this anyways because anybody can access it
210:38 anyways because anybody can access it it's it's free for anybody so we're
210:40 it's it's free for anybody so we're going to be using Tableau in the next
210:41 going to be using Tableau in the next one to actually visualize a lot of these
210:44 one to actually visualize a lot of these things I want to get at least five
210:45 things I want to get at least five visualizations we're going to create a
210:47 visualizations we're going to create a dashboard it's going to be a beautiful
210:49 dashboard it's going to be a beautiful beautiful thing all right so the very
210:51 beautiful thing all right so the very last thing that we are going to do is we
210:53 last thing that we are going to do is we are going to actually save this and then
210:55 are going to actually save this and then put it into GitHub and I just want to
210:57 put it into GitHub and I just want to show you how to do that that's where
210:59 show you how to do that that's where we're going to be storing our code at
211:00 we're going to be storing our code at least for now um so let's go up here
211:03 least for now um so let's go up here let's click file let's click save as
211:08 let's click file let's click save as I've already have multiple versions of
211:09 I've already have multiple versions of this let's just put
211:12 this let's just put B2 we're going to save that so we have
211:15 B2 we're going to save that so we have this saved now I'm going to go over here
211:18 this saved now I'm going to go over here I'm going to go to my GitHub now if you
211:21 I'm going to go to my GitHub now if you don't have an account I highly recommend
211:23 don't have an account I highly recommend getting an account so you can start
211:25 getting an account so you can start putting your portfolio projects in here
211:28 putting your portfolio projects in here of course we're not going to put our
211:29 of course we're not going to put our Tableau one in here but our SQL ones and
211:31 Tableau one in here but our SQL ones and our python ones you can put in here
211:33 our python ones you can put in here again I'll talk a lot more about how we
211:36 again I'll talk a lot more about how we actually want to display this in GitHub
211:39 actually want to display this in GitHub or other places but what we're going to
211:42 or other places but what we're going to do for this is we're going to create a
211:43 do for this is we're going to create a new
211:44 new repository let's call this one
211:48 repository let's call this one portfolio
211:51 portfolio projects make it public we'll create the
211:54 projects make it public we'll create the repository we'll do all that extra stuff
211:56 repository we'll do all that extra stuff later so what we now want to do is
211:58 later so what we now want to do is upload an existing file we'll click
212:01 upload an existing file we'll click right there go to choose files and we'll
212:04 right there go to choose files and we'll click this latest one that we saved uh
212:07 click this latest one that we saved uh and we'll open it and we can always
212:10 and we'll open it and we can always change the name of it later on and you
212:12 change the name of it later on and you can add notes if you'd like but we'll
212:14 can add notes if you'd like but we'll commit that change so we'll actually
212:16 commit that change so we'll actually upload this uh this
212:20 upload this uh this file um but let's look at it really
212:22 file um but let's look at it really quick and I'm going to go back and I'm
212:24 quick and I'm going to go back and I'm going to use the real one where has the
212:27 going to use the real one where has the formatting and and the notes that I have
212:29 formatting and and the notes that I have that I wanted to add in there but as you
212:31 that I wanted to add in there but as you can see you know you can see all of the
212:35 can see you know you can see all of the queries that we wrote and this is
212:36 queries that we wrote and this is fantastic so if somebody comes in here
212:38 fantastic so if somebody comes in here you know we'll have more notes and kind
212:40 you know we'll have more notes and kind of better comments on what they do um
212:44 of better comments on what they do um and what the takeaway is this from for a
212:46 and what the takeaway is this from for a hiring manager to you know when they
212:49 hiring manager to you know when they actually look at this so this is a
212:51 actually look at this so this is a really really good place to start again
212:53 really really good place to start again uh this may not be your optimal place to
212:56 uh this may not be your optimal place to put this I'll give you a few different
212:58 put this I'll give you a few different options in a later video about how we
213:00 options in a later video about how we can actually uh potentially improve upon
213:03 can actually uh potentially improve upon this I'm really looking forward to
213:06 this I'm really looking forward to getting more portfolio projects done so
213:09 getting more portfolio projects done so we can actually start building a compl
213:10 we can actually start building a compl complete
213:12 complete portfolio uh if you've stuck around all
213:14 portfolio uh if you've stuck around all this way I just want to say
213:16 this way I just want to say congratulations I mean I know this was a
213:19 congratulations I mean I know this was a long video I know that it took a long
213:22 long video I know that it took a long time but you stuck with me uh you you
213:24 time but you stuck with me uh you you put in the hard work and that is
213:26 put in the hard work and that is fantastic and I really hope that it pays
213:28 fantastic and I really hope that it pays off and I hope that this has been
213:29 off and I hope that this has been helpful thank you for watching we'll
213:31 helpful thank you for watching we'll have a lot more uh videos in the future
213:34 have a lot more uh videos in the future on these portfolio projects and I'm I'm
213:36 on these portfolio projects and I'm I'm just really really looking forward to
213:38 just really really looking forward to doing them to be honest so thank you for
213:40 doing them to be honest so thank you for sticking with with me uh thank you for
213:42 sticking with with me uh thank you for watching I really appreciate it if you
213:44 watching I really appreciate it if you like this video be sure to like And
213:46 like this video be sure to like And subscribe below and I will see you in
213:48 subscribe below and I will see you in the next
213:50 the next [Music]
214:00 [Music] video what's going on everybody welcome
214:02 video what's going on everybody welcome back to another video today we will be
214:04 back to another video today we will be heading back in a sequel for our third
214:06 heading back in a sequel for our third portfolio
214:15 now I am extremely excited for this project in particular for a few reasons
214:17 project in particular for a few reasons one we're getting back into SQL and I
214:19 one we're getting back into SQL and I really like SQL and two we are finally
214:22 really like SQL and two we are finally focusing on data cleaning and I have
214:24 focusing on data cleaning and I have talked so much about why data cleaning
214:26 talked so much about why data cleaning is important and that you really need to
214:28 is important and that you really need to learn how to clean data and that that's
214:30 learn how to clean data and that that's a big part of what a data analyst does
214:32 a big part of what a data analyst does but I haven't actually showed you how to
214:34 but I haven't actually showed you how to do it yet and so that is what this whole
214:36 do it yet and so that is what this whole project is going to be and then at the
214:38 project is going to be and then at the end you'll get to add it to your
214:39 end you'll get to add it to your portfolio so it's really a win-win now
214:41 portfolio so it's really a win-win now before we start I just want to say that
214:42 before we start I just want to say that I think it's going to be a little bit
214:44 I think it's going to be a little bit more advanced than our very first video
214:45 more advanced than our very first video in Sequel where we walk through data
214:47 in Sequel where we walk through data exploration if you see something that
214:49 exploration if you see something that you have never seen before I will do my
214:51 you have never seen before I will do my best to explain it while we're walking
214:53 best to explain it while we're walking through it but if you get confused or it
214:55 through it but if you get confused or it seems a little complicated please pause
214:57 seems a little complicated please pause it Google it do a little bit of research
214:59 it Google it do a little bit of research and then come back and I think that will
215:00 and then come back and I think that will be very helpful with that being said
215:02 be very helpful with that being said let's jump over to my screen and we'll
215:03 let's jump over to my screen and we'll get started on the project so we're
215:04 get started on the project so we're going to start over here on GitHub and
215:06 going to start over here on GitHub and this is where I've actually put the data
215:07 this is where I've actually put the data set that we are going to be using so I
215:09 set that we are going to be using so I will put this link in the description uh
215:11 will put this link in the description uh we're going to go right over here to the
215:12 we're going to go right over here to the Nashville housing data for data cleaning
215:16 Nashville housing data for data cleaning all you have to do is Click download and
215:19 all you have to do is Click download and it's going to download it and you can
215:20 it's going to download it and you can open it up if you want to we're not
215:22 open it up if you want to we're not going to do anything to this data at all
215:25 going to do anything to this data at all but really quick I'm just going to show
215:26 but really quick I'm just going to show you what it does look like um and we'll
215:29 you what it does look like um and we'll of course look at this in SQL in just a
215:30 of course look at this in SQL in just a little bit but we have a unique ID
215:32 little bit but we have a unique ID parcel ID uh we have this
215:35 parcel ID uh we have this address a sales date uh the price of the
215:38 address a sales date uh the price of the home so this is housing data if you
215:41 home so this is housing data if you didn't pick up on that already uh who
215:42 didn't pick up on that already uh who actually owns the home the owner address
215:45 actually owns the home the owner address and then some information about land
215:48 and then some information about land value um bedrooms bathrooms things like
215:50 value um bedrooms bathrooms things like that again not super important um
215:53 that again not super important um because we're going to be doing all of
215:55 because we're going to be doing all of this in uh SQL so let's actually get
215:59 this in uh SQL so let's actually get this data into SQL we're going to import
216:01 this data into SQL we're going to import it the exact same way that we did uh in
216:04 it the exact same way that we did uh in the very first video so we're going to
216:06 the very first video so we're going to come right over here going to go all the
216:08 come right over here going to go all the way down to Microsoft SQL Server 2019
216:12 way down to Microsoft SQL Server 2019 Import and Export we'll click next our
216:16 Import and Export we'll click next our data source is like last time a
216:18 data source is like last time a Microsoft Excel and let's take a
216:21 Microsoft Excel and let's take a look and we'll take that first one this
216:24 look and we'll take that first one this is the most recent one I've downloaded
216:26 is the most recent one I've downloaded but I just wanted to make sure so I
216:28 but I just wanted to make sure so I downloaded a few times um for the
216:32 downloaded a few times um for the destination we're going to click SQL
216:34 destination we're going to click SQL Server native client
216:36 Server native client 11.0 and this is my client or my server
216:40 11.0 and this is my client or my server right here
216:41 right here and I'm going to go down here and I want
216:42 and I'm going to go down here and I want to put it in this portfolio project so
216:45 to put it in this portfolio project so you know just configure this to what
216:47 you know just configure this to what your server is um again if you haven't
216:49 your server is um again if you haven't done this before you've never set up SQL
216:51 done this before you've never set up SQL server or a server um to go on SQL
216:54 server or a server um to go on SQL Server I will leave a link hopefully
216:57 Server I will leave a link hopefully right here also in the description uh
216:59 right here also in the description uh like I did for the first project so um
217:02 like I did for the first project so um you know be sure to go through that
217:03 you know be sure to go through that video so that you know how to download
217:05 video so that you know how to download this and have everything we're going to
217:07 this and have everything we're going to copy the data we're going to take sheet
217:09 copy the data we're going to take sheet one um we could renamed sheet one to
217:12 one um we could renamed sheet one to something else but uh we didn't and then
217:14 something else but uh we didn't and then we're going to finish this and finish
217:16 we're going to finish this and finish and it should run successfully
217:20 and it should run successfully hopefully it's looking good perfect so
217:23 hopefully it's looking good perfect so we have
217:25 we have 56477 so let's head over to
217:28 56477 so let's head over to SQL all right let's go to our
217:32 SQL all right let's go to our database portfolio project uh and here
217:36 database portfolio project uh and here is our sheet one now I'm going to rename
217:38 is our sheet one now I'm going to rename this um let's rename name it what is it
217:42 this um let's rename name it what is it Nashville let's just do Nashville
217:44 Nashville let's just do Nashville housing that's what I'm going to rename
217:46 housing that's what I'm going to rename it as um at least so when I post these
217:50 it as um at least so when I post these queries um to the GitHub and you see
217:52 queries um to the GitHub and you see them this is what they will be so if you
217:54 them this is what they will be so if you want to have them the exact same or be
217:56 want to have them the exact same or be able to copy and paste them um you know
217:58 able to copy and paste them um you know you should you should do that as well so
218:00 you should you should do that as well so let's take a look really quick let's
218:01 let's take a look really quick let's select the top
218:03 select the top 1,000 but there's about 56,000 rows
218:06 1,000 but there's about 56,000 rows there's a lot of data in here um and a
218:09 there's a lot of data in here um and a lot of things so
218:11 lot of things so uh I'm about to open up a a save thing
218:13 uh I'm about to open up a a save thing and we'll walk through the exact things
218:14 and we'll walk through the exact things that we're going to be working on in
218:15 that we're going to be working on in just a little bit but um yeah this is
218:18 just a little bit but um yeah this is what the data looks like in here there's
218:19 what the data looks like in here there's lots of columns uh lots of data so
218:22 lots of columns uh lots of data so really excited about this um let me pull
218:24 really excited about this um let me pull this open really fast it's going to be
218:26 this open really fast it's going to be this project walkth through here are the
218:30 this project walkth through here are the things and I'm going to show you this
218:31 things and I'm going to show you this really quickly here are the things that
218:33 really quickly here are the things that we're going to be walking through so
218:34 we're going to be walking through so we're going to standardize the date
218:35 we're going to standardize the date format we're going to populate the
218:37 format we're going to populate the property address data um that's
218:39 property address data um that's referring to this right here if you
218:41 referring to this right here if you notice there's the address and there's
218:43 notice there's the address and there's also the city that it's in so we want to
218:47 also the city that it's in so we want to be able to separate that out um and that
218:50 be able to separate that out um and that is actually right over here we're going
218:52 is actually right over here we're going to be doing the same same thing to the
218:54 to be doing the same same thing to the owner address except that has an address
218:56 owner address except that has an address a city and the state um which makes it a
218:59 a city and the state um which makes it a little bit more complicated and so um
219:01 little bit more complicated and so um that one should be really really cool to
219:03 that one should be really really cool to to show you um oh whoops I I messed up
219:07 to show you um oh whoops I I messed up that's what this one is breaking out
219:09 that's what this one is breaking out into individual columns that's where
219:10 into individual columns that's where going to do for that this popular in the
219:12 going to do for that this popular in the property address um you know if you
219:14 property address um you know if you notice and we'll go into this a little
219:16 notice and we'll go into this a little bit there's actually some values in the
219:17 bit there's actually some values in the property address that are blank but I'm
219:19 property address that are blank but I'm going to show you how you can actually
219:21 going to show you how you can actually populate that um which you know is a
219:24 populate that um which you know is a it's just a cool trick that I've used a
219:26 it's just a cool trick that I've used a few times and it it it does work I think
219:29 few times and it it it does work I think you'll find that one interesting um in
219:31 you'll find that one interesting um in the sold as vacant field we're going to
219:33 the sold as vacant field we're going to be doing some um some case statements if
219:37 be doing some um some case statements if then um then we're going to be removing
219:39 then um then we're going to be removing duplicates and then delet deleting
219:40 duplicates and then delet deleting unused columns so we have a lot to get
219:43 unused columns so we have a lot to get through this could be potentially the
219:44 through this could be potentially the longest video and I'm okay with that um
219:47 longest video and I'm okay with that um because I'm I love SQL down here and and
219:52 because I'm I love SQL down here and and I will say that when I when I in the
219:53 I will say that when I when I in the very first video I said it was going to
219:54 very first video I said it was going to be an ETL video um and I fully intended
219:58 be an ETL video um and I fully intended on doing that but I ran into not issues
220:00 on doing that but I ran into not issues on my side but issues in the fact that
220:03 on my side but issues in the fact that the ma vast majority of people who are
220:05 the ma vast majority of people who are going to be watching this are not going
220:06 going to be watching this are not going to be able to do what I did to configure
220:08 to be able to do what I did to configure my server um but I left it in here
220:11 my server um but I left it in here anyways when I think ETL is an automated
220:13 anyways when I think ETL is an automated process in order to uh extract the data
220:16 process in order to uh extract the data from somewhere we're going to transform
220:18 from somewhere we're going to transform it and then put it somewhere this was
220:20 it and then put it somewhere this was going to be the extraction method um and
220:22 going to be the extraction method um and I was going to put it in a store
220:23 I was going to put it in a store procedure so that you could um you know
220:25 procedure so that you could um you know run the run the store procedure run the
220:27 run the run the store procedure run the job import the data it was going to be
220:29 job import the data it was going to be really cool but I know that if I was
220:33 really cool but I know that if I was having trouble with it me trying to
220:35 having trouble with it me trying to explain it to you and you being able to
220:36 explain it to you and you being able to figure it out on your side was going to
220:38 figure it out on your side was going to be very tough I left the this anyways
220:41 be very tough I left the this anyways because I was able to get to work on my
220:43 because I was able to get to work on my computer um but it is tough and it took
220:47 computer um but it is tough and it took a lot of research um and I did this for
220:49 a lot of research um and I did this for a previous server like a year or two ago
220:51 a previous server like a year or two ago and I remember it being crazy hard but I
220:54 and I remember it being crazy hard but I was able to figure it out on my computer
220:55 was able to figure it out on my computer so if you want to try it out um try it
220:57 so if you want to try it out um try it out and and look into the stuff so I'm
220:59 out and and look into the stuff so I'm going to leave this here this is just
221:01 going to leave this here this is just for if you want to try it it's a little
221:03 for if you want to try it it's a little more advanced um and so you don't have
221:07 more advanced um and so you don't have to just important and this will be a
221:08 to just important and this will be a data cleaning project
221:10 data cleaning project instead of an ETL project but data
221:12 instead of an ETL project but data cleaning is what 90% it was going to be
221:14 cleaning is what 90% it was going to be anyways um anyways let's go back up to
221:17 anyways um anyways let's go back up to the very top really quickly I have a
221:20 the very top really quickly I have a whole another laptop right here as I did
221:23 whole another laptop right here as I did in the first video I didn't show it to
221:24 in the first video I didn't show it to you last time but um I have all of my
221:27 you last time but um I have all of my queries written out over here I'm going
221:29 queries written out over here I'm going to try to do this as quickly as possible
221:30 to try to do this as quickly as possible we have a lot to get through now before
221:32 we have a lot to get through now before we start writing our queries I am going
221:34 we start writing our queries I am going to turn off my camera so I do not get in
221:36 to turn off my camera so I do not get in the way all right you should still be
221:38 the way all right you should still be hearing my voice let's let get started
221:40 hearing my voice let's let get started let's just start with select everything
221:43 let's just start with select everything and we'll do from uh and it is portfolio
221:47 and we'll do from uh and it is portfolio project.
221:49 project. db. Nashville housing so let's just get
221:52 db. Nashville housing so let's just get this pulled up on
221:55 this pulled up on screen awesome so this is exactly what
221:58 screen awesome so this is exactly what we were looking at
222:00 we were looking at before and the very first thing that
222:02 before and the very first thing that we're going to be looking at is this
222:04 we're going to be looking at is this sale date now uh I wrote standardized
222:06 sale date now uh I wrote standardized sale date but I'm really just going to
222:08 sale date but I'm really just going to change the sale date um so let's copy
222:12 change the sale date um so let's copy this really
222:13 this really quick and let's look at just s
222:22 date and it has this time on the end and it serves absolutely no purpose and I it
222:24 it serves absolutely no purpose and I it just annoys me I want to take that off
222:26 just annoys me I want to take that off and so right now it's a say it's a date
222:28 and so right now it's a say it's a date time format but we're going to convert
222:32 time format but we're going to convert and we're going to do date and we're
222:34 and we're going to do date and we're going to take sale
222:37 going to take sale date sale date and we're going to go
222:41 date sale date and we're going to go like that and let's run this really
222:44 like that and let's run this really quick and this is what we want it to
222:46 quick and this is what we want it to look like all right so let's say update
222:51 look like all right so let's say update and we have portfolio project specified
222:53 and we have portfolio project specified up here so we can just say Nashville
222:55 up here so we can just say Nashville housing and we are going to set sale
222:59 housing and we are going to set sale date equal to and we're just going to
223:01 date equal to and we're just going to copy this now I will say before we do
223:02 copy this now I will say before we do this um I had some issues in my when I
223:06 this um I had some issues in my when I was initially doing it whether or not it
223:09 was initially doing it whether or not it made the update and I was I'm not sure
223:11 made the update and I was I'm not sure why why not it was doing it um so yeah
223:14 why why not it was doing it um so yeah it's not doing it right now I you try it
223:17 it's not doing it right now I you try it out on yours it may or may not be
223:19 out on yours it may or may not be working I'm not exactly sure why that is
223:21 working I'm not exactly sure why that is because I would say like 80% of the time
223:23 because I would say like 80% of the time it's doing it 10 20% it's not I don't
223:26 it's doing it 10 20% it's not I don't know why um no logical explanation of
223:28 know why um no logical explanation of that but uh when I most the time when I
223:32 that but uh when I most the time when I did it they would then be the same
223:33 did it they would then be the same column something we can do I just
223:36 column something we can do I just thought of we can do alter alter can't
223:38 thought of we can do alter alter can't even say that word alter
223:41 even say that word alter table and we can say um I think it's new
223:46 table and we can say um I think it's new or it's add add um give me one
223:50 or it's add add um give me one second yeah so add and we'll just do
223:53 second yeah so add and we'll just do sale date
223:56 sale date converted um and let's make that a date
224:00 converted um and let's make that a date format and bum just like this and then
224:04 format and bum just like this and then we can
224:06 we can say like this and say
224:10 say like this and say sale date
224:12 sale date converted um let's try this and see what
224:14 converted um let's try this and see what happens so I'm going to add this
224:17 happens so I'm going to add this column and then I'm going to update this
224:20 column and then I'm going to update this and it says it's affected let's see what
224:22 and it says it's affected let's see what happened uh so let's write sale
224:26 happened uh so let's write sale date convert
224:29 date convert sale date
224:32 sale date converted let's see what happened let's
224:34 converted let's see what happened let's see if it actually
224:35 see if it actually worked and it worked okay so we we now
224:38 worked and it worked okay so we we now have a column um and maybe at the end
224:41 have a column um and maybe at the end we'll remove that sale date column U so
224:44 we'll remove that sale date column U so that we just have that sale date
224:45 that we just have that sale date converted but we know what that is you
224:47 converted but we know what that is you don't have to name it that you can name
224:48 don't have to name it that you can name it sale date to or something like that
224:52 it sale date to or something like that um cool well let's go down to the
224:55 um cool well let's go down to the property address and let's get a just a
224:58 property address and let's get a just a really quick look at it uh let's copy
225:01 really quick look at it uh let's copy this up here I hate rewriting this stuff
225:04 this up here I hate rewriting this stuff so I'm always copying and pasting um but
225:07 so I'm always copying and pasting um but we're going to be working with the prop
225:15 address there we go so let's take a look at this really
225:16 at this really quick
225:22 um so let's look at sorry I was looking at my notes we need to look at where the
225:26 at my notes we need to look at where the property address is null so what you'll
225:28 property address is null so what you'll see really quick when we run this is
225:30 see really quick when we run this is that there are null values um why there
225:34 that there are null values um why there are null values yeah I really don't know
225:38 are null values yeah I really don't know um I I really am not sure but let's look
225:42 um I I really am not sure but let's look at
225:44 at everything where this
225:46 everything where this is um where it's n so we have this
225:50 is um where it's n so we have this property address we have a sale date a
225:52 property address we have a sale date a price legal reference um there's this
225:54 price legal reference um there's this parcel ID and there's this unique ID um
225:57 parcel ID and there's this unique ID um so we have a lot of information and when
226:01 so we have a lot of information and when you have something like this something
226:02 you have something like this something like a u an address an address is you
226:06 like a u an address an address is you know the address isn't going to change
226:08 know the address isn't going to change the address is the address the owner the
226:10 the address is the address the owner the owner's address might change but the
226:13 owner's address might change but the property itself the address 99.9% of the
226:16 property itself the address 99.9% of the time is not going to change so you can
226:18 time is not going to change so you can say with almost certainty that you know
226:21 say with almost certainty that you know this property address could be populated
226:23 this property address could be populated if we had a reference point um to base
226:27 if we had a reference point um to base that off of so really quickly um let's
226:32 that off of so really quickly um let's look at just
226:34 look at just everything and let's look at and we'll
226:37 everything and let's look at and we'll just order by
226:40 just order by let's
226:41 let's do property not property address uh
226:44 do property not property address uh let's do parcel ID and let's take a look
226:47 let's do parcel ID and let's take a look at this so we have to do a little bit of
226:49 at this so we have to do a little bit of some research on this um but I'm going
226:53 some research on this um but I'm going to show you something really quick let's
226:54 to show you something really quick let's see if I can find
226:56 see if I can find example um in not too
226:59 example um in not too long okay so here's an example here's
227:01 long okay so here's an example here's the same ID so 015 bum and that's the
227:06 the same ID so 015 bum and that's the exact same address and we'll find this a
227:09 exact same address and we'll find this a lot of times and I look through the data
227:11 lot of times and I look through the data and it's it is pretty much accurate um
227:15 and it's it is pretty much accurate um when it does have it it it is the exact
227:18 when it does have it it it is the exact same address so this parcel ID is going
227:21 same address so this parcel ID is going to be the same as the property address
227:25 to be the same as the property address um so something that we can do is
227:28 um so something that we can do is basically say if this parcel ID has an
227:33 basically say if this parcel ID has an address and this parcel ID does not have
227:35 address and this parcel ID does not have an address let's populate it with this
227:39 an address let's populate it with this address that's already populated because
227:41 address that's already populated because we know these are going to be the same
227:43 we know these are going to be the same that is basically what we are about to
227:44 that is basically what we are about to do um and it's not super complicated um
227:50 do um and it's not super complicated um but let's get started writing
227:53 but let's get started writing it let's copy that down there um one
227:57 it let's copy that down there um one thing we are going to have to do with
227:59 thing we are going to have to do with this is do a self-join so we have to
228:01 this is do a self-join so we have to join the table to itself to look
228:05 join the table to itself to look at if this is equal to this then this
228:09 at if this is equal to this then this needs to be equal to this that kind of
228:11 needs to be equal to this that kind of thing um so real quick let's just write
228:13 thing um so real quick let's just write that join part out and we'll go from
228:16 that join part out and we'll go from there I don't know why I sounded
228:17 there I don't know why I sounded Canadian right there we'll go from
228:20 Canadian right there we'll go from there uh so we'll join on this and we'll
228:24 there uh so we'll join on this and we'll say
228:25 say on a do oh wait let's let's label them
228:30 on a do oh wait let's let's label them I'm gonna do this in a really lazy way
228:32 I'm gonna do this in a really lazy way I'm just going to do a and b a. parcel
228:35 I'm just going to do a and b a. parcel ID is equal to b. parcel ID
228:40 ID is equal to b. parcel ID and um let's see really
228:43 and um let's see really quick so we need to find a way to
228:45 quick so we need to find a way to distinguish these the sale date could be
228:47 distinguish these the sale date could be the same um one thing this unique ID is
228:50 the same um one thing this unique ID is is unique so we need these to be
228:52 is unique so we need these to be different so let's use this and let's
228:54 different so let's use this and let's say um let's say and a. unique ID is not
229:00 say um let's say and a. unique ID is not equal to b. unique ID so all we have
229:05 equal to b. unique ID so all we have done here is we've joined these the same
229:08 done here is we've joined these the same exact table to it self and we said where
229:10 exact table to it self and we said where the partiel ID is the same but it's not
229:13 the partiel ID is the same but it's not the same row right because this is a
229:15 the same row right because this is a unique ID unique will never that means
229:18 unique ID unique will never that means these will never repeat themselves so
229:20 these will never repeat themselves so we'll never get the same one so if
229:22 we'll never get the same one so if this is equal to this but these are
229:25 this is equal to this but these are different we want to then populate um
229:29 different we want to then populate um populate the other one so let's do a.
229:32 populate the other one so let's do a. parcel ID and we'll say a do property
229:37 parcel ID and we'll say a do property address B do parcel ID comma bproperty
229:42 address B do parcel ID comma bproperty address um and let's take a look at this
229:45 address um and let's take a look at this really
229:46 really quick and let's
229:49 quick and let's do let me see if this works where
229:53 do let me see if this works where aproperty address is null and let's see
229:59 aproperty address is null and let's see if see what comes up
230:00 if see what comes up here okay so this is perfect this is
230:03 here okay so this is perfect this is exactly what I wanted to see so we have
230:06 exactly what I wanted to see so we have this parcel ID we have this parcel ID
230:08 this parcel ID we have this parcel ID and here is our address and it's blank
230:11 and here is our address and it's blank in all 35 of these so we have an address
230:13 in all 35 of these so we have an address for all of these but we're not
230:15 for all of these but we're not populating it so what we want to do is
230:18 populating it so what we want to do is we want to say use this thing called
230:20 we want to say use this thing called isnull so isnull is basically saying
230:23 isnull so isnull is basically saying it's the first thing is what do we want
230:25 it's the first thing is what do we want to check to see if it's null so we want
230:27 to check to see if it's null so we want to check aproperty address this whole
230:31 to check aproperty address this whole thing now if it is null what do we want
230:35 thing now if it is null what do we want to populate um we want to put in there
230:38 to populate um we want to put in there this B do bproperty um address because
230:43 this B do bproperty um address because we want to take that property address
230:46 we want to take that property address and stick it in there so um let's run
230:50 and stick it in there so um let's run this really quick so this row is what is
230:53 this really quick so this row is what is eventually going to be stuck into this
230:55 eventually going to be stuck into this row so this is perfect um it's literally
230:58 row so this is perfect um it's literally saying when it's null take take this and
231:02 saying when it's null take take this and put it there and so that's what this um
231:04 put it there and so that's what this um this part of is doing so let's go in
231:07 this part of is doing so let's go in here and write our update
231:10 here and write our update uh so we want to update and let's take
231:13 uh so we want to update and let's take this whole thing from here
231:21 up and we this will be the set oops um so we're going to set
231:24 oops um so we're going to set um
231:31 property okay we need to specify um and just so you know when you're doing joins
231:33 just so you know when you're doing joins in an update statement you're not going
231:34 in an update statement you're not going to say Nashville housing okay that's
231:37 to say Nashville housing okay that's going to give you an error you need to
231:38 going to give you an error you need to use it by by its Alias so let's put a so
231:42 use it by by its Alias so let's put a so now we're going to say property address
231:44 now we're going to say property address is going to be equal to and now we're
231:46 is going to be equal to and now we're just going to copy this is
231:48 just going to copy this is null and put it right
231:51 null and put it right here and we only want to update let's
231:54 here and we only want to update let's see if it it does take this so I think
231:57 see if it it does take this so I think this should be correct let's let's test
231:59 this should be correct let's let's test it out really quick and we're going to
232:00 it out really quick and we're going to run this above query and see if it made
232:01 run this above query and see if it made that
232:07 update all right so there you go um as you can see there are now none that have
232:09 you can see there are now none that have null in there otherwise it'd be giving
232:11 null in there otherwise it'd be giving us an output right now so that one is
232:14 us an output right now so that one is fixed we can go back and check it if you
232:15 fixed we can go back and check it if you want to please go back and and double
232:17 want to please go back and and double check that um but that is what we did
232:21 check that um but that is what we did and it worked perfectly so that's what
232:23 and it worked perfectly so that's what that is null does it checks to see if
232:27 that is null does it checks to see if this is null if it is null it it it can
232:29 this is null if it is null it it it can populate with a value you can also do
232:31 populate with a value you can also do like a string and what we I mean you can
232:33 like a string and what we I mean you can write you know no address if you wanted
232:37 write you know no address if you wanted to do something like that we don't want
232:38 to do something like that we don't want to do that we're going to keep it how it
232:40 to do that we're going to keep it how it is let's keep moving on we do not have
232:43 is let's keep moving on we do not have unlimited time here trying to keep this
232:45 unlimited time here trying to keep this I'm going to try to keep this on one
232:47 I'm going to try to keep this on one under two hours stretching the rules
232:49 under two hours stretching the rules because for my love of SQL and that is
232:51 because for my love of SQL and that is the only reason um and this I think is
232:53 the only reason um and this I think is going to take a little longer so let's
232:56 going to take a little longer so let's take a look and let's copy this real
233:06 quick and let's take a look at uh what are we doing the property address the
233:09 are we doing the property address the property address um and we can get rid
233:11 property address um and we can get rid of this as
233:13 of this as well so if you notice we have two things
233:18 well so if you notice we have two things here we have both the address and then
233:21 here we have both the address and then there's this comma after all of them and
233:23 there's this comma after all of them and there is the
233:26 there is the city now you know you don't know that or
233:28 city now you know you don't know that or you maybe you haven't looked into this
233:30 you maybe you haven't looked into this but I have and there are no other commas
233:34 but I have and there are no other commas anywhere except for in between these
233:36 anywhere except for in between these things as a separator as a delimiter
233:40 things as a separator as a delimiter um a delimiter is lit if you don't know
233:42 um a delimiter is lit if you don't know what if you've never heard that term
233:43 what if you've never heard that term delimiter a delimiter um is something
233:45 delimiter a delimiter um is something that separates different columns or
233:47 that separates different columns or different values so for us the delimer
233:50 different values so for us the delimer is a comma and for this first one
233:54 is a comma and for this first one because we're going to be separating
233:55 because we're going to be separating this one out and then we're going to be
233:56 this one out and then we're going to be doing the owner
233:57 doing the owner address um for this one we're going to
234:00 address um for this one we're going to be using something called a substring
234:03 be using something called a substring and we're also going to be using
234:04 and we're also going to be using something called a character index or a
234:07 something called a character index or a charart index so let's start writing
234:10 charart index so let's start writing that out and let's do
234:13 that out and let's do select and let's say substring now the
234:17 select and let's say substring now the substring that we want to take we of
234:19 substring that we want to take we of course want to be looking at oops let me
234:21 course want to be looking at oops let me um put this down here so it helps us out
234:23 um put this down here so it helps us out a little
234:24 a little bit and I'll get do like that so
234:27 bit and I'll get do like that so substring and of course we're looking at
234:31 substring and of course we're looking at property
234:33 property address and we want to look at position
234:36 address and we want to look at position one so we're going to start at position
234:38 one so we're going to start at position one one now this next part
234:42 one one now this next part is something that you may have never
234:45 is something that you may have never seen before um and if that if you
234:47 seen before um and if that if you haven't that's totally okay um we're
234:49 haven't that's totally okay um we're going to be the character index is going
234:51 going to be the character index is going to be searching
234:52 to be searching for the um it's going to basically be
234:56 for the um it's going to basically be searching for a specific value okay
234:59 searching for a specific value okay that's all it's doing and you and you
235:00 that's all it's doing and you and you can look into this a little bit more if
235:02 can look into this a little bit more if you want um so it's going to be Char
235:04 you want um so it's going to be Char index that's how it's spelled and then
235:06 index that's how it's spelled and then like an open parentheses and we want to
235:08 like an open parentheses and we want to specify what we're looking for so it can
235:10 specify what we're looking for so it can be anything you can even do you know if
235:12 be anything you can even do you know if you wanted to things like um Tom or you
235:16 you wanted to things like um Tom or you can do Val well you do it um like this
235:21 can do Val well you do it um like this you can look for Tom or if you're
235:22 you can look for Tom or if you're looking for a specific word like John
235:25 looking for a specific word like John you can search that that's what this is
235:27 you can search that that's what this is for um but we're going to do a comma
235:29 for um but we're going to do a comma where are we looking that's what this
235:31 where are we looking that's what this next one is so we're looking in property
235:34 next one is so we're looking in property address uh and then we're going to close
235:36 address uh and then we're going to close the
235:37 the parenthesis and and we'd also close it
235:40 parenthesis and and we'd also close it again to complete off that substring and
235:42 again to complete off that substring and we'll say as
235:45 we'll say as address um and let's just take a look
235:47 address um and let's just take a look really quick at
235:48 really quick at this so right now it's taking the it is
235:54 this so right now it's taking the it is basically going it's looking at property
235:56 basically going it's looking at property address it's going to the very first
235:58 address it's going to the very first value or starting at the first value and
236:01 value or starting at the first value and then it's going until the comma Now the
236:03 then it's going until the comma Now the unfortunate thing is is we actually
236:05 unfortunate thing is is we actually getting this comma in this output and we
236:07 getting this comma in this output and we don't want that uh you don't want a
236:09 don't want that uh you don't want a comma at the end of every address we can
236:12 comma at the end of every address we can change that um so we can say because
236:15 change that um so we can say because this is specifying a position if we just
236:19 this is specifying a position if we just look at this chart index which we can do
236:21 look at this chart index which we can do really
236:27 quick it is going to give us a a number it is saying at position 19 that is
236:29 it is saying at position 19 that is where the comma is right so it's not
236:32 where the comma is right so it's not like it's taking it's not a value or
236:34 like it's taking it's not a value or it's not a um it's not a string it's a
236:36 it's not a um it's not a string it's a it's a number so we can say minus one
236:38 it's a number so we can say minus one one and if we do
236:41 one and if we do that and now we run
236:43 that and now we run it now that comma is gone because we're
236:46 it now that comma is gone because we're looking back we're going to the comma
236:48 looking back we're going to the comma and then going back one from uh one
236:50 and then going back one from uh one behind the comma so that's how you get
236:52 behind the comma so that's how you get rid of that comma right there um the
236:55 rid of that comma right there um the next one's a little bit more tricky
236:57 next one's a little bit more tricky because we're not starting well it's not
236:59 because we're not starting well it's not super tricky but we're not starting at
237:00 super tricky but we're not starting at that first position anymore so let's put
237:03 that first position anymore so let's put a comma then we have our substring now
237:05 a comma then we have our substring now where we want to start is at this as at
237:09 where we want to start is at this as at where the comma is so instead of
237:10 where the comma is so instead of position one we want it to be where that
237:12 position one we want it to be where that character
237:14 character index um I don't want it to look like
237:16 index um I don't want it to look like this this whole time is it like this
237:18 this this whole time is it like this what am I doing it doesn't
237:21 what am I doing it doesn't matter let's just get rid of this and
237:24 matter let's just get rid of this and see if that fixes
237:27 see if that fixes it what am I doing here oh it's just
237:31 it what am I doing here oh it's just because this is wrong um and we'll just
237:34 because this is wrong um and we'll just do comma parentheses that might fix it
237:38 do comma parentheses that might fix it ah doesn't matter okay I'm wasting time
237:40 ah doesn't matter okay I'm wasting time I'm going to keep going we want to start
237:42 I'm going to keep going we want to start in this in this position okay um but we
237:45 in this in this position okay um but we actually don't want to start at minus
237:47 actually don't want to start at minus one we need to start at plus one because
237:48 one we need to start at plus one because we want to go to the actual comma itself
237:52 we want to go to the actual comma itself then once we get to the comma we want to
237:54 then once we get to the comma we want to add one so if we didn't if we just left
237:56 add one so if we didn't if we just left it the same again it would include the
237:57 it the same again it would include the comma at the beginning um then we need
238:00 comma at the beginning um then we need to specify where it needs to go to where
238:03 to specify where it needs to go to where does it need to finish now every single
238:05 does it need to finish now every single thing is going to be different every
238:07 thing is going to be different every single address has a different length
238:10 single address has a different length but we can use that to our advantage in
238:12 but we can use that to our advantage in this one and we can literally say the
238:14 this one and we can literally say the length
238:16 length of property address you guessed it right
238:20 of property address you guessed it right and then we can close this off let's see
238:23 and then we can close this off let's see if that
238:24 if that works okay what's messing up so we have
238:26 works okay what's messing up so we have property
238:27 property substring property
238:30 substring property address comma character
238:33 address comma character index and then we have specifying it in
238:36 index and then we have specifying it in the
238:37 the comma um we have the property address
238:40 comma um we have the property address plus one okay we can't have that right
238:42 plus one okay we can't have that right there I don't know why I had that F
238:45 there I don't know why I had that F finally figured it out at the end um so
238:48 finally figured it out at the end um so let's see what we're doing here let's
238:50 let's see what we're doing here let's see if it worked it works perfect um and
238:53 see if it worked it works perfect um and again this was one that I'm guessing a
238:55 again this was one that I'm guessing a lot of people haven't used before so I
238:57 lot of people haven't used before so I was trying to explain it a little bit
238:58 was trying to explain it a little bit more than other ones um but if we take
239:00 more than other ones um but if we take that out we take out that plus one
239:02 that out we take out that plus one you're going to see the comma at the
239:04 you're going to see the comma at the beginning right here so that's what that
239:06 beginning right here so that's what that is um so Plus one and that's what we're
239:10 is um so Plus one and that's what we're going to keep now we can't separate two
239:14 going to keep now we can't separate two values into from one column without
239:17 values into from one column without creating two other columns so just like
239:21 creating two other columns so just like we added this um table up here we're
239:24 we added this um table up here we're just going to I mean we're we're I'm
239:25 just going to I mean we're we're I'm just going to copy this down here really
239:27 just going to copy this down here really quick we're going to create two new
239:34 columns and add that value in so we're gonna we're gonna add that we're going
239:36 gonna we're gonna add that we're going to call this um let's call it because
239:39 to call this um let's call it because it's property address let's do
239:42 it's property address let's do property property split um and this is
239:46 property property split um and this is the
239:47 the address and then we'll say this one this
239:50 address and then we'll say this one this next one is going to be property and
239:52 next one is going to be property and this is City split
239:55 this is City split city city and this isn't going to be a
239:58 city city and this isn't going to be a date of course uh this going to be let's
240:00 date of course uh this going to be let's do narar and let's make it 255 just in
240:03 do narar and let's make it 255 just in case it's a large um just in case it is
240:06 case it's a large um just in case it is a large string a large text so then we
240:10 a large string a large text so then we can say um update that update
240:15 can say um update that update that um and now we need to in insert um
240:19 that um and now we need to in insert um what we did for it so this first one is
240:21 what we did for it so this first one is the address so we're going to say that
240:23 the address so we're going to say that equals the address and we're going to
240:25 equals the address and we're going to take this whole thing this whole
240:28 take this whole thing this whole substring oops and copy that and that's
240:31 substring oops and copy that and that's going to equal this um and then at the
240:34 going to equal this um and then at the end we'll we'll look at it really quick
240:37 end we'll we'll look at it really quick so first let's add this table I'm going
240:39 so first let's add this table I'm going to do this one at a time really quick so
240:41 to do this one at a time really quick so you can see it so it adds the
240:44 you can see it so it adds the table now it adds the results and again
240:48 table now it adds the results and again adds the table of city and sets that
240:52 adds the table of city and sets that City to that
240:54 City to that substring and now let's take um let's
240:58 substring and now let's take um let's take this and just do
241:00 take this and just do select everything from this and you
241:03 select everything from this and you should see at the very end because when
241:05 should see at the very end because when you add it it goes to the end we should
241:07 you add it it goes to the end we should have two new values and here we are so
241:10 have two new values and here we are so property split address and property
241:12 property split address and property split city um it's much more usable than
241:16 split city um it's much more usable than this I mean this would be a nightmare
241:18 this I mean this would be a nightmare not a nightmare it just be annoying to
241:20 not a nightmare it just be annoying to use this column I mean now that it's
241:22 use this column I mean now that it's separated on the address and the city
241:23 separated on the address and the city it's so much more usable of data it
241:27 it's so much more usable of data it really really is the next thing we're
241:29 really really is the next thing we're going to be looking at is this owner
241:31 going to be looking at is this owner address
241:32 address now it was hard enough or it was tough
241:36 now it was hard enough or it was tough enough to do this um but I want to show
241:40 enough to do this um but I want to show you maybe even a simpler way to do it
241:42 you maybe even a simpler way to do it even though this is more complicated so
241:45 even though this is more complicated so let's go down
241:46 let's go down here and let's get rid of
241:50 here and let's get rid of this so let's say um let's get this and
241:55 this so let's say um let's get this and let's just say property oops no we're
241:57 let's just say property oops no we're doing owner owner address here we go
242:02 doing owner owner address here we go let's just take a look at this let's see
242:03 let's just take a look at this let's see what we got so again we're using or what
242:07 what we got so again we're using or what we have in here is the address the city
242:10 we have in here is the address the city and the state so what we need to do is
242:13 and the state so what we need to do is split all of those out um and again I
242:16 split all of those out um and again I don't want to use substrings again that
242:18 don't want to use substrings again that was a pain I want to use um something a
242:23 was a pain I want to use um something a little different something again that
242:24 little different something again that you may have never seen it's called
242:27 you may have never seen it's called parse name um and parse name is super
242:30 parse name um and parse name is super useful um especially for like delimited
242:34 useful um especially for like delimited stuff stuff that's delimited by a
242:35 stuff stuff that's delimited by a specific value um so let me just show
242:38 specific value um so let me just show you what it is and then we'll go from
242:41 you what it is and then we'll go from there so we can say
242:43 there so we can say parse parse name um and we're going to
242:46 parse parse name um and we're going to be doing this on the owner
242:50 be doing this on the owner address okay let's let me see let me see
242:55 address okay let's let me see let me see yeah I mean it's because I don't have
242:57 yeah I mean it's because I don't have this of course I do that all the time so
243:00 this of course I do that all the time so annoying so on the owner address um and
243:04 annoying so on the owner address um and then let's do
243:05 then let's do one and let's just see what happens
243:12 uh nothing changed of course because parse name only is useful with periods
243:16 parse name only is useful with periods or that's what it looks for that's what
243:18 or that's what it looks for that's what par name looks for and these are commas
243:21 par name looks for and these are commas so something we can just do is we can
243:23 so something we can just do is we can replace those commas with uh a a instead
243:27 replace those commas with uh a a instead of a comma we replace it with a period
243:29 of a comma we replace it with a period so super easy we're just going to do
243:31 so super easy we're just going to do owner address
243:33 owner address comma um and we'll look for the comma in
243:37 comma um and we'll look for the comma in there then we need to specify what we
243:39 there then we need to specify what we need to change it to we'll change it to
243:40 need to change it to we'll change it to a period and let's close
243:43 a period and let's close that and now let's run
243:47 that and now let's run it and it's taking Tennessee so
243:51 it and it's taking Tennessee so something odd about at least to me odd
243:54 something odd about at least to me odd about parse name is that it kind of does
243:55 about parse name is that it kind of does things backwards than what you would
243:57 things backwards than what you would expect it to do uh let's really quick
244:00 expect it to do uh let's really quick let's add the other things um you'll
244:03 let's add the other things um you'll you'll get a kick out well you won't get
244:05 you'll get a kick out well you won't get a kick out of this as much as I do
244:07 a kick out of this as much as I do here's one two
244:08 here's one two three let's execute this and it
244:11 three let's execute this and it separates everything for us but it's
244:14 separates everything for us but it's backwards so it's 1 2 3 you would
244:16 backwards so it's 1 2 3 you would imagine it' be one two 3 but no it's one
244:18 imagine it' be one two 3 but no it's one two three so all we need to do is go
244:22 two three so all we need to do is go three 2
244:25 three 2 one and run
244:28 one and run this and there we go so now we have it
244:31 this and there we go so now we have it broken out this is now our address this
244:34 broken out this is now our address this is our city and this is our state so
244:38 is our city and this is our state so super what I would consider super easy a
244:40 super what I would consider super easy a lot easier than the substring but I
244:41 lot easier than the substring but I didn't want to show you the easy one
244:42 didn't want to show you the easy one first and then give you the hard one um
244:45 first and then give you the hard one um so now we just need to add those columns
244:48 so now we just need to add those columns and then we need to add the values so
244:51 and then we need to add the values so let's do
244:53 let's do this uh let's make some room and I need
244:57 this uh let's make some room and I need to get rid of one of these I think o did
245:00 to get rid of one of these I think o did I do that right what did I
245:07 do I have my alter table update alter table update what is this doing here
245:10 table update what is this doing here what is this I don't even know what this
245:12 what is this I don't even know what this is we'll just go like that so now we
245:14 is we'll just go like that so now we have three perfect um so from National
245:17 have three perfect um so from National Housing we're going to say we're going
245:19 Housing we're going to say we're going to say this is the
245:21 to say this is the owner oops owner split
245:25 owner oops owner split address um actually let me just copy the
245:27 address um actually let me just copy the owner make it easier so we have owner
245:29 owner make it easier so we have owner split address owner split
245:34 split address owner split City
245:36 City and let's do owner owner split and then
245:40 and let's do owner owner split and then State oops and copy there owner split
245:47 State oops and copy there owner split City there we go owner split address
245:51 City there we go owner split address owner split address so I'm putting all
245:52 owner split address so I'm putting all the sets equal to what we're about to
245:54 the sets equal to what we're about to add to so now this first one this three
245:58 add to so now this first one this three is the address we'll paste it there the
246:02 is the address we'll paste it there the second one is the city so we'll put that
246:07 second one is the city so we'll put that oh I see what happened here that's what
246:10 oh I see what happened here that's what happened got to get rid of
246:12 happened got to get rid of that um I set the owner split City equal
246:17 that um I set the owner split City equal to that middle one and then of course
246:19 to that middle one and then of course the third one is the
246:21 the third one is the state so let's go do
246:24 state so let's go do that and that should be done so let's do
246:27 that and that should be done so let's do it two at a
246:29 it two at a time oops owner split address what's
246:33 time oops owner split address what's wrong with that oh I probably just got
246:35 wrong with that oh I probably just got to run this first let's try that
246:39 to run this first let's try that tried to get good too quick um you can
246:42 tried to get good too quick um you can do this a much more efficient way I'm
246:44 do this a much more efficient way I'm just doing this for visual purposes I
246:46 just doing this for visual purposes I would update all the tables first or add
246:48 would update all the tables first or add all the um columns first I mean and then
246:50 all the um columns first I mean and then do all the updating at the end that's
246:52 do all the updating at the end that's normally how I do it but um again for
246:55 normally how I do it but um again for visual purposes that this is what we're
246:57 visual purposes that this is what we're doing so let's go get this actually
247:00 doing so let's go get this actually let's get this bring this down
247:03 let's get this bring this down here um don't keep this in in your final
247:06 here um don't keep this in in your final queries it's a lot of extra selecting
247:09 queries it's a lot of extra selecting everything you don't need to do that um
247:11 everything you don't need to do that um so here we go so owner split address
247:14 so here we go so owner split address owner split City owner split State again
247:18 owner split City owner split State again so much more usable than when it's all
247:20 so much more usable than when it's all in one column I mean it is 10 100 times
247:24 in one column I mean it is 10 100 times more useful data now um you know that
247:27 more useful data now um you know that one to me you that gets used a lot let's
247:30 one to me you that gets used a lot let's keep it going I feel like we're making
247:32 keep it going I feel like we're making fantastic time I don't even know I'm not
247:33 fantastic time I don't even know I'm not even keeping track of time time is not
247:35 even keeping track of time time is not even relative anymore be three hours and
247:38 even relative anymore be three hours and I wouldn't care let's keep going um
247:42 I wouldn't care let's keep going um let's take a look at this column right
247:45 let's take a look at this column right here sold as vacant um right now has no
247:49 here sold as vacant um right now has no but let's look at let's do select
247:52 but let's look at let's do select distinct oh gosh I hate when I do this I
247:56 distinct oh gosh I hate when I do this I do this all the time am I the only one I
247:58 do this all the time am I the only one I don't think I'm the only one and we'll
248:01 don't think I'm the only one and we'll do sp uh what is it sold as okay sold as
248:05 do sp uh what is it sold as okay sold as vacant let's do a distinct count on are
248:08 vacant let's do a distinct count on are distinct on
248:09 distinct on these so right now we have yes no n y
248:13 these so right now we have yes no n y I'm guessing which is no and yes and
248:14 I'm guessing which is no and yes and then no so let's look just for just
248:19 then no so let's look just for just because I'm curious um let's look at a
248:23 because I'm curious um let's look at a count
248:25 count of I don't want to do the let me just do
248:27 of I don't want to do the let me just do sold as vacant let me do a count of this
248:30 sold as vacant let me do a count of this and we'll Group
248:33 and we'll Group by uh sold is vacant okay let's run this
248:37 by uh sold is vacant okay let's run this and see what we get oh gosh let me order
248:42 and see what we get oh gosh let me order by okay here we go now we're now we're
248:46 by okay here we go now we're now we're moving that's not what I wanted at all
248:48 moving that's not what I wanted at all order by two here's what I wanted okay
248:52 order by two here's what I wanted okay so at no we have 51,000 yes 4,000 almost
248:56 so at no we have 51,000 yes 4,000 almost 5,000 no and then just a few so let's
248:58 5,000 no and then just a few so let's change them to to yes and no because
249:01 change them to to yes and no because these are obviously the vastly more
249:03 these are obviously the vastly more populated ones um and we're just going
249:05 populated ones um and we're just going to do this through a case statement so
249:08 to do this through a case statement so we're going to say oh yeah let me get
249:10 we're going to say oh yeah let me get this ready before we start oh yeah I'm
249:12 this ready before we start oh yeah I'm ahead of the game now let's do select
249:15 ahead of the game now let's do select and we'll do sold as vacant and then
249:18 and we'll do sold as vacant and then we'll start our case
249:19 we'll start our case statement um yeah let's do right here so
249:22 statement um yeah let's do right here so we'll do case when sold as vacant is
249:28 we'll do case when sold as vacant is equal to yes all we want to do is say
249:33 equal to yes all we want to do is say then we want to make it
249:36 then we want to make it no oh
249:38 no oh won't make a yes what am I doing geez
249:41 won't make a yes what am I doing geez I'm losing it when and I'm just oops
249:44 I'm losing it when and I'm just oops oops oops oops ignore that pretend that
249:46 oops oops oops ignore that pretend that didn't
249:47 didn't happen
249:50 happen when sold as vacant is equal to
249:54 when sold as vacant is equal to n
249:56 n then
249:58 then no and then else we want to say if it's
250:01 no and then else we want to say if it's already if it's not one of those values
250:03 already if it's not one of those values it means it's already a yes or no so
250:04 it means it's already a yes or no so we're just going to say just keep it as
250:07 we're just going to say just keep it as sold as vacant and then we'll end it so
250:11 sold as vacant and then we'll end it so let's take a
250:12 let's take a look okay so let's scroll through here
250:16 look okay so let's scroll through here and see if we get any that we can see oh
250:19 and see if we get any that we can see oh I just went byy some didn't
250:21 I just went byy some didn't I oh I just went buy some I know I
250:24 I oh I just went buy some I know I did um let's see okay here we go so
250:28 did um let's see okay here we go so here's an N it's now a no so this this
250:31 here's an N it's now a no so this this sold as vacant as this column the newly
250:33 sold as vacant as this column the newly uh the case statement right here is
250:35 uh the case statement right here is changing it so the N is no so this
250:38 changing it so the N is no so this should work all and this will be a
250:41 should work all and this will be a unique update statement um and I hope it
250:43 unique update statement um and I hope it works unlike the first update statement
250:45 works unlike the first update statement that we we did that was a that was a
250:48 that we we did that was a that was a travesty um let's do update Nashville
250:53 travesty um let's do update Nashville housing um and we'll
250:56 housing um and we'll say
250:57 say set sorry I'm talking faster than I'm
251:01 set sorry I'm talking faster than I'm going set sold as vacant equal to and we
251:04 going set sold as vacant equal to and we can just literally put in this case
251:05 can just literally put in this case statement um it's not but let's try
251:09 statement um it's not but let's try it okay now let's go look at this again
251:11 it okay now let's go look at this again and see if it made the update there we
251:14 and see if it made the update there we go the update statement worked oh
251:16 go the update statement worked oh fantastic it's a beautiful
251:18 fantastic it's a beautiful thing okay great I'm glad that one
251:21 thing okay great I'm glad that one worked I was worried for a second that
251:23 worked I was worried for a second that uh my update had broken in um in SQL
251:28 uh my update had broken in um in SQL Server now now we're going to do
251:30 Server now now we're going to do something um these next two things is
251:31 something um these next two things is we're going to remove the duplicates and
251:32 we're going to remove the duplicates and then we're going to get rid of unused
251:34 then we're going to get rid of unused columns um this removing duplicate I got
251:38 columns um this removing duplicate I got to be honest I don't do it a ton in SQL
251:40 to be honest I don't do it a ton in SQL but I have done it um especially for
251:43 but I have done it um especially for like
251:44 like queries you know when I'm looking at
251:47 queries you know when I'm looking at full tables I I will write some sort of
251:49 full tables I I will write some sort of temp table and like put the remove
251:51 temp table and like put the remove duplicates in there I normally don't
251:53 duplicates in there I normally don't delete actual data we are we're going to
251:56 delete actual data we are we're going to do that um but it's not a standard
251:58 do that um but it's not a standard practice to delete data that's in um
252:00 practice to delete data that's in um that's in your database so just for
252:03 that's in your database so just for future purposes don't blame me if you
252:05 future purposes don't blame me if you delete all the all the duplicates back
252:07 delete all the all the duplicates back accident in your uh table at work so you
252:11 accident in your uh table at work so you can do this a few different ways but the
252:12 can do this a few different ways but the way I'm going to show you is we're going
252:14 way I'm going to show you is we're going to write a
252:15 to write a CTE and we're going to do some windows
252:17 CTE and we're going to do some windows functions to find where there are
252:21 functions to find where there are duplicate values okay so excuse me so
252:26 duplicate values okay so excuse me so let's start writing out our CTE and or
252:30 let's start writing out our CTE and or you know even we can write out the query
252:33 you know even we can write out the query first then put it into a CTE that might
252:35 first then put it into a CTE that might be a little bit better so let's do
252:38 be a little bit better so let's do select everything and oh my gosh I was
252:41 select everything and oh my gosh I was about to do it somebody's out there just
252:45 about to do it somebody's out there just like waiting for me to make that mistake
252:48 like waiting for me to make that mistake again so we want to partition our
252:54 again so we want to partition our data um when you're doing removing
252:57 data um when you're doing removing duplicates we're going to have duplicate
252:59 duplicates we're going to have duplicate rows and we need to be able to have a
253:01 rows and we need to be able to have a way to identify those rows right so you
253:04 way to identify those rows right so you can use things like rank order rank
253:07 can use things like rank order rank um row number there are a few different
253:09 um row number there are a few different options we're going to be using row
253:11 options we're going to be using row number um and you know if you want to
253:14 number um and you know if you want to look into how Rank and rank uh like
253:17 look into how Rank and rank uh like dense Rank and all those ones work
253:19 dense Rank and all those ones work please do that so you know why we're
253:20 please do that so you know why we're doing it um but we're using row number
253:22 doing it um but we're using row number because it's the I think the simplest um
253:24 because it's the I think the simplest um and it's going to do what we need
253:26 and it's going to do what we need exactly so I'm going to get this over
253:29 exactly so I'm going to get this over here we'll say select everything because
253:31 here we'll say select everything because we're selecting everything then we're
253:32 we're selecting everything then we're going to add this row number on here so
253:35 going to add this row number on here so row number and we're going to do these
253:37 row number and we're going to do these parenthesis right here we're going to
253:38 parenthesis right here we're going to say over and an open parentheses now we
253:41 say over and an open parentheses now we need to write our partition because
253:43 need to write our partition because we're going to partition this data so
253:45 we're going to partition this data so we're going to say um
253:49 we're going to say um Partition by cool um now really quickly
253:54 Partition by cool um now really quickly while we're here we need to actually
253:56 while we're here we need to actually know what we're partitioning on that's
253:58 know what we're partitioning on that's helpful so let me write this so while
254:00 helpful so let me write this so while we're writing it we can see what we're
254:02 we're writing it we can see what we're doing we need to partition it on things
254:05 doing we need to partition it on things that should be unique
254:08 that should be unique um
254:10 um two basically to each row um if in I
254:15 two basically to each row um if in I guess for the sake of what we're doing
254:17 guess for the sake of what we're doing we're we're going to pretend this unique
254:18 we're we're going to pretend this unique ID isn't here um although you know you
254:21 ID isn't here um although you know you could say I'm cheating it doesn't matter
254:22 could say I'm cheating it doesn't matter but I'm going to say you know if things
254:25 but I'm going to say you know if things like the parcel ID are the same if the
254:28 like the parcel ID are the same if the sale date is the same um the property
254:32 sale date is the same um the property address is the same the sales price is
254:35 address is the same the sales price is the same This legal reference which I'm
254:37 the same This legal reference which I'm guessing is some type of legal document
254:39 guessing is some type of legal document saying it's like somebody's uh property
254:41 saying it's like somebody's uh property if all of those are the exact same then
254:44 if all of those are the exact same then to me that is the same data it's it's
254:47 to me that is the same data it's it's unusable just for example I mean this
254:49 unusable just for example I mean this may I don't I mean this data is just
254:51 may I don't I mean this data is just some random data set I found online
254:52 some random data set I found online right so that's what we're going to be
254:54 right so that's what we're going to be going with that's what we're going to be
254:55 going with that's what we're going to be running with and pretend that lie that I
254:57 running with and pretend that lie that I just told you is completely true so what
254:59 just told you is completely true so what we want to Partition by let's start with
255:02 we want to Partition by let's start with the
255:03 the parcel um can I is this not right here
255:08 parcel um can I is this not right here why is it saying this why is it not
255:09 why is it saying this why is it not giving
255:10 giving me okay doesn't even matter I'm just
255:13 me okay doesn't even matter I'm just going to say parcel
255:15 going to say parcel ID um we can
255:18 ID um we can say
255:20 say property we'll do a property address
255:23 property we'll do a property address stick with me we're getting somewhere
255:24 stick with me we're getting somewhere we'll do sale
255:27 we'll do sale price um what do we say sale date I mean
255:32 price um what do we say sale date I mean there shouldn't be two of this they
255:33 there shouldn't be two of this they didn't sell twice on the same day come
255:35 didn't sell twice on the same day come on and then legal
255:41 reference and oh I know why it's not working or my
255:44 and oh I know why it's not working or my autocomplete isn't working which I love
255:48 autocomplete isn't working which I love um it's because we're creating our own
255:51 um it's because we're creating our own partition so it's its own column of
255:53 partition so it's its own column of course I don't know why I'm it's late as
255:56 course I don't know why I'm it's late as you can see down here it's 11:15 it's
255:59 you can see down here it's 11:15 it's getting late for me but hey I I this is
256:02 getting late for me but hey I I this is an adrenaline rush for me um now we need
256:05 an adrenaline rush for me um now we need to order it now we want to order it on
256:07 to order it now we want to order it on something that should
256:08 something that should be um not necessar I guess unique so
256:13 be um not necessar I guess unique so we're going to order it on this unique
256:14 we're going to order it on this unique ID we'll see if that actually does what
256:16 ID we'll see if that actually does what we want it to do um oops what am I doing
256:19 we want it to do um oops what am I doing order bu come on and we'll say uh
256:24 order bu come on and we'll say uh unique oops unique ID perfect and we
256:30 unique oops unique ID perfect and we should be able to close that off and
256:32 should be able to close that off and we're going to call this R num I mean
256:33 we're going to call this R num I mean that's just that just makes sense so now
256:36 that's just that just makes sense so now we have this and let's run this really
256:39 we have this and let's run this really quick and see what happens so um and
256:43 quick and see what happens so um and maybe we should order this as well but
256:46 maybe we should order this as well but we'll maybe we'll do that
256:48 we'll maybe we'll do that later yeah let's order this on parcel
256:53 later yeah let's order this on parcel ID um order by parcel ID let's just see
256:57 ID um order by parcel ID let's just see what happens because this I think that
256:59 what happens because this I think that should be pretty
257:04 accurate um let's scroll down and see if we get
257:05 we get any this is all
257:08 any this is all ones maybe should be doing it on unique
257:11 ones maybe should be doing it on unique ID I don't know let's see if we get any
257:13 ID I don't know let's see if we get any hits okay there's a two in
257:16 hits okay there's a two in there let's let's look at this really
257:18 there let's let's look at this really quick because I want to see
257:20 quick because I want to see it maybe I did something wrong I don't
257:23 it maybe I did something wrong I don't know it is absolutely
257:29 possible somebody play some Jeopardy music for me real
257:31 music for me real quick yeah I don't know I don't know why
257:34 quick yeah I don't know I don't know why it's um okay so let's see let's let's
257:36 it's um okay so let's see let's let's look at these
257:38 look at these two um and let's see if I
257:41 two um and let's see if I did something wrong
257:45 did something wrong oops don't need to pull that
257:48 oops don't need to pull that up I was doing some research when I when
257:50 up I was doing some research when I when that convert by wasn't working um okay
257:54 that convert by wasn't working um okay so this one and this one it's giving
257:56 so this one and this one it's giving different row
257:58 different row numbers so let's look at the actual data
258:00 numbers so let's look at the actual data ignore the unique ID but the data itself
258:04 ignore the unique ID but the data itself so the the sale date is the same the
258:06 so the the sale date is the same the sale price is the same the legal
258:09 sale price is the same the legal reference is the same the owner is the
258:15 same this is the same I mean literally every single thing
258:19 same I mean literally every single thing in here is the same so this is a good
258:21 in here is the same so this is a good example so we're going to in this query
258:24 example so we're going to in this query that we're about to write that that will
258:26 that we're about to write that that will be that second one will be deleted
258:27 be that second one will be deleted because we don't need it now there
258:29 because we don't need it now there there's only one so it looks like this
258:32 there's only one so it looks like this is working as intended um I can also
258:35 is working as intended um I can also do um
258:37 do um let's do where rowcor num is greater
258:42 let's do where rowcor num is greater than one let's see if that I don't think
258:45 than one let's see if that I don't think it will work
258:46 it will work actually yeah that's because uh it is
258:50 actually yeah that's because uh it is that is in a Windows function of course
258:52 that is in a Windows function of course we can't do that what am I thinking
258:53 we can't do that what am I thinking that's why we need to put it into a
258:55 that's why we need to put it into a CTE oh of course it all comes back so
258:59 CTE oh of course it all comes back so let's call this all comes back to the CT
259:01 let's call this all comes back to the CT those things are amazing um let's call
259:04 those things are amazing um let's call this um row num
259:07 this um row num num
259:09 num CTE and we'll say as and then open
259:13 CTE and we'll say as and then open parentheses and I don't think we can
259:16 parentheses and I don't think we can have an order by in here let's do it
259:18 have an order by in here let's do it like this and let's just do select
259:22 like this and let's just do select everything from row number
259:25 everything from row number CTE so again if you haven't watched my
259:28 CTE so again if you haven't watched my like CTE CTE video or you've never used
259:31 like CTE CTE video or you've never used a CTE before um this is now basically
259:33 a CTE before um this is now basically almost like a temp table so we're going
259:35 almost like a temp table so we're going to be able to this query down here is
259:36 to be able to this query down here is querying off of this table that we quote
259:39 querying off of this table that we quote unquote
259:40 unquote created so um it looks like it's working
259:44 created so um it looks like it's working so all we're going to do is select um
259:47 so all we're going to do is select um everything from that and we want to say
259:52 everything from that and we want to say where row num because that's now a row
259:55 where row num because that's now a row is greater than one and let's order that
260:00 is greater than one and let's order that by I don't know property address let's
260:02 by I don't know property address let's see if that
260:03 see if that works and let's see what happens
260:08 works and let's see what happens okay so all of these are duplicates we
260:11 okay so all of these are duplicates we have 104 of them it looks like so
260:13 have 104 of them it looks like so there's not many but it there's twos any
260:17 there's not many but it there's twos any threes no no threes so there's multiple
260:21 threes no no threes so there's multiple of these rows or columns that are
260:24 of these rows or columns that are basically duplicates um and we want to
260:26 basically duplicates um and we want to delete them so all we're going to say is
260:30 delete them so all we're going to say is we're going to select instead of saying
260:32 we're going to select instead of saying select everything from row we're just
260:34 select everything from row we're just going to say delete
260:38 going to say delete and uh yeah I got to get rid of that
260:40 and uh yeah I got to get rid of that order bu that doesn't work and let's do
260:47 this there's 104 let's see if it worked um so now let's do let's go back and
260:50 um so now let's do let's go back and we'll say select everything and let's
260:52 we'll say select everything and let's see if there's any more duplicates in
260:53 see if there's any more duplicates in there there are none that is fantastic
260:56 there there are none that is fantastic every I'm like biting my nails now to
260:58 every I'm like biting my nails now to see if each one of these Works um
261:00 see if each one of these Works um because I that first one didn't work um
261:04 because I that first one didn't work um so yeah so it worked we got rid of the
261:06 so yeah so it worked we got rid of the duplicates that is fantastic um and now
261:08 duplicates that is fantastic um and now it's smooth sailing from here because
261:09 it's smooth sailing from here because we're just going to delete some um
261:12 we're just going to delete some um unused columns that we don't care about
261:14 unused columns that we don't care about this doesn't happen often um this I
261:18 this doesn't happen often um this I would say actually happens more in like
261:20 would say actually happens more in like views when I'm creating views I have a
261:22 views when I'm creating views I have a view and I'm like oh I didn't mean to
261:23 view and I'm like oh I didn't mean to add that column let me just remove it
261:25 add that column let me just remove it because it's a I don't need it you don't
261:28 because it's a I don't need it you don't do this to um like the raw data that you
261:31 do this to um like the raw data that you import usually this is I mean again best
261:34 import usually this is I mean again best practices please don't do this to your
261:35 practices please don't do this to your raw data that comes into your database
261:38 raw data that comes into your database um talk to somebody before you do this
261:40 um talk to somebody before you do this that's just my my legal advice for the
261:42 that's just my my legal advice for the day I'm not legally bound or legally
261:44 day I'm not legally bound or legally held responsible for any mistakes you
261:45 held responsible for any mistakes you make so let's keep going um we're
261:48 make so let's keep going um we're literally just going to delete some
261:49 literally just going to delete some columns it could be any columns that we
261:51 columns it could be any columns that we want um but for example we got have
261:56 want um but for example we got have these property split address and owner
261:58 these property split address and owner split address um in city and state and
262:00 split address um in city and state and city and these are perfect and much more
262:03 city and these are perfect and much more useful than these owner um these owner
262:06 useful than these owner um these owner address because this is really unusable
262:09 address because this is really unusable to be honest so we're going to delete
262:11 to be honest so we're going to delete those um and maybe we'll also get rid of
262:13 those um and maybe we'll also get rid of like I don't know maybe the land that
262:14 like I don't know maybe the land that land use might be useful this tax tax
262:17 land use might be useful this tax tax District who cares about that um so it's
262:19 District who cares about that um so it's going to be super easy we're just going
262:20 going to be super easy we're just going to write alter table alter table did I
262:24 to write alter table alter table did I say that right
262:26 say that right geez um and we're going to say alter
262:30 geez um and we're going to say alter this
262:31 this table and we're going to
262:34 table and we're going to drop a column and you can do as many as
262:37 drop a column and you can do as many as many as we want so we're going to say
262:40 many as we want so we're going to say owner um
262:42 owner um address we're going to do tax
262:48 address we're going to do tax district and let's also do the property
262:59 address all right and let's try this and let's see if it
263:01 let's see if it works I'm
263:08 nervous all right so as you can see that the property address is gone the owner
263:10 the property address is gone the owner address is gone the tax what was it tax
263:13 address is gone the tax what was it tax district is gone and now we are left
263:16 district is gone and now we are left with this um now remember the whole
263:18 with this um now remember the whole point of everything we were doing was to
263:20 point of everything we were doing was to clean up the data right we wanted to
263:23 clean up the data right we wanted to clean the data and actually now well now
263:25 clean the data and actually now well now that we're here we have this sale date
263:27 that we're here we have this sale date as well U and we have the sale date
263:29 as well U and we have the sale date converted over here let's get rid I
263:31 converted over here let's get rid I forgot let's get rid of this oh that was
263:33 forgot let's get rid of this oh that was my dog Max excuse them let's get rid of
263:37 my dog Max excuse them let's get rid of oops let's get rid of that sale price
263:39 oops let's get rid of that sale price that that or the um sale date that made
263:41 that that or the um sale date that made me look like an idiot this is Sweet
263:44 me look like an idiot this is Sweet Revenge sale
263:45 Revenge sale date Sweet Sweet
263:49 date Sweet Sweet Revenge all right and it is gone so it's
263:52 Revenge all right and it is gone so it's as easy as that now remember like I was
263:55 as easy as that now remember like I was saying before the whole point of this
263:56 saying before the whole point of this project is to clean the data and make it
263:59 project is to clean the data and make it more usable um and it may not have felt
264:02 more usable um and it may not have felt like that as we were going through cuz I
264:04 like that as we were going through cuz I wasn't you know really looking at the
264:05 wasn't you know really looking at the clean cleaning data uh uh we were
264:09 clean cleaning data uh uh we were cleaning it but you know what was the
264:11 cleaning it but you know what was the purpose of it I may not have highlighted
264:13 purpose of it I may not have highlighted that too much all these other columns
264:16 that too much all these other columns that we created um are just it's much
264:18 that we created um are just it's much more usable much more friendly um this
264:21 more usable much more friendly um this is standardized now and you know we we
264:26 is standardized now and you know we we did that through quite a few various
264:29 did that through quite a few various methods um so let's go back up the top
264:32 methods um so let's go back up the top we're going to recap what we did really
264:34 we're going to recap what we did really quick so using this convert we tried to
264:38 quick so using this convert we tried to standardize the date format or change
264:39 standardize the date format or change the date format may or may not have
264:40 the date format may or may not have worked for you didn't work for me we
264:42 worked for you didn't work for me we populated this property address um which
264:47 populated this property address um which we did that
264:48 we did that before we broke this out because if we
264:51 before we broke this out because if we reversed it if we broke these addresses
264:53 reversed it if we broke these addresses out into individual columns and then we
264:56 out into individual columns and then we populated the this thing um we would
264:59 populated the this thing um we would have because then we went and
265:01 have because then we went and deleted uh we went and deleted this
265:03 deleted uh we went and deleted this column oops sorry we went and deleted uh
265:07 column oops sorry we went and deleted uh this property address so we wouldn't
265:09 this property address so we wouldn't have actually gotten any of that data so
265:10 have actually gotten any of that data so there was a reason it was in that order
265:12 there was a reason it was in that order uh don't mess that up that's happened um
265:15 uh don't mess that up that's happened um so we broke it out we did that to to
265:17 so we broke it out we did that to to using um substring chart index as well
265:20 using um substring chart index as well as parse name and
265:22 as parse name and replace then we went through and we
265:25 replace then we went through and we changed yes to no or Y and n's to yeses
265:27 changed yes to no or Y and n's to yeses and NOS using case
265:30 and NOS using case statements um then we use we removed
265:32 statements um then we use we removed duplicates using a row number a c te and
265:36 duplicates using a row number a c te and windows function of Partition by and
265:39 windows function of Partition by and then at the end we deleted a few useless
265:41 then at the end we deleted a few useless columns that we no longer want to see
265:42 columns that we no longer want to see because um they are horrible and
265:45 because um they are horrible and terrible and um you know we don't want
265:48 terrible and um you know we don't want to see them anymore that is the entire
265:51 to see them anymore that is the entire project that was everything and you did
265:54 project that was everything and you did it and I'm honestly super proud of you
265:56 it and I'm honestly super proud of you for sticking around this long it this
265:59 for sticking around this long it this this was not necessarily an easy project
266:01 this was not necessarily an easy project we used quite a few new things that I
266:03 we used quite a few new things that I may have not talked about or showed you
266:05 may have not talked about or showed you before this to me is just the beginning
266:09 before this to me is just the beginning right this is just a a glimpse into all
266:11 right this is just a a glimpse into all the things that you need to do you need
266:13 the things that you need to do you need to look for um in order to clean data so
266:17 to look for um in order to clean data so you know I really do think this is a
266:19 you know I really do think this is a good portfolio project because it will
266:21 good portfolio project because it will show that you understand and know how to
266:24 show that you understand and know how to clean the data although this is not an
266:26 clean the data although this is not an end to-end project right that could that
266:27 end to-end project right that could that would take a long time and a lot more
266:30 would take a long time and a lot more exploratory analysis looking into the
266:32 exploratory analysis looking into the data to to figure out what we need to
266:35 data to to figure out what we need to change but for all intents and purposes
266:37 change but for all intents and purposes I mean this is a a pretty good project
266:40 I mean this is a a pretty good project for cleaning data and I hope that you
266:41 for cleaning data and I hope that you learned something I also hope that you
266:43 learned something I also hope that you worked on this hard um if you want to
266:45 worked on this hard um if you want to make any improvements please do that
266:48 make any improvements please do that this is not perfect by any means there's
266:50 this is not perfect by any means there's other things that you could change um
266:52 other things that you could change um you could you know I don't even know I'm
266:55 you could you know I don't even know I'm not even going to try to guess you could
266:56 not even going to try to guess you could do other things to this data though um
266:58 do other things to this data though um and and create your own queries create
267:00 and and create your own queries create your own um data cleaning uh part of
267:04 your own um data cleaning uh part of this and so um you do that if you were
267:07 this and so um you do that if you were able to get this um the ETL part of it
267:10 able to get this um the ETL part of it done do that I think it'd be really
267:12 done do that I think it'd be really really cool um again I was able to get
267:15 really cool um again I was able to get it to work but I don't think 90% of
267:17 it to work but I don't think 90% of people out there would be able to get it
267:19 people out there would be able to get it to work um it's just every computer is
267:22 to work um it's just every computer is different every server is configured
267:24 different every server is configured differently um and so it would just be a
267:27 differently um and so it would just be a huge pain so I decided to cut that out
267:29 huge pain so I decided to cut that out and I'm sorry um but hopefully this will
267:32 and I'm sorry um but hopefully this will suffice um with that being said this is
267:35 suffice um with that being said this is it you made it all the way to the end
267:37 it you made it all the way to the end again I'm super proud you guys are doing
267:39 again I'm super proud you guys are doing fantastic you guys are the ones putting
267:41 fantastic you guys are the ones putting in the hard work to build the portfolio
267:43 in the hard work to build the portfolio for your future job I mean it's not easy
267:46 for your future job I mean it's not easy but you're putting in the work and so
267:48 but you're putting in the work and so and so kudos to you um in our next video
267:51 and so kudos to you um in our next video we're going to be going into python for
267:53 we're going to be going into python for the very first time really excited about
267:55 the very first time really excited about that one because um I think the only
267:58 that one because um I think the only python video that I have up right now is
267:59 python video that I have up right now is on one where I was scraping data from
268:01 on one where I was scraping data from Twitter so um you know this will be a
268:04 Twitter so um you know this will be a nice change a pace or a little bit
268:07 nice change a pace or a little bit different content than I normally put
268:08 different content than I normally put out and so I'm really excited about it
268:10 out and so I'm really excited about it and I hope you are as well with that
268:13 and I hope you are as well with that being said I am done with the video I'm
268:16 being said I am done with the video I'm going to be stopping it soon thank you
268:17 going to be stopping it soon thank you for joining me if you like this video be
268:19 for joining me if you like this video be sure to subscribe be sure to like this
268:22 sure to subscribe be sure to like this video leave a comment below um telling
268:25 video leave a comment below um telling me how it changed your life uh and I
268:28 me how it changed your life uh and I will see you in the next video
268:31 will see you in the next video [Music]
268:36 goodbye [Music]
268:42 [Music] what's going on everybody today we are
268:43 what's going on everybody today we are starting our Excel tutorial
268:46 starting our Excel tutorial [Music]
268:50 [Music] series now there are so many things that
268:52 series now there are so many things that you can do in Excel so I don't know how
268:54 you can do in Excel so I don't know how long this series is going to be it could
268:56 long this series is going to be it could be 15 or even 20 videos but what I do
268:58 be 15 or even 20 videos but what I do know is that I'm going to be covering
269:00 know is that I'm going to be covering just about every single thing that I've
269:01 just about every single thing that I've used since I became a data analyst and I
269:03 used since I became a data analyst and I want to show you how to do it uh so
269:05 want to show you how to do it uh so won't just be the more concrete things
269:07 won't just be the more concrete things um you know like pivot tables charts V
269:10 um you know like pivot tables charts V lookups things like that it'll also be
269:11 lookups things like that it'll also be some of the more nuanced things like how
269:13 some of the more nuanced things like how to deal with missing data or how to deal
269:15 to deal with missing data or how to deal with dirty data and how to clean that up
269:17 with dirty data and how to clean that up within Excel and so those are things
269:19 within Excel and so those are things that you may not be able to do you know
269:21 that you may not be able to do you know if somebody wasn't showing you how to do
269:22 if somebody wasn't showing you how to do it and so that's what I'm going to try
269:24 it and so that's what I'm going to try to help you because I know that that is
269:26 to help you because I know that that is something that you will need to do or
269:27 something that you will need to do or learn how to do in Excel now before we
269:30 learn how to do in Excel now before we get into it I want to give a huge shout
269:31 get into it I want to give a huge shout out to the sponsor of this Excel series
269:33 out to the sponsor of this Excel series and that is udem me I took so many Excel
269:35 and that is udem me I took so many Excel courses on you to me when I was first
269:36 courses on you to me when I was first starting out as a data analyst and there
269:38 starting out as a data analyst and there was this one course that I kept going
269:40 was this one course that I kept going back to over and over again because as I
269:42 back to over and over again because as I got into it in my job I realized that
269:44 got into it in my job I realized that there were so many things that were in
269:46 there were so many things that were in that course that I really needed to know
269:48 that course that I really needed to know but I didn't realize I needed to know it
269:51 but I didn't realize I needed to know it and so I'm going to put the links to
269:52 and so I'm going to put the links to those courses in the description in case
269:53 those courses in the description in case you want to take those again huge shout
269:55 you want to take those again huge shout out to you to me without further Ado
269:57 out to you to me without further Ado let's jump on my screen and get started
269:58 let's jump on my screen and get started with our very first Excel tutorial all
269:59 with our very first Excel tutorial all right so I'm going to go ahead and get
270:00 right so I'm going to go ahead and get rid of myself we are going to be looking
270:02 rid of myself we are going to be looking at something absolutely pivotal in your
270:04 at something absolutely pivotal in your data analytics career and that is Pivot
270:06 data analytics career and that is Pivot tables uh and I think that's really
270:08 tables uh and I think that's really appropriate it is probably one of the
270:10 appropriate it is probably one of the most commonly used things I think that
270:13 most commonly used things I think that data analysts use to convey information
270:15 data analysts use to convey information in Excel it's super easy to group things
270:17 in Excel it's super easy to group things together to display information in a
270:19 together to display information in a very easily understandable way
270:22 very easily understandable way especially for people who are not data
270:25 especially for people who are not data analysts right I use this a lot for
270:26 analysts right I use this a lot for other managers or for higher-ups um who
270:29 other managers or for higher-ups um who don't want to get into SQL or or you
270:31 don't want to get into SQL or or you know aren't super text savy in like
270:33 know aren't super text savy in like python or Tableau they just want it in
270:35 python or Tableau they just want it in an except sell and so I use it all the
270:37 an except sell and so I use it all the time for that reason and so we're going
270:39 time for that reason and so we're going to be using this data set right here
270:41 to be using this data set right here bike store sales in Europe I will
270:42 bike store sales in Europe I will include this link in the description um
270:44 include this link in the description um we're not going to look at the columns
270:45 we're not going to look at the columns just yet we're going to download it um
270:47 just yet we're going to download it um I've already downloaded it a few times
270:50 I've already downloaded it a few times but we are going to go
270:52 but we are going to go to um our downloads we're going to open
270:55 to um our downloads we're going to open it up and we're going to open up this
270:57 it up and we're going to open up this sales right
270:59 sales right here and give it a
271:01 here and give it a second all right perfect and so here's
271:04 second all right perfect and so here's what it looks like uh at least on my
271:06 what it looks like uh at least on my screen I'm going to uh spread it out
271:08 screen I'm going to uh spread it out just a little
271:09 just a little bit um and really quickly let's take a
271:12 bit um and really quickly let's take a very quick glance at this so we have a
271:15 very quick glance at this so we have a date a day a month a year so some um
271:18 date a day a month a year so some um some date
271:20 some date information um then we have some
271:22 information um then we have some customer age information so how old was
271:25 customer age information so how old was the customer again this is bike sales so
271:28 the customer again this is bike sales so what did um you know what did they buy
271:31 what did um you know what did they buy and they have some demographic
271:32 and they have some demographic information so this is their age group
271:34 information so this is their age group we have uh the gender country State the
271:38 we have uh the gender country State the product category the subcategory the
271:41 product category the subcategory the actual product that was purchased and
271:43 actual product that was purchased and then we have things like um you know how
271:46 then we have things like um you know how much these things cost the quantity that
271:48 much these things cost the quantity that was that was ordered so we have order
271:50 was that was ordered so we have order Quant quantity unit cost unit price then
271:53 Quant quantity unit cost unit price then we have the profit cost and revenue all
271:56 we have the profit cost and revenue all things that we almost everything in here
271:59 things that we almost everything in here we can in some way put into a pivot
272:02 we can in some way put into a pivot table now I'm not going to go through
272:03 table now I'm not going to go through every single variation of that but we
272:05 every single variation of that but we are going to be um looking at a lot of
272:08 are going to be um looking at a lot of this um Revenue over here because I
272:10 this um Revenue over here because I think it's it's pretty easy to show the
272:12 think it's it's pretty easy to show the value of a pivot table with especially
272:14 value of a pivot table with especially with um you know currency or money so
272:18 with um you know currency or money so what we're going to do to get started is
272:20 what we're going to do to get started is we're going to go up to insert and we're
272:23 we're going to go up to insert and we're going to click on insert and then we are
272:24 going to click on insert and then we are going to click on pivot table now really
272:26 going to click on pivot table now really quick there is a recommended pivot
272:28 quick there is a recommended pivot tables and if you click on that what
272:30 tables and if you click on that what will come up is some recommendations
272:32 will come up is some recommendations that Excel gives based on the data that
272:34 that Excel gives based on the data that you have um and it can kind of give you
272:37 you have um and it can kind of give you some ideas of of what you can do with
272:40 some ideas of of what you can do with pivot tables it's going to generate it
272:42 pivot tables it's going to generate it for you we're not going to do that we're
272:43 for you we're not going to do that we're going to build our
272:45 going to build our own uh but let's click on pivot table
272:48 own uh but let's click on pivot table and it's going to Auto Select basically
272:51 and it's going to Auto Select basically everything and that's fantastic um but
272:53 everything and that's fantastic um but what if it doesn't come like that I I
272:55 what if it doesn't come like that I I just erase that if it doesn't come like
272:57 just erase that if it doesn't come like that you can click right here you can
272:59 that you can click right here you can cck excuse me you can click control
273:02 cck excuse me you can click control shift and then the right arrow and then
273:04 shift and then the right arrow and then the down arrow and is going to select
273:06 the down arrow and is going to select all of our data um and you have right
273:08 all of our data um and you have right here a new worksheet or an existing
273:10 here a new worksheet or an existing worksheet we're going to create a new
273:12 worksheet we're going to create a new worksheet just tends to get too clogged
273:14 worksheet just tends to get too clogged up if we put it on the same worksheet
273:16 up if we put it on the same worksheet that already has a lot of data in it so
273:19 that already has a lot of data in it so right over here are pivot table fields
273:22 right over here are pivot table fields and these are all of our columns that we
273:23 and these are all of our columns that we just looked at and we're going to be
273:25 just looked at and we're going to be able to select those and kind of drag
273:27 able to select those and kind of drag and drop now if you just took the
273:29 and drop now if you just took the Tableau um tutorial series that I just
273:32 Tableau um tutorial series that I just finished doing last week then this is
273:34 finished doing last week then this is going to be pretty pretty familiar um
273:37 going to be pretty pretty familiar um you're going to start seeing a little
273:38 you're going to start seeing a little bit of um hopefully some patterns about
273:42 bit of um hopefully some patterns about how the data is kind of displayed and so
273:44 how the data is kind of displayed and so we have our filters down here we have
273:46 we have our filters down here we have columns rows
273:48 columns rows values all these things uh we will be
273:51 values all these things uh we will be using I'll show you how to use today as
273:53 using I'll show you how to use today as well as some additional things um one
273:56 well as some additional things um one thing that we want to start with uh for
273:58 thing that we want to start with uh for this demonstration is we're going to be
274:00 this demonstration is we're going to be looking at kind of the um these bottom
274:02 looking at kind of the um these bottom ones right here profit cost and Revenue
274:05 ones right here profit cost and Revenue and we're going to be doing that per
274:07 and we're going to be doing that per country uh per country and state and
274:09 country uh per country and state and we'll kind of do some drill Downs um and
274:12 we'll kind of do some drill Downs um and I'll show you how those work so for just
274:14 I'll show you how those work so for just to start out we're going to take the
274:15 to start out we're going to take the country right here and you'll see it
274:18 country right here and you'll see it populate right over here in fact um let
274:20 populate right over here in fact um let me zoom in maybe once uh yeah that
274:24 me zoom in maybe once uh yeah that should be fine I don't know if I want I
274:25 should be fine I don't know if I want I might zoom in it again in just a little
274:27 might zoom in it again in just a little bit um so we have our country and and
274:29 bit um so we have our country and and it's just like this very very simple
274:32 it's just like this very very simple oops um now I'm going to include the
274:34 oops um now I'm going to include the state now I'm going to drag this um all
274:37 state now I'm going to drag this um all the way and I'm going to put it under
274:38 the way and I'm going to put it under you can put it above or you can put it
274:39 you can put it above or you can put it below I'm going to put it
274:41 below I'm going to put it below uh it definitely makes the most
274:43 below uh it definitely makes the most sense there now when you do that it it
274:48 sense there now when you do that it it um kind of populates it in an expanded
274:50 um kind of populates it in an expanded way but you can collapse this very
274:53 way but you can collapse this very easily we're going to go right here
274:54 easily we're going to go right here we're going to right click we're going
274:56 we're going to right click we're going to go go down to expand and collapse and
274:58 to go go down to expand and collapse and we're going to collapse the entire field
275:00 we're going to collapse the entire field and so now here are all of our um all of
275:03 and so now here are all of our um all of our countries as they were before now
275:05 our countries as they were before now each of them has this plus sign to the
275:07 each of them has this plus sign to the left and if you click on it now we can
275:09 left and if you click on it now we can go and we see this state that we that we
275:11 go and we see this state that we that we added to these rows and what this is
275:14 added to these rows and what this is going to do is it kind of is like a
275:15 going to do is it kind of is like a rollup or it's like a grouping um and so
275:18 rollup or it's like a grouping um and so if you you know have taken the SQL um
275:21 if you you know have taken the SQL um tutorial series and you've done things
275:23 tutorial series and you've done things with Group by this is very similar to
275:26 with Group by this is very similar to that um and if you've done the Tableau
275:29 that um and if you've done the Tableau tutorial series it's kind of like a
275:31 tutorial series it's kind of like a drill down it's very very similar so you
275:34 drill down it's very very similar so you can drill into the information so we um
275:36 can drill into the information so we um can put some values in here uh and what
275:40 can put some values in here uh and what we're what that's going to do is that's
275:42 we're what that's going to do is that's going to kind of create some some
275:44 going to kind of create some some context to what this what we're grouping
275:46 context to what this what we're grouping by so just for um visual purposes let's
275:51 by so just for um visual purposes let's add this Revenue so this is the revenue
275:54 add this Revenue so this is the revenue that is bike uh bike sales revenue right
275:57 that is bike uh bike sales revenue right that's what we're looking at so this is
275:59 that's what we're looking at so this is the sum of the revenue for these bike
276:03 the sum of the revenue for these bike sales per country now if we drop down
276:06 sales per country now if we drop down right here we can see that in Australia
276:09 right here we can see that in Australia uh New South Wales had uh 92 was that
276:15 uh New South Wales had uh 92 was that 9,234 N5 Queensland had 5 million you
276:19 9,234 N5 Queensland had 5 million you know etc etc so now we can break it down
276:22 know etc etc so now we can break it down we can't it's we don't just have to look
276:24 we can't it's we don't just have to look at Australia we can now drill down even
276:26 at Australia we can now drill down even further to the actual state is what
276:29 further to the actual state is what they're calling it um the actual state
276:31 they're calling it um the actual state within Australia and so it's super super
276:34 within Australia and so it's super super useful and you can do that for every
276:35 useful and you can do that for every single one and so we can look at Canada
276:38 single one and so we can look at Canada we can look at France and we can really
276:39 we can look at France and we can really drill down into uh the revenue for each
276:42 drill down into uh the revenue for each of these countries as well as the states
276:45 of these countries as well as the states within them now over here this is not
276:48 within them now over here this is not the most uh pretty um it just says sum
276:51 the most uh pretty um it just says sum of Revenue and then it has some numbers
276:54 of Revenue and then it has some numbers not not the most pretty thing I've ever
276:55 not not the most pretty thing I've ever seen um really quick we can go like we
276:58 seen um really quick we can go like we can um kind of highlight over these and
277:01 can um kind of highlight over these and we can go back to home you can do it in
277:02 we can go back to home you can do it in a couple different ways we can go to
277:03 a couple different ways we can go to home and will type currency now it has
277:07 home and will type currency now it has these two. Z at the end you can get rid
277:09 these two. Z at the end you can get rid of those really easily by going like
277:11 of those really easily by going like that um already this looks quite a bit
277:14 that um already this looks quite a bit better just visually um especially if
277:16 better just visually um especially if you're looking at it in uh you know
277:18 you're looking at it in uh you know dollars you can change the currency um
277:21 dollars you can change the currency um to different currencies if you want to
277:23 to different currencies if you want to do that now we don't just have to do uh
277:27 do that now we don't just have to do uh the sum of Revenue we can do a lot of
277:30 the sum of Revenue we can do a lot of different things so let's go to the
277:31 different things so let's go to the value field settings so we can customize
277:34 value field settings so we can customize this name so we can do um Revenue oops
277:40 this name so we can do um Revenue oops good if I get spell Revenue per
277:44 good if I get spell Revenue per country that's fine that you know it's
277:47 country that's fine that you know it's just a placeholder trying to show you
277:49 just a placeholder trying to show you but we don't have to just do that um you
277:51 but we don't have to just do that um you know we could do the count the average
277:53 know we could do the count the average the max the Min we can do just about
277:55 the max the Min we can do just about anything we want um but let's keep it
277:58 anything we want um but let's keep it the sum right now um and if we want to
278:04 the sum right now um and if we want to we can show this value as different
278:06 we can show this value as different things so we percentage the percentage
278:09 things so we percentage the percentage of column total percentage of row total
278:11 of column total percentage of row total let's do really quick just for
278:12 let's do really quick just for demonstration purposes the percentage of
278:14 demonstration purposes the percentage of grand total so when we do that we can
278:17 grand total so when we do that we can see that the United States the per
278:20 see that the United States the per Revenue per country United States has
278:23 Revenue per country United States has 32% just between these um you know these
278:27 32% just between these um you know these countries and Australia has the next one
278:30 countries and Australia has the next one so you know it might be kind of hard to
278:32 so you know it might be kind of hard to glance at this really quickly to know
278:34 glance at this really quickly to know who has the highest um but what we can
278:37 who has the highest um but what we can do is we can go right here and we can go
278:39 do is we can go right here and we can go to sort and we can do largest to
278:41 to sort and we can do largest to smallest and there we have the United
278:44 smallest and there we have the United States on top now when you do it right
278:46 States on top now when you do it right here it's not sorted largest uh to
278:49 here it's not sorted largest uh to smallest you'd have to go in again click
278:51 smallest you'd have to go in again click sort and do largest to smallest and so
278:54 sort and do largest to smallest and so now we can see that California has the
278:56 now we can see that California has the has the um you know biggest percentage
278:59 has the um you know biggest percentage they're pulling in 20% of that 32% of
279:02 they're pulling in 20% of that 32% of Revenue so I'm just going to click C
279:04 Revenue so I'm just going to click C control z a few times and get us back to
279:07 control z a few times and get us back to where we just were um and what I want to
279:11 where we just were um and what I want to do is I want to show you a few different
279:13 do is I want to show you a few different things uh pretty quickly so we want to
279:16 things uh pretty quickly so we want to pull in this profit and this cost uh and
279:19 pull in this profit and this cost uh and so I'm going to pull in this cost next
279:21 so I'm going to pull in this cost next and then I'm going to pull in this
279:23 and then I'm going to pull in this profit again uh I'm going to
279:26 profit again uh I'm going to change the currency on
279:29 change the currency on this and I'm not going to change the
279:31 this and I'm not going to change the names um right now but you know you
279:34 names um right now but you know you absolutely can do that now the revenue
279:37 absolutely can do that now the revenue is the how much is actually being sold
279:40 is the how much is actually being sold so you know for the United States it was
279:42 so you know for the United States it was 27 million now the cost is how much did
279:46 27 million now the cost is how much did it cost to manufacture or or store um or
279:50 it cost to manufacture or or store um or distribute all of these products so that
279:52 distribute all of these products so that was 60 million and the profit is
279:54 was 60 million and the profit is actually how much money is being made at
279:57 actually how much money is being made at the end of the day after um you know all
280:00 the end of the day after um you know all their costs after all their employee
280:02 their costs after all their employee costs after everything they're still
280:03 costs after everything they're still making the United States is still making
280:05 making the United States is still making $1
280:06 $1 million now you might look at this and
280:09 million now you might look at this and you might say well you know I can kind
280:11 you might say well you know I can kind of glance at it and say know that this
280:13 of glance at it and say know that this profit is correct based off these two
280:15 profit is correct based off these two numbers um but we can do a calculated
280:19 numbers um but we can do a calculated field um if you remember what calculated
280:21 field um if you remember what calculated fields are that's something from Tableau
280:23 fields are that's something from Tableau very uh basically the exact same thing
280:26 very uh basically the exact same thing and so we can create an additional
280:28 and so we can create an additional column right here that is a calculated
280:29 column right here that is a calculated field that can add and subtract these
280:31 field that can add and subtract these things to make sure that our numbers are
280:33 things to make sure that our numbers are adding up correctly
280:35 adding up correctly so let's do that really quickly U let's
280:38 so let's do that really quickly U let's go to pivot table analyze we're going to
280:41 go to pivot table analyze we're going to go over to Fields items and sets and go
280:43 go over to Fields items and sets and go to calculated field now we can name this
280:46 to calculated field now we can name this anything um and I'm just going to for
280:49 anything um and I'm just going to for demo purposes I'm going to say um oops
280:53 demo purposes I'm going to say um oops calculated field demo uh I'm sure yours
280:57 calculated field demo uh I'm sure yours will be different now um if you want to
281:01 will be different now um if you want to you can go in here and this is the
281:02 you can go in here and this is the formula it's almost like um you know we
281:04 formula it's almost like um you know we haven't looked at formulas up this is
281:06 haven't looked at formulas up this is our first tutorial but you know when we
281:07 our first tutorial but you know when we look at formulas it's basically the same
281:09 look at formulas it's basically the same thing as writing it if inside of a cell
281:12 thing as writing it if inside of a cell but here it gives us kind of this um
281:14 but here it gives us kind of this um open text to do how we uh do what we
281:17 open text to do how we uh do what we want with it now what we're going to do
281:20 want with it now what we're going to do is we're going to do Revenue I'm going
281:22 is we're going to do Revenue I'm going to insert that I'm going to get rid of
281:25 to insert that I'm going to get rid of this I'm going to do revenue and so
281:28 this I'm going to do revenue and so that's the the the very large number and
281:31 that's the the the very large number and then we're going to
281:32 then we're going to subtract and we're going to sub subract
281:35 subtract and we're going to sub subract our cost we going to insert that and
281:39 our cost we going to insert that and let's do this and click okay so this is
281:42 let's do this and click okay so this is our calculated field demo column that we
281:45 our calculated field demo column that we just created and as you can see it
281:47 just created and as you can see it matches our uh sum of profit column
281:50 matches our uh sum of profit column exactly and that's exactly what we want
281:52 exactly and that's exactly what we want to see we want to kind of check to make
281:54 to see we want to kind of check to make sure that this revenue and cost uh
281:56 sure that this revenue and cost uh fields are generating the correct profit
281:59 fields are generating the correct profit and sometimes those are off and so it's
282:00 and sometimes those are off and so it's really good to kind of check those and
282:02 really good to kind of check those and have that additional column um You
282:04 have that additional column um You probably wouldn't have this if you were
282:06 probably wouldn't have this if you were um you know going to submit this to
282:08 um you know going to submit this to somebody uh just so you know now that
282:10 somebody uh just so you know now that this is an actual column you can't go
282:12 this is an actual column you can't go here and do something like cut or and
282:16 here and do something like cut or and paste it over here you know that's not
282:18 paste it over here you know that's not it won't let you do that what it is is
282:20 it won't let you do that what it is is is now an actual um column and so we can
282:23 is now an actual um column and so we can go and remove that and we can add it
282:25 go and remove that and we can add it back at any moment so if we want to go
282:27 back at any moment so if we want to go back and add that um oops add that down
282:30 back and add that um oops add that down here we can do that because we've
282:32 here we can do that because we've created that column it's now permanently
282:34 created that column it's now permanently there unless we go and delete all of
282:37 there unless we go and delete all of that data uh and so we can just click
282:39 that data uh and so we can just click this check mark and it will get rid of
282:40 this check mark and it will get rid of it for us all right now the last thing
282:42 it for us all right now the last thing that we have not used down here is the
282:44 that we have not used down here is the filters now the filters is exactly what
282:47 filters now the filters is exactly what it sounds like it's going to allow you
282:49 it sounds like it's going to allow you to filter on certain things um but
282:52 to filter on certain things um but probably not things that you already
282:54 probably not things that you already have included in your pivot table so if
282:57 have included in your pivot table so if you add something like the country down
282:59 you add something like the country down here um it's going to kind of expand
283:02 here um it's going to kind of expand everything and then if you then go and
283:04 everything and then if you then go and filter on it it kind of breaks it down
283:08 filter on it it kind of breaks it down that's really not what the filter is
283:10 that's really not what the filter is kind of used for or meant for um for
283:13 kind of used for or meant for um for example right up here we have uh
283:16 example right up here we have uh customer gender okay so let's take the
283:18 customer gender okay so let's take the customer gender and we'll put it in this
283:19 customer gender and we'll put it in this filters now we can see all of the
283:22 filters now we can see all of the revenue all of the cost all the profit
283:25 revenue all of the cost all the profit and we can do that based off of the
283:27 and we can do that based off of the gender so we can filter by a gender not
283:30 gender so we can filter by a gender not really having to change anything about
283:32 really having to change anything about our pivot table and so at a super Quick
283:34 our pivot table and so at a super Quick Glance we can see that uh the males are
283:39 Glance we can see that uh the males are the profit from the males is
283:49 16.48% so at a super uh basic level at a really quick glance we can see that the
283:51 really quick glance we can see that the men or the males are you know spending a
283:54 men or the males are you know spending a little bit more than the females by
283:57 little bit more than the females by about about
283:58 about about $700,000 now let's go ahead and create
284:01 $700,000 now let's go ahead and create one more pivot table uh we are going to
284:03 one more pivot table uh we are going to create a pivot table right over here
284:05 create a pivot table right over here let's go back to the
284:06 let's go back to the sales right here again control shift
284:10 sales right here again control shift right down it's going to select all of
284:12 right down it's going to select all of our data and we're click okay so one
284:16 our data and we're click okay so one thing that we're going to look at is
284:18 thing that we're going to look at is we're going to use some of this date
284:20 we're going to use some of this date information right here so let's select
284:22 information right here so let's select our country just like we did before um
284:25 our country just like we did before um and what we want to do is see you know
284:27 and what we want to do is see you know what year were we performing our best
284:30 what year were we performing our best when were we doing our absolute best uh
284:32 when were we doing our absolute best uh with oops
284:34 with oops let me go
284:36 let me go back uh with our sales so I'm going to
284:39 back uh with our sales so I'm going to select the year and put that in our
284:41 select the year and put that in our columns and so now we have 2011 through
284:45 columns and so now we have 2011 through 2016 and we want to look at our Revenue
284:49 2016 and we want to look at our Revenue let's put our Revenue right down here
284:51 let's put our Revenue right down here and now we have all of our Revenue now
284:54 and now we have all of our Revenue now let's again make this into a
284:58 let's again make this into a currency just like that and super
285:01 currency just like that and super quickly now we can get a really quick
285:03 quickly now we can get a really quick glance at at how Australia was doing
285:06 glance at at how Australia was doing each year and we can see that there was
285:08 each year and we can see that there was a huge uptick in 2013 and a huge uptick
285:11 a huge uptick in 2013 and a huge uptick in 2015 it didn't happen for every
285:14 in 2015 it didn't happen for every single country uh it did go up uh for
285:17 single country uh it did go up uh for most countries very slightly for some
285:19 most countries very slightly for some but we can see on a large scale from um
285:23 but we can see on a large scale from um year to year what that's like and So
285:26 year to year what that's like and So within just a few minutes we're able to
285:27 within just a few minutes we're able to create some really useful pivot tables
285:30 create some really useful pivot tables that anybody could look at and
285:32 that anybody could look at and understand and that's really the biggest
285:33 understand and that's really the biggest use of these PIV pivot tables is that
285:35 use of these PIV pivot tables is that you can kind of group these things
285:36 you can kind of group these things together show some uh information and
285:39 together show some uh information and data at at kind of a broad larger scale
285:42 data at at kind of a broad larger scale and make it to where anybody who's
285:43 and make it to where anybody who's looking at it can understand it that is
285:45 looking at it can understand it that is why pivot tables are so useful and so I
285:48 why pivot tables are so useful and so I hope that this video was helpful I hope
285:49 hope that this video was helpful I hope that I was able to walk through it and
285:51 that I was able to walk through it and help you better understand how pivot
285:52 help you better understand how pivot tables work and how you can use them
285:54 tables work and how you can use them when you are working within Excel thank
285:57 when you are working within Excel thank you guys so much for watching I really
285:58 you guys so much for watching I really appreciate it if you like this video be
286:00 appreciate it if you like this video be sure to like And subscribe below and
286:02 sure to like And subscribe below and I'll see you in the next video
286:04 I'll see you in the next video [Music]
286:15 [Music] what's going on everybody today we're
286:17 what's going on everybody today we're going to be looking at formulas in
286:19 going to be looking at formulas in [Music]
286:23 [Music] Excel now I know what you're thinking
286:25 Excel now I know what you're thinking there's absolutely no way that you're
286:27 there's absolutely no way that you're going to be able to show us every single
286:29 going to be able to show us every single formula in Excel and you're absolutely
286:31 formula in Excel and you're absolutely right but I am going to show you some of
286:32 right but I am going to show you some of my favorites and the ones that I found
286:34 my favorites and the ones that I found the most useful and then you can go
286:36 the most useful and then you can go ahead and practice those and try those
286:38 ahead and practice those and try those out and if there are ones that you
286:40 out and if there are ones that you really want me to do and you think that
286:41 really want me to do and you think that I missed put it in the comments below
286:44 I missed put it in the comments below and I will see those and I'll try to
286:45 and I will see those and I'll try to make a list of those and make another
286:47 make a list of those and make another video on formulas and include all of
286:49 video on formulas and include all of those as well and now before we jump
286:51 those as well and now before we jump into the actual tutorial I want to give
286:53 into the actual tutorial I want to give a huge shout out to the sponsor of the
286:54 a huge shout out to the sponsor of the series and that is udemy you guys
286:57 series and that is udemy you guys already know if you have watched any of
286:59 already know if you have watched any of my videos that I absolutely love udem me
287:01 my videos that I absolutely love udem me I mean honestly they were the ones who
287:02 I mean honestly they were the ones who got me started and were able ble to give
287:04 got me started and were able ble to give me affordable courses for me to get
287:06 me affordable courses for me to get started as a data analyst I learned SQL
287:08 started as a data analyst I learned SQL and Excel and python all through udimi
287:11 and Excel and python all through udimi courses and so if you are looking for a
287:12 courses and so if you are looking for a platform to take a course I absolutely
287:15 platform to take a course I absolutely recommend you look at udemy they have
287:17 recommend you look at udemy they have fantastic sales going on right now
287:18 fantastic sales going on right now especially during the holiday season in
287:20 especially during the holiday season in this new year and so if you're looking
287:22 this new year and so if you're looking to take a full-fledged Excel course I
287:24 to take a full-fledged Excel course I have some of my favorites in the
287:25 have some of my favorites in the description below and now without
287:26 description below and now without further Ado let's jump onto my screen
287:28 further Ado let's jump onto my screen and get started with the tutorial all
287:30 and get started with the tutorial all right now before we start I want to say
287:31 right now before we start I want to say that this is not like every other
287:33 that this is not like every other tutorial that I have created created
287:34 tutorial that I have created created this one is very streamlined okay so I
287:37 this one is very streamlined okay so I already know exactly what I'm going to
287:38 already know exactly what I'm going to do there's not going to be much messing
287:40 do there's not going to be much messing around I left little notes here and
287:42 around I left little notes here and there um and I'm going to try to get
287:45 there um and I'm going to try to get through it because there's a lot of them
287:46 through it because there's a lot of them to get through um so all these ones at
287:48 to get through um so all these ones at the bottom now these are ones that I use
287:50 the bottom now these are ones that I use a lot that I think are useful again if
287:53 a lot that I think are useful again if you know other ones that you use a lot
287:55 you know other ones that you use a lot that think that I should be using which
287:56 that think that I should be using which I know there are ones that I left out of
287:58 I know there are ones that I left out of here you know put it in the comments um
288:00 here you know put it in the comments um I'll see the ones that people are liking
288:01 I'll see the ones that people are liking and I will I will create more videos on
288:03 and I will I will create more videos on the because I know there are so many I
288:06 the because I know there are so many I also will save this um excel in on the
288:10 also will save this um excel in on the GitHub so you can go and download it it
288:11 GitHub so you can go and download it it will be exactly what you're looking at
288:13 will be exactly what you're looking at right now I highly recommend trying
288:15 right now I highly recommend trying these formulas out for yourself so you
288:17 these formulas out for yourself so you can get a feel for how they work and how
288:18 can get a feel for how they work and how they're actually used and you can mess
288:20 they're actually used and you can mess around with it yourself so um as you can
288:23 around with it yourself so um as you can see at the bottom we're going to start
288:24 see at the bottom we're going to start with Max Min and then we're going to go
288:26 with Max Min and then we're going to go on to some more I think a little bit
288:28 on to some more I think a little bit more uh difficult things um and all
288:32 more uh difficult things um and all these things are super useful I'll try
288:33 these things are super useful I'll try to talk about how you can actually use
288:35 to talk about how you can actually use it as we go through it some are super
288:37 it as we go through it some are super self-explanatory but some may not be so
288:41 self-explanatory but some may not be so this one I think is super
288:41 this one I think is super self-explanatory but again one that
288:43 self-explanatory but again one that you're going to use all the time um and
288:46 you're going to use all the time um and so what we can do is we can say equal
288:49 so what we can do is we can say equal and that's how you kind of start off
288:50 and that's how you kind of start off saying this is going to be a formula in
288:52 saying this is going to be a formula in this cell equal means uh I am now
288:54 this cell equal means uh I am now creating a formula and we're going to
288:56 creating a formula and we're going to say
288:57 say MX and I'll hit Tab and so it'll kind of
289:00 MX and I'll hit Tab and so it'll kind of populate it and right here if you've
289:02 populate it and right here if you've never seen a formula before it'll to
289:04 never seen a formula before it'll to give you what the inputs need to be so
289:06 give you what the inputs need to be so it's going to say Max of number one
289:08 it's going to say Max of number one number two etc etc what we're going to
289:10 number two etc etc what we're going to do is we're going to give a range so
289:12 do is we're going to give a range so we're going to go from here down to here
289:14 we're going to go from here down to here you don't have to close the parenthesis
289:16 you don't have to close the parenthesis but you can I'm going to and then you
289:18 but you can I'm going to and then you hit enter and so for this date it's
289:21 hit enter and so for this date it's going to give us the max date now these
289:23 going to give us the max date now these are um the start dates for these people
289:26 are um the start dates for these people right here and so if we just kind of
289:28 right here and so if we just kind of glance through here we can see that 2013
289:31 glance through here we can see that 2013 was the last year and this one is
289:33 was the last year and this one is actually the latest in that year and so
289:35 actually the latest in that year and so it gave us the correct one the Min is
289:37 it gave us the correct one the Min is going to do the exact opposite it's
289:40 going to do the exact opposite it's going to give us uh the smallest and so
289:42 going to give us uh the smallest and so we'll give it the same range we'll close
289:44 we'll give it the same range we'll close the parenthesis and it's going to say
289:47 the parenthesis and it's going to say December 7th of 1995 and we can see that
289:50 December 7th of 1995 and we can see that that is correct so Michael Scott started
289:53 that is correct so Michael Scott started in 1995 the earliest of all the
289:55 in 1995 the earliest of all the employees um and you can do the exact
289:57 employees um and you can do the exact same thing for really any of these
289:59 same thing for really any of these columns we can see who the who's making
290:02 columns we can see who the who's making the most money or at least what the
290:04 the most money or at least what the higher salary is U so we'll do Max and
290:08 higher salary is U so we'll do Max and then we'll do the salary range and so
290:11 then we'll do the salary range and so this is this one again uh whoops what
290:13 this is this one again uh whoops what did I do oh I did the wrong range didn't
290:16 did I do oh I did the wrong range didn't I no I didn't do the wrong range it's
290:19 I no I didn't do the wrong range it's just there it goes uh this column was a
290:24 just there it goes uh this column was a date range or a date column for whatever
290:25 date range or a date column for whatever reason let me get rid of that uh and
290:28 reason let me get rid of that uh and then we can do equals Min and we'll do
290:32 then we can do equals Min and we'll do again we'll do the salary and at a quick
290:34 again we'll do the salary and at a quick glance we can see that Pam Beasley is
290:36 glance we can see that Pam Beasley is making the least and 65,000 is Michael
290:41 making the least and 65,000 is Michael Scott who's making uh that so super
290:44 Scott who's making uh that so super simple it shows the max it shows the Min
290:46 simple it shows the max it shows the Min you can select a range there you go
290:48 you can select a range there you go let's move on to if and ifs now if is um
290:53 let's move on to if and ifs now if is um I think pretty straightforward so all
290:55 I think pretty straightforward so all you're going to do is you're going to
290:56 you're going to do is you're going to say if this then that um ifs is a little
291:01 say if this then that um ifs is a little bit different so ifs is you can you can
291:03 bit different so ifs is you can you can put multiple conditions and as we're
291:05 put multiple conditions and as we're writing it I'll show you kind of what it
291:08 writing it I'll show you kind of what it the conditions that need to be met all
291:10 the conditions that need to be met all right so we're going to click right here
291:11 right so we're going to click right here we're going to say equal we're going to
291:12 we're going to say equal we're going to do if hit Tab and we need a logical test
291:16 do if hit Tab and we need a logical test uh and so we're going to give it a range
291:18 uh and so we're going to give it a range or or or something we're going to say if
291:19 or or or something we're going to say if it's equal greater to um something like
291:22 it's equal greater to um something like that then we're going to say if the
291:23 that then we're going to say if the value is true what's the what is going
291:25 value is true what's the what is going to be the output or if the value is
291:27 to be the output or if the value is false what's going to be the output so
291:29 false what's going to be the output so let's do this right
291:32 let's do this right here we'll do this age range and so if
291:35 here we'll do this age range and so if they are greater than let's say let's do
291:40 they are greater than let's say let's do 30 if they're greater than 30 we're
291:43 30 if they're greater than 30 we're going to do a comma and so if the value
291:45 going to do a comma and so if the value is true what what should be the output
291:47 is true what what should be the output uh if they're greater than 30 we're
291:48 uh if they're greater than 30 we're going to call them old and then if it is
291:52 going to call them old and then if it is false so if they're younger than 30 what
291:55 false so if they're younger than 30 what should it say and we're going to say
291:59 should it say and we're going to say young and we'll close the
292:01 young and we'll close the parenthesis and there you go so if
292:04 parenthesis and there you go so if they're over 30 then they are going to
292:06 they're over 30 then they are going to have young or if they're younger than 30
292:09 have young or if they're younger than 30 they're going to have young now this is
292:11 they're going to have young now this is something where you need to specify if
292:13 something where you need to specify if you want 30 and over or over 30 we chose
292:16 you want 30 and over or over 30 we chose over 30 so 30 is not included in that um
292:20 over 30 so 30 is not included in that um so they're going to be
292:22 so they're going to be young now uh let's get we don't actually
292:25 young now uh let's get we don't actually need two of these that's pretty
292:26 need two of these that's pretty self-explanatory the ifs is a little bit
292:28 self-explanatory the ifs is a little bit different right you can have multiple
292:30 different right you can have multiple conditions so let's open that up real
292:32 conditions so let's open that up real quick so ifs and now we have a logical
292:36 quick so ifs and now we have a logical test value if uh that's true then you
292:40 test value if uh that's true then you can do logical test two value if that's
292:42 can do logical test two value if that's true um so you can have multiple
292:45 true um so you can have multiple multiple multiple things now this one is
292:47 multiple multiple things now this one is a little bit different in this one oops
292:50 a little bit different in this one oops let me get out of this in this one you
292:53 let me get out of this in this one you had a value of true a value of false ifs
292:56 had a value of true a value of false ifs does not have that ifs is going to give
292:59 does not have that ifs is going to give you um different ranges in different
293:02 you um different ranges in different specific conditions
293:04 specific conditions and you can't say if this one's false
293:06 and you can't say if this one's false you're just going to have multiple
293:07 you're just going to have multiple conditions so let's do equals and ifs
293:11 conditions so let's do equals and ifs Tab and we'll do our first logical test
293:14 Tab and we'll do our first logical test so let's
293:15 so let's do
293:16 do um if the
293:19 um if the salesman or if that equals to
293:24 salesman or if that equals to salesman we're going to say we're going
293:27 salesman we're going to say we're going to respond with
293:30 to respond with sales so that's if the value is true
293:33 sales so that's if the value is true that's what we want the output to be now
293:35 that's what we want the output to be now we're going to go on to our logical test
293:37 we're going to go on to our logical test two so you're going to see this pattern
293:39 two so you're going to see this pattern right if this is our conditional or
293:42 right if this is our conditional or logical test so if this is true this is
293:45 logical test so if this is true this is what's going to be returned so you'll
293:47 what's going to be returned so you'll notice that's just a a pretty simple
293:48 notice that's just a a pretty simple pattern we can just do random things so
293:51 pattern we can just do random things so if it's equal to sales um and we'll just
293:54 if it's equal to sales um and we'll just do the same one if that is equal
293:58 do the same one if that is equal to say HR we can say fire
294:05 to say HR we can say fire immediately and now we're going to
294:07 immediately and now we're going to say if it's equal
294:16 to regional
294:18 regional manager and we
294:20 manager and we say give
294:23 say give Christmas bonus and we'll close the
294:25 Christmas bonus and we'll close the parenthesis and let's see what we get so
294:29 parenthesis and let's see what we get so as you can see there's no default value
294:31 as you can see there's no default value for true or false like like this one
294:34 for true or false like like this one there was a logical test and if it was
294:37 there was a logical test and if it was true there was a value and if it was
294:38 true there was a value and if it was false there was a value so for every
294:40 false there was a value so for every single one you'll get a value for this
294:42 single one you'll get a value for this one that's not exactly going to happen
294:44 one that's not exactly going to happen as you can see there are these
294:45 as you can see there are these Nas now when that happens it just means
294:48 Nas now when that happens it just means nothing met that condition so we never
294:50 nothing met that condition so we never said anything about supplier relations
294:52 said anything about supplier relations we never said anything about accountants
294:55 we never said anything about accountants but if it was part of that ifs statement
294:57 but if it was part of that ifs statement then it got something um and so that is
295:00 then it got something um and so that is how the ifs works now let's move on to
295:04 how the ifs works now let's move on to length uh this is exactly what we're
295:06 length uh this is exactly what we're going to do but you know some of the
295:08 going to do but you know some of the uses for this U for the length I've used
295:10 uses for this U for the length I've used it for a lot of different things um one
295:13 it for a lot of different things um one thing that I've used it for in the past
295:14 thing that I've used it for in the past and you know Max and ifs you know you
295:17 and you know Max and ifs you know you can use it for almost anything length is
295:20 can use it for almost anything length is there's a lot of different use cases one
295:22 there's a lot of different use cases one I used to work with a lot of um customer
295:25 I used to work with a lot of um customer data or patient data they had like
295:26 data or patient data they had like Social Security numbers and if you know
295:28 Social Security numbers and if you know there was bad Social Security numbers we
295:30 there was bad Social Security numbers we didn't want to include that and so we do
295:32 didn't want to include that and so we do like the length of that and if a social
295:34 like the length of that and if a social security number was let's say 10 numbers
295:37 security number was let's say 10 numbers or 11 numbers where it should only be
295:39 or 11 numbers where it should only be nine or or you know however many they
295:42 nine or or you know however many they are I think it's nine then we know that
295:43 are I think it's nine then we know that that social security number is incorrect
295:46 that social security number is incorrect and then we can get rid of that or
295:47 and then we can get rid of that or discard it from our results that's just
295:49 discard it from our results that's just an example right um so for this oops
295:52 an example right um so for this oops what I do that I did control Z to undo
295:55 what I do that I did control Z to undo that if you didn't know how to do that
295:56 that if you didn't know how to do that um so we're going to do equals Len which
295:59 um so we're going to do equals Len which is length um and again if you didn't see
296:02 is length um and again if you didn't see that it Returns the number of characters
296:03 that it Returns the number of characters in a text string so let's go right here
296:08 in a text string so let's go right here and let's go to uh let's go to their
296:11 and let's go to uh let's go to their last name and we'll give it a range so
296:14 last name and we'll give it a range so it's going to tell us how many
296:17 it's going to tell us how many characters are in that string so for
296:19 characters are in that string so for halber it's seven characters for
296:21 halber it's seven characters for flenderson it's 10 characters and we're
296:24 flenderson it's 10 characters and we're able to see a length and so again there
296:26 able to see a length and so again there are a lot of different use cases for
296:28 are a lot of different use cases for this uh the social security number was
296:30 this uh the social security number was one another one is phone numbers right
296:32 one another one is phone numbers right if you look at the length of the phone
296:33 if you look at the length of the phone numbers and there's ones that are like
296:35 numbers and there's ones that are like 12 numbers long you know those might not
296:38 12 numbers long you know those might not be ones that are accurate and you need
296:39 be ones that are accurate and you need to go look at them and see if you want
296:41 to go look at them and see if you want to include them in your results or your
296:42 to include them in your results or your output so that is how length is done
296:45 output so that is how length is done let's move right over to the left and
296:48 let's move right over to the left and right um I I might be going a little
296:51 right um I I might be going a little fast but uh you know I'm keeping it I'm
296:53 fast but uh you know I'm keeping it I'm keeping it live I'm keeping this on our
296:55 keeping it live I'm keeping this on our feet uh so let's keep going left and
296:58 feet uh so let's keep going left and right um are kind of like substrings if
297:02 right um are kind of like substrings if you've taken the the sequel um tutorial
297:04 you've taken the the sequel um tutorial series that I've done uh substrings are
297:07 series that I've done uh substrings are where you can choose a certain part of
297:09 where you can choose a certain part of the text string and you can extract data
297:11 the text string and you can extract data from that um and usually have to
297:14 from that um and usually have to reference a certain number so a certain
297:15 reference a certain number so a certain amount of characters that's the exact
297:17 amount of characters that's the exact same thing except uh unfortunately
297:20 same thing except uh unfortunately there's no substring there's substitute
297:22 there's no substring there's substitute but there's no substring left and right
297:24 but there's no substring left and right is really the closest thing that we have
297:26 is really the closest thing that we have so let's kind of take a look real quick
297:29 so let's kind of take a look real quick and see what we can do so we're going to
297:31 and see what we can do so we're going to do left and it's going to say Returns
297:33 do left and it's going to say Returns the specified number of characters from
297:35 the specified number of characters from the start of a text string so we're
297:37 the start of a text string so we're starting from the very far left and we
297:39 starting from the very far left and we need to choose our text and then choose
297:41 need to choose our text and then choose the number of characters that we're
297:43 the number of characters that we're going to be looking over so let's go
297:46 going to be looking over so let's go over here and let's just choose you know
297:49 over here and let's just choose you know start symol uh we'll get a little bit
297:51 start symol uh we'll get a little bit more advanced so we have um this is our
297:53 more advanced so we have um this is our text range so these are the the the ones
297:55 text range so these are the the the ones that we want to look at and then how
297:56 that we want to look at and then how many characters do we want to look
297:58 many characters do we want to look forward and we'll just choose three as
298:00 forward and we'll just choose three as an example and so you can see that it
298:03 an example and so you can see that it takes the first three characters from
298:06 takes the first three characters from every single thing now you can also do
298:08 every single thing now you can also do this with numbers it doesn't just have
298:10 this with numbers it doesn't just have to be um you know name with with actual
298:13 to be um you know name with with actual words or letters you can do the exact
298:15 words or letters you can do the exact same thing so you can say
298:18 same thing so you can say write um and we're going to choose our
298:20 write um and we're going to choose our our string uh and let's do this one so
298:22 our string uh and let's do this one so you know all of them start with 100 um
298:25 you know all of them start with 100 um and we'll just say we want to take the
298:27 and we'll just say we want to take the last one so this one is going to start
298:30 last one so this one is going to start from the very far right and go over one
298:32 from the very far right and go over one character
298:33 character so right here you can see this is our
298:35 so right here you can see this is our range and I just chose one so starting
298:37 range and I just chose one so starting from the very far right we go over one
298:39 from the very far right we go over one character and that's what we take and so
298:41 character and that's what we take and so that can definitely be useful another
298:43 that can definitely be useful another one that you can do and this one is one
298:44 one that you can do and this one is one that I have used so many times I mean
298:46 that I have used so many times I mean honestly countless times in in actually
298:49 honestly countless times in in actually using this in my job uh so we're going
298:51 using this in my job uh so we're going to go from the right and we're going to
298:53 to go from the right and we're going to look at a date so you know sometimes you
298:55 look at a date so you know sometimes you have these date structures month month
298:58 have these date structures month month day day year year year or year um you
299:01 day day year year year or year um you know day month year all these different
299:03 know day month year all these different and sometimes you just want to extract
299:06 and sometimes you just want to extract either the month or the year or or
299:08 either the month or the year or or something like that the day and so we
299:10 something like that the day and so we want to come in here we're just going to
299:12 want to come in here we're just going to extract the oops I wanted to make that
299:14 extract the oops I wanted to make that arrange we want to extract the year of
299:16 arrange we want to extract the year of the start dates so we're going to do
299:18 the start dates so we're going to do that and then we're going to go over
299:20 that and then we're going to go over four so we want to take the first four
299:22 four so we want to take the first four characters from the right to give us the
299:24 characters from the right to give us the entire year let's do that and now we can
299:27 entire year let's do that and now we can see exactly the year and this can be
299:30 see exactly the year and this can be just super super useful this is again
299:32 just super super useful this is again one that I've used used a lot and so
299:33 one that I've used used a lot and so that is one that you might want to
299:34 that is one that you might want to remember in case you're ever doing
299:36 remember in case you're ever doing analysis on you know start and end dates
299:37 analysis on you know start and end dates or or anything with um date data uh
299:41 or or anything with um date data uh again one that I highly recommend
299:43 again one that I highly recommend remembering let's go over to date to
299:45 remembering let's go over to date to text I actually probably should have
299:47 text I actually probably should have included that um before because I
299:50 included that um before because I actually used it in this one um if you
299:53 actually used it in this one um if you notice right here this is a text so in
299:56 notice right here this is a text so in in this one we just did that was a text
299:58 in this one we just did that was a text you can't do this right on um start and
300:01 you can't do this right on um start and end dates when it's a date uh format and
300:05 end dates when it's a date uh format and let me show you so this is a date now if
300:08 let me show you so this is a date now if I do equals and you know we just did
300:12 I do equals and you know we just did this uh let's do on the end date and
300:16 this uh let's do on the end date and I'll do the whole range give me a second
300:18 I'll do the whole range give me a second and we'll do
300:19 and we'll do four it's giving us completely random
300:21 four it's giving us completely random numbers why is that because underneath
300:23 numbers why is that because underneath the date range there are um numbers
300:27 the date range there are um numbers right so if I go right here and I make
300:30 right so if I go right here and I make this a general it's going to have the
300:33 this a general it's going to have the numbers and look these are the first
300:34 numbers and look these are the first four characters from the right and so
300:36 four characters from the right and so it's doing what it's supposed to do but
300:39 it's doing what it's supposed to do but uh it's not doing what we actually want
300:40 uh it's not doing what we actually want and that's the issue so how can we
300:43 and that's the issue so how can we convert this now there are a ton of
300:45 convert this now there are a ton of different ways um but the quickest
300:48 different ways um but the quickest probably the easiest besides actually
300:51 probably the easiest besides actually writing writing it out like this like
300:53 writing writing it out like this like 11-2
300:55 11-2 d201 which then converts it to a date
300:58 d201 which then converts it to a date format um but what you can do you know
301:01 format um but what you can do you know just so you know you can create a as a
301:03 just so you know you can create a as a text you can do 11-2
301:07 text you can do 11-2 d201 and now it will stay a text string
301:11 d201 and now it will stay a text string and as you can tell these are a little
301:12 and as you can tell these are a little bit different because this one is uh
301:14 bit different because this one is uh formatted or situated on the right and
301:16 formatted or situated on the right and this one's on the left that's how you
301:17 this one's on the left that's how you can tell the difference now if you don't
301:20 can tell the difference now if you don't want to do it by hand uh completely
301:22 want to do it by hand uh completely manually and waste hours of your time
301:25 manually and waste hours of your time you can do it in a very simple way so
301:28 you can do it in a very simple way so we're going to do uh text so this is the
301:31 we're going to do uh text so this is the exact um form for that we're going to
301:33 exact um form for that we're going to use so let's get rid of that one there
301:36 use so let's get rid of that one there we go so we're going to do equals we're
301:39 we go so we're going to do equals we're going to do uh oops text it says
301:42 going to do uh oops text it says converts a value to text in a specific
301:45 converts a value to text in a specific number format so for a date format we
301:48 number format so for a date format we can choose a date format and then it'll
301:51 can choose a date format and then it'll convert it to a text for us which saves
301:54 convert it to a text for us which saves so much time I promise you uh let's do
301:57 so much time I promise you uh let's do all of these just like we did and then
301:59 all of these just like we did and then we need to tell it what the format is if
302:02 we need to tell it what the format is if we don't if we tell it something
302:04 we don't if we tell it something incorrect it's going to give us a
302:05 incorrect it's going to give us a completely terrible output or just give
302:07 completely terrible output or just give us an error alog together so this is a
302:09 us an error alog together so this is a DayDay month Monon year year year year
302:13 DayDay month Monon year year year year format and that is what we're going to
302:14 format and that is what we're going to do so we're going to do
302:17 do so we're going to do ddmm y y YY and close that up and there
302:22 ddmm y y YY and close that up and there you go and now we well because it's in a
302:25 you go and now we well because it's in a formula what we need to do
302:28 formula what we need to do is copy
302:31 is copy this and past paste it right over here
302:34 this and past paste it right over here and now you can see that is a general
302:36 and now you can see that is a general this is something that we can use as a
302:38 this is something that we can use as a string and let's just check it just to
302:40 string and let's just check it just to make sure we're going to do right we're
302:43 make sure we're going to do right we're going to do this one let's do all of
302:45 going to do this one let's do all of them and we'll do
302:48 them and we'll do four and there you go so now it works
302:51 four and there you go so now it works that is what we are looking for um and
302:53 that is what we are looking for um and you can do that imagine doing that with
302:55 you can do that imagine doing that with millions of rows or you know let's say
302:57 millions of rows or you know let's say 10,000 rows it's going to be a breeze
303:00 10,000 rows it's going to be a breeze right it's going to take you two minutes
303:02 right it's going to take you two minutes or a minute
303:03 or a minute to do everything that you want to do
303:04 to do everything that you want to do instead of having to just do a bunch of
303:06 instead of having to just do a bunch of mess to convert it to a string which I
303:09 mess to convert it to a string which I promise you I've done it it just takes
303:10 promise you I've done it it just takes forever it's it's terrible so that is uh
303:14 forever it's it's terrible so that is uh date to text super helpful formula let's
303:17 date to text super helpful formula let's go over to trim now I I purposefully
303:20 go over to trim now I I purposefully messed up this column now why do I did I
303:24 messed up this column now why do I did I mess it up like this because when you're
303:26 mess it up like this because when you're working with real data you're going to
303:28 working with real data you're going to get data like this it it's messy it's
303:30 get data like this it it's messy it's dirty it just has random spaces at the
303:34 dirty it just has random spaces at the end for no reason um because sometimes
303:38 end for no reason um because sometimes you're going to be working with um data
303:40 you're going to be working with um data that is inputed by a user it's not like
303:43 that is inputed by a user it's not like a drop- down option so imagine
303:45 a drop- down option so imagine somebody's typing this in they
303:46 somebody's typing this in they accidentally put a space so they
303:47 accidentally put a space so they actually put an enter or something and
303:50 actually put an enter or something and then they submit it and this is how it's
303:51 then they submit it and this is how it's going to look in the database um and if
303:54 going to look in the database um and if you're a data engineer or you know
303:55 you're a data engineer or you know you're working with the raw data if they
303:57 you're working with the raw data if they don't clean that up then you're going to
303:59 don't clean that up then you're going to be working with that that dirty data and
304:01 be working with that that dirty data and I I guarantee you if you're working as a
304:03 I I guarantee you if you're working as a data analyst you're going to see stuff
304:04 data analyst you're going to see stuff like this not with maybe a last name but
304:07 like this not with maybe a last name but all sorts of data so we're going to go
304:09 all sorts of data so we're going to go right here we're going to say equals
304:11 right here we're going to say equals trim do open parenthesis actually this
304:14 trim do open parenthesis actually this says removes all spaces from a text
304:15 says removes all spaces from a text string except for a single space between
304:18 string except for a single space between words so like you know if it said
304:20 words so like you know if it said Halpert space uh or gy space Halpert it
304:24 Halpert space uh or gy space Halpert it won't take the space in between there
304:26 won't take the space in between there because it it kind of understands that
304:28 because it it kind of understands that the in normal language space is supposed
304:30 the in normal language space is supposed to be there so it won't do that um but
304:33 to be there so it won't do that um but we'll take that we'll give it this
304:35 we'll take that we'll give it this Range close that up and there you go now
304:39 Range close that up and there you go now it is nice and clean much more usable
304:42 it is nice and clean much more usable now let's look at concatenate one that I
304:45 now let's look at concatenate one that I have used just way way way too many
304:49 have used just way way way too many times um and something that I've used
304:51 times um and something that I've used concatenate for and you'll see this one
304:53 concatenate for and you'll see this one in a lot of demonstrations for a good
304:55 in a lot of demonstrations for a good reason is because a lot of people use it
304:58 reason is because a lot of people use it for this um so what you can do is you
305:02 for this um so what you can do is you can say equals um and well let me tell
305:05 can say equals um and well let me tell you what concatenate does real
305:07 you what concatenate does real quick so what concatenate does oops I'm
305:11 quick so what concatenate does oops I'm totally messing up here um but it joins
305:14 totally messing up here um but it joins two or more text strings into one string
305:16 two or more text strings into one string it basically joins things together and
305:19 it basically joins things together and adds them together so let's do
305:21 adds them together so let's do concatenate and we're going to add this
305:23 concatenate and we're going to add this first and last name again one that gets
305:25 first and last name again one that gets used all the time but that's because um
305:28 used all the time but that's because um it really is useful so you can do this
305:31 it really is useful so you can do this and you can say now now I want to
305:33 and you can say now now I want to include this so concatenating this and
305:35 include this so concatenating this and this and let's take a look so it says
305:37 this and let's take a look so it says Jim Halpert U but it's all connected and
305:41 Jim Halpert U but it's all connected and that's typically not how people write
305:42 that's typically not how people write their names so what we can do is we can
305:45 their names so what we can do is we can go back in here and we can do what my
305:47 go back in here and we can do what my demonstration up here already tells us
305:48 demonstration up here already tells us to do which is we're just going to add
305:50 to do which is we're just going to add another thing in here and if we add two
305:53 another thing in here and if we add two parentheses we can include anything in
305:55 parentheses we can include anything in here we can include a dash we can
305:57 here we can include a dash we can include an exclamation point or we can
305:59 include an exclamation point or we can just include a space so let's just
306:01 just include a space so let's just include a space really quick and just
306:04 include a space really quick and just like that it works perfectly and so now
306:07 like that it works perfectly and so now we have the full name now something that
306:10 we have the full name now something that you could use it for is something like
306:12 you could use it for is something like generating uh an email this is something
306:15 generating uh an email this is something that you absolutely could do um and it's
306:19 that you absolutely could do um and it's you know pretty simple so I'm going to
306:21 you know pretty simple so I'm going to do it like this I'm G to say oops what
306:24 do it like this I'm G to say oops what did I do I'm G to say um Dot and then at
306:30 did I do I'm G to say um Dot and then at the end I'm going to say at
306:33 the end I'm going to say at oops comma
306:36 oops comma quotation
306:38 quotation gmail.com and now I've created emails
306:41 gmail.com and now I've created emails for all of these people so just
306:44 for all of these people so just something that you can do with this um
306:46 something that you can do with this um and something that it it absolutely is
306:48 and something that it it absolutely is used for and you'll see that
306:49 used for and you'll see that demonstration almost everywhere because
306:51 demonstration almost everywhere because honestly it gets used a lot um by data
306:54 honestly it gets used a lot um by data analysts and so uh you know just a good
306:56 analysts and so uh you know just a good one to know understanding how that that
306:58 one to know understanding how that that concatenation works um let's go over to
307:01 concatenation works um let's go over to the next one
307:03 the next one so we are going to do substitute now
307:06 so we are going to do substitute now substitute's really interesting um there
307:08 substitute's really interesting um there are different ways you can do it I'm
307:10 are different ways you can do it I'm going to show it to you on these dates
307:12 going to show it to you on these dates real quick uh that's what we're going to
307:14 real quick uh that's what we're going to look at so changing a date format
307:17 look at so changing a date format changing how what it's supposed to look
307:19 changing how what it's supposed to look like is absolutely something that
307:21 like is absolutely something that happens all the time and um you know
307:24 happens all the time and um you know sometimes you'll even get it like
307:26 sometimes you'll even get it like this where it'll look like it'll be
307:28 this where it'll look like it'll be messy it'll be different a different um
307:31 messy it'll be different a different um I guess format so this one has all the
307:35 I guess format so this one has all the other ones have slashes where these ones
307:37 other ones have slashes where these ones have
307:38 have dashes and you know what you can do is
307:43 dashes and you know what you can do is if you want to well let me actually go
307:46 if you want to well let me actually go with the no instances real quick because
307:47 with the no instances real quick because this one is uh actually makes the most
307:49 this one is uh actually makes the most sense um so we'll do equals and we're
307:53 sense um so we'll do equals and we're going to say
307:54 going to say substitute and oops and let me say
307:56 substitute and oops and let me say substitute replaces existing text with
307:59 substitute replaces existing text with new text in a text string so if we do an
308:03 new text in a text string so if we do an open parenthesis it says we take the
308:05 open parenthesis it says we take the text have the old text we have the new
308:08 text have the old text we have the new text and then we have how what instance
308:10 text and then we have how what instance or how many times uh or or or what
308:13 or how many times uh or or or what instance are we looking at it and I'll
308:15 instance are we looking at it and I'll explain that in a little
308:17 explain that in a little bit so the text that we're going to be
308:19 bit so the text that we're going to be looking at is this one right here so
308:20 looking at is this one right here so let's take this
308:22 let's take this range and the old is we're going to take
308:25 range and the old is we're going to take this Dash and so let's take the
308:30 this Dash and so let's take the dash and then what do we want to replace
308:32 dash and then what do we want to replace replace it with we want to replace it
308:34 replace it with we want to replace it with this slash right here I think it's
308:36 with this slash right here I think it's a forward slash isn't that what it's
308:37 a forward slash isn't that what it's called it's called a forward slash am I
308:39 called it's called a forward slash am I crazy um and we're not going to put an
308:41 crazy um and we're not going to put an instance notice that that's in a bracket
308:43 instance notice that that's in a bracket that means it's optional we're going to
308:44 that means it's optional we're going to do none of that um and what it's going
308:47 do none of that um and what it's going to do is it's going to fix this so this
308:49 to do is it's going to fix this so this one is now in the correct format that we
308:52 one is now in the correct format that we want uh and that's fantastic that's you
308:54 want uh and that's fantastic that's you know that's what we tried to accomplish
308:56 know that's what we tried to accomplish given what we had now let's fix that if
308:59 given what we had now let's fix that if we want to do the exact same thing uh we
309:01 we want to do the exact same thing uh we can say
309:03 can say uh what are we doing substitute we can
309:05 uh what are we doing substitute we can do substitute we can do open parentheses
309:08 do substitute we can do open parentheses we'll give the range and now let's say
309:10 we'll give the range and now let's say we want to change all of them to a
309:12 we want to change all of them to a different format so instead of the um
309:16 different format so instead of the um forward slash I'm going to keep calling
309:17 forward slash I'm going to keep calling it that if that's correct we want to
309:20 it that if that's correct we want to give it a dash and so then we close that
309:22 give it a dash and so then we close that and now all of them are in this new
309:24 and now all of them are in this new format so it it's able to substitute a
309:27 format so it it's able to substitute a specific value for a new value and if
309:30 specific value for a new value and if you don't include an instance
309:32 you don't include an instance then it'll do it to every single one in
309:35 then it'll do it to every single one in there so let's go over here and we're
309:38 there so let's go over here and we're going to actually use the the um the the
309:42 going to actually use the the um the the instance num and I'll show you what that
309:45 instance num and I'll show you what that does uh and so really quick we'll do the
309:47 does uh and so really quick we'll do the exact same thing that we just did we'll
309:49 exact same thing that we just did we'll do the forward slash and we want to
309:54 do the forward slash and we want to replace it with this one again this Dash
309:58 replace it with this one again this Dash but we only want to do it on the first
310:00 but we only want to do it on the first instance of that forward slash and so as
310:04 instance of that forward slash and so as you can see all the ones that um all the
310:07 you can see all the ones that um all the ones that were replaced are the very
310:08 ones that were replaced are the very first instance whereas the second
310:11 first instance whereas the second instance which is the second time it
310:12 instance which is the second time it appears in this string does not get
310:15 appears in this string does not get touched so if we take
310:18 touched so if we take this and we put it right over here and
310:22 this and we put it right over here and we move it to
310:23 we move it to two it's kind of the opposite so the
310:26 two it's kind of the opposite so the first one wasn't touched the second one
310:28 first one wasn't touched the second one was so we're choosing which instance or
310:30 was so we're choosing which instance or which time it shows up in that string
310:33 which time it shows up in that string and then it replaces it if you do not
310:35 and then it replaces it if you do not choose an instance it chooses all of
310:37 choose an instance it chooses all of them so this can be super useful if you
310:39 them so this can be super useful if you want to do like a bulk replace um but
310:42 want to do like a bulk replace um but you only want to do it on a specific
310:44 you only want to do it on a specific column um and you just want to use a
310:46 column um and you just want to use a formula really quick right um and so you
310:48 formula really quick right um and so you can use this in a lot of different ways
310:49 can use this in a lot of different ways so that's how you're able to actually do
310:51 so that's how you're able to actually do it with the first instance the second
310:52 it with the first instance the second instance and if you don't include an
310:54 instance and if you don't include an instance at all let's go over to the sum
310:58 instance at all let's go over to the sum uh this is one I think everyone knows
311:01 uh this is one I think everyone knows how to use but I want to show you two
311:03 how to use but I want to show you two other ones um as well so let's go to the
311:07 other ones um as well so let's go to the sum and we're just going to do equals
311:09 sum and we're just going to do equals the sum and I hope you know what this is
311:10 the sum and I hope you know what this is well not hope I if you don't know what
311:12 well not hope I if you don't know what this is it just adds up all the numbers
311:14 this is it just adds up all the numbers in range so we're going to add sum means
311:16 in range so we're going to add sum means add so we're going to take this and it's
311:19 add so we're going to take this and it's going to give us the uh what all these
311:21 going to give us the uh what all these salaries are together so super super
311:23 salaries are together so super super simple Su is one of probably the most
311:25 simple Su is one of probably the most basic formulas that you can do um some
311:29 basic formulas that you can do um some if is a little bit different you can add
311:33 if is a little bit different you can add an if statement which we learned right
311:35 an if statement which we learned right back here you can add an if statement
311:38 back here you can add an if statement and then add it if it meets a certain
311:41 and then add it if it meets a certain criteria all right so we're going to do
311:44 criteria all right so we're going to do equals some if and then you're going to
311:47 equals some if and then you're going to need to give a range in criteria and you
311:50 need to give a range in criteria and you can include a some range if you would
311:52 can include a some range if you would like so we're going to do the salary
311:55 like so we're going to do the salary again we going to do a comma and now
311:57 again we going to do a comma and now here's our criteria let's do if they
312:00 here's our criteria let's do if they have greater than 50,000 for their
312:03 have greater than 50,000 for their salary and close our parenthesis so now
312:07 salary and close our parenthesis so now it's only going to add up if their
312:09 it's only going to add up if their salary is greater than 50,000 now his is
312:12 salary is greater than 50,000 now his is 50,000 exactly so that won't count but
312:14 50,000 exactly so that won't count but we have 63 and 65,000 which does equal
312:18 we have 63 and 65,000 which does equal 128,000 so it it just gives a specific
312:22 128,000 so it it just gives a specific criteria or an if statement then it does
312:25 criteria or an if statement then it does the addition uh so super useful on that
312:27 the addition uh so super useful on that one so that is how you do a su if and Su
312:29 one so that is how you do a su if and Su ifs is kind of the same thing as we did
312:32 ifs is kind of the same thing as we did back here there's the if and the ifs so
312:35 back here there's the if and the ifs so the ifs is going to be if it has it
312:37 the ifs is going to be if it has it meets multiple conditions so let's take
312:39 meets multiple conditions so let's take a look at that one so let's do um equals
312:44 a look at that one so let's do um equals some ifs now uh oops now the Syntax for
312:49 some ifs now uh oops now the Syntax for this one is going to be a little bit
312:50 this one is going to be a little bit different you'll see that in just a
312:52 different you'll see that in just a second this adds the cells specified by
312:54 second this adds the cells specified by a given set of conditions or criteria so
312:58 a given set of conditions or criteria so let's do an open open parentheses we
313:00 let's do an open open parentheses we give the sum range so let's do um the
313:03 give the sum range so let's do um the same one as before then we have our
313:05 same one as before then we have our criteria range so what are we looking at
313:09 criteria range so what are we looking at What's um this is the area that's going
313:10 What's um this is the area that's going to be added after all these if
313:12 to be added after all these if statements are done right so we have to
313:16 statements are done right so we have to initially set that now we're going to
313:18 initially set that now we're going to say okay what criteria are we basing
313:20 say okay what criteria are we basing this off of so let's put a comma and
313:23 this off of so let's put a comma and we're going to base it off of let's do
313:25 we're going to base it off of let's do this one we'll say um if the uh gender
313:30 this one we'll say um if the uh gender so we'll do comma if that's female oops
313:35 so we'll do comma if that's female oops if that's female and then we'll give
313:37 if that's female and then we'll give another one we can say if they're female
313:41 another one we can say if they're female and let's say they are greater than oops
313:44 and let's say they are greater than oops greater than 30 and we'll close that up
313:48 greater than 30 and we'll close that up and it's going to give us 88,000 so
313:50 and it's going to give us 88,000 so female female there's one two right here
313:55 female female there's one two right here so it's going to be this one and this
313:57 so it's going to be this one and this one that equals 88,000 so that's how
314:00 one that equals 88,000 so that's how that works you're able to incorporate
314:02 that works you're able to incorporate several different
314:04 several different conditions into uh the sum formula so
314:08 conditions into uh the sum formula so again I know this one's super simple but
314:10 again I know this one's super simple but you you can use it in a much more
314:11 you you can use it in a much more complex way if you use the sum if and
314:14 complex way if you use the sum if and the sum ifs um almost the exact same
314:18 the sum ifs um almost the exact same thing for this count I'm not going to go
314:20 thing for this count I'm not going to go super in depth into this one um I'll
314:23 super in depth into this one um I'll just kind of show you because count is
314:27 just kind of show you because count is um count and sum are kind of on the same
314:30 um count and sum are kind of on the same level of difficulty they're both pretty
314:33 level of difficulty they're both pretty beginner this is just going to give you
314:34 beginner this is just going to give you a count of how many cells um are there
314:38 a count of how many cells um are there so let's give this range um and so it's
314:41 so let's give this range um and so it's not going to add it it's just going to
314:42 not going to add it it's just going to give us a count so if we do right here
314:44 give us a count so if we do right here and scroll over them like highlight them
314:47 and scroll over them like highlight them this countdown here oops this countdown
314:49 this countdown here oops this countdown here is nine and so it's going to give
314:50 here is nine and so it's going to give us that count but we can do a count with
314:54 us that count but we can do a count with conditions exactly how we did it in the
314:57 conditions exactly how we did it in the sum so if we do count if Oops I did not
315:00 sum so if we do count if Oops I did not spell that right if we do count if we're
315:03 spell that right if we do count if we're going to give a range and a criteria
315:05 going to give a range and a criteria exact same as we did before so let's do
315:08 exact same as we did before so let's do this I me you can do this on basically
315:11 this I me you can do this on basically any of these it doesn't really for this
315:12 any of these it doesn't really for this demonstration it doesn't really matter
315:14 demonstration it doesn't really matter um but we'll say if their salary is
315:18 um but we'll say if their salary is greater than 45,000 so how many people
315:20 greater than 45,000 so how many people this is going to give us how many people
315:22 this is going to give us how many people have a salary over 45,000 and that's
315:25 have a salary over 45,000 and that's five so before in the sum if if we did
315:28 five so before in the sum if if we did that um we did 50,000 it adds everything
315:31 that um we did 50,000 it adds everything together the count is just going to
315:33 together the count is just going to count the amount of cells that meet that
315:35 count the amount of cells that meet that criteria and again count
315:39 criteria and again count ifs uh we're going to have a criteria
315:41 ifs uh we're going to have a criteria range and then we will specify what if
315:45 range and then we will specify what if statements we want to be uh to occur in
315:48 statements we want to be uh to occur in order to count those cells so let's do
315:52 order to count those cells so let's do we want you know we want to count it can
315:54 we want you know we want to count it can be any range or it can be any of these
315:56 be any range or it can be any of these we'll do the ID this time and now we can
316:00 we'll do the ID this time and now we can say you know want it to be is our
316:03 say you know want it to be is our criteria one we can say we want it to be
316:06 criteria one we can say we want it to be greater than want their ID to be greater
316:09 greater than want their ID to be greater than
316:11 than 1005 and let's say we want them to
316:19 be male so they have an ID over a certain
316:23 male so they have an ID over a certain um a certain range and then they are a
316:26 um a certain range and then they are a male so there's only three people that
316:27 male so there's only three people that meet that criteria and so it'll be
316:31 meet that criteria and so it'll be Michael Stanley and Kevin those are our
316:33 Michael Stanley and Kevin those are our three people and so it gives us a count
316:35 three people and so it gives us a count very useful to give quick numbers like
316:37 very useful to give quick numbers like this something I I genuinely use a lot
316:41 this something I I genuinely use a lot and I know I've said that a lot during
316:43 and I know I've said that a lot during this tutorial but that's because
316:45 this tutorial but that's because everything I'm showing you are things
316:47 everything I'm showing you are things that I've used a lot so I don't feel
316:48 that I've used a lot so I don't feel like um you know I'm speaking out of
316:50 like um you know I'm speaking out of turn here let's look at this one this
316:52 turn here let's look at this one this one is very um has some specific use
316:57 one is very um has some specific use cases um notice that this is a text
316:59 cases um notice that this is a text right now um if you do it when it is uh
317:03 right now um if you do it when it is uh in a date format it actually will not
317:06 in a date format it actually will not work I mean I can you can test it out
317:08 work I mean I can you can test it out yourself you just got to trust me it's
317:09 yourself you just got to trust me it's not going to work so what this does is
317:12 not going to work so what this does is it's going to give you the range from
317:15 it's going to give you the range from this day to this day that's what it's
317:16 this day to this day that's what it's going to do so let's do uh oops days
317:20 going to do so let's do uh oops days it's GNA we want to choose our end date
317:22 it's GNA we want to choose our end date so this is our end date it's kind of
317:24 so this is our end date it's kind of backward from what you think end date to
317:26 backward from what you think end date to start date you think start date to end
317:28 start date you think start date to end date so you have to start with this one
317:30 date so you have to start with this one and then we're going to choose the start
317:31 and then we're going to choose the start date and now it's going to tell us how
317:34 date and now it's going to tell us how many um how many uh days was it from
317:40 many um how many uh days was it from here to here and this one it's
317:43 here to here and this one it's 5,56 so Network days is extremely
317:46 5,56 so Network days is extremely similar except it takes out holidays and
317:49 similar except it takes out holidays and it takes out weekends and you can see
317:51 it takes out weekends and you can see how many working days has this person um
317:55 how many working days has this person um how many working days or network days
317:57 how many working days or network days has this person worked not including you
317:59 has this person worked not including you know weekends and holidays have they
318:01 know weekends and holidays have they actually worked since their start date
318:03 actually worked since their start date and their end date so let's do Network
318:06 and their end date so let's do Network days and we need our start date our end
318:08 days and we need our start date our end date and you can specify extra holidays
318:11 date and you can specify extra holidays if you'd like but there are a already
318:14 if you'd like but there are a already standard set holidays in there that it
318:17 standard set holidays in there that it takes out um so you know if you want to
318:19 takes out um so you know if you want to do that you can so we're going to do the
318:21 do that you can so we're going to do the start date again this one's different
318:23 start date again this one's different this one says start date end date and
318:25 this one says start date end date and then we're going to give the end
318:27 then we're going to give the end date and if you
318:29 date and if you notice they are going to be different
318:31 notice they are going to be different numbers is dramatically lower because
318:33 numbers is dramatically lower because it's taking out weekends and holidays so
318:36 it's taking out weekends and holidays so this is how many days uh calendar days
318:38 this is how many days uh calendar days they've worked and this is how many days
318:40 they've worked and this is how many days they've actually been in the office and
318:42 they've actually been in the office and worked and that is it um again there are
318:47 worked and that is it um again there are so many formulas I mean literally
318:48 so many formulas I mean literally hundreds of formulas that you can
318:51 hundreds of formulas that you can utilize and use and are out there for
318:54 utilize and use and are out there for you to try out yourself if there are
318:57 you to try out yourself if there are specific ones that I did not cover in
319:00 specific ones that I did not cover in this video please please put it in the
319:02 this video please please put it in the comments below so that I can you know
319:05 comments below so that I can you know show you how to do these things I I I
319:07 show you how to do these things I I I will say I've probably used a majority
319:08 will say I've probably used a majority of the ones that you're going to put in
319:09 of the ones that you're going to put in the comments already and if I haven't
319:12 the comments already and if I haven't used it I'll take a look at it and see
319:13 used it I'll take a look at it and see if it's really useful and I'll show you
319:15 if it's really useful and I'll show you that so thank you guys so much for
319:18 that so thank you guys so much for watching I hope that this has been
319:19 watching I hope that this has been helpful I I feel like a lot of these
319:22 helpful I I feel like a lot of these things are not things that I learned
319:24 things are not things that I learned before I started almost all these are
319:26 before I started almost all these are ones that I learned while I was on the
319:28 ones that I learned while I was on the job and so I'm hoping that you can get
319:30 job and so I'm hoping that you can get ahead of the curve and you can learn
319:31 ahead of the curve and you can learn learn these things before you actually
319:32 learn these things before you actually start so that when you get in there
319:34 start so that when you get in there you're just like killing it with the
319:36 you're just like killing it with the formulas and people are like whoa this
319:38 formulas and people are like whoa this guy is like this guy knows what he's
319:39 guy is like this guy knows what he's doing in Excel give him all the Excel
319:41 doing in Excel give him all the Excel work and then you become like you know
319:43 work and then you become like you know just the Excel guy um and everyone you
319:45 just the Excel guy um and everyone you know loves you for it so with that being
319:48 know loves you for it so with that being said thank you so much for watching I
319:49 said thank you so much for watching I really do hope this helped if you like
319:51 really do hope this helped if you like this video be sure to like And subscribe
319:53 this video be sure to like And subscribe below I'll see you in the next
319:56 below I'll see you in the next [Music]
320:00 [Music] video
320:08 what's going on everybody welcome back to another video in this Excel tutorial
320:10 to another video in this Excel tutorial we'll be looking at
320:17 [Music] xlup now if you don't already know what
320:19 xlup now if you don't already know what xlookup is it is a new feature in Excel
320:21 xlookup is it is a new feature in Excel to kind of replace vlookup or to be a
320:24 to kind of replace vlookup or to be a much better option at least in my mind
320:26 much better option at least in my mind is a much better option than V lookup
320:28 is a much better option than V lookup and so if you're someone who's either
320:30 and so if you're someone who's either used V lookup a a lot and you're trying
320:32 used V lookup a a lot and you're trying to you know learn this new Option or if
320:33 to you know learn this new Option or if you've never used it before this video
320:35 you've never used it before this video will be super helpful because I'll walk
320:37 will be super helpful because I'll walk you through kind of the options and what
320:38 you through kind of the options and what x lookup can do as well as the
320:40 x lookup can do as well as the difference between X lookup and V lookup
320:42 difference between X lookup and V lookup but before we get into the tutorial I
320:43 but before we get into the tutorial I want to give a huge shout out to today's
320:45 want to give a huge shout out to today's sponsor and that is udemy udemy is the
320:47 sponsor and that is udemy udemy is the go-to place if you want a full-fledged
320:48 go-to place if you want a full-fledged course in Excel I have three options of
320:51 course in Excel I have three options of courses that I have taken on em me so
320:53 courses that I have taken on em me so I'd highly recommend checking those out
320:55 I'd highly recommend checking those out they are having a huge sale on all their
320:56 they are having a huge sale on all their courses during this time and so if you
320:58 courses during this time and so if you are in the market for a course I highly
321:00 are in the market for a course I highly recommend checking out UD to me and
321:02 recommend checking out UD to me and getting one there now without further
321:03 getting one there now without further Ado let's jum on my screen and start the
321:05 Ado let's jum on my screen and start the tutorial all right so let's get me off
321:07 tutorial all right so let's get me off the screen because we all know why we're
321:08 the screen because we all know why we're here so I didn't include this in the
321:10 here so I didn't include this in the formulas video last week because I knew
321:14 formulas video last week because I knew this was going to be a large one and a
321:15 this was going to be a large one and a lot of people are going to want to know
321:16 lot of people are going to want to know how to do this what the difference
321:18 how to do this what the difference stream V lookup and X lookup is so it
321:20 stream V lookup and X lookup is so it has its own dedicated video to it so
321:23 has its own dedicated video to it so let's get started it is a Formula so
321:24 let's get started it is a Formula so we're going to come in here in this cell
321:26 we're going to come in here in this cell we're going to hit equal and then we're
321:28 we're going to hit equal and then we're going to start typing X lookup now I'm
321:30 going to start typing X lookup now I'm GNA hit tab in just a second but let's
321:33 GNA hit tab in just a second but let's read what this says it says searches a
321:34 read what this says it says searches a range or an array for a match and
321:37 range or an array for a match and Returns the corresponding item from a
321:39 Returns the corresponding item from a second range or array by default an
321:41 second range or array by default an exact match is used so really useful to
321:44 exact match is used so really useful to know um we'll talk a little bit more
321:45 know um we'll talk a little bit more about that in just a second let's hit
321:47 about that in just a second let's hit Tab and it's going to complete it and
321:50 Tab and it's going to complete it and it's going to start giving us or it's
321:51 it's going to start giving us or it's going to tell us what our input values
321:53 going to tell us what our input values need to be we're going to have our
321:55 need to be we're going to have our lookup value we're going to have our
321:57 lookup value we're going to have our lookup array our return array and then
322:00 lookup array our return array and then some options things like if not found so
322:03 some options things like if not found so if your option isn't found you know what
322:05 if your option isn't found you know what will be um you know the the uh output
322:08 will be um you know the the uh output that it gives us a match mode and a
322:11 that it gives us a match mode and a search mode and I'm going to show you um
322:12 search mode and I'm going to show you um kind of how to use every single one of
322:14 kind of how to use every single one of these things as you can see at the very
322:16 these things as you can see at the very bottom I've kind of already set up all
322:18 bottom I've kind of already set up all of the instructional um instructional
322:21 of the instructional um instructional content for this video and so we'll kind
322:24 content for this video and so we'll kind of get through all these different
322:25 of get through all these different scenarios so let's just start really
322:28 scenarios so let's just start really quickly with um how to use it very
322:31 quickly with um how to use it very simply with the lookup lookup array and
322:33 simply with the lookup lookup array and return array so we're going to come in
322:36 return array so we're going to come in here and we're going to give it our
322:37 here and we're going to give it our lookup value Now Toby Fenderson right
322:40 lookup value Now Toby Fenderson right over here in A3 is going to be our
322:42 over here in A3 is going to be our lookup value so that's who we're going
322:44 lookup value so that's who we're going to be searching for now we're going to
322:46 to be searching for now we're going to hit comma and now we're going to be
322:48 hit comma and now we're going to be needing to look up uh or to input our
322:50 needing to look up uh or to input our lookup array now an array is just uh you
322:52 lookup array now an array is just uh you know a range basically so we're going to
322:55 know a range basically so we're going to do this is where it's going to be
322:57 do this is where it's going to be searching for um that value this is
323:00 searching for um that value this is where it's searches for A3 so here's
323:02 where it's searches for A3 so here's Toby Fenderson here's Toby flenderson so
323:04 Toby Fenderson here's Toby flenderson so it will find it in this array right here
323:08 it will find it in this array right here then we're going to hit comma and now we
323:10 then we're going to hit comma and now we need to give it the return array what
323:11 need to give it the return array what it's going to return on that row when it
323:14 it's going to return on that row when it finds it so we're going to return his
323:16 finds it so we're going to return his email keep it really simple so what it
323:18 email keep it really simple so what it should do and let's close parentheses
323:21 should do and let's close parentheses what it should do is it should take Toby
323:22 what it should do is it should take Toby Fenderson it's going to search in this
323:25 Fenderson it's going to search in this column or in this array and then it's
323:28 column or in this array and then it's going to return the email when it finds
323:31 going to return the email when it finds Toby Fenderson so it's on Toby Fenderson
323:33 Toby Fenderson so it's on Toby Fenderson is on row six so it's going to find Toby
323:37 is on row six so it's going to find Toby flenderson it's going to come over here
323:38 flenderson it's going to come over here and it's going to return Toby flenderson
323:41 and it's going to return Toby flenderson dundermifflin corporate.com that's what
323:43 dundermifflin corporate.com that's what it should do let's see what it actually
323:45 it should do let's see what it actually does said enter and it returns it now if
323:49 does said enter and it returns it now if we drag it down like this it'll apply it
323:52 we drag it down like this it'll apply it to all of these names right here and it
323:54 to all of these names right here and it works exactly how it's supposed to um
323:57 works exactly how it's supposed to um again if you have never used vlookup you
324:00 again if you have never used vlookup you don't know how good you have it okay
324:02 don't know how good you have it okay vlookup um was extremely useful but just
324:05 vlookup um was extremely useful but just uh a bit complicated and I'll talk about
324:06 uh a bit complicated and I'll talk about that near the end of the video when we
324:08 that near the end of the video when we compare V lookup to xlookup but just
324:11 compare V lookup to xlookup but just know that if you're using X lookup for
324:13 know that if you're using X lookup for the first time and you're just getting
324:14 the first time and you're just getting into using Excel you guys have it good
324:17 into using Excel you guys have it good okay so just know that um now let's go
324:20 okay so just know that um now let's go over here to X lookup multiple rows
324:23 over here to X lookup multiple rows because you can return more than one
324:26 because you can return more than one output with um with X lookup so let's go
324:31 output with um with X lookup so let's go right in here and we're going to
324:33 right in here and we're going to basically write the exact same thing as
324:36 basically write the exact same thing as we did before so let's write X lookup
324:39 we did before so let's write X lookup we're going to do Toby flenderson as our
324:41 we're going to do Toby flenderson as our value we're going to search here and
324:44 value we're going to search here and we're going to do something a little bit
324:45 we're going to do something a little bit different this time we want to include
324:47 different this time we want to include our end date and the email so what we're
324:50 our end date and the email so what we're going to do is we're going to start here
324:52 going to do is we're going to start here we're going to go down all the way to
324:53 we're going to go down all the way to the bottom of end date and then we're
324:55 the bottom of end date and then we're also going to include the email and when
324:58 also going to include the email and when we do that it will uh in the output give
325:01 we do that it will uh in the output give us a row or a column for end dat and a
325:04 us a row or a column for end dat and a column for email so an output for both
325:07 column for email so an output for both so let's hit enter and now we can see
325:09 so let's hit enter and now we can see that we have the end date here and the
325:12 that we have the end date here and the email here now one of the downsides or
325:15 email here now one of the downsides or or something that I'm not a huge huge
325:18 or something that I'm not a huge huge fan of is well first off I love that you
325:20 fan of is well first off I love that you can do this that's fantastic um but it
325:24 can do this that's fantastic um but it have to be right next to each other so
325:26 have to be right next to each other so you're only going to get that output
325:28 you're only going to get that output exactly how it is in the columns so if I
325:31 exactly how it is in the columns so if I went and did this range um I would
325:33 went and did this range um I would include all of that um so H you know
325:37 include all of that um so H you know let's just for example let's pull that
325:39 let's just for example let's pull that down here so let's take
325:41 down here so let's take this and put it right here if I did
325:45 this and put it right here if I did instead of zero or or O2 to P10 if I
325:50 instead of zero or or O2 to P10 if I included age to email this whole range
325:53 included age to email this whole range and I hit enter it's all going to be
325:56 and I hit enter it's all going to be included so you know that's one of the
325:58 included so you know that's one of the small downsides of of that functionality
326:02 small downsides of of that functionality of when you can use multiple rows is
326:04 of when you can use multiple rows is that it's going to use the rows exactly
326:06 that it's going to use the rows exactly as they are you can't really customize
326:09 as they are you can't really customize it within the formula you can move
326:12 it within the formula you can move around um these columns to how you want
326:14 around um these columns to how you want it um so that is something to note and
326:18 it um so that is something to note and again you can pull this down and it'll
326:20 again you can pull this down and it'll be applied to all of those names let's
326:23 be applied to all of those names let's go over to X lookup exact match so let's
326:27 go over to X lookup exact match so let's open this up we're going to do equals
326:29 open this up we're going to do equals xlup as we've been doing and we're
326:30 xlup as we've been doing and we're actually going to be looking at the if
326:32 actually going to be looking at the if not found and the match mode U both you
326:34 not found and the match mode U both you know on this tab right here so let's do
326:38 know on this tab right here so let's do what we've been doing before we take our
326:40 what we've been doing before we take our value that we're looking up we take the
326:44 value that we're looking up we take the array that we're looking and we're going
326:46 array that we're looking and we're going to do the email and you know as you can
326:50 to do the email and you know as you can see this says Toby flender and not Toby
326:53 see this says Toby flender and not Toby flenderson so what we are going to do is
326:56 flenderson so what we are going to do is we're going to hit comma and if it's not
326:57 we're going to hit comma and if it's not found you can return um a value or a
327:01 found you can return um a value or a string that you want to return now for
327:04 string that you want to return now for simple purposes or for simple
327:06 simple purposes or for simple instructional purposes we're going to do
327:08 instructional purposes we're going to do not
327:10 not found and then we're going to close that
327:13 found and then we're going to close that off so let's do this and Toby Fenderson
327:16 off so let's do this and Toby Fenderson was not found and so it was returned not
327:19 was not found and so it was returned not found if Toby Fender was actually in
327:22 found if Toby Fender was actually in this full name then it would have
327:25 this full name then it would have returned the email and then if along the
327:27 returned the email and then if along the way you know one of these was not part
327:28 way you know one of these was not part of it then you know we would have uh we
327:32 of it then you know we would have uh we would have had the KN found all right so
327:34 would have had the KN found all right so let's go right up here we're actually
327:36 let's go right up here we're actually just going to copy this uh because I
327:38 just going to copy this uh because I want to reuse it um and then we're going
327:41 want to reuse it um and then we're going to go right here and we hit a comma now
327:43 to go right here and we hit a comma now this is our match mode option and so we
327:47 this is our match mode option and so we have four different options that we can
327:48 have four different options that we can choose from a zero is an exact match and
327:50 choose from a zero is an exact match and that is by default that is what we have
327:53 that is by default that is what we have or what we use then there's a minus one
327:56 or what we use then there's a minus one that's an exact match or next smaller
327:58 that's an exact match or next smaller item then there's a one which is an
328:00 item then there's a one which is an exact match or next larger item and then
328:02 exact match or next larger item and then there's a two which is a wild card
328:04 there's a two which is a wild card character match now we're going to do
328:06 character match now we're going to do that and we are going to um you know try
328:09 that and we are going to um you know try this out and it's not going to work and
328:12 this out and it's not going to work and not just because I forgot to put A4 um
328:14 not just because I forgot to put A4 um it's doing it because it's searching for
328:16 it's doing it because it's searching for Beasley but if there's not a wild card
328:20 Beasley but if there's not a wild card option already put in here um it doesn't
328:23 option already put in here um it doesn't recognize it so we need to indicate
328:25 recognize it so we need to indicate where that wild card needs to be so
328:27 where that wild card needs to be so we're going to do a double apostrophe or
328:29 we're going to do a double apostrophe or quotation marks we're going to put put
328:30 quotation marks we're going to put put an asterisk right here and then do
328:32 an asterisk right here and then do another one and we're going to hit an
328:35 another one and we're going to hit an Amper sand so we're going to have an
328:36 Amper sand so we're going to have an Amper sand right here and when that's
328:38 Amper sand right here and when that's going to say is anything that comes
328:40 going to say is anything that comes before A4 anything that comes before
328:43 before A4 anything that comes before Beasley is okay doesn't matter what it
328:45 Beasley is okay doesn't matter what it is as long as it has Beasley at the end
328:48 is as long as it has Beasley at the end that is going to be okay so we're going
328:49 that is going to be okay so we're going to have Pam that comes before Beasley
328:52 to have Pam that comes before Beasley and that's going to tell it and it's
328:53 and that's going to tell it and it's going to say okay I know that anything
328:55 going to say okay I know that anything that comes before Beasley is all right
328:57 that comes before Beasley is all right and so when we hit enter is now going to
328:59 and so when we hit enter is now going to return the output that we are looking
329:01 return the output that we are looking for and we can include that on these as
329:04 for and we can include that on these as well now this one is Meredith um and so
329:08 well now this one is Meredith um and so Meredith is at the beginning so we have
329:11 Meredith is at the beginning so we have Meredith Palmer so we can actually take
329:13 Meredith Palmer so we can actually take this and we're going to put this at the
329:17 this and we're going to put this at the end put the Amber sand right here and
329:20 end put the Amber sand right here and now it'll work and the exact same thing
329:24 now it'll work and the exact same thing for Kevin Malo right here Kevin Malone
329:27 for Kevin Malo right here Kevin Malone so it just didn't include uh the ne at
329:30 so it just didn't include uh the ne at the end and so it's still going to work
329:33 the end and so it's still going to work if we include that asterisk at the end
329:36 if we include that asterisk at the end now I know I said we were looking at
329:37 now I know I said we were looking at search order but I'm actually going to
329:38 search order but I'm actually going to kind of give you an exact match uh first
329:40 kind of give you an exact match uh first and then search order but it just kind
329:42 and then search order but it just kind of easier to show it over here so I'm
329:44 of easier to show it over here so I'm going to do X look up I'm going to look
329:47 going to do X look up I'm going to look up this value do a comma here's the
329:50 up this value do a comma here's the range this is our start date that's it's
329:52 range this is our start date that's it's going to be looking for and I want to
329:54 going to be looking for and I want to return the full name now no value in
329:58 return the full name now no value in here has one one 2000 but what we can do
330:02 here has one one 2000 but what we can do is we can do comma and then a comma for
330:05 is we can do comma and then a comma for the match mode and do an exact match or
330:07 the match mode and do an exact match or next
330:09 next larger and I know this is in the exact
330:11 larger and I know this is in the exact match part but it you know kind of
330:14 match part but it you know kind of refers to search ORD a little bit um
330:16 refers to search ORD a little bit um where it searches for the next largest
330:18 where it searches for the next largest value that's that's what that number one
330:20 value that's that's what that number one represents the next larger value so we
330:22 represents the next larger value so we have 112000 and if we look right here
330:24 have 112000 and if we look right here the next value above 112000 is
330:28 the next value above 112000 is 152000 and so it should should return
330:31 152000 and so it should should return Angela Martin let's see if that works
330:34 Angela Martin let's see if that works and there it is now let's look up the
330:36 and there it is now let's look up the actual search order um so let's do
330:38 actual search order um so let's do equals x
330:40 equals x lookup this is the value that we want to
330:42 lookup this is the value that we want to be searching for and we're going to be
330:44 be searching for and we're going to be looking in this start date and comma and
330:49 looking in this start date and comma and we want to return the name now let's get
330:52 we want to return the name now let's get over to search mode now the search mode
330:55 over to search mode now the search mode performs a search starting at the first
330:57 performs a search starting at the first item so at the very top going down so by
331:00 item so at the very top going down so by default it searches from first to last
331:02 default it searches from first to last but you can reverse that and do search
331:05 but you can reverse that and do search from last to first or you can do a
331:07 from last to first or you can do a binary search which is where it sorts in
331:09 binary search which is where it sorts in ascending order or sorts in descending
331:11 ascending order or sorts in descending order um and that's with the actual
331:14 order um and that's with the actual value and so we won't be able to show
331:17 value and so we won't be able to show this binary search or on ascending or
331:20 this binary search or on ascending or descending because our values are the
331:22 descending because our values are the same but if we had different values and
331:25 same but if we had different values and we were looking up um using this um next
331:29 we were looking up um using this um next largest we we would be able to show that
331:31 largest we we would be able to show that but I'm going to show you the search
331:32 but I'm going to show you the search from first to last and last to first so
331:34 from first to last and last to first so let's put in by default and this is what
331:36 let's put in by default and this is what it would be search From First to Last
331:38 it would be search From First to Last what the default would be so it starts
331:40 what the default would be so it starts at the very top it goes down and finds
331:42 at the very top it goes down and finds the first 56 2001 and returns Toby
331:46 the first 56 2001 and returns Toby flenderson now if we go in here and we
331:49 flenderson now if we go in here and we hit minus one that is going to search
331:51 hit minus one that is going to search from last to first so it's going to
331:53 from last to first so it's going to start at the bottom and go to the top
331:54 start at the bottom and go to the top and the first one that it finds is
331:56 and the first one that it finds is Michael Scott so that's that first one
331:58 Michael Scott so that's that first one starting from the bottom and then the
332:01 starting from the bottom and then the Michael Scott right there so these two
332:04 Michael Scott right there so these two the exact match and the search order can
332:05 the exact match and the search order can kind of be combined into um this one
332:08 kind of be combined into um this one right here we're using this
332:09 right here we're using this one um which is you know exact match or
332:12 one um which is you know exact match or next larger and you can include that in
332:15 next larger and you can include that in this binary search in this one as well
332:17 this binary search in this one as well all right now let's head over to the X
332:19 all right now let's head over to the X lookup horizontal I think we're we only
332:22 lookup horizontal I think we're we only have a few left yep X look up horizontal
332:24 have a few left yep X look up horizontal then we'll do X lookup with sum and then
332:26 then we'll do X lookup with sum and then I'm going to show you the V lookup at
332:27 I'm going to show you the V lookup at the end so let's go right here let's say
332:29 the end so let's go right here let's say equals X lookup the value that we want
332:32 equals X lookup the value that we want to be searching for is February that's
332:33 to be searching for is February that's what we're looking for hit comma and
332:35 what we're looking for hit comma and where do we want to search to find
332:37 where do we want to search to find February we want to search in uh these
332:40 February we want to search in uh these calendar months and then we hit another
332:42 calendar months and then we hit another comma and now we're going to be
332:43 comma and now we're going to be searching for paper so let's do paper
332:47 searching for paper so let's do paper and we'll hit enter and it found
332:50 and we'll hit enter and it found February and it return paper right here
332:53 February and it return paper right here and we can do that for paper printer and
332:56 and we can do that for paper printer and manila folders and so it's going to give
332:58 manila folders and so it's going to give us the 310 the 40 and the 118 from
333:01 us the 310 the 40 and the 118 from February now let's go right over here to
333:03 February now let's go right over here to XL up with some um I actually it's
333:05 XL up with some um I actually it's basically a carbon copy of this uh let's
333:09 basically a carbon copy of this uh let's take this over here real
333:12 take this over here real quick and place it right there because
333:15 quick and place it right there because it's the exact same thing except at the
333:17 it's the exact same thing except at the end we're going to use I'm going to show
333:19 end we're going to use I'm going to show you how to use sum with the X lookup at
333:22 you how to use sum with the X lookup at the same time now um we're going to be
333:26 the same time now um we're going to be using the formula sum and
333:30 using the formula sum and so we're going to do sum and then within
333:32 so we're going to do sum and then within the sum our first number is going to be
333:34 the sum our first number is going to be an X lookup and then our next value is
333:37 an X lookup and then our next value is also going to be an X lookup so let's do
333:41 also going to be an X lookup so let's do X lookup and now we're going to search
333:44 X lookup and now we're going to search for our very first value oops our very
333:47 for our very first value oops our very first lookup value so we're going to go
333:49 first lookup value so we're going to go to
333:51 to i1 and then we're going to search this
333:55 i1 and then we're going to search this again and we want whatever value oop
334:00 again and we want whatever value oop goes into that so let's close that
334:02 goes into that so let's close that parenthesis and now we're going to do a
334:04 parenthesis and now we're going to do a colon and another X
334:07 colon and another X lookup and now let's do March so now
334:11 lookup and now let's do March so now we're going to search for March we're
334:13 we're going to search for March we're going to do our search range where we're
334:16 going to do our search range where we're searching for that March and we want the
334:18 searching for that March and we want the paper as
334:20 paper as well and let's close that and then we
334:23 well and let's close that and then we also need to close that parentheses so
334:26 also need to close that parentheses so now we are basically adding this
334:28 now we are basically adding this February and and this March so it's
334:30 February and and this March so it's going to be 310 plus 150 it's adding
334:33 going to be 310 plus 150 it's adding those um two values and it should be uh
334:36 those um two values and it should be uh what 460 so let's see if that is our
334:39 what 460 so let's see if that is our output and it is so you can do this with
334:43 output and it is so you can do this with a lot of things not just some but you're
334:44 a lot of things not just some but you're able to use x lookup within different
334:47 able to use x lookup within different formulas if you're searching for a
334:48 formulas if you're searching for a specific value and a specific value um
334:51 specific value and a specific value um in in another um cell you can add those
334:54 in in another um cell you can add those together using X lookup which is
334:56 together using X lookup which is honestly it's pretty great so let's go
334:58 honestly it's pretty great so let's go over to V up so I wanted to show you
335:01 over to V up so I wanted to show you this because I wanted to show you where
335:02 this because I wanted to show you where it came from and what we used to do um
335:06 it came from and what we used to do um unless you are continuing to use V
335:07 unless you are continuing to use V lookup and what we can do now so X
335:09 lookup and what we can do now so X lookup I just showed you kind of
335:11 lookup I just showed you kind of everything um but super quickly I'm
335:13 everything um but super quickly I'm going to show you how vlookup used to
335:15 going to show you how vlookup used to work um in a super short way so that you
335:18 work um in a super short way so that you can understand how it used to be used
335:20 can understand how it used to be used and how it is used uh how X lookup is
335:22 and how it is used uh how X lookup is used now so let's go in here and we're
335:25 used now so let's go in here and we're going to say equals and we're going to
335:26 going to say equals and we're going to do a vlookup and so we have a lookup
335:29 do a vlookup and so we have a lookup value Val and so we're going to click
335:31 value Val and so we're going to click this we're going to hit Comma just like
335:34 this we're going to hit Comma just like we did before and now we're going to do
335:35 we did before and now we're going to do a table array and the table array is a
335:39 a table array and the table array is a little different in that you're
335:40 little different in that you're searching an entire area so let's do uh
335:46 searching an entire area so let's do uh H2 all the way through o oops o10 so
335:52 H2 all the way through o oops o10 so that's what that's what our table array
335:55 that's what that's what our table array is going to be then we're going to do a
335:57 is going to be then we're going to do a comma and now we have to do a column
335:59 comma and now we have to do a column index number which number um are we
336:03 index number which number um are we going to be um searching for which um
336:06 going to be um searching for which um value are we going to be searching for
336:08 value are we going to be searching for in here and so we want to search for
336:10 in here and so we want to search for eight because this is 1 2 3 4 five 6 7
336:14 eight because this is 1 2 3 4 five 6 7 eight we want to return that email and
336:16 eight we want to return that email and we're searching for the name right here
336:19 we're searching for the name right here in this very first column so we have
336:21 in this very first column so we have that comma and we're going to do eight
336:23 that comma and we're going to do eight and then in the range lookup you can do
336:25 and then in the range lookup you can do true which is an approximate match or
336:27 true which is an approximate match or false which is an exact match and we'll
336:29 false which is an exact match and we'll do
336:30 do false I don't know why it's not Auto
336:32 false I don't know why it's not Auto auto doing it but there we go and now we
336:35 auto doing it but there we go and now we will do it and it's going to return it
336:38 will do it and it's going to return it just as we had it um a lot of people uh
336:42 just as we had it um a lot of people uh I guess not everybody but some people
336:45 I guess not everybody but some people didn't like and the reason why they
336:47 didn't like and the reason why they created X lookup you had to do those
336:49 created X lookup you had to do those ranges and if you ever went in here and
336:52 ranges and if you ever went in here and then we let's say we um added another
336:56 then we let's say we um added another column which happens to data now it
336:59 column which happens to data now it gives completely different um different
337:02 gives completely different um different data so let's say for whatever reason we
337:04 data so let's say for whatever reason we added uh address so now we have these
337:07 added uh address so now we have these people address well now it's going to
337:08 people address well now it's going to give us a different um value it's going
337:11 give us a different um value it's going to have this end dates because if we go
337:12 to have this end dates because if we go in here now it doesn't um now the eighth
337:16 in here now it doesn't um now the eighth is this end date and the ninth is this
337:18 is this end date and the ninth is this email so if you have a vlookup that you
337:21 email so if you have a vlookup that you use for um you know a calculation or a
337:25 use for um you know a calculation or a table that you've created or different
337:26 table that you've created or different things in Excel you then have to go
337:28 things in Excel you then have to go through here and manually change this
337:30 through here and manually change this and so a lot of people didn't like that
337:31 and so a lot of people didn't like that CU if you you know needed to change data
337:33 CU if you you know needed to change data or you needed to change something or add
337:35 or you needed to change something or add an additional column you'd have to go
337:37 an additional column you'd have to go back and fix all of your vlookups they
337:40 back and fix all of your vlookups they wouldn't just automatically U Move with
337:42 wouldn't just automatically U Move with it which is what happens with xlookup
337:45 it which is what happens with xlookup and just to prove this uh let's go back
337:48 and just to prove this uh let's go back to the very first one which is the X
337:50 to the very first one which is the X lookup and right now the email is
337:52 lookup and right now the email is looking at O2 and through o10 um we're
337:56 looking at O2 and through o10 um we're just going to insert right here and that
337:58 just going to insert right here and that would be our new colum we'll do address
338:01 would be our new colum we'll do address oops address and notice that it hasn't
338:04 oops address and notice that it hasn't changed and why is that because it auto
338:07 changed and why is that because it auto changed for us from P2 to P10
338:10 changed for us from P2 to P10 understanding that it wanted to stick
338:11 understanding that it wanted to stick with when something was inserted here it
338:13 with when something was inserted here it wanted to stick with the original data
338:15 wanted to stick with the original data the original array that was selected and
338:18 the original array that was selected and so xlup does that work for you and it
338:20 so xlup does that work for you and it makes it a little bit easier to automate
338:23 makes it a little bit easier to automate things and create these processes in
338:26 things and create these processes in Excel without having to go fix it later
338:28 Excel without having to go fix it later which you had to do with lookup so that
338:30 which you had to do with lookup so that is it for today I hope that you know how
338:32 is it for today I hope that you know how to use x lookup a little bit better now
338:34 to use x lookup a little bit better now that you have watched this uh if you
338:35 that you have watched this uh if you enjoyed this video be sure to like And
338:37 enjoyed this video be sure to like And subscribe below and I will see you in
338:39 subscribe below and I will see you in the next
338:41 the next [Music]
338:51 [Music] video what's going on everybody welcome
338:53 video what's going on everybody welcome back to another Excel tutorial today
338:55 back to another Excel tutorial today we'll be looking at conditional
338:58 we'll be looking at conditional formatting
339:00 formatting [Music]
339:03 [Music] now if you've never heard of conditional
339:04 now if you've never heard of conditional form mounting before that's okay I had
339:06 form mounting before that's okay I had never heard of it before I became a data
339:08 never heard of it before I became a data analyst and so now that I've been using
339:10 analyst and so now that I've been using Excel a lot of course I use it quite a
339:12 Excel a lot of course I use it quite a bit and so I want to show you how to use
339:14 bit and so I want to show you how to use it conditional formatting is basically
339:16 it conditional formatting is basically just a way to see patterns and Trends
339:17 just a way to see patterns and Trends and data and that's a super simple way
339:20 and data and that's a super simple way of putting it um but it's very easy to
339:22 of putting it um but it's very easy to use and so hopefully I can show you how
339:24 use and so hopefully I can show you how to use it uh really easily in a lot of
339:27 to use it uh really easily in a lot of the things that I use the most and some
339:28 the things that I use the most and some of the things that I use it for so that
339:30 of the things that I use it for so that you can also know how to use conditional
339:32 you can also know how to use conditional formatting now before we jump into the
339:33 formatting now before we jump into the tutorial I want to give a huge shout out
339:35 tutorial I want to give a huge shout out to the sponsor of this Excel series and
339:36 to the sponsor of this Excel series and that is udemy you guys know by now that
339:39 that is udemy you guys know by now that I absolutely love udemy I've been using
339:41 I absolutely love udemy I've been using them for years and I've taken literally
339:42 them for years and I've taken literally hundreds of courses on udemy and I've
339:45 hundreds of courses on udemy and I've learned so so much especially when I was
339:46 learned so so much especially when I was first starting out as a data analyst uh
339:49 first starting out as a data analyst uh I learned a lot through their Excel
339:50 I learned a lot through their Excel courses on udemy and so I have actually
339:53 courses on udemy and so I have actually put the ones that I really like and I
339:55 put the ones that I really like and I have taken and enjoyed and think you
339:56 have taken and enjoyed and think you would as well in the description so if
339:58 would as well in the description so if you want to take those sure to check
339:59 you want to take those sure to check those out again huge shout out to UD me
340:02 those out again huge shout out to UD me for sponsoring the series now without
340:04 for sponsoring the series now without further Ado let's jump onto my screen
340:05 further Ado let's jump onto my screen and get started with the tutorial all
340:06 and get started with the tutorial all right so let's jump right into it on
340:08 right so let's jump right into it on this Home tab right here if we go all
340:10 this Home tab right here if we go all the way over to the right there is
340:12 the way over to the right there is conditional formatting and the
340:13 conditional formatting and the description that it gives us is easily
340:15 description that it gives us is easily spot Trends and patterns in your data
340:17 spot Trends and patterns in your data using bars colors and icons to visually
340:19 using bars colors and icons to visually highlight important values and that is
340:22 highlight important values and that is exactly how I would have defined it a
340:24 exactly how I would have defined it a really good job Microsoft exactly how I
340:26 really good job Microsoft exactly how I would have done it so what you'll see
340:27 would have done it so what you'll see right away is there's nothing too
340:29 right away is there's nothing too complex so we have some highlight cell
340:31 complex so we have some highlight cell rules um we have some top bottom rules
340:34 rules um we have some top bottom rules data bars color scales icon sets and
340:37 data bars color scales icon sets and then at the bottom we can create a rule
340:39 then at the bottom we can create a rule we can clear the rule and we can manage
340:41 we can clear the rule and we can manage our rule so if you create a rule then
340:43 our rule so if you create a rule then you can manage it so we're going to
340:45 you can manage it so we're going to start with these icon sets and I'm going
340:47 start with these icon sets and I'm going to show you how to use those and we'll
340:49 to show you how to use those and we'll work our way to the top and then I'll
340:50 work our way to the top and then I'll show you how to create some rules
340:51 show you how to create some rules yourself and how that all works so let's
340:55 yourself and how that all works so let's start off with the icon sets I'm going
340:57 start off with the icon sets I'm going to go over here to sales um and for this
341:01 to go over here to sales um and for this data we kind of have this um you know
341:03 data we kind of have this um you know Trend or or pattern that you can kind of
341:06 Trend or or pattern that you can kind of see over time so over the months um so
341:09 see over time so over the months um so if we go right here and let's use that
341:13 if we go right here and let's use that conditional forming let's use that icon
341:14 conditional forming let's use that icon sets and right here we can use these
341:17 sets and right here we can use these directional so you know we have this
341:19 directional so you know we have this kind of Time series each month that
341:21 kind of Time series each month that shows us how much paper they're selling
341:23 shows us how much paper they're selling and if we do this right here it's going
341:26 and if we do this right here it's going to show us if it's kind of average or if
341:28 to show us if it's kind of average or if it's below average or if it's above
341:31 it's below average or if it's above average or if it's going up so at a
341:34 average or if it's going up so at a really quick glance you can kind of see
341:35 really quick glance you can kind of see the pattern of this data set it's kind
341:38 the pattern of this data set it's kind of going mostly yellow and red there's
341:40 of going mostly yellow and red there's only two months where it's going up
341:42 only two months where it's going up significantly now we don't have to only
341:45 significantly now we don't have to only do that for one row or one column you
341:48 do that for one row or one column you can apply to all of them but as you can
341:51 can apply to all of them but as you can see all of these are red now why are
341:53 see all of these are red now why are they all red it's because they're using
341:55 they all red it's because they're using numbers for everything so they're
341:57 numbers for everything so they're comparing these 24s these 50s and 65s
342:00 comparing these 24s these 50s and 65s against these 450s and 750s and so
342:03 against these 450s and 750s and so they're all going to be red but if we do
342:06 they're all going to be red but if we do it individually if we do it each row if
342:08 it individually if we do it each row if we take it just like this and then we go
342:11 we take it just like this and then we go to Icon sets and do it it's going to be
342:14 to Icon sets and do it it's going to be much more representative of the actual
342:16 much more representative of the actual printers not of all the numbers as a
342:18 printers not of all the numbers as a whole and you can do other things uh the
342:20 whole and you can do other things uh the arrows are ones that you'll probably see
342:22 arrows are ones that you'll probably see the most often that's the one I've used
342:24 the most often that's the one I've used if I ever do use them um but you can you
342:28 if I ever do use them um but you can you know do ones like this where they have
342:30 know do ones like this where they have you know kind of a trend upward or a
342:32 you know kind of a trend upward or a trend downward um and so there's just
342:35 trend downward um and so there's just several more arrows this one only gives
342:36 several more arrows this one only gives you three as you can see this one gives
342:38 you three as you can see this one gives you five um and you can do you know
342:41 you five um and you can do you know colors or shapes or or different
342:43 colors or shapes or or different indicators and all these different
342:45 indicators and all these different things um and honestly it's kind of
342:47 things um and honestly it's kind of whatever you want to use whatever makes
342:49 whatever you want to use whatever makes sense for your data but you know I've
342:51 sense for your data but you know I've really only ever seen like these colors
342:53 really only ever seen like these colors being used I've never really seen these
342:55 being used I've never really seen these flags or anything like that but again it
342:57 flags or anything like that but again it just depends on what industry you work
342:58 just depends on what industry you work in you might you might see that let's go
343:00 in you might you might see that let's go right over here to the demographics um
343:03 right over here to the demographics um and let's look at our color scales now
343:07 and let's look at our color scales now color scales are going to be the
343:08 color scales are going to be the probably the most obvious thing that in
343:10 probably the most obvious thing that in datab bars are going to be the most
343:11 datab bars are going to be the most obvious things in here um if you go
343:14 obvious things in here um if you go right here and and you look at this
343:16 right here and and you look at this color scale if it's high if it's among
343:19 color scale if it's high if it's among the top ones it's green the lowest it's
343:22 the top ones it's green the lowest it's red and you can change that um to really
343:25 red and you can change that um to really any colors you want any colors that they
343:27 any colors you want any colors that they offer you um and it it does exactly what
343:31 offer you um and it it does exactly what it does it's a color scale a gradient of
343:33 it does it's a color scale a gradient of the colors from high to low or low to
343:36 the colors from high to low or low to high and so any color that you do you'll
343:38 high and so any color that you do you'll be able to kind of see um you know
343:40 be able to kind of see um you know what's good and what's not good that
343:43 what's good and what's not good that really is um color scales in a nutshell
343:47 really is um color scales in a nutshell data bars are again super super
343:51 data bars are again super super straightforward it's going to be either
343:53 straightforward it's going to be either a gradient fill or a solid fill so let's
343:54 a gradient fill or a solid fill so let's look at the gradient fill if we do a
343:57 look at the gradient fill if we do a blue gradient fill I'll actually let's
343:59 blue gradient fill I'll actually let's get rid of our um let's go over here
344:02 get rid of our um let's go over here let's go to clear rules from selected
344:04 let's go to clear rules from selected cells we haven't looked at that yet but
344:06 cells we haven't looked at that yet but that's how you clear it let's go to data
344:08 that's how you clear it let's go to data bars and we'll use this blue gradient so
344:12 bars and we'll use this blue gradient so with this blue gradient you know this
344:13 with this blue gradient you know this one is or sorry this one is the highest
344:16 one is or sorry this one is the highest one so it's going to be completely
344:17 one so it's going to be completely filled and this one is 36,000 almost
344:20 filled and this one is 36,000 almost half of this I'm pretty close and so
344:22 half of this I'm pretty close and so it's almost half um this one again you
344:25 it's almost half um this one again you know it's not used very often
344:29 know it's not used very often I you don't see these a lot to be honest
344:31 I you don't see these a lot to be honest you just don't um but if you do see it
344:34 you just don't um but if you do see it that's how you use it that's how it can
344:36 that's how you use it that's how it can be done again pretty easy uh as I just
344:40 be done again pretty easy uh as I just showed a second ago if you want to clear
344:41 showed a second ago if you want to clear the rules you can clear from the
344:42 the rules you can clear from the selected cells that's what we're doing
344:44 selected cells that's what we're doing so I have column G selected and I'm
344:46 so I have column G selected and I'm going to I'm going to clear that if you
344:47 going to I'm going to clear that if you want to clear the rules for the entire
344:49 want to clear the rules for the entire sheet you can do that as well so it
344:50 sheet you can do that as well so it would affect every single column and row
344:53 would affect every single column and row we'll just do this for now so now let's
344:56 we'll just do this for now so now let's go look at the top bottom rules so so
344:59 go look at the top bottom rules so so this is the top 10 items top 10% bottom
345:02 this is the top 10 items top 10% bottom 10 items bottom 10% above average and
345:05 10 items bottom 10% above average and below average and they're going to do
345:06 below average and they're going to do exactly what you think they are going to
345:08 exactly what you think they are going to do if you select above average it is
345:10 do if you select above average it is going to select or highlight the cells
345:13 going to select or highlight the cells that are above the average in column G
345:15 that are above the average in column G so let's look at the salaries that are
345:17 so let's look at the salaries that are above average all right and so uh the
345:20 above average all right and so uh the ones that are at the very top are
345:22 ones that are at the very top are Michael Scotts Toby flenderson and
345:25 Michael Scotts Toby flenderson and Dwight shro uh no shock there um I
345:28 Dwight shro uh no shock there um I believe the average is somewhere around
345:30 believe the average is somewhere around like
345:31 like 48,500 or something so I think this one
345:34 48,500 or something so I think this one just is just below it and so all these
345:36 just is just below it and so all these other ones are below average and that's
345:38 other ones are below average and that's just because you know Michael Scott and
345:41 just because you know Michael Scott and Dwight Sho are and Toby are kind of
345:43 Dwight Sho are and Toby are kind of bringing up that average quite a bit so
345:45 bringing up that average quite a bit so everyone else is going to fall beneath
345:46 everyone else is going to fall beneath that so at a super quick glance you're
345:49 that so at a super quick glance you're able to just highlight the cells and
345:52 able to just highlight the cells and you're able to see who is above average
345:54 you're able to see who is above average and you know you can do this in a lot of
345:56 and you know you can do this in a lot of different ways in Excel but this is just
345:58 different ways in Excel but this is just a really simple fast way to do that um
346:01 a really simple fast way to do that um let's get rid of that real quick and
346:04 let's get rid of that real quick and let's go back up here and now we can
346:06 let's go back up here and now we can oops let's go to top bottom rules and
346:08 oops let's go to top bottom rules and now we can see the below average and
346:10 now we can see the below average and it's going to highlight all the other
346:11 it's going to highlight all the other ones and so it works exactly how you
346:13 ones and so it works exactly how you think it is going to work and this is
346:15 think it is going to work and this is the default way that it highlights these
346:17 the default way that it highlights these cells so it highlights them this kind of
346:20 cells so it highlights them this kind of um seeth through red and then it
346:22 um seeth through red and then it highlights the actual text or or the um
346:25 highlights the actual text or or the um characters in there red as well now I'm
346:27 characters in there red as well now I'm not going to go through and show you
346:29 not going to go through and show you every single one of these top bottom
346:31 every single one of these top bottom rules I think they're pretty
346:32 rules I think they're pretty self-explanatory I just kind of wanted
346:33 self-explanatory I just kind of wanted to show you what happens when you do use
346:36 to show you what happens when you do use one of them it's going to highlight that
346:37 one of them it's going to highlight that cell so let's go up here to the
346:39 cell so let's go up here to the Highlight cells rules and honestly these
346:42 Highlight cells rules and honestly these are the ones that I use by far the most
346:45 are the ones that I use by far the most uh all these other ones combined I do
346:48 uh all these other ones combined I do not use more than this highlight cells
346:50 not use more than this highlight cells rules um and the one in here that I use
346:52 rules um and the one in here that I use more than any other conditional
346:53 more than any other conditional formatting rule is this duplicate values
346:56 formatting rule is this duplicate values so I'll start with that really quick and
346:57 so I'll start with that really quick and I'll kind of show you a few few of these
346:59 I'll kind of show you a few few of these other ones but this duplicate values to
347:01 other ones but this duplicate values to me is one of the most useful ones um and
347:05 me is one of the most useful ones um and so let's kind of show you how that works
347:08 so let's kind of show you how that works if we go to the start date you can see
347:10 if we go to the start date you can see that we have a duplicate value right
347:12 that we have a duplicate value right here and if we go over here to
347:15 here and if we go over here to conditional formatting highlight cells
347:17 conditional formatting highlight cells rules and duplicate values it is going
347:19 rules and duplicate values it is going to highlight um the uh duplicate and
347:23 to highlight um the uh duplicate and that says duplicate right here now we
347:24 that says duplicate right here now we can go through here and click on unique
347:27 can go through here and click on unique um and then it would highlight all the
347:29 um and then it would highlight all the ones that are not duplicates um so you
347:32 ones that are not duplicates um so you can use it you know kind of in a similar
347:34 can use it you know kind of in a similar inverse way uh it's just different
347:37 inverse way uh it's just different different but I use the duplicate almost
347:39 different but I use the duplicate almost always um another thing that you can do
347:42 always um another thing that you can do is go over here and you can change the
347:44 is go over here and you can change the color um or you can even do a custom um
347:47 color um or you can even do a custom um which I never do that it's not um
347:51 which I never do that it's not um something I spend a lot of time doing I
347:52 something I spend a lot of time doing I typically just stick with this one so
347:54 typically just stick with this one so you can do that and it's going to
347:55 you can do that and it's going to highlight um you know something that has
347:58 highlight um you know something that has a duplicate value in there now why do I
348:01 a duplicate value in there now why do I use this so much well I work with a lot
348:04 use this so much well I work with a lot of different types of data sets but one
348:07 of different types of data sets but one thing that you'll find in almost all of
348:09 thing that you'll find in almost all of them is they have some type of ID and
348:12 them is they have some type of ID and they're going to have some type of um
348:15 they're going to have some type of um personal information whether that's a
348:17 personal information whether that's a social security number or an address or
348:21 social security number or an address or um you
348:22 um you know or a cell phone number or something
348:25 know or a cell phone number or something like that there is going to be data that
348:28 like that there is going to be data that is going to to identify that person now
348:30 is going to to identify that person now I work a lot with pharmaceutical data a
348:32 I work a lot with pharmaceutical data a lot with Pharmacy data um as well as
348:36 lot with Pharmacy data um as well as Healthcare data so like names Social
348:38 Healthcare data so like names Social Security numbers addresses phone numbers
348:40 Security numbers addresses phone numbers all those things all that customer or or
348:42 all those things all that customer or or client information and oftentimes when I
348:44 client information and oftentimes when I get a new data set and I have it in
348:46 get a new data set and I have it in Excel or I convert it to excel I will
348:48 Excel or I convert it to excel I will start using these duplicates to try to
348:50 start using these duplicates to try to find issues with the data and I find
348:53 find issues with the data and I find them all the time either there's an
348:55 them all the time either there's an employee ID or some type of customer ID
348:57 employee ID or some type of customer ID or client ID that has a duplicate in
348:59 or client ID that has a duplicate in there that should not be in there or
349:01 there that should not be in there or there's multiple Social Security numbers
349:03 there's multiple Social Security numbers or there's an issue in some other way
349:04 or there's an issue in some other way and I'm able to find those things and
349:06 and I'm able to find those things and spot those patterns using this
349:08 spot those patterns using this duplicates and I promise you I use this
349:11 duplicates and I promise you I use this one almost every single time I open a
349:13 one almost every single time I open a new data set or I work with a new
349:14 new data set or I work with a new clients working with their data um and
349:17 clients working with their data um and so I wanted to show you this one I
349:18 so I wanted to show you this one I wanted to really press upon you that
349:20 wanted to really press upon you that this one is a really really really good
349:22 this one is a really really really good one to know and learn how to use it's
349:25 one to know and learn how to use it's not complicated it's not hard it just
349:26 not complicated it's not hard it just shows you you know you know
349:29 shows you you know you know if there's a duplicate value but I
349:30 if there's a duplicate value but I wanted you to know how I use it and how
349:32 wanted you to know how I use it and how often I use it so that you can you know
349:35 often I use it so that you can you know pick that up and put that in your tool
349:36 pick that up and put that in your tool kit in your back pocket so that you can
349:38 kit in your back pocket so that you can use that later on if you have uh if you
349:40 use that later on if you have uh if you have a similar need or if you're trying
349:42 have a similar need or if you're trying to do something similar to what I was
349:43 to do something similar to what I was just talking about so that is how
349:46 just talking about so that is how duplicates work again super great it's
349:49 duplicates work again super great it's obviously not super useful when you're
349:51 obviously not super useful when you're only using um 10 rows but when you have
349:53 only using um 10 rows but when you have you know 50,000 100,000 and there should
349:56 you know 50,000 100,000 and there should be zero duplicates in there and you
349:58 be zero duplicates in there and you highlight it and then uh you come right
350:00 highlight it and then uh you come right here use the
350:03 here use the filter and we're going to filter and
350:05 filter and we're going to filter and we're going to sort by the color and it
350:08 we're going to sort by the color and it allows you to sort by the color and you
350:10 allows you to sort by the color and you have duplicates in there then that's a
350:11 have duplicates in there then that's a problem and you identified a problem
350:13 problem and you identified a problem super quickly uh and you know some of
350:16 super quickly uh and you know some of those things they slip by because nobody
350:18 those things they slip by because nobody checks it and so that's something that I
350:20 checks it and so that's something that I I often check and if you go here and you
350:22 I often check and if you go here and you sort by color and there isn't an option
350:24 sort by color and there isn't an option to do um this this pink red color and
350:26 to do um this this pink red color and that means there aren't any duplicates
350:28 that means there aren't any duplicates and that a really good thing most of the
350:30 and that a really good thing most of the time that's a really good thing so let's
350:31 time that's a really good thing so let's go ahead and we're going to clear that
350:34 go ahead and we're going to clear that as well
350:36 as well as get rid of our conditional formatting
350:40 as get rid of our conditional formatting rules now another one that I use a lot
350:43 rules now another one that I use a lot is this one right here which is the text
350:47 is this one right here which is the text that contains honestly this one comes a
350:51 that contains honestly this one comes a lot in handy especially when you're
350:52 lot in handy especially when you're looking for like a specific keyword in
350:55 looking for like a specific keyword in my uh case a lot of times I was using
350:59 my uh case a lot of times I was using this when I was going through drug names
351:01 this when I was going through drug names I am not a doctor I do not pretend to be
351:03 I am not a doctor I do not pretend to be a doctor and so when I was looking for
351:05 a doctor and so when I was looking for laraza Pam or something like that um I
351:08 laraza Pam or something like that um I would just search for like lorz or
351:10 would just search for like lorz or something and and not Lorax but loras
351:12 something and and not Lorax but loras you know I I would just search for it
351:15 you know I I would just search for it and then all the ones that contain that
351:17 and then all the ones that contain that would pop up I can bring them to the top
351:18 would pop up I can bring them to the top and I can see them and to me that's
351:21 and I can see them and to me that's super super useful and I would do that
351:23 super super useful and I would do that all the time and so in this case we're
351:25 all the time and so in this case we're looking at emails and let's say we all
351:27 looking at emails and let's say we all only wanted to pull all the ones that
351:29 only wanted to pull all the ones that are Gmail and so now we can go through
351:32 are Gmail and so now we can go through and we can you know click okay and
351:33 and we can you know click okay and that's going to pop up or we want all
351:35 that's going to pop up or we want all the ones that have Dunder oops Dunder
351:40 the ones that have Dunder oops Dunder Mifflin and if we click on that all the
351:42 Mifflin and if we click on that all the ones that are Dunder Mifflin come up or
351:44 ones that are Dunder Mifflin come up or have done their Mylin in it and again we
351:46 have done their Mylin in it and again we can um sort by or we can um and so we
351:50 can um sort by or we can um and so we can sort by right here and we can bring
351:52 can sort by right here and we can bring all those to the top and so super super
351:55 all those to the top and so super super useful um and another use for it that
351:58 useful um and another use for it that you may not think of is something like
352:00 you may not think of is something like if it's you know there's some incorrect
352:03 if it's you know there's some incorrect data in there this happens often with
352:05 data in there this happens often with phone numbers addresses um start dates
352:09 phone numbers addresses um start dates or or or dates in general date formats
352:12 or or or dates in general date formats where you can go in here and you can say
352:15 where you can go in here and you can say text that contains and if you know you
352:17 text that contains and if you know you put in a oops a dash and it has it in
352:21 put in a oops a dash and it has it in there then you know that that is that is
352:23 there then you know that that is that is wrong now that is really all I wanted to
352:25 wrong now that is really all I wanted to show you in the Highlight cells rules uh
352:27 show you in the Highlight cells rules uh the duplicate values and the text
352:28 the duplicate values and the text contains are by far the ones that I use
352:30 contains are by far the ones that I use the most all the other ones I have used
352:33 the most all the other ones I have used um these ones not so much but in these
352:36 um these ones not so much but in these highlight cells rules I use you know
352:38 highlight cells rules I use you know these two all the time um sometimes I
352:40 these two all the time um sometimes I use this between I don't really use
352:42 use this between I don't really use these other ones as much although I have
352:44 these other ones as much although I have used them and so you got nothing else
352:46 used them and so you got nothing else from this video I just wanted you to
352:48 from this video I just wanted you to know that these two are super useful and
352:51 know that these two are super useful and if you haven't used them before to maybe
352:52 if you haven't used them before to maybe try them out and see if you can apply
352:53 try them out and see if you can apply them to your own data sets now we've
352:56 them to your own data sets now we've looked at all of these preset ones in
352:58 looked at all of these preset ones in conditional formatting but you can also
353:01 conditional formatting but you can also do a new rule and so if we click on new
353:03 do a new rule and so if we click on new rule right here and we go down to use a
353:05 rule right here and we go down to use a formula to determine which cells to
353:07 formula to determine which cells to format we can add our own formula in
353:10 format we can add our own formula in here that will then highlight exactly
353:13 here that will then highlight exactly what we want and so if there isn't a
353:15 what we want and so if there isn't a preset rule that you like and it doesn't
353:18 preset rule that you like and it doesn't have the option that you want you can do
353:20 have the option that you want you can do almost any formula that you want in our
353:22 almost any formula that you want in our formulas video that we did a few weeks
353:24 formulas video that we did a few weeks ago and you can put it in here and then
353:26 ago and you can put it in here and then you can format uh what you want the cell
353:28 you can format uh what you want the cell to look like if it meets that criteria
353:31 to look like if it meets that criteria so let's take this right over here um
353:33 so let's take this right over here um and before we start this formula I just
353:35 and before we start this formula I just want you to note that you know I have
353:38 want you to note that you know I have h11 highlighted that's going to come
353:39 h11 highlighted that's going to come into play in just a little bit but I
353:41 into play in just a little bit but I want you to be aware that h11 is the
353:43 want you to be aware that h11 is the cell that we're highlighted so what
353:45 cell that we're highlighted so what we're going to do is we are going to
353:47 we're going to do is we are going to create our formula now if you've never
353:50 create our formula now if you've never created a formula I highly recommend uh
353:52 created a formula I highly recommend uh watching my formulas tutorial because
353:54 watching my formulas tutorial because that is going to show you how to do this
353:56 that is going to show you how to do this um but we're all we're going to do is
353:58 um but we're all we're going to do is we're going to do equals that's how you
354:00 we're going to do equals that's how you start the uh how you actually create a
354:02 start the uh how you actually create a formula and we're going to give it this
354:04 formula and we're going to give it this range right here and so it's going to
354:07 range right here and so it's going to take everything from G2 to G10 now these
354:10 take everything from G2 to G10 now these dollar signs are super important if you
354:13 dollar signs are super important if you don't know how to use them or you don't
354:14 don't know how to use them or you don't know what they do um you're going to
354:16 know what they do um you're going to mess up this formula a lot uh and so
354:19 mess up this formula a lot uh and so what this dollar sign basically does is
354:21 what this dollar sign basically does is it's basically hardcoding it in there it
354:24 it's basically hardcoding it in there it is only going to look at G2 and is only
354:26 is only going to look at G2 and is only going to look at G10 or through G10
354:28 going to look at G10 or through G10 because that colon and this can come
354:31 because that colon and this can come into play because if you have something
354:33 into play because if you have something selected like the h11 it's going to mess
354:36 selected like the h11 it's going to mess it up because now if you have h11
354:39 it up because now if you have h11 selected like we do you'll see this in a
354:40 selected like we do you'll see this in a second it's not going to be applied to
354:43 second it's not going to be applied to this um and again I'll show you that in
354:46 this um and again I'll show you that in just a minute but we don't want this
354:47 just a minute but we don't want this hardcoded in there okay but we do have
354:51 hardcoded in there okay but we do have to select the proper range in a second
354:54 to select the proper range in a second um so we're going to get rid of this
354:55 um so we're going to get rid of this we're going to get rid of the dollar
354:56 we're going to get rid of the dollar signs because we want to pretty fluid
354:59 signs because we want to pretty fluid and be able to applied to be applied
355:01 and be able to applied to be applied basically anywhere we want let's go into
355:03 basically anywhere we want let's go into this
355:03 this formula um if it meets our criteria
355:07 formula um if it meets our criteria let's give it um let's give it a border
355:10 let's give it um let's give it a border and we'll give it um we'll give it some
355:13 and we'll give it um we'll give it some color we're going to say if this is
355:15 color we're going to say if this is greater than
355:16 greater than 50,000 so let's hit okay and nothing
355:21 50,000 so let's hit okay and nothing happened so let's go back and see why so
355:24 happened so let's go back and see why so if we go to our manage rules you can see
355:26 if we go to our manage rules you can see that so as the G2 to G G10 is greater
355:28 that so as the G2 to G G10 is greater than 50,000 but it only is being applied
355:30 than 50,000 but it only is being applied to this h11 cell which really makes no
355:34 to this h11 cell which really makes no sense um so if we had wanted to get it
355:36 sense um so if we had wanted to get it done the first time we needed to have
355:38 done the first time we needed to have basically selected that G2 to G10 right
355:41 basically selected that G2 to G10 right away um but we can do that now so let's
355:43 away um but we can do that now so let's get rid of this and we're going to say
355:47 get rid of this and we're going to say G2 to
355:48 G2 to G10 and that is hardcoded in there
355:51 G10 and that is hardcoded in there that's should be fine still um but let's
355:54 that's should be fine still um but let's see what it
355:55 see what it does and so now every every single thing
355:58 does and so now every every single thing is highlighted and why is that uh that's
356:01 is highlighted and why is that uh that's because when we changed it it also
356:04 because when we changed it it also changed the format of it because we
356:06 changed the format of it because we changed the cell that we were looking at
356:09 changed the cell that we were looking at so we need to come back here and that's
356:11 so we need to come back here and that's why again you want to do this the right
356:12 why again you want to do this the right way the first time we're going to come
356:13 way the first time we're going to come back here we're going to give it this
356:15 back here we're going to give it this range and we're going to get rid of
356:17 range and we're going to get rid of these dollar
356:19 these dollar [Music]
356:21 [Music] signs and now we're going to hit okay
356:25 signs and now we're going to hit okay and so now it's being applied G2 to G10
356:28 and so now it's being applied G2 to G10 and G2 to G10 and we'll keep it like
356:30 and G2 to G10 and we'll keep it like that and we'll apply it and now it works
356:33 that and we'll apply it and now it works properly so now everything that's above
356:35 properly so now everything that's above 50,000 is being highlighted again if
356:37 50,000 is being highlighted again if that was confusing um it it is confusing
356:40 that was confusing um it it is confusing it genuinely is and so if you wanted to
356:42 it genuinely is and so if you wanted to do this right the first time without
356:43 do this right the first time without having to make a bunch of changes you'd
356:45 having to make a bunch of changes you'd want to highlight these before you start
356:48 want to highlight these before you start and then you want to go in and create
356:50 and then you want to go in and create the rule we'll do this really quick just
356:52 the rule we'll do this really quick just to kind of show you what I'm talking
356:53 to kind of show you what I'm talking about we'll say equals we'll give it
356:56 about we'll say equals we'll give it this range
356:59 this range get rid of these real quick because
357:02 get rid of these real quick because again I don't want this hardcoded in
357:05 again I don't want this hardcoded in there it will ruin our formula and then
357:07 there it will ruin our formula and then we'll say greater than 30 um and we'll
357:10 we'll say greater than 30 um and we'll give this nice green uh and so now if
357:14 give this nice green uh and so now if they're over the age of 30 it will be
357:16 they're over the age of 30 it will be highlighted and we didn't have to go
357:17 highlighted and we didn't have to go back and change anything we didn't have
357:19 back and change anything we didn't have to go back and fix anything like we did
357:20 to go back and fix anything like we did in the first one um that was all for
357:23 in the first one um that was all for demonstration purposes but again you
357:25 demonstration purposes but again you need to really be aware of that that is
357:27 need to really be aware of that that is something that I think think almost
357:28 something that I think think almost everybody's going to mess up at some
357:30 everybody's going to mess up at some point if you don't already know about it
357:32 point if you don't already know about it then you definitely are going to make
357:34 then you definitely are going to make that mistake now if we come over here in
357:36 that mistake now if we come over here in this area uh we go to our manage rules
357:39 this area uh we go to our manage rules and not just the current selection but
357:40 and not just the current selection but this whole worksheet then you can see
357:42 this whole worksheet then you can see that we have these two formulas now you
357:44 that we have these two formulas now you can go in and edit any of these by
357:45 can go in and edit any of these by double clicking or clicking on it and
357:47 double clicking or clicking on it and then hitting edit rule you can also
357:49 then hitting edit rule you can also delete these rules or duplicate these
357:51 delete these rules or duplicate these rules um I just wanted to show you what
357:53 rules um I just wanted to show you what you are able to do with them but if we
357:55 you are able to do with them but if we uh go ahead and we get rid of this um so
357:58 uh go ahead and we get rid of this um so let's say we delete that rule and we hit
358:01 let's say we delete that rule and we hit apply uh you know the rule is going to
358:02 apply uh you know the rule is going to go away that's that I mean it's as
358:04 go away that's that I mean it's as simple as that so that is how you can
358:07 simple as that so that is how you can create your own rule I want to be again
358:10 create your own rule I want to be again very specific in the fact that that is a
358:12 very specific in the fact that that is a confusing piece and if you mess that up
358:15 confusing piece and if you mess that up you're going to be you know fixing a
358:17 you're going to be you know fixing a bunch of different stuff and not
358:19 bunch of different stuff and not understanding why your rule is not
358:21 understanding why your rule is not working properly it's just because it's
358:23 working properly it's just because it's confusing those dollar signs are are
358:25 confusing those dollar signs are are really important to watch out for and
358:27 really important to watch out for and that is all all there is to it with
358:28 that is all all there is to it with conditional formatting again conditional
358:30 conditional formatting again conditional formatting is um you know it's not
358:33 formatting is um you know it's not anything super confusing we've looked at
358:35 anything super confusing we've looked at more complicated things but it's a
358:37 more complicated things but it's a really really useful tool to use to look
358:40 really really useful tool to use to look at these patterns and Trends super
358:41 at these patterns and Trends super quickly and to find um these outliers or
358:44 quickly and to find um these outliers or these specific values that you're
358:45 these specific values that you're looking for very quickly and if you're
358:47 looking for very quickly and if you're looking at just thousands and tens of
358:50 looking at just thousands and tens of thousands or hundreds of thousands of
358:51 thousands or hundreds of thousands of rows this is one of the fastest ways to
358:54 rows this is one of the fastest ways to find these things without having to kind
358:56 find these things without having to kind of wait and filter and use these um
358:59 of wait and filter and use these um these these filters right here because
359:00 these these filters right here because again this can just take forever um and
359:03 again this can just take forever um and so if you haven't or if you've never
359:05 so if you haven't or if you've never worked with a ton of data and tried to
359:06 worked with a ton of data and tried to use this before it can take honestly
359:09 use this before it can take honestly like 10 minutes for something simple
359:11 like 10 minutes for something simple that you could do with conditional
359:12 that you could do with conditional formatting in like 10 seconds so
359:14 formatting in like 10 seconds so definitely something to mess with and
359:15 definitely something to mess with and use when you are working with your own
359:17 use when you are working with your own data sets uh I hope this was helpful I
359:19 data sets uh I hope this was helpful I mean honestly I use this all the time so
359:21 mean honestly I use this all the time so you know I hope that somebody out there
359:23 you know I hope that somebody out there can can use this uh for their own work
359:26 can can use this uh for their own work that they're currently using thank you
359:28 that they're currently using thank you guys so much for watching I really
359:29 guys so much for watching I really appreciate it again huge shout out to
359:31 appreciate it again huge shout out to you me for sponsoring this Excel series
359:33 you me for sponsoring this Excel series if you like this video be sure to like
359:34 if you like this video be sure to like And subscribe below I'll see you in the
359:36 And subscribe below I'll see you in the next
359:38 next [Music]
359:48 [Music] video what's going on everybody welcome
359:50 video what's going on everybody welcome back to another Excel tutorial today we
359:52 back to another Excel tutorial today we will be looking at
359:55 will be looking at [Music]
359:56 [Music] charts
360:01 now if you have data in Excel and you want to visually show that with bars or
360:04 want to visually show that with bars or graphs or anything like that you can do
360:06 graphs or anything like that you can do that really simply and I'm going to show
360:07 that really simply and I'm going to show you how to do that today and a lot of
360:09 you how to do that today and a lot of people are a little bit intimidated
360:11 people are a little bit intimidated because they think it's a little bit
360:12 because they think it's a little bit complicated but I promise you by the end
360:14 complicated but I promise you by the end of this video you will know how to do it
360:16 of this video you will know how to do it like a pro it's not that difficult it's
360:19 like a pro it's not that difficult it's just you need to know where to look
360:21 just you need to know where to look where to click and how to actually
360:22 where to click and how to actually filter through things to make sure that
360:24 filter through things to make sure that you're visually showing the things that
360:25 you're visually showing the things that you want to show but before we actually
360:27 you want to show but before we actually jump into the the tutorial I want to
360:28 jump into the the tutorial I want to give a huge shout out to the sponsor of
360:29 give a huge shout out to the sponsor of this Excel series and that is udem me
360:32 this Excel series and that is udem me you may not know this but I probably get
360:33 you may not know this but I probably get at least 15 to 50 companies every single
360:36 at least 15 to 50 companies every single month reaching out to me wanting to
360:38 month reaching out to me wanting to sponsor the channel and promote their
360:39 sponsor the channel and promote their product and I turn down almost every
360:41 product and I turn down almost every single one because I either don't know
360:43 single one because I either don't know their product or I don't believe in
360:44 their product or I don't believe in their product and so I'm not going to
360:46 their product and so I'm not going to you know go and promote that on my
360:47 you know go and promote that on my channel but unud me is one that I have
360:49 channel but unud me is one that I have consistently promoted over the past year
360:51 consistently promoted over the past year and that's because I truly believe in
360:53 and that's because I truly believe in their product I've been taking courses
360:54 their product I've been taking courses off their platform for years and I've
360:56 off their platform for years and I've honestly learned so much and I cannot
360:58 honestly learned so much and I cannot recommend them enough so if you want to
361:00 recommend them enough so if you want to take a full-fledged Excel course I have
361:02 take a full-fledged Excel course I have my recommendations in the description if
361:04 my recommendations in the description if you want to check those out thank you
361:06 you want to check those out thank you again to UD me for sponsoring this Excel
361:08 again to UD me for sponsoring this Excel Series so without further Ado let's jump
361:10 Series so without further Ado let's jump onto my screen and get started with the
361:11 onto my screen and get started with the tutorial all right so let's jump right
361:13 tutorial all right so let's jump right into it right here we have the Dunder
361:15 into it right here we have the Dunder Mifflin sales report and over here we
361:17 Mifflin sales report and over here we have all the products that they were
361:18 have all the products that they were selling along with the months that they
361:20 selling along with the months that they were sold in and so in January they sold
361:23 were sold in and so in January they sold 450 reams of paper down here we have the
361:26 450 reams of paper down here we have the total it items per month and so in
361:29 total it items per month and so in January they sold 898 units of uh
361:32 January they sold 898 units of uh products or or things that they sold at
361:34 products or or things that they sold at the very end we have the year end total
361:36 the very end we have the year end total so this is the total amount of paper
361:38 so this is the total amount of paper that they sold throughout the year now
361:40 that they sold throughout the year now we're going to use this data right here
361:42 we're going to use this data right here for all of our charts now you may not
361:45 for all of our charts now you may not have data exactly like this it can come
361:47 have data exactly like this it can come in lots of different flavors but you're
361:49 in lots of different flavors but you're going to get the basic gist of how to
361:51 going to get the basic gist of how to use charts how to edit it how to
361:54 use charts how to edit it how to customize it to fit what you need and
361:56 customize it to fit what you need and then we're going to kind of put it right
361:57 then we're going to kind of put it right over here and kind of create its own
361:59 over here and kind of create its own sheet where we can kind of visualize all
362:02 sheet where we can kind of visualize all the things that we want to
362:04 the things that we want to show so let's jump right back over here
362:07 show so let's jump right back over here into sales and first thing we need to do
362:10 into sales and first thing we need to do is kind of highlight the data that we're
362:12 is kind of highlight the data that we're going to be working with now I'm going
362:13 going to be working with now I'm going to start with everything but um you know
362:15 to start with everything but um you know I'll show you along the way we don't
362:17 I'll show you along the way we don't actually want everything but we can
362:19 actually want everything but we can filter that stuff out as we go so let's
362:22 filter that stuff out as we go so let's go right here and we're going to insert
362:24 go right here and we're going to insert and we're going to go over to charts now
362:27 and we're going to go over to charts now this is the chart section there's lots
362:28 this is the chart section there's lots of different types of charts um but the
362:31 of different types of charts um but the first thing that we're going to be
362:32 first thing that we're going to be looking at is right here this is a 2d
362:35 looking at is right here this is a 2d column or kind of like a bar chart and
362:37 column or kind of like a bar chart and we're just going to click right here and
362:39 we're just going to click right here and we're going to pull this
362:40 we're going to pull this down so now that we have this down here
362:44 down so now that we have this down here there are a few things that I want to
362:45 there are a few things that I want to show you before we actually really get
362:47 show you before we actually really get into it I kind of want to show you the
362:49 into it I kind of want to show you the options that you have so if you go up
362:51 options that you have so if you go up here we have different uh chart Styles
362:55 here we have different uh chart Styles and so if I hover over them you can see
362:57 and so if I hover over them you can see that each one kind of looks a little bit
363:00 that each one kind of looks a little bit different and it really doesn't matter
363:03 different and it really doesn't matter uh it doesn't really change the data in
363:05 uh it doesn't really change the data in any way just how you visualize it and so
363:08 any way just how you visualize it and so if that is important if that is
363:10 if that is important if that is something that you um you want to stick
363:12 something that you um you want to stick with a certain theme or a certain look
363:14 with a certain theme or a certain look then go for that uh the other thing
363:16 then go for that uh the other thing that's really nice to have over here is
363:18 that's really nice to have over here is this switch row and column so right down
363:21 this switch row and column so right down here you can see this purple and you can
363:23 here you can see this purple and you can see this red those are our rows and
363:25 see this red those are our rows and columns and we can switch that right
363:27 columns and we can switch that right here so if we go like this now instead
363:30 here so if we go like this now instead of the months being right here the
363:32 of the months being right here the months are the colors and the actual
363:34 months are the colors and the actual product is right here let's click it
363:36 product is right here let's click it again and it'll go back and so now we
363:38 again and it'll go back and so now we have this kind of Time series now we
363:40 have this kind of Time series now we have January through the end of your
363:42 have January through the end of your total now this one is one that I think
363:45 total now this one is one that I think is super helpful you know it you can do
363:48 is super helpful you know it you can do it down here as well if you go to this
363:49 it down here as well if you go to this filter um but both of these are super
363:53 filter um but both of these are super helpful because you sometimes just want
363:55 helpful because you sometimes just want to select all the data and then kind of
363:56 to select all the data and then kind of get in there and mess with with it
363:58 get in there and mess with with it something that we want to get rid of is
363:59 something that we want to get rid of is this total items per month so we want to
364:02 this total items per month so we want to remove that and then we also want to
364:04 remove that and then we also want to remove this year-end total because both
364:06 remove this year-end total because both of those are are kind of the end result
364:10 of those are are kind of the end result they're not the actual data per month or
364:13 they're not the actual data per month or or per product so we're going to get rid
364:14 or per product so we're going to get rid of those and we're going to apply that
364:16 of those and we're going to apply that and as you can see just right off the
364:18 and as you can see just right off the bat our data is changed dramatically uh
364:20 bat our data is changed dramatically uh and that's because we aren't including
364:22 and that's because we aren't including these these large large numbers that
364:24 these these large large numbers that were kind of throwing off uh the
364:26 were kind of throwing off uh the visualization for us so this one right
364:29 visualization for us so this one right here as is is already pretty good um
364:34 here as is is already pretty good um what we can do right here is we can
364:35 what we can do right here is we can change this and we're just going to
364:37 change this and we're just going to say products
364:40 say products sold per
364:43 sold per month now what we can do if we want to
364:45 month now what we can do if we want to move it to another um to another sheet
364:48 move it to another um to another sheet is we can actually move the chart and we
364:50 is we can actually move the chart and we can select where we want to move it we
364:53 can select where we want to move it we can move it to chart sheet and we can do
364:54 can move it to chart sheet and we can do that or something that I do um almost
364:57 that or something that I do um almost 99% of the time I just copy and I come
365:00 99% of the time I just copy and I come over here and I'm going to paste it and
365:03 over here and I'm going to paste it and so now we have this um this chart right
365:06 so now we have this um this chart right over here as well as back here and so I
365:11 over here as well as back here and so I typically tend to do that because now we
365:13 typically tend to do that because now we can still go over here and change this
365:15 can still go over here and change this one as much as we want so if we want to
365:17 one as much as we want so if we want to go in here we can alter this one and it
365:18 go in here we can alter this one and it won't affect the other one so we just
365:20 won't affect the other one so we just have basically two copies so we're going
365:22 have basically two copies so we're going to keep this one right here this is
365:23 to keep this one right here this is going to be our first
365:25 going to be our first visualization um and as I said said it's
365:27 visualization um and as I said said it's it's fairly straightforward if you've
365:29 it's fairly straightforward if you've ever done any types of charts or graphs
365:31 ever done any types of charts or graphs before um right here it's January
365:33 before um right here it's January February March April May and if you
365:35 February March April May and if you hover over these you can see that that's
365:38 hover over these you can see that that's the the paper and if we just glance you
365:40 the the paper and if we just glance you know the paper is their biggest product
365:42 know the paper is their biggest product by far and so that blue um which is
365:45 by far and so that blue um which is their paper is going to be the biggest
365:47 their paper is going to be the biggest every single month so that makes perfect
365:49 every single month so that makes perfect sense now what if we want to change up
365:52 sense now what if we want to change up uh the the kind so what if we want to
365:55 uh the the kind so what if we want to change up the kind of visualization that
365:57 change up the kind of visualization that it offers us well we have a lot of
366:00 it offers us well we have a lot of different options let's go right over
366:02 different options let's go right over here to change chart type now this is
366:05 here to change chart type now this is going to offer you just about everything
366:08 going to offer you just about everything you could possibly imagine or want and
366:11 you could possibly imagine or want and even things that you absolutely would
366:12 even things that you absolutely would never ever want ever um and so I'm going
366:15 never ever want ever um and so I'm going to show you some of the good ones and
366:17 to show you some of the good ones and I'm going to show you some just
366:18 I'm going to show you some just absolutely insane ones that uh Excel
366:21 absolutely insane ones that uh Excel came up with which cannot I could not
366:23 came up with which cannot I could not imagine a scenario that these are ever
366:25 imagine a scenario that these are ever used um but Within These columns you can
366:28 used um but Within These columns you can do they're called cluster columns uh
366:31 do they're called cluster columns uh these stacked columns so would look just
366:33 these stacked columns so would look just like this those are often used as
366:36 like this those are often used as well um and then we have ones that
366:39 well um and then we have ones that they're just not used often let's look
366:41 they're just not used often let's look let's take a look at this one right
366:43 let's take a look at this one right here I mean it's tough it's tough to
366:46 here I mean it's tough it's tough to look at um but let's let's put it right
366:48 look at um but let's let's put it right here this is basically the same thing
366:51 here this is basically the same thing that we just had except visualized in a
366:54 that we just had except visualized in a different um we'll call it more unique
366:56 different um we'll call it more unique way
366:57 way uh and let's for the sake of it let's
366:59 uh and let's for the sake of it let's put it over here um these two things
367:02 put it over here um these two things show the same information they show the
367:04 show the same information they show the same data just one is shown well and one
367:08 same data just one is shown well and one is not shown well um I'm not a fan of
367:11 is not shown well um I'm not a fan of these 3D type of
367:13 these 3D type of visualizations I I just don't like them
367:16 visualizations I I just don't like them but maybe you do and and you want to use
367:19 but maybe you do and and you want to use that that's fantastic let's go back um
367:22 that that's fantastic let's go back um something else that you'll probably use
367:24 something else that you'll probably use a lot are things like these um these
367:27 a lot are things like these um these line graphs okay so these are line
367:29 line graphs okay so these are line graphs and they're different types so
367:31 graphs and they're different types so they're these stacked um 100% stacked
367:35 they're these stacked um 100% stacked line lines with markers different
367:37 line lines with markers different flavors for this this type of line graph
367:40 flavors for this this type of line graph and so you can go in here and take a
367:42 and so you can go in here and take a look again um not my favorite but they
367:47 look again um not my favorite but they have it as an option if you CH so choose
367:49 have it as an option if you CH so choose to do this um but I kind of I'm kind of
367:52 to do this um but I kind of I'm kind of a simple guy um but I'm going to go in
367:54 a simple guy um but I'm going to go in here and it's pretty cluster
367:57 here and it's pretty cluster um I want to kind of take the ones that
367:59 um I want to kind of take the ones that have the highest
368:01 have the highest sales or the highest total amount sold
368:04 sales or the highest total amount sold so that would be paper manila folders
368:07 so that would be paper manila folders and three ring binders so let's go in
368:10 and three ring binders so let's go in here we want to keep paper we want to
368:14 here we want to keep paper we want to keep uh manila
368:16 keep uh manila folders and we want to keep three ring
368:19 folders and we want to keep three ring binders and let's apply that and so now
368:22 binders and let's apply that and so now it's a lot cleaner and we're just going
368:24 it's a lot cleaner and we're just going to copy this and we're going to put it
368:27 to copy this and we're going to put it over here and I'm just putting these all
368:29 over here and I'm just putting these all over here for you U because we'll look
368:30 over here for you U because we'll look at this at the end and just kind of see
368:32 at this at the end and just kind of see different options and and ways to do
368:34 different options and and ways to do things as we have gone through this
368:36 things as we have gone through this tutorial so let's go back here now
368:39 tutorial so let's go back here now something else that we haven't looked at
368:41 something else that we haven't looked at is the actual colors and color schemes
368:44 is the actual colors and color schemes that you can do so let's go right here
368:46 that you can do so let's go right here to these chart Styles and we can go to
368:48 to these chart Styles and we can go to color now color is um something that
368:52 color now color is um something that probably is quite overlooked um in
368:55 probably is quite overlooked um in actual charts and graphs some terrible
368:57 actual charts and graphs some terrible colors like this or or this um where
369:00 colors like this or or this um where they're really close together especially
369:02 they're really close together especially when you have a lot of them um for
369:04 when you have a lot of them um for example let's just pretend we put all of
369:07 example let's just pretend we put all of them back really quickly it is near
369:11 them back really quickly it is near impossible to distinguish these colors
369:14 impossible to distinguish these colors um we
369:15 um we wouldn't we wouldn't want that let's go
369:17 wouldn't we wouldn't want that let's go back to this color you know when you
369:19 back to this color you know when you have it like uh in some of these colors
369:21 have it like uh in some of these colors at least it at least distinguishes them
369:24 at least it at least distinguishes them so you can kind of see what you're
369:26 so you can kind of see what you're working with with um but when you have
369:28 working with with um but when you have it in these monochromatic options
369:31 it in these monochromatic options sometimes they're just impossible to
369:32 sometimes they're just impossible to distinguish so be sure to choose the
369:34 distinguish so be sure to choose the right colors that you're using so that
369:37 right colors that you're using so that if somebody who's never seen this data
369:38 if somebody who's never seen this data before looks at it they can easily
369:41 before looks at it they can easily distinguish uh the product and the month
369:44 distinguish uh the product and the month that you are looking at but let's go
369:46 that you are looking at but let's go just back up here we'll choose this
369:48 just back up here we'll choose this default option um well let's choose this
369:50 default option um well let's choose this one right here this one's nice although
369:52 one right here this one's nice although there's lots of yellows and oranges
369:54 there's lots of yellows and oranges let's see this one this one's not bad
369:56 let's see this one this one's not bad greens blues uh and like yellows so
370:00 greens blues uh and like yellows so that's nice um other things that we want
370:03 that's nice um other things that we want to look at and there are these chart
370:05 to look at and there are these chart elements right here other things that we
370:07 elements right here other things that we can add are things like data labels um
370:11 can add are things like data labels um and right here it's super messy um but
370:14 and right here it's super messy um but if we went back and we got rid of some
370:16 if we went back and we got rid of some of these things like the printer Staples
370:20 of these things like the printer Staples highlighters pens and total we apply
370:23 highlighters pens and total we apply that it's a little bit easier to
370:25 that it's a little bit easier to distinguish um
370:27 distinguish um and that's you know something that you
370:29 and that's you know something that you may be interested in doing you can also
370:31 may be interested in doing you can also add this data table at the bottom which
370:33 add this data table at the bottom which is the actual columns and rows that you
370:36 is the actual columns and rows that you have for this visualization right here
370:38 have for this visualization right here now let's expand this quite a bit I'm
370:40 now let's expand this quite a bit I'm going to make this extremely large if
370:42 going to make this extremely large if you have something like this it actually
370:44 you have something like this it actually can be pretty nice um you know maybe we
370:47 can be pretty nice um you know maybe we get rid of these data labels but it can
370:50 get rid of these data labels but it can be easy because you're putting it all in
370:51 be easy because you're putting it all in one place you can also make this two
370:53 one place you can also make this two separate visualizations so you can have
370:55 separate visualizations so you can have one visualization just like this and
370:56 one visualization just like this and right underneath it you can have the
370:58 right underneath it you can have the actual rows and columns but this option
371:01 actual rows and columns but this option allows you to put it all in one so let's
371:03 allows you to put it all in one so let's put this back down because that is way
371:06 put this back down because that is way too big and uh wait let's expand it a
371:09 too big and uh wait let's expand it a little bit now if you notice right here
371:11 little bit now if you notice right here we have our Legend up top um it is
371:14 we have our Legend up top um it is possible to actually change that you can
371:16 possible to actually change that you can go right here and you can move this um
371:19 go right here and you can move this um kind of wherever you want um but it's
371:22 kind of wherever you want um but it's not exactly easy to put based off how we
371:25 not exactly easy to put based off how we have it right here if we go into to this
371:28 have it right here if we go into to this chart elements we go down to Legend and
371:30 chart elements we go down to Legend and we hit this little arrow right here we
371:32 we hit this little arrow right here we can select it on the right the top the
371:35 can select it on the right the top the left and the bottom or we can just go to
371:38 left and the bottom or we can just go to more options uh which allows us to push
371:40 more options uh which allows us to push it anywhere but um let's say I want to
371:42 it anywhere but um let's say I want to do it just like this I'm going to put on
371:44 do it just like this I'm going to put on the right and I actually want to bring
371:46 the right and I actually want to bring it down right here and you know that's
371:50 it down right here and you know that's just an option if you want to kind of
371:52 just an option if you want to kind of customize it a little further makes a
371:53 customize it a little further makes a little cleaner uh you can do that with
371:55 little cleaner uh you can do that with almost any of these things so if you
371:56 almost any of these things so if you click on this oops if you click on this
371:59 click on this oops if you click on this you can move this anywhere as well so if
372:01 you can move this anywhere as well so if you want to move this over here on top
372:03 you want to move this over here on top of it you can and make it look terrible
372:04 of it you can and make it look terrible or you can move it uh right back over
372:06 or you can move it uh right back over here you know this is something that you
372:08 here you know this is something that you can move around uh you just kind of want
372:10 can move around uh you just kind of want to make sure you're doing it the right
372:12 to make sure you're doing it the right way so let's get this back where was
372:14 way so let's get this back where was there we go now before we go any further
372:16 there we go now before we go any further let's copy that and put it right over
372:19 let's copy that and put it right over here with our other uh charts and graphs
372:23 here with our other uh charts and graphs and if you see over here on this side we
372:26 and if you see over here on this side we have this this format chart area notice
372:27 have this this format chart area notice I haven't showed you this at all yet
372:30 I haven't showed you this at all yet that is because I genuinely just don't
372:32 that is because I genuinely just don't use this almost at all um there are some
372:35 use this almost at all um there are some good stuff in here um and I'm sure that
372:37 good stuff in here um and I'm sure that you know if you were someone who really
372:39 you know if you were someone who really wants to go in there and super customize
372:41 wants to go in there and super customize it you can do that um but I honestly I
372:44 it you can do that um but I honestly I just never get in here and I never you
372:46 just never get in here and I never you know change the glow or the Shadows um
372:50 know change the glow or the Shadows um just not something I use and some of
372:52 just not something I use and some of these are only for these three 3D
372:53 these are only for these three 3D formatting which I never use and so I'm
372:56 formatting which I never use and so I'm not going to show you and walk through
372:57 not going to show you and walk through these things again I I really don't use
373:00 these things again I I really don't use it and so if you want to go in there and
373:02 it and so if you want to go in there and mess with it uh you know by all means go
373:04 mess with it uh you know by all means go for it it's just not something that I
373:06 for it it's just not something that I want to take the time to show you and
373:08 want to take the time to show you and with that being said let's go back over
373:10 with that being said let's go back over to this chart sheet that we have and it
373:12 to this chart sheet that we have and it was super super easy to get these um
373:16 was super super easy to get these um charts and graphs and and and whatnot
373:18 charts and graphs and and and whatnot there are lots of different options
373:20 there are lots of different options again if we go back here and we go up
373:23 again if we go back here and we go up here to chart design and go to the
373:25 here to chart design and go to the change chart type and again there are a
373:28 change chart type and again there are a ton of different options like a pie
373:30 ton of different options like a pie chart um like this it's it's you know
373:34 chart um like this it's it's you know you can try to figure this out and use
373:36 you can try to figure this out and use these um but you know I wanted to show
373:39 these um but you know I wanted to show you the ones that you'll probably use
373:40 you the ones that you'll probably use the most which are these columns and
373:42 the most which are these columns and line charts and they all kind of are
373:45 line charts and they all kind of are similar in their own way this bar chart
373:47 similar in their own way this bar chart is basically you know this column chart
373:50 is basically you know this column chart just on its side and so they all have
373:52 just on its side and so they all have their different flavor they all have
373:54 their different flavor they all have their different way of visualizing the
373:55 their different way of visualizing the data but but in essence they're using
373:57 data but but in essence they're using the data in a similar way to to
373:59 the data in a similar way to to visualize it and represent the data
374:00 visualize it and represent the data itself especially things like these box
374:02 itself especially things like these box and whisker plots or these waterfall
374:04 and whisker plots or these waterfall charts uh you know these are things that
374:07 charts uh you know these are things that usually require specific data to kind of
374:10 usually require specific data to kind of use uh and and so I'm just using data
374:13 use uh and and so I'm just using data that you'll probably see the most of um
374:15 that you'll probably see the most of um like this this sales data so I hope that
374:18 like this this sales data so I hope that this given you a pretty good um you know
374:20 this given you a pretty good um you know quick understanding of how to use these
374:22 quick understanding of how to use these how to customize them how to copy and
374:25 how to customize them how to copy and paste them over to to a different sheet
374:27 paste them over to to a different sheet to create some type of little uh chart
374:30 to create some type of little uh chart and visualization sheet that you can use
374:32 and visualization sheet that you can use to show your employers and and visualize
374:34 to show your employers and and visualize the data that you are working with thank
374:36 the data that you are working with thank you guys so much for watching I really
374:38 you guys so much for watching I really appreciate it again huge shout out to
374:40 appreciate it again huge shout out to you to me for sponsoring this Excel
374:41 you to me for sponsoring this Excel series if you like this video be sure to
374:44 series if you like this video be sure to like And subscribe below and I'll see
374:45 like And subscribe below and I'll see you in the next
374:47 you in the next [Music]
375:00 what's going on everybody welcome back to the Excel tutorial Series today we'll
375:02 to the Excel tutorial Series today we'll be looking at how to clean data in
375:09 [Music] Excel now knowing how to clean data in
375:11 Excel now knowing how to clean data in Excel is actually extremely useful and
375:13 Excel is actually extremely useful and there are a ton of techniques to do this
375:15 there are a ton of techniques to do this I'm going to be showing you the ones
375:16 I'm going to be showing you the ones that I probably use the most I feel like
375:18 that I probably use the most I feel like are the most helpful to kind of do the
375:21 are the most helpful to kind of do the bulk or the majority of the data
375:23 bulk or the majority of the data cleaning that you're going to do in
375:24 cleaning that you're going to do in Excel like I said there's so many
375:26 Excel like I said there's so many different ways and very specific things
375:28 different ways and very specific things that you can do but I'm going to
375:29 that you can do but I'm going to highlight some of the bigger ones that I
375:31 highlight some of the bigger ones that I find the most useful and some of you may
375:33 find the most useful and some of you may be thinking well I'll just do my data
375:34 be thinking well I'll just do my data cleaning in SQL or python or when I get
375:36 cleaning in SQL or python or when I get it ready to put it in Tableau um but
375:39 it ready to put it in Tableau um but honestly a lot of the data cleaning at
375:41 honestly a lot of the data cleaning at least a lot of the big stuff I tend to
375:42 least a lot of the big stuff I tend to do in Excel IF the data set is small
375:45 do in Excel IF the data set is small enough to fit in Excel and so I think
375:46 enough to fit in Excel and so I think it's actually really really useful to
375:48 it's actually really really useful to know how to do this because you'll most
375:50 know how to do this because you'll most likely be doing it more than you think
375:52 likely be doing it more than you think now before we jump into the tutorial I
375:54 now before we jump into the tutorial I want to give a shout out to the sponsor
375:55 want to give a shout out to the sponsor of this video and is brand new sponsor
375:57 of this video and is brand new sponsor it is unlocked by Z by HP unlocked is a
376:00 it is unlocked by Z by HP unlocked is a movie that's actually broken up into
376:01 movie that's actually broken up into four parts and each of them have a
376:03 four parts and each of them have a unique data science challenge associated
376:05 unique data science challenge associated with it now I'm going to read this next
376:07 with it now I'm going to read this next part because it's extremely interesting
376:09 part because it's extremely interesting each challenge represents a different
376:10 each challenge represents a different topic so there's data visualization text
376:12 topic so there's data visualization text analysis audio signal processing and
376:14 analysis audio signal processing and computer vision and you can submit your
376:16 computer vision and you can submit your answers in your work on their website
376:18 answers in your work on their website for a chance to win one of 10 zbook
376:20 for a chance to win one of 10 zbook Studio laptops or a free trip to the
376:22 Studio laptops or a free trip to the kaggle World Championships so I'll leave
376:24 kaggle World Championships so I'll leave a link in the description where you can
376:25 a link in the description where you can go watch the movie and then do the
376:27 go watch the movie and then do the challenges and then submit your answers
376:28 challenges and then submit your answers for a chance to win you should also go
376:30 for a chance to win you should also go check out their hackathon where you can
376:31 check out their hackathon where you can do these projects with other people just
376:33 do these projects with other people just like you who are trying to figure out
376:34 like you who are trying to figure out these answers and submit them to win as
376:36 these answers and submit them to win as well so go check that out thank you
376:38 well so go check that out thank you again to the sponsor of this video
376:40 again to the sponsor of this video unlocked by Z by HP now without further
376:43 unlocked by Z by HP now without further Ado let's jump onto my screen and get
376:44 Ado let's jump onto my screen and get started with the tutorial all right so
376:46 started with the tutorial all right so let's jump right into it I have this US
376:47 let's jump right into it I have this US president data set I got the base data
376:50 president data set I got the base data set from kaggle uh but I added some of
376:52 set from kaggle uh but I added some of my own data and then I messed some stuff
376:54 my own data and then I messed some stuff up as well just to kind of um
376:56 up as well just to kind of um demonstrate some of these things that
376:57 demonstrate some of these things that we're going to be looking at today this
376:59 we're going to be looking at today this is not a full project so you know we're
377:02 is not a full project so you know we're actually going to be using this to
377:03 actually going to be using this to create any visualizations or anything
377:04 create any visualizations or anything like that so you know all this is just
377:06 like that so you know all this is just for demonstration purposes but we will
377:09 for demonstration purposes but we will be doing a full project in about two or
377:12 be doing a full project in about two or three videos uh in this Excel Series
377:15 three videos uh in this Excel Series where we're going to be doing from start
377:16 where we're going to be doing from start to finish with a real data set so you
377:18 to finish with a real data set so you know if that's something that you're you
377:20 know if that's something that you're you wanting then we will absolutely be doing
377:22 wanting then we will absolutely be doing that now something that you may be
377:23 that now something that you may be wondering is how do you actually
377:25 wondering is how do you actually identify what you need need to clean in
377:26 identify what you need need to clean in the data what do you know to look for
377:29 the data what do you know to look for well some of the obvious things are
377:30 well some of the obvious things are things like formatting and
377:32 things like formatting and standardization so things like you know
377:34 standardization so things like you know this James Monroe is in all caps that
377:36 this James Monroe is in all caps that happens all the time within real data um
377:39 happens all the time within real data um and and so you know you want to
377:40 and and so you know you want to standardize that or this all lowercase
377:42 standardize that or this all lowercase you want to standardize that you want
377:43 you want to standardize that you want that all to be the same there's also
377:46 that all to be the same there's also things like um right here or we have
377:49 things like um right here or we have this wig and this wig with a bunch of
377:51 this wig and this wig with a bunch of random stuff after it this happens all
377:54 random stuff after it this happens all the time where it's not completely
377:56 the time where it's not completely standardized um and you may even notice
377:59 standardized um and you may even notice um you know there are some spelling
378:01 um you know there are some spelling errors in here and I'll we'll kind of
378:02 errors in here and I'll we'll kind of look through that in a little bit and
378:04 look through that in a little bit and then you know there are things like
378:07 then you know there are things like additional spaces where there shouldn't
378:09 additional spaces where there shouldn't be spaces there are things like
378:10 be spaces there are things like currencies that you need to be aware of
378:12 currencies that you need to be aware of if you were importing this into or going
378:14 if you were importing this into or going to be importing this into a SQL database
378:16 to be importing this into a SQL database um things like currencies can be just a
378:20 um things like currencies can be just a problem or be really um unnecessary it
378:24 problem or be really um unnecessary it may actually cause more issues in the
378:25 may actually cause more issues in the long run so you may just want to you
378:27 long run so you may just want to you know take that to the base value and
378:30 know take that to the base value and then dates are always an issue always
378:32 then dates are always an issue always always always um so always look at your
378:34 always always um so always look at your dates make sure they're they're
378:36 dates make sure they're they're formatted correctly make sure they're
378:37 formatted correctly make sure they're all the same these are the types of
378:39 all the same these are the types of things that right when I glance at this
378:41 things that right when I glance at this data set these are things that I'm
378:43 data set these are things that I'm looking for um one other thing that is
378:46 looking for um one other thing that is actually the first thing that we're
378:47 actually the first thing that we're going to start out with is you want to
378:48 going to start out with is you want to make sure that your data is not
378:51 make sure that your data is not duplicated because if your data has
378:53 duplicated because if your data has duplicate data in it and you don't want
378:56 duplicate data in it and you don't want that it's not supposed to be there there
378:57 that it's not supposed to be there there are some specific use cases where
379:00 are some specific use cases where duplicated data is okay um you know you
379:03 duplicated data is okay um you know you want to get rid of that and it's very
379:05 want to get rid of that and it's very easy to do in Excel uh the first thing
379:07 easy to do in Excel uh the first thing we're going to do we're going to go uh
379:09 we're going to do we're going to go uh to this data tab we're going to go right
379:11 to this data tab we're going to go right over here and we're going to get see if
379:12 over here and we're going to get see if there's any uh duplicates in our data so
379:14 there's any uh duplicates in our data so we're just going to go up to remove
379:16 we're just going to go up to remove duplicates it's going to automatically
379:18 duplicates it's going to automatically choose all of your columns to to check
379:20 choose all of your columns to to check against so it's going to for from a all
379:23 against so it's going to for from a all the way through I it's going to see is
379:25 the way through I it's going to see is the exact same data in all these rows
379:27 the exact same data in all these rows and if it is it's going to get rid of it
379:29 and if it is it's going to get rid of it um and so we're going to click okay and
379:32 um and so we're going to click okay and it did find one duplicate and I'll show
379:33 it did find one duplicate and I'll show you that one real quick um because you
379:36 you that one real quick um because you know it was right here so Barack Obama
379:39 know it was right here so Barack Obama was here twice and then I'm going to hit
379:40 was here twice and then I'm going to hit control I hit control Z to go back I'm
379:43 control I hit control Z to go back I'm going hit control y to go forward and it
379:46 going hit control y to go forward and it removed that uh that row completely now
379:50 removed that uh that row completely now in this example you may be able to spot
379:51 in this example you may be able to spot that with your eye but in a real data
379:53 that with your eye but in a real data set where you have 10,000 100,000 rows
379:56 set where you have 10,000 100,000 rows there's absolutely no way you're going
379:57 there's absolutely no way you're going to see that or very very unlikely that
380:00 to see that or very very unlikely that you are going to see that there's
380:01 you are going to see that there's duplicated data in there so just running
380:03 duplicated data in there so just running a a a quick um dup or or removing of
380:06 a a a quick um dup or or removing of duplicates that is really important to
380:09 duplicates that is really important to make sure that you um have gotten rid of
380:11 make sure that you um have gotten rid of those things so that's one of the first
380:13 those things so that's one of the first things that I do um we're going to go
380:15 things that I do um we're going to go into a lot of these different uh columns
380:18 into a lot of these different uh columns and I'm going to kind of show you
380:19 and I'm going to kind of show you different techniques or things that I do
380:21 different techniques or things that I do when I look at actual data so I'm going
380:23 when I look at actual data so I'm going to come right over here I'm going to
380:26 to come right over here I'm going to insert and this is what I actually do I
380:28 insert and this is what I actually do I I usually create a separate column
380:29 I usually create a separate column especially when I'm working with this
380:30 especially when I'm working with this because I don't want to change this one
380:33 because I don't want to change this one um I don't want to go in here and you
380:35 um I don't want to go in here and you know say um equals upper equals proper
380:39 know say um equals upper equals proper Etc there's a lot of different ways that
380:41 Etc there's a lot of different ways that you can change um names or not a lot but
380:44 you can change um names or not a lot but the main ones that you can change names
380:45 the main ones that you can change names and all of them are completely okay so
380:48 and all of them are completely okay so for example I'm going to hit equal upper
380:51 for example I'm going to hit equal upper oops upper and I'm going to go like this
380:53 oops upper and I'm going to go like this and close my parentheses so I selected
380:55 and close my parentheses so I selected this S I close my parenthese hit enter
380:58 this S I close my parenthese hit enter it is and I'm going to hit um in the
381:00 it is and I'm going to hit um in the bottom right I'm going toit double click
381:01 bottom right I'm going toit double click this and it's going to apply to all of
381:03 this and it's going to apply to all of them it is completely okay to have your
381:05 them it is completely okay to have your data like this if you want it to be like
381:07 data like this if you want it to be like that um if you want it to be all lower
381:09 that um if you want it to be all lower you can do that if you want it to be in
381:11 you can do that if you want it to be in proper case you can do that um there are
381:14 proper case you can do that um there are oops there are different um uses for all
381:18 oops there are different um uses for all of them and honestly as long as it's all
381:20 of them and honestly as long as it's all the same typically it's okay but if um
381:23 the same typically it's okay but if um you know for example if you're selling
381:24 you know for example if you're selling this to like a third party company or
381:26 this to like a third party company or something like that they may have um
381:28 something like that they may have um what they want for their ingestion
381:30 what they want for their ingestion process when they take your file in if
381:33 process when they take your file in if you send you know a weekly file or a
381:34 you send you know a weekly file or a monthly file they may want it exactly
381:37 monthly file they may want it exactly how they want it and you can change that
381:39 how they want it and you can change that to to what they want um but as long as
381:42 to to what they want um but as long as it's standardized for you it's all the
381:43 it's standardized for you it's all the same for you that is a good thing so now
381:46 same for you that is a good thing so now we have all of these um in the proper
381:49 we have all of these um in the proper case that's typically what I I do or I
381:52 case that's typically what I I do or I use upper those are the ones I use the
381:55 use upper those are the ones I use the most I don't usually use um lower and if
381:57 most I don't usually use um lower and if you go in here and you type in
382:00 you go in here and you type in lower you know it changes it to all
382:02 lower you know it changes it to all lower I don't typically do that um and
382:04 lower I don't typically do that um and I'm gon to add I'm oops I'm gonna say
382:07 I'm gon to add I'm oops I'm gonna say president Dash fixed and so now all of
382:12 president Dash fixed and so now all of these names um all of these uh different
382:15 these names um all of these uh different uppercase and lowercase these are all
382:16 uppercase and lowercase these are all fixed and and it just makes it so much
382:19 fixed and and it just makes it so much easier to read and you don't have
382:21 easier to read and you don't have different um uppercase and lowercase
382:23 different um uppercase and lowercase issues it's all the same so I'm going to
382:25 issues it's all the same so I'm going to keep keep that right
382:26 keep keep that right there uh if we move a little bit to the
382:30 there uh if we move a little bit to the right if you look at this prior now this
382:34 right if you look at this prior now this prior is a mess it it has stuff all over
382:38 prior is a mess it it has stuff all over and to be honest this is not really
382:40 and to be honest this is not really something that I would probably be using
382:43 something that I would probably be using um like in a real data set I would look
382:45 um like in a real data set I would look at this column and I would say this is
382:46 at this column and I would say this is pretty useless um if I had a very
382:50 pretty useless um if I had a very specific use case for this this data in
382:52 specific use case for this this data in this column I might try to you know
382:54 this column I might try to you know parse it out and do something but I
382:56 parse it out and do something but I don't uh this this is a completely
382:58 don't uh this this is a completely useless com to me so I'm actually going
382:59 useless com to me so I'm actually going to skip this one I'm going to go to this
383:01 to skip this one I'm going to go to this party one and this party one to me it
383:04 party one and this party one to me it looks pretty important because this is
383:05 looks pretty important because this is something that I know I can Group by um
383:07 something that I know I can Group by um and I can create visualizations with and
383:10 and I can create visualizations with and and kind of break that out and if you
383:13 and kind of break that out and if you look right here we're going to add um
383:15 look right here we're going to add um we're going to add a filter so now let's
383:17 we're going to add a filter so now let's open up party and take a look so if we
383:21 open up party and take a look so if we look right here we have Democratic
383:23 look right here we have Democratic democratic-republican Federalist
383:24 democratic-republican Federalist nonpartisan repu Republican Republicans
383:27 nonpartisan repu Republican Republicans wig and wig with a a date and some
383:29 wig and wig with a a date and some information in the back of it and then
383:31 information in the back of it and then some blanks um and it's really important
383:34 some blanks um and it's really important when we're when we're looking at these
383:36 when we're when we're looking at these um ones that we think we might Group by
383:38 um ones that we think we might Group by that we have these um properly grouped
383:42 that we have these um properly grouped so Republican and Republicans to me
383:44 so Republican and Republicans to me right off the bat looks like a spelling
383:45 right off the bat looks like a spelling error and so I'm just going to deselect
383:48 error and so I'm just going to deselect All I'm going to go to Republican
383:51 All I'm going to go to Republican Republicans and it's literally
383:54 Republicans and it's literally Republican all the way down except for
383:55 Republican all the way down except for for this last one and to me that's just
383:58 for this last one and to me that's just something that I would update so I would
383:59 something that I would update so I would just go right here I do that if I didn't
384:02 just go right here I do that if I didn't do that and then I try to create let's
384:04 do that and then I try to create let's say a pivot table on here I'll have its
384:06 say a pivot table on here I'll have its own group of Republicans and it wouldn't
384:08 own group of Republicans and it wouldn't be added to Republican and maybe that's
384:10 be added to Republican and maybe that's on purpose but let's just presume that
384:13 on purpose but let's just presume that we know this data extremely well and
384:15 we know this data extremely well and that's not supposed to be like that
384:16 that's not supposed to be like that right again that that just comes back to
384:18 right again that that just comes back to knowing your data really well
384:20 knowing your data really well understanding what it um you know what
384:22 understanding what it um you know what it should look like and we know that it
384:24 it should look like and we know that it should not be like that so we're going
384:25 should not be like that so we're going to fix that uh the next thing that we're
384:27 to fix that uh the next thing that we're going to fix um and as you can see it it
384:29 going to fix um and as you can see it it got rid of it next thing we're going to
384:31 got rid of it next thing we're going to fix is this wig
384:33 fix is this wig um that's just like an error that's
384:36 um that's just like an error that's that's some issue on the the data side
384:41 that's some issue on the the data side and we're just going to fix that by
384:43 and we're just going to fix that by updating it and that's it I would always
384:46 updating it and that's it I would always be keeping um a a copy of this with the
384:49 be keeping um a a copy of this with the raw data uh somewhere else because this
384:52 raw data uh somewhere else because this is presumably like a working document
384:54 is presumably like a working document this is not
384:55 this is not a um you know you aren't saving over
385:00 a um you know you aren't saving over your original file let's just say that
385:01 your original file let's just say that and then let's take a look at these
385:02 and then let's take a look at these blanks real quick um okay so there are
385:07 blanks real quick um okay so there are these rows right here that have nothing
385:09 these rows right here that have nothing I think we're okay but if we see
385:12 I think we're okay but if we see anything different 47 48 okay so yeah
385:15 anything different 47 48 okay so yeah it's just these ones right here that
385:16 it's just these ones right here that have no data in it anyways it's just
385:18 have no data in it anyways it's just seeing it in the filter so not an issue
385:21 seeing it in the filter so not an issue at all so okay we're looking good we've
385:24 at all so okay we're looking good we've gone all the way over we we fixed this
385:26 gone all the way over we we fixed this President we skipped this one um we we
385:29 President we skipped this one um we we cleaned up this party and I kept this
385:31 cleaned up this party and I kept this one in here because I'm not exactly sure
385:33 one in here because I'm not exactly sure if that's a Democratic or republican so
385:35 if that's a Democratic or republican so I'm going to keep it its own thing um
385:37 I'm going to keep it its own thing um I'm not a huge uh history buff in that
385:40 I'm not a huge uh history buff in that aspect the next one right here is um the
385:45 aspect the next one right here is um the next one right here is really easy uh
385:47 next one right here is really easy uh this is something that happens all the
385:49 this is something that happens all the time especially on actually most often
385:52 time especially on actually most often it's happens on numerical data so like
385:56 it's happens on numerical data so like uh you know there'll be a number of 1,1
385:59 uh you know there'll be a number of 1,1 and then there'll be a space after it
386:00 and then there'll be a space after it for absolutely no reason uh and it
386:02 for absolutely no reason uh and it happens all the time it does happen like
386:05 happens all the time it does happen like this as well um where you'll see this
386:07 this as well um where you'll see this and all you got to do is do trim and
386:10 and all you got to do is do trim and select the the cell we're going to close
386:12 select the the cell we're going to close that parenthesis and we're going to
386:14 that parenthesis and we're going to apply that all the way down what is so
386:16 apply that all the way down what is so fantastic about the trim is that it's
386:18 fantastic about the trim is that it's really intuitive and it knows basically
386:21 really intuitive and it knows basically everything it needs to do for example um
386:24 everything it needs to do for example um it gets gets rid of the um spaces before
386:28 it gets gets rid of the um spaces before it gets rid of extra spaces in the
386:30 it gets rid of extra spaces in the middle and um it'll get rid of extra
386:33 middle and um it'll get rid of extra spaces at the end um which you wouldn't
386:36 spaces at the end um which you wouldn't be able to see but they are there and
386:37 be able to see but they are there and they they absolutely can cause issues if
386:40 they they absolutely can cause issues if you have spaces at the end that you
386:41 you have spaces at the end that you cannot see um let's take this one for
386:44 cannot see um let's take this one for example like if I had spaces at the end
386:46 example like if I had spaces at the end that can cause issues when you insert or
386:49 that can cause issues when you insert or or or put that into a database um that
386:51 or or put that into a database um that happens a lot with numbers um you know
386:53 happens a lot with numbers um you know when you're putting that into SQL
386:55 when you're putting that into SQL that can cause issues and so you really
386:57 that can cause issues and so you really it is important to actually do that trim
387:00 it is important to actually do that trim um and you can do that on all of your
387:02 um and you can do that on all of your columns or just ones that you know
387:03 columns or just ones that you know you're having issues with but once you
387:05 you're having issues with but once you import that data into SQL you will know
387:07 import that data into SQL you will know if there's an issue or not when you
387:08 if there's an issue or not when you actually try to start using it so we're
387:11 actually try to start using it so we're going to say Vice and we're going to say
387:14 going to say Vice and we're going to say fixed oops there we go uh this next one
387:19 fixed oops there we go uh this next one is one that you'll run into a lot when
387:21 is one that you'll run into a lot when you're working with numerical data you
387:24 you're working with numerical data you will encounter so many different issues
387:27 will encounter so many different issues um one that I run into a lot is I I've
387:30 um one that I run into a lot is I I've worked with a lot of cost data or
387:32 worked with a lot of cost data or pricing data and when it's in an Excel
387:35 pricing data and when it's in an Excel it h it sometimes comes in with um these
387:38 it h it sometimes comes in with um these currencies like a dollar sign a pound
387:40 currencies like a dollar sign a pound sign things like that and when you put
387:43 sign things like that and when you put that into
387:44 that into SQL it just is a nuisance right you're
387:47 SQL it just is a nuisance right you're not going to be able to run um it's
387:51 not going to be able to run um it's going to go in as a text or it's going
387:52 going to go in as a text or it's going to be like a string right because it has
387:55 to be like a string right because it has that special character and you don't
387:56 that special character and you don't want that you don't want to have to then
387:58 want that you don't want to have to then go in and then change things around you
388:00 go in and then change things around you just want to be able to start um you
388:02 just want to be able to start um you know doing calculations on those numbers
388:05 know doing calculations on those numbers so what you can do is sometimes it'll
388:07 so what you can do is sometimes it'll come in as a text sometimes it'll come
388:09 come in as a text sometimes it'll come in as um currency which I think this
388:12 in as um currency which I think this one's a currency we are just going to
388:14 one's a currency we are just going to change that to be a number and then
388:17 change that to be a number and then we're going to get rid of these
388:19 we're going to get rid of these oops and get rid of those that it
388:23 oops and get rid of those that it doesn't look as pretty but that is much
388:25 doesn't look as pretty but that is much more useful than actually having the
388:28 more useful than actually having the currency on there with the decimals this
388:31 currency on there with the decimals this actually is so much easier when you when
388:33 actually is so much easier when you when you want to use it for almost anything
388:34 you want to use it for almost anything because you're able to add and uh do
388:38 because you're able to add and uh do things properly in other systems in
388:39 things properly in other systems in Excel I think it does understand it um
388:42 Excel I think it does understand it um but you know that can cause issues so
388:45 but you know that can cause issues so there is how you do that the next thing
388:47 there is how you do that the next thing that we're going to look at is these
388:48 that we're going to look at is these dates and just notoriously whenever I
388:51 dates and just notoriously whenever I see a date field I know there's going to
388:53 see a date field I know there's going to be an issue with it it's very rare
388:55 be an issue with it it's very rare that I get a date field that is perfect
388:59 that I get a date field that is perfect uh it just it is genuinely is um is a
389:02 uh it just it is genuinely is um is a novelty when that happens and most of
389:05 novelty when that happens and most of the time it has to do with um let's say
389:08 the time it has to do with um let's say a date comes into Excel and it's in a
389:10 a date comes into Excel and it's in a text format or date comes into Excel and
389:12 text format or date comes into Excel and they're not the same in this example
389:14 they're not the same in this example they are not the same um and we just
389:17 they are not the same um and we just want them to all be similar they say
389:19 want them to all be similar they say date on if you look right here it says
389:21 date on if you look right here it says date it says date it looks like it
389:24 date it says date it looks like it should be the the same um but if we go
389:28 should be the the same um but if we go like this it all looks the same right
389:31 like this it all looks the same right there's no issues at all if we were to
389:34 there's no issues at all if we were to um try to use that it may or may not be
389:38 um try to use that it may or may not be an issue but we don't want to leave that
389:39 an issue but we don't want to leave that to chance later on if you're using this
389:41 to chance later on if you're using this with python or something like that it
389:43 with python or something like that it can cause issues U maybe not in SQL
389:45 can cause issues U maybe not in SQL because it may um see the underlying um
389:48 because it may um see the underlying um what's in the underlying cell not just
389:50 what's in the underlying cell not just what we see but some systems won't and
389:53 what we see but some systems won't and so you want to make sure that they're
389:54 so you want to make sure that they're all the same
389:55 all the same and so you know what we were doing back
389:57 and so you know what we were doing back here with um oops with the party and we
390:00 here with um oops with the party and we were looking at this uh this filter and
390:02 were looking at this uh this filter and identifying the issues I usually do that
390:05 identifying the issues I usually do that on date fields as well and and
390:06 on date fields as well and and oftentimes um I know just for just for
390:09 oftentimes um I know just for just for demonstration purposes ofttimes I will
390:12 demonstration purposes ofttimes I will get something like that and then I'll
390:14 get something like that and then I'll come up here and I'll notice that
390:16 come up here and I'll notice that there's this one random number that
390:18 there's this one random number that happens all the time all the time um and
390:22 happens all the time all the time um and so you know you want to make sure that
390:25 so you know you want to make sure that you um that you look at these things and
390:28 you um that you look at these things and just just do at least a quick glance if
390:30 just just do at least a quick glance if not kind of doing a kind of a deep dive
390:32 not kind of doing a kind of a deep dive into it but all we're going to do is
390:34 into it but all we're going to do is we're going to do both of these and
390:36 we're going to do both of these and we're going to do a short date and let's
390:38 we're going to do a short date and let's take a look and see if that fixed it and
390:40 take a look and see if that fixed it and so now they are all the same format and
390:43 so now they are all the same format and that is fantastic that is exactly what
390:45 that is fantastic that is exactly what we want we're going to go back through
390:47 we want we're going to go back through here we're going to get rid of these um
390:50 here we're going to get rid of these um again this is a
390:52 again this is a working um this is a working document
390:56 working um this is a working document oops uh we need to we're I'm going to do
391:00 oops uh we need to we're I'm going to do um control shift down oops let me go
391:04 um control shift down oops let me go back up do control shift down and copy
391:07 back up do control shift down and copy and what I'm going to do right now is
391:09 and what I'm going to do right now is I'm actually going to copy let me do it
391:12 I'm actually going to copy let me do it right here I'll show you sometimes I do
391:13 right here I'll show you sometimes I do this does just depends I'm going to go
391:15 this does just depends I'm going to go right here I'm going to hit rightclick
391:17 right here I'm going to hit rightclick and I'm going to paste as a value which
391:20 and I'm going to paste as a value which means it's not going to take the
391:23 means it's not going to take the calculation or the formula that I just
391:24 calculation or the formula that I just did
391:25 did uh it's going to actually paste it as
391:27 uh it's going to actually paste it as that value so we just replaced it um
391:29 that value so we just replaced it um right here you can see up here it says
391:31 right here you can see up here it says equals trim of
391:33 equals trim of G2 this now now that I copied and pasted
391:36 G2 this now now that I copied and pasted it over as a value um it got rid of that
391:41 it over as a value um it got rid of that um calculation and now it is actually a
391:43 um calculation and now it is actually a string so we don't need this anymore and
391:46 string so we don't need this anymore and I'll do the same thing over here as
391:49 I'll do the same thing over here as well I'm going to control shift down
391:54 well I'm going to control shift down copy and I just hit the right key uh or
391:58 copy and I just hit the right key uh or the left key sorry now I'm going to
391:59 the left key sorry now I'm going to right click and I'm going to do paste as
392:02 right click and I'm going to do paste as a value and again it has this proper and
392:06 a value and again it has this proper and now it doesn't have the proper it's
392:07 now it doesn't have the proper it's actually the value that was here so
392:09 actually the value that was here so that's really important to note uh and
392:12 that's really important to note uh and we're going to get rid of that one and
392:14 we're going to get rid of that one and so now what we have is is already
392:16 so now what we have is is already looking much better now one of the last
392:18 looking much better now one of the last things I we're going to look at is
392:19 things I we're going to look at is deleting columns that we are not going
392:21 deleting columns that we are not going to use and this is why it's so important
392:23 to use and this is why it's so important to keep a backup or or or the raw data
392:26 to keep a backup or or or the raw data not in this file because if you start
392:28 not in this file because if you start saving over this file and this is your
392:29 saving over this file and this is your raw file uh that can mess up a lot of
392:31 raw file uh that can mess up a lot of things and that happens to me before and
392:34 things and that happens to me before and it's terrible and then you have to
392:36 it's terrible and then you have to request another file or you have to go
392:38 request another file or you have to go back and find it or something like that
392:40 back and find it or something like that it's terrible um so so this is our
392:42 it's terrible um so so this is our working document so we can mess with
392:44 working document so we can mess with this and do whatever we want for our
392:45 this and do whatever we want for our purposes now for us um I can already
392:49 purposes now for us um I can already tell you that this prior is a bunch of
392:51 tell you that this prior is a bunch of nonsense and we do not need it we're not
392:53 nonsense and we do not need it we're not going to use it for anything and it and
392:55 going to use it for anything and it and if we have um this is a small very small
392:58 if we have um this is a small very small data set this only has like um let's say
393:00 data set this only has like um let's say you know one two three four five six
393:03 you know one two three four five six seven eight we have like eight columns
393:05 seven eight we have like eight columns that we're you know kind of using that
393:06 that we're you know kind of using that has data eight or nine now that's a
393:09 has data eight or nine now that's a small data I've had ones with literally
393:11 small data I've had ones with literally like hundreds um and and it has so many
393:15 like hundreds um and and it has so many columns uh so much data and sometimes
393:18 columns uh so much data and sometimes it's good to just trim it back to the
393:20 it's good to just trim it back to the things you know you're going to use this
393:21 things you know you're going to use this to me is absolutely useless um we're
393:23 to me is absolutely useless um we're going to delete that
393:25 going to delete that and then right over here it's pretty
393:26 and then right over here it's pretty redundant um it's just one number off
393:30 redundant um it's just one number off but if we scroll down just a little bit
393:32 but if we scroll down just a little bit um it goes it's basically just counts
393:34 um it goes it's basically just counts it's a you could even call it a unique
393:37 it's a you could even call it a unique um identifier if you want sure why not
393:40 um identifier if you want sure why not but we don't need both um so we're going
393:41 but we don't need both um so we're going to get rid of this first one and now we
393:43 to get rid of this first one and now we have more of the useful and relevant
393:45 have more of the useful and relevant data rather than the stuff that we
393:46 data rather than the stuff that we absolutely know that we are not going to
393:48 absolutely know that we are not going to use um these date updated and date
393:50 use um these date updated and date created we may never use them but we
393:53 created we may never use them but we might um so it doesn't hurt to keep it
393:56 might um so it doesn't hurt to keep it on hand those other ones are ones that
393:57 on hand those other ones are ones that we are almost certain we will never use
394:00 we are almost certain we will never use again keep a backup just in case you
394:02 again keep a backup just in case you need it you can always go back and get
394:03 need it you can always go back and get it so you know if you go back to what we
394:06 it so you know if you go back to what we started with and you look at what we
394:07 started with and you look at what we have now it is much cleaner it's much
394:10 have now it is much cleaner it's much more usable and these are small subtle
394:13 more usable and these are small subtle changes um especially with this very
394:15 changes um especially with this very small data set of only like 50 rows or
394:17 small data set of only like 50 rows or or 46 rows but you're going to be
394:19 or 46 rows but you're going to be working with data sets that are
394:21 working with data sets that are thousands tens of thousands hundreds of
394:22 thousands tens of thousands hundreds of thousands of rows and you need to know
394:25 thousands of rows and you need to know how to kind of look at this data
394:27 how to kind of look at this data standardize it um format it properly for
394:30 standardize it um format it properly for what you're going to be using it for if
394:31 what you're going to be using it for if you're keeping it in Excel there are
394:33 you're keeping it in Excel there are different things that you may do than if
394:34 different things that you may do than if you're putting it into a database or
394:36 you're putting it into a database or going to be using it in you know um
394:40 going to be using it in you know um using python to to access it so you need
394:43 using python to to access it so you need to kind of know your use case but these
394:45 to kind of know your use case but these are some things that I do all the time
394:48 are some things that I do all the time to kind of clean up the data before I
394:50 to kind of clean up the data before I use it for something whether I'm
394:52 use it for something whether I'm creating pivot tables or I'm inserting
394:54 creating pivot tables or I'm inserting it into or I'm putting it into SQL these
394:57 it into or I'm putting it into SQL these are things I do all the time and so
394:58 are things I do all the time and so hopefully that helps give you kind of an
395:00 hopefully that helps give you kind of an idea of some of the things that you
395:02 idea of some of the things that you should be looking for when you're
395:03 should be looking for when you're actually cleaning data and it's really
395:05 actually cleaning data and it's really important to understand why you're
395:06 important to understand why you're actually making these changes and the
395:08 actually making these changes and the reason you're making these changes
395:09 reason you're making these changes because some of the things that I did
395:11 because some of the things that I did today may not be things you want to do
395:13 today may not be things you want to do on a different data set that has
395:14 on a different data set that has different uses and different um purposes
395:16 different uses and different um purposes for so you know take everything that
395:19 for so you know take everything that I've said and and apply it um with a
395:21 I've said and and apply it um with a little grain of salt to your data set
395:23 little grain of salt to your data set because your spefic specific needs may
395:25 because your spefic specific needs may be different than what I wanted when I
395:27 be different than what I wanted when I was cleaning my data set so I hope this
395:30 was cleaning my data set so I hope this was helpful I hope you this gave you a
395:31 was helpful I hope you this gave you a small glimpse of some of the things that
395:33 small glimpse of some of the things that I'm looking for when I clean a data set
395:35 I'm looking for when I clean a data set or I get a new data set in and I'm kind
395:36 or I get a new data set in and I'm kind of you know analyzing it figuring out
395:38 of you know analyzing it figuring out what I need to fix in it I hope this has
395:41 what I need to fix in it I hope this has been helpful uh with that being said
395:43 been helpful uh with that being said thank you so much for watching I really
395:45 thank you so much for watching I really appreciate it if you like this video be
395:47 appreciate it if you like this video be sure to like And subscribe below and
395:48 sure to like And subscribe below and I'll see you in the next
395:51 I'll see you in the next [Music]
395:53 [Music] video
396:02 [Music] what's going on everybody welcome back
396:03 what's going on everybody welcome back to the Excel tutorial Series today we're
396:05 to the Excel tutorial Series today we're going to create an entire project in
396:08 going to create an entire project in [Music]
396:12 [Music] Excel now if you've never done a
396:14 Excel now if you've never done a complete project in Excel where you take
396:16 complete project in Excel where you take the data you clean it then you create an
396:18 the data you clean it then you create an actual dashboard where people can click
396:20 actual dashboard where people can click on things and filter things this is
396:22 on things and filter things this is going to be a really great learning
396:23 going to be a really great learning opportunity as well as potentially you
396:25 opportunity as well as potentially you know a simple project that you can use
396:27 know a simple project that you can use for your portfolio or you can spice
396:29 for your portfolio or you can spice things up and go a little farther than
396:30 things up and go a little farther than what we're going to be doing in today's
396:32 what we're going to be doing in today's video I will walk you through every
396:33 video I will walk you through every single step of the way and hopefully we
396:35 single step of the way and hopefully we learn something together and without
396:37 learn something together and without further Ado let's jump right into it
396:39 further Ado let's jump right into it let's jump onto my screen and get
396:40 let's jump onto my screen and get started with the project all right so
396:42 started with the project all right so this is the data set that we're going to
396:43 this is the data set that we're going to be working with I will leave a link in
396:45 be working with I will leave a link in the description to my GitHub where you
396:46 the description to my GitHub where you can go and download it so you can be
396:47 can go and download it so you can be working with the exact same data set
396:49 working with the exact same data set that I am using now before we actually
396:51 that I am using now before we actually get into this data and start looking at
396:53 get into this data and start looking at it I'm going to show you what the final
396:55 it I'm going to show you what the final dashboard is going to look like um we're
396:57 dashboard is going to look like um we're going to create a few different types of
396:58 going to create a few different types of visualizations nothing too crazy um and
397:01 visualizations nothing too crazy um and then we'll create some filters as well
397:03 then we'll create some filters as well so we can kind of you know create some
397:04 so we can kind of you know create some interactive filters with our data so
397:06 interactive filters with our data so let's go right on over to our data set
397:10 let's go right on over to our data set now I'm going to hide this because we
397:13 now I'm going to hide this because we are not going to use that but what I am
397:14 are not going to use that but what I am going to do before we do anything is I'm
397:16 going to do before we do anything is I'm going to create a
397:18 going to create a dashboard and I'm going to create a
397:21 dashboard and I'm going to create a pivot table oops
397:24 pivot table oops and I'm going to create a working sheet
397:30 and I'm going to create a working sheet so um all these things have different
397:33 so um all these things have different uses and I'll explain that as we go
397:35 uses and I'll explain that as we go along so this is our data set um I'm
397:38 along so this is our data set um I'm going to copy this over to our working
397:40 going to copy this over to our working sheet when I go into you know an Excel
397:44 sheet when I go into you know an Excel and I'm working on something I don't
397:45 and I'm working on something I don't like to you know use just the one that I
397:48 like to you know use just the one that I was using in case I mess something up
397:49 was using in case I mess something up and it saves over or's some issue I like
397:51 and it saves over or's some issue I like to create a working sheet and keep the
397:53 to create a working sheet and keep the raw data right over here it just makes
397:55 raw data right over here it just makes my life easier I don't have to save it
397:57 my life easier I don't have to save it and then you know open up a different
397:58 and then you know open up a different Excel to compare them so we have our
398:00 Excel to compare them so we have our bike buyers this is our working sheets
398:02 bike buyers this is our working sheets this is our raw data this is the one
398:04 this is our raw data this is the one we're actually be working on today so
398:06 we're actually be working on today so let's um let's start looking at it
398:08 let's um let's start looking at it really quick and just kind of glance and
398:10 really quick and just kind of glance and see what data we're working with and
398:12 see what data we're working with and then we'll start cleaning it up making
398:14 then we'll start cleaning it up making it more useful for what we are going to
398:16 it more useful for what we are going to be using it for and then we'll start
398:18 be using it for and then we'll start building out the dashboard so right here
398:21 building out the dashboard so right here we have an ID that should be be a unique
398:24 we have an ID that should be be a unique ID to each person uh this is their
398:27 ID to each person uh this is their marital status so married or single this
398:30 marital status so married or single this is their gender male female we have
398:32 is their gender male female we have their income children their education
398:35 their income children their education their occupation do they own a home how
398:38 their occupation do they own a home how many cars they own how long their
398:40 many cars they own how long their commute is the region where they live
398:43 commute is the region where they live their age and if they purchased a bike
398:45 their age and if they purchased a bike and this column right here is extremely
398:47 and this column right here is extremely important this is going to tell us
398:48 important this is going to tell us whether they did or did not buy a bike
398:50 whether they did or did not buy a bike so we got their information they're
398:52 so we got their information they're looking for a bike but they either
398:54 looking for a bike but they either decided not to buy a bike or they did
398:55 decided not to buy a bike or they did buy a bike and we're going to be using
398:57 buy a bike and we're going to be using that one a lot in in this video and so
399:01 that one a lot in in this video and so um you know this is basically the data
399:04 um you know this is basically the data set that we're working with um some of
399:05 set that we're working with um some of the demographics and and information
399:07 the demographics and and information behind the person so what we want to do
399:11 behind the person so what we want to do when we are cleaning the data before we
399:13 when we are cleaning the data before we do anything uh I like to see if there
399:15 do anything uh I like to see if there are any duplicates in here um what we're
399:17 are any duplicates in here um what we're going to do is come right up here we can
399:20 going to do is come right up here we can go to
399:22 go to uh where is it right here we got remove
399:26 uh where is it right here we got remove duplicates so we're going to click on
399:27 duplicates so we're going to click on that it selects every single one we just
399:30 that it selects every single one we just want to see if there's any useless
399:32 want to see if there's any useless duplicated data that we do not need uh
399:34 duplicated data that we do not need uh and the data is a header so we're going
399:36 and the data is a header so we're going to click
399:37 to click okay all right so we had a ton of
399:38 okay all right so we had a ton of duplicates in there uh for whatever
399:40 duplicates in there uh for whatever reason so yeah we do have duplicates in
399:42 reason so yeah we do have duplicates in there so I'm glad we did that otherwise
399:44 there so I'm glad we did that otherwise we would have uh you know not good data
399:48 we would have uh you know not good data and we don't want that let's start right
399:50 and we don't want that let's start right over here um the ID of course we're not
399:53 over here um the ID of course we're not going to change the marital status and
399:54 going to change the marital status and gender are M's s's fs and M's um this
400:00 gender are M's s's fs and M's um this isn't inherently a bad thing to have it
400:02 isn't inherently a bad thing to have it like this but you know we have to think
400:04 like this but you know we have to think about it from the perspective of someone
400:05 about it from the perspective of someone who's going to be using this dashboard
400:07 who's going to be using this dashboard do they know what M ands is do they know
400:09 do they know what M ands is do they know what M uh and F is and if they don't
400:13 what M uh and F is and if they don't it's better to just spell it out for the
400:14 it's better to just spell it out for the most part um so let's just do that so
400:18 most part um so let's just do that so we're going to click on the column B
400:20 we're going to click on the column B we're going to hit controll H that's
400:22 we're going to hit controll H that's going to bring up our find and replace
400:24 going to bring up our find and replace now there's an m in both of these
400:26 now there's an m in both of these columns and there's different things one
400:28 columns and there's different things one is married and one means male so we're
400:31 is married and one means male so we're going to do is we're going to search by
400:33 going to do is we're going to search by columns um and we'll have match case I
400:36 columns um and we'll have match case I don't think that's going to change
400:37 don't think that's going to change anything but that just means an exact
400:38 anything but that just means an exact match uh and we're going to do m equals
400:42 match uh and we're going to do m equals and we're going to replace it with
400:43 and we're going to replace it with married and we'll replace all awesome
400:47 married and we'll replace all awesome and then we do s is
400:49 and then we do s is single this one is super easy we're
400:52 single this one is super easy we're going to do the exact same thing right
400:54 going to do the exact same thing right here so column C to hit contrl H we'll
400:58 here so column C to hit contrl H we'll do still has by column so we'll do m is
401:03 do still has by column so we'll do m is male we'll replace all of those and F is
401:09 male we'll replace all of those and F is female and replace all those that's
401:12 female and replace all those that's great uh you know the next column right
401:15 great uh you know the next column right here is income and in a SE in a previous
401:18 here is income and in a SE in a previous video I talked about how I don't
401:19 video I talked about how I don't typically like it in this format and
401:22 typically like it in this format and that's true um if you're doing calcul
401:23 that's true um if you're doing calcul ations on it or or any other thing it
401:25 ations on it or or any other thing it can mess it up sometimes having the
401:27 can mess it up sometimes having the dollar sign or it being a currency we're
401:30 dollar sign or it being a currency we're not really going to mess with it too
401:31 not really going to mess with it too much right now um what we can do is just
401:35 much right now um what we can do is just kind
401:36 kind of make sure all of it's currency um
401:39 of make sure all of it's currency um we'll just go like that to make it a
401:41 we'll just go like that to make it a little simpler but we're not going to
401:42 little simpler but we're not going to change it to like a
401:44 change it to like a numeric um we will use this in the
401:47 numeric um we will use this in the visualization we'll see how it looks and
401:49 visualization we'll see how it looks and if we need to we'll come back and change
401:50 if we need to we'll come back and change it if not we'll keep it how it is um so
401:53 it if not we'll keep it how it is um so so that's all we're going to do to that
401:55 so that's all we're going to do to that one uh the children those look good we
401:58 one uh the children those look good we have
401:59 have education partial College partial High
402:02 education partial College partial High School this looks fine to me um if
402:04 School this looks fine to me um if there's any spelling errors or anything
402:06 there's any spelling errors or anything like that of course we need to clean
402:07 like that of course we need to clean that up it doesn't look like there
402:09 that up it doesn't look like there is
402:11 is occupation skilled manual manual okay
402:14 occupation skilled manual manual okay those should be separate are they a
402:17 those should be separate are they a homeowner should just be yes or no all
402:20 homeowner should just be yes or no all right we have Cars 1 2 3 4 good night
402:23 right we have Cars 1 2 3 4 good night who owns four cars um and then we have
402:25 who owns four cars um and then we have the commute distance uh and you know
402:27 the commute distance uh and you know there's nothing terrible about this it's
402:29 there's nothing terrible about this it's giving you ranges um which can be a good
402:32 giving you ranges um which can be a good thing I say let's keep it for now but I
402:36 thing I say let's keep it for now but I have a feeling when we get further and
402:37 have a feeling when we get further and we start using in the visualization we
402:39 we start using in the visualization we may want to change this so let's just
402:41 may want to change this so let's just hold off for now um but if needed we
402:44 hold off for now um but if needed we will come back to this and we'll change
402:46 will come back to this and we'll change this um and then we have our region and
402:51 this um and then we have our region and that looks totally fine and we have our
402:52 that looks totally fine and we have our age now when you're using ages typically
402:56 age now when you're using ages typically you have some type of like age bracket
402:59 you have some type of like age bracket or or age range and you do that because
403:02 or or age range and you do that because there are so many ages in here right
403:04 there are so many ages in here right it's 25 all the way down to 89 and if
403:07 it's 25 all the way down to 89 and if you're using that in some type of
403:08 you're using that in some type of visualization it could just get really
403:10 visualization it could just get really messy and so you'll create kind of you
403:13 messy and so you'll create kind of you know just brackets around these so that
403:15 know just brackets around these so that you can kind of condense it and make it
403:17 you can kind of condense it and make it a little bit easier to understand so
403:20 a little bit easier to understand so let's do that and just create a new
403:22 let's do that and just create a new column and then then we can use that for
403:24 column and then then we can use that for our dashboard so let's go right up here
403:26 our dashboard so let's go right up here we're just going to create a new column
403:29 we're just going to create a new column uh we'll call this age
403:32 uh we'll call this age brackets and what we can do is we can
403:35 brackets and what we can do is we can use an if statement to kind of say if
403:40 use an if statement to kind of say if it's older than or less than and and and
403:42 it's older than or less than and and and kind of give them these ranges um that's
403:45 kind of give them these ranges um that's one way to do it and that's the way
403:46 one way to do it and that's the way we're going to do it right now so let's
403:49 we're going to do it right now so let's go up here and what we want to do is we
403:51 go up here and what we want to do is we want to say is going to we're going to
403:53 want to say is going to we're going to say equals and we're going to do if and
403:56 say equals and we're going to do if and we're going to close that parenthesis
403:58 we're going to close that parenthesis now what we're going to say is if
404:01 now what we're going to say is if this we'll go right back up here if this
404:04 this we'll go right back up here if this is less than so we're going do this 31
404:09 is less than so we're going do this 31 and we're going to say comma so if they
404:11 and we're going to say comma so if they are less than 31 what do we want to call
404:15 are less than 31 what do we want to call them what do we want their their you
404:18 them what do we want their their you know name to be we'll call them
404:22 know name to be we'll call them adolescent oops that's not how you spell
404:24 adolescent oops that's not how you spell adolescent adolescent um and then if
404:27 adolescent adolescent um and then if they're not what we're going to do is
404:29 they're not what we're going to do is we're going to say it's
404:32 we're going to say it's invalid okay and let's just see if this
404:34 invalid okay and let's just see if this one works
404:35 one works first all right it's not working at all
404:39 first all right it's not working at all um okay so basically what we did was um
404:42 um okay so basically what we did was um incorrect we did it backward uh we want
404:44 incorrect we did it backward uh we want to do I said uh L2 is greater than 31 no
404:48 to do I said uh L2 is greater than 31 no we want to do like this so let's do that
404:52 we want to do like this so let's do that now all right and it should pull up
404:55 now all right and it should pull up where if they're under the age of 31 so
404:58 where if they're under the age of 31 so if they're 30 or below is basically what
405:00 if they're 30 or below is basically what it's saying so if they're 31 they'll be
405:03 it's saying so if they're 31 they'll be invalid but if they're 30 or below it's
405:05 invalid but if they're 30 or below it's adolescent so it is working properly um
405:08 adolescent so it is working properly um and let's see what it see what it says
405:10 and let's see what it see what it says perfect so this one is working and and
405:13 perfect so this one is working and and now what we want to do is we actually
405:14 now what we want to do is we actually want to build on this and make it uh
405:17 want to build on this and make it uh kind of like a nested if statement if
405:19 kind of like a nested if statement if you've ever heard of that or done that
405:21 you've ever heard of that or done that before so this is our first first if
405:23 before so this is our first first if statement and this is going to be this
405:26 statement and this is going to be this is invalid this is our value if false
405:28 is invalid this is our value if false statement this whole statement is going
405:30 statement this whole statement is going to become our value if false for a
405:34 to become our value if false for a different if statement um so let let me
405:38 different if statement um so let let me write it out and hopefully that'll make
405:40 write it out and hopefully that'll make sense but we're going to say if do open
405:43 sense but we're going to say if do open parentheses and we're going to do it
405:44 parentheses and we're going to do it like this and let's just get rid of this
405:45 like this and let's just get rid of this for a
405:48 for a second all right uh what did I do and
405:52 second all right uh what did I do and let me do
405:54 let me do oops give me a
405:56 oops give me a second okay we have our if let me just
405:59 second okay we have our if let me just write that out again we have our if
406:01 write that out again we have our if there we go so now what we're going to
406:04 there we go so now what we're going to do is we're going to write basically the
406:06 do is we're going to write basically the next part of it so we're going to say if
406:09 next part of it so we're going to say if that L2 is and we're going to do this
406:12 that L2 is and we're going to do this time we're going to do greater than or
406:13 time we're going to do greater than or equal to 31 so now it's going to include
406:16 equal to 31 so now it's going to include that 31 so right here we did anything
406:19 that 31 so right here we did anything less than 31 so it's 30 and below this
406:22 less than 31 so it's 30 and below this one is going to be 31 and above so we're
406:25 one is going to be 31 and above so we're going to say these people are middle
406:29 going to say these people are middle Ag and if not then it's going to go to
406:33 Ag and if not then it's going to go to this if statement and then we need to
406:34 this if statement and then we need to close it I believe so now let's try this
406:38 close it I believe so now let's try this all
406:39 all right fantastic now if um everybody
406:42 right fantastic now if um everybody should be in one of these areas right
406:45 should be in one of these areas right everyone should either be an adolescent
406:47 everyone should either be an adolescent or middle age because basically all
406:48 or middle age because basically all we're saying is is if they're older than
406:50 we're saying is is if they're older than 31 or 30 or below that's all these two
406:52 31 or 30 or below that's all these two statements do so we have um you know our
406:56 statements do so we have um you know our next group now we can add and go even
406:59 next group now we can add and go even further into this and now we can use
407:02 further into this and now we can use this entire thing as the um what was it
407:05 this entire thing as the um what was it called the value if false section so
407:09 called the value if false section so that's what we're going to do we're
407:10 that's what we're going to do we're going to do one more so we're have three
407:11 going to do one more so we're have three different categories so we're going to
407:13 different categories so we're going to say if and do uh an open parenthesis and
407:16 say if and do uh an open parenthesis and we're going to say if oh actually Let's
407:19 we're going to say if oh actually Let's Do It
407:20 Do It um let's not do it to this one
407:24 um let's not do it to this one let's do to this top one just
407:26 let's do to this top one just easier uh so we're going to say if open
407:29 easier uh so we're going to say if open parenthesis we're going to say L2 and
407:33 parenthesis we're going to say L2 and this time we're going to say anybody
407:35 this time we're going to say anybody over the age of 50 uh or we can do 55
407:39 over the age of 50 uh or we can do 55 let's do 55 so we'll do 55 and we're
407:42 let's do 55 so we'll do 55 and we're going to call them
407:44 going to call them old and we'll do a comma and this is the
407:48 old and we'll do a comma and this is the value if false statement and we need to
407:49 value if false statement and we need to close our parenthesis so let's try this
407:52 close our parenthesis so let's try this anybody over the age of 55 should have
407:55 anybody over the age of 55 should have old um you know maybe we'll do 54 so
407:59 old um you know maybe we'll do 54 so anybody who is 55 is considered old I
408:01 anybody who is 55 is considered old I think that's fair I think that's fair
408:04 think that's fair I think that's fair guys oops I should have
408:05 guys oops I should have done I should have done that to this one
408:07 done I should have done that to this one let me get out of this and we'll do
408:10 let me get out of this and we'll do 54 my dad is 55 that's why I'm doing it
408:14 54 my dad is 55 that's why I'm doing it like this this is fre
408:16 like this this is fre dead CU he should be in this old
408:18 dead CU he should be in this old category to be fair so now we have
408:20 category to be fair so now we have adolescent adolescent middle-age and old
408:23 adolescent adolescent middle-age and old these are three categories so we can now
408:25 these are three categories so we can now have these buckets these different
408:27 have these buckets these different groups of Ages and it's much more usable
408:30 groups of Ages and it's much more usable than these individual ages um and so we
408:33 than these individual ages um and so we will be using this in our in our
408:35 will be using this in our in our dashboard for sure now our next one is
408:38 dashboard for sure now our next one is the purchased bike uh and we're not
408:40 the purchased bike uh and we're not going to do anything with that so you
408:42 going to do anything with that so you know that is that is that one and you
408:46 know that is that is that one and you know there wasn't a ton to clean up here
408:48 know there wasn't a ton to clean up here we removed some duplicates um I don't
408:51 we removed some duplicates um I don't know why it says that what did I do
408:58 married married what does this mean even mean I did I write that did I mess this
409:01 mean I did I write that did I mess this up guys
409:04 up guys oh when I did the m and the S uh
409:08 oh when I did the m and the S uh replacement in there it replaced it with
409:11 replacement in there it replaced it with married and single it's supposed to say
409:13 married and single it's supposed to say marital status
409:16 marital status oops thanks for catching that guys
409:18 oops thanks for catching that guys thanks for catching that I hope that's
409:19 thanks for catching that I hope that's how you spell marital uh we'll see so
409:23 how you spell marital uh we'll see so uh we are going to keep it just like
409:25 uh we are going to keep it just like this now what we are going to
409:32 now now what we are going to do is build pivot tables with this data so we had
409:35 pivot tables with this data so we had our raw data we have our working sheet
409:38 our raw data we have our working sheet and now we want to create pivot tables
409:40 and now we want to create pivot tables and pivot tables is how you actually
409:42 and pivot tables is how you actually help build your dashboards or help build
409:44 help build your dashboards or help build your visualizations so we're going to go
409:46 your visualizations so we're going to go right here we're going to hit whoops get
409:49 right here we're going to hit whoops get rid of that we're going to go right here
409:52 rid of that we're going to go right here we're going to insert and we're going to
409:53 we're going to insert and we're going to say pivot table and it's going to ask us
409:55 say pivot table and it's going to ask us what
409:56 what range so we're going to go back to the
409:58 range so we're going to go back to the working sheet and we'll just click here
410:00 working sheet and we'll just click here and hit control
410:03 and hit control a this is going to select all of our
410:05 a this is going to select all of our data for us so it's really easy and
410:08 data for us so it's really easy and we're going to hit okay and so now we
410:11 we're going to hit okay and so now we have all of
410:12 have all of our pivot I don't need I don't need to
410:15 our pivot I don't need I don't need to pull it out that far that was way too
410:16 pull it out that far that was way too far and now we have all of our pivot
410:17 far and now we have all of our pivot table information over here and so that
410:20 table information over here and so that should make it really easy to you know
410:23 should make it really easy to you know actually build out so what we're going
410:24 actually build out so what we're going to do is start selecting what columns
410:27 to do is start selecting what columns and what data we actually want to work
410:28 and what data we actually want to work with so the first one that we're going
410:30 with so the first one that we're going to build out is a dashboard that is
410:33 to build out is a dashboard that is basically looking at the average income
410:35 basically looking at the average income of somebody who either bought or did not
410:37 of somebody who either bought or did not buy a bike so we need in this one we're
410:41 buy a bike so we need in this one we're going to need their income that's
410:43 going to need their income that's definitely going to be a value right
410:44 definitely going to be a value right here um but we want to break it out by
410:48 here um but we want to break it out by male and female so let's look at their
410:49 male and female so let's look at their gender we going to pull that down into
410:51 gender we going to pull that down into the rows so
410:53 the rows so um this is basically a sum and no let's
410:56 um this is basically a sum and no let's look
410:57 look at let's make this an average so I just
410:59 at let's make this an average so I just went to the um I clicked right here I
411:02 went to the um I clicked right here I went to the value field settings and
411:04 went to the value field settings and we're just going to do an
411:05 we're just going to do an average all right and then we are going
411:08 average all right and then we are going to make these um and as you can see
411:12 to make these um and as you can see there's four decimal points um we'll
411:14 there's four decimal points um we'll keep it as is right now but we may need
411:16 keep it as is right now but we may need to go back and change something then
411:17 to go back and change something then we're going to look at if they purchased
411:19 we're going to look at if they purchased a bik or not and we're going to put that
411:21 a bik or not and we're going to put that right here so so we can see that uh
411:24 right here so so we can see that uh right here for the people who did not
411:26 right here for the people who did not buy a bike the females their their
411:28 buy a bike the females their their average salary was 53,000 the average
411:31 average salary was 53,000 the average salary for the average salary for males
411:33 salary for the average salary for males was 56,000 for yes the ones who did buy
411:36 was 56,000 for yes the ones who did buy a bike the average salary was 55 for
411:39 a bike the average salary was 55 for female and 60 for male so the people who
411:42 female and 60 for male so the people who had a little bit more money are buying
411:44 had a little bit more money are buying bikes and you can also see that uh the
411:46 bikes and you can also see that uh the men are making more money in this data
411:48 men are making more money in this data set just overall in general um so let's
411:53 set just overall in general um so let's make the visualization really quick but
411:55 make the visualization really quick but you know I don't know I'm not a huge fan
411:57 you know I don't know I'm not a huge fan of these decimal points and maybe we can
411:59 of these decimal points and maybe we can just change that in the visualization
412:01 just change that in the visualization we'll see um oops that's not what I
412:05 we'll see um oops that's not what I meant to
412:06 meant to do um let's do that so what we are going
412:10 do um let's do that so what we are going to do is we're going to click into here
412:12 to do is we're going to click into here we're going to click insert and we're
412:14 we're going to click insert and we're going to go to these recommended charts
412:15 going to go to these recommended charts and it's going to bring up basically
412:17 and it's going to bring up basically every single type that we would want um
412:20 every single type that we would want um and we can just click in here and see
412:22 and we can just click in here and see which one looks good uh oh yeah I love
412:25 which one looks good uh oh yeah I love those 3D ones those are my favorite you
412:27 those 3D ones those are my favorite you guys know that uh let's let's use this
412:29 guys know that uh let's let's use this one right here pretty simple um whoops
412:32 one right here pretty simple um whoops let's pull this right over here and as
412:35 let's pull this right over here and as is it looks pretty good um you know it
412:39 is it looks pretty good um you know it shows male female we have the average or
412:42 shows male female we have the average or the incomes right here whether they did
412:44 the incomes right here whether they did or did not purchase it um and so at a
412:47 or did not purchase it um and so at a glance it's pretty easy to see let's see
412:50 glance it's pretty easy to see let's see if there's anything um you know if you
412:54 if there's anything um you know if you want to change up style-wise go for it
412:55 want to change up style-wise go for it I'm just going to keep it as is um but
412:58 I'm just going to keep it as is um but let's see if there's anything we need to
412:59 let's see if there's anything we need to add right do we want to add these access
413:02 add right do we want to add these access titles uh for the most part I I tend to
413:06 titles uh for the most part I I tend to do that um it makes it pretty easy to
413:09 do that um it makes it pretty easy to see so we can go in here and we can just
413:11 see so we can go in here and we can just click it like this and we'll say
413:13 click it like this and we'll say income and we'll say oops and we'll do
413:19 income and we'll say oops and we'll do gender so that's what that
413:21 gender so that's what that is and and let's go back in here do we
413:24 is and and let's go back in here do we want to add a chart title we definitely
413:27 want to add a chart title we definitely want to add a chart title uh for most of
413:29 want to add a chart title uh for most of these we'll add a chart title for sure
413:30 these we'll add a chart title for sure so we'll say average
413:32 so we'll say average income per
413:35 income per purchase um I don't know if that's 100%
413:37 purchase um I don't know if that's 100% right but we'll we'll we'll use it uh if
413:39 right but we'll we'll we'll use it uh if we need to change it to be you know by
413:41 we need to change it to be you know by gender or something we can but um for
413:44 gender or something we can but um for now let's see do we want to add data
413:46 now let's see do we want to add data labels uh definitely not uh a data table
413:50 labels uh definitely not uh a data table um we can do this it may make it a
413:52 um we can do this it may make it a little easier to read I will say that
413:54 little easier to read I will say that again these numbers are just these
413:55 again these numbers are just these decimal points are really throwing me
413:57 decimal points are really throwing me off let's go see if um we can change it
413:59 off let's go see if um we can change it in here let's go
414:02 in here let's go to see if we can just make these numbers
414:05 to see if we can just make these numbers okay and um we can keep it like that or
414:08 okay and um we can keep it like that or we can even do something like this add
414:13 we can even do something like this add commas yeah I'm going to keep it just
414:15 commas yeah I'm going to keep it just like this I I think this just looks the
414:16 like this I I think this just looks the best um again I'm I'm getting adding
414:19 best um again I'm I'm getting adding commas here I'm changing the um decimal
414:22 commas here I'm changing the um decimal place right here it just makes it look a
414:25 place right here it just makes it look a little nicer a little cleaner um so
414:28 little nicer a little cleaner um so let's keep this exactly how it is um we
414:32 let's keep this exactly how it is um we can always change things if we want to
414:34 can always change things if we want to uh if we want to come back to it so we
414:36 uh if we want to come back to it so we created our pivot table and then we
414:38 created our pivot table and then we created our visualization basically
414:40 created our visualization basically exactly what we're going to do for all
414:42 exactly what we're going to do for all of these because again all of these need
414:44 of these because again all of these need um you know all of these need pivot
414:46 um you know all of these need pivot tables in order to create the
414:47 tables in order to create the visualization so let's um get out of
414:50 visualization so let's um get out of here we're going to scroll down and
414:53 here we're going to scroll down and we're going to create our next pivot
414:54 we're going to create our next pivot table and once we get done with all of
414:57 table and once we get done with all of the pivot tables that we need all the
414:58 the pivot tables that we need all the visualizations that we need then we will
415:01 visualizations that we need then we will um we will start so we're going to do
415:03 um we will start so we're going to do control a we're going do okay and
415:06 control a we're going do okay and basically do the exact same thing that
415:08 basically do the exact same thing that we did um this time we're going to look
415:10 we did um this time we're going to look at the distance so for this one I wanted
415:13 at the distance so for this one I wanted to see you know I try to you know I
415:15 to see you know I try to you know I created this already I've already done
415:16 created this already I've already done this entire project through but I
415:18 this entire project through but I haven't really talked about why or what
415:20 haven't really talked about why or what we're going to look at for this one you
415:22 we're going to look at for this one you know know we're looking at is their
415:25 know know we're looking at is their income does it change whether they
415:27 income does it change whether they bought or didn't buy one um so if they
415:30 bought or didn't buy one um so if they said yes you know is there a reason are
415:32 said yes you know is there a reason are they making more money is you know are
415:34 they making more money is you know are price points are the customers do they
415:36 price points are the customers do they make more money so you we cater to them
415:38 make more money so you we cater to them or not uh that's a good question uh
415:41 or not uh that's a good question uh another thing is you know we're we sell
415:43 another thing is you know we're we sell bikes or this person sells bikes so
415:45 bikes or this person sells bikes so commuting distance definitely makes a
415:48 commuting distance definitely makes a difference you know does the person who
415:50 difference you know does the person who is buying a bike live one mile away from
415:53 is buying a bike live one mile away from where they work or 20 miles away uh this
415:55 where they work or 20 miles away uh this will help us determine this next
415:56 will help us determine this next visualization will help us determine you
415:59 visualization will help us determine you know who who is doing that or who's
416:00 know who who is doing that or who's buying it so what we are going to do is
416:04 buying it so what we are going to do is we are going to look at the um that one
416:08 we are going to look at the um that one that we were looking at earlier the
416:09 that we were looking at earlier the commute distance so we're going to bring
416:11 commute distance so we're going to bring that right over here so we have these
416:13 that right over here so we have these you know one mile 10 Mile 1.2
416:17 you know one mile 10 Mile 1.2 Etc now we are going to uh again we're
416:20 Etc now we are going to uh again we're going to look at if they purchased a
416:22 going to look at if they purchased a bike
416:23 bike that's really important and let's make
416:25 that's really important and let's make that the column as well so now what we
416:27 that the column as well so now what we have is a count of these Nos and yeses
416:29 have is a count of these Nos and yeses whether they did or did not buy a bike
416:32 whether they did or did not buy a bike um one of the issues I already see and
416:34 um one of the issues I already see and we'll I'm going to visualize it and then
416:35 we'll I'm going to visualize it and then I'll show you that this 10 miles you
416:38 I'll show you that this 10 miles you know it's right next to the 0.1 so it's
416:40 know it's right next to the 0.1 so it's not an order um and that could be that
416:44 not an order um and that could be that could be an issue um so we may have to
416:47 could be an issue um so we may have to revise that somehow to put it at the
416:49 revise that somehow to put it at the very bottom because we can either do
416:51 very bottom because we can either do ascending
416:53 ascending or descending uh either one I don't
416:56 or descending uh either one I don't think is going to work so we may have to
416:57 think is going to work so we may have to work through that in just a second um I
416:59 work through that in just a second um I don't know if I did that my I plan for
417:02 don't know if I did that my I plan for that um yeah so it has this big dip
417:06 that um yeah so it has this big dip um yeah so let's let's create it um
417:09 um yeah so let's let's create it um that's okay we're going to figure this
417:11 that's okay we're going to figure this one out together because I honestly um I
417:14 one out together because I honestly um I didn't plan for this one so okay we have
417:16 didn't plan for this one so okay we have 0.1 miles that's exactly where it needs
417:18 0.1 miles that's exactly where it needs to be the one the two the five that's
417:21 to be the one the two the five that's exactly where it needs to be this 10
417:22 exactly where it needs to be this 10 miles is not and let's see if I change
417:26 miles is not and let's see if I change that 10 10 plus miles to 10 miles plus
417:32 that 10 10 plus miles to 10 miles plus let's see if that'll put it down here
417:34 let's see if that'll put it down here because I I don't know if it's looking
417:35 because I I don't know if it's looking at I don't know if it's reading it weird
417:38 at I don't know if it's reading it weird um but let's go into this working sheet
417:41 um but let's go into this working sheet and let's go right here and we're going
417:43 and let's go right here and we're going to do controll H and we'll do oops not
417:47 to do controll H and we'll do oops not this
417:48 this one um 10 miles plus let's get that in
417:52 one um 10 miles plus let's get that in there and we're going to do
417:54 there and we're going to do 10 uh
417:57 10 uh miles plus I I don't know if that's
417:59 miles plus I I don't know if that's actually going to work um we will see so
418:03 actually going to work um we will see so let's go back to the pivot
418:04 let's go back to the pivot table let's re go to the data let's
418:08 table let's re go to the data let's refresh uh no it didn't it didn't change
418:10 refresh uh no it didn't it didn't change it um okay so let's think about this
418:14 it um okay so let's think about this maybe if we change it to like a letter
418:17 maybe if we change it to like a letter it might change down here so start it
418:18 it might change down here so start it with uh miles that could work um let's
418:21 with uh miles that could work um let's try it it okay it's already
418:25 try it it okay it's already selected let's do the 10 plus miles okay
418:29 selected let's do the 10 plus miles okay so let's
418:31 so let's do
418:34 do um M uh more than 10
418:39 um M uh more than 10 miles and we'll replace all let's get
418:42 miles and we'll replace all let's get rid of
418:44 rid of this let's go to the pivot and refresh
418:48 this let's go to the pivot and refresh all right okay so it's not perfect but
418:51 all right okay so it's not perfect but it works
418:53 it works um and for what we're doing I think
418:55 um and for what we're doing I think we'll keep it how it is so we have our
418:58 we'll keep it how it is so we have our second one uh and you know there are
419:02 second one uh and you know there are different ways you can kind of change
419:04 different ways you can kind of change this one um you know on the last one we
419:06 this one um you know on the last one we did a ton of different stuff we can
419:09 did a ton of different stuff we can do just
419:12 do just do commute
419:15 do commute distance and we can
419:17 distance and we can say what do we want to say on this one
419:20 say what do we want to say on this one what is this oh this is the count um do
419:23 what is this oh this is the count um do we have to do we have to keep this
419:26 we have to do we have to keep this one um no there we go I'm just going to
419:30 one um no there we go I'm just going to do um just one and
419:35 do um just one and say commute
419:38 say commute distance and let's add a
419:41 distance and let's add a title chart title we can make this one
419:45 title chart title we can make this one um let's
419:46 um let's say
419:49 say distance per customer uh that's not 100%
419:53 distance per customer uh that's not 100% true because it's no or yes um that's
419:55 true because it's no or yes um that's that's the important part of this it's
419:58 that's the important part of this it's distance um average distance uh let's
420:02 distance um average distance uh let's see we'll just say customer
420:10 commute all right and we'll keep it just like that all right perfect I don't
420:13 like that all right perfect I don't think um let me see I don't think
420:16 think um let me see I don't think there's anything else we need to add on
420:17 there's anything else we need to add on that one all right now let's go right
420:19 that one all right now let's go right down here we're going to create our very
420:21 down here we're going to create our very last one uh we only had three so you
420:24 last one uh we only had three so you know sometimes you'll have a ton
420:26 know sometimes you'll have a ton sometimes you'll have like one on each
420:28 sometimes you'll have like one on each sheet and you'll create multiple sheets
420:29 sheet and you'll create multiple sheets but um do contr a um now we have our
420:34 but um do contr a um now we have our thing now this one we're going to be
420:36 thing now this one we're going to be looking at these age brackets that we
420:39 looking at these age brackets that we were looking at that we created um
420:41 were looking at that we created um something that I do honestly a lot is is
420:45 something that I do honestly a lot is is kind of bracket things in into groups
420:47 kind of bracket things in into groups like this and you know for this I'm just
420:50 like this and you know for this I'm just kind of made them up but you know it's
420:53 kind of made them up but you know it's good to know how to do this because I I
420:57 good to know how to do this because I I promise you this one happens a lot or I
420:59 promise you this one happens a lot or I use this one a ton and then we just want
421:01 use this one a ton and then we just want to look at who purchased a bike uh so
421:04 to look at who purchased a bike uh so the same thing as we did before so like
421:05 the same thing as we did before so like purchase a bike count of the purchase um
421:08 purchase a bike count of the purchase um you know pretty easy so we just have to
421:10 you know pretty easy so we just have to count of either no or yes for these age
421:12 count of either no or yes for these age ranges um and let's go to the insert
421:17 ranges um and let's go to the insert we'll go to
421:18 we'll go to recommendation um I personally like a
421:21 recommendation um I personally like a good line for this one um so
421:24 good line for this one um so let's this is already interesting we
421:27 let's this is already interesting we could do something like
421:29 could do something like this that's nice see this one versus
421:33 this that's nice see this one versus this it just adds a dot it looks nice
421:35 this it just adds a dot it looks nice we'll keep that one
421:37 we'll keep that one um so just really quick at a glance
421:40 um so just really quick at a glance really interesting people under the age
421:42 really interesting people under the age of 30 are not buying that many bikes um
421:45 of 30 are not buying that many bikes um age 30 to
421:47 age 30 to 54 uh 31 to 54 buying a ton of bikes uh
421:52 54 uh 31 to 54 buying a ton of bikes uh they buy more bikes or look at bikes
421:54 they buy more bikes or look at bikes more than anybody really interesting um
421:56 more than anybody really interesting um but yeah we'll make the dashboard in a
421:58 but yeah we'll make the dashboard in a little bit um let's make these chart
422:00 little bit um let's make these chart titles we'll
422:02 titles we'll do vert oops the
422:05 do vert oops the horizontal we just call
422:07 horizontal we just call this age
422:14 bracket um and then we'll add a chart title um again you can add some extra
422:18 title um again you can add some extra stuff if you want
422:19 stuff if you want to um but you don't need to uh none of
422:22 to um but you don't need to uh none of this other stuff we really need I'm just
422:23 this other stuff we really need I'm just kind of looking at the stuff we do need
422:26 kind of looking at the stuff we do need or do want uh so what do we want to call
422:28 or do want uh so what do we want to call this one let's call it customer
422:31 this one let's call it customer age brackets um and it's not perfect but
422:36 age brackets um and it's not perfect but we'll keep it as is for comparison um
422:39 we'll keep it as is for comparison um let me see if I can copy
422:42 let me see if I can copy um or or use this um real quick instead
422:46 um or or use this um real quick instead of the age brackets I'm going to get rid
422:49 of the age brackets I'm going to get rid of this and use the age
422:53 of this and use the age and then let's
422:55 and then let's use um let's insert
422:58 use um let's insert recommendation we use a
423:00 recommendation we use a line and we'll use this
423:04 line and we'll use this so This compared to this just think of
423:08 so This compared to this just think of it like if a customer or consumer or or
423:12 it like if a customer or consumer or or not a customer if somebody you're
423:13 not a customer if somebody you're working with is trying to use this
423:15 working with is trying to use this dashboard to understand this dashboard
423:17 dashboard to understand this dashboard this is going to be just it's going to I
423:20 this is going to be just it's going to I don't know it might melt their brain
423:21 don't know it might melt their brain just makes no sense it makes sense it's
423:23 just makes no sense it makes sense it's just all over the place it's really hard
423:24 just all over the place it's really hard to make sense of this it really is I
423:27 to make sense of this it really is I mean you can kind of see a pattern going
423:29 mean you can kind of see a pattern going up around like the mid-30s and then it
423:31 up around like the mid-30s and then it Trends downward but it's hard to see um
423:34 Trends downward but it's hard to see um it really is so doing these um these
423:38 it really is so doing these um these brackets really helps and you can even
423:39 brackets really helps and you can even add you know adolescent um you know 0o
423:44 add you know adolescent um you know 0o to 30 underneath it and in fact we may
423:46 to 30 underneath it and in fact we may want to do that um why not why not let's
423:49 want to do that um why not why not let's do that oh whoops um so why don't why
423:53 do that oh whoops um so why don't why don't we do that why don't we go back
423:54 don't we do that why don't we go back I'm just going to I'm doing this on the
423:56 I'm just going to I'm doing this on the Fly why don't we go
423:58 Fly why don't we go back uh what am I doing
424:00 back uh what am I doing whoops and this is all calculated but
424:03 whoops and this is all calculated but let's do
424:05 let's do adolescent 0
424:07 adolescent 0 to
424:10 to 30 let's do
424:12 30 let's do middleaged 31 through
424:15 middleaged 31 through 54 and then old 55 plus let's see if
424:19 54 and then old 55 plus let's see if this breaks anything I hope it doesn't
424:23 this breaks anything I hope it doesn't um and we'll go back to our pivot table
424:26 um and we'll go back to our pivot table let's refresh the
424:33 data uh okay it did mess with stuff okay never mind guys that was a
424:36 stuff okay never mind guys that was a terrible idea don't do that um perfect
424:40 terrible idea don't do that um perfect uh let's get rid of that that was a
424:42 uh let's get rid of that that was a terrible idea don't do that I'm glad we
424:44 terrible idea don't do that I'm glad we tested it out though I like I like to
424:46 tested it out though I like I like to see if it was going to work no it messed
424:48 see if it was going to work no it messed with the um the Order of Things um I I
424:52 with the um the Order of Things um I I intentionally named them adolescent
424:54 intentionally named them adolescent middle- Ag and old because it's it it
424:57 middle- Ag and old because it's it it makes sense for the visualization um but
425:01 makes sense for the visualization um but you know if if I change something and it
425:04 you know if if I change something and it messes with it I'm not going to mess
425:05 messes with it I'm not going to mess with it it was just an idea on the Fly
425:07 with it it was just an idea on the Fly guys come on all right so let's start
425:09 guys come on all right so let's start building out our dashboard now um when
425:12 building out our dashboard now um when we're building our dashboard what I
425:14 we're building our dashboard what I personally like to do is to have this
425:16 personally like to do is to have this pivot table sheet and then I will copy
425:19 pivot table sheet and then I will copy them over and later we'll hide these
425:21 them over and later we'll hide these other sheets beats um and I'll explain
425:23 other sheets beats um and I'll explain that a little bit but I like to have
425:25 that a little bit but I like to have this this one for us so we're going to
425:27 this this one for us so we're going to copy this so I just click on it hit
425:29 copy this so I just click on it hit controlc we're going to paste it right
425:31 controlc we're going to paste it right over
425:33 over here uh let's just make them small for
425:35 here uh let's just make them small for now that's oh gosh no let's not do that
425:38 now that's oh gosh no let's not do that oh these look terrible okay anyways
425:41 oh these look terrible okay anyways um let's copy this one
425:45 um let's copy this one over
425:47 over oops okay what did I just
425:50 oops okay what did I just do oh I didn't copy this one
425:54 do oh I didn't copy this one whoops it's not
425:56 whoops it's not copying okay we're going to go
426:00 copying okay we're going to go copy hit
426:02 copy hit paste
426:04 paste fantastic oops guys look away this is
426:07 fantastic oops guys look away this is this is tough to watch this is tough for
426:09 this is tough to watch this is tough for me to watch I'm the one doing it it is
426:10 me to watch I'm the one doing it it is tough for me to watch all right let's go
426:12 tough for me to watch all right let's go to this last one I'm I'm gonna try it
426:15 to this last one I'm I'm gonna try it again all right it worked this time so
426:18 again all right it worked this time so now we have um our our three
426:22 now we have um our our three visualizations this is perfect but now
426:24 visualizations this is perfect but now we actually want to create a dashboard
426:25 we actually want to create a dashboard now how do you do that how do you make
426:26 now how do you do that how do you make it look nice U and then we're going to
426:28 it look nice U and then we're going to add some you know filters and stuff like
426:30 add some you know filters and stuff like that how do we make it look nice um what
426:33 that how do we make it look nice um what happened here what changed what did we
426:39 do oh my goodness gracious all right let's copy
426:46 this let's paste this let's get rid of this I don't even know how that happened
426:47 this I don't even know how that happened I've never seen that before that was
426:49 I've never seen that before that was wild uh Excel is trying to destroy my
426:52 wild uh Excel is trying to destroy my whole video I mean I'm doing this for
426:54 whole video I mean I'm doing this for you Excel good night okay no problem at
426:58 you Excel good night okay no problem at all what we're going to do and how you
427:00 all what we're going to do and how you make this at least look nice um first
427:03 make this at least look nice um first off we can get rid of these grid lines
427:05 off we can get rid of these grid lines pretty easily and I recommend when you
427:07 pretty easily and I recommend when you do that when you make a dashboard just
427:08 do that when you make a dashboard just makes it look cleaner makes it look like
427:10 makes it look cleaner makes it look like an actual dashboard um let's go to view
427:13 an actual dashboard um let's go to view and grid lines so we can get rid of
427:15 and grid lines so we can get rid of these grid lines it just makes it look
427:17 these grid lines it just makes it look nicer um we're going to make you know we
427:20 nicer um we're going to make you know we can choose any color here here I'm just
427:22 can choose any color here here I'm just going to get choose a
427:24 going to get choose a color I like this and let's we're we're
427:28 color I like this and let's we're we're basically creating like a header right
427:30 basically creating like a header right if you're using like Tableau or
427:31 if you're using like Tableau or something um we're going to merge and
427:32 something um we're going to merge and center so it takes every single cell
427:34 center so it takes every single cell that we have highlighted creates it into
427:35 that we have highlighted creates it into one let's call this um bike sales uh I
427:40 one let's call this um bike sales uh I have I think I called it bike sales
427:42 have I think I called it bike sales dashboard let's just call it that um you
427:45 dashboard let's just call it that um you know see what happens let's get that
427:49 know see what happens let's get that let's make it white and and make it much
427:52 let's make it white and and make it much larger than it
427:54 larger than it is okay okay
427:57 is okay okay um sure let's do that doesn't look bad
428:02 um sure let's do that doesn't look bad um what is it doing there we go uh let's
428:06 um what is it doing there we go uh let's bre that Center perfect um it's not
428:09 bre that Center perfect um it's not perfect but we're going to use it all
428:10 perfect but we're going to use it all right so now we kind of want to organize
428:13 right so now we kind of want to organize these and you know everybody has their
428:16 these and you know everybody has their different way of doing it uh I'm just
428:18 different way of doing it uh I'm just going to start building it out myself
428:21 going to start building it out myself self and just see how it
428:24 self and just see how it looks uh and then we'll go from there I
428:26 looks uh and then we'll go from there I like this one there um we can
428:30 like this one there um we can put this one I I this one's a kind of a
428:33 put this one I I this one's a kind of a longer one so I'll probably put it at
428:34 longer one so I'll probably put it at the bottom let's see how it
428:36 the bottom let's see how it looks um but we'll put this one right
428:40 looks um but we'll put this one right here try to line it up geez let's let's
428:43 here try to line it up geez let's let's zoom in a little bit let's try to line
428:45 zoom in a little bit let's try to line this up see what it looks
428:52 like let's extend it to the end that doesn't look too bad uh needs
428:54 end that doesn't look too bad uh needs to move up just a hair and I'll show you
428:57 to move up just a hair and I'll show you how to kind of align these in a second
428:59 how to kind of align these in a second but um that looks not bad and we'll kind
429:04 but um that looks not bad and we'll kind of try to align these as well let me
429:06 of try to align these as well let me zoom out and extend this the length of
429:10 zoom out and extend this the length of this just to make it look nice um you
429:12 this just to make it look nice um you know now what you can do and you know
429:17 know now what you can do and you know this is something that's pretty simple
429:19 this is something that's pretty simple is you can get both of these and we're
429:21 is you can get both of these and we're going to go to shape format and we can
429:23 going to go to shape format and we can just align these it's really nice to
429:25 just align these it's really nice to align especially if like the top and
429:27 align especially if like the top and maybe like the left to right but like
429:29 maybe like the left to right but like we're going to align these to the top
429:31 we're going to align these to the top and they just kind of align themselves
429:33 and they just kind of align themselves on the very top now these look much
429:35 on the very top now these look much better this one is a larger dashboard or
429:38 better this one is a larger dashboard or a larger visualization so I'm going to
429:40 a larger visualization so I'm going to keep it how it is um and I'm going to
429:43 keep it how it is um and I'm going to keep this one how it is so it is going
429:45 keep this one how it is so it is going to be a little bit smaller as you can
429:46 to be a little bit smaller as you can tell and then we'll have this one um and
429:49 tell and then we'll have this one um and I'm going to do that
429:52 I'm going to do that um I this is going to bother me if I
429:55 um I this is going to bother me if I don't align these so let me do this I'm
429:59 don't align these so let me do this I'm shape format align to the
430:02 shape format align to the right and it's not exactly what I wanted
430:06 right and it's not exactly what I wanted to happen
430:11 because oh jeez what am I doing that's not exactly what I wanted to happen I
430:13 not exactly what I wanted to happen I actually wanted this one to align uh
430:15 actually wanted this one to align uh this one to align with this one it did
430:16 this one to align with this one it did the opposite um so let me just scoot
430:19 the opposite um so let me just scoot this back all right visually looks fine
430:21 this back all right visually looks fine but that's how you do it if you want to
430:23 but that's how you do it if you want to do it um I I I if you have multiple of
430:26 do it um I I I if you have multiple of them like this it you can make it look
430:28 them like this it you can make it look bad so we have our dashboards this is
430:30 bad so we have our dashboards this is already looking really good I I like how
430:32 already looking really good I I like how this looks colors are coordinated it we
430:35 this looks colors are coordinated it we have a kind of a theme throughout um and
430:38 have a kind of a theme throughout um and it looks nice I actually I actually kind
430:40 it looks nice I actually I actually kind of want to change this one um
430:43 of want to change this one um to
430:46 to um let's
430:52 see maybe if I did like that it look nicer than all of them yeah this does
430:54 nicer than all of them yeah this does look nicer um it doesn't change much
430:56 look nicer um it doesn't change much either guys I'm should I do it all right
431:00 either guys I'm should I do it all right we're going for it we're changing the
431:01 we're going for it we're changing the design on the Fly should I do it for all
431:04 design on the Fly should I do it for all of
431:05 of them let's
431:07 them let's see it doesn't fit doesn't fit um all
431:11 see it doesn't fit doesn't fit um all right guys just ignore what I'm doing uh
431:13 right guys just ignore what I'm doing uh don't do any of this I'm just messing
431:15 don't do any of this I'm just messing around at this point so this is really
431:18 around at this point so this is really great to have it really is and what we
431:20 great to have it really is and what we want to do is there are other elements
431:23 want to do is there are other elements there are other things that people would
431:24 there are other things that people would like to feel a to filter by and be able
431:26 like to feel a to filter by and be able to look at but it's not in this
431:28 to look at but it's not in this visualization um to be more specific one
431:32 visualization um to be more specific one field that's could be really interesting
431:34 field that's could be really interesting is married versus single are single
431:35 is married versus single are single people buying more or um married people
431:38 people buying more or um married people buying more you know it it'd be nice to
431:40 buying more you know it it'd be nice to filter on it so we're going to click on
431:42 filter on it so we're going to click on uh any of these actually and we're going
431:44 uh any of these actually and we're going to go up to Pivot chart analyze and
431:46 to go up to Pivot chart analyze and we'll click insert slicer now we can
431:50 we'll click insert slicer now we can choose which ones we want to be able to
431:51 choose which ones we want to be able to filter on all at the same time or one at
431:53 filter on all at the same time or one at a time I'm just going to do the first
431:54 a time I'm just going to do the first one by itself and then I'll show you how
431:57 one by itself and then I'll show you how to do other ones um but this one is the
432:00 to do other ones um but this one is the marital status so this is the married
432:01 marital status so this is the married single the one we were just looking at
432:03 single the one we were just looking at and we can drag this right over
432:06 and we can drag this right over here bring it in a little
432:09 here bring it in a little bit all right and we don't need all that
432:13 bit all right and we don't need all that space so we're going to boop boop boop
432:14 space so we're going to boop boop boop boop all the way up
432:16 boop all the way up now while we're doing this um it only
432:20 now while we're doing this um it only because we selected this uh this
432:22 because we selected this uh this visualization it only is working on that
432:24 visualization it only is working on that one right now we of course wanted to
432:26 one right now we of course wanted to apply to all of them is not hard to do
432:29 apply to all of them is not hard to do all we're going to do is we're going to
432:31 all we're going to do is we're going to click on we're going to make sure we're
432:32 click on we're going to make sure we're clicking on this we're going to go up to
432:33 clicking on this we're going to go up to slicer we're going to hit report
432:35 slicer we're going to hit report connections um and if you remember we
432:38 connections um and if you remember we have this um this pivot table that we're
432:40 have this um this pivot table that we're working with um and this is where all of
432:43 working with um and this is where all of our pivots are coming from so we're
432:45 our pivots are coming from so we're going to actually apply it to all of
432:47 going to actually apply it to all of them this is our sheet U and this is the
432:49 them this is our sheet U and this is the name of the pivot table now again we
432:52 name of the pivot table now again we created that fourth one we're not using
432:53 created that fourth one we're not using it but we're going to apply it to all of
432:54 it but we're going to apply it to all of them so now when we click on it it's
432:58 them so now when we click on it it's going to apply to all of them so at a
433:01 going to apply to all of them so at a quick glance let's see what single
433:02 quick glance let's see what single people are doing
433:05 people are doing um interesting interesting um you know
433:09 um interesting interesting um you know when I'm looking at the just these
433:10 when I'm looking at the just these numbers right here married people these
433:13 numbers right here married people these individuals are making a lot more like
433:17 individuals are making a lot more like eight um sometimes eight to like 10,000
433:20 eight um sometimes eight to like 10,000 more on average than their single
433:23 more on average than their single counterpart um you know again that's a
433:25 counterpart um you know again that's a rough estimate but it's it's interesting
433:27 rough estimate but it's it's interesting so now what we can do is we're going to
433:29 so now what we can do is we're going to create more of these so we're going to
433:30 create more of these so we're going to go to uh pivot chart analyze we're going
433:33 go to uh pivot chart analyze we're going to go to slicer now we already did
433:35 to go to slicer now we already did marital status but what if we want to
433:37 marital status but what if we want to look at things like uh region and maybe
433:41 look at things like uh region and maybe something like their education so let's
433:45 something like their education so let's bring up both of those and look now two
433:47 bring up both of those and look now two of them come up so let's add the region
433:49 of them come up so let's add the region right here
433:51 right here we'll bring that in just a little bit
433:53 we'll bring that in just a little bit see if we can match it nailed it all
433:56 see if we can match it nailed it all right now we're going to put that up
433:59 right now we're going to put that up we'll bring this one
434:00 we'll bring this one down just like this bring it over see if
434:04 down just like this bring it over see if I can match it again come
434:07 I can match it again come on N almost nailed it I don't know if I
434:10 on N almost nailed it I don't know if I nailed it but it's close all right kind
434:12 nailed it but it's close all right kind of bring this up a little bit bring this
434:15 of bring this up a little bit bring this up and we have to do the exact same
434:18 up and we have to do the exact same thing that we did with this one because
434:19 thing that we did with this one because right now again it only applies to that
434:21 right now again it only applies to that one um chart so what we want to do is we
434:24 one um chart so what we want to do is we want to go to slicer report connections
434:26 want to go to slicer report connections add it to all of them okay do the same
434:30 add it to all of them okay do the same thing with education or connections bada
434:34 thing with education or connections bada bing bada boom We are looking good and
434:38 bing bada boom We are looking good and now uh let's get rid of all of them it's
434:40 now uh let's get rid of all of them it's just going to be everybody so now we can
434:43 just going to be everybody so now we can kind of slice and dice and choose what
434:45 kind of slice and dice and choose what we want we want to look at people who
434:47 we want we want to look at people who have a bachelor's degree who live in
434:49 have a bachelor's degree who live in Europe and are single
434:51 Europe and are single and this is the information that we have
434:53 and this is the information that we have on those people so now we can narrow it
434:55 on those people so now we can narrow it down by certain demographics even
434:57 down by certain demographics even further and look at this key information
435:00 further and look at this key information so we may not you know look at counts
435:02 so we may not you know look at counts and averages of these things but we're
435:04 and averages of these things but we're able to filter on them uh and that's
435:06 able to filter on them uh and that's really great to know so bachelor's
435:08 really great to know so bachelor's degrees on average are making 60s 70,000
435:12 degrees on average are making 60s 70,000 um let's look at um let's look at
435:15 um let's look at um let's look at graduate
435:16 graduate degrees okay a little
435:19 degrees okay a little more um but you know again I'm just
435:22 more um but you know again I'm just looking at random stuff um but you can
435:24 looking at random stuff um but you can mess around with this take a look at
435:26 mess around with this take a look at some stuff um this to me I want to make
435:29 some stuff um this to me I want to make this color darker I feel like it look
435:31 this color darker I feel like it look nicer darker there we go oh yeah that's
435:34 nicer darker there we go oh yeah that's way better this to me is it's a good
435:38 way better this to me is it's a good dashboard right you have key information
435:41 dashboard right you have key information that you're looking at nice
435:43 that you're looking at nice visualizations it's color coordinated
435:45 visualizations it's color coordinated you have these slicers on the side um to
435:49 you have these slicers on the side um to me this is a fantas fantastic just
435:52 me this is a fantas fantastic just simple dashboard and there are so many
435:54 simple dashboard and there are so many other things that you can do with this
435:56 other things that you can do with this data and you can make it unique and you
435:58 data and you can make it unique and you can add your own spin on it and I highly
436:00 can add your own spin on it and I highly recommend that you do that push yourself
436:02 recommend that you do that push yourself go past what we just did today and add
436:04 go past what we just did today and add your own stuff and and use this and then
436:06 your own stuff and and use this and then you can add this to your portfolio
436:08 you can add this to your portfolio website and show this off and show
436:09 website and show this off and show people that you know how to use Excel
436:12 people that you know how to use Excel which is a fantastic thing to know how
436:14 which is a fantastic thing to know how to use and show off so with that being
436:16 to use and show off so with that being said I hope that this project was
436:18 said I hope that this project was helpful I hope that you learned
436:19 helpful I hope that you learned something along the way I know I did um
436:22 something along the way I know I did um I was learning things as we were going
436:23 I was learning things as we were going and I hope that you didn't mind that I
436:25 and I hope that you didn't mind that I took some detours along the way um for
436:28 took some detours along the way um for your amusement as well as my learning uh
436:30 your amusement as well as my learning uh so with that being said thank you so
436:32 so with that being said thank you so much for joining me I really appreciate
436:34 much for joining me I really appreciate it I hope you have a good day and
436:37 it I hope you have a good day and [Music]
436:48 [Music] goodbye what's going on everybody
436:50 goodbye what's going on everybody welcome back to another video today we
436:52 welcome back to another video today we are starting our Tableau tutorial
436:59 [Music] series now this series is for absolute
437:02 series now this series is for absolute beginners so if you have never used TBL
437:03 beginners so if you have never used TBL blow before you are in the perfect place
437:05 blow before you are in the perfect place I'm going to take you all the way from
437:07 I'm going to take you all the way from the very beginning of installing it and
437:09 the very beginning of installing it and just understanding what Tableau is and
437:11 just understanding what Tableau is and how you can use it all the way to
437:12 how you can use it all the way to creating dashboards and sharing it now
437:14 creating dashboards and sharing it now personally I hate those videos that are
437:16 personally I hate those videos that are like 3 hours long and they just expect
437:18 like 3 hours long and they just expect you to go through it uh i' like to break
437:20 you to go through it uh i' like to break my videos up in chunk so if you have
437:22 my videos up in chunk so if you have ever done my sequel tutorials you'll
437:24 ever done my sequel tutorials you'll know that I like to break things up so
437:26 know that I like to break things up so it gives you time to try them out and do
437:28 it gives you time to try them out and do them yourself and then you can move on
437:29 them yourself and then you can move on to the next video so I'm going to be
437:31 to the next video so I'm going to be breaking this up into five separate
437:32 breaking this up into five separate videos but in this video I'm going to
437:34 videos but in this video I'm going to show you how to install Tableau for free
437:36 show you how to install Tableau for free I'm going to show you the user interface
437:38 I'm going to show you the user interface we're going to download a data set that
437:39 we're going to download a data set that you can find on kagle and then we will
437:41 you can find on kagle and then we will build our first visualization together
437:43 build our first visualization together with that being said let's jump over my
437:45 with that being said let's jump over my screen and we'll get started all right
437:46 screen and we'll get started all right so the very first thing that we need to
437:48 so the very first thing that we need to do is you need to actually download
437:50 do is you need to actually download Tableau so we're not going to be using
437:52 Tableau so we're not going to be using Tableau we're going to be using a free
437:54 Tableau we're going to be using a free version called Tableau public it has a
437:56 version called Tableau public it has a lot of the same features except of
437:57 lot of the same features except of course it's not uh every single feature
438:00 course it's not uh every single feature that regular Tableau has but it is
438:02 that regular Tableau has but it is absolutely perfect for learning it and
438:04 absolutely perfect for learning it and for using it and and you can even build
438:07 for using it and and you can even build um you know dashboards and share those
438:08 um you know dashboards and share those for your
438:09 for your portfolio um I'm going to put this link
438:12 portfolio um I'm going to put this link in the description so you can just go
438:14 in the description so you can just go and click on that and and all you have
438:16 and click on that and and all you have to do is input your email right here
438:18 to do is input your email right here we're going click download the app um
438:20 we're going click download the app um and then it should start to download and
438:22 and then it should start to download and then you can save that and then you're
438:24 then you can save that and then you're going to open this up now I'm going to
438:26 going to open this up now I'm going to open it up I don't know what it's going
438:27 open it up I don't know what it's going to do I already have it downloaded um
438:30 to do I already have it downloaded um but it should open up and look hopefully
438:33 but it should open up and look hopefully like what you're seeing on my screen in
438:35 like what you're seeing on my screen in just a second let see what it does um I
438:38 just a second let see what it does um I hope you can see this but it says
438:40 hope you can see this but it says Tableau public um it says I already have
438:43 Tableau public um it says I already have it set up but you're going to click
438:44 it set up but you're going to click install and go through all that um all
438:46 install and go through all that um all that setup stuff uh so I'm going to exit
438:48 that setup stuff uh so I'm going to exit out of here but I'm going to go over
438:50 out of here but I'm going to go over here and type in table of public uh and
438:54 here and type in table of public uh and it's 20 21.3 that's the current version
438:57 it's 20 21.3 that's the current version that they have out if you're doing this
438:59 that they have out if you're doing this in the future they may have you know
439:01 in the future they may have you know different versions um so you should be
439:03 different versions um so you should be able to pull this up right here now um
439:07 able to pull this up right here now um I'm going to go and get our data set
439:10 I'm going to go and get our data set that we're going to be using and I'm
439:11 that we're going to be using and I'm going to show you how to get that as
439:12 going to show you how to get that as well and then we will actually jump into
439:13 well and then we will actually jump into Tableau and start uh using it so let's
439:17 Tableau and start uh using it so let's go over here I'm going to get a data set
439:18 go over here I'm going to get a data set from kagle I wanted something pretty
439:20 from kagle I wanted something pretty generic uh to show you in future videos
439:23 generic uh to show you in future videos I'm going to show you some special or
439:25 I'm going to show you some special or not special but just different
439:27 not special but just different visualizations that you might use um and
439:29 visualizations that you might use um and we'll get different data sets for those
439:31 we'll get different data sets for those because of course not one data set
439:33 because of course not one data set covers all these other types of
439:34 covers all these other types of visualizations so um we're starting off
439:37 visualizations so um we're starting off pretty simple right here we're going to
439:38 pretty simple right here we're going to be getting one called video game sales
439:40 be getting one called video game sales um and we can take a really quick look
439:43 um and we can take a really quick look at it um here are some of the fields
439:44 at it um here are some of the fields that you're going to be having uh like
439:46 that you're going to be having uh like rank name platform the year genre and
439:49 rank name platform the year genre and then some sales data and this is what it
439:51 then some sales data and this is what it actually looks like it's called VG sales
439:53 actually looks like it's called VG sales so video game sales it's then a
439:55 so video game sales it's then a CSV and um you know here are the fields
440:00 CSV and um you know here are the fields and we have our data and all we are
440:02 and we have our data and all we are going to do is we're going to download
440:04 going to do is we're going to download that and I will save it now when you
440:07 that and I will save it now when you download it it's going to be saved into
440:09 download it it's going to be saved into a zip file so we need to go to our
440:12 a zip file so we need to go to our downloads uh let's refresh this here's
440:16 downloads uh let's refresh this here's our archive we need to go in here you
440:18 our archive we need to go in here you can just copy it and paste paste it
440:20 can just copy it and paste paste it right back into here um and just so you
440:23 right back into here um and just so you know that is a uh a CSV so be aware of
440:27 know that is a uh a CSV so be aware of that so what we want to do is we want to
440:29 that so what we want to do is we want to come in here now since it is a CSV this
440:31 come in here now since it is a CSV this is not we're not going to be using
440:33 is not we're not going to be using Microsoft Excel we're going to be using
440:35 Microsoft Excel we're going to be using the text file so we'll come in here
440:37 the text file so we'll come in here we'll take VG sales now uh one thing I
440:40 we'll take VG sales now uh one thing I want to do before I do that is I'm going
440:42 want to do before I do that is I'm going to rename mine uh VGC
440:46 to rename mine uh VGC sales1 um I've already prepared for this
440:49 sales1 um I've already prepared for this and so I already have that in there um
440:51 and so I already have that in there um but so I want to make a distinct one for
440:53 but so I want to make a distinct one for myself you do not have to do that so
440:55 myself you do not have to do that so we'll come back here um and then we're
440:57 we'll come back here um and then we're going to do text file and VG sales we're
441:02 going to do text file and VG sales we're going to open that
441:03 going to open that up
441:05 up and when it pulls up right here um you
441:09 and when it pulls up right here um you can bring in other tables and then you
441:11 can bring in other tables and then you can start to join them together and
441:13 can start to join them together and create those relationships we are not
441:15 create those relationships we are not going to be doing that in this video
441:16 going to be doing that in this video we'll do that in a separate one um as
441:19 we'll do that in a separate one um as for you know just getting started you
441:21 for you know just getting started you know we're not going to be using that
441:22 know we're not going to be using that but you can see some of these things or
441:27 but you can see some of these things or some of these fields and if you notice
441:30 some of these fields and if you notice they they um they're either ABC or
441:34 they they um they're either ABC or they're a number so it starts to
441:36 they're a number so it starts to categorize what this field type is so is
441:40 categorize what this field type is so is it a string is it numeric it starts to
441:43 it a string is it numeric it starts to automatically do that and that's all
441:44 automatically do that and that's all done within
441:45 done within Tableau and so it just kind of reads it
441:48 Tableau and so it just kind of reads it and that's what it does um what we going
441:50 and that's what it does um what we going to do is I'm going to click right down
441:52 to do is I'm going to click right down here it's called go to worksheet um the
441:54 here it's called go to worksheet um the worksheets are where you're going to
441:55 worksheets are where you're going to actually start being able to build your
441:57 actually start being able to build your visualizations your charts your graphs
441:59 visualizations your charts your graphs all these things um and so you know we
442:02 all these things um and so you know we have this in here now and so we're just
442:04 have this in here now and so we're just going to click right here on go to
442:06 going to click right here on go to worksheet as you can see here is VG
442:10 worksheet as you can see here is VG sales1 you will not have the underscore
442:12 sales1 you will not have the underscore one if you did not add that like I did
442:14 one if you did not add that like I did uh but right down here you can see all
442:16 uh but right down here you can see all the fields that we just imported from
442:18 the fields that we just imported from that data set and they even created one
442:19 that data set and they even created one right here for us uh they just generated
442:22 right here for us uh they just generated that field u based on the file so it's a
442:25 that field u based on the file so it's a count of all the rows really so what I'm
442:28 count of all the rows really so what I'm going to do is I'm just going to walk
442:30 going to do is I'm just going to walk you through uh basically what we're
442:33 you through uh basically what we're looking at some of the things that we're
442:34 looking at some of the things that we're going to be using today there will be
442:35 going to be using today there will be things that I don't talk about but I'm
442:37 things that I don't talk about but I'm going to highlight those in in in future
442:40 going to highlight those in in in future videos when we start using those or
442:42 videos when we start using those or going over them um and so let's just
442:45 going over them um and so let's just start with the most obvious one it's way
442:47 start with the most obvious one it's way over here I'm sure you saw it when we uh
442:49 over here I'm sure you saw it when we uh this first came up on the screen because
442:52 this first came up on the screen because it has all these different charts and
442:53 it has all these different charts and visualizations and graphs and uh these
442:57 visualizations and graphs and uh these will become available as you start
442:59 will become available as you start dragging and dropping our data into this
443:03 dragging and dropping our data into this sheet and so if I go right here it says
443:05 sheet and so if I go right here it says for Scatter Plots try zero or more
443:07 for Scatter Plots try zero or more Dimensions two to four measures so what
443:10 Dimensions two to four measures so what our dimensions are are right here what
443:13 our dimensions are are right here what our measures are are right down here and
443:15 our measures are are right down here and so typically uh things like like you say
443:18 so typically uh things like like you say genre or names or or strings like that
443:21 genre or names or or strings like that are going to be these uh dimensions and
443:24 are going to be these uh dimensions and then a lot of lot of times the numerical
443:27 then a lot of lot of times the numerical is going to be our going to be measures
443:29 is going to be our going to be measures next what I want to show you is right
443:32 next what I want to show you is right here so you can take something like
443:35 here so you can take something like Global sales and you can drag it right
443:37 Global sales and you can drag it right here into your rows and then it takes
443:41 here into your rows and then it takes your rows and so it automatically
443:43 your rows and so it automatically created a sum of global sales now if we
443:46 created a sum of global sales now if we take that away and let's say we drag it
443:48 take that away and let's say we drag it right here it's going to give us a
443:49 right here it's going to give us a column
443:50 column now you can also do it right up here you
443:53 now you can also do it right up here you don't have to um drag it on screen you
443:56 don't have to um drag it on screen you can
443:57 can also just add it to the column or the
444:00 also just add it to the column or the row that's typically what I do I it's
444:02 row that's typically what I do I it's just more intuitive to me um or you can
444:05 just more intuitive to me um or you can drop it in this section right here and
444:07 drop it in this section right here and it does its best to assign it some type
444:10 it does its best to assign it some type of um some type of visualization and so
444:13 of um some type of visualization and so that's what it always is trying to do it
444:15 that's what it always is trying to do it is trying to say okay this is what
444:18 is trying to say okay this is what you're trying to do let me try to to get
444:20 you're trying to do let me try to to get the best visualization for the data that
444:22 the best visualization for the data that you're giving me now while we are here
444:26 you're giving me now while we are here um it went down here into marks and
444:29 um it went down here into marks and marks is a very important area it's
444:32 marks is a very important area it's where you can add color size text detail
444:35 where you can add color size text detail and Tool tip and I'm not going to go
444:36 and Tool tip and I'm not going to go into what all those are cuz I'm just
444:38 into what all those are cuz I'm just going to show you so let's start pulling
444:40 going to show you so let's start pulling some fields in here and creating a
444:41 some fields in here and creating a visualization and then I'm going to show
444:43 visualization and then I'm going to show you how all of that works including
444:45 you how all of that works including filters as well so the first thing that
444:47 filters as well so the first thing that we are going to look at is global save
444:50 we are going to look at is global save and let's put that in the rows and then
444:54 and let's put that in the rows and then I'm going to take year and I'm going to
444:56 I'm going to take year and I'm going to make that the column and this is
444:59 make that the column and this is basically exactly what uh I wanted to do
445:02 basically exactly what uh I wanted to do now as of right now it has only the year
445:05 now as of right now it has only the year and it's looking at Global sales for
445:08 and it's looking at Global sales for everything but we want to break that out
445:10 everything but we want to break that out a little bit better I Want to Break It
445:12 a little bit better I Want to Break It Out by let's do genre so different genre
445:16 Out by let's do genre so different genre of games now if I add that right here to
445:19 of games now if I add that right here to this column s it is going to break it up
445:22 this column s it is going to break it up by year and genre if I add it right here
445:27 by year and genre if I add it right here is going to break it out by the year of
445:29 is going to break it out by the year of course but then in each individual row
445:32 course but then in each individual row has the different genre that's not what
445:34 has the different genre that's not what we want we want to keep this type of
445:38 we want we want to keep this type of line graph uh and what we're going to do
445:41 line graph uh and what we're going to do is we're going to add it to
445:43 is we're going to add it to Marks and you can't really see it based
445:45 Marks and you can't really see it based off of these colors but they're all
445:47 off of these colors but they're all different so we have action J genre we
445:50 different so we have action J genre we have the sports genre racing uh role
445:53 have the sports genre racing uh role playing all these different genres
445:55 playing all these different genres within it now we can get rid of that cuz
445:57 within it now we can get rid of that cuz we don't need it
445:58 we don't need it anymore uh and this is where these U
446:02 anymore uh and this is where these U these marks really come in handy because
446:04 these marks really come in handy because you can start basically doing what you
446:07 you can start basically doing what you want with them so for the genre I want
446:09 want with them so for the genre I want to be able to see all these different
446:11 to be able to see all these different genres with different colors to me that
446:13 genres with different colors to me that just makes the most sense so I'm going
446:14 just makes the most sense so I'm going to put color right here and
446:16 to put color right here and automatically it assigns every single
446:19 automatically it assigns every single genre its own own color and gives us
446:21 genre its own own color and gives us this Legend right over here and so it's
446:24 this Legend right over here and so it's really easy to see well when you have
446:27 really easy to see well when you have smaller numbers is much easier but I
446:29 smaller numbers is much easier but I know that red is sports and I can go
446:31 know that red is sports and I can go right here and find red and that is
446:33 right here and find red and that is sports so it makes it a lot easier than
446:35 sports so it makes it a lot easier than when it is all the same color blue so
446:39 when it is all the same color blue so what you can do after that is you can
446:41 what you can do after that is you can also add things like uh a label to it so
446:44 also add things like uh a label to it so if we take label and we or we take genre
446:47 if we take label and we or we take genre put label you can click right here and
446:50 put label you can click right here and you can get rid of the labels that you
446:52 you can get rid of the labels that you have and you can see them right down
446:53 have and you can see them right down here or you can also change uh the font
446:57 here or you can also change uh the font so if you want to make it orange or or
446:59 so if you want to make it orange or or whatever color you can do all those same
447:01 whatever color you can do all those same things and you can also do things like
447:04 things and you can also do things like changing where you see these things so
447:07 changing where you see these things so for Action you're going to see it a ton
447:09 for Action you're going to see it a ton because for each year action is is at
447:11 because for each year action is is at the is on the higher end and so you're
447:14 the is on the higher end and so you're seeing those in those mins and Maxes you
447:17 seeing those in those mins and Maxes you can also do it for a selected area so if
447:18 can also do it for a selected area so if I come in here and I select it it's then
447:21 I come in here and I select it it's then going to show me what those are so label
447:23 going to show me what those are so label is really really uh useful really
447:26 is really really uh useful really helpful let me get rid of that really
447:28 helpful let me get rid of that really quick uh you can also do it where the
447:31 quick uh you can also do it where the lines end so line ends is at the
447:33 lines end so line ends is at the beginning and the end and you can also
447:35 beginning and the end and you can also take that away or put that back on so
447:38 take that away or put that back on so labels are really important labels
447:40 labels are really important labels aren't very helpful when you're doing at
447:42 aren't very helpful when you're doing at least I don't find that it's super
447:43 least I don't find that it's super helpful when you're doing things like
447:44 helpful when you're doing things like genre so when you're doing your
447:46 genre so when you're doing your Dimensions so I'm going to get rid of
447:48 Dimensions so I'm going to get rid of that and I'm actually going to bring
447:50 that and I'm actually going to bring our Global sales over here and let's
447:53 our Global sales over here and let's label
447:54 label that and right now I think it's labeling
447:57 that and right now I think it's labeling the uh line ends we want to do the Min
448:00 the uh line ends we want to do the Min and Max now if we do Min and Max on the
448:03 and Max now if we do Min and Max on the table it's just going to give us the Max
448:05 table it's just going to give us the Max and the men which is zero and then
448:09 and the men which is zero and then 139.0 it's a little bit more useful if
448:11 139.0 it's a little bit more useful if we do it for each line uh this at least
448:14 we do it for each line uh this at least gives us some context I probably
448:15 gives us some context I probably wouldn't do this in an actual visual
448:17 wouldn't do this in an actual visual visualization but to give you some um
448:20 visualization but to give you some um understanding just how it works so now I
448:22 understanding just how it works so now I know that um right over here the men and
448:24 know that um right over here the men and the max or the men sorry the max for
448:27 the max or the men sorry the max for these for action and for sports is right
448:31 these for action and for sports is right around 138 139 so it's pretty easy to
448:33 around 138 139 so it's pretty easy to see um and you can again go in here and
448:38 see um and you can again go in here and you can remove the max or remove the
448:40 you can remove the max or remove the mins whichever one you feel is best uh
448:43 mins whichever one you feel is best uh you'll probably keep the maximums in
448:45 you'll probably keep the maximums in there for each category and so this is a
448:48 there for each category and so this is a really quickly becoming uh a pretty
448:50 really quickly becoming uh a pretty usable visualization and that's not the
448:53 usable visualization and that's not the only label that you can add we still are
448:55 only label that you can add we still are using year over here so we can always
448:57 using year over here so we can always drop year in there as well we'll create
449:00 drop year in there as well we'll create a label and so now we have let's see for
449:02 a label and so now we have let's see for this one is a puzzle genre so we also
449:06 this one is a puzzle genre so we also have the year that it had the maximum uh
449:10 have the year that it had the maximum uh sales and so you know just some things
449:13 sales and so you know just some things that you can do you don't have to add
449:15 that you can do you don't have to add that now let's go up here and we're
449:17 that now let's go up here and we're going to take a look at filters because
449:19 going to take a look at filters because filters are really important you know if
449:21 filters are really important you know if you are making this for a client or you
449:24 you are making this for a client or you making this for somebody you want them
449:26 making this for somebody you want them to be able to filter down uh to very
449:29 to be able to filter down uh to very specific information that they want to
449:31 specific information that they want to see so let's take uh the platform lots
449:34 see so let's take uh the platform lots of different
449:35 of different platforms um as you can see you know PS4
449:38 platforms um as you can see you know PS4 Xbox um if you're familiar with these
449:41 Xbox um if you're familiar with these we'll click all of these um and we'll
449:44 we'll click all of these um and we'll click okay so now this is an option as a
449:48 click okay so now this is an option as a filter and all we're going to do is
449:49 filter and all we're going to do is we're going to click on this Arrow right
449:51 we're going to click on this Arrow right here and we're going to say show
449:53 here and we're going to say show filter now right now all of them are
449:56 filter now right now all of them are selected so every single one is being
449:59 selected so every single one is being taken into account for this
450:01 taken into account for this visualization but let's say we come down
450:03 visualization but let's say we come down here and we say okay I don't want to see
450:05 here and we say okay I don't want to see sales for any of these PS the original
450:09 sales for any of these PS the original PlayStation 2 three or four so I'm going
450:10 PlayStation 2 three or four so I'm going to get rid of this one this one this one
450:14 to get rid of this one this one this one and this one and you could immediately
450:16 and this one and you could immediately see the the changes that were happening
450:18 see the the changes that were happening so now none of the numbers none of those
450:21 so now none of the numbers none of those sales are being accounted for and and
450:23 sales are being accounted for and and being added to the sum of global sales
450:26 being added to the sum of global sales right here at
450:27 right here at all so that is just how a filter uh can
450:32 all so that is just how a filter uh can work and you can also do that and you
450:37 work and you can also do that and you can get rid of all of them and you can
450:39 can get rid of all of them and you can go in and actually just pick very
450:41 go in and actually just pick very specific sales so if you only want to
450:43 specific sales so if you only want to see the PlayStation sales you can go in
450:45 see the PlayStation sales you can go in there and do that as well so really
450:48 there and do that as well so really really handy filter are things that you
450:51 really handy filter are things that you at least want to have as an option for
450:53 at least want to have as an option for most of your your visualizations at
450:56 most of your your visualizations at least that's what I found especially
450:57 least that's what I found especially when you're doing client facing work
450:58 when you're doing client facing work they like to uh get in there and mess
451:00 they like to uh get in there and mess around and look at different look at it
451:02 around and look at different look at it in different ways and so that's one that
451:05 in different ways and so that's one that I I think is is really useful to to
451:09 I I think is is really useful to to have the very last thing that we want to
451:12 have the very last thing that we want to do is we want to actually add this to a
451:16 do is we want to actually add this to a dashboard now let's say we add come
451:19 dashboard now let's say we add come right down down here and we add a new
451:20 right down down here and we add a new worksheet and actually we might change
451:22 worksheet and actually we might change one more thing on that last one but
451:24 one more thing on that last one but we'll just make a really simple one um
451:26 we'll just make a really simple one um we'll just give it genre and we'll give
451:29 we'll just give it genre and we'll give it Global sales as the
451:31 it Global sales as the rows um and this Nifty button right up
451:34 rows um and this Nifty button right up here which is a sorting button so I'm
451:37 here which is a sorting button so I'm going to sort like that I'm going to add
451:40 going to sort like that I'm going to add the genre in just as we did I'll give it
451:43 the genre in just as we did I'll give it different colors perfect now we have two
451:46 different colors perfect now we have two really quick different visualizations
451:48 really quick different visualizations right what I want to do is just show you
451:50 right what I want to do is just show you how to combine those because what you
451:52 how to combine those because what you are going to do is you're going to
451:55 are going to do is you're going to actually come in here and you're going
451:57 actually come in here and you're going to do new dashboard that's what this
451:59 to do new dashboard that's what this button is right here now when we come in
452:01 button is right here now when we come in here the size is extremely small it's
452:04 here the size is extremely small it's very easy to fix that all we're going to
452:06 very easy to fix that all we're going to do is Click right here we're going to go
452:08 do is Click right here we're going to go to this range or this dropdown and we're
452:10 to this range or this dropdown and we're going to click automatic so now it is a
452:13 going to click automatic so now it is a much larger size for us to actually drop
452:15 much larger size for us to actually drop our visualizations
452:16 our visualizations into uh and let's put sheet sheet one
452:20 into uh and let's put sheet sheet one and we'll put uh let's put it up top so
452:23 and we'll put uh let's put it up top so now it looks a little bit like this uh
452:26 now it looks a little bit like this uh not perfect but again if I wanted to
452:28 not perfect but again if I wanted to make this look a lot better I definitely
452:30 make this look a lot better I definitely would and then you can go over here and
452:33 would and then you can go over here and you can rename these things you can also
452:34 you can rename these things you can also do that back when we were in our actual
452:36 do that back when we were in our actual worksheets but you can also do it here
452:38 worksheets but you can also do it here as well and then start um you know
452:41 as well and then start um you know customizing it and building it out
452:43 customizing it and building it out that's not what this video is for that
452:44 that's not what this video is for that is the last video we're going to build
452:46 is the last video we're going to build an entire dashboard it'll be kind of
452:48 an entire dashboard it'll be kind of like a small project you put that in
452:49 like a small project you put that in your portfolio um if you have gotten
452:52 your portfolio um if you have gotten this far and you want to jump straight
452:55 this far and you want to jump straight into it and you don't want to wait for
452:56 into it and you don't want to wait for these other videos to come out or you
452:58 these other videos to come out or you don't you just want to jump straight
453:00 don't you just want to jump straight into creating an entire portfolio
453:01 into creating an entire portfolio project I have an entire portfolio
453:04 project I have an entire portfolio project series that covers SQL Python
453:07 project series that covers SQL Python and Tableau and so go check out that
453:09 and Tableau and so go check out that series I have one video dedicated to
453:12 series I have one video dedicated to Tableau it's like 45 minutes or an hour
453:14 Tableau it's like 45 minutes or an hour long and it covers a lot of the things
453:17 long and it covers a lot of the things that we're going to hear in here as well
453:19 that we're going to hear in here as well as a few other things but I appreciate
453:22 as a few other things but I appreciate you checking out this video in future
453:24 you checking out this video in future videos we're going be going over things
453:26 videos we're going be going over things like creating bins calculated Fields
453:28 like creating bins calculated Fields doing joins and then creating a final
453:30 doing joins and then creating a final project and putting it all together so
453:33 project and putting it all together so thank you so much for joining me I
453:34 thank you so much for joining me I really appreciate it if you like this
453:36 really appreciate it if you like this video be sure to like And subscribe
453:38 video be sure to like And subscribe below and I will see you in the next
453:42 below and I will see you in the next [Music]
453:48 [Music] video
453:50 video [Music]
453:54 what's going on everybody welcome back to the Tableau tutorial Series in this
453:56 to the Tableau tutorial Series in this video we're going to be going over bins
453:58 video we're going to be going over bins and calculated
454:04 [Music] Fields all right so let's jump right
454:06 Fields all right so let's jump right into it the first thing that we're going
454:08 into it the first thing that we're going to look at are bins and bins are
454:10 to look at are bins and bins are basically just groupings or ranges of
454:12 basically just groupings or ranges of numerical values so we cannot create
454:15 numerical values so we cannot create bins uh for genre name platform or
454:18 bins uh for genre name platform or anything like that we have to do
454:19 anything like that we have to do something with this sign right here
454:21 something with this sign right here which means that it is a numeric so year
454:24 which means that it is a numeric so year or all this sales data or this ranking
454:26 or all this sales data or this ranking data and we're going to use what we
454:28 data and we're going to use what we worked on in our very first tutorial and
454:31 worked on in our very first tutorial and so what we're going to be using to kind
454:33 so what we're going to be using to kind of demonstrate how bins work is this
454:35 of demonstrate how bins work is this year right down here so right now we
454:37 year right down here so right now we have a range of 1993 all the way up to
454:39 have a range of 1993 all the way up to 2018 and we're going to create some bins
454:42 2018 and we're going to create some bins to group and create ranges for these
454:45 to group and create ranges for these years and it's pretty simple all we're
454:47 years and it's pretty simple all we're going to do is I'm going to come right
454:49 going to do is I'm going to come right over here to year and this little drop
454:51 over here to year and this little drop down on the side and we're going to go
454:52 down on the side and we're going to go down to create and go down to
454:56 down to create and go down to bins now it's going to say the size of
454:58 bins now it's going to say the size of Bin and it's going to give you a
455:01 Bin and it's going to give you a recommendation based off of the
455:03 recommendation based off of the information that is already provided the
455:04 information that is already provided the Min and the max the ranges of these
455:07 Min and the max the ranges of these values you know you don't have to do
455:09 values you know you don't have to do this but usually um it it does give some
455:12 this but usually um it it does give some good estimation on what you might be
455:14 good estimation on what you might be considering if you were thinking hey
455:16 considering if you were thinking hey maybe do a bit of like 20 and they're
455:17 maybe do a bit of like 20 and they're recommending two think about why they
455:19 recommending two think about why they might be doing that we're going to
455:21 might be doing that we're going to change ours to five and you can always
455:24 change ours to five and you can always change what this field is going to be
455:25 change what this field is going to be I'm just going to give it an old
455:27 I'm just going to give it an old exclamation point just to um really
455:29 exclamation point just to um really spice things up here so we're going to
455:31 spice things up here so we're going to click okay and as you can see it adds it
455:35 click okay and as you can see it adds it right up here is no longer um it is no
455:38 right up here is no longer um it is no longer a numeric now it is a categorical
455:41 longer a numeric now it is a categorical so it now it's this is no longer just uh
455:44 so it now it's this is no longer just uh 1 2 3 4 five its ranges its groups and
455:48 1 2 3 4 five its ranges its groups and we're going to get rid of this year
455:49 we're going to get rid of this year really quick actually let's keep it up
455:51 really quick actually let's keep it up there for a second uh see what happens
455:53 there for a second uh see what happens but we're going to bring this up and
455:55 but we're going to bring this up and we'll get rid of this year and this is
455:59 we'll get rid of this year and this is is what kind of it spits out for us now
456:01 is what kind of it spits out for us now I did look at the data um when I was
456:03 I did look at the data um when I was prepping for this there are some nulls
456:05 prepping for this there are some nulls in the Years um and so all we're going
456:07 in the Years um and so all we're going to do for this is we're just going to go
456:08 to do for this is we're just going to go like this and we're going to exclude the
456:11 like this and we're going to exclude the nulls uh probably not something you
456:14 nulls uh probably not something you should be doing uh if you're doing this
456:15 should be doing uh if you're doing this for work but this is for demonstration
456:17 for work but this is for demonstration purposes so we can do it ever we want
456:20 purposes so we can do it ever we want but as you can see we now have these
456:23 but as you can see we now have these ranges so this range starts at
456:26 ranges so this range starts at 1990 and it includes 1990 all the way up
456:30 1990 and it includes 1990 all the way up to 1994 and then it's 1995 to
456:33 to 1994 and then it's 1995 to 1999 and so just really quickly we can
456:37 1999 and so just really quickly we can tell that the years 2000 to 2004 were a
456:40 tell that the years 2000 to 2004 were a huge huge huge uh season or group of of
456:44 huge huge huge uh season or group of of years for game sales so these are the
456:46 years for game sales so these are the global sales for for these video games
456:50 global sales for for these video games and so it is really helpful it's very
456:53 and so it is really helpful it's very useful um you can do this on a lot of
456:55 useful um you can do this on a lot of different information we could do this
456:56 different information we could do this on the sales data you can do this on age
456:59 on the sales data you can do this on age you can do it on years like we did and
457:01 you can do it on years like we did and it can be very very useful and so uh
457:03 it can be very very useful and so uh really quickly that is how bins work I
457:07 really quickly that is how bins work I would say it's pretty straightforward
457:09 would say it's pretty straightforward now this is a perfect time to segue into
457:11 now this is a perfect time to segue into the next part of the video which is
457:13 the next part of the video which is calculated Fields uh right over here on
457:16 calculated Fields uh right over here on this left hand side we see that the
457:18 this left hand side we see that the global sales which are in millions goes
457:19 global sales which are in millions goes all the way up to 900 million and
457:22 all the way up to 900 million and created these beautiful bins right down
457:24 created these beautiful bins right down here but let's look at Within These from
457:27 here but let's look at Within These from 1999 to 2015 let's see which of these
457:31 1999 to 2015 let's see which of these has the highest percentage of course
457:32 has the highest percentage of course it's going to be this one but we can do
457:35 it's going to be this one but we can do something called a quick table
457:37 something called a quick table calculation uh we'll create a our own
457:39 calculation uh we'll create a our own calculation later I'll show you how to
457:40 calculation later I'll show you how to do that but we're going to do a quick
457:42 do that but we're going to do a quick table calculation and we're going to do
457:44 table calculation and we're going to do the percent of total and so now we have
457:47 the percent of total and so now we have these bins and instead of just seeing
457:49 these bins and instead of just seeing the total amount of sales that they had
457:51 the total amount of sales that they had we see the actual percentages based off
457:53 we see the actual percentages based off these year ranges which is really useful
457:56 these year ranges which is really useful something that you could absolutely put
457:58 something that you could absolutely put uh in some real work that you do for a
458:00 uh in some real work that you do for a client now really quick just to show you
458:02 client now really quick just to show you something that you can do if you click
458:04 something that you can do if you click control and you drag this over here you
458:07 control and you drag this over here you can actually save that calculation so we
458:09 can actually save that calculation so we can say
458:11 can say percentage of global sales and that
458:15 percentage of global sales and that actually saves it as uh you know a
458:17 actually saves it as uh you know a measure for us so that was a quick
458:19 measure for us so that was a quick calculation but let's look how to
458:20 calculation but let's look how to actually create a calculated field so if
458:24 actually create a calculated field so if we do this right here what is going to
458:26 we do this right here what is going to come up is just the global sales and you
458:28 come up is just the global sales and you can do a lot of what you would basically
458:30 can do a lot of what you would basically do in Excel multiplication division
458:32 do in Excel multiplication division subtraction a few other things but we're
458:34 subtraction a few other things but we're going to keep it super super simple
458:36 going to keep it super super simple today all I'm going to do is I'm going
458:38 today all I'm going to do is I'm going to take Global sales and I'm going to
458:40 to take Global sales and I'm going to subtract I'm going to do an open bracket
458:42 subtract I'm going to do an open bracket and I'm going to say EU sales and it
458:45 and I'm going to say EU sales and it auto completes for me I'm going to click
458:47 auto completes for me I'm going to click okay and created calculation 2 I'm going
458:50 okay and created calculation 2 I'm going to come in here and I'm just going to
458:52 to come in here and I'm just going to say Global sales
458:56 say Global sales minus EU
458:58 minus EU sales and let's drag this over these are
459:02 sales and let's drag this over these are different um one's percentage one is in
459:07 different um one's percentage one is in terms of sum and so I'm just going to
459:10 terms of sum and so I'm just going to bring this in right here and so now we
459:12 bring this in right here and so now we are comparing against the same thing and
459:15 are comparing against the same thing and if we look at the global sales we have
459:16 if we look at the global sales we have probably right around 9 50 million-ish
459:20 probably right around 9 50 million-ish in this 2000 to 2004 bin and for Global
459:24 in this 2000 to 2004 bin and for Global sales minus the EU sales we're looking
459:26 sales minus the EU sales we're looking at you know 650 million so there is a
459:29 at you know 650 million so there is a noticeable difference and this is just
459:32 noticeable difference and this is just one of the ways that you can use
459:34 one of the ways that you can use calculated fields to actually just show
459:36 calculated fields to actually just show the difference between two numbers or
459:38 the difference between two numbers or you can do more advanced calculations
459:40 you can do more advanced calculations depending on the data that you actually
459:41 depending on the data that you actually have so that's it for this video I hope
459:43 have so that's it for this video I hope you learned a little bit more about bins
459:45 you learned a little bit more about bins and calculated fields in the next video
459:48 and calculated fields in the next video we're going to looking at a ton of
459:49 we're going to looking at a ton of different visualizations and graphs and
459:51 different visualizations and graphs and charts and just exploring what options
459:54 charts and just exploring what options really are out there for visualizing our
459:56 really are out there for visualizing our data thank you guys so much for joining
459:58 data thank you guys so much for joining me I really appreciate it if you like
460:01 me I really appreciate it if you like this video be sure to like And subscribe
460:02 this video be sure to like And subscribe below and I will see you in the next
460:06 below and I will see you in the next [Music]
460:16 [Music] video what's going on everybody welcome
460:18 video what's going on everybody welcome back to the Tableau tutorial Series in
460:20 back to the Tableau tutorial Series in this video we're going to be looking at
460:21 this video we're going to be looking at lots of different visualizations
460:23 lots of different visualizations including the scatter plot and density
460:31 [Music] Maps now before we jump into the
460:33 Maps now before we jump into the tutorial I have some very exciting news
460:35 tutorial I have some very exciting news in just two days on October 7th I going
460:37 in just two days on October 7th I going to be partnering with alter X to host a
460:39 to be partnering with alter X to host a webinar this webinar is completely for
460:42 webinar this webinar is completely for data analysts who are wanting to change
460:43 data analysts who are wanting to change careers to become a data analyst now you
460:46 careers to become a data analyst now you did hear that right I will be the host
460:47 did hear that right I will be the host of the event but but we will be bringing
460:49 of the event but but we will be bringing on guests as well who are industry
460:51 on guests as well who are industry experts who actually change careers to
460:52 experts who actually change careers to become data analyst much like myself
460:54 become data analyst much like myself they'll be sharing their stories of how
460:56 they'll be sharing their stories of how they actually transition careers along
460:58 they actually transition careers along with the tools that they found extremely
460:59 with the tools that they found extremely useful and helpful to make that switch
461:01 useful and helpful to make that switch and they'll be giving lots of advice
461:03 and they'll be giving lots of advice along the way so if you are somebody who
461:05 along the way so if you are somebody who is wanting to change careers to become a
461:07 is wanting to change careers to become a data analyst or just wanting to learn
461:09 data analyst or just wanting to learn about data analytics this is an absolute
461:11 about data analytics this is an absolute fantastic place to learn a lot more
461:13 fantastic place to learn a lot more about that I will leave a link in the
461:14 about that I will leave a link in the description so be sure to go and sign up
461:16 description so be sure to go and sign up for that again I'm going to be there so
461:18 for that again I'm going to be there so so it should be really fun without
461:20 so it should be really fun without further Ado let's jump onto my screen
461:21 further Ado let's jump onto my screen and start the tutorial now we are about
461:23 and start the tutorial now we are about to look at a ton of different
461:25 to look at a ton of different visualizations uh over here you can see
461:28 visualizations uh over here you can see just an array of them but not all of
461:31 just an array of them but not all of them are ones that I actually think are
461:34 them are ones that I actually think are useful or ones that I would actually
461:35 useful or ones that I would actually recommend using and so I'm going to take
461:37 recommend using and so I'm going to take you through some of the ones that I
461:39 you through some of the ones that I absolutely think are worth learning and
461:41 absolutely think are worth learning and using and trying out uh and I'm just
461:44 using and trying out uh and I'm just going to kind of just show you how I
461:47 going to kind of just show you how I might use them how they might look how
461:49 might use them how they might look how you can navigate them a little bit now
461:51 you can navigate them a little bit now before we do that we do need to go
461:52 before we do that we do need to go download one data set it's this
461:55 download one data set it's this Starbucks location worldwide yes we're
461:57 Starbucks location worldwide yes we're going to do a little bit of longitude
461:59 going to do a little bit of longitude latitude here and all we have to do is
462:02 latitude here and all we have to do is click this downloads button and it will
462:05 click this downloads button and it will download we're going to do that into
462:07 download we're going to do that into downloads we'll save that uh yeah I've
462:10 downloads we'll save that uh yeah I've already done that but you know I'm doing
462:13 already done that but you know I'm doing this with you guys I'm doing it for you
462:15 this with you guys I'm doing it for you so let's go to our
462:17 so let's go to our downloads now we have have here we want
462:19 downloads now we have have here we want to come in here we're going to copy it
462:22 to come in here we're going to copy it or um you can cut it and then we're
462:25 or um you can cut it and then we're going to paste it here yeah replace it
462:28 going to paste it here yeah replace it perfect and now we have it ready to go
462:32 perfect and now we have it ready to go we'll come in here let's do a new sheet
462:34 we'll come in here let's do a new sheet and I already have it in there but uh
462:36 and I already have it in there but uh I'm just going to show you what I would
462:37 I'm just going to show you what I would do do new data source we'll do text file
462:41 do do new data source we'll do text file we'll do directory and we will open
462:48 it and let's see what data we have in here before we actually begin uh just
462:50 here before we actually begin uh just super quickly we have the brand so um
462:55 super quickly we have the brand so um whatever company has it and then a bunch
462:57 whatever company has it and then a bunch of um location information street
463:00 of um location information street address City the state this is all in
463:04 address City the state this is all in the United States so that's basically it
463:07 the United States so that's basically it and what we are going to do is we're
463:09 and what we are going to do is we're going to go over to this sheet
463:11 going to go over to this sheet three and we have this directory 2
463:14 three and we have this directory 2 that's the one I just pulled in exact
463:16 that's the one I just pulled in exact same thing as directory but so the first
463:18 same thing as directory but so the first VIs visualization that we are going to
463:19 VIs visualization that we are going to look at is a bar and line graph so what
463:21 look at is a bar and line graph so what we're going to take is the year right
463:23 we're going to take is the year right here take these Global sales and these
463:27 here take these Global sales and these na
463:28 na sales and we're going to be doing this
463:30 sales and we're going to be doing this one right here so this has a combination
463:33 one right here so this has a combination of two separate uh types of
463:36 of two separate uh types of visualizations so sometimes you just
463:38 visualizations so sometimes you just have lines sometimes you just have these
463:40 have lines sometimes you just have these uh these bar graphs or the bar charts
463:43 uh these bar graphs or the bar charts and we're combining the two and it's
463:45 and we're combining the two and it's very nice I like how this looks now if
463:48 very nice I like how this looks now if you notice if I put this na sales behind
463:51 you notice if I put this na sales behind it now it kind of cuts off so now this
463:53 it now it kind of cuts off so now this Global sales is in front we're going to
463:56 Global sales is in front we're going to you know put that back I just wanted to
463:58 you know put that back I just wanted to show you that uh right here there's all
464:01 show you that uh right here there's all some of global sales some of Na sales so
464:03 some of global sales some of Na sales so if we go into this all we click this
464:05 if we go into this all we click this drop down we can change it to a line um
464:08 drop down we can change it to a line um we can change it basically whatever we
464:10 we can change it basically whatever we want I just hit contrl Z to reverse that
464:13 want I just hit contrl Z to reverse that but what we can do is we can go in here
464:15 but what we can do is we can go in here and we can change this color and let's
464:18 and we can change this color and let's see if we can just make it red is that
464:21 see if we can just make it red is that [Music]
464:23 [Music] possible see what I did I made it orange
464:25 possible see what I did I made it orange that works for me um just something to
464:27 that works for me um just something to stick out a little bit more choose
464:29 stick out a little bit more choose whatever color you want and this is a
464:31 whatever color you want and this is a really nice visualization this is one
464:33 really nice visualization this is one that I have used in the past we're
464:35 that I have used in the past we're looking at Global sales versus the na
464:36 looking at Global sales versus the na sales and so it's very easy to see the
464:39 sales and so it's very easy to see the distinction between the two and how one
464:41 distinction between the two and how one was doing a specific year versus how the
464:44 was doing a specific year versus how the other one was doing in that same year so
464:46 other one was doing in that same year so I really like this if you want to do
464:48 I really like this if you want to do something uh like keeping it consistent
464:50 something uh like keeping it consistent you can do two bars I don't really like
464:53 you can do two bars I don't really like this one as much um and you can again
464:56 this one as much um and you can again you can really change it up um there's
464:58 you can really change it up um there's lots of different ones that you can do
465:00 lots of different ones that you can do again I prefer the line but you know do
465:03 again I prefer the line but you know do whatever you think is best I'm going to
465:05 whatever you think is best I'm going to change it back because this is not how I
465:06 change it back because this is not how I want to keep it but there you go so that
465:09 want to keep it but there you go so that is the first one that we are going to
465:11 is the first one that we are going to look at let's move on to the second one
465:14 look at let's move on to the second one and we actually will be using our our
465:16 and we actually will be using our our Starbucks data here
465:18 Starbucks data here now when you bring in data that has um
465:22 now when you bring in data that has um any type of map or or um address or
465:26 any type of map or or um address or postal code or things like that or or
465:28 postal code or things like that or or country it's typically going to create
465:30 country it's typically going to create this latitude and longitude it's going
465:32 this latitude and longitude it's going to generate that now what we want to do
465:35 to generate that now what we want to do is bring this longitude right up here
465:37 is bring this longitude right up here and this latitude right
465:40 and this latitude right there and if you do the show me right
465:43 there and if you do the show me right now it's giving us this but what we want
465:45 now it's giving us this but what we want to do is add what we're looking for so
465:48 to do is add what we're looking for so what will we actually be trying to
465:51 what will we actually be trying to search for on this map you can do
465:53 search for on this map you can do anything from like a postal code um and
465:57 anything from like a postal code um and it will drag us right here let's come
466:00 it will drag us right here let's come over to this this allows us to kind of
466:03 over to this this allows us to kind of scroll around a little bit um we're
466:05 scroll around a little bit um we're going to mess around with this one for
466:07 going to mess around with this one for just a little bit and me see if I
466:11 just a little bit and me see if I can that's nice that might be too big
466:14 can that's nice that might be too big let me back up one so at least in the
466:18 let me back up one so at least in the Continental us a little bit down here
466:20 Continental us a little bit down here this these are the postal codes so right
466:22 this these are the postal codes so right now we're looking at post codes uh
466:25 now we're looking at post codes uh and there are a lot that you can do with
466:28 and there are a lot that you can do with this um really color will make almost no
466:31 this um really color will make almost no difference it just becomes this mess so
466:33 difference it just becomes this mess so you don't typically want to do something
466:34 you don't typically want to do something like that at least not for this let's go
466:38 like that at least not for this let's go to size and if we make it really small
466:43 to size and if we make it really small you can kind of see these groupings
466:45 you can kind of see these groupings these pairings um typically of like
466:47 these pairings um typically of like larger cities or major major
466:49 larger cities or major major metropolitan areas and so you can do
466:53 metropolitan areas and so you can do this and it's and it's really really
466:54 this and it's and it's really really easy I don't recommend uh labeling this
466:57 easy I don't recommend uh labeling this I don't even know if it'll do it um it
466:59 I don't even know if it'll do it um it would be an absolute mess to try to
467:01 would be an absolute mess to try to label all these
467:02 label all these postcodes well let's bring this out and
467:05 postcodes well let's bring this out and let's bring these State and provinces in
467:08 let's bring these State and provinces in now right now we have these little tiny
467:10 now right now we have these little tiny tiny uh dots on here and I think what we
467:14 tiny uh dots on here and I think what we want to do is not increase the size size
467:19 want to do is not increase the size size but over here we want to actually do
467:21 but over here we want to actually do this and make it a map so now it's going
467:23 this and make it a map so now it's going to fill in all the states we can you
467:26 to fill in all the states we can you know why not we'll add some color here
467:29 know why not we'll add some color here um but we
467:30 um but we can it hasn't numbered I didn't think
467:33 can it hasn't numbered I didn't think they were numbered
467:35 they were numbered um oh that's interesting I haven't seen
467:37 um oh that's interesting I haven't seen that I didn't look at that before I was
467:39 that I didn't look at that before I was just found that interesting but now we
467:41 just found that interesting but now we can see what uh what states Starbucks is
467:45 can see what uh what states Starbucks is in and as you can see they're in all 50
467:47 in and as you can see they're in all 50 states but it's something interesting to
467:50 states but it's something interesting to um look at to think about now if we go
467:52 um look at to think about now if we go right up here we can again choose a
467:54 right up here we can again choose a different type and we're going to go to
467:56 different type and we're going to go to the density now right now it's just
467:59 the density now right now it's just doing a density on the uh the state
468:01 doing a density on the uh the state we're going get rid of that we're going
468:02 we're going get rid of that we're going to bring back postal code I'm just
468:04 to bring back postal code I'm just switching it up on you a little bit and
468:07 switching it up on you a little bit and you can do it as small or as big as
468:09 you can do it as small or as big as you'd like um you know I like to do
468:11 you'd like um you know I like to do somewhere in the middle um probably
468:14 somewhere in the middle um probably right right about there is fine um I
468:18 right right about there is fine um I don't think it's going to make sense to
468:19 don't think it's going to make sense to really add any color here again all
468:20 really add any color here again all these poster codes are different so it's
468:22 these poster codes are different so it's just going to be complete mish mash but
468:24 just going to be complete mish mash but this is kind of how you can use a
468:26 this is kind of how you can use a density map and you can do this with uh
468:29 density map and you can do this with uh countries you can do this with postal
468:31 countries you can do this with postal codes you can do this with any type of
468:33 codes you can do this with any type of kind of like address or location based
468:36 kind of like address or location based data so that is how you can use a map
468:39 data so that is how you can use a map again there's lots of different ways to
468:41 again there's lots of different ways to use a map and so I'm not going to show
468:44 use a map and so I'm not going to show you every single way but in a really
468:45 you every single way but in a really brief way this is how you can use a map
468:47 brief way this is how you can use a map to actually visualize your data that
468:49 to actually visualize your data that does have location uh based information
468:52 does have location uh based information in it so let's go over to sheet three uh
468:55 in it so let's go over to sheet three uh and this data that we have over here it
468:57 and this data that we have over here it just allows for a lot of different types
468:59 just allows for a lot of different types of visualizations so we're going to use
469:01 of visualizations so we're going to use this one um and there are lots of other
469:04 this one um and there are lots of other ones that you might see out there like
469:06 ones that you might see out there like this one right here uh we obviously
469:08 this one right here uh we obviously wouldn't be using this we might do
469:10 wouldn't be using this we might do something like this change the
469:15 something like this change the label um and maybe add why have both of
469:18 label um and maybe add why have both of these in here um let's get rid of this
469:21 these in here um let's get rid of this oops that's not what I meant let's
469:23 oops that's not what I meant let's actually add that let's do the sum of
469:25 actually add that let's do the sum of global sales and we'll just make that
469:27 global sales and we'll just make that into a label as well
469:30 into a label as well so what you can do with these and and
469:33 so what you can do with these and and how you're able to use them and
469:35 how you're able to use them and visualize them again these are not
469:38 visualize them again these are not you'll see these often but these are not
469:40 you'll see these often but these are not often ones that I would recommend you
469:42 often ones that I would recommend you use that's very similar to these packed
469:44 use that's very similar to these packed bubbles um you can as these Global sales
469:48 bubbles um you can as these Global sales in here again add the label it just uh
469:53 in here again add the label it just uh it sometimes is not as straightforward
469:56 it sometimes is not as straightforward the information that it's trying to tell
469:58 the information that it's trying to tell you right you kind of have to search for
470:00 you right you kind of have to search for it a little bit you kind of have to look
470:02 it a little bit you kind of have to look around um but you can find some good
470:05 around um but you can find some good visualizations in here for very specific
470:08 visualizations in here for very specific types of data and so these are just ones
470:10 types of data and so these are just ones to consider uh one that you'll see all
470:13 to consider uh one that you'll see all the time is uh this guy right here and
470:18 the time is uh this guy right here and uh let me see if I can expand this a
470:20 uh let me see if I can expand this a little bit because this
470:22 little bit because this is very small um let's see we have the I
470:28 is very small um let's see we have the I just want Global
470:30 just want Global sales and let's label
470:33 sales and let's label that the
470:36 that the size I how do I expand this haven't done
470:40 size I how do I expand this haven't done this in a while let me just expand this
470:42 this in a while let me just expand this I don't use pie charts what is
470:46 I don't use pie charts what is happening this is a incredibly large pie
470:50 happening this is a incredibly large pie chart oh my gosh I am making this um
470:53 chart oh my gosh I am making this um this is becoming a problem there we go
470:56 this is becoming a problem there we go uh and what I actually wanted to do was
470:57 uh and what I actually wanted to do was label the uh genre as well as I've been
471:00 label the uh genre as well as I've been doing in all the other
471:02 doing in all the other ones and we'll label this now look
471:06 ones and we'll label this now look whether you are a fan of pie charts or
471:09 whether you are a fan of pie charts or not you have to understand that people
471:11 not you have to understand that people use them uh some people just like how
471:14 use them uh some people just like how they look and for certain data it can do
471:17 they look and for certain data it can do well for things that have a lot of
471:20 well for things that have a lot of different um groupings or categories it
471:23 different um groupings or categories it usually isn't super great but it does
471:26 usually isn't super great but it does give you some type of order of things
471:29 give you some type of order of things give you a quick glance and people use
471:31 give you a quick glance and people use them right so let's not pretend like
471:35 them right so let's not pretend like it's like the the the Hideous stepchild
471:37 it's like the the the Hideous stepchild all right people use it people have it
471:40 all right people use it people have it in their dashboards and their
471:41 in their dashboards and their visualizations all over so it's best to
471:43 visualizations all over so it's best to just know what they look like know how
471:45 just know what they look like know how to do them know um how to use them best
471:48 to do them know um how to use them best again I'm not a super huge huge fan of
471:50 again I'm not a super huge huge fan of it myself I've used it once or twice but
471:54 it myself I've used it once or twice but one to look out for and again you can
471:57 one to look out for and again you can come over to here and use is called a
471:59 come over to here and use is called a box and a whisker plot um it's good for
472:02 box and a whisker plot um it's good for these large um distributions you know
472:06 these large um distributions you know this is
472:07 this is like the median upper upper lower lower
472:11 like the median upper upper lower lower I don't use these a lot but I know a lot
472:12 I don't use these a lot but I know a lot of people who love them something to
472:16 of people who love them something to just look at and or mess around with it
472:18 just look at and or mess around with it a little bit it's pretty I think
472:21 a little bit it's pretty I think straightforward and it does give you
472:22 straightforward and it does give you some good insight into your data if you
472:25 some good insight into your data if you know how to use it now there is one last
472:27 know how to use it now there is one last one that I want to show you I'm just
472:29 one that I want to show you I'm just going to create it on a new sheet make
472:31 going to create it on a new sheet make it easy uh we'll do year here we'll do
472:35 it easy uh we'll do year here we'll do some of let's do na sales why
472:39 some of let's do na sales why not and we are going to make this like
472:43 not and we are going to make this like this now it's very similar to a line
472:45 this now it's very similar to a line chart but when we break it out by the
472:48 chart but when we break it out by the genre and we add some color you know
472:52 genre and we add some color you know it's just a different way to visualize
472:54 it's just a different way to visualize this information you can uh you know
472:58 this information you can uh you know potentially add some stuff in here like
473:00 potentially add some stuff in here like some labels if you uh want to depending
473:03 some labels if you uh want to depending on how it looks for you but this is just
473:06 on how it looks for you but this is just another way to visualize the data so
473:09 another way to visualize the data so wanting to give you guys some options
473:11 wanting to give you guys some options wanting to give you some things that you
473:13 wanting to give you some things that you might want to look at if you haven't
473:16 might want to look at if you haven't already used these before four these are
473:18 already used these before four these are ones all every single one that I've
473:19 ones all every single one that I've showed you are ones that I've at least
473:21 showed you are ones that I've at least used once um this one I maybe have
473:24 used once um this one I maybe have literally only used once but the first
473:27 literally only used once but the first ones that I showed you the ones I
473:28 ones that I showed you the ones I pointed out as the ones that I really
473:30 pointed out as the ones that I really wanted you to know are great
473:33 wanted you to know are great visualizations to learn how to use and
473:36 visualizations to learn how to use and learn how to make useful for the data
473:38 learn how to make useful for the data that you have with that being said that
473:39 that you have with that being said that is all that we are looking at in this
473:41 is all that we are looking at in this video again I tried to keep it super
473:43 video again I tried to keep it super easy just wanted to show you some
473:44 easy just wanted to show you some different visualizations the data that
473:46 different visualizations the data that you can use to get those visualizations
473:48 you can use to get those visualizations and just some other options in case you
473:50 and just some other options in case you wanted to get a little bit uh
473:52 wanted to get a little bit uh spontaneous a little bit out there a
473:54 spontaneous a little bit out there a little bit funky uh to show your boss or
473:56 little bit funky uh to show your boss or something like that thank you guys so
473:58 something like that thank you guys so much for watching I really appreciate it
474:00 much for watching I really appreciate it if you like this video be sure to like
474:02 if you like this video be sure to like And subscribe below and I will see you
474:04 And subscribe below and I will see you in the next
474:17 video [Music]
474:21 what's going on everybody welcome back to another video today we're looking at
474:23 to another video today we're looking at joins in
474:29 [Music] Tableau now before we get into the
474:31 Tableau now before we get into the tutorial I want to give a huge shout out
474:32 tutorial I want to give a huge shout out to today's sponsor and that is udem me
474:35 to today's sponsor and that is udem me they were having a massive Black Friday
474:37 they were having a massive Black Friday sale and so everything is about 85% off
474:39 sale and so everything is about 85% off so if you've been looking at a course
474:41 so if you've been looking at a course now is the time to buy it if you are
474:43 now is the time to buy it if you are looking at learning and taking an actual
474:45 looking at learning and taking an actual full Tableau course there are fantastic
474:48 full Tableau course there are fantastic ones on UD me that I have taken myself
474:50 ones on UD me that I have taken myself so be sure to go and check out UD me
474:52 so be sure to go and check out UD me while they're having this huge sale I
474:53 while they're having this huge sale I will include a link in the description
474:54 will include a link in the description if you want to check them out now let's
474:56 if you want to check them out now let's get into the tutorial all right let's
474:58 get into the tutorial all right let's get started and first we're going to
474:59 get started and first we're going to start off in Excel I'm going to kind of
475:01 start off in Excel I'm going to kind of walk you through the data that we're
475:02 walk you through the data that we're working with and then we're going to put
475:04 working with and then we're going to put it into Tableau and I'm going to show
475:05 it into Tableau and I'm going to show you how to do all those joins in Tableau
475:08 you how to do all those joins in Tableau so the first table that we have is this
475:09 so the first table that we have is this demographics table we have employee ID
475:12 demographics table we have employee ID name of employee employee age and
475:14 name of employee employee age and employee gender now look right here
475:16 employee gender now look right here because this will be important uh going
475:18 because this will be important uh going forward in the demographics table we
475:21 forward in the demographics table we have 10 uh individuals and they each
475:24 have 10 uh individuals and they each have an employee ID now when we go to
475:26 have an employee ID now when we go to the job title we have our employee ID
475:29 the job title we have our employee ID employee name and the job title but this
475:32 employee name and the job title but this one is missing Ryan Howard is missing
475:34 one is missing Ryan Howard is missing his employee ID and then the very last
475:37 his employee ID and then the very last one there are only seven employee IDs
475:41 one there are only seven employee IDs and no names um and so we're going to
475:43 and no names um and so we're going to use all of that and I'm going to show
475:44 use all of that and I'm going to show you how to actually do the joins into
475:47 you how to actually do the joins into Tableau Tableau does a really fantastic
475:49 Tableau Tableau does a really fantastic job of visualizing for you so it takes a
475:52 job of visualizing for you so it takes a lot of the guesswork out um I am going
475:54 lot of the guesswork out um I am going to include a link to my joins video in
475:56 to include a link to my joins video in SQL because these two are very closely
475:58 SQL because these two are very closely connected and and if you understand how
476:00 connected and and if you understand how the joins work in in SQL you'll
476:03 the joins work in in SQL you'll understand how the joins work in Tableau
476:05 understand how the joins work in Tableau it's almost the exact same thing so with
476:09 it's almost the exact same thing so with that being said let's jump over to
476:11 that being said let's jump over to Tableau so I'm going to pull this up
476:13 Tableau so I'm going to pull this up going go right over here and now we have
476:16 going go right over here and now we have uh where where we can connect to our
476:18 uh where where we can connect to our data and so we're going to click
476:20 data and so we're going to click Microsoft Excel I'm going to scroll down
476:22 Microsoft Excel I'm going to scroll down here to Tableau joins file I'm going to
476:25 here to Tableau joins file I'm going to open this up and I have it open so I
476:26 open this up and I have it open so I can't use it so let me get rid of that
476:29 can't use it so let me get rid of that and let's open it again perfect so now
476:33 and let's open it again perfect so now what we're going to do and I'm going to
476:34 what we're going to do and I'm going to show you how to actually open up the
476:35 show you how to actually open up the joins um in a second but what you need
476:37 joins um in a second but what you need to understand is when you first come
476:39 to understand is when you first come here Tableau doesn't automatically allow
476:43 here Tableau doesn't automatically allow you to to use the joins they use
476:45 you to to use the joins they use something called relationships and there
476:47 something called relationships and there are joins on the back end but they call
476:49 are joins on the back end but they call it relationships because they are
476:51 it relationships because they are inferring all of these things they're
476:52 inferring all of these things they're trying to go in and make that inference
476:54 trying to go in and make that inference for you so it takes a lot of the work
476:57 for you so it takes a lot of the work off of you and most of the time that
476:58 off of you and most of the time that works and and you know you just plug
477:00 works and and you know you just plug these two things in here like a
477:02 these two things in here like a demographics and the job title and it is
477:06 demographics and the job title and it is going to you know help you build those
477:09 going to you know help you build those what they call relationships and you can
477:11 what they call relationships and you can click on this and learn how the
477:12 click on this and learn how the relationships differ from joins again
477:14 relationships differ from joins again there's not a huge difference but it's
477:16 there's not a huge difference but it's not as custom customizable and you can't
477:18 not as custom customizable and you can't as easily do left joins or full joins or
477:21 as easily do left joins or full joins or all these things that we're about to
477:22 all these things that we're about to look at so uh I'm going to take this one
477:25 look at so uh I'm going to take this one off and what we're going to do to
477:27 off and what we're going to do to actually be able to look at the joins
477:30 actually be able to look at the joins and and choose what joins we want to use
477:32 and and choose what joins we want to use is we're going to do this dropdown we're
477:33 is we're going to do this dropdown we're going to click open and so now we are in
477:37 going to click open and so now we are in a place where we can actually create the
477:40 a place where we can actually create the joins uh and again it's just much more
477:43 joins uh and again it's just much more customizable and so um back when I was
477:46 customizable and so um back when I was using
477:47 using regularly I would use the relationships
477:50 regularly I would use the relationships when it was pretty simple and
477:52 when it was pretty simple and straightforward cuz almost they almost
477:53 straightforward cuz almost they almost always got it right but uh you know the
477:57 always got it right but uh you know the joins it it just makes more sense in the
478:00 joins it it just makes more sense in the way it visualizes it for me so most of
478:01 way it visualizes it for me so most of the time I'd be using the joins so let's
478:05 the time I'd be using the joins so let's pull over this job title right here and
478:08 pull over this job title right here and it's going to make this connection now
478:10 it's going to make this connection now before if you remember just about you
478:12 before if you remember just about you know 30 seconds ago when it connected
478:14 know 30 seconds ago when it connected them it was just a line and and so it
478:16 them it was just a line and and so it gave us the this option down here to
478:18 gave us the this option down here to kind of edit the relationship but now
478:20 kind of edit the relationship but now it's giving us this visualization and so
478:22 it's giving us this visualization and so let's click on it really quick and what
478:24 let's click on it really quick and what is going to come up is the different
478:26 is going to come up is the different types of joins that you can do you can
478:28 types of joins that you can do you can do an inner join a left join a right
478:30 do an inner join a left join a right join and a full outer join and then you
478:33 join and a full outer join and then you can actually choose the different uh
478:35 can actually choose the different uh data sources and how you're connecting
478:38 data sources and how you're connecting them so again um I'm going to walk
478:40 them so again um I'm going to walk through a little bit of this but I think
478:42 through a little bit of this but I think the sequel video that I did on this
478:44 the sequel video that I did on this shows it so well um I would highly
478:47 shows it so well um I would highly recommend using that um and I recommend
478:50 recommend using that um and I recommend learning SQL too so you know two birds
478:52 learning SQL too so you know two birds one stem so I'm going to get into each
478:55 one stem so I'm going to get into each of the joins how they work what data is
478:58 of the joins how they work what data is going to be displayed um and these
479:00 going to be displayed um and these visualizations are really going to be
479:02 visualizations are really going to be helpful and I think that it's it's just
479:05 helpful and I think that it's it's just nice that they have it because it's a
479:06 nice that they have it because it's a little reminder okay um you know this is
479:09 little reminder okay um you know this is what this joint is or this is what that
479:10 what this joint is or this is what that joint is so super super simple so right
479:13 joint is so super super simple so right now we have the demographics table and
479:15 now we have the demographics table and we have the job title table and so what
479:18 we have the job title table and so what it's doing right now and let's get rid
479:20 it's doing right now and let's get rid of this what it's doing right now is
479:21 of this what it's doing right now is it's doing an inner join and so it's
479:23 it's doing an inner join and so it's pulling everything that overlaps if it
479:26 pulling everything that overlaps if it matches on the employee ID and the
479:29 matches on the employee ID and the employee ID and so right now you only
479:32 employee ID and so right now you only see one through n but if you remember in
479:34 see one through n but if you remember in the demographics table we had uh 1,000
479:38 the demographics table we had uh 1,000 all the way through 10 so where's that
479:40 all the way through 10 so where's that 10th one well the 10th one is not there
479:42 10th one well the 10th one is not there and that is because in this job title
479:45 and that is because in this job title employee ID it only went up
479:48 employee ID it only went up to9 and then Ryan Howard just didn't
479:51 to9 and then Ryan Howard just didn't have an employee ID in there for
479:52 have an employee ID in there for whatever reason so that data is going to
479:54 whatever reason so that data is going to be missing now when you are using actual
479:57 be missing now when you are using actual data sets very large data sets which we
480:00 data sets very large data sets which we will use in the next video when we walk
480:02 will use in the next video when we walk through an entire
480:03 through an entire project um when you use large data sets
480:07 project um when you use large data sets this can be the difference between clean
480:10 this can be the difference between clean data and very wrong data and and
480:12 data and very wrong data and and visualizing it correctly and showing
480:15 visualizing it correctly and showing completely wrong numbers and so you
480:17 completely wrong numbers and so you really need to be sure you understand
480:18 really need to be sure you understand how your data works together when you're
480:20 how your data works together when you're doing these joins so how can we fix this
480:23 doing these joins so how can we fix this how can we um make it to where we can
480:27 how can we um make it to where we can see all of the data well right now we're
480:29 see all of the data well right now we're only making it to where if the employee
480:31 only making it to where if the employee ID is equal to the employee ID so we
480:33 ID is equal to the employee ID so we only are going to see through 109 and
480:35 only are going to see through 109 and through 109 we're never going to see
480:37 through 109 we're never going to see Ryan so there are two different types of
480:39 Ryan so there are two different types of joins that we could do to make it see it
480:42 joins that we could do to make it see it and then there's something else that we
480:43 and then there's something else that we can join on to where we can see that
480:45 can join on to where we can see that data the first that we can look at is
480:47 data the first that we can look at is the right uh join and what this does is
480:51 the right uh join and what this does is it's going to take everything that is
480:52 it's going to take everything that is the same but also everything from this
480:55 the same but also everything from this job title table regardless of if it has
480:58 job title table regardless of if it has a match in the demographics table so
481:00 a match in the demographics table so it's pretty you know this visualization
481:01 it's pretty you know this visualization does it all it's going to show
481:03 does it all it's going to show everything in the right table regardless
481:05 everything in the right table regardless and it's only going to show things from
481:07 and it's only going to show things from this table if there's a match so let's
481:09 this table if there's a match so let's try this one and we should see Ryan
481:11 try this one and we should see Ryan Howard in the job title table so let's
481:14 Howard in the job title table so let's click on it and if we scroll down there
481:16 click on it and if we scroll down there going to be n n n n n until we get to
481:20 going to be n n n n n until we get to over here where we now have the data
481:24 over here where we now have the data that we had in that actual table but
481:27 that we had in that actual table but again this wasn't a match and so we
481:29 again this wasn't a match and so we weren't able to see that data so this
481:31 weren't able to see that data so this gives us a way to where we can see all
481:34 gives us a way to where we can see all of it um all everything from that right
481:36 of it um all everything from that right table this job title table and now we're
481:38 table this job title table and now we're going to click on the full outer now the
481:41 going to click on the full outer now the full outer is going to take everything
481:43 full outer is going to take everything from both regardless of if there is a
481:45 from both regardless of if there is a match at all and so right here you're
481:48 match at all and so right here you're going to see Ryan Howard and Ryan Howard
481:49 going to see Ryan Howard and Ryan Howard now why are there two different rows for
481:51 now why are there two different rows for it well because in the demographics
481:53 it well because in the demographics table there was an employee ID so we're
481:56 table there was an employee ID so we're seeing the employee ID Ryan Howard his
481:58 seeing the employee ID Ryan Howard his age and his gender and over here there
482:02 age and his gender and over here there was no match right but in the job title
482:05 was no match right but in the job title table again this one didn't have an
482:07 table again this one didn't have an employee ID and so we we are going to be
482:10 employee ID and so we we are going to be able to see this data but over here it
482:14 able to see this data but over here it has no match and so that's why showing
482:16 has no match and so that's why showing us two different rows is because there
482:19 us two different rows is because there was no connection there was no match
482:21 was no connection there was no match there that's what a full outer joint is
482:23 there that's what a full outer joint is going to do now just for uh the purposes
482:27 going to do now just for uh the purposes of seeing what this one does as well we
482:28 of seeing what this one does as well we have the leftand table um and now we are
482:31 have the leftand table um and now we are able to see the 110 or or 1010 that we
482:35 able to see the 110 or or 1010 that we didn't see before um and it's putting in
482:38 didn't see before um and it's putting in nulles over here because there's no
482:40 nulles over here because there's no match so that's that is um what we have
482:43 match so that's that is um what we have so far now like I said just a second
482:46 so far now like I said just a second going to go there is a way that we can
482:48 going to go there is a way that we can do this without using the employee IDs
482:51 do this without using the employee IDs we're allowed to use a different join
482:53 we're allowed to use a different join Clause now there is the name of the
482:55 Clause now there is the name of the employee in both of them this one is
482:57 employee in both of them this one is called name of employee and in the job
482:59 called name of employee and in the job title it's called employee name they
483:01 title it's called employee name they don't have to have the same column name
483:03 don't have to have the same column name in order to join it you can do whatever
483:05 in order to join it you can do whatever you want so I'm going to get rid of this
483:10 you want so I'm going to get rid of this one and now we are only tying it on the
483:13 one and now we are only tying it on the employee name and let's do an inter
483:17 employee name and let's do an inter join and it should be basically
483:20 join and it should be basically everything um except the only piece of
483:22 everything um except the only piece of data that wasn't filled in which is that
483:25 data that wasn't filled in which is that 110 over on the job title table and so
483:29 110 over on the job title table and so this way was a slightly different maybe
483:32 this way was a slightly different maybe uh less thought of way because normally
483:34 uh less thought of way because normally you do it if there's an ID you go on the
483:36 you do it if there's an ID you go on the IDS but because we had a lack of data
483:41 IDS but because we had a lack of data for in in one of the tables in the job
483:43 for in in one of the tables in the job title table we decided to use a
483:45 title table we decided to use a different column to to join on and now
483:48 different column to to join on and now we're able to look at all the data
483:50 we're able to look at all the data together so super quickly that is an
483:54 together so super quickly that is an inner join a left join a right join and
483:56 inner join a left join a right join and a full outer join and it's pretty easily
483:59 a full outer join and it's pretty easily visualized here and you're able to uh
484:02 visualized here and you're able to uh change what you're joining on right here
484:05 change what you're joining on right here but you're also you can do multiple so
484:07 but you're also you can do multiple so if we want to do the employee ID and the
484:09 if we want to do the employee ID and the employee ID you can do that as well and
484:11 employee ID you can do that as well and you can keep going as as many as you'd
484:14 you can keep going as as many as you'd like um and
484:17 like um and right here or you can change some of
484:19 right here or you can change some of these things uh I don't there aren't a
484:22 these things uh I don't there aren't a lot of use cases for this um but you
484:25 lot of use cases for this um but you know you can absolutely do this um and
484:27 know you can absolutely do this um and mess around with this as seen I'm not
484:28 mess around with this as seen I'm not going to go through it in the tutorial
484:30 going to go through it in the tutorial because again 95 plus perc of the joins
484:34 because again 95 plus perc of the joins you're doing you're going to want to do
484:35 you're doing you're going to want to do it to where this equals this um and if
484:37 it to where this equals this um and if you want to get into where it doesn't
484:39 you want to get into where it doesn't equal or or all these other things which
484:41 equal or or all these other things which is more complicated I think it's much
484:44 is more complicated I think it's much better to learn that in SQL uh that's my
484:46 better to learn that in SQL uh that's my personal preference and so um again all
484:49 personal preference and so um again all in the SQL tutorial if you want to check
484:50 in the SQL tutorial if you want to check that one out so you're able to join on
484:52 that one out so you're able to join on multiple things now let's get rid of
484:54 multiple things now let's get rid of that one because we can actually bring
484:56 that one because we can actually bring in this salary one as well and what
485:00 in this salary one as well and what you'll see right down
485:01 you'll see right down here is that we have our employee ID and
485:05 here is that we have our employee ID and this is all coming from the demographics
485:07 this is all coming from the demographics so employee ID name of employer employee
485:09 so employee ID name of employer employee age employee gender then right over here
485:12 age employee gender then right over here we have the job title table so employee
485:16 we have the job title table so employee ID job title employee name job title and
485:19 ID job title employee name job title and then right over here was or is our
485:23 then right over here was or is our salary table and so we have employee ID
485:25 salary table and so we have employee ID salary and employee salary so again this
485:28 salary and employee salary so again this is a way that you can put all of this
485:30 is a way that you can put all of this data into one place and and just a
485:32 data into one place and and just a second we'll go into the
485:33 second we'll go into the worksheet right down here I'm going to
485:35 worksheet right down here I'm going to show you kind of how it looks because it
485:37 show you kind of how it looks because it looks a little bit different um than
485:39 looks a little bit different um than previous tutorials and so I want to show
485:42 previous tutorials and so I want to show you how that actually all works together
485:45 you how that actually all works together um but again you can create these joins
485:48 um but again you can create these joins um as well and do the exact same thing
485:50 um as well and do the exact same thing that we just looked at and customize the
485:52 that we just looked at and customize the joins customize what you're what you're
485:54 joins customize what you're what you're um uh joining on and then you have your
485:58 um uh joining on and then you have your finished product and so right now we
486:00 finished product and so right now we have our demographics plus Tableau joins
486:03 have our demographics plus Tableau joins file and we can rename that if we want
486:06 file and we can rename that if we want I'm going to call this um demographics
486:09 I'm going to call this um demographics plus joins
486:11 plus joins demo and click enter and so now that is
486:15 demo and click enter and so now that is saved so so now let's go down to the go
486:17 saved so so now let's go down to the go to worksheet we're going to click on
486:19 to worksheet we're going to click on that and so up here on our left side
486:21 that and so up here on our left side this may look a little bit different
486:22 this may look a little bit different than it normally does um because it's
486:24 than it normally does um because it's broken out um on the measure names and
486:27 broken out um on the measure names and the measure values it's broken out by
486:29 the measure values it's broken out by the tables that they were joined on so
486:31 the tables that they were joined on so we can pull in the employee gender now
486:34 we can pull in the employee gender now and we can pull in the employee name now
486:37 and we can pull in the employee name now um and we can pull in the employee ID
486:39 um and we can pull in the employee ID again if we want to from the job title
486:42 again if we want to from the job title table and we can pull in the employee ID
486:44 table and we can pull in the employee ID from the salary table we could do that
486:46 from the salary table we could do that if we wanted to it makes no sense uh uh
486:48 if we wanted to it makes no sense uh uh for actually creating any visualizations
486:50 for actually creating any visualizations but you know you can do that and so you
486:52 but you know you can do that and so you probably you wouldn't be able to do that
486:54 probably you wouldn't be able to do that if you hadn't joined these together and
486:56 if you hadn't joined these together and so down here in the measure values the
486:59 so down here in the measure values the values that we have are from the
487:00 values that we have are from the demographics table and the salary table
487:02 demographics table and the salary table all of the um all of the stuff from the
487:06 all of the um all of the stuff from the employee title none of those things were
487:10 employee title none of those things were um values and so we can't use there are
487:12 um values and so we can't use there are going to be no values down here and so
487:14 going to be no values down here and so really quick let's take the name of the
487:16 really quick let's take the name of the employee let's take their salary sure
487:20 employee let's take their salary sure why not um let's order
487:24 why not um let's order that let's take the employee
487:26 that let's take the employee salary we'll do
487:29 salary we'll do color and uh expan this out a little
487:34 color and uh expan this out a little bit maybe one more time oops just like
487:38 bit maybe one more time oops just like that and there you go so that is how you
487:40 that and there you go so that is how you do joins in Tableau and I think Tableau
487:42 do joins in Tableau and I think Tableau does a really fantastic job of making it
487:44 does a really fantastic job of making it pretty simple they have the different
487:46 pretty simple they have the different types of joins when you click on that
487:48 types of joins when you click on that that join button and it shows you the
487:50 that join button and it shows you the inner and the left and the right and the
487:52 inner and the left and the right and the full outer and they make it pretty
487:54 full outer and they make it pretty simple um and and and it's just really
487:56 simple um and and and it's just really useful to be able to see that while
487:59 useful to be able to see that while you're creating it and see the output
488:01 you're creating it and see the output below like we just did a second ago it
488:03 below like we just did a second ago it it just makes it so simple to create
488:05 it just makes it so simple to create those joins and then just keep going
488:07 those joins and then just keep going because you already know what your
488:08 because you already know what your output is going to be and you can kind
488:09 output is going to be and you can kind of mess around with it and make sure
488:11 of mess around with it and make sure you're getting the data that you need in
488:13 you're getting the data that you need in the very next video we're going to be
488:14 the very next video we're going to be doing an entire project in tap we're
488:16 doing an entire project in tap we're going to be using a lot more data and
488:18 going to be using a lot more data and it's going to be a a complete project
488:20 it's going to be a a complete project that you can add to your portfolio and
488:22 that you can add to your portfolio and it's going to be a really good time so I
488:24 it's going to be a really good time so I hope that you joined me for that one I
488:26 hope that you joined me for that one I appreciate your time I hope that this
488:27 appreciate your time I hope that this was helpful thank you guys so much for
488:29 was helpful thank you guys so much for watching I really appreciate it if you
488:31 watching I really appreciate it if you like this video be sure to like And
488:33 like this video be sure to like And subscribe below and I'll see you in the
488:34 subscribe below and I'll see you in the next
488:36 next [Music]
488:44 [Music] video
488:47 video what's going on everybody welcome back
488:49 what's going on everybody welcome back to the Tableau tutorial Series this is
488:50 to the Tableau tutorial Series this is our very last video in the series and
488:53 our very last video in the series and today we'll be doing an entire
488:59 [Music] project now if you're watching this
489:01 project now if you're watching this video I hope that you watch the other
489:03 video I hope that you watch the other four videos in this series just so you
489:04 four videos in this series just so you can get the basics down you kind of know
489:06 can get the basics down you kind of know what you're doing uh this won't be a
489:08 what you're doing uh this won't be a crazy hard project this is a beginner
489:10 crazy hard project this is a beginner tutorial Series so I'm trying to make
489:12 tutorial Series so I'm trying to make this super easy so you can follow along
489:15 this super easy so you can follow along nothing super comp complicated I promise
489:17 nothing super comp complicated I promise and if you were wanting to go above and
489:18 and if you were wanting to go above and beyond and just make a lot of different
489:20 beyond and just make a lot of different dashboards or try a lot of different
489:21 dashboards or try a lot of different things there's a ton of data in here and
489:24 things there's a ton of data in here and so I'll show you some of the things that
489:25 so I'll show you some of the things that I would do you know as we go through it
489:27 I would do you know as we go through it of the things that I would be looking at
489:29 of the things that I would be looking at and some of the different visualizations
489:30 and some of the different visualizations that I might do as well but again in
489:32 that I might do as well but again in this video we're going to be singing to
489:33 this video we're going to be singing to a lot of the basics but I'll switch over
489:35 a lot of the basics but I'll switch over my screen in just a second I will show
489:37 my screen in just a second I will show you the final product and then we will
489:38 you the final product and then we will actually walk through step by step of
489:40 actually walk through step by step of how to do the entire dashboard and at
489:42 how to do the entire dashboard and at the end you should have a completed
489:44 the end you should have a completed project that you can add to your
489:45 project that you can add to your portfolio or you know just share on
489:47 portfolio or you know just share on LinkedIn if you want to do that as well
489:49 LinkedIn if you want to do that as well with that being said let's jump over to
489:50 with that being said let's jump over to my screen and let's get started all
489:52 my screen and let's get started all right so let's get me off screen and
489:53 right so let's get me off screen and show you what we're going to be working
489:54 show you what we're going to be working on today this is the final dashboard
489:56 on today this is the final dashboard that we're actually going to be building
489:58 that we're actually going to be building and so it's nothing crazy right I'm sure
490:00 and so it's nothing crazy right I'm sure you have seen all of these things before
490:03 you have seen all of these things before um and I'm just going to help you kind
490:04 um and I'm just going to help you kind of build it out show you what to do the
490:07 of build it out show you what to do the buttons to click um and it's really
490:09 buttons to click um and it's really going to be a simple walk through by the
490:11 going to be a simple walk through by the end of this you should be able to do all
490:12 end of this you should be able to do all these things very easily and I highly
490:15 these things very easily and I highly encourage looking at at the data and
490:17 encourage looking at at the data and looking at these visualizations and
490:18 looking at these visualizations and seeing what else you can do with it
490:20 seeing what else you can do with it there's a lot of different colors a lot
490:22 there's a lot of different colors a lot of different visualizations um that you
490:24 of different visualizations um that you can do with this data I'm just showing
490:26 can do with this data I'm just showing you this today and so the more you go
490:29 you this today and so the more you go out there and the more you do this on
490:30 out there and the more you do this on your own and you mess around with stuff
490:33 your own and you mess around with stuff and and choose different things and see
490:34 and and choose different things and see how it all works the better you're going
490:36 how it all works the better you're going to get and so I highly highly encourage
490:38 to get and so I highly highly encourage doing that uh so what we are going to be
490:40 doing that uh so what we are going to be working with today is an Airbnb data set
490:43 working with today is an Airbnb data set I'm going to show you that in just a
490:45 I'm going to show you that in just a second and I'm going to show you the
490:47 second and I'm going to show you the data and we're going to just jump right
490:49 data and we're going to just jump right into it all right so this is the data
490:51 into it all right so this is the data set that we are going to be using this
490:52 set that we are going to be using this is the Seattle Airbnb open data set and
490:56 is the Seattle Airbnb open data set and let's scroll down really quick um
490:58 let's scroll down really quick um there's three different csvs in here and
491:00 there's three different csvs in here and so this is some of the data that we're
491:02 so this is some of the data that we're going to be working with um some date on
491:05 going to be working with um some date on listings and some pricing and then
491:07 listings and some pricing and then there's the actual listing that shows um
491:10 there's the actual listing that shows um the actual street address the location
491:13 the actual street address the location the price the bedrooms all of these good
491:15 the price the bedrooms all of these good stuff stuff and then there's a
491:18 stuff stuff and then there's a reviews um and it has you know some
491:20 reviews um and it has you know some comments and you know talks about some
491:22 comments and you know talks about some of the reviews so this is what we're
491:26 of the reviews so this is what we're going to be working with but you don't
491:28 going to be working with but you don't have to go in here and download it I
491:30 have to go in here and download it I have already combined all these csvs
491:32 have already combined all these csvs into one I've put it on the GitHub so
491:36 into one I've put it on the GitHub so I'll have a link below so you can just
491:37 I'll have a link below so you can just click on that and you don't have to do
491:38 click on that and you don't have to do all the stuff that I did to get this set
491:40 all the stuff that I did to get this set up um just so you know this is from 2016
491:43 up um just so you know this is from 2016 so this data set is a little bit old if
491:45 so this data set is a little bit old if you want to you can come right here and
491:48 you want to you can come right here and I will leave this link as well and you
491:50 I will leave this link as well and you can get the data set from you know what
491:52 can get the data set from you know what is this a couple weeks ago uh this is
491:55 is this a couple weeks ago uh this is they they are continuing to update this
491:56 they they are continuing to update this this is always updated and so you can go
491:58 this is always updated and so you can go ahead and download these but some of
492:00 ahead and download these but some of these are the CSV Dogz um so you may
492:03 these are the CSV Dogz um so you may need to like convert it I don't want to
492:04 need to like convert it I don't want to go through that process um on you know
492:08 go through that process um on you know in the video and so I am just going to
492:10 in the video and so I am just going to go with what is literally in kaggle um
492:13 go with what is literally in kaggle um and use that but if you want want to
492:15 and use that but if you want want to have an updated one for your project I
492:18 have an updated one for your project I just advise you to go in here and grab
492:20 just advise you to go in here and grab it yourself and that should be perfectly
492:21 it yourself and that should be perfectly good so go ahead and download the data
492:25 good so go ahead and download the data set from the GitHub and we should be
492:27 set from the GitHub and we should be good to go so this is the Excel that I
492:29 good to go so this is the Excel that I was just talking about this has all of
492:31 was just talking about this has all of our csvs in one place this is you know
492:33 our csvs in one place this is you know an Excel workbook so in this reviews
492:36 an Excel workbook so in this reviews actually let's start with the listings
492:38 actually let's start with the listings because that's kind of where it all
492:39 because that's kind of where it all stems from uh we have our listing and
492:42 stems from uh we have our listing and the DAT or the data in here is um you
492:45 the DAT or the data in here is um you you know really extensive there's a lot
492:47 you know really extensive there's a lot of data in here so let's get over really
492:49 of data in here so let's get over really quick um the listing refers to the
492:52 quick um the listing refers to the actual home that they're renting out the
492:54 actual home that they're renting out the Airbnb so it shows their
492:58 Airbnb so it shows their location um and there's a lot more
493:00 location um and there's a lot more location information over here I'm
493:02 location information over here I'm getting into it in in just a second so
493:04 getting into it in in just a second so there's the neighborhood the city state
493:06 there's the neighborhood the city state um zip code all stuff that you know may
493:09 um zip code all stuff that you know may be useful there's a latitude and
493:11 be useful there's a latitude and longitude it shows what type of property
493:14 longitude it shows what type of property it is so that's really really good um
493:17 it is so that's really really good um right over here it has you know how many
493:19 right over here it has you know how many bathrooms bedrooms and beds um you know
493:22 bathrooms bedrooms and beds um you know sometimes if it's a five bedroom house
493:24 sometimes if it's a five bedroom house it has seven beds so that's why there's
493:26 it has seven beds so that's why there's those two different um Fields I don't
493:28 those two different um Fields I don't know if you're familiar with Airbnb and
493:31 know if you're familiar with Airbnb and and you know what they have on there but
493:33 and you know what they have on there but just something to note uh they have the
493:35 just something to note uh they have the price this is the price per day this is
493:37 price this is the price per day this is a weekly price a monthly price and if
493:40 a weekly price a monthly price and if there's a deposit needed uh and then a
493:42 there's a deposit needed uh and then a cleaning fee as well so a bunch of
493:45 cleaning fee as well so a bunch of financial data that's you know super
493:47 financial data that's you know super useful we go into it a little bit but
493:49 useful we go into it a little bit but there's so much you can do with that um
493:52 there's so much you can do with that um you know if you want to dig into that
493:53 you know if you want to dig into that and that's kind of it the rest of it's
493:55 and that's kind of it the rest of it's pretty uh pretty useless um and there's
493:57 pretty uh pretty useless um and there's a lot so there's so much data in here
493:59 a lot so there's so much data in here almost you know more than half by far is
494:02 almost you know more than half by far is nothing you would put in any type of
494:04 nothing you would put in any type of visualization um and this is pretty
494:06 visualization um and this is pretty common uh you're not going to
494:08 common uh you're not going to get data every column where you're going
494:11 get data every column where you're going to be able to use it a lot of times it's
494:13 to be able to use it a lot of times it's just a lot of useless junk and so you
494:14 just a lot of useless junk and so you have to know what you're looking for and
494:16 have to know what you're looking for and know uh you know what's actually useful
494:18 know uh you know what's actually useful so that's the listing then we have
494:20 so that's the listing then we have reviews
494:22 reviews now what's really a little bit confusing
494:25 now what's really a little bit confusing in here and something that you just need
494:26 in here and something that you just need to kind of understand about the data um
494:28 to kind of understand about the data um and something that if you're if you get
494:29 and something that if you're if you get a data analyst job you need to
494:32 a data analyst job you need to understand your data because it's very
494:33 understand your data because it's very easy to come in here and say okay
494:34 easy to come in here and say okay there's an ID ID field and here's an ID
494:37 there's an ID ID field and here's an ID field so that means that those are the
494:39 field so that means that those are the same well not in this case um this ID
494:42 same well not in this case um this ID field is actually the review reviews ID
494:45 field is actually the review reviews ID not the reviewer ID that refers to like
494:47 not the reviewer ID that refers to like the person this is the reviews ID this
494:50 the person this is the reviews ID this listing ID is the actual ID right there
494:56 listing ID is the actual ID right there so really important to
494:58 so really important to note um and then the L and so then they
495:01 note um and then the L and so then they just have their comment there what they
495:03 just have their comment there what they left as a review and then on the
495:04 left as a review and then on the calendar um I don't know why I'm
495:06 calendar um I don't know why I'm scrolled down uh we have this listing
495:09 scrolled down uh we have this listing idea again so again that listing ID is
495:11 idea again so again that listing ID is equal to the ID in this listing table
495:14 equal to the ID in this listing table and we have a date in a price so this
495:16 and we have a date in a price so this refers to a specific location and on
495:18 refers to a specific location and on this day they got $85 for it somebody
495:22 this day they got $85 for it somebody rented it out um and so then there's
495:24 rented it out um and so then there's these like T's and Fs um let's try to
495:26 these like T's and Fs um let's try to find a blank one really quick here's a
495:29 find a blank one really quick here's a blank one so there's these T's and Fs uh
495:32 blank one so there's these T's and Fs uh the t means that it was taken um the f
495:35 the t means that it was taken um the f means that it's vacant I don't know
495:37 means that it's vacant I don't know exactly what it means uh what a TF means
495:40 exactly what it means uh what a TF means but that we can deduce that much from
495:42 but that we can deduce that much from this and so you can see when and how
495:45 this and so you can see when and how much this person was making or this
495:47 much this person was making or this homeade uh in that time so really really
495:51 homeade uh in that time so really really good data in here there's a lot to work
495:53 good data in here there's a lot to work with um and and so we're just going to
495:55 with um and and so we're just going to be kind of I'll give you a little bit of
495:58 be kind of I'll give you a little bit of a use case for it in a second and then
496:00 a use case for it in a second and then we're going to start trying to answer
496:01 we're going to start trying to answer some of those the building out some of
496:03 some of those the building out some of the visualizations for that use case uh
496:06 the visualizations for that use case uh again you could have 20 different use
496:08 again you could have 20 different use cases for this data or more um honestly
496:11 cases for this data or more um honestly for this data where you can build out
496:13 for this data where you can build out different dashboards and different
496:14 different dashboards and different reports literally with just this data
496:16 reports literally with just this data but you know we're doing a pretty
496:19 but you know we're doing a pretty General broad project and so it's hard
496:21 General broad project and so it's hard to answer all of them so let's jump over
496:25 to answer all of them so let's jump over to Tableau we're going to get started on
496:27 to Tableau we're going to get started on this and we are going to build out
496:30 this and we are going to build out everything all right so let's come right
496:32 everything all right so let's come right here uh this is a Microsoft Excel we'll
496:36 here uh this is a Microsoft Excel we'll open that up do this one we will open
496:41 open that up do this one we will open it and give it just a second says it's
496:44 it and give it just a second says it's executing the query it's pulling the
496:46 executing the query it's pulling the data in all right so we have our
496:50 data in all right so we have our calendar our listing and our reviews
496:52 calendar our listing and our reviews those are the different tabs at the
496:53 those are the different tabs at the bottom we're going to start with the
496:55 bottom we're going to start with the listing this is the the kind of the main
496:58 listing this is the the kind of the main one has um you know the there's I didn't
497:01 one has um you know the there's I didn't show you but there's about
497:03 show you but there's about 3,600
497:04 3,600 locations that they had in
497:06 locations that they had in there uh let's just have it update
497:10 there uh let's just have it update automatically I don't know why we need
497:12 automatically I don't know why we need to click on that but um so we have this
497:14 to click on that but um so we have this list listings we have our calendar and
497:18 list listings we have our calendar and our
497:19 our reviews what we're going to do is going
497:20 reviews what we're going to do is going to come in here and we're going to open
497:23 to come in here and we're going to open it as we did in our very last video uh
497:25 it as we did in our very last video uh for the joins so now that we've opened
497:27 for the joins so now that we've opened it we can kind of go in here and we can
497:30 it we can kind of go in here and we can do the joins as um as needed and so
497:34 do the joins as um as needed and so let's go over here and we're going to uh
497:37 let's go over here and we're going to uh let's start with
497:39 let's start with calendar put it right there that was
497:41 calendar put it right there that was super slow I
497:44 super slow I apologize all right let's wait for it
497:54 to get the data start setting everything up did not think it would take this long
497:56 up did not think it would take this long I
498:03 apologize no take your time so let's click on here and right now it has the
498:06 click on here and right now it has the uh the join based on the price which
498:10 uh the join based on the price which obviously is not going to work um and if
498:12 obviously is not going to work um and if you remember there is no ID in this
498:14 you remember there is no ID in this calendar it's just just the listing ID
498:17 calendar it's just just the listing ID um we can actually look right here
498:18 um we can actually look right here there's just the listing ID so we're
498:19 there's just the listing ID so we're actually going to put listing ID is
498:22 actually going to put listing ID is equal to
498:25 equal to ID and right down here we can see that
498:27 ID and right down here we can see that we have a lot of of well you can't see
498:30 we have a lot of of well you can't see it um but we show that there is a lot of
498:33 it um but we show that there is a lot of data um and so we know that that is
498:36 data um and so we know that that is correct we know that that is now pulling
498:38 correct we know that that is now pulling in data correctly because it's showing
498:39 in data correctly because it's showing up down here so that's a good thing now
498:43 up down here so that's a good thing now in this listings there there are about
498:47 in this listings there there are about 3600 um about 3600 listings and
498:52 3600 um about 3600 listings and so that all the data that's in listings
498:54 so that all the data that's in listings is going to be in there but on the
498:57 is going to be in there but on the calendar because we converted from a CSV
498:59 calendar because we converted from a CSV to an Excel workbook it isn't able to
499:01 to an Excel workbook it isn't able to store as much information so some of the
499:03 store as much information so some of the ones in calendar may have gotten cut off
499:05 ones in calendar may have gotten cut off so we can just keep at this inj join
499:07 so we can just keep at this inj join because we know that if it's in listings
499:09 because we know that if it's in listings it is going to be in calendar we know
499:11 it is going to be in calendar we know that it if it um there may be some in
499:14 that it if it um there may be some in calar Cal that aren't in listings so if
499:17 calar Cal that aren't in listings so if we really um you know if we really
499:20 we really um you know if we really really wanted to we could do a full
499:21 really wanted to we could do a full outer or something like that I I haven't
499:24 outer or something like that I I haven't really thought through this as I'm
499:25 really thought through this as I'm talking through it in my head but we
499:27 talking through it in my head but we know that uh everything that's in
499:30 know that uh everything that's in listing is going to be in calendar and
499:32 listing is going to be in calendar and so you know we don't really need to do
499:34 so you know we don't really need to do anything other than an inner
499:36 anything other than an inner join and we can also pull in these
499:41 join and we can also pull in these reviews and it's going to do the same
499:43 reviews and it's going to do the same thing as before where just kind of
499:44 thing as before where just kind of pulling in the data and it defaults to
499:47 pulling in the data and it defaults to ID equals ID now we know that that is
499:50 ID equals ID now we know that that is not correct um because the ID in here is
499:53 not correct um because the ID in here is referring to the review ID we need to go
499:55 referring to the review ID we need to go to the listings ID so we need the ID be
499:58 to the listings ID so we need the ID be able to you know be part of that
500:00 able to you know be part of that listings ID if we do the
500:03 listings ID if we do the ID it goes down to
500:06 ID it goes down to 2,555 rows if we do how it's supposed
500:09 2,555 rows if we do how it's supposed and because that's just you know it's
500:10 and because that's just you know it's random luck there happen to be some
500:12 random luck there happen to be some numbers that are in both fields um that
500:15 numbers that are in both fields um that tie together if we do the correct one
500:17 tie together if we do the correct one where we hit the listing ID it bumps it
500:19 where we hit the listing ID it bumps it up to I think 2, 373,000 oh maybe more
500:23 up to I think 2, 373,000 oh maybe more than that uh 23 million rows right a lot
500:27 than that uh 23 million rows right a lot lot lot more and so it's super important
500:30 lot lot more and so it's super important to get these joins right to tie them
500:32 to get these joins right to tie them together on the right Fields if you just
500:34 together on the right Fields if you just do it based off what Tableau tells you
500:36 do it based off what Tableau tells you because it has that automated um you
500:38 because it has that automated um you know it goes into these fields and says
500:41 know it goes into these fields and says okay these are the same exact column
500:43 okay these are the same exact column name so they're most likely going to be
500:46 name so they're most likely going to be what you're looking for well it was
500:47 what you're looking for well it was incorrect in this point so it's really
500:49 incorrect in this point so it's really important to check those things and make
500:51 important to check those things and make sure you're pulling in the right data
500:52 sure you're pulling in the right data again we're going to keep it that inner
500:54 again we're going to keep it that inner join um you know if you wanted to you
500:57 join um you know if you wanted to you know try to see if there's any other
500:58 know try to see if there's any other data that correlate we're keeping it
500:59 data that correlate we're keeping it simple today but sometimes you need to
501:01 simple today but sometimes you need to join on multiple things uh so just uh a
501:05 join on multiple things uh so just uh a you know a tip so let's get out of here
501:08 you know a tip so let's get out of here um and we are good to go so this is our
501:10 um and we are good to go so this is our listings plus Tableau full project
501:13 listings plus Tableau full project that's what we'll that's what we'll be
501:15 that's what we'll that's what we'll be working with um and we we were able to
501:17 working with um and we we were able to tie all three of these um you know as
501:20 tie all three of these um you know as you call them tables or sheets or
501:22 you call them tables or sheets or whatever you want to call them we were
501:23 whatever you want to call them we were able to tie them together so let's go
501:26 able to tie them together so let's go over here to our first
501:28 over here to our first worksheet uh let's
501:30 worksheet uh let's see all right so this says Tableau
501:32 see all right so this says Tableau public only works with less than 15
501:34 public only works with less than 15 million rows of data we have 23 million
501:36 million rows of data we have 23 million rows of data that is uh that's a problem
501:39 rows of data that is uh that's a problem um and when I did this before it didn't
501:42 um and when I did this before it didn't do that so I you know we're going to
501:44 do that so I you know we're going to work through this together so this is
501:46 work through this together so this is date reviews I believe this is date for
501:51 date reviews I believe this is date for um this is date for the calendar which
501:56 um this is date for the calendar which is going to be a lot of rows of data and
501:58 is going to be a lot of rows of data and so I'm sure that's part of it let's
502:02 so I'm sure that's part of it let's see let's do
502:04 see let's do years we only want 2016 oops we only
502:09 years we only want 2016 oops we only want
502:16 2016 let's do okay let's see what that does let's see if
502:17 let's see what that does let's see if that gets us under what we need um we
502:20 that gets us under what we need um we only want 2016 data
502:22 only want 2016 data anyways so if it's in 2017 we were going
502:25 anyways so if it's in 2017 we were going to take it out um anyway so we'll see if
502:28 to take it out um anyway so we'll see if that gets us underneath I have
502:30 that gets us underneath I have absolutely if this T ends up taking like
502:32 absolutely if this T ends up taking like 20 minutes I will just cut it and you
502:36 20 minutes I will just cut it and you know you won't have to wait as long as
502:37 know you won't have to wait as long as I'm waiting so let's see how long it
502:45 takes all right so it took about 20 minutes and it did absolutely nothing
502:49 minutes and it did absolutely nothing um one thing I do know is that we don't
502:53 um one thing I do know is that we don't actually use this review tables at all
502:56 actually use this review tables at all um just for demonstration purposes so
502:59 um just for demonstration purposes so we're going to remove that and let's see
503:01 we're going to remove that and let's see if that helps us in any
503:08 way if it does we're just going to keep it as is um you know the reviews table
503:11 it as is um you know the reviews table is really just for demonstrating how to
503:13 is really just for demonstrating how to do the joint
503:15 do the joint but we weren't actually using any of the
503:16 but we weren't actually using any of the data for any of the
503:18 data for any of the visualizations although you
503:21 visualizations although you could again I'm going to see how long
503:23 could again I'm going to see how long this takes uh and I'll cut
503:30 ahead all right so that worked uh perfectly it apparently took out all the
503:32 perfectly it apparently took out all the data that we needed all the rows that we
503:34 data that we needed all the rows that we needed to get under that level again I
503:35 needed to get under that level again I was just doing that to show you the that
503:38 was just doing that to show you the that that joins how you needed to change the
503:41 that joins how you needed to change the columns to make sure that it joined
503:43 columns to make sure that it joined properly we don't actually use for any
503:44 properly we don't actually use for any of the visualization so their end
503:46 of the visualization so their end product is going to be totally fine I
503:48 product is going to be totally fine I don't know why uh this didn't happen to
503:50 don't know why uh this didn't happen to me when I when I created this whole
503:51 me when I when I created this whole thing already um so just going to move
503:55 thing already um so just going to move forward because uh I make mistakes so uh
503:58 forward because uh I make mistakes so uh let's keep moving the first one that we
504:00 let's keep moving the first one that we are going to make is that uh is that
504:03 are going to make is that uh is that colorful one I'll probably pop it up on
504:05 colorful one I'll probably pop it up on screen so you can see it uh well if I
504:07 screen so you can see it uh well if I remember I'm going to pop it up on
504:08 remember I'm going to pop it up on screen um it's the colorful one it's the
504:11 screen um it's the colorful one it's the price by ZIP code so we're going to be
504:12 price by ZIP code so we're going to be looking at these zip codes and kind of
504:14 looking at these zip codes and kind of see
504:15 see um you know how
504:17 um you know how expensive is each zip code um and before
504:21 expensive is each zip code um and before we actually start I just remembered I
504:23 we actually start I just remembered I want to talk to you about the use case
504:25 want to talk to you about the use case for this
504:26 for this data I want to imagine you to imagine
504:29 data I want to imagine you to imagine that you're working for somebody they're
504:30 that you're working for somebody they're like hey where you know I want to start
504:33 like hey where you know I want to start an Airbnb business I want to know where
504:35 an Airbnb business I want to know where I should go where should I buy up buy a
504:39 I should go where should I buy up buy a home put it up on Airbnb and start
504:41 home put it up on Airbnb and start renting it out where's the best place
504:43 renting it out where's the best place you know what are some of the fact fact
504:44 you know what are some of the fact fact that I should be looking at uh and so
504:47 that I should be looking at uh and so that's kind of what our use case is so
504:49 that's kind of what our use case is so we're going to some of the things that
504:51 we're going to some of the things that he cares about are things like bedrooms
504:53 he cares about are things like bedrooms um location which is really important
504:56 um location which is really important and how much price he's actually going
504:58 and how much price he's actually going to get how much money can he charge and
505:01 to get how much money can he charge and so he's trying to optimize that to make
505:03 so he's trying to optimize that to make sure that whatever rental he gets he can
505:05 sure that whatever rental he gets he can make a the most profit from instead of
505:07 make a the most profit from instead of choosing something that you know he
505:09 choosing something that you know he thinks would work but you know in the
505:10 thinks would work but you know in the end he's actually not making that much
505:11 end he's actually not making that much money so those things are important so
505:14 money so those things are important so that's our use case we're trying to help
505:16 that's our use case we're trying to help this guy out help him find a really good
505:19 this guy out help him find a really good Airbnb um so let's take a look at these
505:21 Airbnb um so let's take a look at these zip codes real quick we have uh quite a
505:23 zip codes real quick we have uh quite a few of them and there's one that's null
505:27 few of them and there's one that's null uh we'll exclude that or if if it
505:28 uh we'll exclude that or if if it doesn't have a zip code we'll just
505:29 doesn't have a zip code we'll just exclude those because they're not going
505:31 exclude those because they're not going to show up on the these visualizations
505:33 to show up on the these visualizations anyways um and so we want to look at the
505:36 anyways um and so we want to look at the price so we just want to find uh the
505:39 price so we just want to find uh the price which should actually be down
505:41 price which should actually be down here and not the sum
505:45 here and not the sum uh no we want to look at the average
505:49 uh no we want to look at the average price and let's order that this is great
505:53 price and let's order that this is great um so this is the most expensive one uh
505:56 um so this is the most expensive one uh ZIP code 98134 at
505:59 ZIP code 98134 at $26 uh
506:01 $26 uh per for the average price uh but let's
506:04 per for the average price uh but let's give that some color really quick Let's
506:07 give that some color really quick Let's uh where's the ZIP code it's up here so
506:09 uh where's the ZIP code it's up here so let's take that zip code we're going to
506:11 let's take that zip code we're going to put it right over here we're going to do
506:12 put it right over here we're going to do color and it's going to give it some uh
506:15 color and it's going to give it some uh assorted colors now these colors are
506:17 assorted colors now these colors are going to um when we do the map in just a
506:20 going to um when we do the map in just a little bit these colors will um match
506:22 little bit these colors will um match what we're doing in there and so you
506:25 what we're doing in there and so you know I I like to try to color coordinate
506:27 know I I like to try to color coordinate things um we're not doing going too
506:29 things um we're not doing going too crazy with the colors today so this is
506:31 crazy with the colors today so this is our very first visualization
506:32 our very first visualization congratulations it is uh it is complete
506:36 congratulations it is uh it is complete so uh we can label this one and we can
506:39 so uh we can label this one and we can just
506:40 just do price by zip code and I'll make that
506:47 do price by zip code and I'll make that bold I don't know I usually like it bold
506:49 bold I don't know I usually like it bold we'll apply we'll do like that and boom
506:52 we'll apply we'll do like that and boom first one is done uh and this is our
506:55 first one is done uh and this is our starting place to say uh Hey person
506:58 starting place to say uh Hey person who's looking to buy this Airbnb here
507:01 who's looking to buy this Airbnb here are the zip codes where they are able to
507:02 are the zip codes where they are able to charge the most um for for their Airbnb
507:07 charge the most um for for their Airbnb so let's go over to the second sheet and
507:10 so let's go over to the second sheet and we are going to be doing the map and so
507:12 we are going to be doing the map and so um map is pretty easy
507:14 um map is pretty easy but it it's pretty easy Once you
507:17 but it it's pretty easy Once you actually get the data that you need
507:19 actually get the data that you need although there's a lot of different data
507:21 although there's a lot of different data that you can use for the actual U map
507:25 that you can use for the actual U map right here you need something that shows
507:28 right here you need something that shows um the location and there's a lot of
507:30 um the location and there's a lot of things that show location in here in
507:32 things that show location in here in fact they already um provide a latitude
507:34 fact they already um provide a latitude and longitude and then at the bottom
507:36 and longitude and then at the bottom they generated a latitude and longitude
507:39 they generated a latitude and longitude from from some different um fields and
507:42 from from some different um fields and then there's just a bunch of different
507:44 then there's just a bunch of different um State there's um States there's zip
507:47 um State there's um States there's zip codes there are uh I think another one I
507:51 codes there are uh I think another one I yeah like country there's a lot of
507:53 yeah like country there's a lot of location data in here so which one do we
507:56 location data in here so which one do we want to use we want to stay consistent
507:59 want to use we want to stay consistent we don't want to deviate from that and
508:00 we don't want to deviate from that and start using different um L long
508:03 start using different um L long longitude and latitudinal uh coordinates
508:05 longitude and latitudinal uh coordinates because that could throw off our our
508:07 because that could throw off our our results completely we want to stay
508:09 results completely we want to stay consistent with what we're using so we
508:11 consistent with what we're using so we actually want to use this ZIP code but
508:13 actually want to use this ZIP code but when we pull it up here it's going to
508:15 when we pull it up here it's going to give us uh basically the same um you
508:17 give us uh basically the same um you know it's going to show these zip codes
508:18 know it's going to show these zip codes but we were going to right over here
508:20 but we were going to right over here we're going to click on this one and now
508:22 we're going to click on this one and now it's going to separate them out so now
508:24 it's going to separate them out so now we have all of these um you know kind of
508:27 we have all of these um you know kind of separated out what you might get when
508:29 separated out what you might get when you first do this um is it might look
508:31 you first do this um is it might look like this you may have to zoom in um I
508:34 like this you may have to zoom in um I know that that happened to me the other
508:36 know that that happened to me the other time excuse me go to here that's what
508:39 time excuse me go to here that's what happened to me uh just when I first did
508:41 happened to me uh just when I first did it so uh know that that may happen
508:45 it so uh know that that may happen and we want to change the colors the
508:48 and we want to change the colors the exact same way that we did them before
508:50 exact same way that we did them before so we're just going over here we're
508:51 so we're just going over here we're doing color and these colors do um they
508:57 doing color and these colors do um they do should match up with the um with the
509:01 do should match up with the um with the other ones let me um exclude this let me
509:04 other ones let me um exclude this let me see if it does 98134 that's the
509:08 see if it does 98134 that's the blue and right over here 98134 that's a
509:13 blue and right over here 98134 that's a blue I I I believe believe they are
509:14 blue I I I believe believe they are going to be the same yep and so just
509:17 going to be the same yep and so just scrolling back if you look at the ZIP
509:19 scrolling back if you look at the ZIP code on the far right uh they are the
509:21 code on the far right uh they are the same so if you're looking like this
509:23 same so if you're looking like this section right over here I I'm just
509:25 section right over here I I'm just wanting to make sure I'm not going crazy
509:27 wanting to make sure I'm not going crazy uh before I get into this and realize
509:29 uh before I get into this and realize I'm not correct at all so uh now what we
509:33 I'm not correct at all so uh now what we want is you know this doesn't really
509:35 want is you know this doesn't really give us any information if I was just to
509:36 give us any information if I was just to glance at this map I would have no idea
509:39 glance at this map I would have no idea what you're trying to show me um any
509:42 what you're trying to show me um any information off this so we want to show
509:43 information off this so we want to show some actual
509:44 some actual information so first thing that we're
509:46 information so first thing that we're going to do is we're going to actually
509:48 going to do is we're going to actually add the label to this so that you can
509:51 add the label to this so that you can see it you know when you're going over
509:53 see it you know when you're going over here and you see okay here's this um zip
509:56 here and you see okay here's this um zip code um in the dashboard when we create
509:58 code um in the dashboard when we create it you can click on this but if you just
510:01 it you can click on this but if you just want to do it visually without having to
510:03 want to do it visually without having to click anywhere you'll be able to see
510:04 click anywhere you'll be able to see okay 98134 that's right here so this
510:07 okay 98134 that's right here so this location right here is you know able to
510:09 location right here is you know able to charge a lot of money it's probably a
510:11 charge a lot of money it's probably a really nice neighborhood so um and we
510:14 really nice neighborhood so um and we can back that up by putting the average
510:18 can back that up by putting the average price so these these two visualizations
510:20 price so these these two visualizations are really they really go hand in hand
510:23 are really they really go hand in hand we're going to add oops not the
510:26 we're going to add oops not the sum this one needs to be the average so
510:28 sum this one needs to be the average so you go to this measure the sum go to
510:31 you go to this measure the sum go to average and there you go and these
510:34 average and there you go and these should match so this should be
510:42 206.125 206.000 so this all matches um and we
510:46 206.000 so this all matches um and we can uh we can actually change that size
510:48 can uh we can actually change that size a little bit if you want to actually get
510:50 a little bit if you want to actually get it in um get it within each of these
510:53 it in um get it within each of these things you know adjust it as you see
510:56 things you know adjust it as you see fits I think that's fine right there um
510:59 fits I think that's fine right there um no need
511:00 no need to mess with it
511:03 to mess with it anymore all right so let me see I think
511:05 anymore all right so let me see I think that is everything for this one I don't
511:07 that is everything for this one I don't know if I want to add anything else uh
511:11 know if I want to add anything else uh no I'm going to keep it how it is so
511:13 no I'm going to keep it how it is so that is our second visualization again
511:15 that is our second visualization again these ones are directly uh correlated
511:19 these ones are directly uh correlated and and you know this there's just
511:21 and and you know this there's just different ways to visualize it this one
511:22 different ways to visualize it this one you can see actually on the map where it
511:24 you can see actually on the map where it is and the average price this one you
511:26 is and the average price this one you can see from highest to lowest so again
511:28 can see from highest to lowest so again you know sometimes when you're doing
511:29 you know sometimes when you're doing these visualizations you're going to
511:31 these visualizations you're going to have these accompanying um uh these
511:35 have these accompanying um uh these accompanying visualizations in your
511:36 accompanying visualizations in your dashboard that's very normal so let's
511:40 dashboard that's very normal so let's move over to the third one and for this
511:43 move over to the third one and for this third one um you know something that our
511:47 third one um you know something that our guy was looking at is he's like okay
511:49 guy was looking at is he's like okay well you know I'm thinking about listing
511:51 well you know I'm thinking about listing it on Airbnb but I also want to live in
511:54 it on Airbnb but I also want to live in it so I want to know the best times to
511:57 it so I want to know the best times to actually um you know put it on the
511:59 actually um you know put it on the market for people to be able to use and
512:02 market for people to be able to use and so I was like okay man no problem uh
512:05 so I was like okay man no problem uh let's let's take a look at when when are
512:07 let's let's take a look at when when are people spending the most money in
512:09 people spending the most money in airbnbs and we actually had that
512:11 airbnbs and we actually had that calendar um if you remember let's look
512:14 calendar um if you remember let's look let's see this calendar so we have this
512:16 let's see this calendar so we have this available the date the listing all of
512:19 available the date the listing all of that stuff um and let's look at the date
512:24 that stuff um and let's look at the date in
512:25 in here uh and we obviously don't want it
512:28 here uh and we obviously don't want it like this we want it to be more uh more
512:30 like this we want it to be more uh more of a Time series and we're going to do
512:33 of a Time series and we're going to do be doing that based off of uh the price
512:37 be doing that based off of uh the price for the calendar so let's go see if we
512:39 for the calendar so let's go see if we can find that really
512:41 can find that really quick okay here's the price
512:48 where is that calendar one let me see okay there's the calendar
512:53 one let me see okay there's the calendar oh
512:55 oh here I totally forgot where that was
512:57 here I totally forgot where that was supposed to be o that looks
512:59 supposed to be o that looks terrible okay um let's see let's let's
513:04 terrible okay um let's see let's let's start working on this because this needs
513:05 start working on this because this needs some work obviously uh this is the worst
513:08 some work obviously uh this is the worst visualization I have ever seen um so we
513:11 visualization I have ever seen um so we need to work on this a little bit what
513:14 need to work on this a little bit what we need to do is we need to change oh
513:16 we need to do is we need to change oh whoops we need to change some the way
513:19 whoops we need to change some the way that these dates are are seen so right
513:22 that these dates are are seen so right here is a these are two separate things
513:25 here is a these are two separate things so if I go right here and I Do by
513:26 so if I go right here and I Do by quarter it's just going to change the
513:27 quarter it's just going to change the quarters here right that's that isn't
513:30 quarters here right that's that isn't really helpful we actually want to keep
513:32 really helpful we actually want to keep the year here what we want to do it is
513:34 the year here what we want to do it is by year we want to separate it by year
513:37 by year we want to separate it by year um but we want to separate it let's just
513:40 um but we want to separate it let's just do I don't know let's try weak and see
513:41 do I don't know let's try weak and see what it looks like okay this is great
513:43 what it looks like okay this is great this is this is what we're looking at
513:45 this is this is what we're looking at again um if we went back and Chang this
513:47 again um if we went back and Chang this like quarter it uh changed it quarter
513:51 like quarter it uh changed it quarter and then change it to week it would show
513:54 and then change it to week it would show the
513:54 the quarters but it wouldn't
513:57 quarters but it wouldn't show everything right this isn't all the
514:00 show everything right this isn't all the data that we need and so you know you
514:02 data that we need and so you know you really need to make sure that you're
514:04 really need to make sure that you're doing this correct I by default it's
514:07 doing this correct I by default it's almost always year but if you're looking
514:09 almost always year but if you're looking at it via quarter so like let's say
514:11 at it via quarter so like let's say somebody comes in you say hey what
514:13 somebody comes in you say hey what quarters I Want to Break these out by
514:15 quarters I Want to Break these out by quarters um and not year-over-year
514:18 quarters um and not year-over-year that's how you would do this but in the
514:20 that's how you would do this but in the year we want to break it out by uh the
514:23 year we want to break it out by uh the week and you see this huge drop off um
514:28 week and you see this huge drop off um at the end well that is actually because
514:30 at the end well that is actually because the data doesn't go past that um there's
514:33 the data doesn't go past that um there's just like one day of data or one one um
514:36 just like one day of data or one one um week of data in here with actual um with
514:40 week of data in here with actual um with January of 2017 data so it just drops
514:42 January of 2017 data so it just drops off because this is an this is the sum
514:44 off because this is an this is the sum so it only adds up to like um 591 th000
514:48 so it only adds up to like um 591 th000 compared to like the 2 million so we
514:50 compared to like the 2 million so we want to get rid of that um and how do we
514:53 want to get rid of that um and how do we do that uh let's see I think it's
515:00 filter how's it format no it's not format what am I thinking bear with me
515:04 format what am I thinking bear with me uh let's a filter well I was looking for
515:06 uh let's a filter well I was looking for it I just couldn't find
515:07 it I just couldn't find it uh let's bring it back to the 31st
515:12 it uh let's bring it back to the 31st let's see if that fixes what we need
515:14 let's see if that fixes what we need perfect uh that's all you had to do um
515:18 perfect uh that's all you had to do um and the reason that this is helpful and
515:21 and the reason that this is helpful and often times you'd have several years
515:23 often times you'd have several years worth of data in here um and then you
515:26 worth of data in here um and then you could have you could do even do
515:27 could have you could do even do something like this um like this one
515:29 something like this um like this one where it has multiple
515:31 where it has multiple lines the reason that this is helpful is
515:34 lines the reason that this is helpful is because if I'm telling my friend let's I
515:37 because if I'm telling my friend let's I mean just I'm going to say it's a friend
515:39 mean just I'm going to say it's a friend or business partner whatever you
515:40 or business partner whatever you whatever you want to use this use case
515:41 whatever you want to use this use case for I'm GNA tell him hey the beginning
515:44 for I'm GNA tell him hey the beginning of January all the way until like you
515:47 of January all the way until like you know even February it's like really low
515:51 know even February it's like really low it's half so there's not a lot of people
515:53 it's half so there's not a lot of people traveling because everyone travels when
515:55 traveling because everyone travels when at the end of the year so in November
515:58 at the end of the year so in November December for the holidays to visit
516:00 December for the holidays to visit family um and then in the summer for
516:02 family um and then in the summer for vacations I would tell him just based
516:04 vacations I would tell him just based off this one thing I would say hey over
516:07 off this one thing I would say hey over the summer and then at the end of the
516:09 the summer and then at the end of the year and during the holidays that's when
516:11 year and during the holidays that's when I would be renting out your air BNB okay
516:14 I would be renting out your air BNB okay so just this one very simple
516:17 so just this one very simple visualization can help him understand
516:18 visualization can help him understand the best times um to do that that may be
516:21 the best times um to do that that may be an intuitive you may have already known
516:22 an intuitive you may have already known that but you can prove it with the data
516:25 that but you can prove it with the data which is always really helpful um and
516:28 which is always really helpful um and let's see is there anything else that we
516:29 let's see is there anything else that we need to do with
516:31 need to do with this uh I'm just going to label it and
516:34 this uh I'm just going to label it and I'm going to say
516:37 I'm going to say um
516:39 um revenue for
516:42 revenue for year
516:44 year let's do bold do apply there we go do I
516:47 let's do bold do apply there we go do I label this last one I didn't let's label
516:51 label this last one I didn't let's label that last
516:51 that last [Music]
516:53 [Music] one and we'll
516:55 one and we'll do price per zip
517:00 do price per zip code price per zip code we'll just keep
517:02 code price per zip code we'll just keep it at that keep it
517:04 it at that keep it simple um and let's do that all right I
517:08 simple um and let's do that all right I believe we have two more so we have done
517:12 believe we have two more so we have done um we've done three of them um we got
517:16 um we've done three of them um we got the zip codes we've got the um you know
517:19 the zip codes we've got the um you know the time of the year now something else
517:22 the time of the year now something else that he was wanting to know is um you
517:25 that he was wanting to know is um you know just how things affect it and
517:26 know just how things affect it and something that's going to affect the
517:27 something that's going to affect the price of the actual Airbnb is going to
517:32 price of the actual Airbnb is going to be the amount of bedrooms so the the
517:34 be the amount of bedrooms so the the larger the house the more bedrooms the
517:35 larger the house the more bedrooms the more it's going to cost typically so we
517:39 more it's going to cost typically so we can take a look at that let's pull in
517:42 can take a look at that let's pull in these bedrooms
517:44 these bedrooms um and that will be our
517:48 um and that will be our columns uh no it won't what we need to
517:50 columns uh no it won't what we need to do um and so I I knew this was going to
517:53 do um and so I I knew this was going to happen I just forgot it until right uh
517:55 happen I just forgot it until right uh until right now what we this right now
517:57 until right now what we this right now is actually a um it's a a value right so
518:02 is actually a um it's a a value right so it's a number and that's totally um
518:04 it's a number and that's totally um reasonable because if we go right here
518:07 reasonable because if we go right here we do count distinct that's because
518:09 we do count distinct that's because there's only seven values right it goes
518:11 there's only seven values right it goes there's zero bedrooms 1 2 3 4 5 5 six 7
518:13 there's zero bedrooms 1 2 3 4 5 5 six 7 all the way up to seven bedrooms right
518:15 all the way up to seven bedrooms right now it has it as a numerical value we
518:17 now it has it as a numerical value we want to um change that to create it as
518:22 want to um change that to create it as um these measure names not a value so
518:25 um these measure names not a value so we're going to um we're going to remove
518:29 we're going to um we're going to remove this we're going to go right down here
518:31 this we're going to go right down here we're going click this drop down and
518:33 we're going click this drop down and we're going to say convert to
518:35 we're going to say convert to Dimension and so now we're going to add
518:38 Dimension and so now we're going to add it as a dimension so there that looks um
518:41 it as a dimension so there that looks um much more normal I really quick I'm
518:43 much more normal I really quick I'm going to I'm going to keep these in here
518:45 going to I'm going to keep these in here for a second but we're going to get rid
518:46 for a second but we're going to get rid of these nulls and zeros because if a
518:47 of these nulls and zeros because if a home has zero bedrooms that's a
518:50 home has zero bedrooms that's a problem um and so we want to look at the
518:53 problem um and so we want to look at the price again let's go down here in the
518:57 price again let's go down here in the listings it should be the price now this
518:59 listings it should be the price now this is the price for the location per day um
519:02 is the price for the location per day um if you want to look at monthly or or you
519:05 if you want to look at monthly or or you know stuff like that they have that data
519:08 know stuff like that they have that data um but we're just going to do the price
519:09 um but we're just going to do the price the average price not the
519:11 the average price not the sum um although this is is helpful so
519:14 sum um although this is is helpful so just really quick before we change it
519:16 just really quick before we change it this is going to show you which ones
519:18 this is going to show you which ones make the which ones are bringing in the
519:20 make the which ones are bringing in the most money it also may show you which
519:21 most money it also may show you which ones are the most common um those are
519:23 ones are the most common um those are all different visualizations that we can
519:25 all different visualizations that we can do but the one that brings in the most
519:27 do but the one that brings in the most money uh that brought in 63 or that has
519:30 money uh that brought in 63 or that has $63 Million worth of um worth of
519:35 $63 Million worth of um worth of listings so they all add up those one
519:38 listings so they all add up those one bedrooms are doing phenomenal half of
519:41 bedrooms are doing phenomenal half of that are two bedrooms at 30 million
519:44 that are two bedrooms at 30 million three bedrooms at 18 million and so on
519:45 three bedrooms at 18 million and so on and so forth so there's a ton of
519:48 and so forth so there's a ton of one-bedroom ones we may even keep we
519:51 one-bedroom ones we may even keep we could even keep that in there um you
519:53 could even keep that in there um you know if we wanted
519:55 know if we wanted to um and then we do something similar
519:58 to um and then we do something similar later but you can keep something like
520:00 later but you can keep something like this in there what we will do really
520:02 this in there what we will do really quick though is we're going to do the
520:03 quick though is we're going to do the same thing that we've been doing is
520:04 same thing that we've been doing is keeping
520:06 keeping average um and we are going to get rid
520:09 average um and we are going to get rid of this cuz if it doesn't have the
520:11 of this cuz if it doesn't have the bedrooms you know that's not helpful to
520:13 bedrooms you know that's not helpful to us and if it has zero bedrooms that's
520:16 us and if it has zero bedrooms that's that's genuinely a problem I will not be
520:17 that's genuinely a problem I will not be renting an Airbnb with my family uh that
520:20 renting an Airbnb with my family uh that has zero bedrooms in it so now we have
520:23 has zero bedrooms in it so now we have this and would be really helpful to be
520:25 this and would be really helpful to be able to see that in the visualization I
520:27 able to see that in the visualization I mean it's just kind
520:28 mean it's just kind of hard to see it as is I
520:32 of hard to see it as is I mean it just does not hurt to add that
520:35 mean it just does not hurt to add that right here do a label um why is it
520:39 right here do a label um why is it angled like that maybe I just need to
520:43 angled like that maybe I just need to move it out
520:45 move it out more that looks much better um that's
520:49 more that looks much better um that's the average price that cannot be right
520:52 the average price that cannot be right that's the sum that's why so let's go
520:54 that's the sum that's why so let's go over here let's make that average as
520:56 over here let's make that average as well much better because uh if the price
520:59 well much better because uh if the price was $3
521:01 was $3 million for a three-bedroom I would not
521:04 million for a three-bedroom I would not be going there so this is really really
521:08 be going there so this is really really useful information for our friend right
521:11 useful information for our friend right if um he wants start you know get into
521:14 if um he wants start you know get into those one that one bedroom area you know
521:15 those one that one bedroom area you know you're not going to be making a lot of
521:16 you're not going to be making a lot of money it may be low cost UPF front but
521:19 money it may be low cost UPF front but he's not going to be making a lot of
521:20 he's not going to be making a lot of money it significantly goes up when you
521:23 money it significantly goes up when you reach these five and six bedroom homes
521:25 reach these five and six bedroom homes which makes sense I mean if it has five
521:27 which makes sense I mean if it has five or six bedrooms in it it's probably a
521:29 or six bedrooms in it it's probably a really large really nice home and you
521:31 really large really nice home and you can charge a lot more money and our
521:32 can charge a lot more money and our friend is uh extremely wealthy he can
521:34 friend is uh extremely wealthy he can buy whatever he wants and so he may be
521:36 buy whatever he wants and so he may be looking at these um larger on seeing
521:38 looking at these um larger on seeing that there's a much higher return um on
521:41 that there's a much higher return um on his investment the higher and the more
521:43 his investment the higher and the more bedrooms he goes so we're going to keep
521:45 bedrooms he goes so we're going to keep it just as it
521:48 it just as it is um and let me see is there's anything
521:51 is um and let me see is there's anything else that we want to do with this no
521:53 else that we want to do with this no we're going to keep it just like this uh
521:55 we're going to keep it just like this uh and the last one is by far the easiest
521:56 and the last one is by far the easiest and we actually just discussed it a
521:58 and we actually just discussed it a little bit we want to know you know
522:00 little bit we want to know you know what's his competition look like so um
522:03 what's his competition look like so um for those for the bedrooms specifically
522:06 for those for the bedrooms specifically so let's go back up to the
522:09 so let's go back up to the bedrooms we want that one to be right
522:13 bedrooms we want that one to be right here in our rows so we show um these and
522:16 here in our rows so we show um these and then we just want to count of um how
522:20 then we just want to count of um how many listings there are so we can do
522:22 many listings there are so we can do that via the listings ID so here's our
522:25 that via the listings ID so here's our listings each ID represents one location
522:28 listings each ID represents one location or one home so we're going to do that
522:30 or one home so we're going to do that right here uh that looks absolutely
522:35 right here uh that looks absolutely terrible that looks terrible what am I
522:38 terrible that looks terrible what am I doing wrong here um let me see
522:43 doing wrong here um let me see uh one thing we need to do is we want to
522:45 uh one thing we need to do is we want to get rid of these nulls and
522:46 get rid of these nulls and zeros do that really
522:49 zeros do that really quick um and then we don't want to do
522:52 quick um and then we don't want to do just the ID because I I'm realizing now
522:56 just the ID because I I'm realizing now uh what I'm doing I need to convert this
523:00 uh what I'm doing I need to convert this to a numeric so we can do a count on it
523:03 to a numeric so we can do a count on it so let's um oops let me see what what is
523:06 so let's um oops let me see what what is happening this is terrible all right
523:08 happening this is terrible all right let's put this back let's make let me
523:11 let's put this back let's make let me see if I can just um
523:13 see if I can just um do an
523:15 do an attribute let's
523:18 attribute let's do the
523:20 do the [Music]
523:21 [Music] count and let's
523:24 count and let's do
523:27 do text um no it needs to be a distinct
523:30 text um no it needs to be a distinct count because that's that's basically
523:33 count because that's that's basically like
523:34 like um a count of the numbers themselves not
523:39 um a count of the numbers themselves not each individual ID okay it took figuring
523:43 each individual ID okay it took figuring out I'm going to keep that in there
523:44 out I'm going to keep that in there because you guys need to see uh a lot of
523:47 because you guys need to see uh a lot of you guys like seeing when I make
523:48 you guys like seeing when I make mistakes so you know makes it feel like
523:50 mistakes so you know makes it feel like when you make mistakes it's okay um and
523:51 when you make mistakes it's okay um and I'm all about that so I'm leaving that
523:53 I'm all about that so I'm leaving that in there you guys can see me fail a
523:55 in there you guys can see me fail a little bit um I just forgot how to do
523:57 little bit um I just forgot how to do that for a second and this is exactly
523:59 that for a second and this is exactly what we're looking for right we want we
524:01 what we're looking for right we want we now it showed us in that visualization
524:03 now it showed us in that visualization that we were looking at earlier before
524:05 that we were looking at earlier before we um switched it to the average price
524:08 we um switched it to the average price this is showing us that there are for
524:10 this is showing us that there are for one bedrooms there's 1,800 one bedroom
524:13 one bedrooms there's 1,800 one bedroom two that 483 3 that have 206 four that
524:17 two that 483 3 that have 206 four that have 55 only five that have 20 and six
524:19 have 55 only five that have 20 and six that have five so the more you go up the
524:21 that have five so the more you go up the less and less it is or the less and less
524:23 less and less it is or the less and less competition there's going to be now is
524:25 competition there's going to be now is there a lot of demand for four-bedroom
524:27 there a lot of demand for four-bedroom five-bedroom six-bedroom uh that's for
524:29 five-bedroom six-bedroom uh that's for our friend to figure out um well maybe
524:31 our friend to figure out um well maybe we'll help them out with that later um
524:34 we'll help them out with that later um in the with the data you know we could
524:35 in the with the data you know we could look at the reviews that we had um
524:38 look at the reviews that we had um there's so much data in here and we
524:39 there's so much data in here and we could absolutely figure that out but for
524:42 could absolutely figure that out but for what it's worth giving him this initial
524:44 what it's worth giving him this initial stuff and he'll have follow-up questions
524:45 stuff and he'll have follow-up questions for us later that's how it always works
524:47 for us later that's how it always works I promise um so now we're good with this
524:50 I promise um so now we're good with this one let's label this one did I label the
524:52 one let's label this one did I label the last one I will go back and look um
524:57 last one I will go back and look um distinct I I'm going to butcher this one
524:59 distinct I I'm going to butcher this one I'm going do a distinct count
525:02 I'm going do a distinct count of of bedroom listings I don't that may
525:07 of of bedroom listings I don't that may not make sense at all but we're keeping
525:09 not make sense at all but we're keeping it so we're going to do bedroom apply
525:11 it so we're going to do bedroom apply okay let me see if I added the label on
525:14 okay let me see if I added the label on this one I didn't let me do that real
525:19 this one I didn't let me do that real quick we do
525:22 quick we do average price per
525:25 average price per bedroom again I'm
525:29 bedroom again I'm oops you didn't see that I'm just going
525:32 oops you didn't see that I'm just going with whatever is coming to my head this
525:34 with whatever is coming to my head this probably wouldn't be what I would keep
525:35 probably wouldn't be what I would keep if I this or like an actual project but
525:37 if I this or like an actual project but it works for now so we have our five
525:41 it works for now so we have our five visualizations 1 2 three four and five
525:44 visualizations 1 2 three four and five and let's create our dashboard that's
525:46 and let's create our dashboard that's going to be this button right here so
525:48 going to be this button right here so we're going to click that we are going
525:51 we're going to click that we are going to uh go right here and we're going to
525:53 to uh go right here and we're going to say automatic because we want to use
525:55 say automatic because we want to use this entire area and so now we're just
525:58 this entire area and so now we're just going to start um you know pulling them
526:01 going to start um you know pulling them over and I'm just going to start from
526:03 over and I'm just going to start from the very first one and go to the very
526:05 the very first one and go to the very last one keep it really simple so this
526:08 last one keep it really simple so this very first one we'll pull it over it you
526:11 very first one we'll pull it over it you know it's going to take up the entire
526:13 know it's going to take up the entire space until you start adding all the
526:14 space until you start adding all the other ones we'll include this one right
526:17 other ones we'll include this one right here um and well let's leave it as it is
526:20 here um and well let's leave it as it is you know we'll adjust it once it gets to
526:22 you know we'll adjust it once it gets to its final place now we have number three
526:26 its final place now we have number three We'll add this one on this side it looks
526:29 We'll add this one on this side it looks terrible right now but give it a second
526:31 terrible right now but give it a second uh then we have number four we're going
526:33 uh then we have number four we're going to add that across the top okay it's
526:36 to add that across the top okay it's already starting to look a little
526:38 already starting to look a little better and um maybe I I you don't have
526:43 better and um maybe I I you don't have to keep this in here
526:45 to keep this in here um but you definitely
526:47 um but you definitely can uh let's start to adjust things a
526:50 can uh let's start to adjust things a little
526:54 bit oops okay
526:56 oops okay let's see if I can zoom in one more NOP
527:00 let's see if I can zoom in one more NOP I'm going to do it just like that
527:02 I'm going to do it just like that actually let me
527:07 [Music] see if I can make it even just a little
527:10 see if I can make it even just a little bit closer perfect uh that's the the
527:12 bit closer perfect uh that's the the best you're going to get um if you
527:13 best you're going to get um if you didn't see I use this um magnifying and
527:16 didn't see I use this um magnifying and then I could click on the area that I
527:17 then I could click on the area that I wanted to see so we're going to keep
527:19 wanted to see so we're going to keep that just like
527:21 that just like that we're going to move this over
527:23 that we're going to move this over because that is um definitely not as
527:26 because that is um definitely not as important um and then we're going to
527:28 important um and then we're going to move this way over as well so keep it
527:32 move this way over as well so keep it just like that again this is something
527:33 just like that again this is something where if you want to you can click on
527:35 where if you want to you can click on this um it didn't I don't know why uh I
527:38 this um it didn't I don't know why uh I can't remember how to get those
527:39 can't remember how to get those connected but it's you definitely can um
527:42 connected but it's you definitely can um but okay I was just clicking on the
527:44 but okay I was just clicking on the wrong one that's
527:46 wrong one that's why that is why but you can click over
527:49 why that is why but you can click over here and you you know it'll filter um
527:51 here and you you know it'll filter um based on so if I go to this one oops
527:54 based on so if I go to this one oops [Music]
527:56 [Music] dang oh jeez what am I doing oh this is
527:59 dang oh jeez what am I doing oh this is a
528:00 a travesty okay let's try to get this
528:03 travesty okay let's try to get this back all right I'm not touching it guys
528:05 back all right I'm not touching it guys you get the gist you can mess around
528:06 you get the gist you can mess around with it yourself I'm not messing this up
528:08 with it yourself I'm not messing this up okay so the next thing we need to add is
528:10 okay so the next thing we need to add is the very last one that's going to go
528:12 the very last one that's going to go right up here and then we're just going
528:14 right up here and then we're just going to kind of move it off to the
528:18 to kind of move it off to the side
528:21 side and let's
528:24 and let's see going
528:32 add yeah have this caption um if you've never seen something like this
528:33 never seen something like this before um and I actually want to make
528:36 before um and I actually want to make this bigger as
528:38 this bigger as well oh jeez give me a second it's it's
528:41 well oh jeez give me a second it's it's kind of lagging a little
528:43 kind of lagging a little [Music]
528:52 bit and make this a little bit tall maybe I don't want it as wide but I
528:53 maybe I don't want it as wide but I definitely want a little
528:55 definitely want a little [Music]
528:58 [Music] taller give it a second yeah let me
529:01 taller give it a second yeah let me scooch this
529:02 scooch this [Music]
529:04 [Music] back just like that that's fine uh we
529:09 back just like that that's fine uh we can keep it like that in my original one
529:11 can keep it like that in my original one I didn't have this um um you can get rid
529:13 I didn't have this um um you can get rid of this if you want you know you can um
529:17 of this if you want you know you can um you know just exit out right here if you
529:18 you know just exit out right here if you want to do that but there you have it uh
529:22 want to do that but there you have it uh this is the entire thing so we started
529:25 this is the entire thing so we started from the very start um we started with
529:27 from the very start um we started with this one then this one uh did some um
529:30 this one then this one uh did some um and this is you know all the zip all of
529:33 and this is you know all the zip all of our ZIP code work then we took a look at
529:36 our ZIP code work then we took a look at the calendar where we looked at the
529:37 the calendar where we looked at the price and did some time series
529:39 price and did some time series visualization and then we're looking at
529:41 visualization and then we're looking at the bedrooms and and the count of
529:43 the bedrooms and and the count of bedrooms and so this should be really
529:45 bedrooms and so this should be really helpful for a friend it should be an
529:46 helpful for a friend it should be an initial dashboard to get him going and
529:49 initial dashboard to get him going and once he sees us he's going to have a
529:50 once he sees us he's going to have a million other questions and he's going
529:51 million other questions and he's going to want another dashboard for different
529:53 to want another dashboard for different data that's in there he's going to ask
529:55 data that's in there he's going to ask about okay well what if I want to do it
529:57 about okay well what if I want to do it weekly or you know I want to rent it out
529:59 weekly or you know I want to rent it out for the month or you know how many um
530:02 for the month or you know how many um reviews are people five star reviews are
530:04 reviews are people five star reviews are people giving on you know W bedroom two
530:06 people giving on you know W bedroom two bedroom three bedroom these are all
530:08 bedroom three bedroom these are all things that you know he may ask and then
530:11 things that you know he may ask and then we'd have to build out in the real world
530:13 we'd have to build out in the real world this is what happens all the time you
530:15 this is what happens all the time you know they make a request and then
530:16 know they make a request and then they're like oh this is great but I also
530:18 they're like oh this is great but I also want this so um you know your friend is
530:22 want this so um you know your friend is is going to be right in line with just
530:23 is going to be right in line with just about everyone else um that has ever
530:25 about everyone else um that has ever gotten a dashboard uh for work or for
530:29 gotten a dashboard uh for work or for personal use with that being said this
530:31 personal use with that being said this is it um we have done the entire thing
530:33 is it um we have done the entire thing now if you want to share this it is
530:36 now if you want to share this it is super super easy to share um and I'm
530:38 super super easy to share um and I'm going to try to remember how to share it
530:40 going to try to remember how to share it uh so we're going to do save to tap
530:42 uh so we're going to do save to tap public
530:43 public As and we're going to do this and we're
530:46 As and we're going to do this and we're going to make it um let's do Air BnB is
530:50 going to make it um let's do Air BnB is it like is it a capital B is it like
530:52 it like is it a capital B is it like that no that doesn't look right
530:55 that no that doesn't look right Airbnb uh we'll do full project and
530:58 Airbnb uh we'll do full project and we'll
531:00 we'll save and that is being created right now
531:03 save and that is being created right now um and I will save this so if you guys
531:05 um and I will save this so if you guys want to go look at this you can um and
531:08 want to go look at this you can um and I'll provide a link in the description
531:10 I'll provide a link in the description as well for that and see if yours looks
531:13 as well for that and see if yours looks um similar to mine or better than
531:15 um similar to mine or better than mine give it a second CU it's
531:24 thinking all right so here it is so here's our final our final project um
531:26 here's our final our final project um and if you followed step by step then
531:28 and if you followed step by step then you should get this exact or very very
531:31 you should get this exact or very very similar to this one again I encourage
531:33 similar to this one again I encourage you to if you want to have the upto-date
531:36 you to if you want to have the upto-date data to go to that um Link in the
531:39 data to go to that um Link in the description that has um the the most
531:42 description that has um the the most recent data and they update that I
531:43 recent data and they update that I believe monthly so you can go there get
531:45 believe monthly so you can go there get the most recent data and then you can do
531:47 the most recent data and then you can do stuff and you can create a beautiful
531:49 stuff and you can create a beautiful project just like this um but with the
531:51 project just like this um but with the you know the most recent data again I
531:52 you know the most recent data again I use the kaggle data just so you guys can
531:54 use the kaggle data just so you guys can remember and I encourage you to look at
531:56 remember and I encourage you to look at the different data points that are in
531:58 the different data points that are in the Excel there is so much in there and
532:00 the Excel there is so much in there and you can use uh honestly like there's
532:03 you can use uh honestly like there's probably 30 or 40 other fields that you
532:05 probably 30 or 40 other fields that you could be using in there that we never
532:06 could be using in there that we never even touched um but for this project
532:09 even touched um but for this project we're keeping it pretty simple and so so
532:12 we're keeping it pretty simple and so so go do that make completely unique
532:14 go do that make completely unique dashboards and and visualizations and
532:16 dashboards and and visualizations and create projects and add it to your
532:18 create projects and add it to your portfolios so that you can create uh a
532:21 portfolios so that you can create uh a fantastic portfolio website and get a
532:24 fantastic portfolio website and get a job and that's what this is all about um
532:25 job and that's what this is all about um it's about upskilling and and getting
532:28 it's about upskilling and and getting these skills that you can you know get a
532:30 these skills that you can you know get a job or or do better in your job so I
532:32 job or or do better in your job so I hope this has been helpful I really
532:34 hope this has been helpful I really appreciate you guys joining me and and
532:36 appreciate you guys joining me and and doing this entire project with me I have
532:38 doing this entire project with me I have no idea how long this is this probably
532:40 no idea how long this is this probably this could be like an hour for all I
532:41 this could be like an hour for all I know um so thank you so much for
532:43 know um so thank you so much for sticking with me this entire time if you
532:45 sticking with me this entire time if you like this video be sure to like And
532:47 like this video be sure to like And subscribe below and I will see you in
532:48 subscribe below and I will see you in the next
533:01 [Music] video what's going on everybody welcome
533:03 video what's going on everybody welcome back to another video today we're going
533:05 back to another video today we're going to be starting our powerbi tutorial
533:13 series now I am super excited to start this
533:15 now I am super excited to start this series with you guys we are going to be
533:16 series with you guys we are going to be breaking this up in about six or seven
533:18 breaking this up in about six or seven videos I don't really like those super
533:20 videos I don't really like those super long videos where it's like four hours
533:22 long videos where it's like four hours long I like breaking mine up into chunks
533:25 long I like breaking mine up into chunks so that's what we're going to do this is
533:26 so that's what we're going to do this is the beginner series and so we're going
533:27 the beginner series and so we're going to start with the very Basics and we're
533:29 to start with the very Basics and we're just going to work our way up and I'm
533:30 just going to work our way up and I'm going to walk you through every single
533:31 going to walk you through every single step of the way it'll be very easy to
533:33 step of the way it'll be very easy to follow everything will be provided for
533:35 follow everything will be provided for you so that all you have to do is really
533:37 you so that all you have to do is really follow along and by the end of it you
533:39 follow along and by the end of it you should know powerbi a lot better you
533:40 should know powerbi a lot better you should have a lot more com using it now
533:42 should have a lot more com using it now before we actually jump onto my screen I
533:44 before we actually jump onto my screen I want to give a huge shout out to the
533:45 want to give a huge shout out to the sponsor of this video and that is udemy
533:48 sponsor of this video and that is udemy you guys know that I absolutely love
533:49 you guys know that I absolutely love udemy I've been using them for years and
533:51 udemy I've been using them for years and that is no exception when it comes to
533:52 that is no exception when it comes to powerbi I have taken some of the best
533:54 powerbi I have taken some of the best powerbi courses ever on udemy so I
533:58 powerbi courses ever on udemy so I highly recommend you checking out the
533:59 highly recommend you checking out the ones that I have in the description
534:00 ones that I have in the description these are ones that I actually took and
534:02 these are ones that I actually took and I loved the most so if you're looking
534:04 I loved the most so if you're looking for a full powerbi course I highly
534:06 for a full powerbi course I highly recommend checking out you to me thank
534:07 recommend checking out you to me thank you so much again to our sponsor and now
534:09 you so much again to our sponsor and now without further Ado let's jump onto my
534:11 without further Ado let's jump onto my screen and get started with a tutorial
534:12 screen and get started with a tutorial all right so the first thing I'm going
534:13 all right so the first thing I'm going to do is download powerbi desktop I will
534:16 to do is download powerbi desktop I will leave this link in the description so
534:17 leave this link in the description so you can just click on it go to it and
534:19 you can just click on it go to it and download it we're going to click this
534:20 download it we're going to click this download free button and once we click
534:23 download free button and once we click it you can go to the Microsoft store and
534:26 it you can go to the Microsoft store and I already have it downloaded so when you
534:28 I already have it downloaded so when you see it uh it'll already say downloaded
534:30 see it uh it'll already say downloaded but um for you you can go in here you
534:33 but um for you you can go in here you can click download and it will download
534:35 can click download and it will download it for you I'm on Microsoft uh but it
534:37 it for you I'm on Microsoft uh but it may look a little bit different for you
534:39 may look a little bit different for you if you're on a different system but once
534:40 if you're on a different system but once that is done we are going to open up
534:43 that is done we are going to open up powerbi so let's go right down here to
534:45 powerbi so let's go right down here to our search let's go to
534:52 powerbi and it is going to open up for us all right so right away this is what
534:54 us all right so right away this is what it's going to look like when you open it
534:56 it's going to look like when you open it and we're going to go right over here to
534:58 and we're going to go right over here to get data and let's click on that it's
535:01 get data and let's click on that it's going to open up this window and it's
535:03 going to open up this window and it's going to give us a lot of different
535:05 going to give us a lot of different options for where we can get data from
535:07 options for where we can get data from now some of these are free and some you
535:09 now some of these are free and some you need to upgrade from but you just taking
535:11 need to upgrade from but you just taking a quick glance through here you have a
535:13 a quick glance through here you have a ton of options there's databases there's
535:17 ton of options there's databases there's um you know blob storages there's post
535:19 um you know blob storages there's post create SQL or different SQL databases um
535:22 create SQL or different SQL databases um there's Google analytics there's a lot
535:24 there's Google analytics there's a lot of places and you can go through the
535:26 of places and you can go through the process to connect to that data and you
535:28 process to connect to that data and you can pull that data in from those data
535:29 can pull that data in from those data sources now for what we are doing we're
535:31 sources now for what we are doing we're just going to be using an Excel I'm
535:34 just going to be using an Excel I'm going to leave the Excel that I'm going
535:35 going to leave the Excel that I'm going to be using in the description you can
535:37 to be using in the description you can go and download it and walk through this
535:39 go and download it and walk through this with me so what we're going to do is
535:40 with me so what we're going to do is click on Excel workbook and we're going
535:42 click on Excel workbook and we're going to click connect so we're going to go
535:44 to click connect so we're going to go right here in our powerbi tutorials
535:46 right here in our powerbi tutorials folder and we're going to click on
535:48 folder and we're going to click on apocalypse food prep so let's click on
535:50 apocalypse food prep so let's click on that and it is going to connect and pull
535:53 that and it is going to connect and pull that data in now right here we have our
535:55 that data in now right here we have our Navigator and so if you had a lot of
535:58 Navigator and so if you had a lot of different sheets you can click on that
535:59 different sheets you can click on that and choose which ones to pull in I just
536:02 and choose which ones to pull in I just clicked on it right over here and we're
536:04 clicked on it right over here and we're able to preview the data but I can't
536:06 able to preview the data but I can't load or transform it yet I need to
536:08 load or transform it yet I need to select which sheets I'm bringing in so
536:11 select which sheets I'm bringing in so we only have ones that's the only one
536:12 we only have ones that's the only one we're going to bring in so you can go
536:14 we're going to bring in so you can go ahead and load the data or you can click
536:16 ahead and load the data or you can click on transform data it's going to take us
536:17 on transform data it's going to take us to powerbi power query which is going to
536:20 to powerbi power query which is going to allow us to transform our data so I'm
536:23 allow us to transform our data so I'm going to have an entire video on how to
536:25 going to have an entire video on how to transform the data but I'm going to give
536:26 transform the data but I'm going to give you a really quick glance at it to kind
536:28 you a really quick glance at it to kind of show you what it is so right up here
536:31 of show you what it is so right up here it says our power query editor this is a
536:34 it says our power query editor this is a the window to basically transform your
536:36 the window to basically transform your data and get it ready for your
536:37 data and get it ready for your visualizations now you can do this in
536:39 visualizations now you can do this in Excel if you want to and do that before
536:41 Excel if you want to and do that before forand or you can do it here and there
536:43 forand or you can do it here and there are lots of things that we can do in
536:45 are lots of things that we can do in here as you can see at the top again
536:47 here as you can see at the top again I'll have an entire video dedicated to
536:50 I'll have an entire video dedicated to just power query but let's take a quick
536:51 just power query but let's take a quick look at the data and see if there's
536:53 look at the data and see if there's anything we want to transform quickly
536:54 anything we want to transform quickly before we actually go and start building
536:56 before we actually go and start building our
536:58 our visualizations so over here we have the
537:00 visualizations so over here we have the store where we purchased it we have the
537:02 store where we purchased it we have the product that we purchased the price that
537:04 product that we purchased the price that we paid and the date that we bought it
537:06 we paid and the date that we bought it now the first thing that jumps out to me
537:08 now the first thing that jumps out to me is that this just says date on it um we
537:11 is that this just says date on it um we might want to say date
537:14 might want to say date uncore purchased and we're going to hit
537:17 uncore purchased and we're going to hit enter and if you noticed right over here
537:19 enter and if you noticed right over here on these applied steps it says renamed
537:21 on these applied steps it says renamed columns everything that you do every
537:24 columns everything that you do every single step that you apply to transform
537:26 single step that you apply to transform this data is going to be right over here
537:29 this data is going to be right over here and if I want to if I go back and I say
537:31 and if I want to if I go back and I say you know I really didn't want to rename
537:32 you know I really didn't want to rename that column I can just click X and it is
537:35 that column I can just click X and it is going to get rid of that and take it
537:37 going to get rid of that and take it back to its original state so again I'm
537:40 back to its original state so again I'm just going to say purchase
537:42 just going to say purchase and we're going to enter that now this
537:45 and we're going to enter that now this is our apocalypse food prep so this is
537:48 is our apocalypse food prep so this is food that we are buying for the
537:49 food that we are buying for the apocalypse um for this example and if we
537:52 apocalypse um for this example and if we look at our products we have bottled
537:54 look at our products we have bottled water canned vegetables dried beans milk
537:56 water canned vegetables dried beans milk and rice and all of that stuff makes
537:58 and rice and all of that stuff makes sense except for the milk U milk will
538:00 sense except for the milk U milk will not stay or last long in the apocalypse
538:03 not stay or last long in the apocalypse so I think what we're going to do is
538:04 so I think what we're going to do is we're going to filter that out really
538:05 we're going to filter that out really quickly and we're GNA click okay and
538:09 quickly and we're GNA click okay and right over here again says filtered rows
538:11 right over here again says filtered rows and so now if we scroll down there's no
538:13 and so now if we scroll down there's no milk so what we are going to do is we
538:16 milk so what we are going to do is we are going to go over here to close and
538:19 are going to go over here to close and apply and it is going to actually load
538:22 apply and it is going to actually load the data into powerbi
538:25 the data into powerbi desktop so on this left- hand side it
538:27 desktop so on this left- hand side it immediately takes us to the report Tab
538:30 immediately takes us to the report Tab and what we want to do is go right here
538:32 and what we want to do is go right here to the data
538:33 to the data Tab and take a look at our data so again
538:36 Tab and take a look at our data so again there's our date purchased and as you
538:39 there's our date purchased and as you can see the milk is not in there another
538:43 can see the milk is not in there another tab that we're going to take a look at
538:44 tab that we're going to take a look at um and again in this report tab this is
538:46 um and again in this report tab this is where we actually build our
538:48 where we actually build our visualizations the data is where we can
538:50 visualizations the data is where we can see the data and and change it up a
538:52 see the data and and change it up a little bit and change some small things
538:53 little bit and change some small things about it like sorting The Columns or
538:55 about it like sorting The Columns or even creating a new column and over here
538:58 even creating a new column and over here we have this other Tab and is called
538:59 we have this other Tab and is called model and this is especially useful when
539:01 model and this is especially useful when you have multiple tables or multiple
539:03 you have multiple tables or multiple excels and you need to join them to kind
539:05 excels and you need to join them to kind of connect them together we don't have
539:08 of connect them together we don't have that but in a future video I'm going to
539:10 that but in a future video I'm going to walk through how to use this entire
539:11 walk through how to use this entire higher tab so now let's go back to the
539:13 higher tab so now let's go back to the data Tab and I want to just look at the
539:15 data Tab and I want to just look at the data really quickly before we go over to
539:17 data really quickly before we go over to the report Tab and we start building our
539:19 the report Tab and we start building our first visualization as you can see I've
539:21 first visualization as you can see I've been buying these different products in
539:23 been buying these different products in different months so this rice I've been
539:25 different months so this rice I've been purchasing in January February March and
539:27 purchasing in January February March and April and I've been buying it from three
539:29 April and I've been buying it from three different locations because I wanted to
539:31 different locations because I wanted to see if I was spending less money at one
539:32 see if I was spending less money at one location on all of the products so then
539:35 location on all of the products so then I would just shop there in the future
539:36 I would just shop there in the future and save a lot of money or if there were
539:38 and save a lot of money or if there were specific products that were really cheap
539:40 specific products that were really cheap at one location but others they were
539:42 at one location but others they were cheaper at a different location so I
539:44 cheaper at a different location so I should just buy like the dried beans at
539:46 should just buy like the dried beans at Costco but everything else I should be
539:48 Costco but everything else I should be buying at Walmart and so that's what
539:49 buying at Walmart and so that's what we're going to look at in just a little
539:50 we're going to look at in just a little bit so let's go over to the report tab
539:53 bit so let's go over to the report tab right up here at the top there's this
539:55 right up here at the top there's this data section so you can kind of choose
539:56 data section so you can kind of choose if you want to add any more data now
539:58 if you want to add any more data now that we are here we can also write
540:01 that we are here we can also write queries or transform the data like we
540:02 queries or transform the data like we were looking at in the power query
540:04 were looking at in the power query editor window over here in the insert we
540:06 editor window over here in the insert we can add a new visualization or a text
540:08 can add a new visualization or a text box and then in the calculation section
540:11 box and then in the calculation section we we can create a new measure or a
540:13 we we can create a new measure or a quick measure and then over here we have
540:14 quick measure and then over here we have share where you can actually publish
540:16 share where you can actually publish your report or your dashboard online now
540:18 your report or your dashboard online now over on the visualization section on
540:20 over on the visualization section on this far right this is a very important
540:22 this far right this is a very important area this is where a lot of the actual
540:25 area this is where a lot of the actual creating of the dashboards happen so
540:27 creating of the dashboards happen so let's take a look really quick and we'll
540:29 let's take a look really quick and we'll get into a lot of these things as we're
540:31 get into a lot of these things as we're actually building our dashboard so we're
540:32 actually building our dashboard so we're not just sitting here looking and
540:34 not just sitting here looking and talking we're going to be actually
540:35 talking we're going to be actually building and doing all right so we're
540:37 building and doing all right so we're going to click right here on this drop
540:38 going to click right here on this drop down on sheet one it's going to show us
540:40 down on sheet one it's going to show us all of our columns now two of the things
540:43 all of our columns now two of the things that we wanted to look at were where are
540:45 that we wanted to look at were where are we spending the least amount of money
540:47 we spending the least amount of money buying the exact same product that'll
540:48 buying the exact same product that'll help us determine where we want to shop
540:50 help us determine where we want to shop and the second thing was should I be
540:52 and the second thing was should I be buying all my products at the same place
540:54 buying all my products at the same place or are there certain products that
540:56 or are there certain products that they're going to be cheaper at a
540:57 they're going to be cheaper at a specific store and I should buy it there
540:59 specific store and I should buy it there so let's start out with the first one
541:01 so let's start out with the first one which we're just going to see uh with
541:03 which we're just going to see uh with the store and the
541:05 the store and the price uh where we're spending the least
541:08 price uh where we're spending the least amount of money and just at a quick
541:10 amount of money and just at a quick glance we can see we're spending the
541:11 glance we can see we're spending the least amount of money at Costco at $210
541:13 least amount of money at Costco at $210 versus Target 219 and Walmart at 225 and
541:18 versus Target 219 and Walmart at 225 and that really answers our question but we
541:19 that really answers our question but we want to visualize it better be able to
541:21 want to visualize it better be able to see it in an easier way so we're going
541:23 see it in an easier way so we're going to go right over here and we can click
541:25 to go right over here and we can click on a lot of these but the one that
541:27 on a lot of these but the one that probably makes the most sense is the
541:29 probably makes the most sense is the stocked column
541:30 stocked column chart and it's going to show Walmart
541:33 chart and it's going to show Walmart Target and Costco now they're all the
541:34 Target and Costco now they're all the same color let's add a legend so we're
541:37 same color let's add a legend so we're just going to drag store over here down
541:39 just going to drag store over here down to this Legend and let's make this
541:42 to this Legend and let's make this larger while we're working on it so now
541:45 larger while we're working on it so now we can see we're spending the most
541:46 we can see we're spending the most amount of money at Walmart right in
541:48 amount of money at Walmart right in between at Target and then at Costco is
541:50 between at Target and then at Costco is the lowest and so right there we know
541:52 the lowest and so right there we know that Costco is the place to go for our
541:54 that Costco is the place to go for our apocalypse food prep but is it going to
541:57 apocalypse food prep but is it going to be that way for every product I don't
542:01 be that way for every product I don't know let's take a look let's put this up
542:03 know let's take a look let's put this up in this corner and let's start a new one
542:06 in this corner and let's start a new one we're going to need to select the
542:07 we're going to need to select the product for sure and the price and
542:11 product for sure and the price and probably Additionally the store as well
542:14 probably Additionally the store as well and let's click
542:16 and let's click on let's not do this one we need a
542:18 on let's not do this one we need a clustered column chart that's what we
542:20 clustered column chart that's what we need let's bring this over here let's
542:23 need let's bring this over here let's expand this quite a bit and so really at
542:26 expand this quite a bit and so really at a glance this is giving us everything
542:28 a glance this is giving us everything that we need we can see each product
542:31 that we need we can see each product right here and we can see how much we're
542:33 right here and we can see how much we're paying per store and so for Rice we're
542:36 paying per store and so for Rice we're paying it looks like a lot more for our
542:40 paying it looks like a lot more for our rice at Walmart while at Target is
542:42 rice at Walmart while at Target is actually where we are paying the least
542:44 actually where we are paying the least now if we look at all of these it looks
542:46 now if we look at all of these it looks like for Costco the only one that we're
542:48 like for Costco the only one that we're really paying a lot more on is on our
542:50 really paying a lot more on is on our rice but for our dried beans our bottled
542:54 rice but for our dried beans our bottled water we're paying quite a bit less and
542:57 water we're paying quite a bit less and really it's pretty negligible for these
542:59 really it's pretty negligible for these canned vegetables we're paying maybe
543:00 canned vegetables we're paying maybe what 60 cents 50 60 cents more per can
543:04 what 60 cents 50 60 cents more per can so that's pretty negligible but for the
543:06 so that's pretty negligible but for the big ticket items um we're really
543:08 big ticket items um we're really spending a lot less at Costco if we
543:10 spending a lot less at Costco if we wanted to SP to save just a little bit
543:12 wanted to SP to save just a little bit more money we could go to Target for our
543:14 more money we could go to Target for our rice now if I want to make this more
543:16 rice now if I want to make this more like a dashboard and we're only keeping
543:18 like a dashboard and we're only keeping these two things I'm going to kind of
543:20 these two things I'm going to kind of size them kind of like this whoops going
543:24 size them kind of like this whoops going to show you that in a little bit I'm
543:25 to show you that in a little bit I'm going to size them a little bit like
543:28 going to size them a little bit like this so now that we have that looking
543:30 this so now that we have that looking good we want to change the title of both
543:32 good we want to change the title of both of these so what we're going to do is go
543:34 of these so what we're going to do is go over here in our visualizations and
543:36 over here in our visualizations and format your visual uh and we are going
543:39 format your visual uh and we are going to go to this General go to Ty TI and
543:41 to go to this General go to Ty TI and now we can name it anything we really
543:44 now we can name it anything we really want for this we're going to say best
543:48 want for this we're going to say best store for
543:51 store for product and while we're in here one
543:53 product and while we're in here one other thing that I wanted to do is I
543:54 other thing that I wanted to do is I want to go to this visual go right down
543:57 want to go to this visual go right down here to these data labels now we haven't
543:59 here to these data labels now we haven't added any data labels so I'm going to
544:02 added any data labels so I'm going to click on and you'll see exactly what it
544:04 click on and you'll see exactly what it does uh it just puts the labels and the
544:06 does uh it just puts the labels and the numbers above it so you don't have to
544:07 numbers above it so you don't have to actually like hover over it and see what
544:09 actually like hover over it and see what it is now it is actually rounding these
544:11 it is now it is actually rounding these numbers so what we're going to do is go
544:13 numbers so what we're going to do is go down here we're going to go down to
544:16 down here we're going to go down to values and we'll go down to display
544:19 values and we'll go down to display units and it's on auto so it's Auto
544:21 units and it's on auto so it's Auto rounding those numbers and we're just
544:22 rounding those numbers and we're just going to say none so we can see the
544:24 going to say none so we can see the actual value of these
544:27 actual value of these numbers and we can do the exact same
544:29 numbers and we can do the exact same thing over here it probably is a good
544:32 thing over here it probably is a good thing to do um and it just is going to
544:35 thing to do um and it just is going to visualize it a little bit differently in
544:36 visualize it a little bit differently in here but you can always change that if
544:38 here but you can always change that if you want to go over here to
544:41 you want to go over here to title and we're going to say total by
544:46 title and we're going to say total by store and now we're going to take a look
544:50 store and now we're going to take a look and so in a matter of minutes we were
544:52 and so in a matter of minutes we were able to take our data from an Excel put
544:54 able to take our data from an Excel put it into powerbi transform it a little
544:57 it into powerbi transform it a little bit then we're able to create these
544:59 bit then we're able to create these visualizations that gave us concrete
545:01 visualizations that gave us concrete answers to some very important topics we
545:04 answers to some very important topics we now know that Costco is the place to go
545:06 now know that Costco is the place to go for basically every single product
545:08 for basically every single product except if we're buying rice and if we
545:10 except if we're buying rice and if we want to save just a few dollars we're
545:12 want to save just a few dollars we're going to head over to Target and that's
545:14 going to head over to Target and that's genuinely going to change my shopping
545:15 genuinely going to change my shopping habits for the next several years until
545:17 habits for the next several years until the apocalypse happens so in future
545:19 the apocalypse happens so in future videos we're going to dive into a lot of
545:20 videos we're going to dive into a lot of the things that we looked at today but
545:22 the things that we looked at today but just in more detail and then at the very
545:24 just in more detail and then at the very end of the series we're going to have an
545:25 end of the series we're going to have an entire project where we really use every
545:27 entire project where we really use every single part of powerbi and create a
545:30 single part of powerbi and create a beautiful dashboard and so that's all we
545:31 beautiful dashboard and so that's all we have for our very first video in our
545:33 have for our very first video in our powerbi series I hope it was helpful if
545:35 powerbi series I hope it was helpful if you like this video be sure to like And
545:37 you like this video be sure to like And subscribe below and I'll see you in the
545:38 subscribe below and I'll see you in the next
545:39 next video
545:42 video [Music]
545:51 [Music] what's going on everybody today we're
545:52 what's going on everybody today we're continuing our powerbi tutorial series
545:54 continuing our powerbi tutorial series and in this video we're going to be
545:55 and in this video we're going to be looking at Power
545:57 looking at Power [Music]
546:01 [Music] query Now power query is really great
546:04 query Now power query is really great because it allows you to actually
546:05 because it allows you to actually transform the data before you actually
546:07 transform the data before you actually get it into powerbi so if you want to
546:09 get it into powerbi so if you want to make any changes like adding or deleting
546:11 make any changes like adding or deleting a column or changing the data type or a
546:13 a column or changing the data type or a ton of other things you can do all of
546:15 ton of other things you can do all of that in power query now without further
546:17 that in power query now without further Ado let's jump on my screen and get
546:19 Ado let's jump on my screen and get started with the tutorial all right so
546:20 started with the tutorial all right so before we jump over to powerbi and start
546:22 before we jump over to powerbi and start using power query I wanted to take a
546:24 using power query I wanted to take a look at the data and this is the Excel
546:26 look at the data and this is the Excel from our last video called apocalypse
546:28 from our last video called apocalypse food prep and in that video we went
546:30 food prep and in that video we went through and we bought some rice some
546:32 through and we bought some rice some beans water vegetables and milk all for
546:34 beans water vegetables and milk all for the apocalypse getting prepared for that
546:37 the apocalypse getting prepared for that now we decided to buy some additional
546:39 now we decided to buy some additional things like rope some flashlights duct
546:42 things like rope some flashlights duct tape and a water filter several water
546:45 tape and a water filter several water filters and after we purchased those uh
546:48 filters and after we purchased those uh our boss or whoever we're working with
546:50 our boss or whoever we're working with or somebody decided to go and make a
546:52 or somebody decided to go and make a pivot table now in this pivot table they
546:54 pivot table now in this pivot table they kind of broke it out by Costco Target
546:56 kind of broke it out by Costco Target and Walmart and had all the items had
546:59 and Walmart and had all the items had some subtotals as well as some Grand
547:01 some subtotals as well as some Grand totals right here and then they decided
547:04 totals right here and then they decided to kind of copy and paste that into this
547:07 to kind of copy and paste that into this and you'll see this a lot when you're
547:09 and you'll see this a lot when you're working with uh people who use Excel
547:11 working with uh people who use Excel they like to kind of make things like
547:12 they like to kind of make things like this maybe make it into like a table or
547:15 this maybe make it into like a table or or format a little bit differently but
547:16 or format a little bit differently but you'll see stuff like this a lot so this
547:19 you'll see stuff like this a lot so this is what we're going to actually pull
547:20 is what we're going to actually pull into Power query and work with now we're
547:23 into Power query and work with now we're going to imagine that this is all we
547:25 going to imagine that this is all we have this is the only thing we were
547:26 have this is the only thing we were working with and I'll kind of reference
547:28 working with and I'll kind of reference this pivot table a little bit but we're
547:30 this pivot table a little bit but we're going to pretend this is all we have and
547:32 going to pretend this is all we have and we want to transform it to make it a lot
547:34 we want to transform it to make it a lot more usable to where we can make
547:35 more usable to where we can make visualizations with it so let's hop over
547:37 visualizations with it so let's hop over to powerbi and pull this excel in so
547:39 to powerbi and pull this excel in so what we're going to do is click import
547:41 what we're going to do is click import data from Excel we're going to click
547:43 data from Excel we're going to click apocalypse food prep and click open and
547:45 apocalypse food prep and click open and then it's going to bring up this window
547:46 then it's going to bring up this window right here now this is where we can
547:48 right here now this is where we can choose what data to bring in so we can
547:50 choose what data to bring in so we can take a preview and just click on it real
547:53 take a preview and just click on it real quick and this is the pivot table that
547:54 quick and this is the pivot table that we were looking at so it does have that
547:56 we were looking at so it does have that pivot table so we are able to pull in
547:58 pivot table so we are able to pull in just a pivot table and then we have the
548:01 just a pivot table and then we have the purchase overview where it's kind of
548:02 purchase overview where it's kind of that formatted um thing that we're just
548:05 that formatted um thing that we're just looking at with all the colors we're
548:07 looking at with all the colors we're going to pull both of those in so we're
548:08 going to pull both of those in so we're going to pull in the pivot table and the
548:10 going to pull in the pivot table and the purchase overview now we could just load
548:12 purchase overview now we could just load it or we could transform it and we're
548:15 it or we could transform it and we're going to click transform and that's
548:16 going to click transform and that's going to bring us to power query so
548:17 going to bring us to power query so let's click on transform data so now
548:19 let's click on transform data so now really quick before we actually jump
548:20 really quick before we actually jump into working through this and
548:22 into working through this and transforming it I want to show you what
548:24 transforming it I want to show you what the power query editor looks like so if
548:27 the power query editor looks like so if we go right over here we have our
548:28 we go right over here we have our queries and these are the tables that we
548:30 queries and these are the tables that we actually pulled in and we can click on
548:31 actually pulled in and we can click on those and kind of go back and forth
548:33 those and kind of go back and forth between them now up top we have our
548:35 between them now up top we have our ribbon and the ribbon offers a lot of
548:37 ribbon and the ribbon offers a lot of functionality we have things like remove
548:40 functionality we have things like remove columns keep rows remove rows split
548:43 columns keep rows remove rows split columns these are all things that we're
548:45 columns these are all things that we're likely to use when using this power
548:46 likely to use when using this power query editor there's also another tab
548:49 query editor there's also another tab called transform where there's a lot of
548:51 called transform where there's a lot of functionality here as well things like
548:53 functionality here as well things like unpivoting a column or transposing
548:55 unpivoting a column or transposing columns and rows and using a first row
548:58 columns and rows and using a first row as a header some of the things that
549:00 as a header some of the things that we'll be looking at today there's also
549:02 we'll be looking at today there's also another tab called add a column and this
549:04 another tab called add a column and this one's pretty self-explanatory where you
549:05 one's pretty self-explanatory where you can add additional columns like deleting
549:07 can add additional columns like deleting a column creating an index column or a
549:10 a column creating an index column or a conditional column those are the three
549:12 conditional column those are the three main ones there's also view tools and
549:14 main ones there's also view tools and help but we're not going to really be
549:15 help but we're not going to really be looking at those today and then on the
549:18 looking at those today and then on the far right side we have our query
549:19 far right side we have our query settings you can do things like change
549:22 settings you can do things like change the name so we call it pivot table
549:25 the name so we call it pivot table 2022 and it'll update right over here on
549:28 2022 and it'll update right over here on our query side and we have our applied
549:31 our query side and we have our applied steps now our applied steps are
549:33 steps now our applied steps are extremely important and very very useful
549:35 extremely important and very very useful anytime we make any change to transform
549:37 anytime we make any change to transform this data it's going to be documented
549:39 this data it's going to be documented right here and then we can go back and
549:41 right here and then we can go back and look at it or we could even delete that
549:43 look at it or we could even delete that change in the future if we want to and
549:46 change in the future if we want to and go back to a previous version of what we
549:48 go back to a previous version of what we just did so when we loaded the data into
549:50 just did so when we loaded the data into powerbi it did a few things for us it
549:52 powerbi it did a few things for us it shows the source the navigation and it
549:54 shows the source the navigation and it promoted the headers and then it also
549:55 promoted the headers and then it also changed the data type so if we want to
549:58 changed the data type so if we want to check we can actually see those things
550:00 check we can actually see those things or change those things like this Source
550:02 or change those things like this Source right here we can click on this little
550:03 right here we can click on this little icon and it's going to bring up the
550:05 icon and it's going to bring up the actual path where we got this file so if
550:08 actual path where we got this file so if we wanted to change that or or it
550:09 we wanted to change that or or it changes in the future future we can come
550:11 changes in the future future we can come here and we can change this file path
550:13 here and we can change this file path but we're not going to do that right now
550:14 but we're not going to do that right now so let's click on cancel and let's go
550:17 so let's click on cancel and let's go back down to change type so it promoted
550:19 back down to change type so it promoted these headers and obviously these
550:21 these headers and obviously these headers are not correct we're looking at
550:22 headers are not correct we're looking at this pivot table and not the purchase
550:24 this pivot table and not the purchase overview but it changed these column
550:27 overview but it changed these column headers and so in the future if we
550:28 headers and so in the future if we wanted to we could easily change those
550:30 wanted to we could easily change those but it did that for us and it changed
550:32 but it did that for us and it changed the type as well so if you look right
550:35 the type as well so if you look right here it says
550:36 here it says abc123 all the way over here it's where
550:38 abc123 all the way over here it's where it just says ABC ABC means it's only
550:41 it just says ABC ABC means it's only going to be text where abc123 means it
550:44 going to be text where abc123 means it could be basically anything uh text or
550:46 could be basically anything uh text or it could be numeric so now let's go over
550:49 it could be numeric so now let's go over to purchase overview and this is the one
550:51 to purchase overview and this is the one that we're actually going to be working
550:52 that we're actually going to be working on the most but we might be looking at
550:54 on the most but we might be looking at pivot table just a little bit to kind of
550:55 pivot table just a little bit to kind of reference it and see some of the
550:57 reference it and see some of the differences so before we do anything
550:58 differences so before we do anything let's just take a look at how powerbi
551:01 let's just take a look at how powerbi decided to take this data in so it chose
551:03 decided to take this data in so it chose this apocalypse food prep overview as
551:06 this apocalypse food prep overview as kind of the First Column and that was
551:07 kind of the First Column and that was kind of our header or the title of what
551:09 kind of our header or the title of what we were looking at before and then all
551:11 we were looking at before and then all these other columns are basically column
551:12 these other columns are basically column 1 2 3 four fivs so that's something that
551:15 1 2 3 four fivs so that's something that we're going to want to change in just a
551:16 we're going to want to change in just a little bit there's also all these blank
551:19 little bit there's also all these blank uh columns right here at the top and
551:21 uh columns right here at the top and kind of these null values as we go along
551:24 kind of these null values as we go along and we'll take a look at those and we
551:25 and we'll take a look at those and we kind of we going to want to get rid of
551:27 kind of we going to want to get rid of some of this and just clean this up to
551:29 some of this and just clean this up to make it more usable for our powerbi
551:31 make it more usable for our powerbi visualizations this may be perfectly
551:33 visualizations this may be perfectly fine and acceptable in an Excel but when
551:35 fine and acceptable in an Excel but when you're pulling it into powerbi the real
551:37 you're pulling it into powerbi the real reason you're pulling it in is to create
551:39 reason you're pulling it in is to create visualizations not just it to look good
551:41 visualizations not just it to look good in an Excel so we're going to need to
551:43 in an Excel so we're going to need to clean this up quite a bit so let's go
551:45 clean this up quite a bit so let's go right up top the first thing that I want
551:47 right up top the first thing that I want to do is I want to get rid of these top
551:49 to do is I want to get rid of these top rows so we're going to go to this top
551:51 rows so we're going to go to this top ribbon and we're going to click remove
551:53 ribbon and we're going to click remove rows and we're going to select remove
551:55 rows and we're going to select remove top rows and we're going to select two
551:58 top rows and we're going to select two because we have one two rows of all
552:00 because we have one two rows of all nulls and those are completely useless
552:02 nulls and those are completely useless we just want to get rid of them right
552:03 we just want to get rid of them right away so let's cck Okay and it removed
552:07 away so let's cck Okay and it removed those the next thing that we want to do
552:09 those the next thing that we want to do is these this location product and all
552:12 is these this location product and all these dates these are actually the
552:14 these dates these are actually the column headers that we wanted so what we
552:17 column headers that we wanted so what we need to do now is we want to go over to
552:20 need to do now is we want to go over to transform and we want to say use first
552:22 transform and we want to say use first row as
552:24 row as headers and just like that we have
552:27 headers and just like that we have location products and these dates as our
552:29 location products and these dates as our headers exactly how we wanted them now
552:32 headers exactly how we wanted them now let's say for whatever reason you know
552:33 let's say for whatever reason you know we made a mistake and we needed to go
552:35 we made a mistake and we needed to go back we would just select remove top
552:38 back we would just select remove top rows and that would be perfectly fine
552:40 rows and that would be perfectly fine now you can see over here it promoted
552:42 now you can see over here it promoted the headers but it's also changed the
552:44 the headers but it's also changed the data type so before if we went to before
552:47 data type so before if we went to before we removed the headers these were all
552:50 we removed the headers these were all abc123 abc123 because it had a lot of
552:53 abc123 abc123 because it had a lot of different data types in there so it just
552:54 different data types in there so it just kind of made a generic data type but
552:57 kind of made a generic data type but when we promoted these headers the first
552:59 when we promoted these headers the first thing that it decided to do was also
553:01 thing that it decided to do was also change this data type for us giving us
553:03 change this data type for us giving us its best guess as to what this data type
553:06 its best guess as to what this data type is and it decided to do this decimal so
553:09 is and it decided to do this decimal so this one two is a decimal but we're
553:11 this one two is a decimal but we're actually going to change that and all
553:12 actually going to change that and all you have to do is click on This 1.2 uh
553:15 you have to do is click on This 1.2 uh or or the data type that it has right
553:17 or or the data type that it has right here for you and we're going to click on
553:19 here for you and we're going to click on fixed decimal number and let's do
553:21 fixed decimal number and let's do replace
553:22 replace current and now it's just a little bit
553:25 current and now it's just a little bit better so now it's 2.70 2.5 and that's
553:28 better so now it's 2.70 2.5 and that's normally how we would read uh values
553:31 normally how we would read uh values like this because this is money so we
553:33 like this because this is money so we would normally read it to the second
553:34 would normally read it to the second decimal just like that and if we have it
553:36 decimal just like that and if we have it on the second decimal for some we should
553:38 on the second decimal for some we should probably have it on the second decimal
553:39 probably have it on the second decimal for all all of them so really quickly
553:41 for all all of them so really quickly I'm going to go through and I'm just
553:43 I'm going to go through and I'm just going to change that and it should be
553:45 going to change that and it should be pretty quick so hang with me for just a
553:48 pretty quick so hang with me for just a second all right that is perfect now for
553:51 second all right that is perfect now for the purposes of what we're about to do
553:54 the purposes of what we're about to do we don't actually need these subtotals
553:56 we don't actually need these subtotals or this Costco total Target total and
553:58 or this Costco total Target total and Walmart total as well as the grand total
554:00 Walmart total as well as the grand total really we want to get rid of those and
554:02 really we want to get rid of those and so what we're going to do is we're going
554:04 so what we're going to do is we're going to go right over here we're going to
554:05 to go right over here we're going to click on this drop down and we're going
554:06 click on this drop down and we're going to try to filter this data before we
554:08 to try to filter this data before we actually load it into power VI so we're
554:11 actually load it into power VI so we're going to filter and we're going to say
554:13 going to filter and we're going to say remove empty and let's remove those and
554:17 remove empty and let's remove those and it's going to take out all of those
554:18 it's going to take out all of those nulls if we wanted to try to filter this
554:20 nulls if we wanted to try to filter this out by saying something like Costco
554:22 out by saying something like Costco total or Target total we could do that
554:25 total or Target total we could do that by going right here clicking this drop
554:27 by going right here clicking this drop town on products going to text filters
554:29 town on products going to text filters and saying does not contain and let's do
554:33 and saying does not contain and let's do insert and we're going to say does not
554:35 insert and we're going to say does not contain and we want to say total and
554:39 contain and we want to say total and let's click okay okay and again it
554:41 let's click okay okay and again it filtered out all of those things so
554:43 filtered out all of those things so there's a few different options that you
554:44 there's a few different options that you can do if you want to filter out rows
554:46 can do if you want to filter out rows that contain either null values or
554:48 that contain either null values or specific values now the next thing that
554:50 specific values now the next thing that we're going to do is actually get rid of
554:52 we're going to do is actually get rid of a column this grand total column and so
554:54 a column this grand total column and so what we're going to do is we're going to
554:56 what we're going to do is we're going to click on the very top part where it says
554:58 click on the very top part where it says grand total we're going to go back over
555:00 grand total we're going to go back over here to home and we're going to click on
555:02 here to home and we're going to click on remove columns and it says insert that's
555:04 remove columns and it says insert that's because we're on this filtered rows one
555:06 because we're on this filtered rows one right here um but what we're going to do
555:08 right here um but what we're going to do is just insert that and it'll insert
555:10 is just insert that and it'll insert right there that's totally fine we can
555:11 right there that's totally fine we can just move it to the bottom now we got
555:13 just move it to the bottom now we got rid of this column entirely now this
555:16 rid of this column entirely now this looks really good visually I like how
555:19 looks really good visually I like how this looks I like how everything is set
555:20 this looks I like how everything is set up the biggest thing about this is that
555:23 up the biggest thing about this is that when you're actually wanting to use this
555:25 when you're actually wanting to use this for visualizations these columns as
555:27 for visualizations these columns as dates doesn't really work too well and
555:30 dates doesn't really work too well and so what we're going to want to do is
555:32 so what we're going to want to do is we're going to want to transpose this or
555:35 we're going to want to transpose this or pivot this to where these dates are
555:37 pivot this to where these dates are actually rows so what we're going to do
555:39 actually rows so what we're going to do is select the first date which is
555:40 is select the first date which is January 1st all the way through April
555:43 January 1st all the way through April 1st and we're going to hit shift and
555:44 1st and we're going to hit shift and click on that April 1st right there to
555:46 click on that April 1st right there to select all of them at the same time and
555:48 select all of them at the same time and then we're going to go over here to the
555:50 then we're going to go over here to the transform Tab and we're going to click
555:52 transform Tab and we're going to click unpivot columns and let's see what this
555:55 unpivot columns and let's see what this does and so now what we've done is we've
555:57 does and so now what we've done is we've basically recreated our original Excel
556:00 basically recreated our original Excel that we had so let's go back and take a
556:01 that we had so let's go back and take a look really quickly at that so this
556:03 look really quickly at that so this looks almost identical to what we have
556:05 looks almost identical to what we have in powerbi right now and this is
556:07 in powerbi right now and this is extremely usable and very good for
556:09 extremely usable and very good for visualization
556:10 visualization and is much much better than this but
556:12 and is much much better than this but again we were pretending that this is
556:14 again we were pretending that this is what we were given at the beginning so
556:16 what we were given at the beginning so you have to imagine you know somebody
556:18 you have to imagine you know somebody just handing you this and you need to
556:19 just handing you this and you need to make it much more usable for
556:21 make it much more usable for visualizations in the future which
556:23 visualizations in the future which happens a lot and we actually wanted to
556:25 happens a lot and we actually wanted to create this we just weren't given this
556:27 create this we just weren't given this now a few last things that we might want
556:29 now a few last things that we might want to do is we want to clean this up just a
556:31 to do is we want to clean this up just a little bit we're going to select the
556:32 little bit we're going to select the data type and change this to date and
556:35 data type and change this to date and then we're going to select the value and
556:37 then we're going to select the value and I double clicked on the value and I
556:39 I double clicked on the value and I actually want to call this cost uh or
556:42 actually want to call this cost uh or product cost
556:45 product cost productor
556:46 productor cost and then for the location I
556:49 cost and then for the location I actually want this to be called
556:51 actually want this to be called store so now this looks really good but
556:54 store so now this looks really good but I want to show you one thing really
556:56 I want to show you one thing really quickly on this pivot table 2022 so
556:58 quickly on this pivot table 2022 so let's go back here this looks very
557:01 let's go back here this looks very similar to how we had it when it first
557:03 similar to how we had it when it first started one thing I wanted to show you
557:06 started one thing I wanted to show you uh really quickly and I want to click on
557:08 uh really quickly and I want to click on this first one we're going to make make
557:10 this first one we're going to make make this our column header and then we're
557:12 this our column header and then we're going to try to Pivot or unpivot this
557:14 going to try to Pivot or unpivot this January February March April so really
557:16 January February March April so really quickly let's do that so we're going to
557:18 quickly let's do that so we're going to transform use first row as
557:21 transform use first row as headers so now we have this January
557:24 headers so now we have this January February March April now if you notice
557:26 February March April now if you notice these are not dates these are actually
557:29 these are not dates these are actually texts it says January February March and
557:32 texts it says January February March and April so if we go to do this and we
557:36 April so if we go to do this and we click
557:37 click unpivot and here's the columns that are
557:39 unpivot and here's the columns that are cre cre when we unpivot it it is January
557:43 cre cre when we unpivot it it is January February March and April these are not
557:45 February March and April these are not dates so we cannot go and change this to
557:47 dates so we cannot go and change this to a date because that would error out
557:50 a date because that would error out because it's actually text so it's
557:51 because it's actually text so it's something that you want to look out for
557:53 something that you want to look out for it's something that you need to be aware
557:54 it's something that you need to be aware of and you can change that in the pivot
557:56 of and you can change that in the pivot table so you want to be aware of how it
557:58 table so you want to be aware of how it actually sits and looks in the Excel or
558:00 actually sits and looks in the Excel or whatever data source you're pulling from
558:02 whatever data source you're pulling from before you actually pull it into Power
558:04 before you actually pull it into Power query to transform and now the very last
558:07 query to transform and now the very last thing that we need to do to finalize all
558:09 thing that we need to do to finalize all of this is go over here to close and
558:11 of this is go over here to close and apply and once we click that everything
558:13 apply and once we click that everything that we've worked on is going to be
558:14 that we've worked on is going to be applied to the actual data and it's
558:16 applied to the actual data and it's going to load into powerbi to create our
558:18 going to load into powerbi to create our visualizations so let's go ahead and
558:19 visualizations so let's go ahead and click on that and so now the data has
558:21 click on that and so now the data has been pulled into powerbi let's go right
558:23 been pulled into powerbi let's go right down here to data and we can see the
558:25 down here to data and we can see the data right here if we need to transform
558:27 data right here if we need to transform this data again we can bring it back
558:29 this data again we can bring it back into the power query editor window by
558:31 into the power query editor window by just clicking the transform data button
558:34 just clicking the transform data button and it's going to bring us right back so
558:35 and it's going to bring us right back so I hope that this was helpful thank you
558:37 I hope that this was helpful thank you so much for watching if you like this
558:39 so much for watching if you like this video like And subscribe below and check
558:41 video like And subscribe below and check out all my other videos and everything
558:43 out all my other videos and everything data analyst related I'll see you in the
558:45 data analyst related I'll see you in the next
558:46 next [Music]
558:56 [Music] video what's going on everybody welcome
558:59 video what's going on everybody welcome back to the powerbi tutorial Series
559:01 back to the powerbi tutorial Series today we're going to be taking a look at
559:02 today we're going to be taking a look at building
559:08 [Music] relationships now when you import
559:10 relationships now when you import multiple tables from either the same
559:12 multiple tables from either the same data source or multiple data sources you
559:14 data source or multiple data sources you want to tie them together so that when
559:16 want to tie them together so that when you're creating your visualizations
559:17 you're creating your visualizations everything is connected so in this
559:19 everything is connected so in this tutorial we'll be walking through how to
559:21 tutorial we'll be walking through how to create those relationships to make sure
559:22 create those relationships to make sure that all of your tables are connected
559:24 that all of your tables are connected properly and without further Ado let's
559:26 properly and without further Ado let's jump onto my screen and get started with
559:27 jump onto my screen and get started with the tutorial all right so before we jump
559:29 the tutorial all right so before we jump over to powerbi and start creating our
559:30 over to powerbi and start creating our relationships and our model I want to
559:32 relationships and our model I want to take a look at the data in Excel we
559:34 take a look at the data in Excel we realized we were buying so many products
559:36 realized we were buying so many products for the apocalypse that we decided to
559:38 for the apocalypse that we decided to start our own store and we have several
559:40 start our own store and we have several customers and some client information
559:42 customers and some client information down here and so I wanted to take a look
559:44 down here and so I wanted to take a look at some of the columns and these tables
559:45 at some of the columns and these tables that we're going to be looking at first
559:48 that we're going to be looking at first thing we have is the apocalypse store
559:50 thing we have is the apocalypse store these are the things that we are selling
559:51 these are the things that we are selling I know it's a very limited inventory but
559:55 I know it's a very limited inventory but these are the really high sellers these
559:56 these are the really high sellers these are the ones that I wanted to sell so we
559:58 are the ones that I wanted to sell so we have this product ID our product name
560:01 have this product ID our product name price and production cost then we have
560:04 price and production cost then we have this apocalypse sales this is how many
560:06 this apocalypse sales this is how many sales we've actually made to our
560:08 sales we've actually made to our customers so we have this customer ID
560:11 customers so we have this customer ID our customer name product ID order ID
560:14 our customer name product ID order ID unit sold and the date it was purchased
560:17 unit sold and the date it was purchased and then we have our customer
560:18 and then we have our customer information right here here are all of
560:20 information right here here are all of our clients so we have this customer ID
560:23 our clients so we have this customer ID customer address city state and zip code
560:26 customer address city state and zip code so now that we've taken a look at our
560:28 so now that we've taken a look at our data let's go and load it into powerbi
560:30 data let's go and load it into powerbi so we're going to say import data from
560:31 so we're going to say import data from Excel we're going to choose this model
560:33 Excel we're going to choose this model right here we're going to click open and
560:36 right here we're going to click open and we are going to want all three of these
560:37 we are going to want all three of these so I'm going to click on all of them and
560:39 so I'm going to click on all of them and we're just going to load it we're not
560:40 we're just going to load it we're not going to transform the data at
560:46 all so now the data has been loaded let's go right over here on the left
560:48 let's go right over here on the left hand side to our model Tab and let's
560:50 hand side to our model Tab and let's scoot this over just a little bit and
560:53 scoot this over just a little bit and move back and we're going to move these
560:56 move back and we're going to move these tables up to where it's a little bit
560:58 tables up to where it's a little bit easier to
560:59 easier to see so right off the bat you can already
561:02 see so right off the bat you can already see that there are these lines between
561:04 see that there are these lines between these tables so there are already
561:06 these tables so there are already relationships that powerbi has
561:08 relationships that powerbi has automatically detected and created from
561:10 automatically detected and created from my experience powerbi actually does a
561:12 my experience powerbi actually does a really good job at creating these
561:14 really good job at creating these relationships automatically but we're
561:16 relationships automatically but we're going to go in and take a look at these
561:17 going to go in and take a look at these and kind of see what everything means
561:19 and kind of see what everything means and then we're going to go back and
561:20 and then we're going to go back and create these relationships from scratch
561:22 create these relationships from scratch just to make sure that we know how to do
561:24 just to make sure that we know how to do every single part so to get it started
561:26 every single part so to get it started let's double click on this line
561:27 let's double click on this line connecting the customer information
561:29 connecting the customer information table to the apocalypse sales
561:31 table to the apocalypse sales table and it's going to bring up this
561:33 table and it's going to bring up this edit relationship page right here so
561:36 edit relationship page right here so this line right here connecting these
561:37 this line right here connecting these two tables actually gives us quite a bit
561:39 two tables actually gives us quite a bit of information without actually having
561:41 of information without actually having to click into this edit relationship
561:42 to click into this edit relationship page what this is showing is that we
561:45 page what this is showing is that we have a one to many relationship and
561:47 have a one to many relationship and there's only one or a single crossfilter
561:50 there's only one or a single crossfilter direction and you can find both of those
561:52 direction and you can find both of those things right down here and I'm going to
561:54 things right down here and I'm going to walk through what those mean in just a
561:55 walk through what those mean in just a little bit on this page you can also see
561:57 little bit on this page you can also see the columns that powerbi decided to
561:59 the columns that powerbi decided to choose in order to tie these two tables
562:01 choose in order to tie these two tables together now for our example they
562:04 together now for our example they decided to use the customer and customer
562:06 decided to use the customer and customer right here from the customer information
562:08 right here from the customer information table as well as the apocal sales but I
562:11 table as well as the apocal sales but I don't really want to use those
562:13 don't really want to use those specifically because on this apocalypse
562:15 specifically because on this apocalypse sales table I might remove this customer
562:17 sales table I might remove this customer information and just keep the customer
562:19 information and just keep the customer ID it may have chosen these customer
562:21 ID it may have chosen these customer columns because they have the exact same
562:23 columns because they have the exact same name and really the same information but
562:26 name and really the same information but I want to use this customer ID anyways
562:28 I want to use this customer ID anyways so what I'm going to do is I'm going to
562:29 so what I'm going to do is I'm going to click on that column and click on this
562:31 click on that column and click on this column and then I'm going to click okay
562:33 column and then I'm going to click okay and if we go back into it by double
562:35 and if we go back into it by double clicking again we're going to see that
562:37 clicking again we're going to see that and now save that and if we did what we
562:39 and now save that and if we did what we just did before which is kind of hover
562:41 just did before which is kind of hover over it it's going to show us what those
562:42 over it it's going to show us what those two tables are joined on so opening this
562:45 two tables are joined on so opening this back up let's go down here to this
562:46 back up let's go down here to this cardinality and cross filter Direction
562:48 cardinality and cross filter Direction cardinality has several different
562:50 cardinality has several different options that you can choose from you
562:51 options that you can choose from you have one to many one to one one to many
562:54 have one to many one to one one to many and many to many now for this example
562:57 and many to many now for this example we're looking at apocalypse sales and
562:58 we're looking at apocalypse sales and we're going apocalypse sales down to
563:01 we're going apocalypse sales down to customer information now there are a lot
563:03 customer information now there are a lot of rows in the apocalypse sales but
563:05 of rows in the apocalypse sales but there's very few in this customer
563:07 there's very few in this customer information and there's only one
563:09 information and there's only one customer per row whereas in the
563:11 customer per row whereas in the apocalypse sales up here the customer
563:13 apocalypse sales up here the customer can have several rows for several
563:15 can have several rows for several different orders so that's why the
563:17 different orders so that's why the cardinality is many to one now if we
563:20 cardinality is many to one now if we flip this and we say we want the
563:21 flip this and we say we want the customer information here and we want
563:24 customer information here and we want the apocalypse sales down here we tie
563:26 the apocalypse sales down here we tie that together now it's going to flip and
563:28 that together now it's going to flip and it's going to say one to many now let's
563:30 it's going to say one to many now let's look at the cross filter Direction and
563:32 look at the cross filter Direction and there's only two options here it's
563:33 there's only two options here it's either single or both and if we choose
563:35 either single or both and if we choose both and we click okay this now goes
563:38 both and we click okay this now goes from a single arrow pointing in one
563:40 from a single arrow pointing in one direction to two arrows pointing in both
563:42 direction to two arrows pointing in both directions but what does this really
563:44 directions but what does this really mean so in order to demonstrate this I'm
563:46 mean so in order to demonstrate this I'm going to put this back to a single
563:48 going to put this back to a single Direction and what we're going to try to
563:49 Direction and what we're going to try to do is connect the data over here or the
563:51 do is connect the data over here or the columns over here to the columns in this
563:54 columns over here to the columns in this apocalypse store so let's go over here
563:56 apocalypse store so let's go over here to build a visualization and what we're
563:58 to build a visualization and what we're going to do is we're going to take this
564:00 going to do is we're going to take this customer information and let's just say
564:02 customer information and let's just say we want to look at state so I'm going to
564:04 we want to look at state so I'm going to click on state right here and I'm just
564:06 click on state right here and I'm just going to make this into a table and the
564:08 going to make this into a table and the customer information table is only tied
564:11 customer information table is only tied right now to the sales table so we're
564:13 right now to the sales table so we're actually going to go over to the
564:14 actually going to go over to the apocalypse store and we want to see how
564:17 apocalypse store and we want to see how many product IDs are being bought in
564:20 many product IDs are being bought in these different states so really quickly
564:22 these different states so really quickly we're going to come up here and create a
564:23 we're going to come up here and create a new measure and all we're going to say
564:26 new measure and all we're going to say is this measure is the count of
564:28 is this measure is the count of Apocalypse store product ID and we're
564:32 Apocalypse store product ID and we're going to create that and now we're going
564:34 going to create that and now we're going to select it so it's added to that table
564:36 to select it so it's added to that table so now what this is showing is that
564:37 so now what this is showing is that there are 10 product s which there are
564:40 there are 10 product s which there are 10 products for each of these states but
564:43 10 products for each of these states but that's not actually technically correct
564:45 that's not actually technically correct because not every state purchased these
564:48 because not every state purchased these 10 different items if we go back to our
564:51 10 different items if we go back to our model and we change both of these
564:54 model and we change both of these to a both
564:56 to a both Direction and then we're going to go
564:58 Direction and then we're going to go back and see what changed in our
565:00 back and see what changed in our numbers so now let's go back to our
565:03 numbers so now let's go back to our visualization and now we can see that
565:05 visualization and now we can see that Minnesota actually only ordered seven
565:07 Minnesota actually only ordered seven different product IDs Miss Miss 8 New
565:09 different product IDs Miss Miss 8 New York 99 and Texas 10 this is actually
565:12 York 99 and Texas 10 this is actually much more accurate than before when you
565:15 much more accurate than before when you use the both option it takes these
565:17 use the both option it takes these tables and treats them as if they are a
565:19 tables and treats them as if they are a single table but the single option is
565:21 single table but the single option is not going to do that and so for our
565:23 not going to do that and so for our example if we're trying to connect this
565:24 example if we're trying to connect this table to this table and one of the last
565:26 table to this table and one of the last things that I want to show you is this
565:28 things that I want to show you is this option right down here which says make
565:30 option right down here which says make this relationship active now if we don't
565:33 this relationship active now if we don't click list and there are other options
565:34 click list and there are other options in here that connect these things like
565:36 in here that connect these things like the customer to the customer then that
565:38 the customer to the customer then that may be the active relationship but if I
565:40 may be the active relationship but if I select this is the active relationship
565:42 select this is the active relationship that means this is going to become the
565:43 that means this is going to become the default relationship between these two
565:45 default relationship between these two tables so now let's come out of here
565:47 tables so now let's come out of here we're going to click cancel we're going
565:49 we're going to click cancel we're going to zoom in just a little bit and bring
565:52 to zoom in just a little bit and bring these tables a little bit closer so we
565:54 these tables a little bit closer so we can zoom in just a little bit more now
565:57 can zoom in just a little bit more now we are going to go ahead and delete
565:59 we are going to go ahead and delete these so we're going to say delete yes
566:02 these so we're going to say delete yes and delete yes so just for demonstration
566:06 and delete yes so just for demonstration purposes we're going to build these
566:07 purposes we're going to build these relationships from scratch so we're
566:09 relationships from scratch so we're going to come over to the customer
566:10 going to come over to the customer information table and we're going to
566:12 information table and we're going to drag it all the way over here and put it
566:14 drag it all the way over here and put it on top of this cust ID or the customer
566:16 on top of this cust ID or the customer ID in Apocalypse
566:18 ID in Apocalypse sales and it's going to automatically
566:20 sales and it's going to automatically create that relationship and we can open
566:23 create that relationship and we can open this up and as you can see it created
566:25 this up and as you can see it created the relationship between this customer
566:26 the relationship between this customer ID in the apocalypse sales and the
566:28 ID in the apocalypse sales and the customer ID in the customer information
566:30 customer ID in the customer information it also defaulted the cardinality from
566:32 it also defaulted the cardinality from many to one and the cross filter
566:34 many to one and the cross filter direction to single so we're going to go
566:36 direction to single so we're going to go ahead and change that to both and click
566:38 ahead and change that to both and click okay
566:39 okay and then we're going to come over here
566:40 and then we're going to come over here to the product ID in Apocalypse store
566:42 to the product ID in Apocalypse store and drag this over the product ID in the
566:44 and drag this over the product ID in the apocalypse
566:45 apocalypse sales and again if we open it up it
566:48 sales and again if we open it up it created that relationship for us it
566:50 created that relationship for us it created the cardinality automatically
566:52 created the cardinality automatically and we're going to change this cross
566:53 and we're going to change this cross filter direction to both and click okay
566:56 filter direction to both and click okay and so on a really small scale that is
566:58 and so on a really small scale that is how it works of course it becomes a
567:00 how it works of course it becomes a little bit more complex the more tables
567:03 little bit more complex the more tables that you add and the more relationships
567:04 that you add and the more relationships that are created but this is how you're
567:06 that are created but this is how you're going to actually create the
567:07 going to actually create the relationships in the model tab within
567:10 relationships in the model tab within powerbi I hope that this tutorial has
567:11 powerbi I hope that this tutorial has helped you understand this concept a
567:13 helped you understand this concept a little bit better thank you guys so much
567:15 little bit better thank you guys so much for watching I really appreciate it if
567:17 for watching I really appreciate it if you like this video be sure to like And
567:19 you like this video be sure to like And subscribe below and I'll see you in the
567:20 subscribe below and I'll see you in the next
567:22 next [Music]
567:32 [Music] video what's going on everybody welcome
567:34 video what's going on everybody welcome back to the powerbi tutorial Series
567:36 back to the powerbi tutorial Series today we're going to be taking a look at
567:38 today we're going to be taking a look at Dax
567:44 [Music] now DAC stands for data analysis
567:46 now DAC stands for data analysis expressions and it's basically a library
567:48 expressions and it's basically a library of functions and operators that help you
567:50 of functions and operators that help you build formulas you can use Dax to create
567:53 build formulas you can use Dax to create measures and calculated columns within
567:55 measures and calculated columns within powerbi which can really give you a lot
567:57 powerbi which can really give you a lot of insight into your data honestly it is
567:59 of insight into your data honestly it is not super complicated and hopefully by
568:01 not super complicated and hopefully by the end of this video you'll have a lot
568:03 the end of this video you'll have a lot more confidence actually using Dax and
568:05 more confidence actually using Dax and powerp so without further Ado let's jump
568:07 powerp so without further Ado let's jump onto my screen and get started with the
568:08 onto my screen and get started with the tutorial all right so let's take a look
568:10 tutorial all right so let's take a look at our tables and data before we get
568:11 at our tables and data before we get started so we have two tables the
568:13 started so we have two tables the apocalypse sales the apocalypse store
568:16 apocalypse sales the apocalypse store for this apocalypse sales table we have
568:18 for this apocalypse sales table we have the customer product ID order ID unit
568:20 the customer product ID order ID unit sold and the date it was purchased and
568:23 sold and the date it was purchased and then for the apocalypse store we have
568:25 then for the apocalypse store we have product ID product name price and
568:28 product ID product name price and production cost now these are joined
568:31 production cost now these are joined together or they do have a relationship
568:33 together or they do have a relationship together via the product ID so what
568:36 together via the product ID so what we're going to be using are these new
568:37 we're going to be using are these new measures and new columns to create our
568:40 measures and new columns to create our Dax functions so really quickly let's go
568:43 Dax functions so really quickly let's go over to this report Tab and let's drop
568:45 over to this report Tab and let's drop down our Fields over here so we can see
568:48 down our Fields over here so we can see everything and so to get us started
568:49 everything and so to get us started we're going to go right up here to
568:50 we're going to go right up here to apocalypse sales we're going to
568:52 apocalypse sales we're going to rightclick and click new measure and
568:55 rightclick and click new measure and it's going to open up this right here
568:56 it's going to open up this right here which is basically our bar where we can
568:58 which is basically our bar where we can create our functions and so right here
569:01 create our functions and so right here it's automatically given us the name
569:02 it's automatically given us the name measure but we can change that and we're
569:04 measure but we can change that and we're going to say count of sales so now we
569:08 going to say count of sales so now we can start writing our Dax function
569:10 can start writing our Dax function that's just going to be the name of it
569:12 that's just going to be the name of it and what's going to show up right over
569:13 and what's going to show up right over here once we click enter so let's go
569:16 here once we click enter so let's go over here and we're going to say count
569:19 over here and we're going to say count and as we're typing it's automatically
569:21 and as we're typing it's automatically giving us options it has something
569:23 giving us options it has something called intellisense if you've ever used
569:24 called intellisense if you've ever used other Microsoft products intellisense is
569:26 other Microsoft products intellisense is their kind of autoc completion that
569:28 their kind of autoc completion that helps you look at other options very
569:30 helps you look at other options very quickly and so we're just going to click
569:32 quickly and so we're just going to click on this count and it's prompting us to
569:35 on this count and it's prompting us to put in a column name and so we can come
569:37 put in a column name and so we can come down here and we can select one or we
569:39 down here and we can select one or we can type it out and it'll try to predict
569:41 can type it out and it'll try to predict and help us choose which column to
569:43 and help us choose which column to select so for us we're going to use this
569:45 select so for us we're going to use this order ID but let's just start typing it
569:47 order ID but let's just start typing it out we'll say order ID and then we can
569:50 out we'll say order ID and then we can click on it and we're going to close
569:53 click on it and we're going to close this parenthesis and click enter or you
569:55 this parenthesis and click enter or you can go over here and click this check
569:56 can go over here and click this check mark but we're just going to click
569:58 mark but we're just going to click enter and so over on this right side it
570:01 enter and so over on this right side it finalized that and save that and we can
570:03 finalized that and save that and we can actually look at that by clicking on
570:04 actually look at that by clicking on this box next to
570:06 this box next to it and we want to look at the this in a
570:09 it and we want to look at the this in a table so now we can see that there are
570:11 table so now we can see that there are 74 sales now for this we want to see
570:14 74 sales now for this we want to see who's buying our products we want to see
570:17 who's buying our products we want to see what our what our client name is so
570:19 what our what our client name is so we're going to go over here we're going
570:20 we're going to go over here we're going to choose customer and we're going to
570:23 to choose customer and we're going to put customer on top of sales and we're
570:26 put customer on top of sales and we're just going to take a look at it like
570:28 just going to take a look at it like this so now we can see that our number
570:30 this so now we can see that our number one customer is Uncle Joe's Prep shop he
570:32 one customer is Uncle Joe's Prep shop he has 22 orders now they have the most
570:34 has 22 orders now they have the most orders with us but it doesn't
570:36 orders with us but it doesn't necessarily mean that they're spending
570:37 necessarily mean that they're spending the most money with us but we can take a
570:39 the most money with us but we can take a look at that later the next thing that I
570:41 look at that later the next thing that I want to take a look at is how many
570:43 want to take a look at is how many products we're actually selling what are
570:44 products we're actually selling what are our big products that we're selling we
570:46 our big products that we're selling we have 10 different items but I don't know
570:49 have 10 different items but I don't know exactly which one is selling the best if
570:52 exactly which one is selling the best if if one is doing really poorly and
570:53 if one is doing really poorly and getting no orders this is something that
570:55 getting no orders this is something that I want to look into so all we're going
570:57 I want to look into so all we're going to do is go right back up here to
570:58 to do is go right back up here to apocalypse sales again right click and
571:01 apocalypse sales again right click and select new measure and for this one
571:03 select new measure and for this one we're going to call it the sum of
571:06 we're going to call it the sum of products sold
571:09 products sold and all we're going to start out with is
571:11 and all we're going to start out with is by doing sum and if this seems familiar
571:15 by doing sum and if this seems familiar to something like Excel you're 100%
571:17 to something like Excel you're 100% correct it is very similar and remember
571:20 correct it is very similar and remember these are both Microsoft products so
571:22 these are both Microsoft products so there's going to be similar
571:23 there's going to be similar functionality in both of them and so
571:26 functionality in both of them and so this Dax is going to have a lot of
571:28 this Dax is going to have a lot of similarities to exactly how it has it in
571:30 similarities to exactly how it has it in Excel so we're going to do an open
571:32 Excel so we're going to do an open bracket and now what we're going to
571:34 bracket and now what we're going to choose is this units sold we want to sum
571:37 choose is this units sold we want to sum up all of these units sold and see how
571:39 up all of these units sold and see how many we actually selling so we're going
571:41 many we actually selling so we're going to say units sold I'm going to hit tab
571:45 to say units sold I'm going to hit tab it's going to autocomplete that I'm
571:46 it's going to autocomplete that I'm going to close my parenthesis and I'm
571:48 going to close my parenthesis and I'm going to come over here and click this
571:50 going to come over here and click this checkbox so now it's created that
571:52 checkbox so now it's created that measure and we're already selected in
571:54 measure and we're already selected in this table so all we have to do is click
571:56 this table so all we have to do is click the check mark and it's going to show us
571:58 the check mark and it's going to show us that we have 3,000 total products sold
572:02 that we have 3,000 total products sold and we can go through here and see what
572:03 and we can go through here and see what the big sellers are and probably the
572:05 the big sellers are and probably the biggest one that I see right off the bat
572:07 biggest one that I see right off the bat is this multi- Tool Survival Knife so
572:09 is this multi- Tool Survival Knife so these Dax functions that you can write
572:11 these Dax functions that you can write can be very simple and lead to really
572:13 can be very simple and lead to really good insights that you can use for the
572:15 good insights that you can use for the visualizations later on now I want to
572:17 visualizations later on now I want to take a look at the difference between
572:18 take a look at the difference between something like sum which is an
572:20 something like sum which is an aggregator function and something like
572:22 aggregator function and something like sum X which is an iterator function
572:25 sum X which is an iterator function because if you add X to some of these
572:26 because if you add X to some of these aggregator functions you can create them
572:29 aggregator functions you can create them or or make them into an iterator
572:31 or or make them into an iterator function so you can have some and some X
572:33 function so you can have some and some X or average and average X adding X onto
572:36 or average and average X adding X onto the end of them can make them to an
572:38 the end of them can make them to an iterator function so let's take a look
572:40 iterator function so let's take a look and see how that actually works I'm
572:42 and see how that actually works I'm going to show you the difference and
572:43 going to show you the difference and then I'm going to talk through the
572:44 then I'm going to talk through the difference at the end so really quickly
572:46 difference at the end so really quickly let's go back to our data and let's go
572:49 let's go back to our data and let's go to the apocalypse store now what we have
572:52 to the apocalypse store now what we have right here is we have the price and we
572:53 right here is we have the price and we have the production cost and we want to
572:55 have the production cost and we want to see how much profit we're getting from
572:57 see how much profit we're getting from each of these as well as we can take a
572:58 each of these as well as we can take a look at the unit sold and see how much
573:01 look at the unit sold and see how much money we are actually making so what
573:03 money we are actually making so what we're going to do is we're going to come
573:04 we're going to do is we're going to come back over here we're going to go to
573:06 back over here we're going to go to apocalypse store we're going to right
573:09 apocalypse store we're going to right click and create a measure and in just a
573:11 click and create a measure and in just a little bit we're going to be creating a
573:12 little bit we're going to be creating a new column and that'll kind of show the
573:14 new column and that'll kind of show the difference really well so we're going to
573:16 difference really well so we're going to create this new measure and we're going
573:17 create this new measure and we're going to name it
573:19 to name it profit and we're going to come over here
573:22 profit and we're going to come over here and what we're going to do is we're
573:23 and what we're going to do is we're going to take the sum oops we're going
573:25 going to take the sum oops we're going to start with our sums we're going to
573:27 to start with our sums we're going to take the sum of the
573:29 take the sum of the price and then we're going to close that
573:31 price and then we're going to close that parenthesis and we're going to subtract
573:33 parenthesis and we're going to subtract the sum of the production cost so all
573:38 the sum of the production cost so all that does is it says if something cost
573:39 that does is it says if something cost $20 if we sold it for $20 and it only
573:42 $20 if we sold it for $20 and it only costs us $10 that's $10 in profit for
573:45 costs us $10 that's $10 in profit for that item and then what we're going to
573:46 that item and then what we're going to want to do is we're going to actually
573:48 want to do is we're going to actually want to encapsulate that really quickly
573:50 want to encapsulate that really quickly because we're about to use multiply and
573:54 because we're about to use multiply and then we're going to sum and now we're
573:56 then we're going to sum and now we're going to take the units sold so how many
573:59 going to take the units sold so how many units were actually sold at that profit
574:01 units were actually sold at that profit that we just made so let's see if that
574:03 that we just made so let's see if that works and let's click the check right
574:05 works and let's click the check right here and so we have the profit so let's
574:08 here and so we have the profit so let's click on the profit oops that's not what
574:10 click on the profit oops that's not what I wanted to do let's use a new one or
574:12 I wanted to do let's use a new one or let's create a new uh table we're going
574:15 let's create a new uh table we're going to click
574:16 to click profit let's make it a table and I'm
574:18 profit let's make it a table and I'm going to pull this right over
574:20 going to pull this right over here now we have our profit but I really
574:23 here now we have our profit but I really want to know is which customer is
574:25 want to know is which customer is spending the most money at my store so
574:28 spending the most money at my store so we're going to come right over here
574:29 we're going to come right over here we're going to click on customer and I'm
574:31 we're going to click on customer and I'm put customer at the top and just at a
574:33 put customer at the top and just at a glance we can see that Uncle Joe's Prep
574:35 glance we can see that Uncle Joe's Prep shop is spending the most money at the
574:37 shop is spending the most money at the store now now what I want to show you is
574:39 store now now what I want to show you is the difference between sum and sum X so
574:41 the difference between sum and sum X so what I'm going to do so I'm going to go
574:44 what I'm going to do so I'm going to go back to this profit and going to copy
574:47 back to this profit and going to copy this this entire thing and we're going
574:50 this this entire thing and we're going to go back here to this table now we
574:53 to go back here to this table now we just created a measure and we were able
574:55 just created a measure and we were able to break it down by each customer so
574:58 to break it down by each customer so let's go back over here now let's go up
575:01 let's go back over here now let's go up here to home and we're going to create a
575:04 here to home and we're going to create a new column and we're going to call this
575:07 new column and we're going to call this profit
575:08 profit profit underscore column and we're going
575:12 profit underscore column and we're going to literally paste the exact same thing
575:15 to literally paste the exact same thing into here and we're going to hit
575:18 into here and we're going to hit enter and each row is the exact same
575:23 enter and each row is the exact same thing so what it's doing is it is going
575:25 thing so what it's doing is it is going through the price and it's adding all of
575:27 through the price and it's adding all of it up and calculating it at the bottom
575:30 it up and calculating it at the bottom it's adding the production cost it's
575:31 it's adding the production cost it's going all the way down and calculating
575:33 going all the way down and calculating it at the bottom and then it's going
575:35 it at the bottom and then it's going over and looking at how many units it
575:37 over and looking at how many units it sold and then it's performing this
575:39 sold and then it's performing this calculation up here and then it gives us
575:41 calculation up here and then it gives us the total and it's doing it for every
575:43 the total and it's doing it for every single row but that's not really what we
575:46 single row but that's not really what we wanted to show what we wanted to show is
575:48 wanted to show what we wanted to show is the profit for each row what we wanted
575:51 the profit for each row what we wanted to say is here's the price for the Rope
575:53 to say is here's the price for the Rope the production cost for the rope and
575:55 the production cost for the rope and then how many units we actually sold and
575:57 then how many units we actually sold and then it'll calculate that and give us
576:00 then it'll calculate that and give us the actual profit for just that row but
576:03 the actual profit for just that row but we cannot do it by just using this sum
576:06 we cannot do it by just using this sum what we need to do is use something
576:07 what we need to do is use something called Su X so let's add another column
576:11 called Su X so let's add another column let's go back to home say new
576:14 let's go back to home say new column and now we're going to
576:17 column and now we're going to say profit underscore oops underscore
576:22 say profit underscore oops underscore column underscore sum
576:26 column underscore sum X and now we're going to use sum X and
576:31 X and now we're going to use sum X and hit Tab and we need to choose the table
576:33 hit Tab and we need to choose the table that we want to put this in so we're
576:35 that we want to put this in so we're going to say apocalypse sales because
576:37 going to say apocalypse sales because that's the table that we're looking at
576:38 that's the table that we're looking at right here we're going to say comma and
576:40 right here we're going to say comma and now we need to input an expression which
576:42 now we need to input an expression which it says it Returns the sum of an
576:44 it says it Returns the sum of an expression evaluated for each row in a
576:46 expression evaluated for each row in a table before when you're just using sum
576:48 table before when you're just using sum it's looking at all of these combined
576:50 it's looking at all of these combined now it's taking it row by row so what
576:53 now it's taking it row by row so what we're going to do is basically input the
576:54 we're going to do is basically input the same thing as we did before I'm going to
576:56 same thing as we did before I'm going to copy I'm going to paste that it's not
576:58 copy I'm going to paste that it's not going to be correct I need to get rid of
576:59 going to be correct I need to get rid of these
577:00 these sums but it's basically the exact same
577:03 sums but it's basically the exact same equation give me just a second and let's
577:07 equation give me just a second and let's get rid of this
577:08 get rid of this some and let's see if this works so
577:11 some and let's see if this works so let's click the check
577:14 let's click the check button and now this looks a lot better
577:18 button and now this looks a lot better so what this is now showing us is at a
577:20 so what this is now showing us is at a row level this nylon rope made us 51,000
577:22 row level this nylon rope made us 51,000 almost
577:23 almost $52,000 the waterproof matches made us
577:27 $52,000 the waterproof matches made us $115,000 and we can go down and look at
577:30 $115,000 and we can go down and look at each item and see how much that actually
577:33 each item and see how much that actually made us versus this profit column and so
577:37 made us versus this profit column and so that is the biggest difference between
577:38 that is the biggest difference between sum and sum X hopefully that made sense
577:41 sum and sum X hopefully that made sense I know that sum and sum X and and the
577:43 I know that sum and sum X and and the difference between an aggregator
577:44 difference between an aggregator function and iterator function can be a
577:46 function and iterator function can be a little bit confusing especially if
577:47 little bit confusing especially if you've never done it before but
577:49 you've never done it before but hopefully that was a good example for
577:50 hopefully that was a good example for you to understand that concept now let's
577:52 you to understand that concept now let's go back over here to apocalypse sales
577:55 go back over here to apocalypse sales right here we have a date purchase now
577:58 right here we have a date purchase now in the Dax function we have some ways
578:00 in the Dax function we have some ways that we can interact with dates and so I
578:02 that we can interact with dates and so I want to take a look at those really
578:03 want to take a look at those really quickly so we're going to go right up
578:05 quickly so we're going to go right up here and click on new column and we're
578:08 here and click on new column and we're just going to leave that as column but
578:10 just going to leave that as column but what we're going to say is day so
578:13 what we're going to say is day so there's a few different ones we have Day
578:15 there's a few different ones we have Day dates YTD next day previous day and
578:19 dates YTD next day previous day and weekday and they all are pretty
578:22 weekday and they all are pretty self-explanatory if you click on it
578:24 self-explanatory if you click on it let's click on weekday it says it's
578:26 let's click on weekday it says it's going to return a number from 1 to 7
578:28 going to return a number from 1 to 7 identifying the day of the week of a
578:30 identifying the day of the week of a date so let's use this really quickly
578:33 date so let's use this really quickly and so we're going to say date
578:36 and so we're going to say date purchased and and click tab hit
578:40 purchased and and click tab hit comma and it's going to give us a three
578:43 comma and it's going to give us a three different options basically it's a one a
578:44 different options basically it's a one a two and a three um right here if you hit
578:47 two and a three um right here if you hit this button read more you can read more
578:49 this button read more you can read more on it this is going to say Sunday is
578:51 on it this is going to say Sunday is equal to one Saturday is equal to seven
578:53 equal to one Saturday is equal to seven I like this one personally which is
578:54 I like this one personally which is Monday equals one in my brain it just
578:56 Monday equals one in my brain it just makes more sense so I'm going to click
578:58 makes more sense so I'm going to click on two I'm going to close that
579:00 on two I'm going to close that parentheses and we're going to I guess
579:03 parentheses and we're going to I guess I'll say uh let's say day of week for
579:07 I'll say uh let's say day of week for the column
579:08 the column let's click that
579:10 let's click that checkbox and now Saturdays are equal to
579:13 checkbox and now Saturdays are equal to sixes Mondays are equal to one this
579:16 sixes Mondays are equal to one this allows us to see which day of the week
579:18 allows us to see which day of the week people are buying the most products on
579:21 people are buying the most products on or or which day of the week is somebody
579:23 or or which day of the week is somebody submitting their orders on and so let's
579:25 submitting their orders on and so let's go over to our report let's get rid of
579:28 go over to our report let's get rid of this we just going to move this oh jeez
579:32 this we just going to move this oh jeez I hate moving stuff sometimes all right
579:35 I hate moving stuff sometimes all right really quickly I want to show you the
579:37 really quickly I want to show you the difference between what we just did and
579:38 difference between what we just did and what we already have so we have this um
579:41 what we already have so we have this um date purchased and let's make that into
579:46 date purchased and let's make that into a bar
579:47 a bar graph and what we're going to be taking
579:49 graph and what we're going to be taking a look at is actually the units sold so
579:52 a look at is actually the units sold so right here we have this and obviously
579:55 right here we have this and obviously for we don't want 2022 we're going to
579:57 for we don't want 2022 we're going to get rid of the year we only have one
579:59 get rid of the year we only have one quarter right here we can see January
580:02 quarter right here we can see January February March so we can tell that
580:04 February March so we can tell that January has the most sales or the most
580:06 January has the most sales or the most units sold in that month if we get rid
580:08 units sold in that month if we get rid of that we go down to day we do have
580:11 of that we go down to day we do have some information but we don't know what
580:13 some information but we don't know what day of the week it is it could change
580:15 day of the week it is it could change from month to month and it's really hard
580:18 from month to month and it's really hard to tell exactly what if there's any
580:19 to tell exactly what if there's any pattern there at all that's where what
580:22 pattern there at all that's where what we just created comes in handy so let's
580:24 we just created comes in handy so let's recreate this exact same thing but
580:26 recreate this exact same thing but instead we're going to use day of week
580:28 instead we're going to use day of week so we're going to select day of week and
580:30 so we're going to select day of week and unit sold let's drag that down and move
580:34 unit sold let's drag that down and move this over right here and this day of the
580:37 this over right here and this day of the week should be on the
580:39 week should be on the xaxis and it's really easy now to see if
580:42 xaxis and it's really easy now to see if there's a pattern here there's really
580:44 there's a pattern here there's really not at least not for this fake data that
580:46 not at least not for this fake data that we have um but just I I want these uh
580:50 we have um but just I I want these uh data labels on really
580:51 data labels on really quickly um it's not easy to see if
580:54 quickly um it's not easy to see if there's any pattern again Monday has the
580:56 there's any pattern again Monday has the most so maybe that that I mean it goes
580:59 most so maybe that that I mean it goes down a little bit and then it picks back
581:00 down a little bit and then it picks back up so maybe middle the week is our least
581:03 up so maybe middle the week is our least uh sales day our Wednesdays and
581:05 uh sales day our Wednesdays and Thursdays are a little bit lower than
581:06 Thursdays are a little bit lower than the rest and the beginning and the end
581:08 the rest and the beginning and the end of the week tend to be the highest again
581:10 of the week tend to be the highest again not a huge pattern but you know it's
581:12 not a huge pattern but you know it's much easier to see if there is a pattern
581:14 much easier to see if there is a pattern from week to week or what day of the
581:16 from week to week or what day of the week now that we use this weekday
581:18 week now that we use this weekday function and so this can be really
581:20 function and so this can be really really useful let's go back here to our
581:23 really useful let's go back here to our data now we're going to look at our last
581:25 data now we're going to look at our last Dax function for this video let's go up
581:27 Dax function for this video let's go up here and create a new column and we're
581:30 here and create a new column and we're going to be looking at something called
581:31 going to be looking at something called the if statement now if you've ever used
581:33 the if statement now if you've ever used Excel I'm sure you have heard of this
581:35 Excel I'm sure you have heard of this and you can do the exact same thing here
581:38 and you can do the exact same thing here in powerbi and so we're going to name
581:40 in powerbi and so we're going to name this one order size order undor size and
581:44 this one order size order undor size and so all we're going to say is if we're
581:47 so all we're going to say is if we're going to click on this one right here we
581:49 going to click on this one right here we need to perform our logical test and
581:51 need to perform our logical test and then we want to say if it's true what's
581:53 then we want to say if it's true what's our value and if it's false what is our
581:55 our value and if it's false what is our value so what we're going to be looking
581:57 value so what we're going to be looking at is units sold so we're looking at
582:00 at is units sold so we're looking at order size so we're going to say if unit
582:03 order size so we're going to say if unit sold is greater than
582:06 sold is greater than 25 what's going to happen if it is true
582:09 25 what's going to happen if it is true if the order is larger than 25 you want
582:11 if the order is larger than 25 you want to say it's a big order and if it's not
582:15 to say it's a big order and if it's not we want to say it's a small
582:18 we want to say it's a small order super simple we'll close that
582:20 order super simple we'll close that parenthesis we'll click okay and now
582:24 parenthesis we'll click okay and now really quickly we're able to see if this
582:25 really quickly we're able to see if this is a big order or a small order and so
582:28 is a big order or a small order and so that is all I have for you today there
582:30 that is all I have for you today there are a lot of other dox functions but the
582:32 are a lot of other dox functions but the ones that we looked at today are ones
582:34 ones that we looked at today are ones that are very common ones that you'll
582:36 that are very common ones that you'll see the most and there can be a lot of
582:38 see the most and there can be a lot of really complex and intricate Dax
582:40 really complex and intricate Dax functions that you can create and in our
582:43 functions that you can create and in our project at the end of this series I will
582:45 project at the end of this series I will be sure to include some more complex Dax
582:47 be sure to include some more complex Dax functions but hopefully this gave you a
582:49 functions but hopefully this gave you a good introduction into Dax so you know
582:51 good introduction into Dax so you know how to use it a little bit better thank
582:53 how to use it a little bit better thank you guys so much for watching I really
582:55 you guys so much for watching I really appreciate it if you like this video be
582:57 appreciate it if you like this video be sure to like And subscribe and check out
582:59 sure to like And subscribe and check out all of my other videos on everything
583:01 all of my other videos on everything data analyst related I will see you in
583:03 data analyst related I will see you in the next
583:07 video [Music]
583:16 [Music]
583:18 what's going on everybody welcome back to the powerbi tutorial Series today
583:20 to the powerbi tutorial Series today we're going to be looking at how to
583:21 we're going to be looking at how to drill down in
583:28 [Music] visualizations so when I say drill down
583:30 visualizations so when I say drill down I mean you're basically adding another
583:32 I mean you're basically adding another layer beneath the top layer of the
583:34 layer beneath the top layer of the visualization and when somebody clicks
583:36 visualization and when somebody clicks or drills down in that data they can see
583:38 or drills down in that data they can see more insights and more information on
583:40 more insights and more information on the top level of data when you drill
583:42 the top level of data when you drill down you can also drill up and I will
583:44 down you can also drill up and I will show you how to do that in this tutorial
583:46 show you how to do that in this tutorial so without further Ado let's jump on my
583:48 so without further Ado let's jump on my screen and get started with the tutorial
583:49 screen and get started with the tutorial all right so before we get started I
583:50 all right so before we get started I wanted to remind you that you can find
583:52 wanted to remind you that you can find the data that we're going to be working
583:53 the data that we're going to be working with in this tutorial in the description
583:55 with in this tutorial in the description you can go and download it from my
583:56 you can go and download it from my GitHub now the two tables I'm going to
583:58 GitHub now the two tables I'm going to be looking at are apocalypse sales and
584:01 be looking at are apocalypse sales and purchase tracker and if you've ever
584:02 purchase tracker and if you've ever created any visualizations you've
584:05 created any visualizations you've probably seen something like this where
584:06 probably seen something like this where you'll have the store and the price and
584:09 you'll have the store and the price and this is the the things that we actually
584:10 this is the the things that we actually bought so this is the total amount of
584:13 bought so this is the total amount of Apocalypse prepping uh equipment that we
584:15 Apocalypse prepping uh equipment that we bought and we'll put the store in this
584:17 bought and we'll put the store in this Legend right here and you've probably
584:19 Legend right here and you've probably seen something like this and if you're
584:21 seen something like this and if you're anything like me you're going to be in a
584:22 anything like me you're going to be in a meeting and you're going to be
584:23 meeting and you're going to be presenting this and some higher up is
584:24 presenting this and some higher up is going to be like hey Alex that looks
584:26 going to be like hey Alex that looks great but I want to you know see what
584:28 great but I want to you know see what things we actually bought in Target and
584:29 things we actually bought in Target and how much this cost can you create a
584:31 how much this cost can you create a visualization for that and you're going
584:33 visualization for that and you're going to be like well I could or I could use
584:36 to be like well I could or I could use drill down so you could have done this
584:37 drill down so you could have done this in the first place uh which you should
584:39 in the first place uh which you should have so what we're going to do is all
584:41 have so what we're going to do is all we're going to do is we're going to say
584:43 we're going to do is we're going to say we're going to say the product right
584:44 we're going to say the product right here and these are going to be the
584:45 here and these are going to be the actual things and we're going to put it
584:47 actual things and we're going to put it right under store now you can't see
584:50 right under store now you can't see these things right but there is a a
584:52 these things right but there is a a hierarchy here so once we added this
584:55 hierarchy here so once we added this these options became available let's
584:57 these options became available let's take it out and all those just
584:59 take it out and all those just disappeared and then if we add it back
585:03 disappeared and then if we add it back right here they came back and so you can
585:06 right here they came back and so you can do right here which is is click to turn
585:08 do right here which is is click to turn on drill down you can go to the next
585:10 on drill down you can go to the next level in the hierarchy or you can even
585:12 level in the hierarchy or you can even expand all down one level in the
585:14 expand all down one level in the hierarchy so let's look at each of those
585:15 hierarchy so let's look at each of those really quickly so let's click on this
585:17 really quickly so let's click on this one it's just going to turn on drill
585:19 one it's just going to turn on drill down mode so now if I go and I click on
585:21 down mode so now if I go and I click on target it's going to drill down into
585:24 target it's going to drill down into these and if we want to I can then put
585:26 these and if we want to I can then put product under this
585:28 product under this Legend and we can see all of those
585:30 Legend and we can see all of those things but of course if we go back up
585:33 things but of course if we go back up it's going to be all broken up into this
585:34 it's going to be all broken up into this clustered column chart which is more
585:36 clustered column chart which is more like
585:37 like um this which isn't exactly what we were
585:40 um this which isn't exactly what we were going for but it works now uh let me get
585:43 going for but it works now uh let me get rid of this I actually want store in the
585:45 rid of this I actually want store in the legend now if we turn that off and we
585:48 legend now if we turn that off and we click it doesn't do that anymore so what
585:50 click it doesn't do that anymore so what it does now is it just highlights
585:52 it does now is it just highlights Walmart it highlights Costco it
585:54 Walmart it highlights Costco it highlights Target so we're going to keep
585:56 highlights Target so we're going to keep that on uh but we can also do something
585:59 that on uh but we can also do something called going down in the next level of
586:00 called going down in the next level of hierarchy so let's click on that and so
586:03 hierarchy so let's click on that and so now this is going to go down to the next
586:05 now this is going to go down to the next level down to this product level because
586:07 level down to this product level because that is the next level and now it's
586:09 that is the next level and now it's going to show us each of those things
586:10 going to show us each of those things but it's going to have it broken out by
586:12 but it's going to have it broken out by the store and so it's a completely
586:15 the store and so it's a completely different visualization but all within
586:17 different visualization but all within the same Realm of the data that we're
586:19 the same Realm of the data that we're looking at and what we actually care
586:20 looking at and what we actually care about so let's go back up in the
586:22 about so let's go back up in the hierarchy and then let's use this one
586:24 hierarchy and then let's use this one right here which is expand all down one
586:26 right here which is expand all down one level in the hierarchy and so this one
586:27 level in the hierarchy and so this one is again extremely similar except it
586:30 is again extremely similar except it just visualizes it differently and now
586:32 just visualizes it differently and now what it's doing is Walmart rice Target
586:35 what it's doing is Walmart rice Target dried beans Costco rice so instead of
586:37 dried beans Costco rice so instead of having an all uh like this one where
586:40 having an all uh like this one where it's stacked on top of each other it's
586:41 it's stacked on top of each other it's breaking it down individually so this
586:44 breaking it down individually so this one column would become three separate
586:46 one column would become three separate columns now I'm going to minimize this
586:48 columns now I'm going to minimize this right here uh I'm actually going to go
586:50 right here uh I'm actually going to go back up in the hierarchy just for visual
586:52 back up in the hierarchy just for visual purposes now I'm going to show you one
586:54 purposes now I'm going to show you one more example we're going to use this
586:56 more example we're going to use this apocalypse sales up here and this is one
586:58 apocalypse sales up here and this is one that I actually use all the time so the
587:01 that I actually use all the time so the one you've seen you know you'll get
587:03 one you've seen you know you'll get stuff like that especially if you're
587:04 stuff like that especially if you're working with like sales and stuff but I
587:06 working with like sales and stuff but I work in operations right so I have a lot
587:09 work in operations right so I have a lot of order IDs product IDs stuff like that
587:13 of order IDs product IDs stuff like that now this one this one genuinely I use
587:16 now this one this one genuinely I use quite often I'll have a customer U let's
587:18 quite often I'll have a customer U let's make it we'll just go like this we have
587:21 make it we'll just go like this we have a customer and we have unit sold and
587:24 a customer and we have unit sold and let's use the customer as the legend so
587:27 let's use the customer as the legend so let's make this one quite a bit
587:31 let's make this one quite a bit larger and I'll have something like this
587:34 larger and I'll have something like this and they'll say okay well we want to see
587:36 and they'll say okay well we want to see the order ID s that go with it because
587:38 the order ID s that go with it because we want to know what orders are actually
587:40 we want to know what orders are actually happening for each of these people
587:42 happening for each of these people obviously I'm not using this exact data
587:44 obviously I'm not using this exact data but very very very similar and all you
587:47 but very very very similar and all you have to do is take these order IDs and
587:49 have to do is take these order IDs and slide it right under here under customer
587:52 slide it right under here under customer and this visualization right here is
587:54 and this visualization right here is something I've done a thousand times
587:56 something I've done a thousand times because what happens is is someone some
587:58 because what happens is is someone some stakeholder in our company is saying hey
588:00 stakeholder in our company is saying hey Alex we want this and we want to know we
588:03 Alex we want this and we want to know we want to drill down on this IP address we
588:05 want to drill down on this IP address we want to drill down on this certain
588:07 want to drill down on this certain database we want to drill down on
588:09 database we want to drill down on something and we want to see the order
588:10 something and we want to see the order IDs within them so then all you do is
588:13 IDs within them so then all you do is you turn on drill mode or drill down
588:15 you turn on drill mode or drill down mode you'll click on it and you can see
588:17 mode you'll click on it and you can see every single order ID that's in there
588:19 every single order ID that's in there and then they can go and look those up
588:20 and then they can go and look those up in their system and resolve them or
588:23 in their system and resolve them or whatever they're trying to do with it
588:24 whatever they're trying to do with it and it helps a ton and it's very very
588:26 and it helps a ton and it's very very useful this one is extremely applicable
588:29 useful this one is extremely applicable and that's really all drill down is
588:30 and that's really all drill down is again you have these different
588:31 again you have these different hierarchies as well um but for different
588:34 hierarchies as well um but for different things it's not as useful as you can see
588:37 things it's not as useful as you can see we also have this hierarchy which again
588:39 we also have this hierarchy which again is not as useful so it just depends on
588:41 is not as useful so it just depends on the data that you're using and how you
588:43 the data that you're using and how you want to use this drill down effect but I
588:45 want to use this drill down effect but I promise you that drill down is used all
588:48 promise you that drill down is used all the time especially when you're giving
588:50 the time especially when you're giving presentations where people want to know
588:51 presentations where people want to know more information than just the the
588:54 more information than just the the visualization that you're presenting so
588:55 visualization that you're presenting so I hope that this has been helpful I hope
588:57 I hope that this has been helpful I hope that you understand drill down a little
588:58 that you understand drill down a little bit better if you like this video be
589:00 bit better if you like this video be sure to like And subscribe and check out
589:02 sure to like And subscribe and check out all my other videos on powerbi thank you
589:04 all my other videos on powerbi thank you and I'll see you in the next video
589:07 and I'll see you in the next video [Music]
589:18 [Music] what's going on everybody welcome back
589:19 what's going on everybody welcome back to the powerbi tutorial Series today
589:21 to the powerbi tutorial Series today we're going to be taking a look at
589:22 we're going to be taking a look at conditional
589:28 [Music] formatting now conditional formatting
589:30 formatting now conditional formatting may sound familiar because we looked at
589:32 may sound familiar because we looked at it in the Excel series and it's very
589:34 it in the Excel series and it's very similar how you use it in Excel versus
589:36 similar how you use it in Excel versus how you use it in powerbi conditional
589:38 how you use it in powerbi conditional formatting allows you to take a table or
589:39 formatting allows you to take a table or a matrix within powerbi and use those
589:42 a matrix within powerbi and use those cells to color code them and create
589:44 cells to color code them and create gradients and different visualizations
589:46 gradients and different visualizations within the actual table or Matrix I'm
589:48 within the actual table or Matrix I'm excited to start this one so let's jump
589:50 excited to start this one so let's jump over my screen and get started with the
589:51 over my screen and get started with the tutorial all right so before we get
589:52 tutorial all right so before we get started if you want to use the data that
589:54 started if you want to use the data that we're using in this video you can find
589:55 we're using in this video you can find it in the description on my GitHub now
589:57 it in the description on my GitHub now conditional formatting is super simple
589:59 conditional formatting is super simple and you've most likely used it in Excel
590:01 and you've most likely used it in Excel before but you can also use it in
590:03 before but you can also use it in powerbi and let me show you how to do
590:05 powerbi and let me show you how to do that so the first thing we're going to
590:06 that so the first thing we're going to do is come over over to our apocalypse
590:08 do is come over over to our apocalypse store and we're going to pull up our
590:10 store and we're going to pull up our product name as well as the price and
590:14 product name as well as the price and what we can do is come over here and
590:17 what we can do is come over here and we're going to go to price and it has to
590:18 we're going to go to price and it has to be under the columns so you can't come
590:20 be under the columns so you can't come over here and do this we're going to
590:23 over here and do this we're going to come right over here to price and we're
590:24 come right over here to price and we're going to right click and let's go to
590:26 going to right click and let's go to conditional formatting and we have
590:28 conditional formatting and we have background color font color icons and
590:30 background color font color icons and web URL let's take a look at background
590:32 web URL let's take a look at background color first this is most likely the one
590:34 color first this is most likely the one that we'll look at the most so we're
590:36 that we'll look at the most so we're going to get this pop up and I'm going
590:37 going to get this pop up and I'm going to slide this over now there's a lot of
590:40 to slide this over now there's a lot of different things we can customize in
590:41 different things we can customize in here and the first thing I want to take
590:43 here and the first thing I want to take a look at is format style we have the
590:45 a look at is format style we have the gradient and what it's going to say is
590:47 gradient and what it's going to say is the lowest value will be this color
590:48 the lowest value will be this color highest value will be this color it'll
590:50 highest value will be this color it'll give us this gradient color scale and so
590:53 give us this gradient color scale and so we'll use that in just a little bit but
590:55 we'll use that in just a little bit but we can also create rules kind of like an
590:57 we can also create rules kind of like an if statement and if it is between this
590:59 if statement and if it is between this range and this range we give it a color
591:02 range and this range we give it a color and if it's between a different range
591:03 and if it's between a different range and a different range we'll give it a
591:04 and a different range we'll give it a different color so we'll also try that
591:06 different color so we'll also try that one and then we have this field value uh
591:09 one and then we have this field value uh and this one is one that uh honestly I
591:11 and this one is one that uh honestly I don't use that much I've used it maybe
591:13 don't use that much I've used it maybe once and what you can do is select a
591:15 once and what you can do is select a text field like customer and you can do
591:18 text field like customer and you can do some summarizations on the first and
591:20 some summarizations on the first and last and that is it so what we're going
591:22 last and that is it so what we're going to do is we're going to look at gradient
591:24 to do is we're going to look at gradient specifically for not the customer but
591:27 specifically for not the customer but we're going to go back to the apocalypse
591:29 we're going to go back to the apocalypse store and we're going to do it on the
591:32 store and we're going to do it on the price now what I'm going to do is keep
591:34 price now what I'm going to do is keep it as the count because this is what the
591:36 it as the count because this is what the default is and we're going to go back
591:37 default is and we're going to go back and fix it later but what we want our
591:39 and fix it later but what we want our lowest value to be is this bright green
591:42 lowest value to be is this bright green showing that this it's a cheap product
591:44 showing that this it's a cheap product it's easy to purchase the high value
591:47 it's easy to purchase the high value ones are going to be just the shade of
591:49 ones are going to be just the shade of red more expensive and we'll do it on
591:51 red more expensive and we'll do it on the count now remember the count is on
591:53 the count now remember the count is on each of these and we're not doing a
591:55 each of these and we're not doing a count of how many are sold we're doing a
591:56 count of how many are sold we're doing a count of each product so it's just one
591:58 count of each product so it's just one per row so it all should be the same
592:01 per row so it all should be the same color let's take a look so it is all the
592:03 color let's take a look so it is all the same color but what we really want to
592:05 same color but what we really want to show is the actual price not just the
592:08 show is the actual price not just the count of the price so let's go back to
592:10 count of the price so let's go back to conditional formatting we're going to
592:11 conditional formatting we're going to click the background color again and
592:14 click the background color again and this time we're going to change the
592:16 this time we're going to change the summarization now you can do sum you can
592:19 summarization now you can do sum you can do average minimum maximum it really
592:21 do average minimum maximum it really doesn't matter for this example the
592:23 doesn't matter for this example the number is the same regardless of really
592:25 number is the same regardless of really which one we choose so we can just
592:26 which one we choose so we can just choose the minimum and it's going to
592:28 choose the minimum and it's going to choose the minimum of each row which is
592:31 choose the minimum of each row which is the price so we're just going to select
592:32 the price so we're just going to select minimum for this example we'll select
592:34 minimum for this example we'll select okay and it should correct it
592:36 okay and it should correct it accordingly which means the bright green
592:38 accordingly which means the bright green is the lowest and it goes all the way up
592:39 is the lowest and it goes all the way up to the highest which is the red now
592:42 to the highest which is the red now let's go over here to apocalypse sales
592:44 let's go over here to apocalypse sales we'll add in the units
592:46 we'll add in the units sold and let's move that out a little
592:50 sold and let's move that out a little bit and I'm doing that on purpose
592:52 bit and I'm doing that on purpose because we're about to look at something
592:54 because we're about to look at something within the conditional formatting so
592:55 within the conditional formatting so let's go to unit sold and we'll look at
592:58 let's go to unit sold and we'll look at the conditional formatting for this one
593:00 the conditional formatting for this one now if you noticed we now have a new one
593:03 now if you noticed we now have a new one on here called data bars now we're able
593:05 on here called data bars now we're able to see data bar bars on unit sold and
593:08 to see data bar bars on unit sold and not price because unit sold is something
593:10 not price because unit sold is something like a sum an average something that's
593:12 like a sum an average something that's aggregated but let's take a look at
593:14 aggregated but let's take a look at datab bars because I want to show you
593:15 datab bars because I want to show you how to use this and then we'll go back
593:17 how to use this and then we'll go back to the background color so for data bars
593:20 to the background color so for data bars we are going to taking a look at the
593:22 we are going to taking a look at the lowest to the highest value again we're
593:24 lowest to the highest value again we're going to go from bright green all the
593:28 going to go from bright green all the way
593:29 way to this exact red it's going to be from
593:32 to this exact red it's going to be from left to right and what it's going to
593:34 left to right and what it's going to show you is if it is a positive number
593:35 show you is if it is a positive number which all of these are is going to be a
593:38 which all of these are is going to be a green bar basically representing the
593:40 green bar basically representing the number that you see in here along this
593:42 number that you see in here along this line so let's click
593:44 line so let's click okay and we're going to be able to see
593:47 okay and we're going to be able to see the highest numbers and let's scooch
593:48 the highest numbers and let's scooch this over quite a bit so you can kind of
593:50 this over quite a bit so you can kind of get a better understanding and we're
593:52 get a better understanding and we're going to do it from highest to lowest so
593:55 going to do it from highest to lowest so we sold the most multi-tool survival
593:58 we sold the most multi-tool survival knives at 477 and so this entire bar
594:01 knives at 477 and so this entire bar this row is entirely filled up or almost
594:04 this row is entirely filled up or almost all the way filled up while as it gets
594:07 all the way filled up while as it gets lower and as we sell only 182 solar
594:10 lower and as we sell only 182 solar battery flashlights the bar is going to
594:12 battery flashlights the bar is going to represent that and show that now I'm
594:14 represent that and show that now I'm about to completely mess up this
594:15 about to completely mess up this visualization on purpose because it's
594:17 visualization on purpose because it's about to get very messy to show you that
594:19 about to get very messy to show you that you can do a little bit too much uh it
594:21 you can do a little bit too much uh it is possible what we're going to do is
594:23 is possible what we're going to do is we're going to go right over here to
594:25 we're going to go right over here to this background color unit sold and
594:27 this background color unit sold and instead of gradient let's look at rules
594:29 instead of gradient let's look at rules now with the price we just did a
594:32 now with the price we just did a gradient scale but we can do basically
594:34 gradient scale but we can do basically groups of these and say if a number is
594:37 groups of these and say if a number is greater to or equal than this number
594:39 greater to or equal than this number then it's going to be a certain color
594:40 then it's going to be a certain color and then if it's in a different range we
594:42 and then if it's in a different range we can give it a different color so we're
594:43 can give it a different color so we're going to say if it's greater than or
594:45 going to say if it's greater than or equal to zero and we're going to say
594:47 equal to zero and we're going to say number not percent and if it's less than
594:52 number not percent and if it's less than 266 because we have 265 right here let's
594:55 266 because we have 265 right here let's make it a nice uh like gold a beautiful
594:58 make it a nice uh like gold a beautiful lovely mustard gold just just great now
595:01 lovely mustard gold just just great now we're going to say if it's greater than
595:03 we're going to say if it's greater than or equal to we'll do 260 6 6 because
595:07 or equal to we'll do 260 6 6 because this is less than 266 so it should be
595:09 this is less than 266 so it should be greater than or equal to 266 number and
595:13 greater than or equal to 266 number and if it is less than we'll say
595:16 if it is less than we'll say 500 now we want to do this one and we'll
595:19 500 now we want to do this one and we'll give it uh let's do like a peach and
595:21 give it uh let's do like a peach and we'll click okay and now we have another
595:24 we'll click okay and now we have another conditional formatting on top of that
595:26 conditional formatting on top of that that can give us more information now
595:29 that can give us more information now again you should not do this it's just
595:31 again you should not do this it's just too many now let's go one step further
595:33 too many now let's go one step further and make it even more ridiculous and
595:35 and make it even more ridiculous and show you one more thing before I show
595:36 show you one more thing before I show you how you may actually want to use
595:38 you how you may actually want to use this uh let's go back to unit sold we're
595:41 this uh let's go back to unit sold we're going to rightclick go to conditional
595:42 going to rightclick go to conditional formatting and you can do something
595:44 formatting and you can do something called icons um font color is the exact
595:47 called icons um font color is the exact same thing as background color except it
595:49 same thing as background color except it changes the the font and so I'm not
595:50 changes the the font and so I'm not really going to look into that one icons
595:52 really going to look into that one icons are very simple extremely similar to
595:55 are very simple extremely similar to Excel and how you've seen them and the
595:58 Excel and how you've seen them and the rules that you can apply to them are
596:00 rules that you can apply to them are basically the same as if you're doing
596:02 basically the same as if you're doing like a gradient and it's these if
596:03 like a gradient and it's these if statements that we saw before now it
596:06 statements that we saw before now it Auto gives us this right here which
596:08 Auto gives us this right here which basically says 0 to 33% 33 to 67 67 to
596:12 basically says 0 to 33% 33 to 67 67 to 100 if it's in the bottom 3% it gives us
596:15 100 if it's in the bottom 3% it gives us this red the middle is yellow and the
596:17 this red the middle is yellow and the top is green so we can go through and
596:20 top is green so we can go through and change all of this but honestly this
596:22 change all of this but honestly this looks pretty good so let's click on
596:24 looks pretty good so let's click on it and so the ones that are our least
596:26 it and so the ones that are our least sellers are these red ones right here
596:28 sellers are these red ones right here and the top sellers are up here now this
596:31 and the top sellers are up here now this is just based on unit sold and this
596:33 is just based on unit sold and this looks absolutely terrible so let's kind
596:35 looks absolutely terrible so let's kind of take this exact information but make
596:37 of take this exact information but make it a little bit better so we're going to
596:40 it a little bit better so we're going to create a new visualization or at least a
596:42 create a new visualization or at least a new table so let's click on product name
596:45 new table so let's click on product name and we'll take the price unit sold and
596:49 and we'll take the price unit sold and revenue and what I think makes the most
596:51 revenue and what I think makes the most sense for looking at revenue is these
596:53 sense for looking at revenue is these data bars right here but there's only
596:55 data bars right here but there's only one problem I can't do that because it's
596:58 one problem I can't do that because it's not summarized like unit sold was but
597:02 not summarized like unit sold was but what I can do is to get that those data
597:04 what I can do is to get that those data bars is I can come right down here
597:05 bars is I can come right down here instead of saying don't summarize I can
597:08 instead of saying don't summarize I can summarize it and I can just click the
597:10 summarize it and I can just click the sum so it now was summarized it's the
597:13 sum so it now was summarized it's the exact same number but if I right click
597:16 exact same number but if I right click on here as sum of Revenue I go to
597:18 on here as sum of Revenue I go to conditional formatting I can now use
597:19 conditional formatting I can now use those data bars and so we're going to
597:21 those data bars and so we're going to use those data bars and we're going to
597:23 use those data bars and we're going to say for the lowest value and the highest
597:25 say for the lowest value and the highest value and let's just make it a
597:28 value and let's just make it a nice maybe a darker green I don't want
597:31 nice maybe a darker green I don't want it to well that's that's hideous let's
597:33 it to well that's that's hideous let's make it this color right here a nice
597:34 make it this color right here a nice dark green and there's no negative so it
597:36 dark green and there's no negative so it doesn't really matter we're going to go
597:37 doesn't really matter we're going to go left to right and you can show the bar
597:40 left to right and you can show the bar only but we're going to keep it because
597:42 only but we're going to keep it because I want to see it and we're going to go
597:44 I want to see it and we're going to go just like this we're going to
597:46 just like this we're going to order and this is pretty telling um
597:50 order and this is pretty telling um honestly I did not think the
597:51 honestly I did not think the weatherproof jackets were performing so
597:53 weatherproof jackets were performing so well but I mean they are by far a number
597:56 well but I mean they are by far a number one seller so you know our weatherproof
597:59 one seller so you know our weatherproof jackets multitool survival knives and
598:01 jackets multitool survival knives and the nylon rope are perform outperforming
598:04 the nylon rope are perform outperforming all of our other products so those my
598:06 all of our other products so those my might be the ones that I focus on the
598:07 might be the ones that I focus on the most while duct tape the n95 masks and
598:10 most while duct tape the n95 masks and waterproof matches I mean those are
598:12 waterproof matches I mean those are those are garbage so I might be looking
598:13 those are garbage so I might be looking to replace those in the near future with
598:15 to replace those in the near future with some other items that might sell a
598:17 some other items that might sell a little bit better so that's how you use
598:18 little bit better so that's how you use conditional formatting and it's actually
598:20 conditional formatting and it's actually pretty useful there are a lot of times
598:22 pretty useful there are a lot of times where I've done something like this in
598:23 where I've done something like this in an actual visualization for work and it
598:25 an actual visualization for work and it looks something like this it just
598:27 looks something like this it just depends on what you're visualizing but
598:29 depends on what you're visualizing but this is very much a simple thing that
598:32 this is very much a simple thing that you can do to just add a little bit more
598:34 you can do to just add a little bit more information and and actual visual
598:36 information and and actual visual to this little chart or table that
598:38 to this little chart or table that you're going to create sometimes it's
598:40 you're going to create sometimes it's just better to have these simple
598:41 just better to have these simple visualizations on this table rather than
598:43 visualizations on this table rather than just having the numbers themselves makes
598:45 just having the numbers themselves makes it a little bit more easy to read and
598:48 it a little bit more easy to read and understand so again I hope that this was
598:49 understand so again I hope that this was helpful thank you guys so much for
598:51 helpful thank you guys so much for watching I really appreciate it if you
598:52 watching I really appreciate it if you like this video be sure to like And
598:54 like this video be sure to like And subscribe and check out all my other
598:55 subscribe and check out all my other videos on powerbi and I'll see you in
598:57 videos on powerbi and I'll see you in the next
598:59 the next [Music]
599:05 [Music] video
599:07 video [Music]
599:10 [Music] what's going on everybody welcome back
599:11 what's going on everybody welcome back to the powerbi tutorial Series today
599:14 to the powerbi tutorial Series today we're going to be taking a look at bins
599:15 we're going to be taking a look at bins and
599:21 [Music] lists now bins and list are really
599:23 lists now bins and list are really useful because they allow you to group
599:25 useful because they allow you to group things together to analyze and visualize
599:27 things together to analyze and visualize them easier so in this tutorial I'll
599:29 them easier so in this tutorial I'll show you how to create your bins and
599:30 show you how to create your bins and lists and then we'll create some
599:31 lists and then we'll create some visualizations to show you how it can be
599:33 visualizations to show you how it can be helpful so without further Ado let's
599:35 helpful so without further Ado let's jump on my screen start with a tutorial
599:37 jump on my screen start with a tutorial all right so before we get started I
599:38 all right so before we get started I wanted to let you know you can go and
599:39 wanted to let you know you can go and download the data that we're going to be
599:41 download the data that we're going to be using in this tutorial in the
599:43 using in this tutorial in the description below is on my GitHub so we
599:46 description below is on my GitHub so we are going to be looking at bins and
599:48 are going to be looking at bins and lists today um and for this we're going
599:50 lists today um and for this we're going to be going over here to this apocalypse
599:53 to be going over here to this apocalypse sales uh and let's open up our data
599:56 sales uh and let's open up our data right over here and we want to look at
599:58 right over here and we want to look at apocalypse sales really quickly I feel
600:01 apocalypse sales really quickly I feel like more people would know what a bin
600:02 like more people would know what a bin is so we'll kind of start with a list
600:03 is so we'll kind of start with a list just go a little bit backwards than we
600:05 just go a little bit backwards than we normally would uh I'm going to use this
600:07 normally would uh I'm going to use this customer or we're going to use this
600:09 customer or we're going to use this customer column right here for a list
600:11 customer column right here for a list really quickly and you can do that in
600:12 really quickly and you can do that in two ways you can come up here and you
600:14 two ways you can come up here and you can right click on the customer and go
600:16 can right click on the customer and go to new group or you can come over here
600:19 to new group or you can come over here under this uh the Field section on the
600:21 under this uh the Field section on the far right and go to customer rightclick
600:24 far right and go to customer rightclick and click new group so let's click on
600:26 and click new group so let's click on that
600:27 that now and right now is only giving us the
600:31 now and right now is only giving us the list type it's not giving us bins
600:33 list type it's not giving us bins because bins have to be numeric so we
600:36 because bins have to be numeric so we really can't do that at the moment um so
600:38 really can't do that at the moment um so we're going to call this just customer
600:39 we're going to call this just customer groups just or or we'll actually call it
600:41 groups just or or we'll actually call it list just so it's easier to recognize
600:44 list just so it's easier to recognize when we create it and so all we're going
600:45 when we create it and so all we're going to do is we're going to basically group
600:47 to do is we're going to basically group these but it's going to be called a list
600:50 these but it's going to be called a list and so what we're going to do is we're
600:52 and so what we're going to do is we're going to select and we're going to
600:54 going to select and we're going to select and we're going to say group and
600:56 select and we're going to say group and click on this group button and then it
600:58 click on this group button and then it creates this Alex the analyst apocalypse
601:00 creates this Alex the analyst apocalypse Preppers and uh this prep for anything
601:03 Preppers and uh this prep for anything prepping store so that it kind of named
601:05 prepping store so that it kind of named it for us but if we double click on it
601:09 it for us but if we double click on it then we can rename this and we can call
601:11 then we can rename this and we can call this the best prepping
601:16 this the best prepping stores and then we have these last two
601:19 stores and then we have these last two and we can we can click on one and then
601:22 and we can we can click on one and then click control and click on the other one
601:25 click control and click on the other one so we get both of them and then we can
601:27 so we get both of them and then we can click group and we can call this and
601:30 click group and we can call this and we'll double click and we'll call this
601:32 we'll double click and we'll call this the worst prepping stores
601:37 the worst prepping stores um and then that's it and that's all we
601:39 um and then that's it and that's all we have to do and what we're then going to
601:41 have to do and what we're then going to do and if you want to undo this and you
601:43 do and if you want to undo this and you want to switch it up and do whatever you
601:45 want to switch it up and do whatever you can click on group but we're not going
601:46 can click on group but we're not going to do that we're going to click
601:48 to do that we're going to click okay and here is the column that it
601:50 okay and here is the column that it created and it basically tells us what
601:53 created and it basically tells us what list we put it in if it's Uncle Joe's
601:55 list we put it in if it's Uncle Joe's Prep shop that's in the worst prepping
601:56 Prep shop that's in the worst prepping stores list and if it's the Alex the
601:58 stores list and if it's the Alex the analyst apocalypse Preppers that is in
602:00 analyst apocalypse Preppers that is in the best prepping stores so it's kind of
602:03 the best prepping stores so it's kind of like an if statement you could even
602:05 like an if statement you could even create a calculated column do it on this
602:07 create a calculated column do it on this customer create an if statement this is
602:10 customer create an if statement this is just a lot faster and a lot easier than
602:13 just a lot faster and a lot easier than doing that but it basically would do the
602:15 doing that but it basically would do the exact same thing now you can use lists
602:17 exact same thing now you can use lists as well on things like numeric so let's
602:20 as well on things like numeric so let's say we have order
602:21 say we have order ID and we'll go to new group and it's
602:25 ID and we'll go to new group and it's going to Auto go to bin because
602:27 going to Auto go to bin because typically that's what you'll use but you
602:28 typically that's what you'll use but you can do list as well and let's say you
602:31 can do list as well and let's say you know we want to say we want to call
602:33 know we want to say we want to call these like we'll group these and call
602:36 these like we'll group these and call these the
602:38 these the first um we'll call this the first
602:40 first um we'll call this the first customers or the first orders because
602:42 customers or the first orders because we're looking at order IDs look at the
602:44 we're looking at order IDs look at the first orders and then we will go back
602:47 first orders and then we will go back here we're going on the left side we're
602:49 here we're going on the left side we're going to click oops we're going to go
602:51 going to click oops we're going to go back to the top we're going to hit shift
602:54 back to the top we're going to hit shift group all of these and we'll say the
602:57 group all of these and we'll say the latest
602:58 latest orders and you absolutely can do this um
603:01 orders and you absolutely can do this um again this is kind of like an if
603:03 again this is kind of like an if statement right so you're saying if it
603:05 statement right so you're saying if it falls between this range and this range
603:07 falls between this range and this range then it's called the first orders and if
603:09 then it's called the first orders and if it's between this range and this other
603:10 it's between this range and this other range it's the latest orders um again
603:14 range it's the latest orders um again it's just a much simpler version of an
603:16 it's just a much simpler version of an if statement and so you don't have to
603:17 if statement and so you don't have to write it all out you can just have this
603:19 write it all out you can just have this user interface kind of do it for you uh
603:22 user interface kind of do it for you uh and and it's really really useful so now
603:23 and and it's really really useful so now let's talk about bins and by far the
603:25 let's talk about bins and by far the easiest way to demonstrate this and I'll
603:27 easiest way to demonstrate this and I'll show you one other way uh but by far the
603:29 show you one other way uh but by far the easiest ways to show this is by using
603:32 easiest ways to show this is by using age and so uh for absolutely no reason
603:35 age and so uh for absolutely no reason whatsoever these customer IDs uh who are
603:37 whatsoever these customer IDs uh who are right here in this customer information
603:40 right here in this customer information they decided to give us some of their
603:41 they decided to give us some of their buyer information who are actually
603:43 buyer information who are actually buying their products on their website
603:44 buying their products on their website or in their store they just decided to
603:46 or in their store they just decided to give it to us as well as some uh simple
603:49 give it to us as well as some uh simple demographic information I I don't know
603:51 demographic information I I don't know why but what we're going to use bins for
603:53 why but what we're going to use bins for is grouping these age brackets so you
603:57 is grouping these age brackets so you know you might be interested in say well
604:00 know you might be interested in say well I want to know if my core population who
604:02 I want to know if my core population who are buying my products are within a
604:04 are buying my products are within a certain range and you don't want to look
604:05 certain range and you don't want to look look at every single age because then it
604:08 look at every single age because then it just you know in your visualizations
604:09 just you know in your visualizations it's not going to look right you want to
604:10 it's not going to look right you want to kind of group them make it easier to
604:12 kind of group them make it easier to visualize so what we're going to do is
604:14 visualize so what we're going to do is we're going to go through here and we're
604:15 we're going to go through here and we're going to basically go by tens so 10 20
604:18 going to basically go by tens so 10 20 30 40 50 60 and see what age bracket
604:22 30 40 50 60 and see what age bracket these people fall in so we're going to
604:23 these people fall in so we're going to go to age we're going to right click and
604:25 go to age we're going to right click and we're going to say new group and we're
604:27 we're going to say new group and we're going to go to bin and we'll leave it as
604:29 going to go to bin and we'll leave it as a default age bins um and you can do two
604:32 a default age bins um and you can do two things you can do the size of the bins
604:34 things you can do the size of the bins which splits it uh uh which splits it by
604:37 which splits it uh uh which splits it by this number right here or you can go
604:38 this number right here or you can go based on the number of bins so if you
604:41 based on the number of bins so if you only want to do five different bins
604:44 only want to do five different bins it'll calculate that for you and it'll
604:46 it'll calculate that for you and it'll say okay if you only want five bins
604:49 say okay if you only want five bins you're going to have to do it at 12.2 if
604:51 you're going to have to do it at 12.2 if you want 10 bins it can be 6.1 but it is
604:55 you want 10 bins it can be 6.1 but it is completely up to you on how you want to
604:57 completely up to you on how you want to do that um you can do the size and we'll
604:59 do that um you can do the size and we'll just say every 10 which is what we're
605:01 just say every 10 which is what we're going to do or you can go through and
605:03 going to do or you can go through and then you can create you know the how
605:05 then you can create you know the how many many bins you actually want so
605:07 many many bins you actually want so let's go ahead and click okay and it's
605:10 let's go ahead and click okay and it's going to create those bins for us so if
605:12 going to create those bins for us so if somebody is 78 they're going to be in
605:14 somebody is 78 they're going to be in the 70s bin if somebody's 41 they'll be
605:17 the 70s bin if somebody's 41 they'll be in the 40 bin if somebody is 29 they'll
605:20 in the 40 bin if somebody is 29 they'll be in the 20 bin and so on and so forth
605:23 be in the 20 bin and so on and so forth so when we go to visualize this we don't
605:25 so when we go to visualize this we don't have you know 71 72 73 74 have a lot
605:29 have you know 71 72 73 74 have a lot more things on our visualization it'll
605:31 more things on our visualization it'll just be the 70 or it'll just be the 20
605:33 just be the 70 or it'll just be the 20 now we can also use bins on dates as
605:36 now we can also use bins on dates as well so let's go back to apocalypse
605:38 well so let's go back to apocalypse sales we have this date purchase so we
605:40 sales we have this date purchase so we can create a bin for this as well so
605:42 can create a bin for this as well so let's go to date purchased let's go new
605:45 let's go to date purchased let's go new group now you can also create a list and
605:48 group now you can also create a list and that's totally fine if you would like to
605:50 that's totally fine if you would like to do that um and it would look kind of
605:52 do that um and it would look kind of like this where you can go through and
605:54 like this where you can go through and you can select it and you can say okay
605:57 you can select it and you can say okay this group all these dates you can group
606:00 this group all these dates you can group those and say this is going to be
606:02 those and say this is going to be January uh and you can do that and
606:04 January uh and you can do that and that's totally okay um but for this one
606:07 that's totally okay um but for this one we're going to do bins I think it's a
606:08 we're going to do bins I think it's a little bit easier to do bins because
606:10 little bit easier to do bins because what we can do is go right here and we
606:12 what we can do is go right here and we can specify if we want seconds minutes
606:14 can specify if we want seconds minutes hours days months or years and so um for
606:17 hours days months or years and so um for the data that we have it goes January
606:19 the data that we have it goes January February and March so we're going to do
606:22 February and March so we're going to do months and we're going to say the bin
606:24 months and we're going to say the bin size is going to be one month so each
606:26 size is going to be one month so each month should have its own bin so it'll
606:27 month should have its own bin so it'll be three bins total so we're going to
606:29 be three bins total so we're going to select
606:31 select okay and as you can see on this right
606:33 okay and as you can see on this right side we have January of 2022 and that
606:36 side we have January of 2022 and that correlates to the January over here then
606:38 correlates to the January over here then it goes down to February and then it
606:40 it goes down to February and then it goes down to March and then when we
606:43 goes down to March and then when we visualize this uh we don't have to do
606:45 visualize this uh we don't have to do this the hierarchy stuff that we do in
606:47 this the hierarchy stuff that we do in here where we filter it down down to
606:49 here where we filter it down down to months we can just use this right here
606:51 months we can just use this right here and that will be our month's column so
606:53 and that will be our month's column so now let's go over to our visualizations
606:55 now let's go over to our visualizations and we'll see how this looks really
606:57 and we'll see how this looks really quickly we're not going to look at all
606:58 quickly we're not going to look at all of them but we will take a look at few
607:00 of them but we will take a look at few of them so the first one that we can
607:01 of them so the first one that we can look at is age so let's look at the
607:04 look at is age so let's look at the buyer ID and then we'll do age as well
607:07 buyer ID and then we'll do age as well and so let's spread this
607:10 and so let's spread this out and we can see our distribution of
607:13 out and we can see our distribution of our buyers so it looks like we have very
607:16 our buyers so it looks like we have very few uh who are in the 10 range thank
607:19 few uh who are in the 10 range thank goodness and we can even put the age
607:21 goodness and we can even put the age right under here under the age bins and
607:23 right under here under the age bins and we have this now we kind of have this
607:25 we have this now we kind of have this drill down and so if we go right here
607:27 drill down and so if we go right here and we drill down right there this will
607:30 and we drill down right there this will actually give us the breakdown so this
607:31 actually give us the breakdown so this is what it would have kind of looked
607:33 is what it would have kind of looked like our visualization would have looked
607:35 like our visualization would have looked like if we had just kept it the age cuz
607:37 like if we had just kept it the age cuz now we're drilling down into the age and
607:39 now we're drilling down into the age and so it looks like we have one 18-year-old
607:41 so it looks like we have one 18-year-old and maybe a 20-year-old as well um let's
607:44 and maybe a 20-year-old as well um let's go back up yeah so it looks like we only
607:47 go back up yeah so it looks like we only have one buyer ID yes so there's only
607:48 have one buyer ID yes so there's only one 18year old so of legal age to start
607:51 one 18year old so of legal age to start buying you know all these prepping
607:52 buying you know all these prepping equipment and probably uh buying online
607:55 equipment and probably uh buying online and stuff like that which makes sense
607:57 and stuff like that which makes sense right so uh this gives you kind of a
607:59 right so uh this gives you kind of a quick breakdown in the bins rather than
608:01 quick breakdown in the bins rather than um doing it the alternative way so now
608:03 um doing it the alternative way so now let's take a look at the customer list
608:06 let's take a look at the customer list as well as the unit sold and it looks
608:09 as well as the unit sold and it looks like the best prepping store uh is
608:11 like the best prepping store uh is actually performing much worse
608:13 actually performing much worse surprisingly uh than the worst prepping
608:16 surprisingly uh than the worst prepping store and so I hope this gave you a
608:18 store and so I hope this gave you a really good idea of how to use bins and
608:19 really good idea of how to use bins and lists within powerbi thank you so much
608:22 lists within powerbi thank you so much for watching if you like this video be
608:23 for watching if you like this video be sure to like And subscribe and check out
608:25 sure to like And subscribe and check out all my other videos on powerbi I'll see
608:27 all my other videos on powerbi I'll see you in the next
608:29 you in the next [Music]
608:34 [Music] video
608:35 video [Music]
608:40 [Music] what's going on everybody welcome back
608:42 what's going on everybody welcome back to the powerbi tutorial Series today
608:44 to the powerbi tutorial Series today we're going to be taking a look at all
608:45 we're going to be taking a look at all types of
608:52 [Music] visualizations now when you're working
608:53 visualizations now when you're working in powerbi there are a lot of different
608:55 in powerbi there are a lot of different options to create visualizations and you
608:57 options to create visualizations and you may not always be sure which one to use
609:00 may not always be sure which one to use and so that's what this video is for I'm
609:01 and so that's what this video is for I'm going to walk you through a lot of the
609:03 going to walk you through a lot of the visualizations that I like and I use a
609:05 visualizations that I like and I use a lot as well as kind of point out some of
609:07 lot as well as kind of point out some of the ones that I don't like as much so
609:09 the ones that I don't like as much so that you get kind of a feel for the ones
609:11 that you get kind of a feel for the ones that I think are really popular and that
609:13 that I think are really popular and that are used the most so without further Ado
609:15 are used the most so without further Ado let's jump into powerbi and start taking
609:16 let's jump into powerbi and start taking a look all right before we jump into it
609:18 a look all right before we jump into it there is a link in the description where
609:19 there is a link in the description where you can get the data that we're going to
609:21 you can get the data that we're going to be using for these visualizations if you
609:23 be using for these visualizations if you want to practice them yourself before we
609:25 want to practice them yourself before we actually get into it we do need to
609:28 actually get into it we do need to combine this and if you download that
609:30 combine this and if you download that Excel and you see this you'll have to do
609:32 Excel and you see this you'll have to do the same thing all we have to say is
609:34 the same thing all we have to say is that this product ID is the same as this
609:37 that this product ID is the same as this product ID purchased and now we are good
609:40 product ID purchased and now we are good to go do one to many and it's okay if
609:42 to go do one to many and it's okay if it's one way so right over here under
609:44 it's one way so right over here under this visualizations tab there are lots
609:46 this visualizations tab there are lots of different options and it can be a
609:48 of different options and it can be a little bit overwhelming you don't really
609:50 little bit overwhelming you don't really know which one to choose there are some
609:52 know which one to choose there are some in here that I have almost never used
609:54 in here that I have almost never used for my job ever so I'll Point those out
609:56 for my job ever so I'll Point those out as we go through but the main focus is
609:58 as we go through but the main focus is going to be focusing on the ones that I
610:00 going to be focusing on the ones that I do use that I have used and showing you
610:03 do use that I have used and showing you how to actually create that
610:04 how to actually create that visualization Maybe spice it up just a
610:05 visualization Maybe spice it up just a little bit but we have a lot of them to
610:08 little bit but we have a lot of them to go through so let's jump right into it
610:10 go through so let's jump right into it and the very first one that we're going
610:11 and the very first one that we're going to start with probably the easiest one
610:13 to start with probably the easiest one and the one that you'll recognize the
610:14 and the one that you'll recognize the most is a stacked bar chart and what we
610:17 most is a stacked bar chart and what we going to do is go ahead right over here
610:19 going to do is go ahead right over here to the product name and we want this
610:22 to the product name and we want this unit sold as well so we're going to
610:24 unit sold as well so we're going to click product name and it's going to go
610:26 click product name and it's going to go straight into the Y AIS for us and then
610:28 straight into the Y AIS for us and then we're going to click unit sold and that
610:30 we're going to click unit sold and that will go into the x-axis automatically it
610:33 will go into the x-axis automatically it just kind of intuitively knows but
610:35 just kind of intuitively knows but sometimes it will make a mistake and
610:36 sometimes it will make a mistake and then you can just fix it or flip it and
610:39 then you can just fix it or flip it and we do want this uh let me make this much
610:42 we do want this uh let me make this much larger we do want this to be a little
610:44 larger we do want this to be a little bit more colorcoded that is what this
610:46 bit more colorcoded that is what this Legend is down here so what we're going
610:48 Legend is down here so what we're going to do is drag this product name down to
610:50 to do is drag this product name down to the legend and now we have each product
610:53 the legend and now we have each product as its own
610:54 as its own color and in previous videos we have
610:57 color and in previous videos we have gone through and looked at some of these
610:58 gone through and looked at some of these Visual and general options that you have
611:01 Visual and general options that you have when you're actually creating these
611:02 when you're actually creating these visualizations but we're going to do
611:04 visualizations but we're going to do some of them while we're in here as well
611:06 some of them while we're in here as well so we're just going to go down here
611:08 so we're just going to go down here we're going to choose data labels and
611:10 we're going to choose data labels and we're going to shrink that and if you go
611:13 we're going to shrink that and if you go higher the higher you go the less you
611:15 higher the higher you go the less you see so if you want all of them all the
611:17 see so if you want all of them all the way down to the green we're going to go
611:18 way down to the green we're going to go right about there and we're going to
611:20 right about there and we're going to make it smaller so now we can go ahead
611:22 make it smaller so now we can go ahead and click anywhere outside of that
611:23 and click anywhere outside of that visualization and now we can create a
611:25 visualization and now we can create a new one if we had just kept it like this
611:28 new one if we had just kept it like this where we were still interacting with
611:29 where we were still interacting with this visualization and we clicked on a
611:31 this visualization and we clicked on a different one it would have then changed
611:33 different one it would have then changed our visualization completely which we
611:35 our visualization completely which we don't want so let's hit contrl Z click
611:38 don't want so let's hit contrl Z click out of it and now we can create a new
611:40 out of it and now we can create a new one let's go right over here to this
611:42 one let's go right over here to this 100% stacked column chart I'm going to
611:44 100% stacked column chart I'm going to click on it drag it over here and make
611:47 click on it drag it over here and make it much larger and we're going to come
611:50 it much larger and we're going to come right over here to this customer
611:52 right over here to this customer information and we're going to click on
611:54 information and we're going to click on customer and then we're going to go up
611:56 customer and then we're going to go up to unit sold and click on unit sold and
611:59 to unit sold and click on unit sold and we want to break these out and so
612:01 we want to break these out and so basically what this is doing is it's
612:02 basically what this is doing is it's breaking it out by each of these shops
612:05 breaking it out by each of these shops and we can see the total of what they're
612:07 and we can see the total of what they're buying the units sold but we want to see
612:10 buying the units sold but we want to see exactly what products make up this
612:12 exactly what products make up this percentage of this 100% so we're going
612:15 percentage of this 100% so we're going to go right over here to product name
612:17 to go right over here to product name we're going to drag that down to the
612:18 we're going to drag that down to the legend and as you can see now we have
612:21 legend and as you can see now we have each of these products and each of the
612:23 each of these products and each of the products is up here so this backpack we
612:25 products is up here so this backpack we can see the backpack right here backpack
612:28 can see the backpack right here backpack right here and right here and we can see
612:29 right here and right here and we can see which customer is buying what percentage
612:31 which customer is buying what percentage of their purchases so for this prep for
612:34 of their purchases so for this prep for anything prep store they have a very
612:36 anything prep store they have a very large percentage 40% is duct tape so
612:39 large percentage 40% is duct tape so they're buying a lot of duct tape so
612:41 they're buying a lot of duct tape so really quickly we're able to see what
612:42 really quickly we're able to see what clients are purchasing or which clients
612:44 clients are purchasing or which clients are purchasing what products the most so
612:46 are purchasing what products the most so just like this Alex analyst apocalypse
612:48 just like this Alex analyst apocalypse Preppers they're buying a lot of water
612:50 Preppers they're buying a lot of water purifiers we like drinking clean water
612:53 purifiers we like drinking clean water um you know that's just what my audience
612:54 um you know that's just what my audience likes and so you know we can easily get
612:57 likes and so you know we can easily get a quick glance of that again we're going
612:58 a quick glance of that again we're going to go in here I tend to like putting
613:00 to go in here I tend to like putting these data labels on here that's just
613:03 these data labels on here that's just what I preference
613:05 what I preference so you know something like this it looks
613:07 so you know something like this it looks nice it looks clean um we can always go
613:10 nice it looks clean um we can always go back and change these names which we'll
613:12 back and change these names which we'll do for this one so we're going to go
613:13 do for this one so we're going to go over here go to title we'll go down to
613:16 over here go to title we'll go down to the text and we'll do
613:20 the text and we'll do customer
613:22 customer oops customer purchase oh jeez
613:27 oops customer purchase oh jeez breakdown pretend I'm really good at
613:30 breakdown pretend I'm really good at spelling and we're going to do it just
613:32 spelling and we're going to do it just like that we'll get out of there so now
613:34 like that we'll get out of there so now we have customer purchase breakdown and
613:36 we have customer purchase breakdown and that looks really nice it's a good uh a
613:39 that looks really nice it's a good uh a good visualization and we're going to
613:40 good visualization and we're going to bring that right over here we're going
613:43 bring that right over here we're going to have a lot on the screen so I may
613:45 to have a lot on the screen so I may have to uh make them smaller or larger
613:49 have to uh make them smaller or larger to fit
613:50 to fit everything all right so let's go on to
613:52 everything all right so let's go on to our next one another really common
613:54 our next one another really common visualization is this one right here
613:56 visualization is this one right here which is the line chart and the line
613:59 which is the line chart and the line chart is great especially when you're
614:01 chart is great especially when you're using things like dates I have found
614:03 using things like dates I have found this one to be the best best and a lot
614:05 this one to be the best best and a lot of people use this as well so we're
614:06 of people use this as well so we're going to go right over here and click on
614:08 going to go right over here and click on date purchased and then units sold and
614:11 date purchased and then units sold and on the x-axis you can see it's broken up
614:13 on the x-axis you can see it's broken up by year quarter month and day so we
614:15 by year quarter month and day so we don't want to do it that high level we
614:17 don't want to do it that high level we only have three months of data in here
614:18 only have three months of data in here so we're going to get rid of the year
614:20 so we're going to get rid of the year we're going to get rid of the quarter
614:22 we're going to get rid of the quarter and then we at least have this and let's
614:25 and then we at least have this and let's break it out because right now we're
614:27 break it out because right now we're looking at all of the units sold so
614:29 looking at all of the units sold so we're going to drag the product name
614:30 we're going to drag the product name right down here to the legend and now it
614:32 right down here to the legend and now it breaks it out by the actual product and
614:35 breaks it out by the actual product and for each month in January February or
614:37 for each month in January February or March you can follow these products and
614:38 March you can follow these products and see how they did in each of those months
614:41 see how they did in each of those months and if we wanted to we can come right
614:42 and if we wanted to we can come right over here to the filter on the product
614:44 over here to the filter on the product name and we could filter it by maybe the
614:46 name and we could filter it by maybe the top three so let's do multi-tool
614:49 top three so let's do multi-tool survival knife the nylon rope and the
614:53 survival knife the nylon rope and the duct tape and we can have it just like
614:55 duct tape and we can have it just like this and you know you can do those for
614:57 this and you know you can do those for any product that you want but again we
615:00 any product that you want but again we just want to do it for those three just
615:01 just want to do it for those three just for an example and that really doesn't
615:03 for an example and that really doesn't give us a ton of information we could
615:05 give us a ton of information we could even go down to the day and you know it
615:07 even go down to the day and you know it might give us a little bit more
615:09 might give us a little bit more information and so we'll keep it like
615:11 information and so we'll keep it like that and we can go over here change the
615:14 that and we can go over here change the name as well we're not going to do this
615:16 name as well we're not going to do this for all of them again we're just looking
615:17 for all of them again we're just looking at the different types of visualizations
615:19 at the different types of visualizations I think are really good to know but
615:21 I think are really good to know but we'll change this one as well to
615:24 we'll change this one as well to products
615:26 products purchased by
615:28 purchased by date we'll keep it just like that again
615:31 date we'll keep it just like that again nothing fancy we're just trying to look
615:33 nothing fancy we're just trying to look at a bunch of different stuff so let's
615:34 at a bunch of different stuff so let's put this over here down here now let's
615:38 put this over here down here now let's click out of there and there are other
615:40 click out of there and there are other ones in here um that are definitely
615:42 ones in here um that are definitely useful and you absolutely can use um
615:44 useful and you absolutely can use um like this one is a stacked bar chart
615:46 like this one is a stacked bar chart this one is a stacked column chart it's
615:48 this one is a stacked column chart it's basically the same thing just a
615:49 basically the same thing just a different orientation like we went to
615:51 different orientation like we went to here it's just a different orientation
615:54 here it's just a different orientation it's the same thing um just like this
615:57 it's the same thing um just like this clustered bar chart custom column chart
615:59 clustered bar chart custom column chart it's just its orientation either
616:01 it's just its orientation either horizontal or
616:02 horizontal or vertical then we have things like an
616:04 vertical then we have things like an area chart uh stacked area chart not
616:07 area chart uh stacked area chart not really things that I've used too much in
616:10 really things that I've used too much in previous positions one that I have use
616:12 previous positions one that I have use though is a line and clustered column
616:14 though is a line and clustered column chart so it kind of combines a few of
616:18 chart so it kind of combines a few of these with you know you have these bar
616:20 these with you know you have these bar charts as well as line charts into one
616:23 charts as well as line charts into one visualization so let's look at this one
616:25 visualization so let's look at this one because this is one that I have used
616:26 because this is one that I have used several times in my actual job so for
616:28 several times in my actual job so for our x axis we'll use the product name
616:33 our x axis we'll use the product name then we'll look at something like the
616:34 then we'll look at something like the price and so let's make this a lot
616:37 price and so let's make this a lot larger so you can actually see it so now
616:41 larger so you can actually see it so now we have the price and now we can look at
616:43 we have the price and now we can look at something like the production cost and
616:45 something like the production cost and that can
616:46 that can be our line ya AIS so now we're looking
616:50 be our line ya AIS so now we're looking at the price of it how much someone is
616:52 at the price of it how much someone is actually paying for it and then we're
616:53 actually paying for it and then we're looking at how much it's costing us to
616:55 looking at how much it's costing us to actually produce that product and so
616:57 actually produce that product and so really quickly at a glance you can kind
616:59 really quickly at a glance you can kind of see that it's around the halfway to
617:00 of see that it's around the halfway to 2/3 point on most of these you can see
617:03 2/3 point on most of these you can see that the production cost is always lower
617:06 that the production cost is always lower than the actual price because of course
617:08 than the actual price because of course we're out here to make a profit on these
617:09 we're out here to make a profit on these products so let's minimize this one
617:12 products so let's minimize this one we're going to put this one right down
617:13 we're going to put this one right down here let's make it even smaller let's
617:16 here let's make it even smaller let's click out of that and the next one that
617:18 click out of that and the next one that we're going to take a look at is a
617:20 we're going to take a look at is a scatter chart so let's click on that and
617:23 scatter chart so let's click on that and make it much larger
617:26 make it much larger oops there we go so let's use the price
617:29 oops there we go so let's use the price and the production cost again and so our
617:32 and the production cost again and so our x axis is the price our y y AIS is the
617:35 x axis is the price our y y AIS is the production cost but now we need to fill
617:37 production cost but now we need to fill in this values right here so let's go
617:39 in this values right here so let's go over here and click on the product name
617:40 over here and click on the product name and drag that into values and so now we
617:43 and drag that into values and so now we have our values we just don't know what
617:44 have our values we just don't know what they are but we can see it so let's drag
617:47 they are but we can see it so let's drag this down to Legend as well and it
617:50 this down to Legend as well and it breaks it out and we kind of have this
617:51 breaks it out and we kind of have this scatter plot and you know for this fake
617:54 scatter plot and you know for this fake data that we're using it doesn't really
617:56 data that we're using it doesn't really show a lot U but if you're using real
617:58 show a lot U but if you're using real data you can definitely find outliers
618:00 data you can definitely find outliers and Trends and patterns using this type
618:02 and Trends and patterns using this type of visualization let's go ahead and make
618:04 of visualization let's go ahead and make that one small as well drag it right
618:07 that one small as well drag it right down into the
618:08 down into the corner now let's go right over here and
618:11 corner now let's go right over here and we have the the dreaded pie charts um
618:13 we have the the dreaded pie charts um and dut chart now look I think it's kind
618:15 and dut chart now look I think it's kind of a joke in the data analyst Community
618:17 of a joke in the data analyst Community about pie charts and doughnut charts but
618:20 about pie charts and doughnut charts but at the same time people use them and
618:21 at the same time people use them and they request them and so sometimes
618:23 they request them and so sometimes you're going to use it whether you like
618:25 you're going to use it whether you like it or not so let's click on the dut
618:27 it or not so let's click on the dut chart and let's make this one a lot
618:31 chart and let's make this one a lot larger and let's go over here and let's
618:33 larger and let's go over here and let's click on
618:35 click on State and we're also going to click on
618:37 State and we're also going to click on total purchased and that's really all
618:40 total purchased and that's really all you have to do these ones are pretty
618:43 you have to do these ones are pretty straightforward you can change a few
618:44 straightforward you can change a few different things like where these labels
618:46 different things like where these labels are if you want them inside you can also
618:49 are if you want them inside you can also do that and that would look totally fine
618:51 do that and that would look totally fine um again I'm just not a super huge fan
618:54 um again I'm just not a super huge fan but you will get this one requested
618:55 but you will get this one requested people like this and want to see it and
618:57 people like this and want to see it and the reason a lot of analysts don't like
618:59 the reason a lot of analysts don't like using this is because when you start
619:01 using this is because when you start glancing at these it's really hard to
619:03 glancing at these it's really hard to tell the difference between these sizes
619:06 tell the difference between these sizes if you look at something like this you
619:08 if you look at something like this you can easily see that this is larger like
619:10 can easily see that this is larger like if you're looking at this one the
619:11 if you're looking at this one the multi-tool survival knife is obviously
619:13 multi-tool survival knife is obviously the longest and it gets shorter shorter
619:15 the longest and it gets shorter shorter shorter shorter but when you start
619:16 shorter shorter but when you start getting in here it's really hard to
619:18 getting in here it's really hard to approximate the size I would not be able
619:20 approximate the size I would not be able to tell the difference between this 5.63
619:22 to tell the difference between this 5.63 5.78 two uh 7.72 I would not be able to
619:26 5.78 two uh 7.72 I would not be able to tell really the difference between these
619:28 tell really the difference between these or or kind of the the difference between
619:30 or or kind of the the difference between them very easily that's why a lot of
619:33 them very easily that's why a lot of people don't want to use them in general
619:36 people don't want to use them in general so again I want to show you this one
619:37 so again I want to show you this one because I think it's worth noting and
619:39 because I think it's worth noting and worth knowing how to use but I don't
619:42 worth knowing how to use but I don't really push people towards this because
619:44 really push people towards this because I don't think it's the best
619:45 I don't think it's the best visualization available most of the time
619:48 visualization available most of the time all right the next two are super easy
619:50 all right the next two are super easy but are used all the time uh maybe more
619:53 but are used all the time uh maybe more than some of these even but they're just
619:55 than some of these even but they're just so easy to use so I'm kind of saved them
619:57 so easy to use so I'm kind of saved them for last this one is the card and all
620:01 for last this one is the card and all the card is is it displays one number or
620:04 the card is is it displays one number or multiple numbers if you want to use a
620:05 multiple numbers if you want to use a multi- card but we'll just look at the
620:07 multi- card but we'll just look at the card for now all we're going to look at
620:09 card for now all we're going to look at is the total purchased and it's just
620:11 is the total purchased and it's just going to display it just like this and
620:13 going to display it just like this and you can make it as large or as small as
620:15 you can make it as large or as small as you'd like and normally it goes on like
620:17 you'd like and normally it goes on like the top and you'll put card here a card
620:19 the top and you'll put card here a card here um just for example I'll kind of
620:22 here um just for example I'll kind of show you how this might look so it look
620:24 show you how this might look so it look something like this right and at the top
620:26 something like this right and at the top it'll have different usually High
620:28 it'll have different usually High overarching information and this is
620:31 overarching information and this is super common to see and I'm sure if
620:32 super common to see and I'm sure if you've looked at other people's
620:33 you've looked at other people's visualization you'll see something like
620:35 visualization you'll see something like this this is usually totals or averages
620:38 this this is usually totals or averages or something like that in here where
620:39 or something like that in here where it's super easy to look at so like right
620:42 it's super easy to look at so like right here this is total purchased and we can
620:44 here this is total purchased and we can go in and look at the minimum and then
620:46 go in and look at the minimum and then we can go over here and this one can be
620:49 we can go over here and this one can be account and so it gives us a lot of
620:52 account and so it gives us a lot of information just at a really quick
620:53 information just at a really quick glance and then we have all of our more
620:55 glance and then we have all of our more in-depth colorful visualizations that
620:57 in-depth colorful visualizations that kind of have more information than just
620:59 kind of have more information than just a single piece like the card does and
621:01 a single piece like the card does and then the very last one that I'm going to
621:02 then the very last one that I'm going to show you is this one right here which is
621:05 show you is this one right here which is the table and this one is obviously
621:07 the table and this one is obviously extremely popular it's like an little
621:09 extremely popular it's like an little Excel table and we can go in here and we
621:12 Excel table and we can go in here and we can get the customer wherever that is
621:15 can get the customer wherever that is and then we'll also get the unit sold
621:17 and then we'll also get the unit sold and this is what it looks like and it's
621:19 and this is what it looks like and it's super easy and oftentimes you'll have it
621:21 super easy and oftentimes you'll have it like on the side as well uh and all the
621:23 like on the side as well uh and all the other visualizations over here and so
621:26 other visualizations over here and so you know if we're going to take all
621:27 you know if we're going to take all these visualizations and pretend they
621:28 these visualizations and pretend they were like a real thing you know there's
621:31 were like a real thing you know there's a lot in here but we'll just kind of
621:34 a lot in here but we'll just kind of really quickly do this um you know we
621:36 really quickly do this um you know we might have something like this and we'll
621:39 might have something like this and we'll make this larger and make this
621:42 make this larger and make this wider and you know we have a lot of
621:45 wider and you know we have a lot of information just in here and this is not
621:47 information just in here and this is not a project so don't go put this on your
621:49 a project so don't go put this on your portfolio I'm just threw a ton of random
621:51 portfolio I'm just threw a ton of random visualizations on you know this
621:53 visualizations on you know this dashboard but you can already see a lot
621:56 dashboard but you can already see a lot of these you most likely have seen in
621:58 of these you most likely have seen in other people's work in other people's
622:00 other people's work in other people's visualizations on LinkedIn or on YouTube
622:02 visualizations on LinkedIn or on YouTube these are very common very very popular
622:05 these are very common very very popular and again we did not go through all of
622:07 and again we did not go through all of the ones over here there are maps that
622:09 the ones over here there are maps that you can use but I haven't used Maps ever
622:12 you can use but I haven't used Maps ever in my job there are things like gauges
622:14 in my job there are things like gauges and decomposition trees and waterfall
622:18 and decomposition trees and waterfall charts and uh tree maps and all these
622:21 charts and uh tree maps and all these different things but I really have never
622:23 different things but I really have never used those in my actual job and I don't
622:26 used those in my actual job and I don't see them a lot in others people's work
622:28 see them a lot in others people's work either otherwise I would be telling you
622:30 either otherwise I would be telling you to learn these and use these but again
622:32 to learn these and use these but again try them out see which ones you like if
622:34 try them out see which ones you like if you like this video be sure to like And
622:36 you like this video be sure to like And subscribe below and go check out all the
622:37 subscribe below and go check out all the other powerbi tutorial videos that I
622:39 other powerbi tutorial videos that I have on my channel and I will see you in
622:41 have on my channel and I will see you in the
622:41 the [Music]
622:53 [Music] next what's going on everybody welcome
622:55 next what's going on everybody welcome back to the powerbi tutorial Series
622:57 back to the powerbi tutorial Series today we are going to be working on our
622:59 today we are going to be working on our final
623:07 now this is our final project of the powerbi tutorial Series so if you have
623:09 powerbi tutorial Series so if you have not watched all of those videos leading
623:11 not watched all of those videos leading up to this I recommend going and
623:13 up to this I recommend going and watching those videos so you can make
623:15 watching those videos so you can make sure that you know all the things that
623:16 sure that you know all the things that we're going to be looking at in today's
623:17 we're going to be looking at in today's project I am really excited to work on
623:19 project I am really excited to work on this project with you because I think it
623:20 this project with you because I think it is a really good one and it uses real
623:22 is a really good one and it uses real data that we collected about a month ago
623:25 data that we collected about a month ago where I took a survey of data
623:26 where I took a survey of data professionals and this is the raw data
623:28 professionals and this is the raw data that we're going to be looking at and so
623:30 that we're going to be looking at and so I think it's just really interesting
623:31 I think it's just really interesting that we collected our own data and now
623:33 that we collected our own data and now we're using for a project we're going to
623:34 we're using for a project we're going to transform the data using power query and
623:36 transform the data using power query and then we're actually create the
623:37 then we're actually create the visualizations and finalize the
623:39 visualizations and finalize the dashboards as well as create a theme and
623:41 dashboards as well as create a theme and a different color scheme to kind of make
623:43 a different color scheme to kind of make it a little bit more unique without
623:45 it a little bit more unique without further Ado let's jump onto my screen
623:46 further Ado let's jump onto my screen and get started with the project all
623:47 and get started with the project all right so before we jump into it I wanted
623:49 right so before we jump into it I wanted to let you know that you can get the
623:50 to let you know that you can get the data below it is on my GitHub you can go
623:52 data below it is on my GitHub you can go and download this exact file that we're
623:54 and download this exact file that we're going to be looking at now in the past
623:57 going to be looking at now in the past several projects we have been using this
623:59 several projects we have been using this fake apocalypse data set you know it was
624:02 fake apocalypse data set you know it was fun it was you know what whatever this
624:04 fun it was you know what whatever this data set is real this is a real data set
624:06 data set is real this is a real data set it was a survey that I took from data
624:08 it was a survey that I took from data professionals I posted on LinkedIn and
624:10 professionals I posted on LinkedIn and Twitter and all these other places and
624:12 Twitter and all these other places and we had about 600 700 people who
624:14 we had about 600 700 people who responded to the questions so before we
624:16 responded to the questions so before we actually get into it and start cleaning
624:18 actually get into it and start cleaning the data and doing all this stuff in
624:20 the data and doing all this stuff in powerbi I just wanted to show you the
624:23 powerbi I just wanted to show you the data all right so this is the CSV that I
624:25 data all right so this is the CSV that I downloaded from the survey website that
624:27 downloaded from the survey website that I used and this is completely raw data I
624:29 I used and this is completely raw data I haven't done anything to it at all let's
624:32 haven't done anything to it at all let's go through the data really quickly and
624:33 go through the data really quickly and we'll kind of see what we have and we
624:35 we'll kind of see what we have and we are not going to make any changes at all
624:37 are not going to make any changes at all in Excel we're going to do all of our
624:40 in Excel we're going to do all of our Transformations or at least a few
624:41 Transformations or at least a few transformations in powerbi because again
624:44 transformations in powerbi because again this is a powerbi tutorial and project
624:47 this is a powerbi tutorial and project so I want you to kind of learn how to
624:48 so I want you to kind of learn how to use that and not use Excel because you
624:50 use that and not use Excel because you can go through my Excel tutorial if you
624:52 can go through my Excel tutorial if you want to do that so let's just look at it
624:54 want to do that so let's just look at it in Excel and then we'll move it over to
624:56 in Excel and then we'll move it over to powerbi and actually start transforming
624:58 powerbi and actually start transforming the data so we have this unique ID these
625:00 the data so we have this unique ID these are all the people that actually took it
625:02 are all the people that actually took it oops don't want to do that we have an
625:04 oops don't want to do that we have an email which this was completely
625:05 email which this was completely Anonymous I didn't collect any data or
625:08 Anonymous I didn't collect any data or user data on this then we have the date
625:10 user data on this then we have the date Taken um and let's get into the actual
625:13 Taken um and let's get into the actual good information then we have all of
625:15 good information then we have all of these questions so we have question one
625:18 these questions so we have question one which title fits you best and they can
625:19 which title fits you best and they can choose things now uh let's add a filter
625:22 choose things now uh let's add a filter really quickly that we can look at this
625:25 really quickly that we can look at this now you had the pre-selected ones which
625:28 now you had the pre-selected ones which were like data analyst architect
625:30 were like data analyst architect engineer but then there was an option
625:31 engineer but then there was an option where you could say other and you could
625:33 where you could say other and you could spe specify what that was so if you look
625:36 spe specify what that was so if you look in here we're going to have all these
625:38 in here we're going to have all these different other please specify with
625:40 different other please specify with different titles right and there were a
625:43 different titles right and there were a lot of them now typically what you want
625:47 lot of them now typically what you want to do is really clean this up and we're
625:50 to do is really clean this up and we're not going to be doing a ton ton ton of
625:52 not going to be doing a ton ton ton of data cleaning but we are going to do
625:54 data cleaning but we are going to do some in powerbi but none in here but
625:57 some in powerbi but none in here but typically with this amount of data and
625:59 typically with this amount of data and the way that it's formatted we would do
626:01 the way that it's formatted we would do so much data cleaning um with this one I
626:03 so much data cleaning um with this one I mean I mean there is a lot of work to be
626:05 mean I mean there is a lot of work to be done um like this current year salary
626:08 done um like this current year salary this is one that I would absolutely be
626:10 this is one that I would absolutely be cleaning up because it's ranges and it
626:13 cleaning up because it's ranges and it has a dash and a k and and all these
626:15 has a dash and a k and and all these numbers this is something that I would
626:17 numbers this is something that I would be cleaning up and using but we're not
626:19 be cleaning up and using but we're not going to be cleaning this up right now
626:21 going to be cleaning this up right now so anyways let's just get into it let's
626:23 so anyways let's just get into it let's see what questions we asked uh we have
626:25 see what questions we asked uh we have the yearly salary what industry do you
626:27 the yearly salary what industry do you work in favorite programming
626:30 work in favorite programming language then there were a lot of
626:32 language then there were a lot of different options this is like one
626:34 different options this is like one question where they picked multiple
626:36 question where they picked multiple options so is how happy are you in your
626:38 options so is how happy are you in your current position with the following you
626:39 current position with the following you have your salary work life
626:42 have your salary work life balance um then we have co-workers
626:46 balance um then we have co-workers management upward Mobility learning new
626:49 management upward Mobility learning new things um and they could rank it from
626:51 things um and they could rank it from zero to 10 so some people ranked upward
626:53 zero to 10 so some people ranked upward Mobility a 10 some ranked it a zero or a
626:56 Mobility a 10 some ranked it a zero or a one um and again they can answer however
626:59 one um and again they can answer however they want how difficult was it to break
627:02 they want how difficult was it to break into Data very very difficult very easy
627:06 into Data very very difficult very easy um if you're looking for a new job we
627:08 um if you're looking for a new job we have you know what would you be looking
627:09 have you know what would you be looking for remote work better salary Etc we
627:12 for remote work better salary Etc we have male female which country you from
627:14 have male female which country you from and then this is more like demographics
627:16 and then this is more like demographics so if you're a male how old you are and
627:19 so if you're a male how old you are and this was in a Range so this is like a a
627:22 this was in a Range so this is like a a a sliding bar so you could slide it to
627:23 a sliding bar so you could slide it to the exact age you had there's some
627:26 the exact age you had there's some people who are apparently 92 um which if
627:29 people who are apparently 92 um which if that's true I mean good for you man or
627:31 that's true I mean good for you man or woman actually really quickly I'm going
627:33 woman actually really quickly I'm going to see just just while we're here I'm
627:36 to see just just while we're here I'm going to see if this is a male male or a
627:37 going to see if this is a male male or a female oh it's a female from India very
627:40 female oh it's a female from India very cool um so we have all this information
627:44 cool um so we have all this information and it is a lot of information when you
627:46 and it is a lot of information when you have something like this I mean there is
627:49 have something like this I mean there is so much data cleaning that can be done I
627:52 so much data cleaning that can be done I mean I already see like 20 plus
627:56 mean I already see like 20 plus different things that I would need to do
627:58 different things that I would need to do to make this a lot better um and we also
628:01 to make this a lot better um and we also have date Taken and the time taken as as
628:03 have date Taken and the time taken as as well as how long it they took on it like
628:06 well as how long it they took on it like the time spent really just really
628:08 the time spent really just really interesting data but again this is a
628:11 interesting data but again this is a beginner tutorial Series this is the
628:14 beginner tutorial Series this is the beginner project so we're not going to
628:15 beginner project so we're not going to get do anything too crazy I will be
628:18 get do anything too crazy I will be using this exact data set in a future
628:20 using this exact data set in a future video doing a lot more data cleaning and
628:24 video doing a lot more data cleaning and creating a much more advanced
628:25 creating a much more advanced visualization with what we have and what
628:28 visualization with what we have and what we're looking at right here but for this
628:29 we're looking at right here but for this video we're just going to be doing a
628:31 video we're just going to be doing a pretty simple visualization and D
628:33 pretty simple visualization and D dashboard that you can use uh to
628:35 dashboard that you can use uh to practice with or put on your portfolio
628:37 practice with or put on your portfolio if you know that's where you're at right
628:39 if you know that's where you're at right now so let's get out of here and let's
628:41 now so let's get out of here and let's put this into powerbi so let's exit out
628:43 put this into powerbi so let's exit out and let's come right over here to import
628:45 and let's come right over here to import data from Excel we'll click on powerbi
628:48 data from Excel we'll click on powerbi final project and
628:50 final project and open give that a second doing this all
628:53 open give that a second doing this all in real time we only have the one so
628:55 in real time we only have the one so we'll do be we won't be practicing any
628:57 we'll do be we won't be practicing any joins or anything but we're not going to
628:59 joins or anything but we're not going to load it we're going to transform this
629:01 load it we're going to transform this data so let's put it into to power query
629:05 data so let's put it into to power query editor and now we have all of our data
629:08 editor and now we have all of our data in here and it should look extremely
629:11 in here and it should look extremely familiar now when I'm looking at this
629:14 familiar now when I'm looking at this when I start looking at this information
629:17 when I start looking at this information I kind of need to know beforehand what I
629:20 I kind of need to know beforehand what I want to get out of this do I need to
629:22 want to get out of this do I need to clean every single column do I just need
629:24 clean every single column do I just need to clean a few of them do I need to get
629:26 to clean a few of them do I need to get rid of columns that's kind of where my
629:28 rid of columns that's kind of where my head's at and so right off the bat I can
629:31 head's at and so right off the bat I can already tell you that there are columns
629:32 already tell you that there are columns that we can just delete to get out of
629:33 that we can just delete to get out of our way so we're going to do that at the
629:36 our way so we're going to do that at the beginning so that we don't have to do
629:37 beginning so that we don't have to do that later on or they're just in our way
629:39 that later on or they're just in our way so I'm going to click on browser and
629:41 so I'm going to click on browser and then I'm going to hit shift and I'm
629:43 then I'm going to hit shift and I'm going to go over here to
629:44 going to go over here to refer and I'm just going to go up here
629:46 refer and I'm just going to go up here to remove columns and everything that we
629:49 to remove columns and everything that we do is going to go over here to this
629:50 do is going to go over here to this applied steps if you've been following
629:52 applied steps if you've been following this series um you know we can remove
629:55 this series um you know we can remove things add things but anything we do
629:57 things add things but anything we do will show up right over here so we can
629:59 will show up right over here so we can track it and go back if we need to now
630:02 track it and go back if we need to now one column that I know for sure that I'm
630:04 one column that I know for sure that I'm going to be using quite a bit is this
630:06 going to be using quite a bit is this which title fits you best in your
630:08 which title fits you best in your current role because I I specifically
630:09 current role because I I specifically wanted to do a breakdown of different
630:11 wanted to do a breakdown of different people's roles and how much they make
630:13 people's roles and how much they make and different stuff like that so I know
630:15 and different stuff like that so I know that I want to use this but as we saw
630:18 that I want to use this but as we saw before there's kind of the issue is is
630:20 before there's kind of the issue is is it's not very clean right it has data
630:23 it's not very clean right it has data analyst data architect engineer
630:25 analyst data architect engineer scientist databased developer and then
630:27 scientist databased developer and then like a hundred different options and
630:30 like a hundred different options and then a student or or none of these right
630:39 um and so for the purpose of this video right here we are not going to take
630:42 right here we are not going to take every single one of these options
630:43 every single one of these options because this involves a lot more data
630:45 because this involves a lot more data cleaning let me give you an example this
630:47 cleaning let me give you an example this says software engineer this also says
630:50 says software engineer this also says software engineer and with AI these two
630:53 software engineer and with AI these two would typically be combined or
630:56 would typically be combined or standardized to software engineer but
630:59 standardized to software engineer but it's not very easy to do that in powerbi
631:02 it's not very easy to do that in powerbi we could do that in Excel but not really
631:04 we could do that in Excel but not really in powerbi or even SQL if we pull this
631:06 in powerbi or even SQL if we pull this from a SQL database um and you can find
631:09 from a SQL database um and you can find lots of different you know options of
631:11 lots of different you know options of that we have data manager and data
631:13 that we have data manager and data manager if we separated these out these
631:15 manager if we separated these out these would be different options when we
631:18 would be different options when we created our visualizations and we don't
631:19 created our visualizations and we don't want that so what we are going to do uh
631:22 want that so what we are going to do uh and this is going to be kind of a an
631:24 and this is going to be kind of a an easy way out to just make sure that this
631:27 easy way out to just make sure that this is pretty clean and doesn't we don't
631:28 is pretty clean and doesn't we don't have a thousand different options we're
631:30 have a thousand different options we're going to create this to other so we're
631:33 going to create this to other so we're to simplify this a lot and then we're
631:36 to simplify this a lot and then we're going to use this so we'll have maybe
631:38 going to use this so we'll have maybe six or seven options instead of the you
631:40 six or seven options instead of the you know let's say 50 that we would have if
631:43 know let's say 50 that we would have if we actually did the harder work which
631:45 we actually did the harder work which just break it out standardize it and
631:47 just break it out standardize it and clean it up that way so what we're going
631:49 clean it up that way so what we're going to do is we're going to click on this
631:51 to do is we're going to click on this right here and we're going to go up here
631:52 right here and we're going to go up here to split column in this ribbon up top
631:55 to split column in this ribbon up top we'll go to split
631:56 we'll go to split column and we want to do it by a
631:59 column and we want to do it by a delimiter and if you notice let me see
632:01 delimiter and if you notice let me see if I can move this over if you notice we
632:03 if I can move this over if you notice we have other and then we have this
632:05 have other and then we have this parenthesis and in no other option or
632:07 parenthesis and in no other option or way is there parenthesis so what we're
632:09 way is there parenthesis so what we're going to do is we're going to use a
632:12 going to do is we're going to use a custom and we're use this open
632:15 custom and we're use this open parenthesis what that's going to do is
632:17 parenthesis what that's going to do is it's going to separate it by this
632:18 it's going to separate it by this parenthesis it's going to leave the
632:19 parenthesis it's going to leave the other it's going to create separate
632:22 other it's going to create separate columns um just one separate column for
632:24 columns um just one separate column for each of these and we can do that at each
632:26 each of these and we can do that at each occurrence or we can do the leftmost and
632:28 occurrence or we can do the leftmost and we really we only need it for the
632:30 we really we only need it for the leftmost because there's only one of
632:32 leftmost because there's only one of these uh left-handed or left-sided uh
632:35 these uh left-handed or left-sided uh brackets or or what is it whatever this
632:38 brackets or or what is it whatever this is called and then let's go and click
632:40 is called and then let's go and click okay and it should create another column
632:43 okay and it should create another column so it's going to have 0.1 Point 2 and
632:47 so it's going to have 0.1 Point 2 and now we have if we click on this now we
632:49 now we have if we click on this now we only have these options we have analyst
632:52 only have these options we have analyst architect engineer data scientist
632:54 architect engineer data scientist database developer other and student
632:56 database developer other and student looking or none that is what we want it
632:58 looking or none that is what we want it makes it so much simpler and it's not
633:01 makes it so much simpler and it's not perfect but again I'm trying to show you
633:03 perfect but again I'm trying to show you what we are able to do in powerbi so now
633:06 what we are able to do in powerbi so now we're just going to remove that column
633:08 we're just going to remove that column and we're going to go and do the exact
633:10 and we're going to go and do the exact same thing to this one as well because I
633:13 same thing to this one as well because I know that we want to use this and I
633:15 know that we want to use this and I really wanted to use this one as well
633:17 really wanted to use this one as well but if we look at this one also um
633:20 but if we look at this one also um there's a lot so I said what is your
633:22 there's a lot so I said what is your favorite programming language and people
633:24 favorite programming language and people there were pre-selected answers like
633:26 there were pre-selected answers like JavaScript Java C++ python R things like
633:29 JavaScript Java C++ python R things like that and then there was an other option
633:32 that and then there was an other option and in this other option I mean it was
633:34 and in this other option I mean it was free text so they can fill it in as they
633:36 free text so they can fill it in as they want I mean there's four five six
633:38 want I mean there's four five six different ways that people put SQL that
633:41 different ways that people put SQL that is something I would standardize and you
633:43 is something I would standardize and you know that would be the way I cleaned it
633:46 know that would be the way I cleaned it but that's not how we did it in here so
633:48 but that's not how we did it in here so we're going to do the same thing we're
633:49 we're going to do the same thing we're going to keep that other so we're going
633:51 going to keep that other so we're going to split this column again we're use a
633:53 to split this column again we're use a delimiter and for this delimiter though
633:56 delimiter and for this delimiter though we're going to use a colon so we're
633:58 we're going to use a colon so we're going to say we're going to do a colon
634:00 going to say we're going to do a colon right there we'll just do the leftmost
634:02 right there we'll just do the leftmost we'll click okay and then we have our
634:06 we'll click okay and then we have our options and it's much simpler now I
634:08 options and it's much simpler now I really would have rather kept all these
634:11 really would have rather kept all these and because sql's in there quite a bit
634:13 and because sql's in there quite a bit but you know a lot of people don't think
634:15 but you know a lot of people don't think SQL is even a programming language so uh
634:17 SQL is even a programming language so uh we're going to delete that column now
634:19 we're going to delete that column now one that I just skipped and I kind of
634:21 one that I just skipped and I kind of wanted to go back to is this current
634:23 wanted to go back to is this current yearly salary I really want to use this
634:27 yearly salary I really want to use this let's see if we can use it I here's what
634:30 let's see if we can use it I here's what I want to do with it and this is not
634:31 I want to do with it and this is not perfect um for this video I want to try
634:34 perfect um for this video I want to try it what I want to do is break up these
634:36 it what I want to do is break up these numbers 106 125 and then take the
634:39 numbers 106 125 and then take the average of those numbers so then we'll
634:41 average of those numbers so then we'll use some docks in there so we'll take
634:43 use some docks in there so we'll take 106 125 create that into two separate
634:45 106 125 create that into two separate columns then we'll create a third column
634:48 columns then we'll create a third column that will give us the average of those
634:50 that will give us the average of those two numbers so we'll do 106 plus 125
634:53 two numbers so we'll do 106 plus 125 divided by two and then we'll have the
634:56 divided by two and then we'll have the average of that now that is not perfect
634:58 average of that now that is not perfect but it's going to give us at least you
635:00 but it's going to give us at least you know an average of kind of roundabout
635:02 know an average of kind of roundabout number because they gave us this range
635:04 number because they gave us this range they said my salary is between 106 and
635:06 they said my salary is between 106 and 125,000 so if we say that their salary
635:09 125,000 so if we say that their salary was
635:09 was 112,000 at least gives us it makes it
635:12 112,000 at least gives us it makes it usable it's a numeric value instead of
635:14 usable it's a numeric value instead of being this which is text which we really
635:17 being this which is text which we really we could use and and I'll show you how
635:19 we could use and and I'll show you how to do that because we're going to keep
635:20 to do that because we're going to keep this column I'll create a copy of this
635:22 this column I'll create a copy of this and I'll show you the difference between
635:23 and I'll show you the difference between this and using the average but for but
635:28 this and using the average but for but for this data cleaning portion let's
635:30 for this data cleaning portion let's just try it let's see what we can do and
635:33 just try it let's see what we can do and see if we can make it work so first
635:35 see if we can make it work so first let's create a duplicate so we're going
635:38 let's create a duplicate so we're going to uh duplicate the column so now we
635:42 to uh duplicate the column so now we have this copy at the very very end and
635:45 have this copy at the very very end and we can use this one instead of having to
635:47 we can use this one instead of having to use the original way way way back here
635:50 use the original way way way back here so we're going to leave that one how it
635:51 so we're going to leave that one how it is and we're going to use this one so
635:55 is and we're going to use this one so let's go ahead and split this one up
635:57 let's go ahead and split this one up we're going to click on the column
635:58 we're going to click on the column header then we're going to click on
636:00 header then we're going to click on split column and we'll do it by digit to
636:04 split column and we'll do it by digit to non-digit and if you look at it right
636:07 non-digit and if you look at it right here it's broken it out kind of um in
636:10 here it's broken it out kind of um in the fact that now in this one we just
636:13 the fact that now in this one we just have numeric values and in this one we
636:16 have numeric values and in this one we have k- numeric or just Dash numeric and
636:21 have k- numeric or just Dash numeric and now this can be easily cleaned whereas
636:24 now this can be easily cleaned whereas this one we can just completely get rid
636:25 this one we can just completely get rid of because it's only K so we'll just
636:28 of because it's only K so we'll just remove that column and then in this one
636:30 remove that column and then in this one we're going to rightclick we're going to
636:32 we're going to rightclick we're going to click on replace values and so if it
636:35 click on replace values and so if it just has we're just do a k we'll replace
636:38 just has we're just do a k we'll replace with nothing we'll do okay and then for
636:41 with nothing we'll do okay and then for the last one we'll go to replace values
636:45 the last one we'll go to replace values and we'll do the dash or the minus sign
636:47 and we'll do the dash or the minus sign and we'll place that with nothing and so
636:49 and we'll place that with nothing and so now we have our values as well oh we
636:52 now we have our values as well oh we also have a plus let me get rid of that
636:54 also have a plus let me get rid of that because that's when some people had 250
636:56 because that's when some people had 250 or 225,000 plus so for that one the
636:59 or 225,000 plus so for that one the average is just going to be 225 we'll
637:02 average is just going to be 225 we'll have to specify that in our dock I
637:03 have to specify that in our dock I forgot but actually if somebody has
637:06 forgot but actually if somebody has 225 let me find this plus really quick
637:10 225 let me find this plus really quick uh let me filter by it because that's a
637:12 uh let me filter by it because that's a lot faster what we actually want to do
637:15 lot faster what we actually want to do for the purpose of this one is we want
637:17 for the purpose of this one is we want to put 225 here so that when we do 225
637:20 to put 225 here so that when we do 225 plus 225 divide by two it comes out to
637:23 plus 225 divide by two it comes out to 225 that's just what we're going to put
637:25 225 that's just what we're going to put it as and there's only two people so uh
637:27 it as and there's only two people so uh I'm actually going to replace this I'm
637:29 I'm actually going to replace this I'm going to do replace values I'm G to say
637:31 going to do replace values I'm G to say Plus
637:33 Plus with
637:34 with 225 and we'll click okay awesome we can
637:38 225 and we'll click okay awesome we can unfilter these select all so we're going
637:41 unfilter these select all so we're going to go right up here to add column we're
637:44 to go right up here to add column we're going to say custom
637:46 going to say custom column and we're going to go right over
637:48 column and we're going to go right over here actually let's make it uh
637:51 here actually let's make it uh average salary let's make it average
637:55 average salary let's make it average salary so we're going to insert this I'm
637:58 salary so we're going to insert this I'm going to
638:00 going to say parentheses and we're going to say
638:05 say parentheses and we're going to say plus this
638:07 plus this insert and close the parenthesis divided
638:09 insert and close the parenthesis divided by two and it says no syntax errors have
638:13 by two and it says no syntax errors have been detected let's click on okay and
638:17 been detected let's click on okay and it's giving us an error so it's saying
638:19 it's giving us an error so it's saying we cannot apply operator plus to types
638:21 we cannot apply operator plus to types text and text which makes perfect sense
638:24 text and text which makes perfect sense these aren't uh numbers so let's make it
638:26 these aren't uh numbers so let's make it a whole number and let's make it a whole
638:29 a whole number and let's make it a whole number and then let's see if this will
638:32 number and then let's see if this will actually work
638:35 actually work no or maybe we just need to try a whole
638:37 no or maybe we just need to try a whole another one so let's try transform or
638:40 another one so let's try transform or add column custom
638:43 add column custom column let's try this all again see if
638:45 column let's try this all again see if uh I can make it
638:47 uh I can make it work
638:48 work insert do this
638:51 insert do this one
638:53 one plus this
638:55 plus this one and we'll do divid by two and let's
638:58 one and we'll do divid by two and let's try this one and there we go so now
639:01 try this one and there we go so now let's get rid of this column
639:04 let's get rid of this column columns and we can actually remove these
639:06 columns and we can actually remove these ones as
639:07 ones as well because now we have this
639:12 well because now we have this um average salary
639:16 um average salary column which when we look at this or
639:18 column which when we look at this or when we use this uh we can let me see if
639:21 when we use this uh we can let me see if I can just move this way way way over
639:23 I can just move this way way way over all right I might cut because this is
639:25 all right I might cut because this is taking forever so if you take the
639:27 taking forever so if you take the average of these two numbers you'll get
639:29 average of these two numbers you'll get 53 if you take the average of 0 and 40
639:31 53 if you take the average of 0 and 40 you'll get 20 so now we have this
639:33 you'll get 20 so now we have this average salary and again when we get to
639:35 average salary and again when we get to the actual visualization part I'll show
639:37 the actual visualization part I'll show you why this isn't as useful as having
639:40 you why this isn't as useful as having this average salary and just a reminder
639:42 this average salary and just a reminder this is not perfect uh I wouldn't
639:44 this is not perfect uh I wouldn't typically do this especially if I had it
639:47 typically do this especially if I had it in Excel or if I was you know creating
639:49 in Excel or if I was you know creating this survey in a different way I would
639:51 this survey in a different way I would probably have a very specific value
639:53 probably have a very specific value where they could do it on a slider but
639:55 where they could do it on a slider but this is how it is so we've at least made
639:57 this is how it is so we've at least made it usable or more usable in my mind and
640:00 it usable or more usable in my mind and we have a few other things that we can
640:02 we have a few other things that we can change like what industry do you work in
640:04 change like what industry do you work in where we can break this one out so I'm
640:06 where we can break this one out so I'm going to go ahead and break this one out
640:07 going to go ahead and break this one out as well
640:09 as well as this one right here which country do
640:11 as this one right here which country do you live in I'm going to break bro both
640:13 you live in I'm going to break bro both of those out to where it's the country
640:15 of those out to where it's the country or other I'm not going to have these
640:17 or other I'm not going to have these other values although there are a lot of
640:19 other values although there are a lot of them because there's a lot of people who
640:20 them because there's a lot of people who live in these different countries but we
640:23 live in these different countries but we can't really do that super well in here
640:25 can't really do that super well in here because again the same issue kept
640:27 because again the same issue kept happening Argentina Argentina Argentine
640:30 happening Argentina Argentina Argentine a Australia so we can't normalize those
640:33 a Australia so we can't normalize those values unless we spend just a copious
640:36 values unless we spend just a copious amount of time doing that so I'm going
640:38 amount of time doing that so I'm going to go ahead and do these I'm going to
640:40 to go ahead and do these I'm going to fast I'm going to fast speed this so it
640:42 fast I'm going to fast speed this so it goes a lot faster so I'm just going to
640:44 goes a lot faster so I'm just going to go silent and let this happen really
640:46 go silent and let this happen really quick and then we'll get to the end and
640:48 quick and then we'll get to the end and we'll actually start building our
641:00 visualizations all right so we've split them up and as you can see we have all
641:02 them up and as you can see we have all the these options as well as other and I
641:05 the these options as well as other and I think you know there is let me tell you
641:08 think you know there is let me tell you there is so much more that we could do
641:10 there is so much more that we could do with this I mean just so many other
641:13 with this I mean just so many other things but this is like what the bare
641:16 things but this is like what the bare minimum of what we need for this project
641:19 minimum of what we need for this project so let's go ahead and close and apply
641:22 so let's go ahead and close and apply this and if we need to come back at any
641:24 this and if we need to come back at any point and actually fix anything or
641:26 point and actually fix anything or change anything we can so it's not like
641:28 change anything we can so it's not like that's permanent um so as you can see we
641:30 that's permanent um so as you can see we have everything over here we have all
641:32 have everything over here we have all our data as it is transformed in here as
641:35 our data as it is transformed in here as well and now we can start building out
641:39 well and now we can start building out our visualization let's go back to our
641:42 our visualization let's go back to our report and let's start building
641:44 report and let's start building something out all right so let's add a
641:45 something out all right so let's add a title to our
641:48 title to our dashboard we want to make this right at
641:50 dashboard we want to make this right at the
641:51 the top we call this the
641:54 top we call this the data
641:56 data professional
641:58 professional survey
642:00 survey breakdown and let's make make that quite
642:03 breakdown and let's make make that quite a bit
642:04 a bit larger make it bold why not and we'll
642:08 larger make it bold why not and we'll put that in the
642:09 put that in the center and now let's um let's add some
642:13 center and now let's um let's add some effects let's change that background to
642:15 effects let's change that background to something like it's too dark something
642:19 something like it's too dark something like this and I do not like that Boldt
642:21 like this and I do not like that Boldt let's take that
642:22 let's take that off there we go so something like this
642:25 off there we go so something like this just as a quick title to what we're
642:28 just as a quick title to what we're about to do what we are about to build
642:30 about to do what we are about to build so we're going to start off with the
642:31 so we're going to start off with the most simple visualizations that we're
642:33 most simple visualizations that we're going to do and we'll kind of work our
642:35 going to do and we'll kind of work our way towards kind of the harder ones so
642:37 way towards kind of the harder ones so the first one that we're going to start
642:38 the first one that we're going to start off with is a card and the cards are
642:41 off with is a card and the cards are obviously like just super super easy
642:44 obviously like just super super easy they usually just display one piece of
642:46 they usually just display one piece of information so we're going to go right
642:48 information so we're going to go right over here to the very bottom at the
642:50 over here to the very bottom at the unique ID and we're going to select it
642:54 unique ID and we're going to select it and we're going to say a account of
642:56 and we're going to say a account of distinct or account it doesn't matter um
642:59 distinct or account it doesn't matter um it says 630 count of unique ID now we're
643:02 it says 630 count of unique ID now we're not going to keep that as is we're
643:03 not going to keep that as is we're actually going to go right over here
643:05 actually going to go right over here we're going to say rename for this
643:06 we're going to say rename for this Visual and it says count of unique ID
643:08 Visual and it says count of unique ID but we're going to say count
643:11 but we're going to say count of survey takers and you can say
643:15 of survey takers and you can say whatever you want here but in in general
643:17 whatever you want here but in in general that is what it is we're we're counting
643:19 that is what it is we're we're counting how many people um you know took this
643:22 how many people um you know took this survey and that's just a kind of a total
643:24 survey and that's just a kind of a total maybe I should say total amount or of
643:27 maybe I should say total amount or of survey takers but you can say count of
643:29 survey takers but you can say count of survey takers how many people took this
643:31 survey takers how many people took this survey so let's click out of there let's
643:33 survey so let's click out of there let's click on card let's make it about the
643:36 click on card let's make it about the same size we're going to drag it up
643:38 same size we're going to drag it up here and try to make them about the same
643:42 here and try to make them about the same we will in a little bit we'll make them
643:43 we will in a little bit we'll make them the same size um but for this one we're
643:46 the same size um but for this one we're going to look at age so we're going to
643:48 going to look at age so we're going to look at current age so I'm going click
643:50 look at current age so I'm going click on that and we'll say want the average
643:53 on that and we'll say want the average age so our average age taker is almost
643:56 age so our average age taker is almost 30 years old so let's go right over here
643:58 30 years old so let's go right over here we're going to say rename for this
644:00 we're going to say rename for this visual we'll say a average age of
644:05 visual we'll say a average age of survey oop this might be too
644:08 survey oop this might be too long average age of survey taker again
644:11 long average age of survey taker again name it whatever you'd like so again
644:14 name it whatever you'd like so again these are meant to be highlevel numbers
644:16 these are meant to be highlevel numbers so when somebody's looking at your
644:17 so when somebody's looking at your dashboard they can just really quickly
644:20 dashboard they can just really quickly glance at this and know exactly what it
644:21 glance at this and know exactly what it is instead of like some of these other
644:23 is instead of like some of these other visualizations that we're about to
644:24 visualizations that we're about to create they don't really have to dig
644:26 create they don't really have to dig into it look at the x- axis the y axis
644:29 into it look at the x- axis the y axis the the different uh Legend colors and
644:31 the the different uh Legend colors and whatnot they can just see these high
644:33 whatnot they can just see these high numbers and get a really quick glance of
644:35 numbers and get a really quick glance of the data now let's create our first
644:37 the data now let's create our first visualization and what we're going to do
644:39 visualization and what we're going to do for that one is a clustered bar chart so
644:42 for that one is a clustered bar chart so let's go ahead and click on the
644:43 let's go ahead and click on the clustered bar chart we can create as
644:45 clustered bar chart we can create as small or as large as we'd like and for
644:48 small or as large as we'd like and for this one we're going to be looking at
644:50 this one we're going to be looking at the job titles now remember we kind of
644:53 the job titles now remember we kind of changed the job titles or you know U
644:56 changed the job titles or you know U transform those if you want to say that
644:59 transform those if you want to say that so we're going to look at Job titles and
645:00 so we're going to look at Job titles and then we're going to look at their
645:02 then we're going to look at their average salary and if you remember we
645:04 average salary and if you remember we transformed that one as well we have a
645:07 transformed that one as well we have a average salary now this one is it looks
645:09 average salary now this one is it looks like a text right now so it may not work
645:11 like a text right now so it may not work properly and what we're actually going
645:13 properly and what we're actually going to do is go over
645:14 to do is go over here I want to see the average
645:19 here I want to see the average salary so let's click on average salary
645:21 salary so let's click on average salary and see if we can change this data type
645:23 and see if we can change this data type from a text to a decimal number let's
645:27 from a text to a decimal number let's click yes I forgot to do that when we
645:29 click yes I forgot to do that when we were transforming it and there we go
645:31 were transforming it and there we go this is perfect um so now we can go
645:34 this is perfect um so now we can go back and we can select our average
645:38 back and we can select our average salary and as you can see it has this um
645:40 salary and as you can see it has this um this function symbol and so now we can
645:42 this function symbol and so now we can click on it and it'll look a lot better
645:45 click on it and it'll look a lot better and although this says average salary as
645:47 and although this says average salary as the title it's actually doing a count or
645:49 the title it's actually doing a count or the sum so we can click average right
645:52 the sum so we can click average right here and what we want to do is actually
645:54 here and what we want to do is actually break this down by the job title and so
645:58 break this down by the job title and so now we can see data scientists are
646:00 now we can see data scientists are making the most by far far they're
646:02 making the most by far far they're making average of 93,000 at least from
646:05 making average of 93,000 at least from the survey takers that took it then we
646:07 the survey takers that took it then we have our data Engineers making
646:09 have our data Engineers making 65,000 data Architects are making 63 and
646:13 65,000 data Architects are making 63 and then where the data analysts data
646:15 then where the data analysts data analysts are right here making 55 so
646:18 analysts are right here making 55 so again we had 630 people take this survey
646:21 again we had 630 people take this survey and so the vast majority of them were
646:24 and so the vast majority of them were data analysts so this one's probably the
646:25 data analysts so this one's probably the most accurate out of all of them and I
646:27 most accurate out of all of them and I actually don't like how this looks as
646:30 actually don't like how this looks as the cluster bar chart let's try the
646:32 the cluster bar chart let's try the stocked bar chart and put this as the
646:35 stocked bar chart and put this as the legend that's more what I was going for
646:37 legend that's more what I was going for I don't know I didn't want as skinny
646:40 I don't know I didn't want as skinny because when you're doing this one it
646:41 because when you're doing this one it typically they have multiple options per
646:44 typically they have multiple options per um uh x axis and so I think that's why
646:47 um uh x axis and so I think that's why it was that little skinny line but this
646:49 it was that little skinny line but this one is more what I was looking for but
646:51 one is more what I was looking for but let's make that smaller and let's
646:53 let's make that smaller and let's definitely change that title because
646:54 definitely change that title because good night um this is like incredibly
646:58 good night um this is like incredibly long let's go over here to this format
647:01 long let's go over here to this format visual ual we'll go to the general the
647:05 visual ual we'll go to the general the title and we're just going to say
647:08 title and we're just going to say average salary by job title just like
647:14 average salary by job title just like that and this looks a lot better now
647:17 that and this looks a lot better now we're not going to kind of format all
647:19 we're not going to kind of format all our whole dashboard yet we're going to
647:21 our whole dashboard yet we're going to create our visualizations and then we're
647:23 create our visualizations and then we're going to kind of organize everything and
647:25 going to kind of organize everything and kind of play Tetris with it to make it
647:27 kind of play Tetris with it to make it look the best so we're just going to
647:30 look the best so we're just going to minimize this and put it right up here
647:33 minimize this and put it right up here for now um but we will go back and kind
647:36 for now um but we will go back and kind of make everything look better at the
647:38 of make everything look better at the end and actually while we're here I also
647:40 end and actually while we're here I also want to change this as well so rename
647:44 want to change this as well so rename for this we're going to say job title
647:47 for this we're going to say job title Oops why did I do that
647:51 Oops why did I do that job title and for this one we're just
647:55 job title and for this one we're just going to
647:56 going to say name average
648:00 say name average salary there we go looks much better
648:03 salary there we go looks much better much cleaner uh took away a lot of the
648:06 much cleaner uh took away a lot of the anxiety that I was feeling about two
648:08 anxiety that I was feeling about two minutes ago when we first put that up
648:09 minutes ago when we first put that up there so let's go on to our second
648:11 there so let's go on to our second visualization the next one that I'm
648:13 visualization the next one that I'm interested in is actually what
648:15 interested in is actually what programming language people were using
648:17 programming language people were using the most so we have salary there's a
648:19 the most so we have salary there's a thousand different things we can look at
648:20 thousand different things we can look at in here but I want to know you know what
648:23 in here but I want to know you know what is people's favorite programming
648:24 is people's favorite programming language so let's take a look at that so
648:27 language so let's take a look at that so we have favorite programming language
648:29 we have favorite programming language let's find that so we have our favorite
648:31 let's find that so we have our favorite programming language and we also have
648:34 programming language and we also have how many people actually took it or the
648:36 how many people actually took it or the unique people so right now this is
648:38 unique people so right now this is columns we don't want that let's um
648:41 columns we don't want that let's um let's do a clustered column chart click
648:44 let's do a clustered column chart click on this right here and it looks
648:48 on this right here and it looks like here we go that is kind of what
648:50 like here we go that is kind of what we're looking for and instead of count
648:52 we're looking for and instead of count of unique ID we'll say count
648:55 of unique ID we'll say count of let's do count of
648:59 of let's do count of Voters and for favorite program language
649:02 Voters and for favorite program language we'll
649:04 we'll say favorite oops favorite programming
649:07 say favorite oops favorite programming language and get rid of that as well and
649:10 language and get rid of that as well and then we're going to go into here also
649:13 then we're going to go into here also and change the title and say favorite
649:18 and change the title and say favorite programming
649:20 programming languages or favorite pro programming
649:22 languages or favorite pro programming language just like this now let's make
649:24 language just like this now let's make this a lot bigger so you can see it but
649:27 this a lot bigger so you can see it but really quickly at a glance you can see
649:30 really quickly at a glance you can see python is by far the most popular are
649:32 python is by far the most popular are other C++ JavaScript Java now all we're
649:34 other C++ JavaScript Java now all we're seeing is the count so it's all the same
649:36 seeing is the count so it's all the same it's just blue we can see how many
649:38 it's just blue we can see how many people voted for each one but if we
649:40 people voted for each one but if we wanted to break it out similar to how we
649:41 wanted to break it out similar to how we did with the job titles we could still
649:44 did with the job titles we could still do that so all we'd have to do is break
649:46 do that so all we'd have to do is break it out uh or bring this job title down
649:48 it out uh or bring this job title down to the legend and now breaks out like
649:51 to the legend and now breaks out like this and that's not exactly what I was
649:53 this and that's not exactly what I was going for I was going more for something
649:54 going for I was going more for something like this where we can see the still the
649:57 like this where we can see the still the whole count but now we can see who is
650:00 whole count but now we can see who is actually V voting for these things so
650:02 actually V voting for these things so I'm just not a huge fan of the colors
650:04 I'm just not a huge fan of the colors that are pre-selected here and kind of
650:06 that are pre-selected here and kind of the whole theme of this dashboard at the
650:09 the whole theme of this dashboard at the very end we're going to completely
650:11 very end we're going to completely revamp this change a bunch of colors the
650:13 revamp this change a bunch of colors the background and make this look a lot
650:15 background and make this look a lot nicer rather than just the white
650:17 nicer rather than just the white background like we have it um and so for
650:20 background like we have it um and so for now let's
650:21 now let's just make this a lot smaller and put it
650:25 just make this a lot smaller and put it into this corner these will not be
650:27 into this corner these will not be staying there but we need to we need
650:29 staying there but we need to we need room to create our next visualizations
650:31 room to create our next visualizations and just just a cleaner space to do
650:32 and just just a cleaner space to do things now the next thing that I really
650:34 things now the next thing that I really want to include is a way to break down
650:37 want to include is a way to break down where they're from their country because
650:39 where they're from their country because especially something like salary is very
650:41 especially something like salary is very dependent on your country whereas the
650:43 dependent on your country whereas the average salary in the United States for
650:44 average salary in the United States for a data analyst may be like 60,000 in
650:48 a data analyst may be like 60,000 in another country it could be 20,000 that
650:50 another country it could be 20,000 that could bring down the average quite a bit
650:52 could bring down the average quite a bit so we need a way to be able to break
650:54 so we need a way to be able to break that down now we can do something like a
650:57 that down now we can do something like a filled map and there's no problem with
650:59 filled map and there's no problem with that at all um
651:01 that at all um but you know for what we're building
651:04 but you know for what we're building what we're creating it's not probably
651:06 what we're creating it's not probably going to work out the best I mean this
651:08 going to work out the best I mean this looks okay we could stick it in the
651:11 looks okay we could stick it in the corner or something um and you can do
651:13 corner or something um and you can do that and that's perfectly fine I think
651:14 that and that's perfectly fine I think what I'm going to do is something like a
651:16 what I'm going to do is something like a tree map which I don't use a lot but I
651:21 tree map which I don't use a lot but I want something where they can just click
651:22 want something where they can just click on it they can look at the
651:24 on it they can look at the values
651:26 values distinct they can look at the values and
651:28 distinct they can look at the values and just click on it and it'll be right
651:30 just click on it and it'll be right there for them so they don't have to
651:32 there for them so they don't have to filter it out on their own or no
651:33 filter it out on their own or no geography and look at this map they can
651:35 geography and look at this map they can just read Canada other United Kingdom
651:37 just read Canada other United Kingdom India United States and click on that
651:39 India United States and click on that and so for example let's click over here
651:41 and so for example let's click over here on United States the numbers change
651:43 on United States the numbers change quite a bit now the average salary for a
651:45 quite a bit now the average salary for a data scientist is
651:47 data scientist is 139,000 for data analyst it's 80 and if
651:50 139,000 for data analyst it's 80 and if we look at India you know the average
651:52 we look at India you know the average salary for a data scientist is 68 the
651:55 salary for a data scientist is 68 the average salary is 26 for a data analyst
651:57 average salary is 26 for a data analyst that doesn't mean that they make less
651:59 that doesn't mean that they make less money in India that just means that the
652:01 money in India that just means that the cost of living is probably lower in
652:03 cost of living is probably lower in India therefore they don't need the
652:05 India therefore they don't need the higher US Dollars salary because again
652:07 higher US Dollars salary because again this was all done in US dollars so just
652:09 this was all done in US dollars so just something to think about uh let's click
652:11 something to think about uh let's click out of that so we'll keep that one as
652:13 out of that so we'll keep that one as well so now let's create our next
652:14 well so now let's create our next visualization and this is one that I do
652:16 visualization and this is one that I do not get to use enough in my actual job
652:18 not get to use enough in my actual job so we're going to use it in this project
652:20 so we're going to use it in this project um and it's going to be this gauge right
652:22 um and it's going to be this gauge right here so let's add that one put it right
652:24 here so let's add that one put it right over here we're going to add two of
652:27 over here we're going to add two of those let's just go ahead and add
652:29 those let's just go ahead and add another one while we're at it because
652:31 another one while we're at it because we're going to have them kind of like
652:32 we're going to have them kind of like right here right next to each other the
652:34 right here right next to each other the first one and these ones are really good
652:35 first one and these ones are really good for kind of looking at these kind of
652:38 for kind of looking at these kind of surveys and I don't get to work with
652:39 surveys and I don't get to work with surveys enough but we can see you know
652:41 surveys enough but we can see you know how happy are they in terms of work life
652:44 how happy are they in terms of work life balance so we can add that we're going
652:46 balance so we can add that we're going to add work life balance um and right
652:48 to add work life balance um and right now it's doing a count and we don't have
652:51 now it's doing a count and we don't have minimum or maximum values in there yet
652:53 minimum or maximum values in there yet so it's going to look kind of weird but
652:54 so it's going to look kind of weird but we're going to look at the average rate
652:56 we're going to look at the average rate or the the average score of these then
652:59 or the the average score of these then we're going to pull this over to the
653:00 we're going to pull this over to the minimum value and we want to put that at
653:02 minimum value and we want to put that at the minimum and pull this over and add
653:05 the minimum and pull this over and add the maximum value so now it actually has
653:08 the maximum value so now it actually has zero to 10 and it shows that the average
653:11 zero to 10 and it shows that the average person is happy with which one was this
653:14 person is happy with which one was this their average person is happy with their
653:16 their average person is happy with their work life balance uh they rate about a
653:19 work life balance uh they rate about a 5.74 overall now let's really quickly
653:23 5.74 overall now let's really quickly change the title of this because this is
653:26 change the title of this because this is ridiculous I want to say happy with work
653:30 ridiculous I want to say happy with work life balance
653:31 life balance so this is their rating uh you know
653:33 so this is their rating uh you know change it to whatever title you want
653:35 change it to whatever title you want that's what I'm going to do and we'll
653:36 that's what I'm going to do and we'll also do happy with their salary let's
653:40 also do happy with their salary let's click on salary We'll add that to
653:43 click on salary We'll add that to minimum and we'll add the maximum value
653:46 minimum and we'll add the maximum value as well to make sure that we know how to
653:48 as well to make sure that we know how to use
653:49 use that and then we'll take the average so
653:52 that and then we'll take the average so not many people are happy with their
653:54 not many people are happy with their salary I'm just finding out I mean this
653:55 salary I'm just finding out I mean this is a real survey this is real data so I
653:56 is a real survey this is real data so I mean it's h pretty interesting let's go
653:59 mean it's h pretty interesting let's go to the title let's go to happy with or
654:04 to the title let's go to happy with or maybe it's happiness happiness with
654:07 maybe it's happiness happiness with salary maybe that's what we should make
654:09 salary maybe that's what we should make it and I'm going to change that over
654:11 it and I'm going to change that over here as well I think it sounds better
654:14 here as well I think it sounds better some of this I've already planned out
654:15 some of this I've already planned out some I haven't this is not something
654:17 some I haven't this is not something I've planned out so uh so we're going to
654:18 I've planned out so uh so we're going to say happiness with work life balance
654:20 say happiness with work life balance happiness with salary really interesting
654:23 happiness with salary really interesting um we may go back and tweak these just a
654:25 um we may go back and tweak these just a little bit in the future but the very
654:27 little bit in the future but the very last visualization that we're going to
654:28 last visualization that we're going to do is male versus female kind to got to
654:31 do is male versus female kind to got to have that in there um I don't typically
654:34 have that in there um I don't typically like pie charts and dut charts but uh
654:36 like pie charts and dut charts but uh you know I'm feeling I'm just feeling it
654:38 you know I'm feeling I'm just feeling it so let's try it um and we will
654:42 so let's try it um and we will do let see let's make this larger so we
654:46 do let see let's make this larger so we have male
654:47 have male female and what do we want to look at
654:49 female and what do we want to look at like what do we want to measure so we
654:50 like what do we want to measure so we have male versus female we can measure
654:53 have male versus female we can measure anything um but maybe what we'll do is
654:56 anything um but maybe what we'll do is the average salary again I mean we've
654:58 the average salary again I mean we've kind of only looked at salary once
655:01 kind of only looked at salary once in this one right here um and a little
655:03 in this one right here um and a little bit of like how happy they are but we'll
655:05 bit of like how happy they are but we'll look at the average salary between males
655:08 look at the average salary between males and females and then we'll look at not
655:12 and females and then we'll look at not the current age Oops I meant average
655:16 the current age Oops I meant average salary and then we'll look at the
655:19 salary and then we'll look at the average and it looks like the average
655:22 average and it looks like the average salary is actually really close versus
655:24 salary is actually really close versus males versus females 55 for female
655:28 males versus females 55 for female versus 53 for male so actually the
655:30 versus 53 for male so actually the females are a little bit higher
655:32 females are a little bit higher congratulations so they're just a little
655:34 congratulations so they're just a little bit higher in terms of pay so now we
655:36 bit higher in terms of pay so now we need to start organizing all of this
655:39 need to start organizing all of this cleaning it up making it look a lot
655:41 cleaning it up making it look a lot better than it does right now it looks
655:42 better than it does right now it looks great uh you know but we can do a lot
655:46 great uh you know but we can do a lot more with this so I'm gonna we're we're
655:47 more with this so I'm gonna we're we're going to keep these or all these kind of
655:50 going to keep these or all these kind of over on this left hand side I'm GNA put
655:52 over on this left hand side I'm GNA put this I want this up here we also need to
655:54 this I want this up here we also need to change that title I want this up here um
655:58 change that title I want this up here um and again we're going to kind of change
655:59 and again we're going to kind of change the theme as we go
656:02 the theme as we go I I just want to format it
656:05 I I just want to format it right we'll have it just like this let's
656:07 right we'll have it just like this let's change the title of
656:10 change the title of this let's go to title and we're going
656:13 this let's go to title and we're going to say country of survey
656:16 to say country of survey takers uh I'm not the the survey takers
656:20 takers uh I'm not the the survey takers I'm not really stuck on that if you find
656:22 I'm not really stuck on that if you find something better you think of something
656:23 something better you think of something better I would go with that but um you
656:26 better I would go with that but um you know it definitely doesn't look bad and
656:28 know it definitely doesn't look bad and where did this where did my other
656:29 where did this where did my other visualization go there goes um I think
656:32 visualization go there goes um I think this one I want to make kind of more
656:34 this one I want to make kind of more tall um so I might move it this way jeez
656:37 tall um so I might move it this way jeez this is such a I hate I hate having a
656:39 this is such a I hate I hate having a lot of visualizations on here it just
656:41 lot of visualizations on here it just really is annoying to me so what we're
656:43 really is annoying to me so what we're going to do I think we're
656:45 going to do I think we're gonna step this to the side put this to
656:48 gonna step this to the side put this to the side as
656:50 the side as well I want to make it to where it's
656:54 well I want to make it to where it's just okay I didn't want it to cut
656:57 just okay I didn't want it to cut off we'll do that might make these
657:03 off we'll do that might make these um make these a little bigger actually
657:06 um make these a little bigger actually so I want it to kind of match the
657:09 so I want it to kind of match the size like right there I'll match this
657:13 size like right there I'll match this perfect this one I kind of want to bring
657:16 perfect this one I kind of want to bring over
657:17 over here and bring it down a little bit
657:20 here and bring it down a little bit maybe something like
657:22 maybe something like this maybe I'm not sure I'm not I'm not
657:25 this maybe I'm not sure I'm not I'm not sold on that um I added a few different
657:28 sold on that um I added a few different visualizations that I didn't have in my
657:29 visualizations that I didn't have in my original so now I'm kind of having to do
657:31 original so now I'm kind of having to do this on the fly so I might fast forward
657:33 this on the fly so I might fast forward some of the parts where I'm like really
657:35 some of the parts where I'm like really thinking about it or taking too much
657:36 thinking about it or taking too much time on it but I'm going to bring this
657:38 time on it but I'm going to bring this down a little bit actually because I
657:40 down a little bit actually because I don't like how close that is to um the
657:43 don't like how close that is to um the the text above it but one thing we do
657:46 the text above it but one thing we do need to
657:54 do I'm going to put this up kind of like this I think that looks fine I think I'm
657:56 this I think that looks fine I think I'm going to put this at the very bottom so
657:58 going to put this at the very bottom so let's make some room for
657:59 let's make some room for it all right just like that stretch it
658:03 it all right just like that stretch it to the side and we'll lower
658:06 to the side and we'll lower it and I think we'll keep that as
658:10 it and I think we'll keep that as is kind of like this um okay there's a
658:15 is kind of like this um okay there's a lot going on in here and there are some
658:17 lot going on in here and there are some things I'm just noticing as we're
658:18 things I'm just noticing as we're walking through this that I kind of
658:20 walking through this that I kind of missed um like I need to change some
658:22 missed um like I need to change some titles and stuff like that so let me go
658:24 titles and stuff like that so let me go ahead and change some of those things so
658:26 ahead and change some of those things so we're going to do
658:28 we're going to do title do average
658:31 title do average salary by gender or by
658:36 salary by gender or by sex do like that average salary by sex I
658:39 sex do like that average salary by sex I also don't like that it's in the middle
658:43 also don't like that it's in the middle um I don't like that it's on the outside
658:45 um I don't like that it's on the outside I want them on the inside for this so
658:48 I want them on the inside for this so let's go to the details let's go to
658:51 let's go to the details let's go to inside and see if that looks any better
658:52 inside and see if that looks any better oh that looks terrible um let me see if
658:56 oh that looks terrible um let me see if I can change that maybe I don't no I
658:59 I can change that maybe I don't no I definitely want it
659:01 definitely want it um I guess we'll do outside I you can't
659:04 um I guess we'll do outside I you can't even see the information oh the decimal
659:07 even see the information oh the decimal is crazy long um let me go and see if I
659:10 is crazy long um let me go and see if I can change that decimal to just like a
659:11 can change that decimal to just like a whole number or like
659:13 whole number or like 1.1 uh because that's a problem so maybe
659:16 1.1 uh because that's a problem so maybe I need to go over here to the
659:20 I need to go over here to the value all right so I think I want to
659:22 value all right so I think I want to change this one it's just not working
659:24 change this one it's just not working out exactly how I wanted and you guys
659:26 out exactly how I wanted and you guys know if I make mistakes I'm going to
659:28 know if I make mistakes I'm going to keep it in here so you guys can see it I
659:29 keep it in here so you guys can see it I I hoped that this was going to turn out
659:31 I hoped that this was going to turn out better but it didn't um one that I do
659:34 better but it didn't um one that I do want to add because this is kind of a a
659:36 want to add because this is kind of a a breakdown and a nice visualization I
659:39 breakdown and a nice visualization I want to add this difficulty piece so I
659:41 want to add this difficulty piece so I want to add this how difficult was it
659:43 want to add this how difficult was it for you to break into data science let's
659:45 for you to break into data science let's get rid of these and I want to click on
659:47 get rid of these and I want to click on this really quickly see what it gives us
659:51 this really quickly see what it gives us um values okay so now this shows us
659:55 um values okay so now this shows us percentages um of how easy it was again
659:58 percentages um of how easy it was again it's neither easy nor difficult
660:00 it's neither easy nor difficult difficult easy very difficult very easy
660:03 difficult easy very difficult very easy these numbers make absolutely no sense
660:06 these numbers make absolutely no sense we need to kind of order them a little
660:08 we need to kind of order them a little better so I'm going to come over here to
660:09 better so I'm going to come over here to slices we have our colors over here we
660:12 slices we have our colors over here we want very difficult to be like the most
660:16 want very difficult to be like the most difficult um so we're going to make that
660:19 difficult um so we're going to make that red and then we want difficult to be
660:22 red and then we want difficult to be maybe like an
660:23 maybe like an orange let see if we can find an orange
660:25 orange let see if we can find an orange there we have an orange this does not
660:28 there we have an orange this does not look red enough there we go oh
660:31 look red enough there we go oh no no no very difficult is red difficult
660:33 no no no very difficult is red difficult is orange we have neither easy nor
660:36 is orange we have neither easy nor difficult and that's kind of a neutral
660:38 difficult and that's kind of a neutral um let's see if we have something
660:40 um let's see if we have something neutral in
660:45 here kind of like this yellow I don't know let's try it out then we have easy
660:48 know let's try it out then we have easy and very easy and these will be like our
660:50 and very easy and these will be like our Blues so I'm going to keep that um I'm
660:54 Blues so I'm going to keep that um I'm going to keep that kind of like a dark
660:58 going to keep that kind of like a dark blueish and then our blue for super easy
661:02 blueish and then our blue for super easy is just going to be like really blue U
661:07 is just going to be like really blue U and that doesn't look bad the I mean
661:08 and that doesn't look bad the I mean look I'm I'm not a color person I I'm
661:11 look I'm I'm not a color person I I'm not great with colors and we're going to
661:13 not great with colors and we're going to kind of organize this in just a little
661:14 kind of organize this in just a little bit but this looks better to me um but
661:18 bit but this looks better to me um but we need to change up some stuff as well
661:20 we need to change up some stuff as well like the title need to
661:22 like the title need to do difficulty to break into
661:28 do difficulty to break into Data there we go
661:31 Data there we go and we're also going to
661:33 and we're also going to change this title right here we're just
661:36 change this title right here we're just say
661:38 say difficulty difficulty difficulty this
661:42 difficulty difficulty difficulty this looks better to me um again not perfect
661:45 looks better to me um again not perfect and there's a thousand different things
661:46 and there's a thousand different things you could have done but that's just what
661:47 you could have done but that's just what we're going to do I need to go through
661:49 we're going to do I need to go through here and see what I need to change so
661:51 here and see what I need to change so right off the bat I can see I need to
661:52 right off the bat I can see I need to change this
661:54 change this um to let's see right here I'm going to
661:59 um to let's see right here I'm going to rename this job title just like we did
662:02 rename this job title just like we did in this one right here uh count of
662:06 in this one right here uh count of Voters that's fine progr language
662:08 Voters that's fine progr language breaking into difficulty happiness
662:10 breaking into difficulty happiness happiness average count okay okay so
662:14 happiness average count okay okay so what we have here is very close to a
662:18 what we have here is very close to a finished product now it's not 100%
662:21 finished product now it's not 100% complete I mean I I do want to make it
662:24 complete I mean I I do want to make it look a little nicer rather than just the
662:26 look a little nicer rather than just the typical white so what we're gonna do
662:29 typical white so what we're gonna do we're GNA go up here we'll go to uh what
662:32 we're GNA go up here we'll go to uh what is it View and we have all these
662:34 is it View and we have all these different filters and we're just going
662:35 different filters and we're just going to play around with it see if we can
662:37 to play around with it see if we can find something that we like um this
662:41 find something that we like um this doesn't look too bad it's not really my
662:44 doesn't look too bad it's not really my style we can do this one Frontier this
662:47 style we can do this one Frontier this is pretty neat I kind of am digging this
662:50 is pretty neat I kind of am digging this we might come back to it I like the
662:52 we might come back to it I like the natural tones I don't know why I said
662:54 natural tones I don't know why I said tones like that but I did um this one's
662:58 tones like that but I did um this one's not bad but I don't I don't it's not
663:01 not bad but I don't I don't it's not that's not my I don't like how dark that
663:02 that's not my I don't like how dark that is um and so maybe it's like you know we
663:08 is um and so maybe it's like you know we change like the background color of all
663:10 change like the background color of all of these as well as match it with um
663:13 of these as well as match it with um match it with something else whatever
663:16 match it with something else whatever you want genuinely you customize this
663:18 you want genuinely you customize this however you want I kind of like this one
663:20 however you want I kind of like this one it's kind of groovy man and um it's not
663:23 it's kind of groovy man and um it's not perfect by any means but what we can do
663:27 perfect by any means but what we can do and we can customize this current theme
663:29 and we can customize this current theme we can come in here customize this theme
663:31 we can come in here customize this theme however we'd like I personally don't
663:35 however we'd like I personally don't want color five which is the data
663:37 want color five which is the data analyst color I don't like it to I don't
663:40 analyst color I don't like it to I don't want to go go and change it because I
663:42 want to go go and change it because I don't like it but I don't really like
663:43 don't like it but I don't really like that color per se you know I might want
663:46 that color per se you know I might want to choose a different color um but it
663:48 to choose a different color um but it has to be like this muted like that it
663:50 has to be like this muted like that it has a style to it so you can come in
663:52 has a style to it so you can come in here and you can customize this and make
663:55 here and you can customize this and make it however you'd like and and really
663:58 it however you'd like and and really mess around with it play play around
664:00 mess around with it play play around with it for me uh I'm just going to keep
664:02 with it for me uh I'm just going to keep it how it is because I don't really want
664:04 it how it is because I don't really want to mess with it and break it or anything
664:05 to mess with it and break it or anything like that so U let me just put that up
664:09 like that so U let me just put that up just a tiny bit so this is it this is
664:12 just a tiny bit so this is it this is the project I hope that it was helpful
664:15 the project I hope that it was helpful um I am not joking when I say that I'm
664:18 um I am not joking when I say that I'm because I'm gonna do a different project
664:20 because I'm gonna do a different project I'm gonna go really in depth in another
664:22 I'm gonna go really in depth in another project it's probably gonna be like a
664:23 project it's probably gonna be like a two-hour project it's going to be crazy
664:25 two-hour project it's going to be crazy long um well for a YouTube video but I
664:28 long um well for a YouTube video but I can see doing thousand different things
664:31 can see doing thousand different things with this data creating a really great
664:33 with this data creating a really great dashboard really cleaning the data which
664:36 dashboard really cleaning the data which is a large part of of actually doing
664:38 is a large part of of actually doing this and we didn't do much data cleaning
664:40 this and we didn't do much data cleaning at all there's just so much you can do
664:42 at all there's just so much you can do with this and so really dig into this
664:44 with this and so really dig into this see what you like see what you don't
664:45 see what you like see what you don't like see what you want to clean what you
664:47 like see what you want to clean what you don't want to clean you could put it in
664:48 don't want to clean you could put it in SQL you could put it in um Excel and
664:51 SQL you could put it in um Excel and just and just standardize the data to
664:55 just and just standardize the data to make it a lot more usable do whatever
664:56 make it a lot more usable do whatever you want with it I mean I I took this
664:58 you want with it I mean I I took this survey for you guys that we could use it
665:01 survey for you guys that we could use it so go out and use it and make the best
665:04 so go out and use it and make the best dashboard that you can possibly do so I
665:06 dashboard that you can possibly do so I hope that this was helpful I hope that
665:08 hope that this was helpful I hope that you enjoyed this thank you so much for
665:10 you enjoyed this thank you so much for watching this video If you like this
665:13 watching this video If you like this thank you so much for watching if you
665:15 thank you so much for watching if you like this video be sure to like And
665:17 like this video be sure to like And subscribe below and I'll see you in the
665:18 subscribe below and I'll see you in the next
665:20 next [Music]
665:29 [Music] video
665:31 video what's going on everybody welcome back
665:32 what's going on everybody welcome back to another video today we're going to be
665:34 to another video today we're going to be starting our Python tutorial
665:41 [Music] series now I am extremely excited for
665:44 series now I am extremely excited for this series we're going to be walking
665:45 this series we're going to be walking through all the things that you need to
665:46 through all the things that you need to know to get started in Python we'll be
665:48 know to get started in Python we'll be looking at variables data types for
665:50 looking at variables data types for Loops y Loops operators and a ton more
665:53 Loops y Loops operators and a ton more after this beginner series we're going
665:55 after this beginner series we're going to be going into another set of Series
665:56 to be going into another set of Series where we look at pandas mat plat lib
665:58 where we look at pandas mat plat lib Seaborn web scraping and more now in
666:01 Seaborn web scraping and more now in this video we're just going to be
666:02 this video we're just going to be setting up our environment to where we
666:03 setting up our environment to where we can learn python in future videos in
666:05 can learn python in future videos in this series we're going to be using
666:06 this series we're going to be using jupyter notebooks for all of our
666:08 jupyter notebooks for all of our tutorials because I feel like it's a
666:09 tutorials because I feel like it's a really great place to learn the basics
666:11 really great place to learn the basics but then in future videos I'll show you
666:12 but then in future videos I'll show you different idees that you can use for
666:14 different idees that you can use for your python code I genuinely cannot wait
666:16 your python code I genuinely cannot wait to get started on this series I
666:17 to get started on this series I absolutely love python so without
666:19 absolutely love python so without further Ado let's jump on my screen I'm
666:21 further Ado let's jump on my screen I'm going to show you how to install jupyter
666:22 going to show you how to install jupyter notebooks all right so let's get started
666:24 notebooks all right so let's get started by downloading anaconda anaconda is an
666:26 by downloading anaconda anaconda is an open- Source distribution of python and
666:28 open- Source distribution of python and R products so within Anaconda is our
666:31 R products so within Anaconda is our Jupiter notebooks as well as a lot of
666:33 Jupiter notebooks as well as a lot of other things but we're going to be using
666:34 other things but we're going to be using it for our Jupiter notebooks so let's go
666:36 it for our Jupiter notebooks so let's go right down here and if I hit download
666:38 right down here and if I hit download it's going to download for me because
666:39 it's going to download for me because I'm on Windows but if you want
666:42 I'm on Windows but if you want additional installers if you're running
666:43 additional installers if you're running on Mac or Linux then you can get those
666:45 on Mac or Linux then you can get those all right here now if you are running on
666:48 all right here now if you are running on Windows just make sure to check your
666:50 Windows just make sure to check your system to see if it's a 32bit or a 64
666:52 system to see if it's a 32bit or a 64 you can go into your about in your
666:54 you can go into your about in your system settings to find that information
666:56 system settings to find that information I'm going to click on this 64
666:58 I'm going to click on this 64 bit it's going to pop up on my screen
667:01 bit it's going to pop up on my screen right here and I'm going to click
667:03 right here and I'm going to click save now it's going to start downloading
667:05 save now it's going to start downloading it it says it could take a little while
667:07 it it says it could take a little while but honestly it's going to take probably
667:09 but honestly it's going to take probably about 2 to three minutes and then we'll
667:10 about 2 to three minutes and then we'll get going now that it's done I'm just
667:12 get going now that it's done I'm just going to click on it and it's going to
667:14 going to click on it and it's going to pull up this window right here we are
667:16 pull up this window right here we are just going to click next because we want
667:18 just going to click next because we want to install it this is our license
667:20 to install it this is our license agreement you can read through this if
667:22 agreement you can read through this if you would like I will not I'm just going
667:23 you would like I will not I'm just going to click I agree now we can select our
667:27 to click I agree now we can select our installation type and you can either
667:28 installation type and you can either select it for just me or if you have
667:30 select it for just me or if you have multiple admin or users on one laptop
667:33 multiple admin or users on one laptop you can do that as well for me it's just
667:35 you can do that as well for me it's just me so I'm going to use this one as it
667:38 me so I'm going to use this one as it recommends now it's going to show you
667:39 recommends now it's going to show you where it's installing it on your
667:41 where it's installing it on your computer this is the actual file path
667:44 computer this is the actual file path it's going to take about 3.5 gigs of
667:46 it's going to take about 3.5 gigs of space I have plenty of space but make
667:48 space I have plenty of space but make sure you have enough space and then once
667:50 sure you have enough space and then once you do you can come right over here to
667:52 you do you can come right over here to next and now we can do some Advanced
667:55 next and now we can do some Advanced options we can add Anaconda 3 to my path
667:58 options we can add Anaconda 3 to my path environment variable
668:00 environment variable and when you're using python you
668:02 and when you're using python you typically have a default path with
668:04 typically have a default path with whatever python IDE or notebook that
668:07 whatever python IDE or notebook that you're using I use a lot of Visual
668:09 you're using I use a lot of Visual Studio code so if I do this I'm worried
668:12 Studio code so if I do this I'm worried it might mess something up so I am not
668:13 it might mess something up so I am not going to do this it also says it doesn't
668:15 going to do this it also says it doesn't recommend it again messing with these
668:17 recommend it again messing with these paths is kind of something that you
668:18 paths is kind of something that you might want to do once you know more
668:19 might want to do once you know more about python so I don't really recommend
668:22 about python so I don't really recommend you having this checked we can also
668:24 you having this checked we can also register in AA 3 as my default python
668:27 register in AA 3 as my default python 3.9 you can do this one and I'm to keep
668:30 3.9 you can do this one and I'm to keep it this way just so I have the exact
668:31 it this way just so I have the exact same settings as you do so let's go
668:33 same settings as you do so let's go ahead and click install and now it is
668:36 ahead and click install and now it is going to actually install this on your
668:38 going to actually install this on your computer now once that's complete we can
668:40 computer now once that's complete we can hit next and now we're going to hit next
668:43 hit next and now we're going to hit next again and finally we're going to hit
668:45 again and finally we're going to hit finish but if you want to you can have
668:48 finish but if you want to you can have this tutorial and this getting started
668:50 this tutorial and this getting started with Anaconda I don't want either of
668:52 with Anaconda I don't want either of them because I don't need them but if
668:54 them because I don't need them but if you would like to have those keep those
668:56 you would like to have those keep those checked and you can get those let's
668:57 checked and you can get those let's click finish now let's go down and and
668:59 click finish now let's go down and and we're going to search for Anaconda and
669:02 we're going to search for Anaconda and it'll say Anaconda Navigator and we're
669:05 it'll say Anaconda Navigator and we're going to click on that and it should
669:07 going to click on that and it should open up for us so this is what you
669:09 open up for us so this is what you should be seeing on your screen this is
669:11 should be seeing on your screen this is the Anaconda Navigator and this is where
669:14 the Anaconda Navigator and this is where that distribution of python and R is
669:16 that distribution of python and R is going to be so we have a lot of
669:18 going to be so we have a lot of different options in here and some of
669:19 different options in here and some of them may look familiar we have things
669:21 them may look familiar we have things like Visual Studio code spider our
669:24 like Visual Studio code spider our studio and then right up here we have
669:27 studio and then right up here we have our Jupiter notebooks and this is what
669:29 our Jupiter notebooks and this is what work we're going to be using throughout
669:30 work we're going to be using throughout our tutorials so let's go ahead and
669:32 our tutorials so let's go ahead and click on launch and this is what should
669:34 click on launch and this is what should kind of pop up on your screen now I've
669:36 kind of pop up on your screen now I've been using this a lot um so I have a ton
669:38 been using this a lot um so I have a ton of notebooks and files in here but if
669:42 of notebooks and files in here but if you are just now seeing this it might be
669:44 you are just now seeing this it might be completely blank or just have some you
669:46 completely blank or just have some you know default folders in here but this is
669:49 know default folders in here but this is where we're going to open up a new
669:50 where we're going to open up a new Jupiter notebook where we can write code
669:52 Jupiter notebook where we can write code and all the things that we're going to
669:53 and all the things that we're going to be learning in future tutorials and you
669:56 be learning in future tutorials and you can use this area to save things and
669:58 can use this area to save things and create folders and organize everything
670:00 create folders and organize everything if you already have some notebooks from
670:02 if you already have some notebooks from previous projects or something you can
670:04 previous projects or something you can upload them here but what we're going to
670:06 upload them here but what we're going to do is go right to this new we're going
670:08 do is go right to this new we're going to click on the drop down and we're
670:09 to click on the drop down and we're going to open up a Python 3 kernel and
670:12 going to open up a Python 3 kernel and so we're going to open this up right
670:13 so we're going to open this up right here now right here is where we're going
670:15 here now right here is where we're going to be spending 99% of our time in future
670:18 to be spending 99% of our time in future videos this is where we're going to
670:20 videos this is where we're going to write all of our code so right here is a
670:22 write all of our code so right here is a cell and this is where we can type
670:24 cell and this is where we can type things so I can say print I can do the
670:27 things so I can say print I can do the famous hello world
670:30 famous hello world and then I'll run that by clicking shift
670:32 and then I'll run that by clicking shift enter and this is where all of our code
670:34 enter and this is where all of our code is going to go these are called cells so
670:36 is going to go these are called cells so each one of these are a cell and we have
670:38 each one of these are a cell and we have a ton of stuff up here and I'm going to
670:40 a ton of stuff up here and I'm going to get to that in just a second one thing I
670:42 get to that in just a second one thing I wanted to show you is that you don't
670:43 wanted to show you is that you don't only have to write code here you can
670:45 only have to write code here you can also do something called markdown and so
670:47 also do something called markdown and so markdown is its own kind of you could
670:49 markdown is its own kind of you could say language but um it's just a
670:51 say language but um it's just a different way of writing especially
670:52 different way of writing especially within a notebook so all we're going to
670:54 within a notebook so all we're going to do is do this little hashtag and
670:57 do is do this little hashtag and actually I think it's a pound sign but
670:58 actually I think it's a pound sign but I'm G to call it hashtag we're going to
671:00 I'm G to call it hashtag we're going to do that and we're going to say first
671:02 do that and we're going to say first notebook and then if I run that we have
671:04 notebook and then if I run that we have our first notebook and we can make
671:05 our first notebook and we can make little comments and little notes like
671:07 little comments and little notes like that that don't actually run any code
671:09 that that don't actually run any code they just kind of organize things for us
671:11 they just kind of organize things for us and I'm going to do that in a lot of our
671:12 and I'm going to do that in a lot of our future videos so just want to show you
671:14 future videos so just want to show you how to do that now let's look right up
671:16 how to do that now let's look right up here a lot of these things are pretty
671:17 here a lot of these things are pretty important uh one of the first things
671:19 important uh one of the first things that's really important is actually
671:20 that's really important is actually saving this so let's say we wanted to
671:23 saving this so let's say we wanted to change the title to I'm going to do a AA
671:26 change the title to I'm going to do a AA because I want it to be at the beginning
671:27 because I want it to be at the beginning um so I can show you this I'm do AA a
671:30 um so I can show you this I'm do AA a new notebook and I'm going to rename it
671:33 new notebook and I'm going to rename it and then I'm going to save that so if I
671:35 and then I'm going to save that so if I go right back over here you can see AAA
671:38 go right back over here you can see AAA new notebook that green means that it's
671:41 new notebook that green means that it's currently running and when I say running
671:44 currently running and when I say running I mean right up here and if we wanted to
671:46 I mean right up here and if we wanted to we go ahead and shut that down which
671:48 we go ahead and shut that down which means it wouldn't run the code anymore
671:50 means it wouldn't run the code anymore and then we'd have to run up a new
671:51 and then we'd have to run up a new cluster uh so let's go ahead and do that
671:53 cluster uh so let's go ahead and do that I didn't plan on doing that but let's do
671:55 I didn't plan on doing that but let's do it so we have no notebooks running and
671:58 it so we have no notebooks running and right here it says we have a dead kernel
671:59 right here it says we have a dead kernel so this was our Python 3 kernel and now
672:02 so this was our Python 3 kernel and now since I stopped it it's no longer
672:04 since I stopped it it's no longer processing anything so let's go ahead
672:06 processing anything so let's go ahead and say try restarting
672:08 and say try restarting now and it says kernel is ready so it's
672:12 now and it says kernel is ready so it's back up and running and we're good to go
672:13 back up and running and we're good to go the next thing is this button right here
672:15 the next thing is this button right here now this is an insert cell below so if I
672:18 now this is an insert cell below so if I have a lot of code I know I'm going to
672:19 have a lot of code I know I'm going to be writing I can click a lot of that and
672:22 be writing I can click a lot of that and I often do that because I just don't
672:24 I often do that because I just don't like having to do that all the time so I
672:26 like having to do that all the time so I make a bunch of cells just so I can use
672:28 make a bunch of cells just so I can use them you can delete cells so say we have
672:30 them you can delete cells so say we have some code here we'll say here and we
672:34 some code here we'll say here and we have code here and then we have this
672:36 have code here and then we have this empty cell right here we can just get
672:38 empty cell right here we can just get rid of that by doing this cut selected
672:40 rid of that by doing this cut selected cells we can also copy selected cells so
672:43 cells we can also copy selected cells so if I hit copy selected cells and I can
672:45 if I hit copy selected cells and I can go right here and say paste selected
672:48 go right here and say paste selected cells and as you can see it pasted that
672:51 cells and as you can see it pasted that exact same cell you can also move this
672:53 exact same cell you can also move this up and down so I can actually take this
672:55 up and down so I can actually take this one and say I wanted it in this location
672:58 one and say I wanted it in this location I can take this cell and move it up or I
673:00 I can take this cell and move it up or I can move it down and that's just an easy
673:02 can move it down and that's just an easy way to kind of organize it instead of
673:04 way to kind of organize it instead of having to like copy this and moving it
673:06 having to like copy this and moving it right down here and pasting it you can
673:08 right down here and pasting it you can just take this cell and move it up which
673:10 just take this cell and move it up which is really nice now earlier when I ran
673:12 is really nice now earlier when I ran this code right here I hit shift enter
673:14 this code right here I hit shift enter you can also run and it'll run the cell
673:17 you can also run and it'll run the cell below so you can hit run and it works
673:19 below so you can hit run and it works properly if you're running a script and
673:21 properly if you're running a script and it's taking forever and it's not working
673:23 it's taking forever and it's not working properly at least it's you don't think
673:25 properly at least it's you don't think it's working properly you can stop that
673:27 it's working properly you can stop that by doing this interrupt the kernel right
673:29 by doing this interrupt the kernel right here and anything you're trying to do
673:30 here and anything you're trying to do within this kernel if it's just not
673:32 within this kernel if it's just not working properly it'll stop it you can
673:34 working properly it'll stop it you can restart it then you can try fixing your
673:36 restart it then you can try fixing your code you can also hit this button if you
673:38 code you can also hit this button if you want to restart your kernel and this
673:40 want to restart your kernel and this button if you want to restart the kernel
673:42 button if you want to restart the kernel and then rerun the entire notebook as we
673:44 and then rerun the entire notebook as we talked about just a second ago we have
673:46 talked about just a second ago we have our code and our markdown code we're not
673:48 our code and our markdown code we're not going to talk about either of these
673:50 going to talk about either of these because we're not going to use that
673:51 because we're not going to use that throughout the entire series the next
673:53 throughout the entire series the next thing I want to show you is right up
673:54 thing I want to show you is right up here if you open this file we can create
673:57 here if you open this file we can create a new notebook we can open an existing
673:59 a new notebook we can open an existing notebook we can copy it save it rename
674:02 notebook we can copy it save it rename it all that good stuff we can also edit
674:04 it all that good stuff we can also edit it so a lot of these things that we were
674:06 it so a lot of these things that we were talking about you can cut the cells and
674:07 talking about you can cut the cells and copy the cells using these shortcuts if
674:09 copy the cells using these shortcuts if you would like to we also go to view and
674:11 you would like to we also go to view and you can toggle a lot of these things if
674:13 you can toggle a lot of these things if you would like to which just means it'll
674:14 you would like to which just means it'll show it or not show it depending on what
674:16 show it or not show it depending on what you want so if we toggle this toolbar
674:18 you want so if we toggle this toolbar it'll take away the toolbar for us or if
674:21 it'll take away the toolbar for us or if we go back and we toggle the toolbar we
674:23 we go back and we toggle the toolbar we can bring it back we can also insert a
674:26 can bring it back we can also insert a few different things like inserting a
674:27 few different things like inserting a cell above or a cell below so instead of
674:29 cell above or a cell below so instead of saying This plus button you can just say
674:31 saying This plus button you can just say A or B adding above or below we also
674:34 A or B adding above or below we also have the cell in which we can run our
674:36 have the cell in which we can run our cells or run all of them or all above or
674:39 cells or run all of them or all above or all below and then we have our kernels
674:41 all below and then we have our kernels right here which we were talking about
674:42 right here which we were talking about earlier where we can interrupt it and
674:44 earlier where we can interrupt it and restart those there are widgets we're
674:47 restart those there are widgets we're not going to be looking at any widgets
674:48 not going to be looking at any widgets in this series but if it's something
674:50 in this series but if it's something you're interested in you can definitely
674:51 you're interested in you can definitely do that and then we have help so if you
674:53 do that and then we have help so if you are looking for some help on any of
674:55 are looking for some help on any of these things especially some of these
674:56 these things especially some of these references which are really nice you can
674:58 references which are really nice you can use those and you can also edit your own
675:00 use those and you can also edit your own keyboard shortcuts and now that we
675:02 keyboard shortcuts and now that we walked through all of that you now have
675:03 walked through all of that you now have anacon and jupyter notebooks installed
675:05 anacon and jupyter notebooks installed on your computer in future videos this
675:07 on your computer in future videos this is where we're going to be writing all
675:08 is where we're going to be writing all of our python code so be sure to check
675:10 of our python code so be sure to check those out so we can learn python
675:11 those out so we can learn python together thank you guys so much for
675:12 together thank you guys so much for watching I hope you were able to get
675:14 watching I hope you were able to get everything installed correctly I am
675:16 everything installed correctly I am super excited for this series ahead of
675:17 super excited for this series ahead of us if you like this video be sure to
675:19 us if you like this video be sure to like And subscribe below and I will see
675:21 like And subscribe below and I will see you in the next
675:23 you in the next [Music]
675:28 [Music] video
675:34 [Music] hello everybody today we're going to be
675:35 hello everybody today we're going to be learning about variables in Python a
675:38 learning about variables in Python a variable is basically just a container
675:40 variable is basically just a container for storing data values so you'll take a
675:42 for storing data values so you'll take a value like a number or a string you can
675:45 value like a number or a string you can assign it to a variable and then the
675:47 assign it to a variable and then the variable will carry and contain whatever
675:50 variable will carry and contain whatever you put into it so for example let's go
675:52 you put into it so for example let's go right over here we're going to say x and
675:54 right over here we're going to say x and this is going to be our variable we're
675:56 this is going to be our variable we're going to say is equal to now we can
675:58 going to say is equal to now we can assign the value to it so let's say I
676:02 assign the value to it so let's say I want to put
676:03 want to put 22 x is now equal to 22 so we won't have
676:07 22 x is now equal to 22 so we won't have to write out the number 22 in later
676:09 to write out the number 22 in later scripts that we write we can just say x
676:11 scripts that we write we can just say x because X is equal to 22 it now contains
676:15 because X is equal to 22 it now contains that number so now we can hit enter and
676:17 that number so now we can hit enter and say print we do an open parentheses and
676:20 say print we do an open parentheses and we'll say x now I'm going to hit shift
676:23 we'll say x now I'm going to hit shift enter and now it prints out that 22
676:26 enter and now it prints out that 22 because we are printing x and x is equal
676:29 because we are printing x and x is equal 22 this is our value and this is our
676:32 22 this is our value and this is our variable one really great thing about
676:33 variable one really great thing about variables is that it assigns its own
676:35 variables is that it assigns its own data type it's going to automatically do
676:37 data type it's going to automatically do this so we didn't have to go and tell X
676:39 this so we didn't have to go and tell X that it's an integer it just
676:41 that it's an integer it just automatically knew that 22 is a number
676:44 automatically knew that 22 is a number so we can check that by saying type and
676:46 so we can check that by saying type and then open parenthesis and writing X and
676:50 then open parenthesis and writing X and we'll do shift enter again and this says
676:52 we'll do shift enter again and this says that X is an integer type now we only
676:55 that X is an integer type now we only assigned an integer to X let's try
676:58 assigned an integer to X let's try assigning a string value or some text to
677:00 assigning a string value or some text to a variable so we'll say Y is equal to uh
677:04 a variable so we'll say Y is equal to uh let's say mint chocolate chip I'm
677:07 let's say mint chocolate chip I'm feeling some ice cream today so we'll
677:10 feeling some ice cream today so we'll say mint chocolate chip now if we print
677:13 say mint chocolate chip now if we print that again we'll do print open
677:15 that again we'll do print open parenthesis Y and do shift enter it'll
677:19 parenthesis Y and do shift enter it'll print mint chocolate chip and if we look
677:21 print mint chocolate chip and if we look at the type we can see that the type is
677:24 at the type we can see that the type is a string this time and not an integer
677:27 a string this time and not an integer now again we did not tell it that X was
677:29 now again we did not tell it that X was an integer and Y was a string it just
677:32 an integer and Y was a string it just automatically knew this let's go up here
677:34 automatically knew this let's go up here really quickly we're going to add
677:36 really quickly we're going to add several rows in here because we're about
677:37 several rows in here because we're about to write a lot of different variables
677:40 to write a lot of different variables and really learn in- depth how to use
677:42 and really learn in- depth how to use variables the next thing to know about
677:44 variables the next thing to know about variables is that you can overwrite
677:46 variables is that you can overwrite previous variables right now we have
677:48 previous variables right now we have mint chocolate chip and that is assigned
677:50 mint chocolate chip and that is assigned to the variable y so if I go down here I
677:53 to the variable y so if I go down here I say print y I hit shift enter it's going
677:56 say print y I hit shift enter it's going to print out mint chocolate chip but
677:59 to print out mint chocolate chip but if I go right above it I say Y is equal
678:02 if I go right above it I say Y is equal to and let's say chocolate if I print
678:06 to and let's say chocolate if I print that out it's now going to say chocolate
678:08 that out it's now going to say chocolate whereas up here I'm reassigning it to Y
678:11 whereas up here I'm reassigning it to Y it's still going to say mint chocolate
678:13 it's still going to say mint chocolate chip so if I come right down here and I
678:17 chip so if I come right down here and I copy this and I'm going to paste this
678:20 copy this and I'm going to paste this right here initially it is going to
678:22 right here initially it is going to assign y to Chocolate but then right
678:24 assign y to Chocolate but then right here it will automatically overwrite y
678:27 here it will automatically overwrite y as mint chocolate chip and when we hit
678:29 as mint chocolate chip and when we hit shift enter it's going to show mint
678:31 shift enter it's going to show mint chocolate chip variables are also case
678:34 chocolate chip variables are also case sensitive so if I come up here and I say
678:37 sensitive so if I come up here and I say a capital Y this is a lowercase Y and
678:39 a capital Y this is a lowercase Y and this is a capital Y it is going to print
678:41 this is a capital Y it is going to print out the correct one instead of mint
678:44 out the correct one instead of mint chocolate chip and then if I go down
678:46 chocolate chip and then if I go down here to the print and I type the capital
678:49 here to the print and I type the capital Y it will give us the mint chocolate
678:51 Y it will give us the mint chocolate chip up till now we've only assigned one
678:54 chip up till now we've only assigned one value to one variable but we can
678:56 value to one variable but we can actually assign multiple values to
678:58 actually assign multiple values to multiple variables so let's do X comma y
679:03 multiple variables so let's do X comma y comma Z is equal to and now we can
679:07 comma Z is equal to and now we can assign multiple values to all of those
679:10 assign multiple values to all of those so we can say
679:13 so we can say chocolate and then we'll do a comma oops
679:16 chocolate and then we'll do a comma oops a comma then we can say vanilla and then
679:21 a comma then we can say vanilla and then we'll do another comma and we'll say
679:23 we'll do another comma and we'll say rocky road now this is going to assign
679:27 rocky road now this is going to assign chocolate to X
679:29 chocolate to X vanilla to Y and Rocky Road to Z so what
679:32 vanilla to Y and Rocky Road to Z so what we can do is we'll say
679:35 we can do is we'll say print and we'll go print print print and
679:39 print and we'll go print print print and we'll say X Y and Z so it prints out
679:43 we'll say X Y and Z so it prints out chocolate vanilla and rocky road and
679:46 chocolate vanilla and rocky road and these are our three different values we
679:48 these are our three different values we can also assign multiple variables to
679:51 can also assign multiple variables to one value and we can do this by saying X
679:54 one value and we can do this by saying X is equal to Y is equal to Z is equal to
679:57 is equal to Y is equal to Z is equal to and we can put whatever we would like
679:59 and we can put whatever we would like let's do root beer float then we'll come
680:03 let's do root beer float then we'll come back up here we'll copy this and let's
680:07 back up here we'll copy this and let's print off our X our Y and Z and they are
680:10 print off our X our Y and Z and they are all the exact same now so far we've
680:12 all the exact same now so far we've really only looked at integers and
680:14 really only looked at integers and strings but you can assign things like
680:16 strings but you can assign things like lists dictionaries tupal and sets all to
680:20 lists dictionaries tupal and sets all to variables as well so let's go right down
680:22 variables as well so let's go right down here so let's create our very first list
680:24 here so let's create our very first list I'm going to say icore cream is equal to
680:28 I'm going to say icore cream is equal to and that is our variable right there the
680:30 and that is our variable right there the ice cream is our variable so now we're
680:32 ice cream is our variable so now we're going to do an Open Bracket like this
680:35 going to do an Open Bracket like this and we're going to come up here and copy
680:37 and we're going to come up here and copy all of these values and we're going to
680:39 all of these values and we're going to stick it within our list so now within
680:42 stick it within our list so now within ice cream we have three string values
680:45 ice cream we have three string values chocolate vanilla and rocky road all
680:48 chocolate vanilla and rocky road all within this list so what we can do is we
680:51 within this list so what we can do is we can say x comma y comma Z is equal to
680:56 can say x comma y comma Z is equal to icore cream so so now these three values
680:59 icore cream so so now these three values chocolate vanilla and rocky road will be
681:02 chocolate vanilla and rocky road will be assigned to these three variables X Y
681:04 assigned to these three variables X Y and Z and we can copy this print up
681:08 and Z and we can copy this print up here and we'll hit shift enter and now
681:12 here and we'll hit shift enter and now the X Y and Z all were assigned these
681:15 the X Y and Z all were assigned these values of chocolate vanilla and rocky
681:17 values of chocolate vanilla and rocky road now something that we just did
681:18 road now something that we just did which is really important or something
681:20 which is really important or something that you really need to consider is how
681:22 that you really need to consider is how you name your variables so right here we
681:25 you name your variables so right here we have ice cream now this to me is exactly
681:28 have ice cream now this to me is exactly how I usually write my variables but
681:31 how I usually write my variables but there are many different ways that you
681:32 there are many different ways that you can write your variables so let's take a
681:34 can write your variables so let's take a look at that really quickly and let's
681:36 look at that really quickly and let's add just a few more because I have a
681:38 add just a few more because I have a feeling we're going to go a little bit
681:39 feeling we're going to go a little bit longer than what we have so there are a
681:41 longer than what we have so there are a few best practices for naming variables
681:44 few best practices for naming variables first I'm going to show you kind of what
681:45 first I'm going to show you kind of what a lot of people will do I'll show you
681:48 a lot of people will do I'll show you some good practices and I'm going to
681:49 some good practices and I'm going to show you some bad practices as well that
681:51 show you some bad practices as well that you should avoid doing the first thing
681:53 you should avoid doing the first thing that we're going to look at is something
681:54 that we're going to look at is something called camel case and let's say we want
681:57 called camel case and let's say we want to name it t test variable case oops
682:02 to name it t test variable case oops case now if we have a test variable case
682:05 case now if we have a test variable case the camel case is going to look like
682:07 the camel case is going to look like this we'll have lowercase test and then
682:09 this we'll have lowercase test and then we'll have uppercase variable and
682:12 we'll have uppercase variable and uppercase case is equal to this is what
682:16 uppercase case is equal to this is what this variable is going to look like and
682:18 this variable is going to look like and we can assign it a nilla
682:24 swirl and this is what your camel case will look like it's going to be
682:25 will look like it's going to be lowercase and then all the rest of those
682:28 lowercase and then all the rest of those uh compound words or however you want to
682:30 uh compound words or however you want to say that these letters are going to be
682:32 say that these letters are going to be capitalized to kind of separate where
682:33 capitalized to kind of separate where the words end and begin let's go right
682:36 the words end and begin let's go right down here we're going to copy this the
682:38 down here we're going to copy this the next one is called Pascal case so Pascal
682:42 next one is called Pascal case so Pascal case is going to look just a little bit
682:43 case is going to look just a little bit different instead of the lowercase at
682:46 different instead of the lowercase at test it's going to be a capital T in
682:48 test it's going to be a capital T in test so test variable case again this is
682:52 test so test variable case again this is a very similar way of writing it very
682:54 a very similar way of writing it very similar to camel case U but just a
682:56 similar to camel case U but just a capital at the beginning now let's look
682:59 capital at the beginning now let's look at the last one and this one is my
683:01 at the last one and this one is my personal favorite this one is going to
683:02 personal favorite this one is going to be the snake case now this one is quite
683:06 be the snake case now this one is quite a bit different in the fact that you
683:08 a bit different in the fact that you don't use any capital letters and you
683:10 don't use any capital letters and you separate everything using underscore so
683:13 separate everything using underscore so we're going to write
683:14 we're going to write testore variable underscore case now
683:19 testore variable underscore case now typically let me have them all in there
683:21 typically let me have them all in there typically these are the best practices
683:23 typically these are the best practices these are what you typically want to do
683:26 these are what you typically want to do but probably the best one to to use is
683:29 but probably the best one to to use is this snake case right here what a lot of
683:32 this snake case right here what a lot of people say is that it improves
683:34 people say is that it improves readability if you take a look at either
683:36 readability if you take a look at either the camel case or the Pascal case which
683:38 the camel case or the Pascal case which you will see people do it's not as easy
683:41 you will see people do it's not as easy to distinguish exactly what it says and
683:43 to distinguish exactly what it says and the name of a variable is important
683:46 the name of a variable is important because you can gain information from it
683:48 because you can gain information from it if people name them appropriately so
683:50 if people name them appropriately so when I'm naming variables I usually
683:52 when I'm naming variables I usually write it in snake case because I just
683:53 write it in snake case because I just find it a lot easier to read because
683:56 find it a lot easier to read because each word is broken up by this
683:57 each word is broken up by this underscore score so now let's look at
683:59 underscore score so now let's look at some good variable names these are all
684:01 some good variable names these are all ones that you can use or could use let's
684:04 ones that you can use or could use let's do something like test VAR so test VAR
684:07 do something like test VAR so test VAR is completely appropriate we can also do
684:09 is completely appropriate we can also do something like testore VAR oops
684:13 something like testore VAR oops underscore we could do underscore test
684:17 underscore we could do underscore test underscore VAR you'll see that often as
684:20 underscore VAR you'll see that often as well well people will start it with an
684:22 well well people will start it with an underscore you can do test
684:27 underscore you can do test bar capital T oops capital T capital V
684:33 bar capital T oops capital T capital V in test VAR or you could even do
684:35 in test VAR or you could even do something like test VAR two now adding a
684:40 something like test VAR two now adding a number to your variable is not
684:41 number to your variable is not inherently a Bad Thing usually it's
684:43 inherently a Bad Thing usually it's semif fround upon but there are
684:45 semif fround upon but there are definitely some use cases where you can
684:47 definitely some use cases where you can use it but one thing that you cannot do
684:50 use it but one thing that you cannot do is do something
684:52 is do something like putting the two at the front if you
684:55 like putting the two at the front if you put the two at the front it no longer
684:57 put the two at the front it no longer works it won't run properly at all so
684:59 works it won't run properly at all so we're going to take that out so we can't
685:01 we're going to take that out so we can't do that so I'm going to use this as an
685:03 do that so I'm going to use this as an example of what you should not do you
685:05 example of what you should not do you also can't use a dash so something like
685:08 also can't use a dash so something like test- var2 that doesn't work either and
685:12 test- var2 that doesn't work either and you also can't use something like a
685:16 you also can't use something like a space or a comma or really any kind of
685:20 space or a comma or really any kind of symbol like a period or a backslash or
685:23 symbol like a period or a backslash or equal sign none of those things will
685:25 equal sign none of those things will work within your variable now another
685:27 work within your variable now another thing that you can do within your
685:28 thing that you can do within your variable is use the plus sign so let's
685:31 variable is use the plus sign so let's assign this we'll say x is equal to and
685:35 assign this we'll say x is equal to and we'll do a string we'll say ice
685:38 we'll do a string we'll say ice cream is my
685:40 cream is my favorite and then we'll do a plus sign
685:44 favorite and then we'll do a plus sign and we'll say period now what this will
685:47 and we'll say period now what this will do is it will literally add these two
685:50 do is it will literally add these two strings together so let's do print and
685:53 strings together so let's do print and we'll do X so now it says ice cream is
685:57 we'll do X so now it says ice cream is my favorite one thing that we cannot do
686:00 my favorite one thing that we cannot do in a variable is we cannot add a string
686:02 in a variable is we cannot add a string and a number or an integer so we can't
686:05 and a number or an integer so we can't do ice cream as my favorite two if we
686:08 do ice cream as my favorite two if we try to do that it will give us this
686:09 try to do that it will give us this error right here so in this error it's
686:11 error right here so in this error it's saying you can only concatenate a string
686:14 saying you can only concatenate a string not an integer to a string so only a
686:16 not an integer to a string so only a string plus a string for this example
686:18 string plus a string for this example you can also do and we'll say x is equal
686:22 you can also do and we'll say x is equal to or we'll say
686:24 to or we'll say y we'll say Y is equal
686:27 y we'll say Y is equal to 3 + 2 and it should output five
686:32 to 3 + 2 and it should output five because you can also do an integer and
686:34 because you can also do an integer and an integer now so far we've only been
686:35 an integer now so far we've only been outputting one variable in the print
686:38 outputting one variable in the print statement but you can actually add
686:40 statement but you can actually add multiple variables within a print
686:42 multiple variables within a print statement so let's go right down here
686:44 statement so let's go right down here we're going to say let's give it some
686:46 we're going to say let's give it some more right there so we'll say x is equal
686:49 more right there so we'll say x is equal to ice
686:51 to ice cream and we'll say Y is equal
686:56 cream and we'll say Y is equal to is and then the last one Z is equal
687:01 to is and then the last one Z is equal to my favorite and we'll do a period at
687:04 to my favorite and we'll do a period at the end now we can go to the bottom and
687:06 the end now we can go to the bottom and we can say print x + y + C and when we
687:12 we can say print x + y + C and when we enter
687:14 enter that and when we run and when we run
687:16 that and when we run and when we run that we get ice cream is my favorite now
687:18 that we get ice cream is my favorite now we can actually add a space before is a
687:21 we can actually add a space before is a space before my and when we hit shift
687:23 space before my and when we hit shift enter it says ice cream is my favorite
687:26 enter it says ice cream is my favorite you can also do this exact same thing
687:28 you can also do this exact same thing with numbers as well so we'll say x = to
687:32 with numbers as well so we'll say x = to 1 2 and what Z is equal to three so this
687:36 1 2 and what Z is equal to three so this should equal six now one thing that we
687:39 should equal six now one thing that we tried to do was assign to one variable a
687:41 tried to do was assign to one variable a string plus an integer and that did not
687:44 string plus an integer and that did not work but what you can do is you can take
687:46 work but what you can do is you can take something like this and you can say ice
687:50 something like this and you can say ice cream and we'll get rid of this one and
687:53 cream and we'll get rid of this one and we'll get rid of the Z now saying plus
687:55 we'll get rid of the Z now saying plus is actually not going to work let's try
687:57 is actually not going to work let's try running this
687:58 running this so again we can't concatenate these but
688:01 so again we can't concatenate these but what we can do in the print statement is
688:02 what we can do in the print statement is we can separate it by a comma so when we
688:05 we can separate it by a comma so when we add this comma it should work properly
688:07 add this comma it should work properly let's hit enter and it says ice cream 2
688:10 let's hit enter and it says ice cream 2 again this makes no sense but you are
688:12 again this makes no sense but you are able to combine a string and an integer
688:15 able to combine a string and an integer separating by a comma now this is the
688:17 separating by a comma now this is the meat and potatoes of variables there are
688:19 meat and potatoes of variables there are some other things as well but some of
688:21 some other things as well but some of those things are a little bit more
688:22 those things are a little bit more advanced and not something I wanted to
688:23 advanced and not something I wanted to cover in this tutorial although we may
688:25 cover in this tutorial although we may be looking at some of those things in
688:26 be looking at some of those things in future tutorials
688:28 future tutorials but this is definitely the basics what
688:30 but this is definitely the basics what you really really need to know about
688:32 you really really need to know about variables I hope that this video was
688:34 variables I hope that this video was helpful if it was be sure to like And
688:36 helpful if it was be sure to like And subscribe below and I will see you in
688:38 subscribe below and I will see you in the next
688:39 the next [Music]
688:50 [Music] video hello everybody today we're going
688:52 video hello everybody today we're going to be talking about data types in Python
688:54 to be talking about data types in Python data types are the classification of the
688:56 data types are the classification of the data that you are storing these
688:58 data that you are storing these classifications tell you what operations
689:00 classifications tell you what operations can be performed on your data we're
689:02 can be performed on your data we're going to be looking at the main data
689:03 going to be looking at the main data types within python including numeric
689:06 types within python including numeric sequence type set Boolean and dictionary
689:09 sequence type set Boolean and dictionary so let's get started actually writing
689:10 so let's get started actually writing some of this out and first let's look at
689:12 some of this out and first let's look at numeric there are three different types
689:14 numeric there are three different types of numeric data types we have integers
689:17 of numeric data types we have integers float and complex numbers let's take a
689:19 float and complex numbers let's take a look at integers an integer is basically
689:21 look at integers an integer is basically just a whole number whether it's
689:23 just a whole number whether it's positive or negative so an integer could
689:25 positive or negative so an integer could be a 12 and we can check that by saying
689:28 be a 12 and we can check that by saying type we'll do an open parenthesis and a
689:31 type we'll do an open parenthesis and a Clos parenthesis and if we say the type
689:34 Clos parenthesis and if we say the type of 12 it's going to give us an integer
689:36 of 12 it's going to give us an integer or if we say a -2 that is also an
689:38 or if we say a -2 that is also an integer we can also perform basic
689:40 integer we can also perform basic calculations like -2 + 100 and that'll
689:44 calculations like -2 + 100 and that'll tell us it is also an integer so whether
689:46 tell us it is also an integer so whether it's just a static value or you're
689:48 it's just a static value or you're performing an operation on it it's still
689:50 performing an operation on it it's still going to be that data type if those
689:52 going to be that data type if those numbers are whole numbers whether
689:53 numbers are whole numbers whether negative or positive now let's take this
689:55 negative or positive now let's take this exact one and let's say
689:58 exact one and let's say 12 and we'll do+
690:01 12 and we'll do+ 10.25 when we run this it's no longer
690:03 10.25 when we run this it's no longer going to be a whole number it'll now be
690:05 going to be a whole number it'll now be a float so let's check this and now this
690:08 a float so let's check this and now this is a float type because is no longer a
690:10 is a float type because is no longer a whole number it's now a decimal number
690:12 whole number it's now a decimal number and the last data type within the
690:13 and the last data type within the numeric data type is called complex
690:16 numeric data type is called complex let's copy this right down here now
690:18 let's copy this right down here now personally this is not one that I've
690:19 personally this is not one that I've used almost ever but it is one just
690:22 used almost ever but it is one just worth noting so you can do 12 plus and
690:25 worth noting so you can do 12 plus and let's say 3 J
690:28 let's say 3 J and if we do this it's going to give us
690:30 and if we do this it's going to give us a complex the complex data type is used
690:32 a complex the complex data type is used for imaginary numbers for me it's not
690:35 for imaginary numbers for me it's not often used but if you do use it J is
690:38 often used but if you do use it J is used as that imaginary number if you use
690:41 used as that imaginary number if you use something like C or any other number
690:44 something like C or any other number it's going to give you an error J is the
690:47 it's going to give you an error J is the only one that will work with it now
690:48 only one that will work with it now let's take a look at Boolean values so
690:51 let's take a look at Boolean values so we'll say Boolean the Boolean data type
690:54 we'll say Boolean the Boolean data type only has two built-in values either true
690:57 only has two built-in values either true or false so let's go right down here and
690:59 or false so let's go right down here and say type
691:01 say type true and when we run this it'll say bu
691:04 true and when we run this it'll say bu which stands for Boolean we can do the
691:06 which stands for Boolean we can do the exact same thing with false that is also
691:09 exact same thing with false that is also Boolean and this can be used with
691:11 Boolean and this can be used with something like a comparison operator so
691:13 something like a comparison operator so let's say 1 is greater than 5 and let's
691:18 let's say 1 is greater than 5 and let's check this this is giving us a Boolean
691:20 check this this is giving us a Boolean because it's telling us whether one is
691:22 because it's telling us whether one is greater than five let's bring that right
691:24 greater than five let's bring that right down here this will give us a false so
691:26 down here this will give us a false so it's telling us that one is not greater
691:29 it's telling us that one is not greater than five and just as we got a false we
691:31 than five and just as we got a false we can say 1 is equal to one and this
691:33 can say 1 is equal to one and this should give us a true so now let's take
691:35 should give us a true so now let's take a look at our sequence type data types
691:38 a look at our sequence type data types and that includes strings lists and
691:40 and that includes strings lists and tupal let's start off by looking at
691:42 tupal let's start off by looking at strings in Python strings are arrays of
691:45 strings in Python strings are arrays of bytes representing Unicode characters
691:48 bytes representing Unicode characters when you're using strings you put them
691:49 when you're using strings you put them either in a single quote a double quote
691:51 either in a single quote a double quote or a trible quote I call them
691:52 or a trible quote I call them apostrophes it's just what I was raised
691:55 apostrophes it's just what I was raised to call them but most people who use
691:56 to call them but most people who use Python call them quotes so right here we
691:58 Python call them quotes so right here we have a single quote and that works well
692:02 have a single quote and that works well we can do a double quote and that works
692:06 we can do a double quote and that works also and as you can see they are the
692:08 also and as you can see they are the exact same output and then we have a
692:10 exact same output and then we have a triple quote just like this and this is
692:13 triple quote just like this and this is called a multi-line so we can write on
692:15 called a multi-line so we can write on multiple lines here so let's write a
692:17 multiple lines here so let's write a nice little poem so we'll say the ice
692:20 nice little poem so we'll say the ice cream vanquished my longing for
692:25 cream vanquished my longing for sweets upon this diet
692:28 sweets upon this diet I look
692:29 I look away it no longer
692:32 away it no longer exists on this day and then if we run
692:35 exists on this day and then if we run that it's going to look a little bit
692:37 that it's going to look a little bit weird it's basically giving us the raw
692:40 weird it's basically giving us the raw text which is completely fine but let's
692:42 text which is completely fine but let's call this a
692:45 call this a multi-line and we're going to call this
692:47 multi-line and we're going to call this a variable multi-line and we're going to
692:49 a variable multi-line and we're going to come down here and say
692:51 come down here and say print and before I run this I have to
692:54 print and before I run this I have to make sure that this is Ran So now let's
692:57 make sure that this is Ran So now let's print out our multi-line and now we have
693:00 print out our multi-line and now we have our nice little poem right down here now
693:02 our nice little poem right down here now something to know about these single and
693:03 something to know about these single and double quotes is how they're actually
693:05 double quotes is how they're actually used so if we use a single quote and we
693:08 used so if we use a single quote and we say I've always wanted to eat a gallon
693:13 say I've always wanted to eat a gallon of ice cream and then we do an
693:15 of ice cream and then we do an apostrophe at the end obviously
693:17 apostrophe at the end obviously something went wrong here what went
693:20 something went wrong here what went wrong is when you use a single quote and
693:22 wrong is when you use a single quote and then within your text within your
693:24 then within your text within your sentence you have another apostrophe
693:26 sentence you have another apostrophe it's going to give you an error so what
693:28 it's going to give you an error so what we want to do is whenever we have a
693:31 we want to do is whenever we have a quote within it we need to use a double
693:34 quote within it we need to use a double quote these double quotes will negate
693:37 quote these double quotes will negate any single quotes that you have within
693:39 any single quotes that you have within your statement they won't however negate
693:41 your statement they won't however negate another double quote so you need to make
693:43 another double quote so you need to make sure you aren't using double quotes
693:45 sure you aren't using double quotes within your sentence if you want to do
693:46 within your sentence if you want to do something like that you need to use the
693:48 something like that you need to use the triple quotes like we did above so we
693:50 triple quotes like we did above so we can do double double and then let's
693:54 can do double double and then let's paste this within
693:56 paste this within it
693:58 it and anything you do Within These triple
694:00 and anything you do Within These triple quotes will be completely fine as long
694:02 quotes will be completely fine as long as you don't do triple quotes within
694:04 as you don't do triple quotes within your triple quotes we'll say this is
694:06 your triple quotes we'll say this is wrong so even though it's between these
694:08 wrong so even though it's between these two triple quotes it doesn't work
694:10 two triple quotes it doesn't work exactly again you just have to
694:12 exactly again you just have to understand how that works you have to
694:14 understand how that works you have to use the proper apostrophes or quotes
694:16 use the proper apostrophes or quotes within your string and just to check
694:18 within your string and just to check this we can always say here's our
694:20 this we can always say here's our multi-line we can always say type of
694:25 multi-line we can always say type of multi-line and that is still a string
694:28 multi-line and that is still a string one really important thing to know about
694:30 one really important thing to know about strings is that they can be indexed
694:33 strings is that they can be indexed indexing means that you can search
694:35 indexing means that you can search within it and that index starts at zero
694:37 within it and that index starts at zero so let's go ahead and create a variable
694:39 so let's go ahead and create a variable and we'll just say a is equal to and
694:42 and we'll just say a is equal to and let's do the all popular hello world
694:46 let's do the all popular hello world let's run this and now when we print the
694:49 let's run this and now when we print the string we can say a and we're going to
694:51 string we can say a and we're going to do a bracket and now we can search
694:53 do a bracket and now we can search throughout our string using the index so
694:56 throughout our string using the index so all you have to do is do a colon and we
694:58 all you have to do is do a colon and we can say five what this is going to do is
695:01 can say five what this is going to do is is going to say zero position zero all
695:03 is going to say zero position zero all the way up to five which should give us
695:05 the way up to five which should give us the whole hello I believe let's run this
695:08 the whole hello I believe let's run this and it's giving us the first five
695:09 and it's giving us the first five positions of this string we can also get
695:12 positions of this string we can also get rid of the colon and just say something
695:14 rid of the colon and just say something like five and then when we run this it's
695:18 like five and then when we run this it's actually going to give us position five
695:20 actually going to give us position five so this is 0o 1 2 3 4 and then five is
695:24 so this is 0o 1 2 3 4 and then five is the space let's do six so we can see the
695:27 the space let's do six so we can see the ACT ual letter and that is our w we can
695:29 ACT ual letter and that is our w we can also use a negative when we're indexing
695:31 also use a negative when we're indexing through our string so we could say -3
695:35 through our string so we could say -3 and it'll give us the L because it's NE
695:37 and it'll give us the L because it's NE -1 2 and three we can also specify a
695:40 -1 2 and three we can also specify a range if we don't want to use the
695:41 range if we don't want to use the default of zero so before we did 0 to
695:44 default of zero so before we did 0 to five and it started at zero because that
695:46 five and it started at zero because that was our default but we could also do two
695:48 was our default but we could also do two to five let's run this and now we go
695:51 to five let's run this and now we go position 0 1 and then we start at 2 L L
695:56 position 0 1 and then we start at 2 L L now we can also also multiply strings
695:58 now we can also also multiply strings and we have this a hello world so we can
696:00 and we have this a hello world so we can do a * 3 and if we run this it'll give
696:05 do a * 3 and if we run this it'll give us hello world three times and we can
696:07 us hello world three times and we can also do a plus a and that is Hello World
696:12 also do a plus a and that is Hello World hello world now let's go down here and
696:14 hello world now let's go down here and take a look at lists lists are really
696:16 take a look at lists lists are really fantastic because they store multiple
696:18 fantastic because they store multiple values the string was stored as one
696:21 values the string was stored as one value multiple characters but a list can
696:23 value multiple characters but a list can store multiple separate values so let's
696:26 store multiple separate values so let's create our very first list list we'll
696:28 create our very first list list we'll say list really quickly and then we'll
696:31 say list really quickly and then we'll put a bracket and a bracket means this
696:33 put a bracket and a bracket means this is going to be a list there are other
696:35 is going to be a list there are other ones like a squiggly bracket and a
696:38 ones like a squiggly bracket and a parenthesis these denote that they are
696:40 parenthesis these denote that they are different types of data types the
696:42 different types of data types the bracket is what makes a list list so to
696:44 bracket is what makes a list list so to keep it super simple we'll say one two
696:46 keep it super simple we'll say one two three and we'll run this and now we have
696:49 three and we'll run this and now we have a list that has three separate values in
696:51 a list that has three separate values in it the comma in our list denotes that
696:53 it the comma in our list denotes that they are separate values and a list is
696:55 they are separate values and a list is indexed just like a string is indexed so
696:58 indexed just like a string is indexed so position zero is this one position one
697:01 position zero is this one position one is the two and position two is the three
697:04 is the two and position two is the three now when we made this list we didn't
697:05 now when we made this list we didn't have to use any quotes because these are
697:07 have to use any quotes because these are numbers but if we wanted to create a
697:09 numbers but if we wanted to create a list and we wanted to add string values
697:12 list and we wanted to add string values we have to do it with our quotes so
697:14 we have to do it with our quotes so we'll say quote cookie
697:17 we'll say quote cookie dough then we'll do a comma to separate
697:19 dough then we'll do a comma to separate the value and then we'll say
697:22 the value and then we'll say strawberry and then we'll do one more
697:24 strawberry and then we'll do one more and this will just be chocolate and when
697:27 and this will just be chocolate and when we run this we have all three of these
697:29 we run this we have all three of these values stored in our list now one of the
697:31 values stored in our list now one of the best things about list is you can have
697:33 best things about list is you can have any data type within them they don't
697:35 any data type within them they don't just have to be numbers or strings you
697:37 just have to be numbers or strings you can basically put anything you want in
697:39 can basically put anything you want in there so let's create a new list and
697:42 there so let's create a new list and let's say
697:44 let's say vanilla and then we'll do three and then
697:47 vanilla and then we'll do three and then we'll add a list within a list and we'll
697:50 we'll add a list within a list and we'll say
697:52 say Scoops comma spoon and then we'll get
697:56 Scoops comma spoon and then we'll get out of that list and then we'll add
697:58 out of that list and then we'll add another value of true for Boolean and
698:02 another value of true for Boolean and now we can hit shift enter and we just
698:04 now we can hit shift enter and we just created a list with several different
698:06 created a list with several different data types within one list now let's
698:09 data types within one list now let's take this one list right here with all
698:11 take this one list right here with all of our different ice cream flavors we'll
698:13 of our different ice cream flavors we'll say icore cream is equal to this list
698:17 say icore cream is equal to this list now one thing that's really great about
698:19 now one thing that's really great about lists is that they are changeable that
698:21 lists is that they are changeable that means we can change the data in here we
698:23 means we can change the data in here we can also add and remove items from the
698:25 can also add and remove items from the list after we've already created it so
698:28 list after we've already created it so let's go and take ice cream and we'll
698:30 let's go and take ice cream and we'll say ice cream. append and this is going
698:33 say ice cream. append and this is going to append it to the very end of the list
698:36 to append it to the very end of the list we do an open parenthesis and let's say
698:38 we do an open parenthesis and let's say salted caramel now when we run this and
698:42 salted caramel now when we run this and we call it just like this it's going to
698:45 we call it just like this it's going to take this list add salted caramel to the
698:48 take this list add salted caramel to the end and we'll print it off and as you
698:51 end and we'll print it off and as you can see it was added to the list and
698:53 can see it was added to the list and just like I said before let me go down
698:55 just like I said before let me go down here we can also change things from this
698:57 here we can also change things from this list so let's say ice cream and then we
699:00 list so let's say ice cream and then we need to look at the indexed position so
699:02 need to look at the indexed position so we're going to say zero and that's going
699:04 we're going to say zero and that's going to be this cookie d right here we can
699:06 to be this cookie d right here we can say that is equal to so we can now
699:08 say that is equal to so we can now change that value so let's call that
699:11 change that value so let's call that butter econ and now when we call
699:15 butter econ and now when we call it we can now see that the cookie dough
699:18 it we can now see that the cookie dough was changed to butter peacon another
699:20 was changed to butter peacon another thing that you saw just a little bit ago
699:22 thing that you saw just a little bit ago is something called a list within a list
699:24 is something called a list within a list basically a nested list so we had Scoops
699:28 basically a nested list so we had Scoops spoon true let's give this and we'll say
699:31 spoon true let's give this and we'll say nested uncore list is equal to now when
699:35 nested uncore list is equal to now when we run this we now have this nested list
699:38 we run this we now have this nested list so if we look at the index and we say
699:41 so if we look at the index and we say zero we'll get vanilla if we say two
699:45 zero we'll get vanilla if we say two we'll get Scoops and spoons now since we
699:47 we'll get Scoops and spoons now since we have a list within a list we can also
699:49 have a list within a list we can also look at the index of that nested list so
699:52 look at the index of that nested list so let's now say one and that should give
699:55 let's now say one and that should give us just spoon and you can go on and on
699:58 us just spoon and you can go on and on and on with this you can do lists within
700:00 and on with this you can do lists within lists within lists and all of them will
700:02 lists within lists and all of them will have indexing that you can call now
700:04 have indexing that you can call now let's go down here and start taking a
700:06 let's go down here and start taking a look at tupal so a list and a tupal are
700:08 look at tupal so a list and a tupal are actually quite similar but the biggest
700:11 actually quite similar but the biggest difference between a list and a tuple is
700:13 difference between a list and a tuple is that a tupal is something called
700:14 that a tupal is something called immutable it means it cannot be modified
700:16 immutable it means it cannot be modified or changed after it's created let's go
700:19 or changed after it's created let's go right up here we're going to say
700:22 right up here we're going to say Tuple and let's write our very first
700:24 Tuple and let's write our very first tupal so we'll say Tuple score
700:28 tupal so we'll say Tuple score Scoops is equal to and then we'll do an
700:31 Scoops is equal to and then we'll do an open parentheses now these open
700:33 open parentheses now these open parentheses you've seen if you do like a
700:34 parentheses you've seen if you do like a print statement but that's different
700:36 print statement but that's different because that's executing a function this
700:39 because that's executing a function this is actually creating a tupal which is
700:40 is actually creating a tupal which is going to store data for us so we'll say
700:42 going to store data for us so we'll say one 2 3 two and one let's go ahead and
700:47 one 2 3 two and one let's go ahead and create that Tuple and we can just check
700:50 create that Tuple and we can just check the data type really quickly and it's a
700:53 the data type really quickly and it's a tupal and just like we saw before a
700:55 tupal and just like we saw before a tupal is also index text so if we go at
700:58 tupal is also index text so if we go at the very first position which is a one
701:01 the very first position which is a one we will get the output of a one but we
701:03 we will get the output of a one but we can't do something like
701:05 can't do something like aend and then add a value like three if
701:09 aend and then add a value like three if we do that it's going to say Tuple
701:11 we do that it's going to say Tuple object has no attribute append it's just
701:13 object has no attribute append it's just because you cannot change or add
701:15 because you cannot change or add anything to a tupal just like we were
701:17 anything to a tupal just like we were talking about before typically people
701:19 talking about before typically people will use tupal for when data is never
701:21 will use tupal for when data is never going to change an example for this
701:23 going to change an example for this might be something like a city name a
701:25 might be something like a city name a country a location
701:27 country a location something that won't change they
701:29 something that won't change they definitely have their use cases but I
701:30 definitely have their use cases but I don't think they're as popular as just
701:31 don't think they're as popular as just using a list so now let's scroll down
701:34 using a list so now let's scroll down and start taking look at sets but really
701:36 and start taking look at sets but really quickly let me add a few more cells for
701:40 quickly let me add a few more cells for us and let's say
701:43 us and let's say sets now a set is somewhat similar to a
701:47 sets now a set is somewhat similar to a list and a tupal but they are a little
701:50 list and a tupal but they are a little bit different in the fact that they
701:51 bit different in the fact that they don't have any duplicate elements
701:54 don't have any duplicate elements another big difference is that the
701:55 another big difference is that the values within a set cannot be accessed
701:57 values within a set cannot be accessed using an index because it doesn't have
701:59 using an index because it doesn't have an index because it's actually unordered
702:02 an index because it's actually unordered we can still Loop through the items in a
702:04 we can still Loop through the items in a set with something like a for Loop but
702:05 set with something like a for Loop but we can't access it using the bracket and
702:07 we can't access it using the bracket and then accessing its index point so let's
702:10 then accessing its index point so let's go ahead and create our very first set
702:12 go ahead and create our very first set so we're going to say daily uncore pints
702:16 so we're going to say daily uncore pints then we're going to say equal to and to
702:18 then we're going to say equal to and to create a set we're going to use these
702:19 create a set we're going to use these squiggly brackets I don't know if
702:21 squiggly brackets I don't know if there's an actual name for those if I'm
702:22 there's an actual name for those if I'm being honest I call them squiggly
702:24 being honest I call them squiggly brackets and that's what we're going to
702:25 brackets and that's what we're going to go with we're to put in a one a two and
702:28 go with we're to put in a one a two and a three so let's go ahead and run
702:30 a three so let's go ahead and run this and let's look at the type and as
702:34 this and let's look at the type and as you can see it is a set now when we
702:36 you can see it is a set now when we print this out it's going to show us one
702:39 print this out it's going to show us one a two and a three and those are all the
702:41 a two and a three and those are all the values within our set but if we copy
702:43 values within our set but if we copy this and we'll say daily pant log this
702:46 this and we'll say daily pant log this is going to be every single day maybe I
702:50 is going to be every single day maybe I had different
702:52 had different values now when we run this and we do
702:54 values now when we run this and we do the exact same thing now when we print
702:58 the exact same thing now when we print this it's going to have just the unique
703:00 this it's going to have just the unique values within that set now a use case
703:03 values within that set now a use case for set and this is something that I've
703:04 for set and this is something that I've done in the past is comparing two
703:06 done in the past is comparing two separate sets maybe you have a list or a
703:08 separate sets maybe you have a list or a tupal and you convert that into a set
703:11 tupal and you convert that into a set and that will narrow it down to its
703:12 and that will narrow it down to its unique values then you can compare the
703:14 unique values then you can compare the unique values of one set to the unique
703:16 unique values of one set to the unique values in another set and then we can
703:18 values in another set and then we can see what's the same and what's different
703:20 see what's the same and what's different so let's go down here and let's say
703:22 so let's go down here and let's say wife's
703:24 wife's uncore daily just copy this right here
703:28 uncore daily just copy this right here we'll say is equal to let's do our
703:30 we'll say is equal to let's do our squiggly lines let's do one two let's do
703:34 squiggly lines let's do one two let's do just random
703:36 just random numbers so now this is my daily log and
703:39 numbers so now this is my daily log and this is my wife's daily log and now we
703:41 this is my wife's daily log and now we can compare these values so let's go
703:43 can compare these values so let's go right down here let's say print we'll do
703:48 right down here let's say print we'll do my daily logs and then we'll do this bar
703:51 my daily logs and then we'll do this bar right here and this is going to show us
703:53 right here and this is going to show us the combined unique values it's
703:54 the combined unique values it's basically like putting them all in one
703:56 basically like putting them all in one second set and then trimming it down to
703:58 second set and then trimming it down to just the unique values so we'll take
704:00 just the unique values so we'll take wife's daily pintes log and when we run
704:03 wife's daily pintes log and when we run this we actually need to run this first
704:05 this we actually need to run this first when we run this we should see all the
704:07 when we run this we should see all the unique values between these two sets and
704:10 unique values between these two sets and so as you can see 0 1 2 3 4 5 6 7 24 31
704:14 so as you can see 0 1 2 3 4 5 6 7 24 31 so these are all the unique values
704:16 so these are all the unique values between these two
704:17 between these two sets we can also do another one and
704:21 sets we can also do another one and instead of this bar we're going to do
704:23 instead of this bar we're going to do this symbol right here which I believe
704:25 this symbol right here which I believe is called an Amper sand
704:27 is called an Amper sand don't quote me on that but when we run
704:29 don't quote me on that but when we run this it's going to show what matches
704:31 this it's going to show what matches that means which ones show up in both
704:33 that means which ones show up in both sets so the only ones that show up in
704:36 sets so the only ones that show up in both sets are 1 2 3 and five we can also
704:40 both sets are 1 2 3 and five we can also do the opposite of that by doing a minus
704:43 do the opposite of that by doing a minus sign and this is going to show us what
704:44 sign and this is going to show us what doesn't match and so we have four 6 and
704:47 doesn't match and so we have four 6 and 31 now where is our 24 that was in our
704:50 31 now where is our 24 that was in our wife's daily pints log it's in this one
704:53 wife's daily pints log it's in this one but we're subtracting the values on this
704:55 but we're subtracting the values on this one so let's reverse reverse this and
704:57 one so let's reverse reverse this and we'll say daily pints
704:59 we'll say daily pints log and let's run it now those are our
705:02 log and let's run it now those are our other values so we're taking the values
705:04 other values so we're taking the values of this and then we're subtracting all
705:06 of this and then we're subtracting all the ones that are the same and getting
705:08 the ones that are the same and getting the remaining values and then for our
705:11 the remaining values and then for our last one we can get rid of this and
705:13 last one we can get rid of this and we'll do this symbol right here and this
705:17 we'll do this symbol right here and this is going to show if a value is either in
705:19 is going to show if a value is either in one or the other but not in both so
705:22 one or the other but not in both so let's run this so these values are
705:25 let's run this so these values are completely unique only two each of those
705:27 completely unique only two each of those sets now the very last one that we're
705:30 sets now the very last one that we're going to look at in this video is
705:32 going to look at in this video is dictionaries so let's go right down here
705:35 dictionaries so let's go right down here let's add a few cells and let's say
705:39 let's add a few cells and let's say dictionaries now I saved dictionary for
705:41 dictionaries now I saved dictionary for last because this one is probably the
705:42 last because this one is probably the most different out of all the previous
705:44 most different out of all the previous data types that we've looked at within a
705:46 data types that we've looked at within a data type we have something called a
705:50 data type we have something called a key value pair that means when we use a
705:53 key value pair that means when we use a dictionary it's not like a list where
705:55 dictionary it's not like a list where you just have a value comma value comma
705:58 you just have a value comma value comma value we have a key that indicates what
706:01 value we have a key that indicates what that value is attributed to so let's
706:03 that value is attributed to so let's write out a dictionary to see how this
706:05 write out a dictionary to see how this looks so we're going to say
706:07 looks so we're going to say dictionary cream and just like a set we
706:11 dictionary cream and just like a set we use a squiggly line but the thing that
706:13 use a squiggly line but the thing that differentiates it is that in a
706:15 differentiates it is that in a dictionary we'll have that key value
706:17 dictionary we'll have that key value pair whereas in a set each value is just
706:19 pair whereas in a set each value is just separated by a comma so let's write name
706:23 separated by a comma so let's write name and this is our key and then we do a
706:25 and this is our key and then we do a colon and this is then where we input
706:27 colon and this is then where we input our value so we're going to say Alex
706:30 our value so we're going to say Alex freeberg and then we separate that key
706:33 freeberg and then we separate that key value Pair by a comma and now we can do
706:36 value Pair by a comma and now we can do another key value pair so we'll say
706:39 another key value pair so we'll say weekly intake and a colon and we'll say
706:44 weekly intake and a colon and we'll say five pints of ice cream do a comma and
706:47 five pints of ice cream do a comma and then we'll do favorite ice creams and
706:51 then we'll do favorite ice creams and now what we're going to do is we're
706:52 now what we're going to do is we're going to put in here a list so within
706:54 going to put in here a list so within this dictionary we can also add a list
706:57 this dictionary we can also add a list we'll do MCC from mint chocolate chip
706:59 we'll do MCC from mint chocolate chip and then we'll add chocolate another one
707:02 and then we'll add chocolate another one of my favorites so now we have our very
707:04 of my favorites so now we have our very first dictionary let's copy this and run
707:08 first dictionary let's copy this and run it and let's just look at the
707:10 it and let's just look at the type and as you can see it says that
707:13 type and as you can see it says that this is a dictionary let's also print it
707:15 this is a dictionary let's also print it out now if we want to we can take our
707:18 out now if we want to we can take our dictionary cream and say dot values with
707:22 dictionary cream and say dot values with an open parenthesis and when we execute
707:24 an open parenthesis and when we execute this we'll see all of the values within
707:27 this we'll see all of the values within this dictionary so here's our values of
707:28 this dictionary so here's our values of Alex freeberg five mint chocolate chip
707:31 Alex freeberg five mint chocolate chip and chocolate we can also say keys and
707:35 and chocolate we can also say keys and when we run this all of the keys the
707:36 when we run this all of the keys the name weekly intake and favorite ice
707:38 name weekly intake and favorite ice creams and we can also
707:41 creams and we can also say items so this key value pair is one
707:45 say items so this key value pair is one item and this key value pair is another
707:48 item and this key value pair is another item now one difference between
707:50 item now one difference between something like a list and a dictionary
707:52 something like a list and a dictionary is how you call the index but you can't
707:54 is how you call the index but you can't call it by doing something like like
707:56 call it by doing something like like this where you just do a bracket oops
707:59 this where you just do a bracket oops and say zero so this would in theory
708:02 and say zero so this would in theory take this very first one right our very
708:05 take this very first one right our very first key value pair that's going to
708:07 first key value pair that's going to give us an error how you call a
708:08 give us an error how you call a dictionary is actually by the key so it
708:10 dictionary is actually by the key so it doesn't technically have an index but
708:12 doesn't technically have an index but you can specify what you want to call
708:14 you can specify what you want to call and take it out so we're going to say
708:17 and take it out so we're going to say name and this is going to call that key
708:19 name and this is going to call that key right here and when we run this we'll
708:21 right here and when we run this we'll get the value which is Alex freeberg one
708:25 get the value which is Alex freeberg one other thing that you can do is you can
708:26 other thing that you can do is you can also update information in a dictionary
708:29 also update information in a dictionary which we can't with some other data
708:31 which we can't with some other data types so for this for the name it was
708:33 types so for this for the name it was Alex freeberg now let's say Ste freeberg
708:38 Alex freeberg now let's say Ste freeberg and when we update that I'm also going
708:40 and when we update that I'm also going to
708:41 to print the dictionary get rid of this so
708:45 print the dictionary get rid of this so it's going to update Christine freeberg
708:48 it's going to update Christine freeberg in that value of the name so let's go
708:51 in that value of the name so let's go ahead and run this and now it changed
708:53 ahead and run this and now it changed the name from Alex freeberg to Christine
708:55 the name from Alex freeberg to Christine freeberg we can also update all of these
708:58 freeberg we can also update all of these values at one time so let's copy
709:02 values at one time so let's copy this and I'm going to put it right down
709:04 this and I'm going to put it right down here I'm going to say dictionary.c
709:07 here I'm going to say dictionary.c cream. update then we're going to put a
709:10 cream. update then we're going to put a bracket or not a bracket but a
709:12 bracket or not a bracket but a parentheses around these so now what
709:14 parentheses around these so now what we're going to do is update this entire
709:16 we're going to do is update this entire thing let me take this say print this
709:20 thing let me take this say print this dictionary now we can update this to
709:23 dictionary now we can update this to anything we want so instead of here I
709:26 anything we want so instead of here I can
709:27 can say I'll say
709:29 say I'll say weight and because of all that ice cream
709:32 weight and because of all that ice cream I now weigh 300 lb so let's run this and
709:37 I now weigh 300 lb so let's run this and as you can see it did not delete our key
709:39 as you can see it did not delete our key value pair right here instead it just
709:41 value pair right here instead it just added to it when you're using the update
709:44 added to it when you're using the update we can't actually delete that's the
709:46 we can't actually delete that's the delete statement and I'll show you that
709:47 delete statement and I'll show you that in just a second but all we did was
709:50 in just a second but all we did was added this new value it also is going to
709:52 added this new value it also is going to check and see if you changed anything
709:54 check and see if you changed anything with your key value pair so we can go in
709:55 with your key value pair so we can go in here here and change this value and
709:57 here here and change this value and we'll say 10 so now when we run this the
710:01 we'll say 10 so now when we run this the value of this key value pair was changed
710:03 value of this key value pair was changed but let's say we do want to delete it
710:05 but let's say we do want to delete it we'll say deel that stands for delete
710:08 we'll say deel that stands for delete part of this dictionary cream and now
710:09 part of this dictionary cream and now let's specify the key which will also
710:12 let's specify the key which will also delete the value with it well let's
710:14 delete the value with it well let's specify the key that we want to get rid
710:16 specify the key that we want to get rid of and let's say
710:17 of and let's say wait and then let's print that
710:22 wait and then let's print that again and as you can see the weight was
710:25 again and as you can see the weight was deleted from that dictionary so that is
710:27 deleted from that dictionary so that is all we're going to cover in this data
710:28 all we're going to cover in this data types video thank you guys so much for
710:30 types video thank you guys so much for watching I really appreciate it if you
710:32 watching I really appreciate it if you like this video be sure to like And
710:34 like this video be sure to like And subscribe below and I'll see you in the
710:35 subscribe below and I'll see you in the next
710:37 next [Music]
710:47 [Music] video hello everybody today we're going
710:50 video hello everybody today we're going to be taking a look at comparison
710:51 to be taking a look at comparison logical and membership operators in
710:53 logical and membership operators in Python operators are used to perform
710:55 Python operators are used to perform operations on variables and values for
710:58 operations on variables and values for example you're often going to want to
710:59 example you're often going to want to compare two separate values to see if
711:01 compare two separate values to see if they are the same or if they're
711:02 they are the same or if they're different within Python and that's where
711:04 different within Python and that's where the comparison operator comes in right
711:06 the comparison operator comes in right here you can see our operators you can
711:08 here you can see our operators you can also see what they do so this equal sign
711:11 also see what they do so this equal sign equal sign stands for equal we have the
711:13 equal sign stands for equal we have the does not equal the greater than less
711:15 does not equal the greater than less than greater than or equal to and less
711:17 than greater than or equal to and less than or equal to and honestly I use
711:19 than or equal to and honestly I use these almost every single time I use
711:21 these almost every single time I use Python so these are very important to
711:23 Python so these are very important to know and know how to use so let's get
711:24 know and know how to use so let's get rid of that really quickly and actually
711:26 rid of that really quickly and actually start writing it out and see how these
711:27 start writing it out and see how these comparison operators work in Python the
711:29 comparison operators work in Python the very first one that we're going to look
711:31 very first one that we're going to look at is equal to now you can't just say 10
711:33 at is equal to now you can't just say 10 is equal to 10 let's try running that
711:35 is equal to 10 let's try running that really quickly by clicking shift enter
711:38 really quickly by clicking shift enter it's going to say cannot assign to
711:40 it's going to say cannot assign to literal that's because this is like
711:41 literal that's because this is like assigning a variable we're trying to say
711:43 assigning a variable we're trying to say 10 is equal to 10 and then we can call
711:46 10 is equal to 10 and then we can call that 10 later but that's not how this
711:48 that 10 later but that's not how this actually works what we're trying to do
711:49 actually works what we're trying to do is to determine whether 10 is equal to
711:51 is to determine whether 10 is equal to 10 so we're going to say equal sign
711:53 10 so we're going to say equal sign equal sign and then if we run that by
711:55 equal sign and then if we run that by clicking shift enter again it's going to
711:57 clicking shift enter again it's going to say true now if we put something else
711:59 say true now if we put something else like 50 in there and we try to run this
712:02 like 50 in there and we try to run this it's going to say false so really what
712:03 it's going to say false so really what you're going to get when you use these
712:05 you're going to get when you use these comparison operators is either a true or
712:07 comparison operators is either a true or a false if we take this right down here
712:10 a false if we take this right down here we can also say does not equal and we're
712:12 we can also say does not equal and we're going to use an exclamation point equal
712:14 going to use an exclamation point equal sign and that says 10 is not equal to 50
712:17 sign and that says 10 is not equal to 50 and that should be true you can also
712:18 and that should be true you can also compare strings and variables so let's
712:21 compare strings and variables so let's go right down here and we're going to
712:23 go right down here and we're going to say vanilla is not
712:26 say vanilla is not equal to chocolate and when we run this
712:30 equal to chocolate and when we run this it'll say false now if it was the same
712:32 it'll say false now if it was the same just like when we did our numbers it
712:34 just like when we did our numbers it should say true and we can also compare
712:36 should say true and we can also compare variables so we'll say x is equal to
712:39 variables so we'll say x is equal to vanilla and Y is equal to chocolate and
712:44 vanilla and Y is equal to chocolate and then when we come down here we can say x
712:46 then when we come down here we can say x is equal to Y and it'll give us a false
712:49 is equal to Y and it'll give us a false and we say X is not equal to Y and it'll
712:53 and we say X is not equal to Y and it'll give us a true the next one that we're
712:55 give us a true the next one that we're going to take take a look at is the less
712:57 going to take take a look at is the less than so let's copy this one right up
712:58 than so let's copy this one right up here let's scroll
713:00 here let's scroll down and let's say 10 is less than 50
713:05 down and let's say 10 is less than 50 now this will come out as true now let's
713:07 now this will come out as true now let's say we put a 10 in here before 10 was of
713:11 say we put a 10 in here before 10 was of course less than 50 but is 10 less than
713:14 course less than 50 but is 10 less than 10 no that's false because they are the
713:17 10 no that's false because they are the same so if we want an output that is
713:18 same so if we want an output that is true all we would have to add is an
713:20 true all we would have to add is an equal sign right here and this would say
713:22 equal sign right here and this would say 10 is less than or it is equal to 10 and
713:26 10 is less than or it is equal to 10 and now it's true of course we can say the
713:29 now it's true of course we can say the exact same thing by saying greater than
713:31 exact same thing by saying greater than so 10 is equal or greater than 10
713:34 so 10 is equal or greater than 10 that'll be true because 10 is equal to
713:36 that'll be true because 10 is equal to 10 we can also say 50 is greater or
713:39 10 we can also say 50 is greater or equal to 10 because 50 is obviously
713:41 equal to 10 because 50 is obviously greater than 10 now let's look at
713:43 greater than 10 now let's look at logical operators that are often
713:45 logical operators that are often combined with comparison operators so
713:47 combined with comparison operators so our operators are and or and not so if
713:50 our operators are and or and not so if you have an and that returns true if
713:53 you have an and that returns true if both statements are true if it's or only
713:56 both statements are true if it's or only one of the statements has to be true and
713:58 one of the statements has to be true and the not basically reverses the result so
714:00 the not basically reverses the result so if it was going to return true it would
714:02 if it was going to return true it would return false I don't use this not one a
714:05 return false I don't use this not one a lot but I will show you how it works so
714:07 lot but I will show you how it works so let's actually test that out so before
714:09 let's actually test that out so before we were saying 10 is greater than 50 and
714:13 we were saying 10 is greater than 50 and of course this returned false so now
714:15 of course this returned false so now let's add a parentheses around this 10
714:17 let's add a parentheses around this 10 is greater than 50 and we're going to
714:19 is greater than 50 and we're going to say and we'll do an open parenthesis 50
714:22 say and we'll do an open parenthesis 50 is greater than 10 now this statement
714:25 is greater than 10 now this statement right here is true 50 is greater than 10
714:27 right here is true 50 is greater than 10 so we have a true statement and a false
714:29 so we have a true statement and a false statement but this and is going to look
714:31 statement but this and is going to look at both of them and it's going to say
714:33 at both of them and it's going to say they both need to be true in order to
714:35 they both need to be true in order to return a true so let's try running this
714:38 return a true so let's try running this and we still have a false if we want it
714:40 and we still have a false if we want it to return true we're going to have to
714:42 to return true we're going to have to change this to make it a true statement
714:44 change this to make it a true statement so 70 is greater than 50 and 50 is
714:46 so 70 is greater than 50 and 50 is greater than 10 when we run this it
714:48 greater than 10 when we run this it should return true now let's look at the
714:50 should return true now let's look at the or so let's copy this and we'll say 10
714:54 or so let's copy this and we'll say 10 is greater than 50 or 50 is greater than
714:58 is greater than 50 or 50 is greater than 10 now this is a false statement and
715:01 10 now this is a false statement and this is a true statement so if even one
715:03 this is a true statement so if even one of them is a true statement the output
715:04 of them is a true statement the output should be true and again we can do this
715:07 should be true and again we can do this even with strings so we can do
715:11 even with strings so we can do vanilla and
715:13 vanilla and chocolate there we go and vanilla is
715:17 chocolate there we go and vanilla is actually greater than chocolate because
715:18 actually greater than chocolate because V is a higher number in the alphabetical
715:21 V is a higher number in the alphabetical order so V is like 20 something whereas
715:23 order so V is like 20 something whereas chocolate is three right so actually
715:25 chocolate is three right so actually looks at the spelling for this so if we
715:27 looks at the spelling for this so if we say or here it will come out true and if
715:31 say or here it will come out true and if we say and here it should also be true
715:33 we say and here it should also be true because V is greater than C and 50 is
715:36 because V is greater than C and 50 is greater than 10 so this should also be
715:38 greater than 10 so this should also be true now let's copy this right here and
715:42 true now let's copy this right here and we're going to say not so what we had
715:44 we're going to say not so what we had before is 50 is greater than 10 that
715:47 before is 50 is greater than 10 that returned true but now all we're doing is
715:50 returned true but now all we're doing is putting not in front of it so instead of
715:51 putting not in front of it so instead of returning true it's going to return
715:53 returning true it's going to return false so now let's take a look at
715:55 false so now let's take a look at membership operators and we use this to
715:57 membership operators and we use this to check if something whether it's a value
715:59 check if something whether it's a value or a string or something like that is
716:01 or a string or something like that is within another value or string or
716:03 within another value or string or sequence our operators are in and not in
716:06 sequence our operators are in and not in so it's pretty simple if it's in it's
716:08 so it's pretty simple if it's in it's going to return true if the sequence
716:09 going to return true if the sequence with a specified value is present in the
716:11 with a specified value is present in the object just like we were talking about
716:13 object just like we were talking about and for not in it's basically the exact
716:15 and for not in it's basically the exact same thing if it's not in that object so
716:17 same thing if it's not in that object so let's start out by taking a look at a
716:19 let's start out by taking a look at a string we're going to say ice _ cream is
716:22 string we're going to say ice _ cream is equal to I love chocolate
716:25 equal to I love chocolate ice
716:26 ice cream and then we're going to say love
716:29 cream and then we're going to say love in ice cream and that will will turn
716:33 in ice cream and that will will turn true so all we're doing is searching if
716:35 true so all we're doing is searching if the word love or that string is in this
716:38 the word love or that string is in this larger string we could also just do that
716:40 larger string we could also just do that by literally copying this and putting
716:42 by literally copying this and putting this where this is so we can check is
716:44 this where this is so we can check is this string part of this string and
716:46 this string part of this string and it'll say true we can also make a list
716:49 it'll say true we can also make a list so we'll say Scoops is equal to and then
716:52 so we'll say Scoops is equal to and then we'll do a bracket and we'll say 1 2 3 4
716:55 we'll do a bracket and we'll say 1 2 3 4 4 five and then we'll say two in Scoops
717:00 4 five and then we'll say two in Scoops so all we're doing is searching to see
717:01 so all we're doing is searching to see if two is within this list and that
717:03 if two is within this list and that should return true now if we put a six
717:06 should return true now if we put a six here and we said not in it will also
717:10 here and we said not in it will also return true because six is not in Scoops
717:13 return true because six is not in Scoops and that is true and just like we did we
717:15 and that is true and just like we did we could also say wanted underscore Scoops
717:18 could also say wanted underscore Scoops and we'll say eight so I wanted eight
717:21 and we'll say eight so I wanted eight Scoops so we can say wanted Scoops in
717:24 Scoops so we can say wanted Scoops in scoops and this should return true
717:27 scoops and this should return true because there's not an eight within the
717:29 because there's not an eight within the Scoops that we wanted and if we said in
717:32 Scoops that we wanted and if we said in and we said we wanted eight is that
717:35 and we said we wanted eight is that within our list that we created and
717:37 within our list that we created and that's going to return a false so that
717:39 that's going to return a false so that is a quick breakdown of comparison
717:41 is a quick breakdown of comparison logical and membership operators I hope
717:43 logical and membership operators I hope that this was helpful thank you guys so
717:45 that this was helpful thank you guys so much for watching if you like this video
717:47 much for watching if you like this video be sure to like And subscribe and I will
717:49 be sure to like And subscribe and I will see you in the next
717:52 see you in the next [Music]
717:54 [Music] video
718:03 [Music] hello everybody today we're going to be
718:04 hello everybody today we're going to be taking a look at the if statement within
718:06 taking a look at the if statement within python now it's actually the if LF else
718:08 python now it's actually the if LF else statement but that's a mouthful so I'm
718:10 statement but that's a mouthful so I'm just going to call it the if else
718:11 just going to call it the if else statement now we have this flowchart and
718:13 statement now we have this flowchart and I apologize for being blurry but this is
718:15 I apologize for being blurry but this is the absolute best one that I could find
718:17 the absolute best one that I could find right up top we have our if condition
718:20 right up top we have our if condition now if this if condition is true we're
718:22 now if this if condition is true we're going to run a body of code but if that
718:24 going to run a body of code but if that condition is false we're going to go
718:26 condition is false we're going to go over here and go to the LF condition the
718:28 over here and go to the LF condition the LF condition or statement is basically
718:30 LF condition or statement is basically saying if the first if statement doesn't
718:32 saying if the first if statement doesn't work let's try this if statement if this
718:34 work let's try this if statement if this LF statement is true it goes to this
718:37 LF statement is true it goes to this body of code if it's false it'll come
718:39 body of code if it's false it'll come over here to the else and the else is
718:40 over here to the else and the else is basically if all these things don't work
718:43 basically if all these things don't work then run this body of code now you can
718:45 then run this body of code now you can have as many ill if statements as you
718:47 have as many ill if statements as you want but you can only have one if
718:49 want but you can only have one if statement and one else statement so
718:51 statement and one else statement so let's write out some code and see how
718:52 let's write out some code and see how this actually looks let's first start
718:54 this actually looks let's first start off by writing if that that is our if
718:56 off by writing if that that is our if statement and now we have to write our
718:57 statement and now we have to write our condition which is about to be either
718:59 condition which is about to be either met or not met so we'll say if 25 is
719:03 met or not met so we'll say if 25 is greater than 10 which is true we'll say
719:05 greater than 10 which is true we'll say colon and then we're going to hit enter
719:08 colon and then we're going to hit enter and it's going to automatically indent
719:09 and it's going to automatically indent that line of code for us and this is our
719:11 that line of code for us and this is our body of code so if 25 is greater than 10
719:14 body of code so if 25 is greater than 10 our body of code will execute so for us
719:17 our body of code will execute so for us we're just going to write print and
719:18 we're just going to write print and we'll say it worked now if we run this
719:21 we'll say it worked now if we run this it's going to check is 25 greater than
719:23 it's going to check is 25 greater than 10 if that is true true print this so
719:27 10 if that is true true print this so let's hit shift enter and it worked now
719:30 let's hit shift enter and it worked now let's take this exact code we'll paste
719:33 let's take this exact code we'll paste it right down here and we'll say is less
719:35 it right down here and we'll say is less than and right now this if statement is
719:38 than and right now this if statement is not true so it's not actually going to
719:41 not true so it's not actually going to work as you can see there's no output
719:43 work as you can see there's no output there's nothing that happened really but
719:45 there's nothing that happened really but it did check to see if 25 was less than
719:47 it did check to see if 25 was less than 10 but it just wasn't true now we can
719:50 10 but it just wasn't true now we can use our else statement so we're going to
719:51 use our else statement so we're going to come right down here and we're going to
719:53 come right down here and we're going to say else and we'll do a colon and we'll
719:55 say else and we'll do a colon and we'll hit enter again automatically indenting
719:57 hit enter again automatically indenting and we're going to say print and we're
719:59 and we're going to say print and we're going to say it did not work dot dot dot
720:04 going to say it did not work dot dot dot so what it's going to do is it's going
720:05 so what it's going to do is it's going to come up here and check is 25 less
720:08 to come up here and check is 25 less than 10 no it's not so this body of code
720:11 than 10 no it's not so this body of code is not going to be executed it's going
720:12 is not going to be executed it's going to go right down to this else statement
720:14 to go right down to this else statement now this else statement is going to be
720:16 now this else statement is going to be printed there's no condition on this so
720:18 printed there's no condition on this so the if statement has a condition 25 is
720:20 the if statement has a condition 25 is less than 10 this has no condition so if
720:22 less than 10 this has no condition so if this doesn't work if this is false it's
720:24 this doesn't work if this is false it's going to come down here and it will run
720:26 going to come down here and it will run this body of code let's run this by
720:28 this body of code let's run this by clicking shift enter and as you can see
720:31 clicking shift enter and as you can see our output is it did not work now let's
720:33 our output is it did not work now let's go back up here and put greater than
720:36 go back up here and put greater than because this is now true it's going to
720:38 because this is now true it's going to say if 25 is greater than 10 print it
720:41 say if 25 is greater than 10 print it worked and then it's going to stop it's
720:43 worked and then it's going to stop it's not going to go to this lse statement at
720:45 not going to go to this lse statement at all so let's run this and our output is
720:48 all so let's run this and our output is it worked so what if we have a lot of
720:49 it worked so what if we have a lot of different conditions that we want to try
720:51 different conditions that we want to try let's come right down here this is where
720:53 let's come right down here this is where the LF comes in so so really quickly
720:55 the LF comes in so so really quickly let's change this to a not true a false
720:58 let's change this to a not true a false statement we're going to go down and say
721:00 statement we're going to go down and say LF and we're going to say if it is and
721:05 LF and we're going to say if it is and let's say
721:07 let's say 30 we'll say LF
721:11 30 we'll say LF worked so now it's going to check is 25
721:14 worked so now it's going to check is 25 less than 10 no it's not let's look at
721:16 less than 10 no it's not let's look at the next condition is 25 less than 30
721:19 the next condition is 25 less than 30 and if it is we'll print L if worked so
721:22 and if it is we'll print L if worked so let's try running this and L if worked
721:25 let's try running this and L if worked now we can do as many of these LF
721:27 now we can do as many of these LF statements as we want we can do let's
721:30 statements as we want we can do let's just try a few of them right here so
721:33 just try a few of them right here so we'll say if 25 is less than 20 is less
721:37 we'll say if 25 is less than 20 is less than
721:38 than 21 and let's do 40 and let's do 50 so
721:42 21 and let's do 40 and let's do 50 so we'll say LF lf2 lf3 and lf4 now if you
721:48 we'll say LF lf2 lf3 and lf4 now if you look at this the first one that is
721:50 look at this the first one that is actually going to work is this 25 to 40
721:53 actually going to work is this 25 to 40 right here once this one is checked and
721:55 right here once this one is checked and it comes out as true none of the other
721:58 it comes out as true none of the other LF or L statements will work so let's
722:00 LF or L statements will work so let's try this one it should be
722:02 try this one it should be lf3 and this one ran properly now within
722:05 lf3 and this one ran properly now within our condition so far we've only used a
722:07 our condition so far we've only used a comparison operator we can also use a
722:09 comparison operator we can also use a logical operator like and or or so we
722:12 logical operator like and or or so we can say if 25 is less than 10 which it's
722:16 can say if 25 is less than 10 which it's not and let's say or actually and we'll
722:19 not and let's say or actually and we'll say or 1 is less than three which is
722:22 say or 1 is less than three which is true if we run this now it will actually
722:26 true if we run this now it will actually work so we can use several different
722:27 work so we can use several different types of operators within our if
722:29 types of operators within our if statement to see if a condition is true
722:31 statement to see if a condition is true or not or several conditions are true
722:33 or not or several conditions are true there's also a way to write an IFL
722:35 there's also a way to write an IFL statement in one line if you want to do
722:37 statement in one line if you want to do that so we can write print we'll say it
722:41 that so we can write print we'll say it worked and then we'll come over here and
722:43 worked and then we'll come over here and say if 10 is greater than 30 and then
722:47 say if 10 is greater than 30 and then we'll write else print and we'll say it
722:52 we'll write else print and we'll say it did not work just like we had before
722:55 did not work just like we had before except now it's all occurring on one
722:57 except now it's all occurring on one line so let's just try this and see if
722:59 line so let's just try this and see if it works so it's saying print it worked
723:02 it works so it's saying print it worked if 10 is greater than 30 which it wasn't
723:04 if 10 is greater than 30 which it wasn't so it went to the lse statement and then
723:06 so it went to the lse statement and then it printed out our body right here
723:08 it printed out our body right here although we didn't have any indentation
723:10 although we didn't have any indentation or multiple lines it was all done in one
723:12 or multiple lines it was all done in one line now there's one other thing that we
723:13 line now there's one other thing that we haven't looked at yet uh and I'm going
723:15 haven't looked at yet uh and I'm going to show it to you really quickly and
723:17 to show it to you really quickly and that's a nested if statement so when we
723:19 that's a nested if statement so when we run this it's going to say it worked it
723:22 run this it's going to say it worked it works because it says 25 is less than 10
723:25 works because it says 25 is less than 10 or one is less than three since this is
723:27 or one is less than three since this is true it's going to print out it worked
723:30 true it's going to print out it worked but we can also do a nested if statement
723:32 but we can also do a nested if statement so we can do multiple if statements as
723:34 so we can do multiple if statements as well so we're going to hit enter and
723:36 well so we're going to hit enter and we'll say if and we'll do a true
723:38 we'll say if and we'll do a true statement here so we'll say if 10 is
723:41 statement here so we'll say if 10 is greater than five let's do a colon hit
723:44 greater than five let's do a colon hit enter then we'll say print and then
723:46 enter then we'll say print and then we'll type A String saying this nested
723:49 we'll type A String saying this nested if
723:50 if statement oops
723:52 statement oops worked now let's try this out and and
723:54 worked now let's try this out and and see what we get so it went through the
723:57 see what we get so it went through the first if statement it said it was true
723:59 first if statement it said it was true and it prints out it worked this is
724:01 and it prints out it worked this is still the body of code so it goes down
724:03 still the body of code so it goes down to this next if statement and it says if
724:05 to this next if statement and it says if 10 is greater than five we're going to
724:07 10 is greater than five we're going to print this out and you could do this on
724:09 print this out and you could do this on and on and on it can basically go on
724:12 and on and on it can basically go on forever and you can create a really
724:13 forever and you can create a really in-depth logic and that actually happens
724:15 in-depth logic and that actually happens a lot when you start writing more
724:16 a lot when you start writing more advanced code so I hope that this was
724:18 advanced code so I hope that this was helpful I hope that you understand the
724:20 helpful I hope that you understand the IFL statement better I hope that you
724:22 IFL statement better I hope that you understand how nested if statements work
724:23 understand how nested if statements work as well thank you guys so much for
724:25 as well thank you guys so much for watching if you like this video be sure
724:27 watching if you like this video be sure to like And subscribe below and I'll see
724:29 to like And subscribe below and I'll see you in the next
724:31 you in the next [Music]
724:41 [Music] video hello everybody today we're going
724:44 video hello everybody today we're going to be learning about for Loops in Python
724:46 to be learning about for Loops in Python the for Loop is used to iterate over a
724:48 the for Loop is used to iterate over a sequence which could be a list a tupal
724:51 sequence which could be a list a tupal an array a string or even a dictionary
724:53 an array a string or even a dictionary here's the list that we'll be working
724:54 here's the list that we'll be working with throughout this video and I have
724:56 with throughout this video and I have this little diagram right here which
724:58 this little diagram right here which kind of explains how a for Loop works
725:01 kind of explains how a for Loop works the for Loop is going to start by
725:02 the for Loop is going to start by looking at the very first item in our
725:04 looking at the very first item in our sequence or our list and that's going to
725:06 sequence or our list and that's going to be our one right here it's going to ask
725:08 be our one right here it's going to ask is this the last element in our list and
725:11 is this the last element in our list and it is not so it's going to go down to
725:14 it is not so it's going to go down to this body of the for Loop now we can
725:16 this body of the for Loop now we can have a thousand different things that
725:18 have a thousand different things that can happen in the body of the for loop
725:19 can happen in the body of the for loop as we're about to look out in just a
725:21 as we're about to look out in just a second then it's going to go up to the
725:23 second then it's going to go up to the next element and ask is this the last
725:25 next element and ask is this the last element reached so it'll be no again
725:28 element reached so it'll be no again because we'll be going to the two and
725:29 because we'll be going to the two and then the three and then the four and the
725:31 then the three and then the four and the five once it reaches the five it'll go
725:34 five once it reaches the five it'll go to the body the for Loop and then when
725:36 to the body the for Loop and then when it asks if that's the last element the
725:38 it asks if that's the last element the answer would be yes because it's
725:39 answer would be yes because it's iterated through all the items within
725:41 iterated through all the items within the list and then we would exit the loop
725:43 the list and then we would exit the loop and the for Loop would be over now that
725:45 and the for Loop would be over now that may not have made perfect sense but
725:47 may not have made perfect sense but let's actually start writing out the
725:49 let's actually start writing out the syntax of a for Loop so we can
725:50 syntax of a for Loop so we can understand this better to start our for
725:52 understand this better to start our for loop we're going to say four and and
725:54 loop we're going to say four and and then we're going to give it a temporary
725:56 then we're going to give it a temporary variable for this for Loop so it's a
725:58 variable for this for Loop so it's a variable as it iterates through these
726:00 variable as it iterates through these numbers it's going to assign the
726:02 numbers it's going to assign the variable to that number so for this one
726:04 variable to that number so for this one we're just going to say number because
726:06 we're just going to say number because it's pretty appropriate because these
726:07 it's pretty appropriate because these are all numbers and then we're going to
726:09 are all numbers and then we're going to say in integers now right here you can
726:14 say in integers now right here you can put just about anything this could be
726:15 put just about anything this could be the list this could be a tuple this
726:17 the list this could be a tuple this could be a string even but that is what
726:20 could be a string even but that is what we're going to iterate through so we're
726:21 we're going to iterate through so we're saying for the variables each of these
726:23 saying for the variables each of these numbers within this list of integers and
726:27 numbers within this list of integers and then we're going to write a colon this
726:29 then we're going to write a colon this is the body of code that's going to
726:31 is the body of code that's going to actually be executed when we run through
726:33 actually be executed when we run through and iterate through our list so for our
726:35 and iterate through our list so for our first example we're going to start off
726:37 first example we're going to start off super simple and all we're going to do
726:39 super simple and all we're going to do is say print open parentheses and say
726:42 is say print open parentheses and say number as it iterates through the 1 2 3
726:45 number as it iterates through the 1 2 3 4 and five number becomes our variable
726:48 4 and five number becomes our variable that is going to be printed so during
726:50 that is going to be printed so during that first loop our one will be printed
726:52 that first loop our one will be printed because that will be assigned right here
726:55 because that will be assigned right here then through the next iteration the two
726:57 then through the next iteration the two will be assigned and'll be put right
726:59 will be assigned and'll be put right here in each Loop until the very end so
727:02 here in each Loop until the very end so let's hit shift
727:04 let's hit shift enter and as you can see it did exactly
727:07 enter and as you can see it did exactly that now in this body and I'll copy and
727:09 that now in this body and I'll copy and paste this down here in this body we
727:11 paste this down here in this body we really can do just about anything we
727:13 really can do just about anything we want we don't even have to use this
727:15 want we don't even have to use this variable number right here we can just
727:17 variable number right here we can just print yep if we wanted to and what it's
727:21 print yep if we wanted to and what it's going to do is for each iteration all
727:23 going to do is for each iteration all five of those every time it Loops
727:25 five of those every time it Loops through it's going to print off yep so
727:27 through it's going to print off yep so let's hit shift enter and it printed it
727:30 let's hit shift enter and it printed it off for us so really we weren't even
727:33 off for us so really we weren't even using the numbers within the list we
727:35 using the numbers within the list we were really just using it as almost a
727:37 were really just using it as almost a counter now let's copy this integers
727:39 counter now let's copy this integers once again let's go right up here and
727:42 once again let's go right up here and let's go copy this for Loop that we
727:45 let's go copy this for Loop that we wrote now we do not have to call this
727:48 wrote now we do not have to call this number this can be anything you want any
727:51 number this can be anything you want any variable name that you'd like to name it
727:53 variable name that you'd like to name it we could call it
727:55 we could call it jelly and we can
727:57 jelly and we can do jelly plus
728:00 do jelly plus jelly I think you're getting the picture
728:02 jelly I think you're getting the picture right when it Loops through that one
728:04 right when it Loops through that one it's doing 1 plus one when it Loops
728:06 it's doing 1 plus one when it Loops through the two it's doing two plus two
728:09 through the two it's doing two plus two that is basically how a four Loop works
728:11 that is basically how a four Loop works now for a dictionary it's going to
728:12 now for a dictionary it's going to handle it a little bit differently so
728:15 handle it a little bit differently so let's create a dictionary really quickly
728:17 let's create a dictionary really quickly so we'll say ice
728:19 so we'll say ice cream dictionary is equal to we're going
728:22 cream dictionary is equal to we're going to do a squiggly brackets so we're going
728:24 to do a squiggly brackets so we're going to say name and we're going to say colon
728:27 to say name and we're going to say colon we need to assign our value for that
728:29 we need to assign our value for that item so we're going to say Alex freeberg
728:33 item so we're going to say Alex freeberg we'll do our next one separated by a
728:35 we'll do our next one separated by a comma and we'll say weekly intake and
728:38 comma and we'll say weekly intake and I'll say five Scoops per week the next
728:42 I'll say five Scoops per week the next one we will do is favorite ice creams
728:46 one we will do is favorite ice creams and for this one we're going to do
728:47 and for this one we're going to do something a little bit different for
728:49 something a little bit different for this we're going to have a list within
728:51 this we're going to have a list within this dictionary so we'll say within our
728:53 this dictionary so we'll say within our list of my favorite ice creams we'll say
728:56 list of my favorite ice creams we'll say mint chocolate chip and I'll just do MCC
728:58 mint chocolate chip and I'll just do MCC for that and we'll separate that out by
729:02 for that and we'll separate that out by a comma and we'll say chocolate so now
729:04 a comma and we'll say chocolate so now we have this dictionary ice cream dick
729:07 we have this dictionary ice cream dick and within it we have my name my weekly
729:09 and within it we have my name my weekly intake and my favorite ice creams with a
729:12 intake and my favorite ice creams with a list in there as well let's hit shift
729:15 list in there as well let's hit shift enter and now we're going to start
729:16 enter and now we're going to start writing our for Loop now the for Loop is
729:18 writing our for Loop now the for Loop is going to look very similar but to call a
729:20 going to look very similar but to call a dictionary it's just a little bit
729:22 dictionary it's just a little bit different so we're going to say four the
729:24 different so we're going to say four the cream in icore
729:29 cream in icore creamore
729:30 creamore dictionary. values and then we're going
729:32 dictionary. values and then we're going to do parentheses and then a colon now
729:35 to do parentheses and then a colon now we're going to print the cream so in
729:39 we're going to print the cream so in order to indicate what we actually want
729:40 order to indicate what we actually want to pull we have to specify within the
729:43 to pull we have to specify within the dictionary what we want are we pulling
729:46 dictionary what we want are we pulling the item are we pulling the value we
729:47 the item are we pulling the value we need to specify this so that's why we
729:49 need to specify this so that's why we have this dot values right here so let's
729:52 have this dot values right here so let's run this and see what we get so as you
729:54 run this and see what we get so as you can see we are pulling in the values
729:56 can see we are pulling in the values right here that's why we're pulling in
729:57 right here that's why we're pulling in Alex freeberg 5 and mint chocolate chip
730:00 Alex freeberg 5 and mint chocolate chip SL chocolate now we are able to call
730:03 SL chocolate now we are able to call both of those both the key and the value
730:06 both of those both the key and the value so let's go right down here and we can
730:08 so let's go right down here and we can do both the key and the value so we can
730:11 do both the key and the value so we can pull two things at one time and we're
730:14 pull two things at one time and we're going to do this by saying do items so
730:17 going to do this by saying do items so we could also do do key if we just
730:19 we could also do do key if we just wanted to do a key but we want to do
730:22 wanted to do a key but we want to do items so we going to do both of them
730:24 items so we going to do both of them so we're going to go right down here and
730:26 so we're going to go right down here and say for key and value in ice cream
730:28 say for key and value in ice cream dictionary. items print and let's write
730:32 dictionary. items print and let's write key and then we'll do a comma and then
730:35 key and then we'll do a comma and then let's give it a little arrow or
730:36 let's give it a little arrow or something like that uh something like
730:38 something like that uh something like this and then we'll do a comma and we'll
730:40 this and then we'll do a comma and we'll say value and let's print this off and
730:43 say value and let's print this off and see what we get so it's looping through
730:47 see what we get so it's looping through and for each key and value it's saying
730:49 and for each key and value it's saying here is the key so that's the name then
730:51 here is the key so that's the name then we have weekly intake then we have
730:53 we have weekly intake then we have favorite ice creams it's giving us a
730:55 favorite ice creams it's giving us a little arrow and then we're also
730:56 little arrow and then we're also printing off the value so we have name
730:58 printing off the value so we have name Alex freeberg weekly intake five
731:01 Alex freeberg weekly intake five favorite ice creams mint chocolate chip
731:03 favorite ice creams mint chocolate chip and chocolate so now let's talk about
731:05 and chocolate so now let's talk about nested for Loops we've looked at for
731:07 nested for Loops we've looked at for Loops we understand how they work and
731:09 Loops we understand how they work and why they do what they do but what about
731:11 why they do what they do but what about a nested for Loop a for Loop within a
731:13 a nested for Loop a for Loop within a for Loop for this example let's create
731:16 for Loop for this example let's create two separate lists let's create
731:19 two separate lists let's create flavors and let's make that a list by
731:22 flavors and let's make that a list by making it a bracket we'll do vanilla the
731:26 making it a bracket we'll do vanilla the classic
731:29 classic chocolate and then cookie dough all
731:33 chocolate and then cookie dough all great flavors so that's our first list
731:36 great flavors so that's our first list and then we're going to say toppings and
731:38 and then we're going to say toppings and we'll do a bracket for that as well and
731:41 we'll do a bracket for that as well and we'll say hot
731:43 we'll say hot fudge and then we'll do
731:47 fudge and then we'll do Oreos and then we'll do
731:51 Oreos and then we'll do marshmallows is how you spell
731:53 marshmallows is how you spell marshmallows
731:54 marshmallows I think it's an e that looks wrong I
731:57 I think it's an e that looks wrong I might be spelling it wrong but that's
731:59 might be spelling it wrong but that's okay so let's save this by clicking
732:01 okay so let's save this by clicking shift enter and now we have our flavors
732:04 shift enter and now we have our flavors and our toppings so now let's write our
732:06 and our toppings so now let's write our first for Loops we're going to say 41 as
732:09 first for Loops we're going to say 41 as in our number one for loop we're going
732:11 in our number one for loop we're going to say in flavors and we'll do a colon
732:15 to say in flavors and we'll do a colon we'll click enter now we can write our
732:17 we'll click enter now we can write our second for Loop so we're going to say 4
732:19 second for Loop so we're going to say 4 two in toppings and then we'll do a
732:23 two in toppings and then we'll do a colon and enter and then we're going to
732:25 colon and enter and then we're going to say print and we'll do an open
732:27 say print and we'll do an open parenthesis and then we're going to say
732:29 parenthesis and then we're going to say one so we're printing the one in flavors
732:33 one so we're printing the one in flavors and then we're going to say one comma
732:35 and then we're going to say one comma I'm going to say topped with comma 2 so
732:41 I'm going to say topped with comma 2 so what this is essentially going to do is
732:43 what this is essentially going to do is we're going to say for one we're going
732:45 we're going to say for one we're going to take the very first one in flavors
732:48 to take the very first one in flavors and then we're going to Loop through all
732:49 and then we're going to Loop through all of two as well so we're going to Loop
732:52 of two as well so we're going to Loop through hot fudge Oreo
732:54 through hot fudge Oreo and marshmallows and once we print that
732:57 and marshmallows and once we print that off then we will Loop all the way back
732:59 off then we will Loop all the way back to Flavors and look at the next
733:02 to Flavors and look at the next iteration or the next sequence within
733:04 iteration or the next sequence within the first for Loop so let's run this
733:06 the first for Loop so let's run this really quickly and see what we get so as
733:09 really quickly and see what we get so as you can see it goes vanilla vanilla
733:12 you can see it goes vanilla vanilla vanilla and vanilla is topped with the
733:14 vanilla and vanilla is topped with the hot fudge the Oreos and the marshmallows
733:16 hot fudge the Oreos and the marshmallows and then we start iterating through our
733:18 and then we start iterating through our second one in our first four Loop so
733:20 second one in our first four Loop so there's that hierarchy so we're
733:21 there's that hierarchy so we're iterating completely through this one
733:24 iterating completely through this one before we actually go to the very first
733:25 before we actually go to the very first for Loop and start iterating through
733:27 for Loop and start iterating through that one again now that is essentially
733:29 that one again now that is essentially how a nested for Loop works these nested
733:31 how a nested for Loop works these nested for Loops can get very complicated in
733:33 for Loops can get very complicated in fact for Loops in general can get very
733:36 fact for Loops in general can get very complicated the more you add to it and
733:38 complicated the more you add to it and the more you're wanting to do with it
733:39 the more you're wanting to do with it but that is basically how a for Loop and
733:41 but that is basically how a for Loop and a nested for Loop works thank you guys
733:43 a nested for Loop works thank you guys so much for watching be sure to like And
733:45 so much for watching be sure to like And subscribe below and I'll see you in the
733:46 subscribe below and I'll see you in the next
733:48 next [Music]
733:52 [Music] video
733:54 video [Music]
733:59 [Music] hello everybody today we're going to be
734:01 hello everybody today we're going to be taking a look at while Loops in Python
734:03 taking a look at while Loops in Python the while loop in Python is used to
734:05 the while loop in Python is used to iterate over a block of code as long as
734:07 iterate over a block of code as long as the test condition is true now the
734:09 the test condition is true now the difference between a for Loop and a
734:10 difference between a for Loop and a while loop is that a for Loop is going
734:12 while loop is that a for Loop is going to iterate over the entire sequence
734:14 to iterate over the entire sequence regardless of a condition but the while
734:16 regardless of a condition but the while loop is only going to iterate over that
734:17 loop is only going to iterate over that sequence as long as a specific condition
734:19 sequence as long as a specific condition is met once that condition is not met
734:22 is met once that condition is not met the code is going to stop and it's not
734:23 the code is going to stop and it's not going to inter through the rest of the
734:24 going to inter through the rest of the sequence so if we take a look at this
734:26 sequence so if we take a look at this flowchart right here we're going to
734:27 flowchart right here we're going to enter this while loop and we have a test
734:30 enter this while loop and we have a test condition right here the first time that
734:31 condition right here the first time that this test condition comes back false
734:33 this test condition comes back false it's going to exit the while loop so
734:34 it's going to exit the while loop so let's start actually writing out the
734:36 let's start actually writing out the code and see how this while loop works
734:38 code and see how this while loop works so let's create a variable we're just
734:39 so let's create a variable we're just going to say number is equal to one and
734:42 going to say number is equal to one and then we'll say while and now we need to
734:44 then we'll say while and now we need to write our condition that needs to be met
734:45 write our condition that needs to be met in order for our block of code beneath
734:47 in order for our block of code beneath this to run so we're going to say while
734:50 this to run so we're going to say while number is less than five and then we'll
734:53 number is less than five and then we'll do colon enter and now this is our block
734:55 do colon enter and now this is our block of code we're going to say print and
734:57 of code we're going to say print and then we'll say number now what we need
734:59 then we'll say number now what we need to do is basically create a counter
735:01 to do is basically create a counter we're going to say number equals number
735:04 we're going to say number equals number + 1 if you've never done something like
735:06 + 1 if you've never done something like this it's kind of like a counter most
735:08 this it's kind of like a counter most people start it at zero in fact let's
735:09 people start it at zero in fact let's start it at zero and then each time it
735:11 start it at zero and then each time it runs through this while loop it's going
735:13 runs through this while loop it's going to add one to this number up here and
735:15 to add one to this number up here and then it's going to become a one a two a
735:18 then it's going to become a one a two a three each time it iterates through this
735:20 three each time it iterates through this while loop now once this number is no
735:22 while loop now once this number is no longer less than five it'll break out of
735:25 longer less than five it'll break out of the while loop and it will no longer run
735:27 the while loop and it will no longer run so let's run this really quick by
735:28 so let's run this really quick by hitting shift enter so it starts at zero
735:31 hitting shift enter so it starts at zero and it's going to say while the number
735:33 and it's going to say while the number is less than five print number so the
735:35 is less than five print number so the first time that it runs through it is
735:37 first time that it runs through it is zero and so it prints zero and then it
735:39 zero and so it prints zero and then it adds one two number and then it
735:42 adds one two number and then it continues that y Loop right here and it
735:44 continues that y Loop right here and it keeps looping through this portion it
735:46 keeps looping through this portion it never goes back up here to this line of
735:47 never goes back up here to this line of code this is just our variable that we
735:50 code this is just our variable that we start with and then once this condition
735:52 start with and then once this condition is no longer met once it is is false
735:54 is no longer met once it is is false then it's going to break out of that
735:55 then it's going to break out of that code now that we basically know how a y
735:57 code now that we basically know how a y Loop Works let's look at something
735:59 Loop Works let's look at something called a break statement so let's copy
736:01 called a break statement so let's copy this right down here and what we're
736:03 this right down here and what we're going to say is if number is equal to
736:07 going to say is if number is equal to three we're going to break now with the
736:10 three we're going to break now with the break statement we can basically Stop
736:11 break statement we can basically Stop the Loop even if the while condition is
736:13 the Loop even if the while condition is true so while this number is less than
736:16 true so while this number is less than five it's going to continue to Loop
736:17 five it's going to continue to Loop through but now we have this break
736:19 through but now we have this break statement so it's going to say if the
736:21 statement so it's going to say if the number equals three we're going to break
736:23 number equals three we're going to break out out of this while loop but if this
736:25 out out of this while loop but if this is false we're going to continue adding
736:27 is false we're going to continue adding to that number just like normal so let's
736:29 to that number just like normal so let's execute this so as you can see it only
736:31 execute this so as you can see it only went to three instead of four like
736:33 went to three instead of four like before because each time it was running
736:35 before because each time it was running through this while loop it was checking
736:37 through this while loop it was checking if the number was equal to three and
736:39 if the number was equal to three and once it got to three this became true
736:41 once it got to three this became true and then we broke out of this while loop
736:43 and then we broke out of this while loop the next thing that I want to look at
736:44 the next thing that I want to look at and we'll copy this right down here is
736:46 and we'll copy this right down here is an else statement much like an if
736:48 an else statement much like an if statement but we can use the lse
736:50 statement but we can use the lse statement with a while loop which runs
736:52 statement with a while loop which runs the block of code and when that that
736:53 the block of code and when that that condition is no longer true then it
736:56 condition is no longer true then it activates the else statement so we'll go
736:58 activates the else statement so we'll go right down here and we'll say else and
737:00 right down here and we'll say else and we'll do a colon and enter and then
737:02 we'll do a colon and enter and then we'll say print and we'll say no
737:07 we'll say print and we'll say no longer less than five now because this
737:10 longer less than five now because this if statement is still in there it will
737:12 if statement is still in there it will break so let's say six and then we'll
737:15 break so let's say six and then we'll run this and so it's going to iterate
737:17 run this and so it's going to iterate through this block of code and once this
737:19 through this block of code and once this statement is no longer true once we
737:21 statement is no longer true once we break out of it we're going to go to our
737:22 break out of it we're going to go to our else state St now as long as this
737:24 else state St now as long as this statement is true it's going to continue
737:26 statement is true it's going to continue to iterate through but once this
737:28 to iterate through but once this condition is not met then it will go to
737:30 condition is not met then it will go to our L statement and we'll run that line
737:31 our L statement and we'll run that line of code now the L statement is only
737:33 of code now the L statement is only going to trigger if the Y Loop no longer
737:36 going to trigger if the Y Loop no longer is true if we have something like this
737:38 is true if we have something like this if statement that causes it to break out
737:40 if statement that causes it to break out of the while loop the L statement will
737:41 of the while loop the L statement will no longer work so let's say if the
737:44 no longer work so let's say if the number is three and we run this the L
737:47 number is three and we run this the L statement is no longer going to trigger
737:48 statement is no longer going to trigger so this body of code will not be run now
737:50 so this body of code will not be run now the next thing that I want to look at is
737:51 the next thing that I want to look at is the continue statement if the continue
737:53 the continue statement if the continue statement is triggered it basically
737:55 statement is triggered it basically rejects all remaining statements in the
737:57 rejects all remaining statements in the current iteration of the loop and then
737:59 current iteration of the loop and then we'll go to the next iteration now to
738:01 we'll go to the next iteration now to demonstrate this I'm going to change
738:02 demonstrate this I'm going to change this break into a continue so before
738:05 this break into a continue so before when we had the break if the number was
738:07 when we had the break if the number was equal to three it would stop all the
738:08 equal to three it would stop all the code completely but when we change this
738:11 code completely but when we change this to continue which we'll do right now
738:14 to continue which we'll do right now what it's going to do is it's no longer
738:15 what it's going to do is it's no longer going to run through any of the
738:17 going to run through any of the subsequent code in this block of code
738:19 subsequent code in this block of code it's just going to go straight up to the
738:20 it's just going to go straight up to the beginning and restart our while loop so
738:23 beginning and restart our while loop so what's going to happen when we run this
738:25 what's going to happen when we run this is it's going to come to three it's
738:27 is it's going to come to three it's going to become three it's going to
738:28 going to become three it's going to continue back into the while loop but
738:30 continue back into the while loop but it's never going to have that number
738:32 it's never going to have that number changed to be added to one to continue
738:34 changed to be added to one to continue with the while loop this will basically
738:36 with the while loop this will basically create an infinite Loop let's try this
738:38 create an infinite Loop let's try this really quickly and as you can see it's
738:40 really quickly and as you can see it's going to stay three forever eventually
738:43 going to stay three forever eventually this would time out but I'm just going
738:44 this would time out but I'm just going to stop the code really quick so if we
738:46 to stop the code really quick so if we just change up the order of which we're
738:48 just change up the order of which we're doing things we're going to say there
738:52 doing things we're going to say there and we're going to put this down here
738:54 and we're going to put this down here so what it's going to do now instead of
738:55 so what it's going to do now instead of printing the number immediately and then
738:57 printing the number immediately and then adding the number later we're going to
738:59 adding the number later we're going to add the number right away and then we're
739:02 add the number right away and then we're going to say if it is three we're going
739:03 going to say if it is three we're going to continue and it's going to print the
739:05 to continue and it's going to print the number so let's try executing this and
739:06 number so let's try executing this and see what happens so as you can see we no
739:08 see what happens so as you can see we no longer have the three in our output what
739:11 longer have the three in our output what it did was when we got to the number
739:12 it did was when we got to the number three it continued and didn't execute
739:15 three it continued and didn't execute this right here which prints off that
739:17 this right here which prints off that number so that really is the basics of
739:19 number so that really is the basics of the while loop I hope that this was
739:20 the while loop I hope that this was helpful I hope that you learned
739:21 helpful I hope that you learned something in this video If you did be
739:23 something in this video If you did be sure to like And subscribe below and
739:25 sure to like And subscribe below and I'll see you in the next
739:28 I'll see you in the next [Music]
739:38 [Music] video hello everybody today we're going
739:40 video hello everybody today we're going to be taking a look at functions in
739:41 to be taking a look at functions in Python a function is a block of code
739:44 Python a function is a block of code which is only run when you call it so
739:46 which is only run when you call it so right here we're defining our function
739:48 right here we're defining our function and then this is our body of code that
739:50 and then this is our body of code that when we actually call it is going to be
739:52 when we actually call it is going to be ran so right here we have our function
739:54 ran so right here we have our function call and all we're doing is putting the
739:56 call and all we're doing is putting the function with the parenthesis that is
739:58 function with the parenthesis that is basically us calling that function and
740:00 basically us calling that function and then we have our output throughout this
740:02 then we have our output throughout this video I'm going to show you how to write
740:03 video I'm going to show you how to write a function as well as pass arguments to
740:05 a function as well as pass arguments to that function and then a few other
740:07 that function and then a few other things like arbitrary arguments keyword
740:09 things like arbitrary arguments keyword arguments and arbitrary keyword
740:11 arguments and arbitrary keyword arguments all of these things are really
740:12 arguments all of these things are really important to know when you are using
740:14 important to know when you are using functions so let's get started by
740:16 functions so let's get started by writing our very first function together
740:18 writing our very first function together we're going to start off by saying DF
740:20 we're going to start off by saying DF that is the keyword for defining a
740:21 that is the keyword for defining a function then we can actually name our
740:24 function then we can actually name our function and for this one we're just
740:25 function and for this one we're just going to do first underscore function
740:28 going to do first underscore function and then we do an open parenthesis and
740:30 and then we do an open parenthesis and then we'll put a colon we'll hit enter
740:32 then we'll put a colon we'll hit enter and it'll automatically indent for us
740:34 and it'll automatically indent for us and this is where our body of code is
740:36 and this is where our body of code is going to go now within our body of code
740:37 going to go now within our body of code we can write just about anything and in
740:39 we can write just about anything and in this video I'm not going to get super
740:40 this video I'm not going to get super Advanced we're just going to walk
740:42 Advanced we're just going to walk through the basics to make sure that you
740:43 through the basics to make sure that you understand how to use functions so for
740:45 understand how to use functions so for right now all we're going to say is
740:47 right now all we're going to say is print we'll do an open parenthesis we'll
740:49 print we'll do an open parenthesis we'll do an apostrophe and we'll say we did it
740:53 do an apostrophe and we'll say we did it and now we're going to hit shift enter
740:55 and now we're going to hit shift enter and this is not going to do anything at
740:56 and this is not going to do anything at least you won't see any output from this
740:59 least you won't see any output from this if we want to see the output or we
741:00 if we want to see the output or we actually want to run that function and
741:02 actually want to run that function and some functions don't have outputs but if
741:04 some functions don't have outputs but if we want to run that function what we
741:06 we want to run that function what we have to do is just copy this and put it
741:08 have to do is just copy this and put it right down here and now we're going to
741:10 right down here and now we're going to actually call our function so let's go
741:12 actually call our function so let's go ahead and click shift enter and now
741:14 ahead and click shift enter and now we've successfully called our first
741:16 we've successfully called our first function this function is about as
741:18 function this function is about as simple as it could possibly be but now
741:20 simple as it could possibly be but now let's take it up a notch and start
741:21 let's take it up a notch and start looking at arguments so let's go right
741:23 looking at arguments so let's go right down here and we're going to say Define
741:26 down here and we're going to say Define number underscore squared we'll do a
741:30 number underscore squared we'll do a parenthesis and our colon as well now
741:32 parenthesis and our colon as well now really quickly when you're naming your
741:34 really quickly when you're naming your function it's kind of like naming a
741:35 function it's kind of like naming a variable you can use something like X or
741:37 variable you can use something like X or Y but I tend to like to be a little bit
741:39 Y but I tend to like to be a little bit more descriptive but now let's take a
741:41 more descriptive but now let's take a look at passing an argument into a
741:43 look at passing an argument into a function the argument is going to be
741:45 function the argument is going to be passed right here in the parentheses so
741:47 passed right here in the parentheses so for us I'm just going to call it a
741:49 for us I'm just going to call it a number and then we're going to hit enter
741:52 number and then we're going to hit enter and now we'll write our body of code and
741:53 and now we'll write our body of code and all we're going to do for this is type
741:55 all we're going to do for this is type print and open parenthesis and we'll say
741:58 print and open parenthesis and we'll say number and we'll do two stars at least
742:00 number and we'll do two stars at least that's what I call it a star and a two
742:03 that's what I call it a star and a two and what this is going to do is it's
742:04 and what this is going to do is it's going to take the number that we pass
742:06 going to take the number that we pass into our function it's going to put it
742:08 into our function it's going to put it right here in our body of code and then
742:10 right here in our body of code and then for what we're doing it's going to put
742:11 for what we're doing it's going to put it to the power of two and so when the
742:13 it to the power of two and so when the user or you run this and call this
742:16 user or you run this and call this function this number is something that
742:18 function this number is something that you can specify it's an argument that
742:20 you can specify it's an argument that you can input that will then be run in
742:22 you can input that will then be run in this body of code so let's copy this
742:25 this body of code so let's copy this right here and then we'll put it right
742:28 right here and then we'll put it right down here into this next cell and we'll
742:29 down here into this next cell and we'll say five and so this five is going to be
742:32 say five and so this five is going to be passed through into this function and be
742:34 passed through into this function and be called right here for this print
742:36 called right here for this print statement let's run it and it should
742:38 statement let's run it and it should come out as I believe 25 that is my
742:40 come out as I believe 25 that is my fault I forgot to actually run this
742:42 fault I forgot to actually run this block of code so I'm going to hit shift
742:44 block of code so I'm going to hit shift enter so now we've defined our function
742:46 enter so now we've defined our function up here and now we can actually call it
742:48 up here and now we can actually call it so now we'll hit shift enter and we got
742:51 so now we'll hit shift enter and we got our output of 25 now now in this
742:53 our output of 25 now now in this function we only called one argument but
742:55 function we only called one argument but you can basically call as many arguments
742:57 you can basically call as many arguments as you want you just have to separate
742:59 as you want you just have to separate them by commas so let's copy this and
743:03 them by commas so let's copy this and we'll put it right down here now we'll
743:05 we'll put it right down here now we'll say number squared uncore custom and
743:09 say number squared uncore custom and then we'll do number and then we'll do
743:12 then we'll do number and then we'll do power so now we can specify our number
743:15 power so now we can specify our number as well as the power that we want to
743:16 as well as the power that we want to raise it to so instead of having two
743:18 raise it to so instead of having two which is what you call hardcoded we can
743:21 which is what you call hardcoded we can now customize that and we'll have power
743:23 now customize that and we'll have power power and now when we call this function
743:25 power and now when we call this function we can specify the number and the power
743:27 we can specify the number and the power and both of those will go into this body
743:29 and both of those will go into this body of code and be run and we can customize
743:31 of code and be run and we can customize those numbers so let's copy
743:34 those numbers so let's copy this and we'll
743:36 this and we'll say 5 to the power of three and let's
743:41 say 5 to the power of three and let's make sure I ran this so let's do shift
743:43 make sure I ran this so let's do shift enter and now we will call our function
743:46 enter and now we will call our function and let's hit shift enter and we got 5
743:48 and let's hit shift enter and we got 5 to the^ of 3 which is 125 and just one
743:52 to the^ of 3 which is 125 and just one last thing to mention is if you have two
743:54 last thing to mention is if you have two arguments within your function and you
743:56 arguments within your function and you are calling it right here you have to
743:58 are calling it right here you have to pass in two arguments you can't just
744:00 pass in two arguments you can't just have one so if we have a five right here
744:02 have one so if we have a five right here it's going to error out we have to
744:04 it's going to error out we have to specify both Arguments for it to work
744:08 specify both Arguments for it to work now let's take a look at arbitrary
744:10 now let's take a look at arbitrary arguments now arbitrary arguments are
744:13 arguments now arbitrary arguments are really interesting because if you don't
744:15 really interesting because if you don't know how many arguments you want to pass
744:16 know how many arguments you want to pass through if you don't know if it's a one
744:18 through if you don't know if it's a one a two or a three you can specify that
744:20 a two or a three you can specify that later when you're calling the argument
744:22 later when you're calling the argument so you don't have to do it upfront and
744:24 so you don't have to do it upfront and know that information ahead of time so
744:26 know that information ahead of time so let's define our function so we're going
744:28 let's define our function so we're going to say Define and then we're going to
744:29 to say Define and then we're going to say number underscore args and we'll do
744:33 say number underscore args and we'll do an open parenthesis and a colon now
744:36 an open parenthesis and a colon now within our argument right here typically
744:38 within our argument right here typically we would just specify here's what our
744:40 we would just specify here's what our argument will be it will be number or it
744:42 argument will be it will be number or it will be a word right but what we're
744:44 will be a word right but what we're going to do is something called an
744:45 going to do is something called an arbitrary argument so it's unknown so
744:47 arbitrary argument so it's unknown so we're going to put star and then we'll
744:49 we're going to put star and then we'll say args now you will see something
744:52 say args now you will see something exactly like this typically if you're
744:53 exactly like this typically if you're looking at tutorials that'll have star
744:55 looking at tutorials that'll have star args in there or if you're looking at
744:57 args in there or if you're looking at just a generic piece of code this is
744:59 just a generic piece of code this is what it will look like but for us we're
745:02 what it will look like but for us we're going to actually put number so again we
745:04 going to actually put number so again we have the star and then we have our
745:06 have the star and then we have our arbitrary argument right here and then
745:08 arbitrary argument right here and then we'll hit enter and we're going to say
745:10 we'll hit enter and we're going to say print open parentheses and this is where
745:13 print open parentheses and this is where it's going to get a little bit different
745:15 it's going to get a little bit different so we're going to say number and then
745:16 so we're going to say number and then we're going to do an open bracket and
745:18 we're going to do an open bracket and let's say zero and then we'll do that
745:21 let's say zero and then we'll do that times and then we'll say number again
745:24 times and then we'll say number again with a bracket of one so in a little bit
745:26 with a bracket of one so in a little bit once we run this and then we call this
745:28 once we run this and then we call this number args function right here we're
745:30 number args function right here we're going to need to specify the number zero
745:33 going to need to specify the number zero and the number one that's going to be
745:34 and the number one that's going to be called so let's go ahead and run this
745:37 called so let's go ahead and run this and then we are going to call it and
745:41 and then we are going to call it and let's say 5 comma 6 comma 1 2 8 so right
745:46 let's say 5 comma 6 comma 1 2 8 so right up here we did not know how many
745:48 up here we did not know how many arguments we were going to pass through
745:50 arguments we were going to pass through it could be five it could be a thousand
745:53 it could be five it could be a thousand we could also call in a tuple and that's
745:55 we could also call in a tuple and that's what this is right here we're calling in
745:56 what this is right here we're calling in a tupal so what it's going to do now is
745:59 a tupal so what it's going to do now is when it calls this number it's going to
746:00 when it calls this number it's going to call the very first within that tupal
746:02 call the very first within that tupal which will be that five and then it'll
746:04 which will be that five and then it'll also call in this number which will be
746:06 also call in this number which will be the first position which is the six so
746:09 the first position which is the six so let's hit shift enter and it's going to
746:11 let's hit shift enter and it's going to multiply these numbers together so 5 * 6
746:14 multiply these numbers together so 5 * 6 is equal to 30 now like I just said this
746:16 is equal to 30 now like I just said this is a tuple so we don't actually have to
746:18 is a tuple so we don't actually have to write out these numbers like we just did
746:20 write out these numbers like we just did we can pass through a tuple when we are
746:22 we can pass through a tuple when we are actually calling this function let's do
746:25 actually calling this function let's do that right up here let's just create um
746:27 that right up here let's just create um let's call it argor Tuple and we'll do
746:32 let's call it argor Tuple and we'll do open parentheses and we'll do the same
746:34 open parentheses and we'll do the same numbers let's just copy it make it
746:37 numbers let's just copy it make it easier and now we've created this tupal
746:40 easier and now we've created this tupal right here which we can then pass in and
746:43 right here which we can then pass in and this is a lot more handy a lot more
746:45 this is a lot more handy a lot more specific and this is most likely how
746:46 specific and this is most likely how someone would do something like this but
746:48 someone would do something like this but let's now create this and now we can
746:52 let's now create this and now we can copy args Tuple and pass it through now
746:56 copy args Tuple and pass it through now really quickly this is going to fail and
746:58 really quickly this is going to fail and I'm doing that on purpose but I want to
746:59 I'm doing that on purpose but I want to show you what you need to do in order to
747:01 show you what you need to do in order to pass through this
747:02 pass through this tupal so right now it's going to say
747:04 tupal so right now it's going to say Tuple index is out of range all you have
747:07 Tuple index is out of range all you have to do in order to use this is you have
747:09 to do in order to use this is you have to specify a star before it just like
747:12 to specify a star before it just like you did when you're creating your
747:13 you did when you're creating your argument up here you have to put a star
747:15 argument up here you have to put a star in front of our Tuple that we just
747:17 in front of our Tuple that we just passed through and now let's try running
747:19 passed through and now let's try running this and now it works properly now the
747:22 this and now it works properly now the last two things that we're going to look
747:23 last two things that we're going to look at are keyword arguments and arbitrary
747:25 at are keyword arguments and arbitrary keyword arguments there are more things
747:27 keyword arguments there are more things that you can learn and do within
747:28 that you can learn and do within functions but again I'm just trying to
747:30 functions but again I'm just trying to teach you the basics to make sure that
747:32 teach you the basics to make sure that you understand how they work so let's go
747:34 you understand how they work so let's go right up here and a keyword argument is
747:36 right up here and a keyword argument is kind of similar to this right here and
747:39 kind of similar to this right here and let's actually copy this and put it
747:41 let's actually copy this and put it right down here now a keyword argument
747:44 right down here now a keyword argument is very similar in that you're going to
747:46 is very similar in that you're going to specify your arguments right here but
747:49 specify your arguments right here but what we did up here let me bring this
747:51 what we did up here let me bring this down
747:53 down when we actually called the function
747:55 when we actually called the function what we did was we just put in a five
747:57 what we did was we just put in a five and a three and when we did that it
748:00 and a three and when we did that it automatically assigned number to five
748:02 automatically assigned number to five and power to three and that's totally
748:04 and power to three and that's totally fine and you can do that but if you want
748:06 fine and you can do that but if you want a little bit more control you can use a
748:08 a little bit more control you can use a keyword argument so right here we could
748:11 keyword argument so right here we could say power is equal to five and number is
748:18 say power is equal to five and number is equal to three so I just switched it
748:19 equal to three so I just switched it around right number was assigned to five
748:21 around right number was assigned to five and Power was assigned to three but I
748:24 and Power was assigned to three but I just switched it to show you how this
748:25 just switched it to show you how this might work so let's run both of these
748:28 might work so let's run both of these and now it's 3 to the^ of 5 which is
748:31 and now it's 3 to the^ of 5 which is 243 so that essentially is a keyword
748:33 243 so that essentially is a keyword argument again it just gives you a
748:35 argument again it just gives you a little bit more control you don't have
748:37 little bit more control you don't have to put them in specific positions like
748:39 to put them in specific positions like if you're just calling multiple
748:40 if you're just calling multiple arguments now let's come right down here
748:42 arguments now let's come right down here we're going to create basically another
748:44 we're going to create basically another custom function uh so for this one we're
748:46 custom function uh so for this one we're going to write Define number underscore
748:50 going to write Define number underscore bar and then we'll do an open
748:52 bar and then we'll do an open parenthesis a colon and enter and what
748:56 parenthesis a colon and enter and what this one is is this one is a keyword
748:58 this one is is this one is a keyword argument or an arbitrary keyword
749:00 argument or an arbitrary keyword argument now to specify an arbitrary
749:03 argument now to specify an arbitrary argument all we did was a star and then
749:06 argument all we did was a star and then we input number but if we're doing a
749:08 we input number but if we're doing a keyword argument we actually have to
749:10 keyword argument we actually have to have two stars right here so let's start
749:13 have two stars right here so let's start taking a look and again if you're doing
749:15 taking a look and again if you're doing arbitrary it means we don't really know
749:17 arbitrary it means we don't really know how many keyword arguments we want to
749:19 how many keyword arguments we want to pass into our function so we're just
749:21 pass into our function so we're just going to put star our number and then
749:23 going to put star our number and then later within our body of code and when
749:24 later within our body of code and when we're calling it we'll be able to
749:26 we're calling it we'll be able to specify it and just like the arbitrary
749:28 specify it and just like the arbitrary argument before the arbitrary keyword
749:31 argument before the arbitrary keyword argument means we really just don't know
749:33 argument means we really just don't know how many keyword arguments we're going
749:34 how many keyword arguments we're going to need to pass into our function so to
749:36 to need to pass into our function so to demonstrate this let's write print do an
749:39 demonstrate this let's write print do an open parenthesis and we'll say my oops
749:42 open parenthesis and we'll say my oops need to do an
749:43 need to do an apostrophe my number is we'll do just
749:48 apostrophe my number is we'll do just like that little space and we'll say
749:50 like that little space and we'll say plus and this is kind of where it gets a
749:52 plus and this is kind of where it gets a little interesting or a little bit more
749:54 little interesting or a little bit more tricky so we're going to say is number
749:56 tricky so we're going to say is number so This Is Us calling our number and
749:58 so This Is Us calling our number and then we're going to do a bracket and
750:01 then we're going to do a bracket and then I'm actually going to go to calling
750:03 then I'm actually going to go to calling the function it's a little bit backward
750:05 the function it's a little bit backward or a little bit different than what you
750:07 or a little bit different than what you might think but when we're calling it
750:09 might think but when we're calling it what I'm going to do is I'm going to say
750:11 what I'm going to do is I'm going to say integer is equal to let's just do some
750:14 integer is equal to let's just do some random number now when we're calling
750:16 random number now when we're calling that keyword within our body of code
750:19 that keyword within our body of code what we're going to do is we're going to
750:20 what we're going to do is we're going to actually type out integer just like this
750:23 actually type out integer just like this and this looks a little bit different
750:26 and this looks a little bit different but what this allows us to do is we can
750:28 but what this allows us to do is we can put as many keyword arguments in here as
750:30 put as many keyword arguments in here as we want later and I'll show you in just
750:31 we want later and I'll show you in just a second but for us we're just creating
750:33 a second but for us we're just creating this key and this value when we are
750:36 this key and this value when we are calling it within the function so now
750:38 calling it within the function so now when we create this and we run
750:41 when we create this and we run this oh whoops I forgot this has to be a
750:44 this oh whoops I forgot this has to be a string um so let's run this
750:47 string um so let's run this again now we will say my number is
750:51 again now we will say my number is 2309 then we're we're going to add we'll
750:53 2309 then we're we're going to add we'll say plus and this isn't going to look
750:55 say plus and this isn't going to look great but we'll say my other number
750:58 great but we'll say my other number because this will all be in the same
750:59 because this will all be in the same line that's okay my other number and
751:02 line that's okay my other number and then we'll say number and we can specify
751:05 then we'll say number and we can specify again what we want in there so now we
751:08 again what we want in there so now we can go down here to where we're calling
751:10 can go down here to where we're calling it we'll put a comma and we'll say
751:13 it we'll put a comma and we'll say integer oops
751:16 integer oops integer 2 is equal to we'll do a random
751:20 integer 2 is equal to we'll do a random number and then we'll put in two right
751:23 number and then we'll put in two right here and then we'll add plus right here
751:26 here and then we'll add plus right here so we don't error out we'll create this
751:28 so we don't error out we'll create this we'll run this and as you can see both
751:31 we'll run this and as you can see both numbers were passed through again the
751:33 numbers were passed through again the syntax is terrible but now you can see
751:35 syntax is terrible but now you can see that you have this arbitrary keyword
751:37 that you have this arbitrary keyword argument right here and all we have to
751:39 argument right here and all we have to do is put number number and we can pass
751:42 do is put number number and we can pass through as many of these arbitrary
751:44 through as many of these arbitrary keyword arguments as we want as long as
751:46 keyword arguments as we want as long as we just specify within our function when
751:48 we just specify within our function when we're calling it so that's all we're
751:50 we're calling it so that's all we're going to look at in today's video on
751:51 going to look at in today's video on functions there are of course other
751:53 functions there are of course other things that you can do within functions
751:54 things that you can do within functions and it can get a little bit more
751:55 and it can get a little bit more advanced but I wanted to show you the
751:57 advanced but I wanted to show you the basics the meat and potatoes of things I
751:59 basics the meat and potatoes of things I definitely think you should know in
752:00 definitely think you should know in order to get started using functions I
752:02 order to get started using functions I hope that you were able to understand
752:04 hope that you were able to understand functions better because of this video
752:06 functions better because of this video if you did be sure to like And subscribe
752:07 if you did be sure to like And subscribe below and I will see you in the next
752:11 below and I will see you in the next [Music]
752:21 [Music] video hell hello everybody today we're
752:23 video hell hello everybody today we're going to be talking about converting
752:24 going to be talking about converting data types in Python in this video I'm
752:26 data types in Python in this video I'm going to show you how to convert several
752:28 going to show you how to convert several different data types including strings
752:30 different data types including strings numbers sets tupal and even dictionaries
752:33 numbers sets tupal and even dictionaries so let's start off by creating a
752:34 so let's start off by creating a variable we'll say numor int is equal to
752:37 variable we'll say numor int is equal to 7 and we can check that data type by
752:40 7 and we can check that data type by saying type and then inserting our
752:43 saying type and then inserting our variable number undor int and that will
752:46 variable number undor int and that will tell us that our data type for this
752:48 tell us that our data type for this variable is an integer let's go ahead
752:50 variable is an integer let's go ahead and create another one we're going to
752:51 and create another one we're going to say num underscore string is equal to
752:54 say num underscore string is equal to and for this one we'll also do a seven
752:56 and for this one we'll also do a seven but let's check the type and we'll do an
752:59 but let's check the type and we'll do an open parentheses and we'll say the type
753:01 open parentheses and we'll say the type of num string and that one is a string
753:04 of num string and that one is a string now let's say we wanted to add those
753:06 now let's say we wanted to add those we'll say Num uncore Sum so the sum of
753:10 we'll say Num uncore Sum so the sum of numor int plus numor string now when
753:16 numor int plus numor string now when we're adding these two values it is not
753:18 we're adding these two values it is not going to work it's going to give us an
753:19 going to work it's going to give us an error and it's going to say unsupported
753:21 error and it's going to say unsupported op brand for INT and string so it cannot
753:24 op brand for INT and string so it cannot add both an integer and a string what we
753:27 add both an integer and a string what we need to do in order to add these two
753:28 need to do in order to add these two numbers is to convert that string into
753:31 numbers is to convert that string into an integer so let's go right up here
753:34 an integer so let's go right up here let's add another cell and let's say
753:38 let's add another cell and let's say numor string undor converted is equal to
753:43 numor string undor converted is equal to and we want to convert it into an
753:44 and we want to convert it into an integer so all we have to do to convert
753:46 integer so all we have to do to convert it into an integer is type int and then
753:50 it into an integer is type int and then we're going to say num underscore string
753:53 we're going to say num underscore string and that is as easy as it's going to get
753:56 and that is as easy as it's going to get all we have to do is say integer with
753:58 all we have to do is say integer with our numb string inside of it and then
754:01 our numb string inside of it and then it's going to convert it and we can even
754:03 it's going to convert it and we can even check it right after by saying type num
754:06 check it right after by saying type num string converted and let's run this and
754:08 string converted and let's run this and now we can see that it was converted
754:10 now we can see that it was converted into an integer so now let's add that
754:13 into an integer so now let's add that num string converted right
754:15 num string converted right here let's copy and replace that string
754:18 here let's copy and replace that string with the string
754:20 with the string converted and let's actually print out
754:23 converted and let's actually print out that numor sum and it worked properly
754:28 that numor sum and it worked properly now we did not specify what type of
754:30 now we did not specify what type of value this Num Sum was going to be but
754:33 value this Num Sum was going to be but because it was two integers in here it's
754:36 because it was two integers in here it's going to automatically apply that data
754:38 going to automatically apply that data type of integer to that Num Sum let's go
754:41 type of integer to that Num Sum let's go right down here and now let's look at
754:43 right down here and now let's look at how we can convert lists sets and tupal
754:46 how we can convert lists sets and tupal so now let's say we have a listor type
754:50 so now let's say we have a listor type and that's equal to 1 2 3 and we can
754:53 and that's equal to 1 2 3 and we can check it again by saying
754:56 check it again by saying type and that is a list let's say we
754:59 type and that is a list let's say we want to convert it to a tupal it's
755:01 want to convert it to a tupal it's fairly easy all we're going to do is
755:03 fairly easy all we're going to do is write Tuple say listor type that list
755:07 write Tuple say listor type that list uncore type is now going to be a tupal
755:10 uncore type is now going to be a tupal and we can check that by saying type and
755:13 and we can check that by saying type and wrapping it around this Tuple and it
755:16 wrapping it around this Tuple and it shows us that it is converting that list
755:18 shows us that it is converting that list into a tupal now we can also convert a
755:21 into a tupal now we can also convert a list into a set but it may change the
755:25 list into a set but it may change the actual values within it let's check that
755:28 actual values within it let's check that out really quickly so let's say we have
755:30 out really quickly so let's say we have this list and let's add a few more
755:32 this list and let's add a few more values to this just like that now let's
755:36 values to this just like that now let's say we want to convert it to a set so
755:38 say we want to convert it to a set so we're going to run this and we'll say
755:41 we're going to run this and we'll say set of listor type and let's try running
755:46 set of listor type and let's try running this and see what the output is so this
755:47 this and see what the output is so this is something that you really need to be
755:49 is something that you really need to be aware of when you are converting data
755:51 aware of when you are converting data types because set does not act the same
755:53 types because set does not act the same as a list a set is basically going to
755:55 as a list a set is basically going to take the unique values in the list and
755:57 take the unique values in the list and convert it to a set and it fundamentally
756:00 convert it to a set and it fundamentally changes the data that was in that
756:01 changes the data that was in that original list and just to check the data
756:04 original list and just to check the data type we can say
756:05 type we can say type I'm just doing this for all of them
756:08 type I'm just doing this for all of them and as you can see that is now a set now
756:10 and as you can see that is now a set now let's go down here and take a look at
756:12 let's go down here and take a look at dictionaries now let's say we have a
756:15 dictionaries now let's say we have a dictionary called dictionary type and
756:18 dictionary called dictionary type and we'll do a squiggly bracket and we'll
756:21 we'll do a squiggly bracket and we'll say name name and we'll do a colon and
756:24 say name name and we'll do a colon and we'll say Alex then we'll do age and a
756:29 we'll say Alex then we'll do age and a colon and we'll say
756:31 colon and we'll say 28 and then we'll do
756:36 28 and then we'll do hair
756:37 hair colon and so really quickly let's take
756:40 colon and so really quickly let's take that dictionary type and just confirm
756:43 that dictionary type and just confirm that it is a dictionary and it is and
756:46 that it is a dictionary and it is and now what we're going to do is take a
756:48 now what we're going to do is take a look at all of the items within that
756:50 look at all of the items within that dictionary so we're going to do
756:51 dictionary so we're going to do dictionary type. items open parenthesis
756:56 dictionary type. items open parenthesis and this is going to show us all the
756:57 and this is going to show us all the items within it now we can also take
757:00 items within it now we can also take this and look at something like the
757:04 this and look at something like the values and when we run that these are
757:06 values and when we run that these are our values So within our dictionary we
757:09 our values So within our dictionary we have items and that's what this is right
757:10 have items and that's what this is right here this is one item and then within
757:13 here this is one item and then within that we have our values which are right
757:15 that we have our values which are right here so Alex 28 and Na and then we have
757:19 here so Alex 28 and Na and then we have something called a key and this is the
757:21 something called a key and this is the key the name age and hair are all keys
757:25 key the name age and hair are all keys and we can look at that by saying dot
757:28 and we can look at that by saying dot keys so let's say we want to take all of
757:31 keys so let's say we want to take all of the keys and put that into a list what
757:33 the keys and put that into a list what we're going to do is we're going to take
757:34 we're going to do is we're going to take this right here say
757:36 this right here say list we'll do an open parenthesis we'll
757:39 list we'll do an open parenthesis we'll type that in right there so it says a
757:41 type that in right there so it says a list and we're converting these Keys
757:43 list and we're converting these Keys into a list and let's run that and now
757:46 into a list and let's run that and now this is a list and let's just check the
757:49 this is a list and let's just check the type as well just to confirm
757:52 type as well just to confirm and as you can see it was converted
757:54 and as you can see it was converted properly into a list and we can do the
757:57 properly into a list and we can do the exact same thing with
758:00 exact same thing with values and the values can also be
758:03 values and the values can also be converted into a list now we can also
758:05 converted into a list now we can also convert longer strings that aren't just
758:07 convert longer strings that aren't just numbers like we did above in our very
758:08 numbers like we did above in our very first example so let's do longcore
758:12 first example so let's do longcore string and we'll say I like to party now
758:16 string and we'll say I like to party now we're going to take this string and
758:18 we're going to take this string and we're going to say list long string so
758:22 we're going to say list long string so we're going to convert this string into
758:24 we're going to convert this string into a list and let's see what happens so it
758:27 a list and let's see what happens so it took every single character in that
758:28 took every single character in that string and put it into a list and we
758:31 string and put it into a list and we could also do a set as well that one's a
758:33 could also do a set as well that one's a lot shorter because it's only looking at
758:35 lot shorter because it's only looking at unique values so that is how you convert
758:37 unique values so that is how you convert data types in Python thank you guys so
758:39 data types in Python thank you guys so much for watching I really appreciate it
758:41 much for watching I really appreciate it if you like this video be sure to like
758:43 if you like this video be sure to like And subscribe below and I'll see you in
758:44 And subscribe below and I'll see you in the next
758:46 the next [Music]
758:52 video [Music]
758:57 [Music] hello everybody today we're going to be
758:59 hello everybody today we're going to be working on building a BMI calculator in
759:01 working on building a BMI calculator in Python now before we get started I want
759:03 Python now before we get started I want to show you this BMI calculator that I
759:05 to show you this BMI calculator that I found online and it shows you the basic
759:07 found online and it shows you the basic calculation that they use and that's the
759:09 calculation that they use and that's the one we're going to use in this video and
759:11 one we're going to use in this video and they also have this calculator right
759:12 they also have this calculator right down here and some ranges that we can
759:14 down here and some ranges that we can use for our calculator as well so for
759:17 use for our calculator as well so for reference I weigh about
759:19 reference I weigh about 170 I'm about 5 9 let's calculate this
759:23 170 I'm about 5 9 let's calculate this so I'm about a
759:25 so I'm about a 25.1 BMI which falls into the overweight
759:29 25.1 BMI which falls into the overweight category that's unfortunate but we can
759:31 category that's unfortunate but we can see exactly how this works and how ours
759:34 see exactly how this works and how ours should work when we actually build it so
759:36 should work when we actually build it so we're going to kind of reference this
759:37 we're going to kind of reference this throughout the video so let's go right
759:39 throughout the video so let's go right over here to our BMI calculator we need
759:42 over here to our BMI calculator we need to calculate weight and height and then
759:45 to calculate weight and height and then run this calculation right here so let's
759:47 run this calculation right here so let's go ahead and copy
759:48 go ahead and copy this and we're going to put it right
759:51 this and we're going to put it right down here
759:53 down here here and so now we have our calculation
759:56 here and so now we have our calculation so what we need is we need input from a
759:59 so what we need is we need input from a user and there is an input function
760:02 user and there is an input function within python that we're going to be
760:03 within python that we're going to be using so let's actually give me a few
760:06 using so let's actually give me a few more cells so the first thing that we
760:07 more cells so the first thing that we need to calculate is their weight let's
760:10 need to calculate is their weight let's type out weight right here we'll say
760:11 type out weight right here we'll say weight is equal to and this is where
760:13 weight is equal to and this is where we'll use our input function so we'll
760:15 we'll use our input function so we'll say input and when we actually run this
760:17 say input and when we actually run this it's just going to give us this blank
760:18 it's just going to give us this blank square or a user can input something
760:21 square or a user can input something we'll say Alex so this is our output is
760:24 we'll say Alex so this is our output is what the actual user input and it does
760:27 what the actual user input and it does save it to this variable so if we say
760:29 save it to this variable so if we say print weight it will still print out
760:32 print weight it will still print out Alex now this is where we want the user
760:35 Alex now this is where we want the user to just like we did before where they'll
760:37 to just like we did before where they'll input their weight so we want to kind of
760:39 input their weight so we want to kind of give them a prompt for this we'll put a
760:42 give them a prompt for this we'll put a string in here so I'll do a double quote
760:45 string in here so I'll do a double quote and then I'll say
760:47 and then I'll say enter your weight in and we're using
760:51 enter your weight in and we're using pounds
760:52 pounds say pounds colon space so now when we do
760:56 say pounds colon space so now when we do this it'll say enter your weight in
760:58 this it'll say enter your weight in pounds I'll say 170 and then when we run
761:01 pounds I'll say 170 and then when we run this it does store that now let's do
761:03 this it does store that now let's do print I should have saved it wait again
761:06 print I should have saved it wait again oops now it's only storing the value of
761:09 oops now it's only storing the value of 170 it's not actually storing this
761:11 170 it's not actually storing this string right here so that's really
761:13 string right here so that's really important for when we do our
761:14 important for when we do our calculations
761:15 calculations later um I'm going to I'm going to save
761:18 later um I'm going to I'm going to save this right down here because I'm sure
761:19 this right down here because I'm sure I'm going to use that later um so we
761:21 I'm going to use that later um so we have that it's working now we need to
761:24 have that it's working now we need to also do our height so let's copy this
761:28 also do our height so let's copy this and we'll put it right here and we'll do
761:31 and we'll put it right here and we'll do height and enter your height in inches
761:35 height and enter your height in inches so now for this one if we hit
761:38 so now for this one if we hit enter it's actually running let's stop
761:40 enter it's actually running let's stop it really quick and interrupt it let's
761:43 it really quick and interrupt it let's try running this so it's going to say
761:45 try running this so it's going to say enter your weight and pounds that's the
761:46 enter your weight and pounds that's the first input say
761:49 first input say 170 and then when I hit enter it's going
761:52 170 and then when I hit enter it's going to prompt me for that second input and
761:54 to prompt me for that second input and so in inches 59 is 69 in and then I can
762:00 so in inches 59 is 69 in and then I can hit enter again and now we have both of
762:03 hit enter again and now we have both of our inputs now we need this calculation
762:06 our inputs now we need this calculation right down here and just like that so
762:10 right down here and just like that so now we have weight in pounds time 703
762:14 now we have weight in pounds time 703 divided by height in inches by height in
762:16 divided by height in inches by height in inches so we actually have weight and
762:19 inches so we actually have weight and it's already written in there but I'm
762:20 it's already written in there but I'm just going to do like this we'll do
762:22 just going to do like this we'll do weight time 73 so that's pounds there
762:25 weight time 73 so that's pounds there our weight and pounds * 703 divided by
762:28 our weight and pounds * 703 divided by now we have our height in
762:31 now we have our height in inches times the height in inches so
762:34 inches times the height in inches so this is our calculation right here so
762:37 this is our calculation right here so let's do this exact same thing let's run
762:39 let's do this exact same thing let's run this and this times of course is not
762:42 this and this times of course is not going to work whoops we need to do our
762:44 going to work whoops we need to do our star for both of these all right now
762:48 star for both of these all right now this is our calculation so let's run
762:49 this is our calculation so let's run this so we have
762:52 this so we have 170 and that's pounds and inches was 69
762:56 170 and that's pounds and inches was 69 hit
762:57 hit enter and it says cannot multiply the
763:00 enter and it says cannot multiply the sequence of non- integer type of string
763:02 sequence of non- integer type of string Ah that's because these are being stored
763:04 Ah that's because these are being stored in strings they right down here I do and
763:08 in strings they right down here I do and we'll do type of height we run that this
763:12 we'll do type of height we run that this is actually a string so we want to
763:15 is actually a string so we want to change that because we don't need that
763:17 change that because we don't need that anymore
763:19 anymore that so we don't want it to be a string
763:22 that so we don't want it to be a string we need those to be integers or Floats
763:24 we need those to be integers or Floats or really anything besides a string it
763:26 or really anything besides a string it just needs to be numerical uh so integer
763:28 just needs to be numerical uh so integer float really so let's do integer and
763:30 float really so let's do integer and we'll wrap that input in it and we'll do
763:33 we'll wrap that input in it and we'll do the same thing for this
763:36 the same thing for this one now we have an integer for our
763:38 one now we have an integer for our weight an integer for our height so now
763:41 weight an integer for our height so now when we're running this calculation it
763:42 when we're running this calculation it should work properly let's run this
763:44 should work properly let's run this again our pounds are
763:47 again our pounds are 70 our height is 69 in
763:52 70 our height is 69 in and it's not giving us our output
763:54 and it's not giving us our output because we're not printing anything okay
763:56 because we're not printing anything okay so I just need to
763:58 so I just need to do
763:59 do print BMI so let's try this again 170
764:05 print BMI so let's try this again 170 69 and there is our BMI 25.1 so it
764:08 69 and there is our BMI 25.1 so it worked the exact same as this one so
764:11 worked the exact same as this one so they input well we input our height we
764:14 they input well we input our height we inputed our or we inputed our weight we
764:16 inputed our or we inputed our weight we inputed our height and then it
764:17 inputed our height and then it calculated rbmi the next thing that we
764:20 calculated rbmi the next thing that we need to do is we need to kind of give
764:22 need to do is we need to kind of give the user some context is that good is
764:25 the user some context is that good is there BMI in within a good range a bad
764:28 there BMI in within a good range a bad range we don't know uh so let's go ahead
764:30 range we don't know uh so let's go ahead and I'm going to see if I can copy this
764:33 and I'm going to see if I can copy this know if this will work or
764:34 know if this will work or not let's go ahead and copy this right
764:36 not let's go ahead and copy this right down here perfect so what we now need to
764:40 down here perfect so what we now need to do is we need to say okay if the user
764:43 do is we need to say okay if the user has given us this input we want to give
764:45 has given us this input we want to give them or tell them if they are a normal
764:48 them or tell them if they are a normal weight overweight obese severely obese
764:52 weight overweight obese severely obese anything like that and we have these
764:53 anything like that and we have these ranges so that should help us out quite
764:55 ranges so that should help us out quite a bit so let's just write our if
764:57 a bit so let's just write our if statement and then we'll include it up
764:59 statement and then we'll include it up here but let's go down here and we'll
765:02 here but let's go down here and we'll say if and then we'll do BMI and let's
765:05 say if and then we'll do BMI and let's just say BMI is greater than zero so if
765:11 just say BMI is greater than zero so if it's greater than zero if they had any
765:12 it's greater than zero if they had any input where the BMI was not zero which
765:15 input where the BMI was not zero which should be every time if they do it
765:16 should be every time if they do it properly and they don't you know put a
765:18 properly and they don't you know put a string in there or something or type out
765:20 string in there or something or type out 40 which maybe we should make a prompt
765:22 40 which maybe we should make a prompt for that if that happens then we can say
765:25 for that if that happens then we can say if we'll do
765:27 if we'll do BMI and now we need to give that first
765:30 BMI and now we need to give that first range so this range right here so if
765:31 range so this range right here so if it's under 18.5 so we need to do a less
765:35 it's under 18.5 so we need to do a less than so if it's less than
765:39 than so if it's less than 18.5 and it just says under it doesn't
765:41 18.5 and it just says under it doesn't say under or equal to so I'll keep it at
765:43 say under or equal to so I'll keep it at 18.5 so if it's under
765:46 18.5 so if it's under 18.5 then let's give kind of the output
765:49 18.5 then let's give kind of the output we'll say print
765:51 we'll say print and the output or the basically the
765:54 and the output or the basically the prompt is underweight so we'll just say
765:57 prompt is underweight so we'll just say you are
766:00 you are under under case underweight and just
766:04 under under case underweight and just like that um then we're going to pass
766:09 like that um then we're going to pass several ellf statements through here
766:11 several ellf statements through here well let's just say else so I guess this
766:14 well let's just say else so I guess this would be like if they are if they don't
766:18 would be like if they are if they don't input something properly if something
766:20 input something properly if something messes up
766:21 messes up maybe I we could write something like um
766:24 maybe I we could write something like um print
766:25 print oops I'm thinking all this through we
766:28 oops I'm thinking all this through we can write print
766:30 can write print enter valid
766:33 enter valid inputs or something like this or we can
766:36 inputs or something like this or we can always change that but let's really
766:38 always change that but let's really quickly let's run
766:40 quickly let's run this okay so I'm not in that range uh
766:43 this okay so I'm not in that range uh let's make the next one so then I can be
766:47 let's make the next one so then I can be within a certain range
766:48 within a certain range oops and we need we should need one one
766:51 oops and we need we should need one one more a minimum so we'll say
766:54 more a minimum so we'll say LF and
766:56 LF and LF these next two are this 24.9 so it's
767:01 LF these next two are this 24.9 so it's going to check this one first so if it's
767:03 going to check this one first so if it's 18.5 or below 18.5 it's automatically
767:07 18.5 or below 18.5 it's automatically going to print this one so this next one
767:09 going to print this one so this next one we don't have to do like a range or
767:11 we don't have to do like a range or anything we can just say if it's below
767:15 anything we can just say if it's below if it's between 25 and 29.9 so this one
767:18 if it's between 25 and 29.9 so this one actually should be less than or equal to
767:21 actually should be less than or equal to um this one is normal oh whoops
767:26 um this one is normal oh whoops 24.9 so this one is
767:29 24.9 so this one is 24.9 this one is going to say you are
767:32 24.9 this one is going to say you are normal weight so let's run this
767:36 normal weight so let's run this now let's see BMI was
767:40 now let's see BMI was 25.1 oh guys I'm just messing up here I
767:43 25.1 oh guys I'm just messing up here I apologize all right this is the one that
767:46 apologize all right this is the one that I was part of so now it's going to be
767:48 I was part of so now it's going to be I'm part of the overweight crowd now now
767:50 I'm part of the overweight crowd now now let's run this and now our prompt is you
767:52 let's run this and now our prompt is you are overweight cuz remember the BMI was
767:55 are overweight cuz remember the BMI was saved right here as
767:57 saved right here as 25.1 down here if we run through this
768:01 25.1 down here if we run through this it's saying no you're not in
768:04 it's saying no you're not in oops get rid of that no you're not in
768:07 oops get rid of that no you're not in under 18.5 you're not under
768:10 under 18.5 you're not under 24.9 if you under
768:12 24.9 if you under 29.9 you are overweight so that did work
768:15 29.9 you are overweight so that did work properly so that's really good and I
768:17 properly so that's really good and I don't think I want this to be our output
768:20 don't think I want this to be our output for person because we're going to add
768:22 for person because we're going to add this up here it's just going to give us
768:23 this up here it's just going to give us the BMI and then the output is going to
768:26 the BMI and then the output is going to say you are overweight uh let's make it
768:27 say you are overweight uh let's make it a little bit more customized um I'm
768:30 a little bit more customized um I'm going to say name is equal to input and
768:34 going to say name is equal to input and then we'll say
768:36 then we'll say enter your
768:39 enter your name um so it'll be enter your name
768:41 name um so it'll be enter your name we'll do Alex
768:44 we'll do Alex 70 69 there's our BMI now it's going to
768:48 70 69 there's our BMI now it's going to run through this logic or it will run
768:49 run through this logic or it will run through this logic and just just a
768:51 through this logic and just just a second
768:53 second when we actually finish this so then we
768:56 when we actually finish this so then we have
769:03 34.9 and let's do one more oops and then this one's going to
769:06 more oops and then this one's going to be for
769:13 39.9 so this one was overweight this one is
769:15 is obese severely obese so we'll say
769:18 obese severely obese so we'll say severely that you spell it really obese
769:21 severely that you spell it really obese and then anything that's over that 40
769:23 and then anything that's over that 40 and over so if it's not this one
769:26 and over so if it's not this one anything else should be S morbidly obese
769:30 anything else should be S morbidly obese so actually this lse statement right
769:31 so actually this lse statement right here should
769:33 here should say uh you
769:40 are you are severely obese this is going to say morbidly morbidly obese now I
769:45 to say morbidly morbidly obese now I added that name up here because I wanted
769:48 added that name up here because I wanted to add that down below actually so we're
769:50 to add that down below actually so we're we're going to say uh name plus and then
769:55 we're going to say uh name plus and then we'll do like
769:59 we'll do like comma you are underweight so it'll be a
770:01 comma you are underweight so it'll be a little bit more personalized uh I think
770:04 little bit more personalized uh I think it'll I think it'll be a nice touch I
770:06 it'll I think it'll be a nice touch I really do we'll do it like this and
770:08 really do we'll do it like this and we'll say you and let's go back and do
770:10 we'll say you and let's go back and do that to all of
770:12 that to all of them and let me see how quickly I can do
770:20 thiss oh whoops what I do get rid of that
770:21 that name plus u like that geez you guys are
770:27 name plus u like that geez you guys are seeing me mess up a h name plus you and
770:33 seeing me mess up a h name plus you and then name plus you so now let's run this
770:38 then name plus you so now let's run this and now it's a little more personalized
770:39 and now it's a little more personalized it says Alex you are overweight so this
770:42 it says Alex you are overweight so this is all really good now this is an if
770:45 is all really good now this is an if statement um what we had done before I
770:48 statement um what we had done before I think is actually what we should put
770:49 think is actually what we should put right down here so we'll say l else and
770:51 right down here so we'll say l else and then if that doesn't work we'll say what
770:54 then if that doesn't work we'll say what do we say enter valid input we'll just
770:57 do we say enter valid input we'll just put that um and let let me see if I can
770:59 put that um and let let me see if I can test this out don't I don't know if this
771:03 test this out don't I don't know if this will error out or if this will even
771:06 will error out or if this will even work let me just see if I can mess with
771:08 work let me just see if I can mess with it and see if I can get it to work
771:10 it and see if I can get it to work actually let's copy this we're going to
771:13 actually let's copy this we're going to copy this whole thing we're going to
771:15 copy this whole thing we're going to include it right
771:17 include it right here and now we have basically our
771:20 here and now we have basically our entire calculator so um let's run this
771:24 entire calculator so um let's run this enter your name we'll say Alex enter
771:28 enter your name we'll say Alex enter your pounds 170 into your inches
771:31 your pounds 170 into your inches 69 and then it's going to say
771:34 69 and then it's going to say 25.1 Alex you are overweight and that's
771:37 25.1 Alex you are overweight and that's perfect we could even go as far as
771:39 perfect we could even go as far as adding like some feedback we say you are
771:42 adding like some feedback we say you are overweight and then it would be a period
771:45 overweight and then it would be a period and we could say um you need to exercise
771:49 and we could say um you need to exercise more
771:50 more stop sitting and writing so many python
771:55 stop sitting and writing so many python tutorials so now if we run this we'll do
771:59 tutorials so now if we run this we'll do Alex
772:02 Alex 17069 it says Alex you are overweight
772:04 17069 it says Alex you are overweight you need to exercise more and stop
772:05 you need to exercise more and stop sitting and writing so many python
772:09 sitting and writing so many python tutorials period and that's it this is
772:13 tutorials period and that's it this is the entire project um you can go a ton
772:17 the entire project um you can go a ton farther you can include much more
772:19 farther you can include much more complex logic you could even build out a
772:22 complex logic you could even build out a UI to create your own you know app just
772:25 UI to create your own you know app just like this where it has this input and
772:27 like this where it has this input and this UI you can build that out with in
772:29 this UI you can build that out with in jupyter notebooks with python um but
772:32 jupyter notebooks with python um but that's not really what this tutorial is
772:34 that's not really what this tutorial is for this is just to kind of help you um
772:36 for this is just to kind of help you um think through some of the logic of
772:38 think through some of the logic of creating something like this so you know
772:40 creating something like this so you know I hope that this was helpful I hope that
772:42 I hope that this was helpful I hope that this was fun I like creating stuff like
772:43 this was fun I like creating stuff like this we have two other projects that
772:45 this we have two other projects that we're going to do and maybe I'll include
772:46 we're going to do and maybe I'll include more but we have two right now that I
772:48 more but we have two right now that I have planned um and I hope those those
772:50 have planned um and I hope those those are helpful this is probably our easiest
772:52 are helpful this is probably our easiest one and they'll get a little bit more
772:54 one and they'll get a little bit more difficult in the next projects so I hope
772:57 difficult in the next projects so I hope that this was fun I hope that this was
772:58 that this was fun I hope that this was helpful and that you can now kind of
773:00 helpful and that you can now kind of utilize those python skills that you've
773:02 utilize those python skills that you've been working on if you like this video
773:04 been working on if you like this video be sure to like And subscribe below and
773:06 be sure to like And subscribe below and I'll see you in the next
773:09 I'll see you in the next [Music]
773:18 [Music] video hello everybody today we're going
773:21 video hello everybody today we're going to be creating an automatic file sorder
773:23 to be creating an automatic file sorder for your files and file explorer now out
773:25 for your files and file explorer now out of all the projects that we've done in
773:26 of all the projects that we've done in this series so far I think this one
773:27 this series so far I think this one might be the most difficult but I also
773:29 might be the most difficult but I also think this one is the most cool because
773:31 think this one is the most cool because it has some real life applications so
773:33 it has some real life applications so without further Ado let's take a look at
773:35 without further Ado let's take a look at some files that we have right down here
773:37 some files that we have right down here in my file explorer so I have this
773:39 in my file explorer so I have this beautiful picture of Rosie uh right here
773:42 beautiful picture of Rosie uh right here this is a PNG file I have a CSV file and
773:45 this is a PNG file I have a CSV file and a text file and I want to sort all of
773:48 a text file and I want to sort all of them into their own folders depending on
773:51 them into their own folders depending on what kind of file it is so if I go right
773:53 what kind of file it is so if I go right in here and I click on this one I go to
773:55 in here and I click on this one I go to properties I can see that this is a PNG
773:58 properties I can see that this is a PNG file um if I go into this one I don't
774:00 file um if I go into this one I don't need to but if I go into this one it's a
774:02 need to but if I go into this one it's a CSV file and of course this one is a
774:05 CSV file and of course this one is a text file so I want three separate
774:08 text file so I want three separate folders in here and I want them to
774:10 folders in here and I want them to automatically go into those folders
774:12 automatically go into those folders without me having to drag and drop and
774:14 without me having to drag and drop and going and clicking now we only have four
774:17 going and clicking now we only have four files here but imagine if we have
774:19 files here but imagine if we have thousands of files
774:20 thousands of files how much time that could save us so
774:23 how much time that could save us so let's get out of here and let's start
774:25 let's get out of here and let's start writing our code so we're going to say
774:28 writing our code so we're going to say import OS comma and then we're going to
774:32 import OS comma and then we're going to say chut iil now OS obviously stands for
774:36 say chut iil now OS obviously stands for operating system shuil uh I don't know
774:39 operating system shuil uh I don't know what it actually supposed to stand for
774:40 what it actually supposed to stand for but what it will allow us to do is do
774:42 but what it will allow us to do is do some highlevel operations on our files
774:45 some highlevel operations on our files in file explorer so we're going to go
774:47 in file explorer so we're going to go ahead and import those and now that we
774:49 ahead and import those and now that we have those imported
774:50 have those imported uh something that's going to be very
774:52 uh something that's going to be very important for us to have throughout this
774:53 important for us to have throughout this whole thing and this is anytime I'm
774:55 whole thing and this is anytime I'm working with like directories or
774:57 working with like directories or something like this we want to get this
774:58 something like this we want to get this path down so I'm going to go ahead and
775:00 path down so I'm going to go ahead and copy this
775:02 copy this path and we're just going to say path is
775:04 path and we're just going to say path is equal to and we'll do this right here so
775:08 equal to and we'll do this right here so let's run this and I need to put an R
775:11 let's run this and I need to put an R right here to make this a raw text um so
775:14 right here to make this a raw text um so when you don't have the r uh it's going
775:16 when you don't have the r uh it's going to read in these you know these
775:17 to read in these you know these backslashes and these colons and
775:19 backslashes and these colons and different stuff if we do R it's just
775:21 different stuff if we do R it's just going to read it in as the raw string
775:22 going to read it in as the raw string and that's what we want so here's what
775:25 and that's what we want so here's what we need to do there there's a few
775:26 we need to do there there's a few different things that have to happen
775:28 different things that have to happen when we are writing this out one thing
775:30 when we are writing this out one thing is is we need to go in here and we need
775:32 is is we need to go in here and we need to see this path and we need to see are
775:33 to see this path and we need to see are there folders in here already um if not
775:36 there folders in here already um if not we need to create a folder so that's one
775:39 we need to create a folder so that's one of the first things that we need to do
775:41 of the first things that we need to do the next thing that we need is it needs
775:43 the next thing that we need is it needs to check each of these files
775:45 to check each of these files individually identify what kind of file
775:47 individually identify what kind of file it is and then put it into the correct
775:50 it is and then put it into the correct folder so we have to create the folder
775:53 folder so we have to create the folder then check these and then place it into
775:55 then check these and then place it into the correct folder so let's go right out
775:57 the correct folder so let's go right out of here so what we're going to start
776:00 of here so what we're going to start doing is we're going to start working
776:02 doing is we're going to start working with these paths and these directories
776:04 with these paths and these directories and some of these things you may never
776:05 and some of these things you may never have seen before but that's okay I'll
776:07 have seen before but that's okay I'll try to explain it as I go through so the
776:09 try to explain it as I go through so the first thing that we're going to write is
776:10 first thing that we're going to write is os. list directories uh and what this is
776:14 os. list directories uh and what this is actually going to do is show us all the
776:15 actually going to do is show us all the files in there we're going to say path
776:18 files in there we're going to say path so it should show us all the files
776:20 so it should show us all the files within path and so here are our results
776:23 within path and so here are our results so we have the data professional results
776:26 so we have the data professional results fake text file our image and our other
776:29 fake text file our image and our other image so this is actually showing us
776:31 image so this is actually showing us what files are in that path and that's
776:33 what files are in that path and that's super important because we're probably
776:35 super important because we're probably going to have to Loop through this in
776:36 going to have to Loop through this in some way later um I wrote this all out
776:39 some way later um I wrote this all out before so I kind of remember but I'm
776:41 before so I kind of remember but I'm doing this all off the top of my head so
776:43 doing this all off the top of my head so I guarantee you throughout this I'll
776:44 I guarantee you throughout this I'll make some mistakes but what we now need
776:46 make some mistakes but what we now need to do is we need to create folders or
776:49 to do is we need to create folders or check if there's a folder and create it
776:51 check if there's a folder and create it if it isn't there that's um The Next
776:53 if it isn't there that's um The Next Step that we need to take so let's go
776:55 Step that we need to take so let's go right down here and we want to check if
776:57 right down here and we want to check if this path exists already so if that
777:00 this path exists already so if that folder already exists so we're going to
777:01 folder already exists so we're going to say
777:02 say os. path. exists so this is going to
777:06 os. path. exists so this is going to check does this path just like this path
777:08 check does this path just like this path up here does it already exist and then
777:10 up here does it already exist and then we're going to do an open parenthesis
777:12 we're going to do an open parenthesis we'll say path so that's our path now we
777:15 we'll say path so that's our path now we need to add a folder name to this um we
777:19 need to add a folder name to this um we could hardcode it so we could do plus we
777:22 could hardcode it so we could do plus we could say CSV files and that could work
777:25 could say CSV files and that could work so it would say does this path already
777:27 so it would say does this path already exist and we can try running this and
777:30 exist and we can try running this and it's going to say false so this doesn't
777:31 it's going to say false so this doesn't already exist but the thing is is we
777:34 already exist but the thing is is we need to create three separate path so we
777:36 need to create three separate path so we could do this by just hardcoding it in
777:40 could do this by just hardcoding it in by saying CSV files image files um and
777:43 by saying CSV files image files um and text files or we can just put this all
777:45 text files or we can just put this all in a list and loop through it I think
777:47 in a list and loop through it I think it's just going to be easier to do that
777:50 it's just going to be easier to do that or I don't know visually it's going to
777:51 or I don't know visually it's going to be easier so we'll do uh folder undor
777:55 be easier so we'll do uh folder undor names and we'll say is equal to and
777:58 names and we'll say is equal to and we'll create a list so I think I want to
778:00 we'll create a list so I think I want to call it CSV files comma um image files
778:05 call it CSV files comma um image files or PNG files whatever you want to write
778:08 or PNG files whatever you want to write and then we'll do text
778:11 and then we'll do text files do text files and then we can go
778:15 files do text files and then we can go right down here um a little for Loop uh
778:19 right down here um a little for Loop uh I think what we'll do actually let's
778:20 I think what we'll do actually let's write
778:21 write folder underscore names um then we can
778:25 folder underscore names um then we can put something like uh let's write Loop
778:30 put something like uh let's write Loop why not um so a little trick for the for
778:33 why not um so a little trick for the for Loop is you going to say four and we'll
778:34 Loop is you going to say four and we'll say Loop and and we'll just do a range
778:37 say Loop and and we'll just do a range because we want it to basically go
778:39 because we want it to basically go through here we don't want it to
778:40 through here we don't want it to actually give us these file names we
778:42 actually give us these file names we just want it to count Zer one and two so
778:45 just want it to count Zer one and two so if we do range from Zer to two zero uh 0
778:50 if we do range from Zer to two zero uh 0 one2 that should work if we do um this
778:53 one2 that should work if we do um this then when it Loops through it's going to
778:54 then when it Loops through it's going to call folder name and say zero which
778:56 call folder name and say zero which would be CSV files image files and text
778:59 would be CSV files image files and text files um so
779:02 files um so let's uh yeah I need a colon let's run
779:06 let's uh yeah I need a colon let's run through this really quickly uh shouldn't
779:08 through this really quickly uh shouldn't do
779:09 do anything but what we can do now is we
779:12 anything but what we can do now is we can say okay if this does not exist what
779:16 can say okay if this does not exist what we can do is actually create it so we'll
779:19 we can do is actually create it so we'll say
779:20 say if not so if this does not exist then
779:24 if not so if this does not exist then what we're going to do is take
779:28 what we're going to do is take this and we'll say
779:31 this and we'll say os. make directory and then we'll do
779:36 os. make directory and then we'll do just like that um I think it's make
779:40 just like that um I think it's make directory S I can't I think that's
779:42 directory S I can't I think that's correct um so let's test this out really
779:45 correct um so let's test this out really quickly let's see if this
779:47 quickly let's see if this works and invalid syntax I I need a
779:50 works and invalid syntax I I need a colon okay so I just ran this let's see
779:53 colon okay so I just ran this let's see if it did actually make those
779:56 if it did actually make those folders let's refresh it and it didn't
780:00 folders let's refresh it and it didn't so let's just print this off um so if
780:04 so let's just print this off um so if not let's just print let's see does this
780:07 not let's just print let's see does this actually
780:08 actually work let's do
780:11 work let's do if
780:13 if okay ah okay so I think I know what
780:17 okay ah okay so I think I know what might be happening I think it's giving
780:19 might be happening I think it's giving us it actually be let let's check this
780:20 us it actually be let let's check this really quick go to python
780:23 really quick go to python tutorials oh
780:25 tutorials oh no I think it's
780:27 no I think it's creating yeah it's creating these Python
780:29 creating yeah it's creating these Python tutorial images right here whoops okay
780:32 tutorial images right here whoops okay so I just figured it out um let's go
780:34 so I just figured it out um let's go back into python tutorials don't take a
780:36 back into python tutorials don't take a look at any of those notebooks those are
780:38 look at any of those notebooks those are secret um we were creating them in the
780:41 secret um we were creating them in the wrong place um and that's because of
780:44 wrong place um and that's because of this right here we need a backslash so
780:46 this right here we need a backslash so we need to actually include a backslash
780:49 we need to actually include a backslash right here here in this path we didn't
780:51 right here here in this path we didn't have that um e y scanning string
780:57 have that um e y scanning string literal okay so this back slash could
781:00 literal okay so this back slash could cause an issue let's see if I can do
781:01 cause an issue let's see if I can do forward slashes on all these just stick
781:04 forward slashes on all these just stick with me guys I might cut this out I
781:06 with me guys I might cut this out I might not we'll see if this is important
781:08 might not we'll see if this is important just going to keep talking while we're
781:10 just going to keep talking while we're doing it um let's run
781:13 doing it um let's run this okay so now that we're doing these
781:16 this okay so now that we're doing these forward slashes we're still checking
781:18 forward slashes we're still checking let's make sure we can still check those
781:20 let's make sure we can still check those files good now when we Loop through this
781:23 files good now when we Loop through this I'm not going to well yeah I can print
781:25 I'm not going to well yeah I can print it off doesn't matter I'm going to print
781:26 it off doesn't matter I'm going to print it and we'll see if that name works and
781:29 it and we'll see if that name works and then we're also going to
781:32 then we're also going to um uh I said if so if it exists then
781:36 um uh I said if so if it exists then make it no no no so if not I think the
781:38 make it no no no so if not I think the not did make sense we just weren't sure
781:40 not did make sense we just weren't sure we had to do some um checking so if it
781:43 we had to do some um checking so if it exists then we're going to create it and
781:45 exists then we're going to create it and we'll keep the print in there because it
781:46 we'll keep the print in there because it doesn't really matter so it's going to
781:48 doesn't really matter so it's going to create the CSV an image but didn't
781:51 create the CSV an image but didn't create the text let's see okay let's uh
781:56 create the text let's see okay let's uh I don't know why this would work but
781:57 I don't know why this would work but let's run it okay so I think I just had
782:00 let's run it okay so I think I just had the wrong range so now we have our
782:02 the wrong range so now we have our images all through or we have our
782:05 images all through or we have our folders all three folders now we need to
782:07 folders all three folders now we need to write a script that will read in these
782:10 write a script that will read in these and check and see what kind of file it
782:12 and check and see what kind of file it is and place it into the correct
782:15 is and place it into the correct folder so let's come right down here and
782:19 folder so let's come right down here and let's see what we need to do so now I
782:21 let's see what we need to do so now I think we need to use this right here um
782:25 think we need to use this right here um I think we need to Loop through this to
782:27 I think we need to Loop through this to be able to check each one so we need to
782:28 be able to check each one so we need to name this so we'll just do um file name
782:32 name this so we'll just do um file name is equal to run that so now we have this
782:35 is equal to run that so now we have this file name um and what we can do is Loop
782:39 file name um and what we can do is Loop through this so let's say let's say for
782:44 through this so let's say let's say for file in file name so we're going to Loop
782:47 file in file name so we're going to Loop through this now when it goes through it
782:49 through this now when it goes through it needs to check the it's going to check
782:53 needs to check the it's going to check the file path and in the file path it'll
782:55 the file path and in the file path it'll say. txt CSV so let's say um if I think
783:01 say. txt CSV so let's say um if I think it should be CSV Let's test it on this
783:04 it should be CSV Let's test it on this one but if CSV is
783:07 one but if CSV is in file name or actually it's file so if
783:12 in file name or actually it's file so if if it's in
783:13 if it's in file and not in and oh not not in if
783:19 file and not in and oh not not in if it's also not in this I believe because
783:23 it's also not in this I believe because we're going to check we're going to
783:24 we're going to check we're going to check each of those folders so we're
783:27 check each of those folders so we're going to Loop through and it's going to
783:29 going to Loop through and it's going to check and see if the CSV so if that
783:32 check and see if the CSV so if that string is in the
783:34 string is in the file then what we want to do is check
783:39 file then what we want to do is check that it's also not in here that's
783:42 that it's also not in here that's actually just the folder we also need um
783:46 actually just the folder we also need um also we're not doing that for Loop
783:48 also we're not doing that for Loop anymore um
783:50 anymore um um okay I'm sorry I'm talking this
783:52 um okay I'm sorry I'm talking this through I'm figuring it out as I go
783:54 through I'm figuring it out as I go because I may have forgotten some of
783:56 because I may have forgotten some of this so we're going to say this that's
783:59 this so we're going to say this that's the CSV files so we need to check this
784:03 the CSV files so we need to check this one um let's do it like this oops okay
784:09 one um let's do it like this oops okay so it's going to check to see if CSV
784:11 so it's going to check to see if CSV files and I think it needs that in
784:13 files and I think it needs that in between it so it's going to say the path
784:15 between it so it's going to say the path so there's our path plus slash C SV
784:19 so there's our path plus slash C SV files um actually no it needs to be like
784:23 files um actually no it needs to be like this CU we're going to check that then I
784:25 this CU we're going to check that then I got it all right I figured it out now
784:27 got it all right I figured it out now then we're going to check if this file
784:29 then we're going to check if this file is in there yeah so that's right so it
784:31 is in there yeah so that's right so it says if the
784:33 says if the CSV is in the
784:35 CSV is in the file um which is right where am I
784:40 file um which is right where am I looking oh file name so if it's in that
784:43 looking oh file name so if it's in that list of the actual files which is all of
784:45 list of the actual files which is all of these if we find CSV in any of these
784:48 these if we find CSV in any of these files
784:50 files and it's not already in here so it's
784:53 and it's not already in here so it's going to say path plus CSV files did I
784:56 going to say path plus CSV files did I say files yeah CSV files plus file okay
785:00 say files yeah CSV files plus file okay that all looks correct so if it's not in
785:03 that all looks correct so if it's not in there we're going to use shuttle. move
785:05 there we're going to use shuttle. move now this is how we actually move the
785:07 now this is how we actually move the file it gives us the ability to move
785:09 file it gives us the ability to move what we want then we'll say move we need
785:12 what we want then we'll say move we need to take it from our initial path to our
785:14 to take it from our initial path to our new path so we're going to specify we'll
785:17 new path so we're going to specify we'll separate by comma we need to spef ify
785:19 separate by comma we need to spef ify its original path which it should just
785:22 its original path which it should just be
785:24 be this without this I think it should be
785:29 this without this I think it should be file path because this is where it is
785:31 file path because this is where it is now it's in the FI this path with that
785:33 now it's in the FI this path with that file name then we need to say we want to
785:37 file name then we need to say we want to move it to here that is what we want to
785:39 move it to here that is what we want to do
785:42 do um yeah so let's check it with just this
785:44 um yeah so let's check it with just this one and see if it works okay it ran
785:47 one and see if it works okay it ran through it let's go check
785:49 through it let's go check aha now that CSV file is gone perfect
785:52 aha now that CSV file is gone perfect that is exactly what we want it to
785:54 that is exactly what we want it to happen now we can just recreate this
785:58 happen now we can just recreate this for um for both our PNG files our image
786:03 for um for both our PNG files our image files and our text files so we'll say LF
786:06 files and our text files so we'll say LF and
786:08 and LF and let's do
786:11 LF and let's do PNG then we'll do image
786:15 PNG then we'll do image files and image files because again
786:18 files and image files because again we're just doing the exact same thing I
786:19 we're just doing the exact same thing I can do text files the next one's going
786:22 can do text files the next one's going to be text files text files so this
786:25 to be text files text files so this one's going to check for
786:27 one's going to check for txt now do we need anything else um
786:31 txt now do we need anything else um we'll just say else and we'll print off
786:35 we'll just say else and we'll print off print this file type is not included or
786:41 print this file type is not included or or if there's multiple files we'll say
786:43 or if there's multiple files we'll say there are files in this
786:48 there are files in this path that were're not
786:51 path that were're not moved okay so if we run through this
786:56 moved okay so if we run through this it's going to catch our CSV catch our
786:58 it's going to catch our CSV catch our PNG catch our text and if not it'll say
787:01 PNG catch our text and if not it'll say there are files in this path that we're
787:02 there are files in this path that we're not moved exclamation point all right
787:05 not moved exclamation point all right now let's run through
787:07 now let's run through this
787:09 this uh uh that's because if LF LF L
787:14 uh uh that's because if LF LF L if and then it's going to this lse
787:17 if and then it's going to this lse statement uh I don't know let's let's
787:19 statement uh I don't know let's let's Circle back around to that in a second
787:22 Circle back around to that in a second all of them were moved properly that's
787:25 all of them were moved properly that's really
787:27 really good really quickly I I'll I'll check
787:30 good really quickly I I'll I'll check and see I just don't I'm G to take that
787:31 and see I just don't I'm G to take that out for now so I'm just going to run it
787:34 out for now so I'm just going to run it um I'm we may or may not go back to that
787:36 um I'm we may or may not go back to that but let's check and see if everything
787:38 but let's check and see if everything worked properly so let's go into the CSV
787:40 worked properly so let's go into the CSV file and we have our CSV file let go
787:43 file and we have our CSV file let go into our image files and we have our
787:46 into our image files and we have our images and let's go into our text file
787:49 images and let's go into our text file and there are our text files now is
787:53 and there are our text files now is there anything else that we need to do I
787:56 there anything else that we need to do I don't believe so but what I can do is I
788:00 don't believe so but what I can do is I can take all
788:02 can take all this I can include it in
788:04 this I can include it in here and I'm going
788:08 here and I'm going to basically restart
788:15 it just to see if it works properly from scratch right I just want to make sure
788:18 scratch right I just want to make sure that I didn't miss it anything and we'll
788:20 that I didn't miss it anything and we'll delete
788:21 delete these so we have our I'm just going to
788:24 these so we have our I'm just going to rerun everything we we
788:26 rerun everything we we imported we created our path these are
788:29 imported we created our path these are our file names and then when we run this
788:31 our file names and then when we run this it should take our folder names check
788:34 it should take our folder names check through them if they aren't already
788:36 through them if they aren't already created it's going to create it don't
788:38 created it's going to create it don't need it to print so let's get rid of
788:40 need it to print so let's get rid of that then for the file within our file
788:44 that then for the file within our file names and it check it it checks each one
788:46 names and it check it it checks each one we check if there's a CSV and if it's
788:48 we check if there's a CSV and if it's already already in that file if it's
788:51 already already in that file if it's already in that folder I mean if it's in
788:53 already in that folder I mean if it's in that folder then it doesn't do anything
788:54 that folder then it doesn't do anything but if it isn't so and not it's not in
788:58 but if it isn't so and not it's not in there it is going to move it to that
789:00 there it is going to move it to that location so it's going to check CSV PNG
789:02 location so it's going to check CSV PNG and text I think everything should work
789:05 and text I think everything should work properly let's run
789:07 properly let's run this and it looks like it's working good
789:11 this and it looks like it's working good good good and perfect it worked exactly
789:15 good good and perfect it worked exactly how I had hoped um that's great so
789:19 how I had hoped um that's great so this is the automatic file sorder in
789:23 this is the automatic file sorder in file explorer project uh you can go even
789:25 file explorer project uh you can go even a step further so I had to come in here
789:27 a step further so I had to come in here and manually run this you can go a step
789:29 and manually run this you can go a step further and put a timer on this where it
789:32 further and put a timer on this where it automatically does this maybe every hour
789:35 automatically does this maybe every hour every day every 30 minutes you can run
789:38 every day every 30 minutes you can run this in your background especially if
789:39 this in your background especially if you create um like a an execution for
789:43 you create um like a an execution for this you can run this in your background
789:45 this you can run this in your background um if you are curious on how to do that
789:48 um if you are curious on how to do that I think I did something something
789:49 I think I did something something similar to that in my web scraping
789:51 similar to that in my web scraping project um my Amazon web scraping
789:54 project um my Amazon web scraping project if you want to go check that one
789:55 project if you want to go check that one out but we're not going to do it in this
789:56 out but we're not going to do it in this project this is all I wanted to show you
789:58 project this is all I wanted to show you how to do so I hope that this was
790:00 how to do so I hope that this was helpful I hope that this project was you
790:02 helpful I hope that this project was you know interesting and that you liked it I
790:04 know interesting and that you liked it I hope that you learned something and so
790:05 hope that you learned something and so if you did be sure to like And subscribe
790:08 if you did be sure to like And subscribe below and I will see you in the next
790:09 below and I will see you in the next video what's going on everybody welcome
790:11 video what's going on everybody welcome back to another video today we're going
790:13 back to another video today we're going to be starting our python web scraping
790:15 to be starting our python web scraping tutorial series now this is more of a
790:17 tutorial series now this is more of a continuation of the Python tutorial
790:18 continuation of the Python tutorial series series but because we're going to
790:20 series series but because we're going to be focusing on web scraping for three or
790:21 be focusing on web scraping for three or four videos I wanted to just make it its
790:23 four videos I wanted to just make it its own little minseries in this series I'm
790:25 own little minseries in this series I'm going to show you the basics of web
790:27 going to show you the basics of web scraping how to actually look at HTML
790:29 scraping how to actually look at HTML how to inspect a web page how to pull
790:31 how to inspect a web page how to pull that data in and then even put it into a
790:33 that data in and then even put it into a CSV file so you can save it and use it
790:35 CSV file so you can save it and use it now in this series we're just covering
790:36 now in this series we're just covering the basics which is a fantastic place to
790:38 the basics which is a fantastic place to start but in future series I'll be going
790:40 start but in future series I'll be going into some of the more advanced web
790:41 into some of the more advanced web scraping topics as well so without
790:43 scraping topics as well so without further Ado let sh up on my screen and
790:44 further Ado let sh up on my screen and get started with web scraping now the
790:46 get started with web scraping now the first thing that we need to learn is
790:47 first thing that we need to learn is HTML HTML stands for hypertext markup
790:51 HTML HTML stands for hypertext markup language and it's used to describe all
790:53 language and it's used to describe all of the elements on a web page now when
790:56 of the elements on a web page now when we actually go to a website and start
790:57 we actually go to a website and start pulling data and information we need to
791:00 pulling data and information we need to know HTML so we can specify exactly what
791:02 know HTML so we can specify exactly what we want to take off of that website so
791:04 we want to take off of that website so that's where HTML comes in and we're
791:06 that's where HTML comes in and we're going to look at the basics
791:07 going to look at the basics understanding just the basic structure
791:09 understanding just the basic structure of HTML then we'll go look at a real
791:11 of HTML then we'll go look at a real website and you'll kind of see that's a
791:13 website and you'll kind of see that's a little bit more difficult than what we
791:14 little bit more difficult than what we just have right here but this is the
791:16 just have right here but this is the basic building blocks to get to what the
791:19 basic building blocks to get to what the HTML actually looks like on a website
791:21 HTML actually looks like on a website now this is basically what HTML looks
791:24 now this is basically what HTML looks like we have these angled brackets with
791:26 like we have these angled brackets with things like HTML head title body and
791:30 things like HTML head title body and then you'll notice that at the end we'll
791:33 then you'll notice that at the end we'll have a body and then we'll have a body
791:35 have a body and then we'll have a body at the bottom this forward SL body
791:38 at the bottom this forward SL body denotes that this is the end of the body
791:40 denotes that this is the end of the body section in HTML so everything inside of
791:43 section in HTML so everything inside of this is within this body so there is
791:46 this is within this body so there is this hierarchy within HTML we have HTML
791:50 this hierarchy within HTML we have HTML and HTML at the bottom which
791:51 and HTML at the bottom which encapsulates all the HTML on the website
791:54 encapsulates all the HTML on the website then we have things like head and head
791:56 then we have things like head and head body and body now Within These sections
791:59 body and body now Within These sections we usually have things like classes tags
792:01 we usually have things like classes tags attributes text and all these other
792:03 attributes text and all these other things things that we'll get to in
792:04 things things that we'll get to in different lessons but one of the easiest
792:06 different lessons but one of the easiest ones to notice and look at are tags
792:09 ones to notice and look at are tags things like a P tag or a title tag now
792:12 things like a P tag or a title tag now Within These tags because this is a
792:14 Within These tags because this is a super simple example we have these
792:16 super simple example we have these strings here my first web page page and
792:19 strings here my first web page page and this is what's called a variable string
792:21 this is what's called a variable string and this is actual text that we could
792:22 and this is actual text that we could take out of this web page now that you
792:25 take out of this web page now that you understand the super basics of HTML
792:27 understand the super basics of HTML let's actually go to our website and I'm
792:29 let's actually go to our website and I'm going to have a link down below but it's
792:31 going to have a link down below but it's going to be this one right here this is
792:32 going to be this one right here this is basically just a website that you can
792:34 basically just a website that you can you know practice web scraping on it's
792:36 you know practice web scraping on it's called scrape the
792:38 called scrape the site.com and what we're going to do is
792:40 site.com and what we're going to do is look at the HTML behind this web page
792:42 look at the HTML behind this web page and you can do this on any website that
792:44 and you can do this on any website that you go on so we're going to right click
792:46 you go on so we're going to right click we're going to go down to inspect
792:50 we're going to go down to inspect now right off the bat this looks a lot
792:52 now right off the bat this looks a lot more complicated and a lot more complex
792:55 more complicated and a lot more complex than the very simple illustration that
792:57 than the very simple illustration that we were looking at but let's kind of
792:59 we were looking at but let's kind of roll this up just a little bit you'll
793:01 roll this up just a little bit you'll notice we have HTML and HTML at the
793:03 notice we have HTML and HTML at the bottom we have a head and there is the
793:05 bottom we have a head and there is the end of the head and then a body and the
793:07 end of the head and then a body and the end of the body so in a super simple
793:10 end of the body so in a super simple sense it is similar but just the
793:13 sense it is similar but just the information that's within it is a lot
793:15 information that's within it is a lot more difficult now if we look at this
793:17 more difficult now if we look at this title right here this is our title tag
793:19 title right here this is our title tag if we click this little arrow this is
793:22 if we click this little arrow this is our dropdown you'll notice that here we
793:24 our dropdown you'll notice that here we have the string hockey teams forms
793:26 have the string hockey teams forms searching imp pagination now let's say
793:29 searching imp pagination now let's say we didn't know we didn't want to click
793:31 we didn't know we didn't want to click on that and go find it there's something
793:33 on that and go find it there's something that's super helpful within this
793:35 that's super helpful within this inspection page that you can click on
793:36 inspection page that you can click on right here it says select an element in
793:39 right here it says select an element in the page to inspect it so we're going to
793:40 the page to inspect it so we're going to click on that and as we go through our
793:43 click on that and as we go through our page and let's click on this title it's
793:45 page and let's click on this title it's going to take us to exactly where this
793:47 going to take us to exactly where this is in our our HTML this is extremely
793:50 is in our our HTML this is extremely helpful extremely useful for example
793:53 helpful extremely useful for example let's say the data I want is down here I
793:55 let's say the data I want is down here I want to take in the Boston Bruins I can
793:57 want to take in the Boston Bruins I can click on it and it's going to take me to
793:59 click on it and it's going to take me to where that is exactly in the HTML this
794:02 where that is exactly in the HTML this is where we can start writing our web
794:03 is where we can start writing our web scraping script to specify okay I'm
794:05 scraping script to specify okay I'm looking for a TR tag I'm looking for a
794:07 looking for a TR tag I'm looking for a TD tag I'm looking for the class called
794:09 TD tag I'm looking for the class called team this is all information and things
794:12 team this is all information and things that we can use to specify exactly what
794:14 that we can use to specify exactly what we want to pull out of our web page now
794:17 we want to pull out of our web page now there are other things that didn't
794:18 there are other things that didn't really look at as well in just our
794:20 really look at as well in just our simple illustration let's come right
794:23 simple illustration let's come right over here there's things like HRS now
794:26 over here there's things like HRS now these are hyperlinks so if we went and
794:28 these are hyperlinks so if we went and then clicked on this this is just
794:30 then clicked on this this is just regular text but inside of it is this
794:32 regular text but inside of it is this hyperlink where if we clicked on it it
794:34 hyperlink where if we clicked on it it would take us to another website and
794:36 would take us to another website and typically that's denoted by this hre
794:38 typically that's denoted by this hre right here then you'll typically see
794:40 right here then you'll typically see things like a P tag which usually stands
794:42 things like a P tag which usually stands for a paragraph now the last thing that
794:44 for a paragraph now the last thing that I want to show you while we're here and
794:46 I want to show you while we're here and we're going to learn a lot more in the
794:47 we're going to learn a lot more in the next several lessons
794:49 next several lessons but if we come right down here there is
794:51 but if we come right down here there is this actual entire table here and let's
794:53 this actual entire table here and let's try to find this table and I'm having
794:55 try to find this table and I'm having trouble selecting the entire thing but
794:57 trouble selecting the entire thing but let's select this team name and if we
794:59 let's select this team name and if we look at this team name you can see that
795:01 look at this team name you can see that this is encapsulating the table this
795:03 this is encapsulating the table this table tag now these are super helpful
795:05 table tag now these are super helpful because it takes in the entire table now
795:07 because it takes in the entire table now if we wrap this up and we look just at
795:10 if we wrap this up and we look just at this it says class table and then we
795:12 this it says class table and then we have the end of this table tag now when
795:15 have the end of this table tag now when we open it it's going to have all of
795:17 we open it it's going to have all of this information so as you can see as
795:18 this information so as you can see as I'm highlighting over it we have these
795:20 I'm highlighting over it we have these th tags and we have these TD tags and
795:24 th tags and we have these TD tags and even these TR tags which is the
795:27 even these TR tags which is the individual data and this is something
795:29 individual data and this is something that we'll look at when we're actually
795:30 that we'll look at when we're actually scraping all of the data from this table
795:32 scraping all of the data from this table in a future lesson so this is how we can
795:34 in a future lesson so this is how we can use HTML how we can inspect the web page
795:37 use HTML how we can inspect the web page and see exactly what's going on kind of
795:39 and see exactly what's going on kind of under the hood and then in future
795:40 under the hood and then in future lessons we'll see how we can use this
795:42 lessons we'll see how we can use this HTML to specify exactly what data we
795:44 HTML to specify exactly what data we want to pull out thank you guys so much
795:46 want to pull out thank you guys so much for watching if you like this video be
795:48 for watching if you like this video be be sure to like And subscribe below I
795:49 be sure to like And subscribe below I will see you in the next
795:53 will see you in the next [Music]
796:03 [Music] lesson hello everybody in this lesson
796:05 lesson hello everybody in this lesson we're going to be taking a look at
796:06 we're going to be taking a look at beautiful soup and requests now these
796:08 beautiful soup and requests now these packages in Python are really useful
796:11 packages in Python are really useful these are the two main ones that I use
796:13 these are the two main ones that I use when I was first starting out with web
796:14 when I was first starting out with web scraping it can get a lot of what you
796:16 scraping it can get a lot of what you want done in order to get that
796:17 want done in order to get that information out now of course there are
796:19 information out now of course there are other packages that you can use that may
796:21 other packages that you can use that may be a little bit more advanced but again
796:23 be a little bit more advanced but again this is just the beginner Series in a
796:25 this is just the beginner Series in a future series we'll look at other
796:26 future series we'll look at other packages as well that have some more
796:28 packages as well that have some more advanced functionality so what we're
796:29 advanced functionality so what we're going to be doing is we're going to
796:30 going to be doing is we're going to import these packages and then we're
796:32 import these packages and then we're going to get all of the HTML from our
796:34 going to get all of the HTML from our website and make sure that it's in a
796:36 website and make sure that it's in a usable State and then in the next lesson
796:38 usable State and then in the next lesson we're going to kind of query around in
796:40 we're going to kind of query around in the HTML kind of pick and choose exactly
796:43 the HTML kind of pick and choose exactly what we want we look at things like tags
796:45 what we want we look at things like tags variable strings classes attributes and
796:47 variable strings classes attributes and more so let's get started by importing
796:50 more so let's get started by importing our packages what we're going to say is
796:52 our packages what we're going to say is from bs4 this is the module that we're
796:55 from bs4 this is the module that we're taking it from we're going to say import
796:58 taking it from we're going to say import and then we'll do
797:00 and then we'll do beautiful soup then we're going to come
797:02 beautiful soup then we're going to come down and we're going to say import
797:05 down and we're going to say import requests now let's go ahead and run this
797:07 requests now let's go ahead and run this I'm going hit shift enter and it works
797:09 I'm going hit shift enter and it works well for me now if this does not work
797:11 well for me now if this does not work for you you may potentially need to
797:13 for you you may potentially need to actually install bs4 so you may have to
797:15 actually install bs4 so you may have to go to your terminal window and say pip
797:17 go to your terminal window and say pip install BS 4 I'll just let you Google
797:19 install BS 4 I'll just let you Google how to do that if you need to do that
797:20 how to do that if you need to do that cuz it's pretty easy but if you're using
797:22 cuz it's pretty easy but if you're using Jupiter notebooks through Anaconda like
797:24 Jupiter notebooks through Anaconda like how we set it up at the beginning of
797:26 how we set it up at the beginning of this python series then you should be
797:28 this python series then you should be totally fine it should be there for you
797:29 totally fine it should be there for you the next thing that we need to do is
797:31 the next thing that we need to do is specify where we're taking this HTML
797:33 specify where we're taking this HTML from so what we need to actually do is
797:35 from so what we need to actually do is come right over here to our web page and
797:38 come right over here to our web page and we need to get the URL so we're going to
797:40 we need to get the URL so we're going to go here we're going to copy this URL and
797:42 go here we're going to copy this URL and I'm just going to put it right here for
797:43 I'm just going to put it right here for a second and what we're going to do is
797:45 a second and what we're going to do is we're going to be using this URL quite a
797:47 we're going to be using this URL quite a bit so we just want to assign it to a
797:49 bit so we just want to assign it to a variable so just say URL is equal to and
797:52 variable so just say URL is equal to and then we'll put it right in here now we
797:55 then we'll put it right in here now we can get rid of that so now this is our
797:57 can get rid of that so now this is our URL going forward this is where we're
797:58 URL going forward this is where we're going to be pulling data from let's go
798:00 going to be pulling data from let's go ahead and run this now we're going to
798:02 ahead and run this now we're going to use requests and what we're going to do
798:04 use requests and what we're going to do is we're going to say
798:06 is we're going to say requests.get and then we're going to put
798:08 requests.get and then we're going to put in url now this get function is going to
798:11 in url now this get function is going to use the request Library it's going to
798:13 use the request Library it's going to send a get request to that URL and it's
798:16 send a get request to that URL and it's going to return a response object let's
798:18 going to return a response object let's go ahead and run
798:19 go ahead and run this as you can see here I got a
798:22 this as you can see here I got a response of 200 if you got something
798:24 response of 200 if you got something like a 204 or a 400 or 401 or 404 all
798:29 like a 204 or a 400 or 401 or 404 all these things are potentially bad
798:31 these things are potentially bad something like a 204 would mean there
798:32 something like a 204 would mean there was no content in the actual web page
798:34 was no content in the actual web page 400 means a bad request so it was
798:37 400 means a bad request so it was invalid the server couldn't process it
798:39 invalid the server couldn't process it and you don't get any response if you
798:40 and you don't get any response if you got a 404 that might be one that you're
798:42 got a 404 that might be one that you're familiar with that's an error that means
798:44 familiar with that's an error that means the server cannot be found the next
798:46 the server cannot be found the next thing that we're going to do is take the
798:48 thing that we're going to do is take the HTML now if you remember we come right
798:50 HTML now if you remember we come right back here and we inspect this we have
798:52 back here and we inspect this we have all this HTML right here now on this web
798:55 all this HTML right here now on this web page specifically right now it's
798:58 page specifically right now it's completely static it's not a bunch of
798:59 completely static it's not a bunch of moving stuff or anything like that
799:01 moving stuff or anything like that usually when you're looking at HTML if
799:03 usually when you're looking at HTML if you're looking at something like Amazon
799:04 you're looking at something like Amazon and those web pages can update but when
799:06 and those web pages can update but when you actually pull that into python
799:08 you actually pull that into python you're basically getting a snapshot of
799:10 you're basically getting a snapshot of the HTML at that time so what we're
799:12 the HTML at that time so what we're going to do is bring in all of this HTML
799:15 going to do is bring in all of this HTML which is our snapshot of our website and
799:17 which is our snapshot of our website and then we can take a look at it so we're
799:19 then we can take a look at it so we're going to come right down here and now
799:21 going to come right down here and now we're going to say beautiful soup so now
799:23 we're going to say beautiful soup so now we'll use the beautiful soup package or
799:25 we'll use the beautiful soup package or Library so we need to say beautiful soup
799:28 Library so we need to say beautiful soup and we're going do an open parenthesis
799:29 and we're going do an open parenthesis we're going to do two things there's two
799:31 we're going to do two things there's two parameters that we need to put in here
799:32 parameters that we need to put in here first we need to put in this get request
799:35 first we need to put in this get request we actually need to name this and we'll
799:36 we actually need to name this and we'll call this page we'll say page is equal
799:39 call this page we'll say page is equal to and let's run this and now we're
799:42 to and let's run this and now we're going to put that page in here and what
799:44 going to put that page in here and what we're going to say is do text so the
799:46 we're going to say is do text so the page is what's sending that request and
799:48 page is what's sending that request and then the text is what's retrieving the
799:50 then the text is what's retrieving the actual raw HTML that we're going to be
799:52 actual raw HTML that we're going to be using then we're going to put a comma
799:54 using then we're going to put a comma here and what we need to specify is how
799:56 here and what we need to specify is how we're going to parse this information
799:58 we're going to parse this information now this is an HTML so what we're going
800:00 now this is an HTML so what we're going to do is HTML just like this this is a
800:04 to do is HTML just like this this is a standard this is already built into to
800:05 standard this is already built into to this Library so we don't need to go any
800:07 this Library so we don't need to go any further but it's basically going to
800:08 further but it's basically going to parse the information in an HTML format
800:11 parse the information in an HTML format let's go ahead and run this let's see
800:13 let's go ahead and run this let's see what we get and as you can see we have a
800:16 what we get and as you can see we have a lot of information and as we scroll down
800:19 lot of information and as we scroll down I'll try to point out some things that
800:20 I'll try to point out some things that we've already looked at in previous
800:22 we've already looked at in previous lessons
800:25 lessons um something like this th tag that
800:28 um something like this th tag that should be very similar that's the title
800:30 should be very similar that's the title then we have these TD tags and then of
800:32 then we have these TD tags and then of course if we scroll down even further
800:34 course if we scroll down even further we'll have things like ATR tag so these
800:36 we'll have things like ATR tag so these are all things that we looked at in that
800:38 are all things that we looked at in that first lesson when learning about HTML
800:40 first lesson when learning about HTML now again we want to assign this to a
800:42 now again we want to assign this to a variable so we're going to say soup
800:45 variable so we're going to say soup that's going to say equal to this
800:47 that's going to say equal to this information information right here now
800:49 information information right here now I'm not going to go into all the history
800:50 I'm not going to go into all the history behind beautiful soup what I will say is
800:52 behind beautiful soup what I will say is the guy who created this beautiful soup
800:54 the guy who created this beautiful soup Library uh what he said was is that it
800:56 Library uh what he said was is that it takes this really messy HTML or XML
800:59 takes this really messy HTML or XML which you can also use it for and makes
801:01 which you can also use it for and makes it into this kind of beautiful soup so I
801:03 it into this kind of beautiful soup so I just thought that was kind of funny uh
801:05 just thought that was kind of funny uh but that's why we're calling it soup
801:06 but that's why we're calling it soup right here and we're going to go ahead
801:08 right here and we're going to go ahead and run this and we'll come right down
801:10 and run this and we'll come right down here and we'll say print soup and let's
801:13 here and we'll say print soup and let's run it and now we have everything in
801:16 run it and now we have everything in here so we have our HTML L our head we
801:19 here so we have our HTML L our head we have some HR and some links in here
801:23 have some HR and some links in here let's scroll down a little bit more and
801:24 let's scroll down a little bit more and then we have our body right there and of
801:27 then we have our body right there and of course we have a bunch of information in
801:28 course we have a bunch of information in here now in the next lesson what we're
801:31 here now in the next lesson what we're going to be doing is learning how to
801:32 going to be doing is learning how to kind of query all of this to take
801:34 kind of query all of this to take specific information out and basically
801:36 specific information out and basically understand a lot of what's going on in
801:37 understand a lot of what's going on in this HTML to make sure we can actually
801:40 this HTML to make sure we can actually get what we need now if this looks
801:41 get what we need now if this looks really kind of messy to you and it just
801:44 really kind of messy to you and it just doesn't make a lot of sense there is one
801:46 doesn't make a lot of sense there is one more thing that I'm going to show you
801:48 more thing that I'm going to show you and we'll come right down here so we'll
801:49 and we'll come right down here so we'll say soup. pry and if you've ever used a
801:53 say soup. pry and if you've ever used a different type of programming languages
801:55 different type of programming languages uh pry is very common in a lot of them
801:58 uh pry is very common in a lot of them where it'll just make it a little bit
801:59 where it'll just make it a little bit more easy to visualize and see uh you'll
802:01 more easy to visualize and see uh you'll notice that it kind of has this
802:02 notice that it kind of has this hierarchy built in whereas if we scroll
802:05 hierarchy built in whereas if we scroll up there's no hierarchy built in it's
802:07 up there's no hierarchy built in it's all just down this left hand side so if
802:09 all just down this left hand side so if you kind of want to view it and just
802:11 you kind of want to view it and just kind of visually see the differences
802:13 kind of visually see the differences this does help a lot but it doesn't
802:16 this does help a lot but it doesn't actually help a lot when you're you know
802:18 actually help a lot when you're you know querying it or using you know find and
802:20 querying it or using you know find and find all which is what we're going to
802:21 find all which is what we're going to look at in the next lesson so that is
802:23 look at in the next lesson so that is our lesson on beautiful soup and
802:25 our lesson on beautiful soup and requests in the next two lessons we're
802:27 requests in the next two lessons we're going to be looking at find and find all
802:29 going to be looking at find and find all as well as really diving into things
802:30 as well as really diving into things like variable strings and tags and
802:32 like variable strings and tags and classes and all those things and then in
802:33 classes and all those things and then in the last lesson we're going to do kind
802:34 the last lesson we're going to do kind of this mini project where we try to get
802:36 of this mini project where we try to get all the data from this web page that
802:38 all the data from this web page that we've been using from that table and put
802:40 we've been using from that table and put it into a panda's data frame so thank
802:43 it into a panda's data frame so thank you guys so much for watching I really
802:44 you guys so much for watching I really appreciate it if you like this video be
802:46 appreciate it if you like this video be sure to like And subscribe below and I
802:48 sure to like And subscribe below and I will see you in the next
802:51 will see you in the next [Music]
803:01 [Music] lesson hello everybody in this lesson
803:03 lesson hello everybody in this lesson we're going to be taking a look at find
803:05 we're going to be taking a look at find and find all really we're going to be
803:07 and find all really we're going to be looking at a ton of different things in
803:09 looking at a ton of different things in this lesson this is where we really
803:10 this lesson this is where we really start digging in seeing how we can
803:12 start digging in seeing how we can extract specific information from our
803:15 extract specific information from our web page but in order to do that let's
803:17 web page but in order to do that let's set everything up where we actually
803:18 set everything up where we actually bring in the HTML like we did in the
803:20 bring in the HTML like we did in the last lesson and we're just going to
803:22 last lesson and we're just going to write all this out one more time just
803:23 write all this out one more time just for practice if nothing else and then
803:26 for practice if nothing else and then we'll get into actually getting that
803:28 we'll get into actually getting that information from the HTML so we're going
803:30 information from the HTML so we're going to start by saying from bs4 import
803:35 to start by saying from bs4 import beautiful soup there we go and import
803:40 beautiful soup there we go and import requests we'll go ahead and run this
803:42 requests we'll go ahead and run this then we're going to come up here grab
803:44 then we're going to come up here grab our HTML or sorry our URL so we'll say
803:48 our HTML or sorry our URL so we'll say URL is equal to and we'll have that
803:51 URL is equal to and we'll have that right here now we need to say page is
803:54 right here now we need to say page is equal to and then we'll do
803:56 equal to and then we'll do requests.get and then we'll put in our
803:59 requests.get and then we'll put in our URL right here and we're going to come
804:01 URL right here and we're going to come over here and run this and lastly we
804:03 over here and run this and lastly we need to say soup so we'll say soup is
804:05 need to say soup so we'll say soup is equal to beautiful soup there we go and
804:09 equal to beautiful soup there we go and then within our parentheses we need to
804:11 then within our parentheses we need to specify the page. text because we need
804:13 specify the page. text because we need that and our parser which is
804:16 that and our parser which is HTML
804:18 HTML and there we go and let's go ahead and
804:20 and there we go and let's go ahead and run this let's print it out make sure
804:22 run this let's print it out make sure it's
804:23 it's working and there we go so we have our
804:27 working and there we go so we have our soup right here all this should look
804:29 soup right here all this should look really similar to uh our last lesson and
804:32 really similar to uh our last lesson and so now we've brought in our HTML from
804:35 so now we've brought in our HTML from our page we have a lot a lot a lot of
804:37 our page we have a lot a lot a lot of information in here now really quickly
804:40 information in here now really quickly let's come over and let's inspect our
804:42 let's come over and let's inspect our web
804:43 web page now in here we have a ton of
804:47 page now in here we have a ton of information right we have bunch of
804:49 information right we have bunch of different tags and classes and all these
804:50 different tags and classes and all these other things but how do we actually use
804:53 other things but how do we actually use these well that's where the find and
804:55 these well that's where the find and find all is going to come into play and
804:57 find all is going to come into play and they're pretty similar and you'll see
804:59 they're pretty similar and you'll see that in just a little bit but let's say
805:01 that in just a little bit but let's say we want to take uh one of these tags and
805:04 we want to take uh one of these tags and let's come down let's say we just want
805:06 let's come down let's say we just want to take this div tag now there's going
805:09 to take this div tag now there's going to be a lot of different div tags in our
805:12 to be a lot of different div tags in our HTML but let's just come right here
805:15 HTML but let's just come right here let's go down and let's say
805:18 let's go down and let's say we're going to call Soup we're going to
805:19 we're going to call Soup we're going to say soup that's all of our information
805:20 say soup that's all of our information we're going to say do find now within
805:23 we're going to say do find now within our parentheses we can specify a lot of
805:25 our parentheses we can specify a lot of different things but we're going to keep
805:26 different things but we're going to keep it really simple right now we're just
805:28 it really simple right now we're just going to say
805:29 going to say di let's go ahead and run this what this
805:31 di let's go ahead and run this what this is going to bring up is the very first
805:34 is going to bring up is the very first div tag in our HTML and that's going to
805:36 div tag in our HTML and that's going to be this information right here now let's
805:40 be this information right here now let's copy this and we're going to do the
805:41 copy this and we're going to do the exact same thing except we're going to
805:44 exact same thing except we're going to say find underscore all now let's run
805:48 say find underscore all now let's run this now we're going to have a ton more
805:51 this now we're going to have a ton more information really all find and find all
805:54 information really all find and find all do is that they find the information now
805:56 do is that they find the information now find is only going to find the first
805:59 find is only going to find the first response in our HTML Le that's the div
806:02 response in our HTML Le that's the div class container let's go back up to the
806:04 class container let's go back up to the top that's our div class container but
806:07 top that's our div class container but find all is going to find all of them so
806:10 find all is going to find all of them so it'll put it in this list for you so
806:12 it'll put it in this list for you so it's going to have this first one and it
806:13 it's going to have this first one and it goes down to uh this SL div which should
806:16 goes down to uh this SL div which should be right here and then we have a comma
806:20 be right here and then we have a comma which separates our next div tag so that
806:23 which separates our next div tag so that is how we can use it now what if we want
806:24 is how we can use it now what if we want to specify one of these div tags we
806:27 to specify one of these div tags we pulled in a ton of them but we want to
806:29 pulled in a ton of them but we want to just look for one of them well this is
806:31 just look for one of them well this is something where the class comes in handy
806:33 something where the class comes in handy because right now we have class is equal
806:34 because right now we have class is equal to container classes equal to co
806:38 to container classes equal to co md-12 I don't know what these are at the
806:40 md-12 I don't know what these are at the off the top of my head but um usually
806:43 off the top of my head but um usually they'll be somewhat unique and we can
806:45 they'll be somewhat unique and we can use these to help us specify what we're
806:47 use these to help us specify what we're looking for for example just kind of
806:49 looking for for example just kind of glancing of this we could also use this
806:51 glancing of this we could also use this a tag if we wanted to look at this so we
806:53 a tag if we wanted to look at this so we could say oh we're looking for uh these
806:55 could say oh we're looking for uh these hrefs so we have an hre here and this
806:59 hrefs so we have an hre here and this right down here we have this hre as well
807:01 right down here we have this hre as well which again uh if you remember from
807:03 which again uh if you remember from previous lesson that stands for a
807:04 previous lesson that stands for a hyperlink now something like the class
807:07 hyperlink now something like the class or the href um or these IDs these are
807:10 or the href um or these IDs these are all attributes so we can specify or kind
807:14 all attributes so we can specify or kind of filter Down based off of these now
807:15 of filter Down based off of these now let's try it so what we can do is we can
807:17 let's try it so what we can do is we can do class first and this is kind of the
807:19 do class first and this is kind of the default uh within something like find
807:21 default uh within something like find all is you can even do class underscore
807:25 all is you can even do class underscore we can come right back up we have this
807:27 we can come right back up we have this div and then here's our class so again
807:29 div and then here's our class so again we have to have the div and the class if
807:31 we have to have the div and the class if we took this a tag this is an a tag
807:34 we took this a tag this is an a tag which would go right here with the class
807:36 which would go right here with the class of something like navlink or something
807:38 of something like navlink or something like navlink again down here we need to
807:40 like navlink again down here we need to specify that more but we have our div so
807:43 specify that more but we have our div so we'll say CL Cole
807:45 we'll say CL Cole md12 right here and let's go ahead and
807:48 md12 right here and let's go ahead and run this and now it's going to pull in
807:50 run this and now it's going to pull in just that information now we're still
807:51 just that information now we're still getting a list because we have multiple
807:54 getting a list because we have multiple of these so this div class uh Cole md-12
807:58 of these so this div class uh Cole md-12 doesn't just happen once if we scroll
808:00 doesn't just happen once if we scroll down we'll see it multiple times
808:03 down we'll see it multiple times something like right here uh or actually
808:06 something like right here uh or actually let me see right here so here's this
808:08 let me see right here so here's this comma then here's our next one so we
808:10 comma then here's our next one so we have two of these uh div tags with a
808:13 have two of these uh div tags with a class of coal- md-12 and in each of
808:16 class of coal- md-12 and in each of these we have different information this
808:19 these we have different information this looks like a paragraph with this P tag
808:21 looks like a paragraph with this P tag right here and let's scroll back up uh
808:25 right here and let's scroll back up uh so I also think we should try out doing
808:27 so I also think we should try out doing something like this P tag typically
808:29 something like this P tag typically these P tags stand for paragraphs or
808:31 these P tags stand for paragraphs or they have text information in them let's
808:33 they have text information in them let's try to P tag really quickly and let's
808:35 try to P tag really quickly and let's just see what we get and let's run this
808:38 just see what we get and let's run this and it looks like we get multiple P tags
808:41 and it looks like we get multiple P tags now if we come back here you can see
808:43 now if we come back here you can see that there's this information and it's
808:45 that there's this information and it's this information that we're pulling in
808:47 this information that we're pulling in and I'm just you know noticing that from
808:49 and I'm just you know noticing that from right here and then we have this
808:51 right here and then we have this information right here and it looks like
808:53 information right here and it looks like there's one more which is this href
808:56 there's one more which is this href which looks like this open source so
808:58 which looks like this open source so data via and then that uh hyperlink or
809:01 data via and then that uh hyperlink or that link right there so we have three
809:03 that link right there so we have three different P tags now just to verify and
809:06 different P tags now just to verify and make sure that that's correct what we
809:08 make sure that that's correct what we could do is come over here we're going
809:10 could do is come over here we're going to click on this paragraph it's going to
809:12 to click on this paragraph it's going to take us to that P tag where the class is
809:15 take us to that P tag where the class is equal to lead let's come over here and
809:18 equal to lead let's come over here and look at this paragraph now we have
809:20 look at this paragraph now we have another P tag right over here with the
809:23 another P tag right over here with the class is equal to glyphicon glyphicon
809:26 class is equal to glyphicon glyphicon education I have no idea what that means
809:29 education I have no idea what that means um and then we'll go to our last one
809:31 um and then we'll go to our last one which is right here where the P tag is
809:33 which is right here where the P tag is equal to uh we have a tag HRA class uh
809:38 equal to uh we have a tag HRA class uh and a bunch of other information so
809:39 and a bunch of other information so let's say we just wanted to pull in this
809:42 let's say we just wanted to pull in this paragraph right here let's go here and
809:44 paragraph right here let's go here and see how we can specify this information
809:47 see how we can specify this information so it looks like P or the class is equal
809:49 so it looks like P or the class is equal to lead that looks like it's going to be
809:52 to lead that looks like it's going to be unique to just that one so if we come
809:54 unique to just that one so if we come down here we're going to say comma and
809:57 down here we're going to say comma and it was class so you can do uh class
810:00 it was class so you can do uh class underscore is equal to and then we're
810:03 underscore is equal to and then we're going to say lead let's try running this
810:06 going to say lead let's try running this and we're just pulling in that
810:08 and we're just pulling in that information now let's say we actually
810:10 information now let's say we actually want to pull in this paragraph We
810:12 want to pull in this paragraph We actually want this text right here and
810:15 actually want this text right here and this is a very real use case you know
810:17 this is a very real use case you know let's say I'm trying to pull in some
810:18 let's say I'm trying to pull in some information or or a paragraph of text
810:21 information or or a paragraph of text well let's copy this and what we're
810:23 well let's copy this and what we're going to then do is say. text and let's
810:27 going to then do is say. text and let's run this now we're going to get an error
810:29 run this now we're going to get an error right here and this is a very common
810:30 right here and this is a very common error because we're trying to use find
810:33 error because we're trying to use find all unfortunately find all does not have
810:37 all unfortunately find all does not have a text attribute we actually need to
810:39 a text attribute we actually need to change this to find typically when I'm
810:42 change this to find typically when I'm working with these find and find alls
810:44 working with these find and find alls I'm using findall most of the time until
810:47 I'm using findall most of the time until I want to start extracting text then
810:49 I want to start extracting text then when I specify it I'll change this back
810:52 when I specify it I'll change this back to find just like this now let's try
810:55 to find just like this now let's try this and now we're getting in
810:57 this and now we're getting in parentheses this information now this is
810:59 parentheses this information now this is all wonky it needs to definitely be
811:01 all wonky it needs to definitely be cleaned up a little bit but if we code
811:03 cleaned up a little bit but if we code back up it's no longer in a list and we
811:07 back up it's no longer in a list and we no longer have things like these P tags
811:09 no longer have things like these P tags in here or this class attribute so we're
811:13 in here or this class attribute so we're really just trying to pull out this
811:14 really just trying to pull out this information now again this does not look
811:17 information now again this does not look perfect we could even try to do
811:19 perfect we could even try to do something like do strip look like
811:21 something like do strip look like there's some white space uh that cleans
811:23 there's some white space uh that cleans it up a little bit this definitely looks
811:26 it up a little bit this definitely looks a little better um and we could
811:28 a little better um and we could definitely go in here and clean this up
811:29 definitely go in here and clean this up more but just for you know an example
811:32 more but just for you know an example this is how we can then extract that
811:33 this is how we can then extract that information now let's look at one more
811:36 information now let's look at one more example this is some information and
811:38 example this is some information and this is what we're going to do kind of
811:39 this is what we're going to do kind of our little mini project in the next
811:40 our little mini project in the next lesson on let's say we wanted to take
811:42 lesson on let's say we wanted to take all this information what if we wanted
811:44 all this information what if we wanted to pull in something like the team name
811:47 to pull in something like the team name that's going to be in right here in this
811:49 that's going to be in right here in this TR tag and each of these TR tags have th
811:53 TR tag and each of these TR tags have th tags underneath them so if we scroll
811:55 tags underneath them so if we scroll down you'll notice that each row is this
811:58 down you'll notice that each row is this TR tag so let's go ahead and search for
812:02 TR tag so let's go ahead and search for let's do th let's just search for that
812:05 let's do th let's just search for that first so let's come right back up here
812:08 first so let's come right back up here let's use this find
812:10 let's use this find all and we'll get rid of this text for
812:13 all and we'll get rid of this text for right now and let's just say we want to
812:16 right now and let's just say we want to look for the TR is that what we said we
812:20 look for the TR is that what we said we were looking for no th so let's say
812:22 were looking for no th so let's say we're looking for th let's go ahead and
812:25 we're looking for th let's go ahead and run this so we're going to have
812:26 run this so we're going to have underneath this th we have team name
812:28 underneath this th we have team name year wins losses and notice these are
812:31 year wins losses and notice these are all the titles so these titles are the
812:34 all the titles so these titles are the only ones with these th tags if we go
812:38 only ones with these th tags if we go down you'll notice that the data is
812:40 down you'll notice that the data is actually TD tags so now let's go back
812:43 actually TD tags so now let's go back and look for TD we'll say
812:47 and look for TD we'll say D and this is going to be a lot longer
812:50 D and this is going to be a lot longer we have a lot of information but these
812:51 we have a lot of information but these are all the rows of data let's see if we
812:54 are all the rows of data let's see if we can just get one piece of this data
812:56 can just get one piece of this data we're going to get back we want just
812:58 we're going to get back we want just this team name that's all we're trying
812:59 this team name that's all we're trying to pull in for now um and then we'll try
813:02 to pull in for now um and then we'll try to get this row and then in the next
813:04 to get this row and then in the next lesson we're going to try to get all of
813:06 lesson we're going to try to get all of this information make it look really
813:08 this information make it look really nice and then we'll put it into a
813:10 nice and then we'll put it into a panda's data frame so let's just get
813:11 panda's data frame so let's just get this team name right now let's go ahead
813:14 this team name right now let's go ahead we're going to say th
813:17 we're going to say th let's run this and we have this th and
813:20 let's run this and we have this th and now that we know we're getting this
813:22 now that we know we're getting this information in we can
813:25 information in we can do find let's run this so there's our
813:29 do find let's run this so there's our team name we're just going to say. text
813:33 team name we're just going to say. text and again we can do do strip just like
813:36 and again we can do do strip just like that and Bam we have our team name so
813:39 that and Bam we have our team name so you can kind of start getting the idea
813:41 you can kind of start getting the idea of how we're pulling this information
813:43 of how we're pulling this information out we're really just specifying exactly
813:46 out we're really just specifying exactly what we're seeing in this HTML and
813:48 what we're seeing in this HTML and what's really really helpful and you
813:50 what's really really helpful and you know something that I do all the time is
813:52 know something that I do all the time is I'm inspecting it I'm just kind of
813:54 I'm inspecting it I'm just kind of searching like how what do I want what
813:56 searching like how what do I want what piece of information do I want then I go
813:58 piece of information do I want then I go ahead and click on it and then I'm
813:59 ahead and click on it and then I'm looking you know where is this sitting
814:01 looking you know where is this sitting in the hierarchy it's within the body
814:03 in the hierarchy it's within the body it's within this table with the class of
814:05 it's within this table with the class of table then it's down here where this TR
814:08 table then it's down here where this TR tag and then this TD tag so I'm looking
814:10 tag and then this TD tag so I'm looking kind of at the hierarchy and I'm
814:12 kind of at the hierarchy and I'm specifying exactly what I'm looking for
814:14 specifying exactly what I'm looking for so that is what we're going to look at
814:16 so that is what we're going to look at in today's lesson that's how we can use
814:18 in today's lesson that's how we can use find and find all we were able to look
814:20 find and find all we were able to look at classes and tags and attributes and
814:23 at classes and tags and attributes and variable strings which is this right
814:25 variable strings which is this right here getting that text uh and variable
814:28 here getting that text uh and variable strings and we will look at find and
814:30 strings and we will look at find and find all and how it's pulling that
814:32 find all and how it's pulling that information in and how we can specify
814:34 information in and how we can specify exactly what we're looking for now in
814:35 exactly what we're looking for now in the next lesson which is definitely
814:37 the next lesson which is definitely going to be the most exciting one we're
814:39 going to be the most exciting one we're going to try to pull in all of this
814:40 going to try to pull in all of this information so every single thing
814:43 information so every single thing because we'll be able to put all this
814:45 because we'll be able to put all this information into a data frame which then
814:47 information into a data frame which then we can use pandas to really search and
814:49 we can use pandas to really search and manipulate that data within that data
814:52 manipulate that data within that data frame so with that being said that is
814:53 frame so with that being said that is the end of this lesson if you like this
814:55 the end of this lesson if you like this video be sure to like And subscribe I
814:57 video be sure to like And subscribe I will see you in the next
815:00 will see you in the next [Music]
815:10 [Music] lesson hello everybody in this lesson we
815:13 lesson hello everybody in this lesson we are going to be scraping data from a
815:14 are going to be scraping data from a real website and putting it into a p
815:16 real website and putting it into a p and's data frame and maybe even
815:18 and's data frame and maybe even exporting it to CSV if we're feeling a
815:20 exporting it to CSV if we're feeling a bit spicy now in the last several
815:22 bit spicy now in the last several lessons we've been looking at this page
815:25 lessons we've been looking at this page right here and I even promised that we
815:26 right here and I even promised that we were going to be pulling this data but
815:29 were going to be pulling this data but as I was building out the project I just
815:31 as I was building out the project I just I honestly thought it was a little bit
815:32 I honestly thought it was a little bit too easy since in the last lesson we
815:34 too easy since in the last lesson we kind of already pulled out some
815:35 kind of already pulled out some information from this table and I want
815:37 information from this table and I want to kind of throw you guys off so we're
815:39 to kind of throw you guys off so we're going to be pulling from a different
815:41 going to be pulling from a different table we're going to be going on to
815:42 table we're going to be going on to Wikipedia and looking at the list of the
815:43 Wikipedia and looking at the list of the largest companies in the United States
815:45 largest companies in the United States by Revenue and we're going to be pulling
815:47 by Revenue and we're going to be pulling all of this information so if you
815:49 all of this information so if you thought this was going to be easy in a
815:50 thought this was going to be easy in a little mini project uh it's now a full
815:52 little mini project uh it's now a full project because why not so let's get
815:56 project because why not so let's get started uh what we're going to do is
815:58 started uh what we're going to do is we're going to import beautiful soup and
815:59 we're going to import beautiful soup and requests we're going to get this
816:01 requests we're going to get this information and we're going to see how
816:03 information and we're going to see how we can do this and it's going to get a
816:05 we can do this and it's going to get a little bit more complicated and a little
816:07 little bit more complicated and a little bit more tricky we're going to have to
816:08 bit more tricky we're going to have to you know format things properly to get
816:10 you know format things properly to get it into our Panda data frame to make it
816:12 it into our Panda data frame to make it looking good and making it more usable
816:15 looking good and making it more usable so let's go ahead and get rid of the
816:16 so let's go ahead and get rid of the this easy table we don't want that one
816:18 this easy table we don't want that one uh and we're going to come in here and
816:20 uh and we're going to come in here and we're just going to start off this
816:21 we're just going to start off this should look uh really familiar by now
816:23 should look uh really familiar by now we're going to say from bs4 import
816:28 we're going to say from bs4 import beautiful soup I don't know if you've
816:30 beautiful soup I don't know if you've noticed but I've messed up spelling
816:32 noticed but I've messed up spelling beautiful soup in every single uh video
816:35 beautiful soup in every single uh video I've noticed uh let's run this and now
816:38 I've noticed uh let's run this and now we need to go ahead and get our URL so
816:40 we need to go ahead and get our URL so let's come up here let's get our
816:43 let's come up here let's get our URL say URL is equal to and we'll just
816:47 URL say URL is equal to and we'll just keep it all in the same thing really
816:49 keep it all in the same thing really quickly because we know this by Heart by
816:51 quickly because we know this by Heart by now right uh we'll say request.get and
816:55 now right uh we'll say request.get and then URL to make sure that we're getting
816:56 then URL to make sure that we're getting that information it give us a response
816:58 that information it give us a response object um hopefully it'll be 200 that'll
817:01 object um hopefully it'll be 200 that'll mean a good response and then we'll say
817:03 mean a good response and then we'll say soup is equal to and then we'll say
817:05 soup is equal to and then we'll say beautiful soup and we'll do our page.
817:09 beautiful soup and we'll do our page. text now we're pulling in the
817:10 text now we're pulling in the information from this URL and then we
817:12 information from this URL and then we use our parser which will be oops HTML
817:17 use our parser which will be oops HTML and let's go ahead and run this looks
817:20 and let's go ahead and run this looks like everything went well let's print
817:21 like everything went well let's print our soup now this is completely new to
817:24 our soup now this is completely new to you it's completely new to me I don't
817:26 you it's completely new to me I don't know what I'm doing uh but it looks like
817:28 know what I'm doing uh but it looks like we're pulling in the information am I
817:30 we're pulling in the information am I right so we got a lot of things going
817:32 right so we got a lot of things going for us uh the uh stuff was imported
817:35 for us uh the uh stuff was imported properly we got our URL we got our soup
817:38 properly we got our URL we got our soup which is uh not beautiful in my opinion
817:41 which is uh not beautiful in my opinion but let's keep on rolling let's come
817:43 but let's keep on rolling let's come right down here now what we need to do
817:45 right down here now what we need to do is we we need to specify what data we're
817:47 is we we need to specify what data we're looking for so let's come and let's
817:50 looking for so let's come and let's inspect this web page now the only
817:52 inspect this web page now the only information that we're going to want is
817:54 information that we're going to want is right in here we're going to want these
817:56 right in here we're going to want these uh titles or these headers whoops so
817:59 uh titles or these headers whoops so we're going to want rank name industry
818:01 we're going to want rank name industry Etc and then we are for sure going to
818:04 Etc and then we are for sure going to want all of this information let's just
818:06 want all of this information let's just scroll down see if there's anything
818:07 scroll down see if there's anything tricky in
818:09 tricky in here all right that looks pretty good
818:12 here all right that looks pretty good and there is another table so there's
818:14 and there is another table so there's not just one table in here there are two
818:16 not just one table in here there are two tables in this page so that might change
818:20 tables in this page so that might change things for us but let's come right back
818:24 things for us but let's come right back and let's inspect our page by using this
818:26 and let's inspect our page by using this little button right here and let's
818:28 little button right here and let's specify in let's see if I can highlight
818:31 specify in let's see if I can highlight just this page oh it's not going oh
818:34 just this page oh it's not going oh let's do that right there so now we have
818:37 let's do that right there so now we have this uh Wiki table sorter now I'm going
818:40 this uh Wiki table sorter now I'm going to actually come right here I'm going to
818:41 to actually come right here I'm going to copy and I'm just going to say copy the
818:44 copy and I'm just going to say copy the outer HTML just just going to paste in
818:47 outer HTML just just going to paste in here real quick and that's a ton of
818:49 here real quick and that's a ton of information I didn't think it was going
818:50 information I didn't think it was going to copy all of it and we're just going
818:52 to copy all of it and we're just going to delete that I just wanted to keep
818:53 to delete that I just wanted to keep that class uh because I wanted to then
818:57 that class uh because I wanted to then come right down here at the bottom and
819:00 come right down here at the bottom and just see what this table uh looks like I
819:03 just see what this table uh looks like I don't know if it's part of it or if it's
819:04 don't know if it's part of it or if it's a if it's its own
819:06 a if it's its own table um I can't tell let's look at this
819:09 table um I can't tell let's look at this Rank and let's come up so it says uh
819:13 Rank and let's come up so it says uh it's under this
819:14 it's under this table and it looks like it's its own
819:16 table and it looks like it's its own table but it says Wiki table sort
819:18 table but it says Wiki table sort sortable jQuery table sorter wikip
819:22 sortable jQuery table sorter wikip sortable jQuery table sorter so it looks
819:25 sortable jQuery table sorter so it looks like there are two tables with the same
819:28 like there are two tables with the same class which shouldn't be a problem if
819:32 class which shouldn't be a problem if we're using find to get our text because
819:34 we're using find to get our text because we should be taking the first one which
819:35 we should be taking the first one which will be this table and this is the table
819:37 will be this table and this is the table we want um and if we wanted this one we
819:42 we want um and if we wanted this one we could just use find all and since it's a
819:44 could just use find all and since it's a list we could use index ing to pull this
819:47 list we could use index ing to pull this table right um but I think we're going
819:50 table right um but I think we're going to be okay with just pulling in this one
819:53 to be okay with just pulling in this one so let's go ahead and let's do our find
819:56 so let's go ahead and let's do our find so we'll do
819:57 so we'll do soup. find and we could find all or we
820:01 soup. find and we could find all or we could just do find uh table let's just
820:03 could just do find uh table let's just try this and see what we get and if it
820:06 try this and see what we get and if it pulls in the right one that we're
820:07 pulls in the right one that we're looking for that' be great now this does
820:10 looking for that' be great now this does not look correct at all um I don't know
820:14 not look correct at all um I don't know what table it's pulling in oh maybe it's
820:17 what table it's pulling in oh maybe it's this right here this might be a table
820:20 this right here this might be a table yeah it is so we have this uh box more
820:23 yeah it is so we have this uh box more citations so actually we are going to
820:24 citations so actually we are going to have to do exactly like what I was
820:26 have to do exactly like what I was talking about uh let's pull
820:29 talking about uh let's pull this and we well we could do comma class
820:33 this and we well we could do comma class uh right here and let's do both you know
820:35 uh right here and let's do both you know what this is a learning opportunity
820:37 what this is a learning opportunity let's do both so let me go back up to
820:40 let's do both so let me go back up to the top because I need these um and what
820:44 the top because I need these um and what we're going to do let's come right down
820:46 we're going to do let's come right down here I want to add in uh another thing
820:50 here I want to add in uh another thing actually I'll just push this one up
820:52 actually I'll just push this one up there we go so we're going to say findor
820:55 there we go so we're going to say findor all let's run this so now we have
820:58 all let's run this so now we have multiple and again we got that weird one
821:00 multiple and again we got that weird one first but if we scroll down here's our
821:03 first but if we scroll down here's our comma and then here's our wik Wiki table
821:06 comma and then here's our wik Wiki table sortable and then we have rank name
821:10 sortable and then we have rank name industry all the ones that we were
821:11 industry all the ones that we were hoping to see and I guarantee you if you
821:14 hoping to see and I guarantee you if you scroll all the way to the bottom
821:16 scroll all the way to the bottom um we're going to
821:18 um we're going to see potentially Well Fargo Goldman Sachs
821:22 see potentially Well Fargo Goldman Sachs I'm pretty sure those are
821:25 I'm pretty sure those are um let's see yeah here we go like Ford
821:27 um let's see yeah here we go like Ford motor Wells Fargo Goldman Sachs that's
821:30 motor Wells Fargo Goldman Sachs that's this table right here so now we're
821:31 this table right here so now we're looking at the third table but again
821:33 looking at the third table but again this is a list so we can use indexing on
821:35 this is a list so we can use indexing on this and we'll just choose not position
821:38 this and we'll just choose not position zero because that's this one right here
821:40 zero because that's this one right here which we did not like well now we'll
821:42 which we did not like well now we'll take position one let's run this let's
821:46 take position one let's run this let's go back up to the top and this is our
821:49 go back up to the top and this is our table right here rank name industry this
821:52 table right here rank name industry this is the information that we were actually
821:54 is the information that we were actually wanting just to confirm rank name
821:57 wanting just to confirm rank name industry Etc so this is the information
822:00 industry Etc so this is the information we're wanting and we're able to specify
822:02 we're wanting and we're able to specify that with our findall and this is the
822:04 that with our findall and this is the information we want so we now want to
822:06 information we want so we now want to make this the only information that
822:08 make this the only information that we're looking at so I'm just going to
822:09 we're looking at so I'm just going to copy this we didn't need to use our
822:12 copy this we didn't need to use our class for this one you could probably
822:13 class for this one you could probably could have um but we could so let's
822:15 could have um but we could so let's actually um put this right down here
822:17 actually um put this right down here this will be our table we'll say equal
822:19 this will be our table we'll say equal to but then I'll come right here and I'm
822:22 to but then I'll come right here and I'm going to say soup. find this is just for
822:27 going to say soup. find this is just for demonstration purposes we do table comma
822:29 demonstration purposes we do table comma class underscore is equal to and then
822:33 class underscore is equal to and then we'll look at this right here whoops me
822:36 we'll look at this right here whoops me do this and let's see if we get the
822:38 do this and let's see if we get the correct
822:39 correct output and let's run this and looks like
822:42 output and let's run this and looks like we're getting a nun type object uh if I
822:45 we're getting a nun type object uh if I remember remember looks like the actual
822:47 remember remember looks like the actual class is this right here so let's run
822:51 class is this right here so let's run this instead and I got to get rid of the
822:53 this instead and I got to get rid of the index there we go okay so we were able
822:56 index there we go okay so we were able to pull it in just using the find so the
822:58 to pull it in just using the find so the find table class and it says Wiki table
823:01 find table class and it says Wiki table sortable at least that's the HTML that
823:04 sortable at least that's the HTML that we're pulling in right here let me go
823:07 we're pulling in right here let me go back because I don't don't know if
823:10 back because I don't don't know if that's what I was seeing
823:12 that's what I was seeing earlier let's just get this rank let's
823:14 earlier let's just get this rank let's go back up
823:16 go back up where's the
823:17 where's the rank we go rank there we go so here's
823:20 rank we go rank there we go so here's our Rank and let's go up to the table
823:24 our Rank and let's go up to the table and there's our
823:25 and there's our class yeah and and that's just uh to me
823:28 class yeah and and that's just uh to me that's a little bit odd so it says Wiki
823:30 that's a little bit odd so it says Wiki table sortable jQuery Das table sorder
823:33 table sortable jQuery Das table sorder right here but in our
823:36 right here but in our actual um in our actual python script
823:39 actual um in our actual python script that we're running it was only pulling
823:41 that we're running it was only pulling in the wiki table sortable so it wasn't
823:45 in the wiki table sortable so it wasn't pulling in the jQuery dot sorter why uh
823:48 pulling in the jQuery dot sorter why uh I'm not 100% sure but all things that
823:51 I'm not 100% sure but all things that we're working through and we were able
823:53 we're working through and we were able to uh we were able to figure out so
823:56 to uh we were able to figure out so we're going to make this our table we're
823:59 we're going to make this our table we're going to say tables equal to uh soup.
824:02 going to say tables equal to uh soup. findall and let's run this and if we
824:05 findall and let's run this and if we print out our table we have this table
824:07 print out our table we have this table now this is our only data that we are
824:10 now this is our only data that we are looking at now the first thing that I
824:11 looking at now the first thing that I want to get is I want to get these
824:14 want to get is I want to get these titles or these headers right here
824:16 titles or these headers right here that's where we're going to get first so
824:18 that's where we're going to get first so let's go in here we can just look in
824:20 let's go in here we can just look in this information you can see that these
824:21 this information you can see that these are with these th tags and we can pull
824:25 are with these th tags and we can pull out those th tags really easily let's
824:28 out those th tags really easily let's come right down here we're just going to
824:30 come right down here we're just going to say th and we can get rid of this let's
824:34 say th and we can get rid of this let's run this now these are our only th tags
824:37 run this now these are our only th tags because everything else is a TR tag for
824:40 because everything else is a TR tag for these rows of data so these th tags are
824:42 these rows of data so these th tags are pretty unique which makes it really easy
824:45 pretty unique which makes it really easy which is really really great because
824:46 which is really really great because then we can just do worldcore titles is
824:49 then we can just do worldcore titles is equal to so now we have these titles but
824:52 equal to so now we have these titles but uh they're not perfect but what we're
824:54 uh they're not perfect but what we're going to do is we're going to Loop
824:56 going to do is we're going to Loop through it so I'm going to say worldcore
824:58 through it so I'm going to say worldcore titles and I'll kind of walk through
824:59 titles and I'll kind of walk through what I'm talking about isn't a list and
825:02 what I'm talking about isn't a list and each one is Within These th tags so th
825:05 each one is Within These th tags so th and then there's our um string that
825:07 and then there's our um string that we're trying to get so we can easily
825:10 we're trying to get so we can easily take this list and use list
825:14 take this list and use list comprehension and we can do that right
825:15 comprehension and we can do that right down here so I'm going to keep this
825:17 down here so I'm going to keep this where we can see it um we'll do
825:19 where we can see it um we'll do worldcore
825:21 worldcore tore titles that's equal to now we'll do
825:25 tore titles that's equal to now we'll do our list comprehension should be super
825:27 our list comprehension should be super easy uh we'll just say for title in
825:30 easy uh we'll just say for title in worldcore titles and then what do we
825:33 worldcore titles and then what do we want we want title. text that's it um
825:37 want we want title. text that's it um because we're just taking the text from
825:39 because we're just taking the text from each of these we're just looping through
825:41 each of these we're just looping through and we're getting rank then We're
825:42 and we're getting rank then We're looping through getting name looping
825:44 looping through getting name looping through getting industry that's that's
825:45 through getting industry that's that's it so let's go and print our world table
825:50 it so let's go and print our world table titles and see if it worked and it did
825:54 titles and see if it worked and it did uh this looks like it needs to be
825:55 uh this looks like it needs to be cleaned up just a little bit so let's go
825:58 cleaned up just a little bit so let's go ahead and do that while we're here
826:01 ahead and do that while we're here before we actually put it into the uh
826:02 before we actually put it into the uh P's data frame oops I just wanted uh I
826:07 P's data frame oops I just wanted uh I just wanted this actually so what we're
826:09 just wanted this actually so what we're going to do is try to get rid of those
826:10 going to do is try to get rid of those back slash ends if we do dot strip that
826:14 back slash ends if we do dot strip that may actually not work yeah uh because
826:16 may actually not work yeah uh because this is a list what we need to do is we
826:18 this is a list what we need to do is we can actually do it dot. text. strip
826:22 can actually do it dot. text. strip right here let's try to do it in there
826:23 right here let's try to do it in there there we go so now we have uh this and
826:27 there we go so now we have uh this and now this world tables is good to go now
826:30 now this world tables is good to go now I'm actually noticing one thing that may
826:33 I'm actually noticing one thing that may be odd yeah so we have rank name
826:36 be odd yeah so we have rank name industry goes to headquarters but then
826:38 industry goes to headquarters but then in here we're getting rank name industry
826:41 in here we're getting rank name industry and then the
826:42 and then the profits which is
826:44 profits which is from this table right here which we
826:48 from this table right here which we don't want uh let's scroll back up let's
826:52 don't want uh let's scroll back up let's kind of backtrack this and see where
826:54 kind of backtrack this and see where this happened we did find all table
826:57 this happened we did find all table we're looking at the first one
826:59 we're looking at the first one right and then we're doing
827:02 right and then we're doing [Music]
827:04 [Music] headquarters uh so we're doing print
827:06 headquarters uh so we're doing print table ah okay I think I found the issue
827:09 table ah okay I think I found the issue here and let's backtrack again this is
827:11 here and let's backtrack again this is we're working through this together
827:12 we're working through this together we're going to make mistakes uh the
827:14 we're going to make mistakes uh the table is what we actually wanted to do
827:16 table is what we actually wanted to do we just did soup. findall th which is
827:18 we just did soup. findall th which is going to pull in that secondary table um
827:21 going to pull in that secondary table um jeez we were not thinking here um so now
827:24 jeez we were not thinking here um so now we need to do find all on the table not
827:28 we need to do find all on the table not the soup because now we were looking at
827:29 the soup because now we were looking at all of them oh what a rookie mistake
827:31 all of them oh what a rookie mistake okay uh let's go back now let's look at
827:34 okay uh let's go back now let's look at this now it's just down to headquarters
827:38 this now it's just down to headquarters okay okay let's go ahead and run this
827:40 okay okay let's go ahead and run this let's run this now we just have
827:42 let's run this now we just have headquarters now let's run this now we
827:46 headquarters now let's run this now we are sitting pretty okay excuse my
827:49 are sitting pretty okay excuse my mistakes Hey listen you know if it
827:50 mistakes Hey listen you know if it happens to me it happens to you I
827:52 happens to me it happens to you I promise you this is you know this is a
827:53 promise you this is you know this is a project this a little U little project
827:55 project this a little U little project we're creating here so we're going to
827:56 we're creating here so we're going to run into issues and that's okay we're
827:58 run into issues and that's okay we're figuring out as we go now what I want to
828:00 figuring out as we go now what I want to do before we start pulling in all the
828:02 do before we start pulling in all the data is I want to put this into our
828:04 data is I want to put this into our Panda's data frame we'll have the uh you
828:06 Panda's data frame we'll have the uh you know headers there for us to go so we
828:08 know headers there for us to go so we won't have to get that later and it just
828:10 won't have to get that later and it just makes it easier uh in general trust me
828:12 makes it easier uh in general trust me so we're going to import pandas as PD
828:15 so we're going to import pandas as PD let's go ahead and run this and now
828:17 let's go ahead and run this and now we're going to create our data frame so
828:18 we're going to create our data frame so we'll say PD dot now we have these world
828:22 we'll say PD dot now we have these world uh table titles so what we're going to
828:24 uh table titles so what we're going to do is pd. data frame and then in here
828:28 do is pd. data frame and then in here for our columns we'll say that's equal
828:30 for our columns we'll say that's equal to the world table titles and let's just
828:33 to the world table titles and let's just go ahead and say that's our data frame
828:36 go ahead and say that's our data frame and call our data frame right here let's
828:37 and call our data frame right here let's run it there we go so we were able to
828:41 run it there we go so we were able to pull out and extract those headers and
828:42 pull out and extract those headers and those titles of these columns we're able
828:44 those titles of these columns we're able to put it into our data frame so we're
828:47 to put it into our data frame so we're set up and we're ready to go we're
828:48 set up and we're ready to go we're rocking and rolling the next thing we
828:50 rocking and rolling the next thing we need let's go back up next thing we need
828:53 need let's go back up next thing we need is to start pulling in this data right
828:55 is to start pulling in this data right here so we have to see how we can pull
828:57 here so we have to see how we can pull this data in now if you
829:00 this data in now if you remember that we had those th tags those
829:03 remember that we had those th tags those were our titles as you can see I'm
829:05 were our titles as you can see I'm highlighting over it but down here now
829:07 highlighting over it but down here now we have these TD tags and those are all
829:10 we have these TD tags and those are all encapsulated within a TR tag so these TR
829:14 encapsulated within a TR tag so these TR represent the row
829:16 represent the row right then the D represents the data
829:19 right then the D represents the data within those rows so R for rows D for
829:21 within those rows so R for rows D for data so let's see how we can use that in
829:25 data so let's see how we can use that in order to get the information that we
829:26 order to get the information that we want so let's go back up here just going
829:29 want so let's go back up here just going to take this because again we're only
829:30 to take this because again we're only pulling from table not soup not soup
829:34 pulling from table not soup not soup what were we thinking um and let's go
829:36 what were we thinking um and let's go ahead and let's look at TR let's run
829:39 ahead and let's look at TR let's run this now when we're doing this TR these
829:42 this now when we're doing this TR these do come in with the head so we're going
829:45 do come in with the head so we're going to have to later on we're going to have
829:47 to have to later on we're going to have to get rid of these we don't want to
829:48 to get rid of these we don't want to pull those in um and have that as part
829:50 pull those in um and have that as part of our data but if we scroll down
829:53 of our data but if we scroll down there's our
829:54 there's our Walmart um we have the location these
829:58 Walmart um we have the location these are all with these TD tags and then of
830:02 are all with these TD tags and then of course it's separated by a comma then we
830:04 course it's separated by a comma then we have our td2 so above we had our td1 so
830:08 have our td2 so above we had our td1 so Row one row two Row three all the way
830:11 Row one row two Row three all the way down now we will easily be able to use
830:13 down now we will easily be able to use this right because this is our column
830:16 this right because this is our column data and we can even call it that column
830:19 data and we can even call it that column underscore data is equal to we'll run
830:22 underscore data is equal to we'll run that um and what we're going to do is
830:23 that um and what we're going to do is we're going to Loop through that because
830:25 we're going to Loop through that because it was all in a list so we're going to
830:26 it was all in a list so we're going to Loop through that information but
830:28 Loop through that information but instead of looking at the TR tag we're
830:30 instead of looking at the TR tag we're going to look at the T D tag so let's
830:32 going to look at the T D tag so let's come right down here we'll say for the
830:35 come right down here we'll say for the row in column
830:37 row in column row and we'll do a colon now we need to
830:40 row and we'll do a colon now we need to Loop through this we'll do something
830:42 Loop through this we'll do something like row. findor all all and then what
830:46 like row. findor all all and then what are we looking for we're not looking for
830:47 are we looking for we're not looking for the TR looking for the TD and just for
830:51 the TR looking for the TD and just for now let's print this off see what this
830:55 now let's print this off see what this looks like apparently I didn't run this
830:59 looks like apparently I didn't run this uh column data that's
831:02 uh column data that's why and let's run
831:04 why and let's run this and what we actually need to do is
831:07 this and what we actually need to do is something almost exactly like
831:10 something almost exactly like this and I'm going to put it right below
831:13 this and I'm going to put it right below it um instead of of printing this off
831:16 it um instead of of printing this off because again this is all in a list
831:19 because again this is all in a list we're using find all so we're we're
831:21 we're using find all so we're we're printing off another list which isn't
831:22 printing off another list which isn't actually super helpful um for each of or
831:27 actually super helpful um for each of or all these data that we're pulling in
831:29 all these data that we're pulling in what we can do is we can call this uh
831:31 what we can do is we can call this uh the rowcor data and then we'll put the
831:34 the rowcor data and then we'll put the row data in here so we'll say four and
831:38 row data in here so we'll say four and we'll say in row data so we'll just say
831:40 we'll say in row data so we'll just say for the data in row data and we'll take
831:43 for the data in row data and we'll take the data we'll exchange that and now
831:46 the data we'll exchange that and now instead of uh World Table titles we can
831:49 instead of uh World Table titles we can change this into uh
831:52 change this into uh individual row data right and now let's
831:56 individual row data right and now let's print off the individual row data so
831:59 print off the individual row data so it's the exact same process that we were
832:01 it's the exact same process that we were doing up here and that's how we cleaned
832:04 doing up here and that's how we cleaned it up and got this and we may not need
832:06 it up and got this and we may not need to strip but let's just run this and see
832:07 to strip but let's just run this and see what we get there we go um and strip I'm
832:10 what we get there we go um and strip I'm sure was helpful let's actually get rid
832:12 sure was helpful let's actually get rid of
832:13 of this yeah strip was helpful is the exact
832:16 this yeah strip was helpful is the exact same thing that happened on the last one
832:18 same thing that happened on the last one so let's keep that actually let's run
832:20 so let's keep that actually let's run this and now let's just kind of glance
832:23 this and now let's just kind of glance at this information let's look through
832:25 at this information let's look through it this looks exactly like the
832:28 it this looks exactly like the information that's in the table let's
832:29 information that's in the table let's just confirm with this first one uh 25
832:32 just confirm with this first one uh 25 uh two what am I saying 572 754 2.4
832:37 uh two what am I saying 572 754 2.4 2300 57275 2.4 2200 so this looks
832:41 2300 57275 2.4 2200 so this looks exactly correct now we have to figure
832:44 exactly correct now we have to figure out a way to get this into our table
832:47 out a way to get this into our table because again these are all individual
832:50 because again these are all individual lists it's not like we're just you know
832:52 lists it's not like we're just you know putting all of this in at one time we
832:54 putting all of this in at one time we can't just take the entire table and
832:56 can't just take the entire table and plop it into um into the data frame we
832:59 plop it into um into the data frame we need a way to kind of put this in one at
833:01 need a way to kind of put this in one at a time now if you're just here for web
833:03 a time now if you're just here for web scraping and you haven't taken like my
833:05 scraping and you haven't taken like my panda series that's totally fine that's
833:06 panda series that's totally fine that's not what we're here for anyways um but
833:08 not what we're here for anyways um but what we can do we'll have our individual
833:11 what we can do we'll have our individual row data and we're going to put it in
833:14 row data and we're going to put it in kind of one at a time time now the
833:15 kind of one at a time time now the reason we have to do that is because
833:17 reason we have to do that is because when we had it like this and let's go
833:19 when we had it like this and let's go back when we had it like this it's
833:21 back when we had it like this it's printing out all of it but what it's
833:23 printing out all of it but what it's really doing and let's get rid of it um
833:25 really doing and let's get rid of it um what it's really doing is it's kind of
833:27 what it's really doing is it's kind of doing it like this it's printing it off
833:29 doing it like this it's printing it off one at a time and it's only going to
833:31 one at a time and it's only going to save that current row of data this last
833:35 save that current row of data this last one it's only going to save that as it's
833:38 one it's only going to save that as it's looping through so what we actually want
833:40 looping through so what we actually want to do is every time it Loops through we
833:42 to do is every time it Loops through we append this information onto the data
833:44 append this information onto the data data frame so as it goes through and
833:47 data frame so as it goes through and eventually it's going to end up with
833:48 eventually it's going to end up with this one but as it goes through let's
833:50 this one but as it goes through let's run this as it goes through it puts this
833:53 run this as it goes through it puts this one in and then the next time it Loops
833:55 one in and then the next time it Loops through it puts this one in and the next
833:57 through it puts this one in and the next time it Loops through Etc all the way
833:58 time it Loops through Etc all the way down um so let's see how we can do this
834:02 down um so let's see how we can do this so we have our data frame right here
834:04 so we have our data frame right here let's get rid of this let's bring our
834:06 let's get rid of this let's bring our data frame in now again like I just
834:09 data frame in now again like I just mentioned if you don't know pandas and
834:10 mentioned if you don't know pandas and you haven't learned that uh you know go
834:12 you haven't learned that uh you know go take my uh series on that it's really
834:14 take my uh series on that it's really good and we do something very similar to
834:16 good and we do something very similar to this in that Series so I'm not going to
834:18 this in that Series so I'm not going to kind of walk through the entire logic um
834:20 kind of walk through the entire logic um but there is something called Lo which
834:23 but there is something called Lo which stands for location when you're looking
834:24 stands for location when you're looking at the index on a data frame and we're
834:28 at the index on a data frame and we're going to use that to our advantage so
834:29 going to use that to our advantage so we're going to say the length of the
834:31 we're going to say the length of the data frame so we're looking at how many
834:33 data frame so we're looking at how many rows are in this data frame and then
834:35 rows are in this data frame and then we're going to say that's our length
834:38 we're going to say that's our length then we're going to take that length and
834:40 then we're going to take that length and use it when we're actually putting in
834:42 use it when we're actually putting in this new information pretty um pretty
834:45 this new information pretty um pretty cool so we're going to say df.loc then a
834:49 cool so we're going to say df.loc then a bracket and we're putting in that length
834:50 bracket and we're putting in that length so we're checking the length of our data
834:52 so we're checking the length of our data frame each time it's looping through and
834:55 frame each time it's looping through and then we're going to put the information
834:57 then we're going to put the information in the next position that's exactly what
834:59 in the next position that's exactly what we're doing so let's go ahead and put in
835:02 we're doing so let's go ahead and put in the individual row data um so let's just
835:05 the individual row data um so let's just recap We're looping through this TR this
835:09 recap We're looping through this TR this is our column data so these TR that's
835:11 is our column data so these TR that's our row of data then we're as
835:15 our row of data then we're as as We're looping through it we're doing
835:16 as We're looping through it we're doing find all and looking for TD tags that's
835:19 find all and looking for TD tags that's our individual data so that's our row
835:21 our individual data so that's our row data then we're taking that data each
835:23 data then we're taking that data each piece of data and we're getting out the
835:26 piece of data and we're getting out the text and we're stripping it to kind of
835:27 text and we're stripping it to kind of clean it and now it's in a list for each
835:31 clean it and now it's in a list for each individual row then we're looking at our
835:33 individual row then we're looking at our current data frame which has nothing in
835:34 current data frame which has nothing in it right now we're looking at the length
835:36 it right now we're looking at the length of it and we're appending each row of
835:39 of it and we're appending each row of this information into the next position
835:43 this information into the next position so let's go ahead and run this
835:45 so let's go ahead and run this it's working it's thinking and it looks
835:48 it's working it's thinking and it looks like we got an issue canot set a row
835:50 like we got an issue canot set a row with mismatched columns now we're
835:52 with mismatched columns now we're encountering an issue not one that I got
835:54 encountering an issue not one that I got earlier but we're going to cancel this
835:57 earlier but we're going to cancel this out we're going to figure this out
835:59 out we're going to figure this out together so let's print off our
836:01 together so let's print off our individual row data let's look at this
836:04 individual row data let's look at this this one is empty uh this is I'm almost
836:07 this one is empty uh this is I'm almost certain is probably the issue um I
836:10 certain is probably the issue um I didn't encounter this issue when I wrote
836:12 didn't encounter this issue when I wrote these uh when I wrote this lesson um but
836:14 these uh when I wrote this lesson um but I'm almost certain that this is the
836:15 I'm almost certain that this is the issue right here so let's do the column
836:17 issue right here so let's do the column data but let's start at position um
836:21 data but let's start at position um let's try one and not parentheses I need
836:26 let's try one and not parentheses I need brackets because this is a list right so
836:27 brackets because this is a list right so it should work and there we go so now
836:30 it should work and there we go so now that first one's gone so now we just
836:33 that first one's gone so now we just have the information I didn't even think
836:34 have the information I didn't even think about that um just a second ago but I'm
836:36 about that um just a second ago but I'm glad we're running into it in case you
836:38 glad we're running into it in case you ran into that uh issue let's go ahead
836:41 ran into that uh issue let's go ahead and try this
836:42 and try this again and it looked like it worked so
836:45 again and it looked like it worked so let's pull our data frame down I could
836:47 let's pull our data frame down I could have just wrote DF let's pull our data
836:48 have just wrote DF let's pull our data frame down and now this is looking
836:52 frame down and now this is looking fantastic now um these three dots just
836:55 fantastic now um these three dots just mean there's information in there just
836:56 mean there's information in there just doesn't want to display it but it looks
836:58 doesn't want to display it but it looks like we have our rank we have our name
837:01 like we have our rank we have our name have the industry revenue revenue growth
837:04 have the industry revenue revenue growth employees and headquarters for every
837:05 employees and headquarters for every single one so this is perfect now this
837:08 single one so this is perfect now this is exactly what I was hoping to get now
837:11 is exactly what I was hoping to get now you can go in and use pandas and
837:12 you can go in and use pandas and manipulate this and change it and you
837:14 manipulate this and change it and you know dive into all the information in
837:16 know dive into all the information in there but we can also export this into a
837:19 there but we can also export this into a CSV if that's what you're wanting so we
837:22 CSV if that's what you're wanting so we could easily do that by saying we'll do
837:24 could easily do that by saying we'll do DF do2 CSV and then within here we're
837:28 DF do2 CSV and then within here we're just going to do R and specify our file
837:31 just going to do R and specify our file path so let's come down here to our file
837:33 path so let's come down here to our file path then we'll go to our folder for our
837:36 path then we'll go to our folder for our output so we're just going to take this
837:38 output so we're just going to take this path and let me do it like that so I
837:41 path and let me do it like that so I have this path in my one drive documents
837:43 have this path in my one drive documents python webscript being folder for output
837:45 python webscript being folder for output so you know I already made this um and
837:47 so you know I already made this um and I'm just going to put this right down
837:49 I'm just going to put this right down here now I do have to specify what we're
837:51 here now I do have to specify what we're going to call this um we'll just call
837:53 going to call this um we'll just call this companies and then we have to say
837:56 this companies and then we have to say CSV that is very important now if we run
837:59 CSV that is very important now if we run this I already know just because uh we
838:02 this I already know just because uh we have this Rank and this index here we're
838:04 have this Rank and this index here we're going to keep this index in the output
838:07 going to keep this index in the output not great uh but let's run it let's look
838:09 not great uh but let's run it let's look at our
838:11 at our output there's our companies and when we
838:13 output there's our companies and when we pull this up as you can see this is not
838:16 pull this up as you can see this is not what we want because we have this extra
838:18 what we want because we have this extra thing right here now if we're automating
838:19 thing right here now if we're automating this this would get super annoying so
838:21 this this would get super annoying so what we're going to do is go back and
838:22 what we're going to do is go back and just say index equals false let's go out
838:25 just say index equals false let's go out of here and now we're just going to come
838:26 of here and now we're just going to come right down here we're going to say comma
838:28 right down here we're going to say comma index equals false and so it's going to
838:31 index equals false and so it's going to take this index and it's not going to
838:33 take this index and it's not going to import or actually export it into the
838:36 import or actually export it into the CSV now let's go ahead and run
838:39 CSV now let's go ahead and run this let's pull up our folder one more
838:43 this let's pull up our folder one more time and let's refresh just to make sure
838:46 time and let's refresh just to make sure should be good and now this looks a lot
838:49 should be good and now this looks a lot better so we're able to take all of that
838:52 better so we're able to take all of that information and put it into a CSV and
838:54 information and put it into a CSV and it's all there so this is the whole
838:57 it's all there so this is the whole project so if we scroll all the way back
838:59 project so if we scroll all the way back up let's just kind of glance at what we
839:01 up let's just kind of glance at what we did here scroll down we brought in our
839:04 did here scroll down we brought in our libraries and packages we specified our
839:06 libraries and packages we specified our URL we brought in our soup um and then
839:10 URL we brought in our soup um and then we tried to find our table now that took
839:12 we tried to find our table now that took a little bit of uh testing out but we
839:16 a little bit of uh testing out but we knew that the table was the second one
839:17 knew that the table was the second one so in position one so we took that table
839:20 so in position one so we took that table we were also able to specify it using
839:22 we were also able to specify it using find but then we used the class and of
839:25 find but then we used the class and of course we just wanted to work with that
839:27 course we just wanted to work with that table that's all the data we wanted so
839:29 table that's all the data we wanted so we specifi this is our table and we
839:32 we specifi this is our table and we worked with just our table going forward
839:34 worked with just our table going forward of course uh we encountered some small
839:36 of course uh we encountered some small issues user errors on my end but we were
839:39 issues user errors on my end but we were able to get our world titles and we put
839:41 able to get our world titles and we put those into our data frame right here
839:44 those into our data frame right here using pandas then next we went back and
839:47 using pandas then next we went back and we got all the row data and the
839:48 we got all the row data and the individual data from those rows and we
839:51 individual data from those rows and we put it into our Panda data frame then we
839:55 put it into our Panda data frame then we came below and we exported this into an
839:57 came below and we exported this into an actual CSV file so that is how we can
839:59 actual CSV file so that is how we can use webs scraping to get data from
840:01 use webs scraping to get data from something like a table and put it into a
840:04 something like a table and put it into a panda data frame I hope that this lesson
840:06 panda data frame I hope that this lesson was helpful I know we encountered some
840:07 was helpful I know we encountered some issues that's on my end and I apologize
840:10 issues that's on my end and I apologize but if you run into those same issues
840:12 but if you run into those same issues hopefully that helped uh but I hope this
840:14 hopefully that helped uh but I hope this was helpful and if you like this be sure
840:15 was helpful and if you like this be sure to like And subscribe below I appreciate
840:18 to like And subscribe below I appreciate you I love you and I will see you in the
840:20 you I love you and I will see you in the next
840:22 next [Music]
840:32 [Music] lesson so the first thing that we need
840:34 lesson so the first thing that we need to do is import our Panda's Library so
840:37 to do is import our Panda's Library so we're going to say import we're going to
840:38 we're going to say import we're going to say pandas now this will import the
840:40 say pandas now this will import the pandas library but it's pretty common
840:43 pandas library but it's pretty common place to give it an alias and as a
840:46 place to give it an alias and as a standard when using pandas people will
840:48 standard when using pandas people will say as PD so this is just a quick Alias
840:51 say as PD so this is just a quick Alias that you can use uh that's what I always
840:53 that you can use uh that's what I always use and I've always used it because
840:54 use and I've always used it because that's how I learned it and I want to
840:56 that's how I learned it and I want to teach it to you the right way so that's
840:57 teach it to you the right way so that's how we're going to do it in this video
840:59 how we're going to do it in this video so let's hit shift enter now that that
841:02 so let's hit shift enter now that that is imported we can start reading in our
841:03 is imported we can start reading in our files now right down here I'm going to
841:05 files now right down here I'm going to open up my file explorer and we have
841:08 open up my file explorer and we have several different types of files in here
841:11 several different types of files in here we have CSV files text files Json files
841:15 we have CSV files text files Json files and an Excel worksheet which is a little
841:17 and an Excel worksheet which is a little bit different than a CSV so we're going
841:19 bit different than a CSV so we're going to import all of those I'm going to show
841:21 to import all of those I'm going to show you how to import it as well as some of
841:24 you how to import it as well as some of the different things that you need to be
841:25 the different things that you need to be aware of when you're importing so we're
841:27 aware of when you're importing so we're going to import some of those different
841:29 going to import some of those different file types and I'll show you how to do
841:30 file types and I'll show you how to do that within pandas so the first thing
841:32 that within pandas so the first thing that we need to say is PD Dot and let's
841:36 that we need to say is PD Dot and let's read in a CSV because that's a pretty
841:38 read in a CSV because that's a pretty common one we'll say
841:40 common one we'll say read
841:42 read CSV and this is liter literally all you
841:44 CSV and this is liter literally all you have to write in order to call it in now
841:47 have to write in order to call it in now it's not going to call it in as a string
841:49 it's not going to call it in as a string like it would in one of our previous
841:50 like it would in one of our previous videos if you're just using the regular
841:53 videos if you're just using the regular operating system of python when you're
841:55 operating system of python when you're using pandas it calls it in as a data
841:57 using pandas it calls it in as a data frame and I'll talk about some of the
841:58 frame and I'll talk about some of the nuances of that so let's go down to our
842:00 nuances of that so let's go down to our file explorer we have this countries of
842:03 file explorer we have this countries of the world CSV you just need to click on
842:05 the world CSV you just need to click on it and rightclick and copy as path and
842:09 it and rightclick and copy as path and that's literally going to copy that file
842:11 that's literally going to copy that file path for us you don't have to type it
842:12 path for us you don't have to type it out manually you can if You' like and
842:14 out manually you can if You' like and we're just going to paste it in between
842:16 we're just going to paste it in between these parentheses now if we run it right
842:19 these parentheses now if we run it right now it will not work I'll do that for
842:21 now it will not work I'll do that for you it's saying we have this Unicode
842:23 you it's saying we have this Unicode error uh basically what's happening is
842:26 error uh basically what's happening is is it's reading in these backs slashes
842:28 is it's reading in these backs slashes and this colon and all those back
842:30 and this colon and all those back clashes in there and this period at the
842:31 clashes in there and this period at the end what we need to do is read this in
842:33 end what we need to do is read this in as a raw text so we're just going to say
842:35 as a raw text so we're just going to say R and now it's going to read this as a
842:38 R and now it's going to read this as a literal string or a literal value and
842:41 literal string or a literal value and not as you know with all these back
842:43 not as you know with all these back slashes which does make a big difference
842:46 slashes which does make a big difference when we run this it's going to populate
842:47 when we run this it's going to populate our very first data frame so let's go
842:49 our very first data frame so let's go ahead and run it and now we have this
842:52 ahead and run it and now we have this CSV in here with our country and our
842:54 CSV in here with our country and our region now if we go and pull up this
842:57 region now if we go and pull up this file and let's do that really quickly
842:59 file and let's do that really quickly let's bring up this country's of the
843:00 let's bring up this country's of the world it automatically populated those
843:02 world it automatically populated those headers for us in the data frame but we
843:05 headers for us in the data frame but we don't have any column for those 0 1 2 3
843:08 don't have any column for those 0 1 2 3 so if we go back as you can see right
843:10 so if we go back as you can see right here there's this index and that's
843:11 here there's this index and that's really important in a data frame it's
843:13 really important in a data frame it's really makes a data frame a data frame
843:15 really makes a data frame a data frame and we use index a lot in pandas we're
843:18 and we use index a lot in pandas we're able to filter on the index search on
843:20 able to filter on the index search on the index and a lot of other things
843:21 the index and a lot of other things which I'll show you in future videos but
843:24 which I'll show you in future videos but this is basically how you read in a file
843:27 this is basically how you read in a file now if we go right up here in between
843:28 now if we go right up here in between these parentheses and we hit shift tab
843:31 these parentheses and we hit shift tab this is going to come up for us let's
843:33 this is going to come up for us let's hit this plus button and what this is is
843:36 hit this plus button and what this is is these are all the arguments or all the
843:38 these are all the arguments or all the things that we can specify when we're
843:41 things that we can specify when we're reading in a file and there are a lot of
843:43 reading in a file and there are a lot of different opts options so let's go ahead
843:45 different opts options so let's go ahead and take a look really quickly really
843:46 and take a look really quickly really quickly I wanted to give a huge shout
843:48 quickly I wanted to give a huge shout out to the sponsor of this entire Panda
843:49 out to the sponsor of this entire Panda series and that is udemy udemy has some
843:52 series and that is udemy udemy has some of the best courses at the best prices
843:53 of the best courses at the best prices and it is no exception when it comes to
843:55 and it is no exception when it comes to pandas courses if you want to master
843:57 pandas courses if you want to master pandas this is the course that I would
843:58 pandas this is the course that I would recommend it's going to teach you just
844:00 recommend it's going to teach you just about everything you need to know about
844:02 about everything you need to know about pandas so huge shout out to you me for
844:03 pandas so huge shout out to you me for sponsoring this Panda series and let's
844:05 sponsoring this Panda series and let's get back to the video the first thing is
844:06 get back to the video the first thing is obviously the file path we can specify a
844:09 obviously the file path we can specify a separator which there is no default so
844:13 separator which there is no default so when we're pulling in this CSV when
844:14 when we're pulling in this CSV when we're reading in the CSV it's
844:16 we're reading in the CSV it's automatically going to assume it's a
844:17 automatically going to assume it's a comma CU it's a comma separated uh file
844:21 comma CU it's a comma separated uh file you can choose delimers headers names
844:23 you can choose delimers headers names index columns and a lot of other things
844:26 index columns and a lot of other things as you can see right here now I will say
844:28 as you can see right here now I will say that I don't use almost any of these uh
844:32 that I don't use almost any of these uh the few that I'm going to show you
844:33 the few that I'm going to show you really quickly in just a second are up
844:35 really quickly in just a second are up the very top but you can do a ton of
844:38 the very top but you can do a ton of different things and I'm just going to
844:39 different things and I'm just going to slowly go through them so that's what
844:41 slowly go through them so that's what those are you can also go down here this
844:44 those are you can also go down here this is our dock string and you can see
844:46 is our dock string and you can see exactly how these parameters work it'll
844:49 exactly how these parameters work it'll show you and give you a text and walk
844:51 show you and give you a text and walk you through how to do this again most of
844:53 you through how to do this again most of these you'll probably never use but
844:56 these you'll probably never use but things like a separator could actually
844:57 things like a separator could actually be useful and things like a header could
844:59 be useful and things like a header could be useful because it is possible that
845:02 be useful because it is possible that you want to either rename your headers
845:04 you want to either rename your headers or you don't have a header in your CSV
845:07 or you don't have a header in your CSV and you don't want it to autop populate
845:09 and you don't want it to autop populate that header so that is something that
845:11 that header so that is something that you can specify so for example this
845:13 you can specify so for example this header one I'll show you how to do this
845:15 header one I'll show you how to do this uh the default behaviors is to infer
845:17 uh the default behaviors is to infer that there are column names if no names
845:19 that there are column names if no names are passed this behavior is identical to
845:21 are passed this behavior is identical to header equals zero so it's saying that
845:23 header equals zero so it's saying that first row or that first index which is
845:26 first row or that first index which is like right here that zero is going to be
845:29 like right here that zero is going to be read in as a header but we can come
845:32 read in as a header but we can come right over here and we'll do comma
845:34 right over here and we'll do comma header is equal to and we can say none
845:38 header is equal to and we can say none and as you can see there are no headers
845:40 and as you can see there are no headers now instead it's another index so we
845:43 now instead it's another index so we have index indexes on both the x- axis
845:45 have index indexes on both the x- axis and the Y AIS and so right now we have
845:47 and the Y AIS and so right now we have this zero and one index indicating the
845:50 this zero and one index indicating the First Column and the second column if we
845:52 First Column and the second column if we want to specify those names we can say
845:54 want to specify those names we can say the header equals none then we can say
845:57 the header equals none then we can say names is equal to and we'll give it a
846:00 names is equal to and we'll give it a list and so the first one was country
846:04 list and so the first one was country and what's that second one oh region so
846:07 and what's that second one oh region so right here that's the first um the first
846:10 right here that's the first um the first row but we'll rename it and we'll just
846:12 row but we'll rename it and we'll just say country region and when we run that
846:15 say country region and when we run that we've now populated the country and the
846:17 we've now populated the country and the region uh we're just pretending that our
846:19 region uh we're just pretending that our CSV does not have these values in it and
846:21 CSV does not have these values in it and we have to name it ourselves that's how
846:23 we have to name it ourselves that's how you do it but let's get rid of all that
846:25 you do it but let's get rid of all that because we actually do want those in
846:27 because we actually do want those in there so we're just going to get rid of
846:29 there so we're just going to get rid of those and read it in as normal and there
846:32 those and read it in as normal and there we go now typically when you're reading
846:34 we go now typically when you're reading in a file what you need to do is you
846:37 in a file what you need to do is you want to assign that to a variable almost
846:39 want to assign that to a variable almost always when you see any tutorial or
846:42 always when you see any tutorial or anybody online or even when you're
846:43 anybody online or even when you're actually working people will say DF is
846:46 actually working people will say DF is equal to DF stands for data frame again
846:49 equal to DF stands for data frame again this is a data frame in the next video
846:52 this is a data frame in the next video in this series I'm going to walk through
846:53 in this series I'm going to walk through what a series is as well as what a data
846:56 what a series is as well as what a data frame is because that's pretty important
846:57 frame is because that's pretty important to know when you're working with these
846:59 to know when you're working with these data frames but we'll assign it to this
847:01 data frames but we'll assign it to this value and then we'll say we'll call it
847:04 value and then we'll say we'll call it by saying DF and we'll run it and that's
847:06 by saying DF and we'll run it and that's typically how you'll do things because
847:07 typically how you'll do things because you want to save this data frame so
847:09 you want to save this data frame so later on you can do things like data
847:11 later on you can do things like data frame Dot and you can uh you know pass
847:14 frame Dot and you can uh you know pass in different modules but you can't
847:16 in different modules but you can't really do that it's not as easy to do it
847:17 really do that it's not as easy to do it if you're calling this entire CSV and
847:19 if you're calling this entire CSV and importing it every time so let's copy
847:22 importing it every time so let's copy this because now we're going to import a
847:25 this because now we're going to import a different type of file so now we've been
847:27 different type of file so now we've been doing read CSV but we can also import
847:31 doing read CSV but we can also import text files now you can do that with the
847:33 text files now you can do that with the read CSV we can import text files let's
847:36 read CSV we can import text files let's look at this one we have the same one
847:37 look at this one we have the same one it's countries of the world except now
847:39 it's countries of the world except now it's a text file because I just
847:41 it's a text file because I just converted it for this video I'll copy
847:43 converted it for this video I'll copy that as a path and so now when we do
847:45 that as a path and so now when we do this oops let me get
847:47 this oops let me get those quotes in there it'll say world.
847:50 those quotes in there it'll say world. txt it will still work as you can see
847:53 txt it will still work as you can see this did not import properly um we have
847:56 this did not import properly um we have this country back SLT region and then
847:59 this country back SLT region and then all of our values are the exact same
848:00 all of our values are the exact same with this back SLT that's because we
848:02 with this back SLT that's because we need to use a separator and I'll show
848:05 need to use a separator and I'll show you in just a little bit how we can do
848:06 you in just a little bit how we can do this in a different way but with that
848:08 this in a different way but with that read CSV this is how we can do it we'll
848:10 read CSV this is how we can do it we'll just say sep is equal to we need to do
848:14 just say sep is equal to we need to do back SLT now let's try running this and
848:17 back SLT now let's try running this and as you can see it now has it broken out
848:19 as you can see it now has it broken out into country and region we could also do
848:22 into country and region we could also do it the more proper way and this is the
848:24 it the more proper way and this is the way you should do it and I'll get rid of
848:26 way you should do it and I'll get rid of these really quickly but just want to
848:29 these really quickly but just want to keep them there in case you want to see
848:30 keep them there in case you want to see that but you can also do read table and
848:36 that but you can also do read table and let's get rid of this
848:37 let's get rid of this separator and now we have no separators
848:39 separator and now we have no separators just reading it in as a table let's run
848:42 just reading it in as a table let's run this and it reads it in proper L the
848:44 this and it reads it in proper L the first time this read table can be used
848:46 first time this read table can be used for tons of different data types but
848:48 for tons of different data types but typically I've been using it for like
848:49 typically I've been using it for like text files um we can also read in that
848:52 text files um we can also read in that CSV so let's change this right here to
848:54 CSV so let's change this right here to CSV we can read it in as a CSV but just
848:57 CSV we can read it in as a CSV but just like we did in the last one when we read
848:59 like we did in the last one when we read in the text file using read CSV this
849:02 in the text file using read CSV this read table you're going to need to
849:03 read table you're going to need to specify the separator so I'll just copy
849:07 specify the separator so I'll just copy this and we'll say comma and now it
849:11 this and we'll say comma and now it reads it in properly again you can use
849:12 reads it in properly again you can use that for a ton of different file types
849:14 that for a ton of different file types but you just need to specify a few more
849:16 but you just need to specify a few more things if you don't want to use the more
849:17 things if you don't want to use the more specific read uncore function when
849:20 specific read uncore function when you're using pandas now let's copy this
849:23 you're using pandas now let's copy this again we're going to go right down here
849:25 again we're going to go right down here and now let's do Json files Json files
849:28 and now let's do Json files Json files usually hold semi structured data um
849:31 usually hold semi structured data um which is definitely different than very
849:33 which is definitely different than very structured data like a CSV where has
849:35 structured data like a CSV where has columns and rows so let's go to our file
849:38 columns and rows so let's go to our file explorer we have this Json sample we
849:41 explorer we have this Json sample we will copy this in as the
849:45 will copy this in as the path let's paste it right here and we'll
849:48 path let's paste it right here and we'll do reor Json again these different
849:51 do reor Json again these different functions were built out specifically
849:53 functions were built out specifically for these file types that's why you know
849:55 for these file types that's why you know each one has a different name so now
849:57 each one has a different name so now we're reading this in as the
849:59 we're reading this in as the Json let's read it in and it read it in
850:08 properly now let's go ahead and copy this and take a look at Excel files
850:10 this and take a look at Excel files because Excel files are a little bit
850:11 because Excel files are a little bit different than other ones that we've
850:12 different than other ones that we've looked at
850:13 looked at um so let's just do read uncore
850:17 um so let's just do read uncore Excel and let's go down to our file
850:20 Excel and let's go down to our file explorer and let's actually open up this
850:23 explorer and let's actually open up this workbook as you can see we have sheet
850:25 workbook as you can see we have sheet one right here but we also have this
850:27 one right here but we also have this world population which has a lot more
850:29 world population which has a lot more data let's say we just wanted to read in
850:32 data let's say we just wanted to read in sheet one we can do that or by default
850:35 sheet one we can do that or by default it's going to read in this world
850:36 it's going to read in this world population because it's the first sheet
850:38 population because it's the first sheet in the Excel file well let's go ahead
850:40 in the Excel file well let's go ahead and take a look at that let's get out of
850:42 and take a look at that let's get out of here and and let's say oops I forgot to
850:45 here and and let's say oops I forgot to copy the file path let's go ahead and
850:48 copy the file path let's go ahead and copy as path and we'll put it right
850:53 copy as path and we'll put it right here and let's just read it in with no
850:56 here and let's just read it in with no arguments or anything in there or no
850:58 arguments or anything in there or no parameters when we read it in it's
851:00 parameters when we read it in it's reading in that very first sheet so this
851:03 reading in that very first sheet so this is the one that has all of the data now
851:05 is the one that has all of the data now let's say we wanted to read in that
851:06 let's say we wanted to read in that extra sheet name or the second sheet
851:08 extra sheet name or the second sheet name we'll just go comma sheet undor
851:12 name we'll just go comma sheet undor name say is equal to and then we can
851:14 name say is equal to and then we can specify sheet was it sheet one like this
851:18 specify sheet was it sheet one like this yes it was so we just had to specify the
851:20 yes it was so we just had to specify the sheet name right here and then it
851:22 sheet name right here and then it brought in that sheet instead of the
851:24 brought in that sheet instead of the default which is the very first sheet in
851:26 default which is the very first sheet in that Excel now that definitely covers a
851:28 that Excel now that definitely covers a lot of how you read in those files again
851:31 lot of how you read in those files again you can come in here and hit shift Tab
851:33 you can come in here and hit shift Tab and this plus sign and take a look at
851:34 and this plus sign and take a look at all the documentation and you can
851:36 all the documentation and you can specify a lot of different things things
851:38 specify a lot of different things things that I didn't think were very important
851:40 that I didn't think were very important for you guys to know especially if
851:42 for you guys to know especially if you're just starting out the ones that
851:43 you're just starting out the ones that we looked at today are what I would say
851:45 we looked at today are what I would say are like the ones that I use almost all
851:47 are like the ones that I use almost all the time so I wanted to show you those
851:49 the time so I wanted to show you those but if you're interested in any of these
851:51 but if you're interested in any of these other ones or you have very unique data
851:52 other ones or you have very unique data and you need to do that um you know it's
851:55 and you need to do that um you know it's worth really getting in here and
851:57 worth really getting in here and figuring things out a few other things
851:59 figuring things out a few other things that I wanted to show you just in this
852:00 that I wanted to show you just in this kind of first video or this intro video
852:03 kind of first video or this intro video on how to read in files um one thing
852:05 on how to read in files um one thing that you may have noticed especially in
852:06 that you may have noticed especially in this file right here is we're only
852:09 this file right here is we're only looking at the first five and then the
852:12 looking at the first five and then the last five so if we wanted to see all the
852:14 last five so if we wanted to see all the data all the data is in these like
852:16 data all the data is in these like little three dots right here right we
852:18 little three dots right here right we want to be able to see that data but
852:22 want to be able to see that data but right now we can't and that's because of
852:24 right now we can't and that's because of some settings that are already within
852:25 some settings that are already within pandas and all we need to do is change
852:28 pandas and all we need to do is change that so this one has 234 rows and four
852:31 that so this one has 234 rows and four columns so obviously we can see all the
852:33 columns so obviously we can see all the columns well let's just change the rows
852:35 columns well let's just change the rows all we'll say is pd. set uncore option
852:40 all we'll say is pd. set uncore option now what we need to do is we're going to
852:42 now what we need to do is we're going to change the rows we're not going to
852:44 change the rows we're not going to change the columns at least not on this
852:45 change the columns at least not on this one so we'll say quote
852:49 one so we'll say quote display. max. rows now if we just run
852:53 display. max. rows now if we just run this for whatever data we bring in it's
852:57 this for whatever data we bring in it's going to be able to show the max rows
852:58 going to be able to show the max rows and then we'll say
853:00 and then we'll say 235 although there's 234 rows I'm just
853:03 235 although there's 234 rows I'm just going to be safe let's run
853:05 going to be safe let's run this and now it has changed it so let's
853:08 this and now it has changed it so let's read in this file again and you'll see
853:10 read in this file again and you'll see how it's
853:11 how it's changed now we have all the numbers and
853:14 changed now we have all the numbers and we have this little bar on the right
853:16 we have this little bar on the right that allows us to go down all the way to
853:18 that allows us to go down all the way to the bottom and all the way to the top so
853:21 the bottom and all the way to the top so now we can actually look and kind of
853:22 now we can actually look and kind of skim and see our values I like that
853:25 skim and see our values I like that better than just having that you know
853:26 better than just having that you know shorter version um we can do the exact
853:29 shorter version um we can do the exact same thing on columns as well so if we
853:31 same thing on columns as well so if we look at this one this is our Json file
853:34 look at this one this is our Json file has the same thing right here we have
853:36 has the same thing right here we have what was it 38 columns but we can only
853:39 what was it 38 columns but we can only see I think it's maybe it's 20 or
853:41 see I think it's maybe it's 20 or something like that I can't remember um
853:43 something like that I can't remember um but we have 38 we can only see like
853:45 but we have 38 we can only see like let's say 15 of them or 20 of them we'll
853:47 let's say 15 of them or 20 of them we'll do the exact same thing and we'll just
853:49 do the exact same thing and we'll just say pd. set options.
853:54 say pd. set options. max.
853:56 max. columns and we'll set that to 40 for
853:59 columns and we'll set that to 40 for that one when we run this oops let's get
854:03 that one when we run this oops let's get over here when we run this one again we
854:06 over here when we run this one again we can now scroll over and see every single
854:09 can now scroll over and see every single one of our columns now that one is a in
854:11 one of our columns now that one is a in my opinion a lot more useful I like
854:13 my opinion a lot more useful I like being able to see every single column so
854:16 being able to see every single column so definitely something that you should be
854:17 definitely something that you should be using especially when you have these
854:18 using especially when you have these really large files you want to be able
854:20 really large files you want to be able to see a lot of the data and a lot of
854:22 to see a lot of the data and a lot of the columns so when you're slicing and
854:24 the columns so when you're slicing and dicing and doing all the things that
854:25 dicing and doing all the things that we're about to learn in this Panda
854:27 we're about to learn in this Panda series you know you know what you're
854:29 series you know you know what you're looking at I also want to show you just
854:30 looking at I also want to show you just how to kind of look at your data in
854:32 how to kind of look at your data in these data frames as well that's also
854:34 these data frames as well that's also pretty important so let's go right down
854:36 pretty important so let's go right down here and the very last one that we
854:38 here and the very last one that we imported was this one right here this
854:40 imported was this one right here this read Excel so this data frame is the
854:42 read Excel so this data frame is the only one that's going going to read in
854:44 only one that's going going to read in let's run it um this is the last one to
854:46 let's run it um this is the last one to be run so this variable right here DF uh
854:49 be run so this variable right here DF uh it won't be applied to all these other
854:51 it won't be applied to all these other ones which we can always go back and
854:53 ones which we can always go back and change those typically you'll do
854:54 change those typically you'll do something like data frame two you want
854:56 something like data frame two you want to do something like that um so let's
854:59 to do something like that um so let's keep data Frame 2 oops so what we're
855:02 keep data Frame 2 oops so what we're going to do is we're going to bring data
855:03 going to do is we're going to bring data Frame 2 right down here and we want to
855:05 Frame 2 right down here and we want to take a look at some of this data we want
855:07 take a look at some of this data we want to know a little bit more about it
855:09 to know a little bit more about it something that you can do is dataframe
855:10 something that you can do is dataframe 2. info and we'll do an open parenthesis
855:14 2. info and we'll do an open parenthesis and when we run this it's going to give
855:16 and when we run this it's going to give us a really quick breakdown of a little
855:18 us a really quick breakdown of a little bit of our data so we have our columns
855:20 bit of our data so we have our columns right here rank CCA 3 country and
855:23 right here rank CCA 3 country and capital it's saying we have
855:25 capital it's saying we have 234 values in those columns because
855:29 234 values in those columns because there's
855:30 there's 234 scroll up here because there's
855:33 234 scroll up here because there's 234 uh rows that tells me that there's
855:37 234 uh rows that tells me that there's no missing data in here at least not you
855:39 no missing data in here at least not you know completely missing like null values
855:42 know completely missing like null values there is something something in each of
855:43 there is something something in each of those rows the count tells me it's non-
855:45 those rows the count tells me it's non- null so there's no null values and it
855:47 null so there's no null values and it tells me the data type so it's ringing
855:49 tells me the data type so it's ringing in as an integer an object an object and
855:51 in as an integer an object an object and an object and it also tells us how much
855:53 an object and it also tells us how much memory it's using which is also pretty
855:55 memory it's using which is also pretty neat because when you get really really
855:57 neat because when you get really really large data types memory usage and and
855:59 large data types memory usage and and knowing how to work around that stuff
856:01 knowing how to work around that stuff does become more important than when
856:03 does become more important than when you're working at these really small You
856:05 you're working at these really small You Know sample sizes that we're looking at
856:07 Know sample sizes that we're looking at we can also do oops let me get rid of
856:10 we can also do oops let me get rid of that can also do data frame two
856:13 that can also do data frame two and we'll do shape and for this one we
856:16 and we'll do shape and for this one we do not need the
856:17 do not need the parentheses and all this is going to
856:19 parentheses and all this is going to tell us is we have 234 rows and four
856:21 tell us is we have 234 rows and four columns we're also able to look at uh
856:25 columns we're also able to look at uh the first few values or rows in each of
856:28 the first few values or rows in each of these data frames so we can just say
856:30 these data frames so we can just say data frame 2. head and if we do that
856:33 data frame 2. head and if we do that it's going to give us the first five
856:34 it's going to give us the first five values but we can specify how many we
856:37 values but we can specify how many we want we can say head 10 it'll give us
856:39 want we can say head 10 it'll give us the first 10 rows right here we can do
856:42 the first 10 rows right here we can do the exact same thing and let's go right
856:44 the exact same thing and let's go right down here and we'll say tail so they'll
856:47 down here and we'll say tail so they'll give us the last 10 rows within our data
856:50 give us the last 10 rows within our data frame now let's copy this and let's say
856:52 frame now let's copy this and let's say we don't want to actually look at all of
856:54 we don't want to actually look at all of these values or all these columns we can
856:57 these values or all these columns we can specify that by saying df2 and oops
857:00 specify that by saying df2 and oops let's get rid of all of
857:02 let's get rid of all of this and we'll say with a quote we'll
857:06 this and we'll say with a quote we'll say Rank and now we can take just a look
857:09 say Rank and now we can take just a look at the rank data now we can't do that by
857:12 at the rank data now we can't do that by doing the index or at least not like
857:14 doing the index or at least not like this if we want to use this index that
857:17 this if we want to use this index that is right here we can but there's a very
857:19 is right here we can but there's a very special function called Lo and IO for
857:22 special function called Lo and IO for that and I'm going to have an entire
857:23 that and I'm going to have an entire video on this because it does get a
857:25 video on this because it does get a little bit more complex but there's
857:28 little bit more complex but there's df2 and there's Lo and I stands for
857:31 df2 and there's Lo and I stands for location and I location that's only for
857:33 location and I location that's only for the indexes whether it's the x axis or
857:36 the indexes whether it's the x axis or the Y AIS those are the indexes and for
857:39 the Y AIS those are the indexes and for location it's looking for the actual
857:41 location it's looking for the actual text the actual string of the index so
857:44 text the actual string of the index so if we come up here that data Frame 2 we
857:46 if we come up here that data Frame 2 we can specify
857:48 can specify 224 and it'll give us this information
857:50 224 and it'll give us this information right here in a little different format
857:52 right here in a little different format so let's go bracket and we'll say
857:56 so let's go bracket and we'll say 224 and when we run this it gives us our
857:59 224 and when we run this it gives us our rank CCA country capital with our values
858:02 rank CCA country capital with our values over here kind of like a dictionary
858:04 over here kind of like a dictionary almost now let's copy this and we'll say
858:07 almost now let's copy this and we'll say df2 do IO and right now these look the
858:12 df2 do IO and right now these look the exact same but we haven't really talked
858:14 exact same but we haven't really talked a lot about changing the index and you
858:16 a lot about changing the index and you can change the index to a string or a
858:19 can change the index to a string or a different column or something like that
858:20 different column or something like that and we'll look at that in future videos
858:22 and we'll look at that in future videos the iock looks at the integer location
858:24 the iock looks at the integer location so even if these um let's go right up
858:27 so even if these um let's go right up here even if this index had changed to
858:30 here even if this index had changed to let's say this rank or this CCA 3 or
858:32 let's say this rank or this CCA 3 or country or whatever you make this index
858:34 country or whatever you make this index the ILO will still look at the integer
858:37 the ILO will still look at the integer location so that 224 would still be 224
858:40 location so that 224 would still be 224 even if it was usbekistan
858:42 even if it was usbekistan so then when we look at this it's going
858:44 so then when we look at this it's going to be the exact same but if we had
858:47 to be the exact same but if we had changed that Index this Lo is the one
858:49 changed that Index this Lo is the one that we could search on and we could
858:51 that we could search on and we could search
858:57 usuzan is that how you spell usbekistan hey I nailed it so that is how you use
859:01 hey I nailed it so that is how you use Lo and IO again I just wanted to show
859:03 Lo and IO again I just wanted to show you a little bit about how you can look
859:04 you a little bit about how you can look at your data frame or search within your
859:06 at your data frame or search within your data frame now in future videos I'm
859:08 data frame now in future videos I'm going to dive a lot deeper into a lot of
859:09 going to dive a lot deeper into a lot of the concepts that we just looked at
859:11 the concepts that we just looked at because I just kind of touched on them I
859:13 because I just kind of touched on them I wanted you to have a brief introduction
859:14 wanted you to have a brief introduction to them so that in future videos I'm not
859:16 to them so that in future videos I'm not just dropping everything on you all at
859:18 just dropping everything on you all at once so hopefully this was a good quick
859:20 once so hopefully this was a good quick introduction to those topics uh you
859:22 introduction to those topics uh you should be able to read in a file now see
859:24 should be able to read in a file now see your data frame and kind of look at it
859:26 your data frame and kind of look at it in a few different ways that we just
859:28 in a few different ways that we just looked at and I hope that that was
859:29 looked at and I hope that that was helpful and if it was be sure to check
859:30 helpful and if it was be sure to check out all my other videos on Python and
859:32 out all my other videos on Python and pandas and if you like this video be
859:34 pandas and if you like this video be sure to like And subscribe below and I
859:36 sure to like And subscribe below and I will see you in the next
859:38 will see you in the next [Music]
859:41 [Music] video
859:43 video [Music]
859:49 [Music] hello everybody today we're going to be
859:51 hello everybody today we're going to be looking at filtering and ordering data
859:52 looking at filtering and ordering data frames in pandas there are a lot of
859:54 frames in pandas there are a lot of different ways you can filter and order
859:56 different ways you can filter and order your data in pandas and I'm going to try
859:58 your data in pandas and I'm going to try to show you all of the main ways that
860:00 to show you all of the main ways that you can do that so let's kick it off by
860:02 you can do that so let's kick it off by importing our data set so we're going to
860:03 importing our data set so we're going to say data frame is equal to and we'll say
860:06 say data frame is equal to and we'll say pandas and I need to import my pandas so
860:09 pandas and I need to import my pandas so we'll say import pandas as p
860:13 we'll say import pandas as p that's pretty important I think um so
860:15 that's pretty important I think um so pd. read CSV and we'll do R and then
860:19 pd. read CSV and we'll do R and then we'll say the world population CSV so
860:23 we'll say the world population CSV so let's run this all our data frame right
860:26 let's run this all our data frame right here and this is the data frame that
860:28 here and this is the data frame that we're going to be filtering through and
860:30 we're going to be filtering through and ordering in pandas so let's kick it off
860:34 ordering in pandas so let's kick it off the first thing that we can do is filter
860:36 the first thing that we can do is filter based off of The Columns so the data
860:38 based off of The Columns so the data within our columns so Asia Europe Africa
860:41 within our columns so Asia Europe Africa or whatever data we may have in that
860:43 or whatever data we may have in that column let's go right down here we're
860:45 column let's go right down here we're going to say DF and then within it we're
860:48 going to say DF and then within it we're going to specify what column we're going
860:50 going to specify what column we're going to be filtering on so we're going to say
860:52 to be filtering on so we're going to say DF with another bracket and we'll say
860:54 DF with another bracket and we'll say rank so we're going to be looking at
860:56 rank so we're going to be looking at this rank column right here and we'll
860:59 this rank column right here and we'll say in that rank column we want to do
861:01 say in that rank column we want to do greater than 10 and that's actually
861:03 greater than 10 and that's actually going to be a lot of them let's do less
861:05 going to be a lot of them let's do less than so when we run this it's only going
861:08 than so when we run this it's only going to return these values that are less
861:10 to return these values that are less than 10 we can also do less than equal
861:12 than 10 we can also do less than equal to you know all of these um comparison
861:15 to you know all of these um comparison operators so less than or equal to so
861:17 operators so less than or equal to so now we have all of the ranks 1 through
861:19 now we have all of the ranks 1 through 10 now if we look at these countries we
861:22 10 now if we look at these countries we can specify by specific values almost
861:24 can specify by specific values almost exactly like we did here but instead of
861:27 exactly like we did here but instead of doing a comparison operator like we did
861:29 doing a comparison operator like we did right here and including those names
861:31 right here and including those names let's say Bangladesh and Brazil we can
861:34 let's say Bangladesh and Brazil we can use the is in function almost like an in
861:36 use the is in function almost like an in function in SQL if you know SQL so let's
861:38 function in SQL if you know SQL so let's go right down here and we're going to
861:40 go right down here and we're going to say specific underscore countries so
861:44 say specific underscore countries so right now we're just going to make a
861:45 right now we're just going to make a list of the countries that we want and
861:48 list of the countries that we want and then we'll say
861:55 Bangladesh and Brazil so let's go right down here and
861:58 Brazil so let's go right down here and we'll say okay for these specific
862:01 we'll say okay for these specific countries from the data frame let's do
862:03 countries from the data frame let's do our bracket we'll say in this country
862:06 our bracket we'll say in this country column so we'll do data frame and then
862:09 column so we'll do data frame and then another bracket for country so in this
862:13 another bracket for country so in this country column we can do do is in and
862:16 country column we can do do is in and then an open parenthesis and then look
862:19 then an open parenthesis and then look for our specific countries so we're
862:21 for our specific countries so we're looking at just this column and we're
862:23 looking at just this column and we're saying is in so we're looking at are
862:25 saying is in so we're looking at are these values within this column and
862:29 these values within this column and we're getting this error and this looks
862:31 we're getting this error and this looks very very odd let me um this doesn't
862:33 very very odd let me um this doesn't look right there we go I just had some
862:37 look right there we go I just had some syntax errors I apologize made it way
862:39 syntax errors I apologize made it way more complicated than it needs to be but
862:41 more complicated than it needs to be but here's how you use this is in function
862:44 here's how you use this is in function so we're looking at Bangladesh and
862:46 so we're looking at Bangladesh and Brazil and we return those rows with
862:49 Brazil and we return those rows with Bangladesh and Brazil really quickly I
862:51 Bangladesh and Brazil really quickly I wanted to give a huge shout out to the
862:52 wanted to give a huge shout out to the sponsor of this entire Panda series and
862:54 sponsor of this entire Panda series and that is udemy udemy has some of the best
862:56 that is udemy udemy has some of the best courses at the best prices and it is no
862:58 courses at the best prices and it is no exception when it comes to pandas
863:00 exception when it comes to pandas courses if you want to master pandas
863:02 courses if you want to master pandas this is the course that I would
863:03 this is the course that I would recommend it's going to teach you just
863:04 recommend it's going to teach you just about everything you need to know about
863:06 about everything you need to know about pandas so huge shout out to UD me for
863:07 pandas so huge shout out to UD me for sponsoring this Panda series and let's
863:09 sponsoring this Panda series and let's get back to the video we can also do a
863:11 get back to the video we can also do a contains function kind of similar to is
863:14 contains function kind of similar to is in except it's more like the like in SQL
863:17 in except it's more like the like in SQL as well I'm comparing a lot of this to
863:19 as well I'm comparing a lot of this to SQL CU When You're filtering things I
863:21 SQL CU When You're filtering things I always my brain always goes to SQL but
863:23 always my brain always goes to SQL but in pandas it's called the contains so
863:26 in pandas it's called the contains so let's do let's actually copy this
863:29 let's do let's actually copy this because I don't want to make the same
863:30 because I don't want to make the same mistake again let's do that and we'll do
863:33 mistake again let's do that and we'll do the bracket but instead of dot is in
863:36 the bracket but instead of dot is in we're going to do string. contains and
863:39 we're going to do string. contains and then an open parenthesis so we're going
863:42 then an open parenthesis so we're going to going to be looking for a string if
863:44 to going to be looking for a string if it contain if it contains let's do
863:47 it contain if it contains let's do United almost like United States or or
863:50 United almost like United States or or any other United so let's run this and
863:53 any other United so let's run this and as you can see we have United Arab
863:55 as you can see we have United Arab Emirates United Kingdom United States
863:57 Emirates United Kingdom United States United States Virgin Islands so we can
863:59 United States Virgin Islands so we can kind of search for a specific string or
864:02 kind of search for a specific string or a number or a value within our data or
864:05 a number or a value within our data or within that column of country now so far
864:08 within that column of country now so far we've only been looking at how you can
864:09 we've only been looking at how you can filter on these columns we can also fil
864:12 filter on these columns we can also fil filter based off of the index as well
864:14 filter based off of the index as well and there's two different ways you can
864:16 and there's two different ways you can do it or two of the main ways there's
864:18 do it or two of the main ways there's filter and then there's L and IO Lo
864:21 filter and then there's L and IO Lo stands for location and IO stands for
864:23 stands for location and IO stands for integer location and if you've seen
864:25 integer location and if you've seen other previous videos I've kind of
864:27 other previous videos I've kind of mentioned those so we can take a quick
864:28 mentioned those so we can take a quick look at all of those so really quickly
864:31 look at all of those so really quickly we need to set an index because the
864:33 we need to set an index because the index right now is uh not the best we'll
864:35 index right now is uh not the best we'll set our index to
864:38 set our index to Country so let's say
864:41 Country so let's say df2
864:42 df2 is equal to DF do setor index and we'll
864:48 is equal to DF do setor index and we'll say country I'm just doing df2 because
864:51 say country I'm just doing df2 because later on I want to use that data frame
864:52 later on I want to use that data frame again so I'm just going to assign it to
864:54 again so I'm just going to assign it to another data frame so that we can just
864:56 another data frame so that we can just easily switch back and forth so now we
864:59 easily switch back and forth so now we have this index as the country and what
865:01 have this index as the country and what we can do is use the filter function so
865:04 we can do is use the filter function so let's go down here we'll say
865:07 let's go down here we'll say df2
865:09 df2 filter and we'll do an open parenthesis
865:11 filter and we'll do an open parenthesis and now we can specify our items so
865:14 and now we can specify our items so these are actually going to be
865:14 these are actually going to be specifying which columns we want to keep
865:17 specifying which columns we want to keep so we're going to say items is equal to
865:20 so we're going to say items is equal to then we'll make a list we'll say
865:22 then we'll make a list we'll say continent hope that's how we spell
865:24 continent hope that's how we spell continent I'm always messing up with my
865:27 continent I'm always messing up with my uh my stuff here my spelling then we'll
865:29 uh my stuff here my spelling then we'll do CCA 3 because why not you can specify
865:33 do CCA 3 because why not you can specify whichever ones you want when we run this
865:36 whichever ones you want when we run this it's going to only bring in those two
865:38 it's going to only bring in those two columns Now by default it's choosing the
865:40 columns Now by default it's choosing the access for us but we can also specify
865:43 access for us but we can also specify which axis we want to search on so if we
865:45 which axis we want to search on so if we say axis is equal to zero it's actually
865:48 say axis is equal to zero it's actually going to search this axis this is the
865:50 going to search this axis this is the zero axis this is the one axis so where
865:53 zero axis this is the one axis so where our columns are is one so if we go back
865:56 our columns are is one so if we go back and do one we're searching on that one
865:59 and do one we're searching on that one Axis or those header axises again and
866:01 Axis or those header axises again and this is the default but you can specify
866:03 this is the default but you can specify that so if you just want to search on uh
866:06 that so if you just want to search on uh you know filtering right here you can do
866:08 you know filtering right here you can do that and let's actually copy this and do
866:11 that and let's actually copy this and do that right down here just you can see
866:12 that right down here just you can see what it looks like but let's search for
866:15 what it looks like but let's search for Zimbabwe and we'll do Zimbabwe and we'll
866:18 Zimbabwe and we'll do Zimbabwe and we'll be looking at the zero axis which is the
866:21 be looking at the zero axis which is the up and down on the left hand side and
866:24 up and down on the left hand side and when we filter on that we can filter by
866:26 when we filter on that we can filter by Zimbabwe by looking just at the country
866:29 Zimbabwe by looking just at the country index we can also use the like just like
866:32 index we can also use the like just like we did before and I'll show you the
866:33 we did before and I'll show you the exact same demonstration that we did
866:36 exact same demonstration that we did which you can say like is equal to and
866:39 which you can say like is equal to and instead of having to put in a concrete
866:41 instead of having to put in a concrete um text text you can just say United
866:44 um text text you can just say United just like we did before and we're
866:45 just like we did before and we're searching where the axis is equal to
866:47 searching where the axis is equal to zero which again is this left-handed
866:49 zero which again is this left-handed access so now we're looking for United
866:52 access so now we're looking for United and it's going to give us all of the
866:53 and it's going to give us all of the countries or all the indexed values that
866:56 countries or all the indexed values that have United in it like we were talking
866:57 have United in it like we were talking about before we also have l and ILO so
867:00 about before we also have l and ILO so we can say data frame 2. L now this is a
867:06 we can say data frame 2. L now this is a specific value so we'll do United States
867:10 specific value so we'll do United States so location is just looking at the
867:12 so location is just looking at the actual name or the value of it not its
867:15 actual name or the value of it not its position so if we search for United
867:17 position so if we search for United States it's going to give us this right
867:19 States it's going to give us this right here where it gives us all of the
867:20 here where it gives us all of the columns for United States and then all
867:22 columns for United States and then all of the uh values for United States or we
867:27 of the uh values for United States or we can
867:27 can do the io which is the energ location
867:31 do the io which is the energ location which is not the exact same because
867:34 which is not the exact same because we're looking at the string for the L
867:37 we're looking at the string for the L we're looking at this string but
867:39 we're looking at this string but underneath it there still is a position
867:41 underneath it there still is a position that's that integer location let's do a
867:43 that's that integer location let's do a completely random one let's just say
867:46 completely random one let's just say three if we look at the third position
867:48 three if we look at the third position it's going to give us ASM which I'm not
867:51 it's going to give us ASM which I'm not exactly sure what it is but it still
867:53 exactly sure what it is but it still gives us basically the same kind of
867:54 gives us basically the same kind of output which is the columns and the
867:57 output which is the columns and the values so that's another way that you
867:58 values so that's another way that you can search within your index when you're
868:00 can search within your index when you're actually trying to filter down that data
868:03 actually trying to filter down that data now let's go look at the order bu and
868:05 now let's go look at the order bu and let's start with the very first one that
868:07 let's start with the very first one that we looked at let's do data frame that's
868:09 we looked at let's do data frame that's why I kept it because I wanted to use it
868:10 why I kept it because I wanted to use it later now we can sort and order these
868:13 later now we can sort and order these values instead of it just being kind of
868:14 values instead of it just being kind of a jumbled mess in here we can sort these
868:18 a jumbled mess in here we can sort these columns however we would like ascending
868:20 columns however we would like ascending descending multiple columns single
868:22 descending multiple columns single columns and let's look at how to do that
868:24 columns and let's look at how to do that so we'll say data frame and then we'll
868:26 so we'll say data frame and then we'll do data frame look at rank again just
868:29 do data frame look at rank again just like we were doing above and let's do
868:32 like we were doing above and let's do data frame where it's less than 10 I
868:34 data frame where it's less than 10 I should have just gone and copyed this I
868:36 should have just gone and copyed this I apologize so now we have this data frame
868:39 apologize so now we have this data frame that is greater than 10 now we can do do
868:44 that is greater than 10 now we can do do sortore values and this is the function
868:47 sortore values and this is the function that's going to allow us to sort
868:49 that's going to allow us to sort everything that we want to sort so we
868:50 everything that we want to sort so we can do buy is equal to and we'll just
868:54 can do buy is equal to and we'll just order it by the exact same thing that we
868:56 order it by the exact same thing that we were doing or calling it on we'll do
868:58 were doing or calling it on we'll do rank so now what this is going to do
869:00 rank so now what this is going to do it's going to order our rank
869:03 it's going to order our rank column and as you can see it did that
869:05 column and as you can see it did that one 2 3 4 5 we can also do it with
869:08 one 2 3 4 5 we can also do it with ascending or descending so if you want
869:11 ascending or descending so if you want to you can look here and see what you
869:12 to you can look here and see what you can do so we'll do
869:14 can do so we'll do ascending we'll say that's equal to
869:17 ascending we'll say that's equal to true and so that's the automatic default
869:20 true and so that's the automatic default so that didn't change anything but if we
869:22 so that didn't change anything but if we say false it's going to be descending
869:24 say false it's going to be descending from highest to lowest so now we have it
869:26 from highest to lowest so now we have it in the opposite direction now we don't
869:29 in the opposite direction now we don't have to just order or sort this on one
869:32 have to just order or sort this on one single column we can do multiple columns
869:34 single column we can do multiple columns and we can do that by making a list
869:36 and we can do that by making a list right here whoops make a
869:39 right here whoops make a list just like that and we'll input
869:42 list just like that and we'll input different ones as well so now let's
869:44 different ones as well so now let's input our
869:46 input our country and when we run this it will
869:49 country and when we run this it will give us rank of
869:51 give us rank of 9876 as well as the country of Russia
869:54 9876 as well as the country of Russia Bangladesh Brazil now if you noticed the
869:57 Bangladesh Brazil now if you noticed the country really didn't change because the
869:59 country really didn't change because the rank stayed the exact same that's
870:01 rank stayed the exact same that's because there's an order of importance
870:03 because there's an order of importance here and it starts with the very first
870:04 here and it starts with the very first one if we change this around and we look
870:08 one if we change this around and we look at this
870:09 at this one and put a com right here
870:12 one and put a com right here now the country is going to be descended
870:14 now the country is going to be descended and the rank would come second so it's
870:16 and the rank would come second so it's not going the rank isn't going to really
870:18 not going the rank isn't going to really have any effect here so now we have the
870:21 have any effect here so now we have the country United States Russia Pakistan
870:23 country United States Russia Pakistan and the rank really didn't get ordered
870:25 and the rank really didn't get ordered at all now if we want to see how that
870:27 at all now if we want to see how that can actually work let's do continent
870:29 can actually work let's do continent right here and actually put it right
870:32 right here and actually put it right here and do country here so if we run
870:35 here and do country here so if we run this it's first going to come and it's
870:37 this it's first going to come and it's going to organize or sort the continent
870:40 going to organize or sort the continent then it's going to come come back and go
870:42 then it's going to come come back and go to the country and then it's going to
870:44 to the country and then it's going to sort the country so keep so keep your
870:47 sort the country so keep so keep your eye right here in this Asia area because
870:49 eye right here in this Asia area because we're going to sort this differently
870:51 we're going to sort this differently than ascending so we have ascending
870:53 than ascending so we have ascending false and that applies to both of these
870:55 false and that applies to both of these it's false and false but we can specify
870:58 it's false and false but we can specify which one we want to do we can do a
871:00 which one we want to do we can do a false here and a true here so we'll do
871:02 false here and a true here so we'll do false comma true and what this is going
871:05 false comma true and what this is going to do is it's going to say false for the
871:07 to do is it's going to say false for the continent so the continent right here is
871:09 continent so the continent right here is going to stay the exact same and so that
871:12 going to stay the exact same and so that is a lot of how you can filter and order
871:14 is a lot of how you can filter and order your data within pandas I hope that this
871:16 your data within pandas I hope that this was helpful I hope that you enjoyed this
871:18 was helpful I hope that you enjoyed this video if you liked it be sure to like
871:19 video if you liked it be sure to like And subscribe below check out all my
871:21 And subscribe below check out all my other videos on Python and pandas and I
871:23 other videos on Python and pandas and I will see you in the next
871:26 will see you in the next [Music]
871:36 [Music] video hello everybody today we're going
871:39 video hello everybody today we're going to be looking at indexing and pandas if
871:40 to be looking at indexing and pandas if you remember from previous videos the
871:42 you remember from previous videos the index is an object that stores the
871:44 index is an object that stores the access labels for all Panda objects the
871:47 access labels for all Panda objects the index in a data frame is extremely
871:49 index in a data frame is extremely useful because it's customizable and you
871:51 useful because it's customizable and you can also search and filter based off of
871:53 can also search and filter based off of that index in this video we're going to
871:55 that index in this video we're going to talk all about indexing how you can
871:56 talk all about indexing how you can change the index and customize that as
871:59 change the index and customize that as well as how you can search and filter on
872:00 well as how you can search and filter on that index and then we're also going to
872:02 that index and then we're also going to be looking at something a little bit
872:03 be looking at something a little bit more advanced called multi- indexing and
872:06 more advanced called multi- indexing and you won't always use it but it's really
872:08 you won't always use it but it's really good to know in case you come across a
872:10 good to know in case you come across a data frame that has that
872:12 data frame that has that so let's get started by importing pandas
872:15 so let's get started by importing pandas import pandas as PD now we'll get our
872:18 import pandas as PD now we'll get our first data frame we say DF is equal to
872:21 first data frame we say DF is equal to pd. read CSV and I've already copied
872:25 pd. read CSV and I've already copied this but we're going to do R and we're
872:28 this but we're going to do R and we're going to put this file path so I have
872:30 going to put this file path so I have this world population CSV I will have
872:32 this world population CSV I will have that in the description just like I do
872:34 that in the description just like I do in all of my other videos let's run DF
872:38 in all of my other videos let's run DF and let's take a look at this data frame
872:40 and let's take a look at this data frame so we have a lot of information here we
872:42 so we have a lot of information here we have rank country continent population
872:46 have rank country continent population as well as the default index from zero
872:48 as well as the default index from zero all the way up to 233 now if you haven't
872:50 all the way up to 233 now if you haven't watched any of my previous videos on
872:52 watched any of my previous videos on pandas the index is pretty important and
872:54 pandas the index is pretty important and it's basically just a number or a label
872:56 it's basically just a number or a label for each row it doesn't even necessarily
872:59 for each row it doesn't even necessarily have to be a unique number um you can
873:01 have to be a unique number um you can create or add an index yourself if you
873:04 create or add an index yourself if you want to and it doesn't have to be unique
873:06 want to and it doesn't have to be unique but it it really should be unique uh
873:08 but it it really should be unique uh especially if you want to use it
873:09 especially if you want to use it appropriately for what we're doing the
873:11 appropriately for what we're doing the country is actually going to be a pretty
873:13 country is actually going to be a pretty great index because the country you know
873:16 great index because the country you know is going to be all unique because we're
873:18 is going to be all unique because we're looking at every single row as a
873:19 looking at every single row as a different um country as well as the
873:21 different um country as well as the population so let's go ahead and create
873:24 population so let's go ahead and create this country or add this country as our
873:26 this country or add this country as our index now we can do this in a lot of
873:28 index now we can do this in a lot of different ways but the first way that
873:30 different ways but the first way that you can do this if you already know what
873:32 you can do this if you already know what you are going to create that index on is
873:34 you are going to create that index on is we can just go right in here when we're
873:36 we can just go right in here when we're reading in this file and we'll say comma
873:39 reading in this file and we'll say comma index underscore oops I I spelled that
873:41 index underscore oops I I spelled that completely wrong index uncore column and
873:44 completely wrong index uncore column and we'll say that is equal to and then
873:46 we'll say that is equal to and then we're going to say quote country so
873:49 we're going to say quote country so we're taking this country and we're
873:51 we're taking this country and we're going to assign it as the index now
873:54 going to assign it as the index now let's read this in and as you can see
873:57 let's read this in and as you can see this is our index now it looks a little
873:59 this is our index now it looks a little bit different we didn't have this
874:00 bit different we didn't have this country header right here which is
874:02 country header right here which is specifying that this is still the
874:04 specifying that this is still the country but you can tell that this is
874:05 country but you can tell that this is the index based off the um bold letters
874:08 the index based off the um bold letters as well as it being on the far left and
874:11 as well as it being on the far left and all the regular columns for the data is
874:13 all the regular columns for the data is over here while the country header is
874:15 over here while the country header is right here and it's lower than all the
874:17 right here and it's lower than all the others just a quick way that you can see
874:19 others just a quick way that you can see that that is the index now before we
874:21 that that is the index now before we move on I want to show you some other
874:22 move on I want to show you some other ways that you can do this as well but
874:24 ways that you can do this as well but I'm going to show you how to reverse
874:26 I'm going to show you how to reverse this index before we move on and we'll
874:29 this index before we move on and we'll say data frame so we had our data frame
874:31 say data frame so we had our data frame right here so we have data frame dot
874:34 right here so we have data frame dot we'll say reset index and then we'll say
874:38 we'll say reset index and then we'll say in place is equal to True which means we
874:41 in place is equal to True which means we don't have to assign this to another
874:42 don't have to assign this to another variable and all that stuff it'll just
874:44 variable and all that stuff it'll just be true so now when we run that data
874:46 be true so now when we run that data frame again the index was reset to the
874:49 frame again the index was reset to the default numbers so now let's go down
874:51 default numbers so now let's go down here I'll show you how to do this in a
874:53 here I'll show you how to do this in a different way you can do DF do we'll say
874:55 different way you can do DF do we'll say setor index and then we'll just say
874:59 setor index and then we'll just say country so very similar to when we were
875:00 country so very similar to when we were reading in that file and we said set the
875:02 reading in that file and we said set the index or that index column we said index
875:05 index or that index column we said index column equals country if we do this and
875:08 column equals country if we do this and we run it in it works but if we say data
875:11 we run it in it works but if we say data frame right down here it's not going to
875:14 frame right down here it's not going to save that if we want to save it just
875:16 save that if we want to save it just like we did above we're going to say in
875:18 like we did above we're going to say in place is equal to true that is going to
875:22 place is equal to true that is going to save it to where we don't have to assign
875:24 save it to where we don't have to assign it another variable so now when we run
875:26 it another variable so now when we run this the data frame right here which is
875:28 this the data frame right here which is going to populate this the data frame is
875:30 going to populate this the data frame is going to say in place is equal to true
875:32 going to say in place is equal to true so that country will now be our index
875:34 so that country will now be our index again let's run this and there we go
875:37 again let's run this and there we go really quickly I wanted to give a huge
875:39 really quickly I wanted to give a huge shout out to the sponsor of this entire
875:40 shout out to the sponsor of this entire panda series and that is udemy udemy has
875:43 panda series and that is udemy udemy has some of the best courses at the best
875:44 some of the best courses at the best prices and it is no exception when it
875:46 prices and it is no exception when it comes to pandas courses if you want to
875:48 comes to pandas courses if you want to master pandas this is the course that I
875:50 master pandas this is the course that I would recommend it's going to teach you
875:51 would recommend it's going to teach you just about everything you need to know
875:53 just about everything you need to know about pandas so huge shout out to UD to
875:54 about pandas so huge shout out to UD to me for sponsoring this Panda series and
875:56 me for sponsoring this Panda series and let's get back to the video now what's
875:57 let's get back to the video now what's really great about this index is we're
875:59 really great about this index is we're able to search based off just this index
876:02 able to search based off just this index and so we can filter on it and basically
876:04 and so we can filter on it and basically look through our data with it and there
876:06 look through our data with it and there are two different ways that you can do
876:07 are two different ways that you can do that at least this is a very common way
876:09 that at least this is a very common way that people who use pandas we'll do to
876:11 that people who use pandas we'll do to kind of search through that index the
876:13 kind of search through that index the first one is called lock and there's
876:15 first one is called lock and there's lock and iock that stands for location
876:17 lock and iock that stands for location or integer location let's look at lock
876:20 or integer location let's look at lock first let's say df.loc and then we'll do
876:24 first let's say df.loc and then we'll do a bracket now we're able to specify the
876:26 a bracket now we're able to specify the actual string the label so let's go
876:29 actual string the label so let's go right up here and let's say
876:30 right up here and let's say Albania so we'll say Albania so again
876:34 Albania so we'll say Albania so again this is just looking at the location
876:36 this is just looking at the location let's run this now it's going to bring
876:38 let's run this now it's going to bring up all the Albania data just like here
876:41 up all the Albania data just like here where it's kind of looks like a column
876:43 where it's kind of looks like a column in a column and we can get this exact
876:45 in a column and we can get this exact same data but using iock right here and
876:50 same data but using iock right here and when we ran lock we were searching based
876:52 when we ran lock we were searching based off Albania which is in the 01 position
876:56 off Albania which is in the 01 position so if we actually pull the one position
876:58 so if we actually pull the one position for that
877:00 for that integer the iock we can look at the one
877:04 integer the iock we can look at the one position and this should give us the
877:06 position and this should give us the exact same data now let's take a look at
877:09 exact same data now let's take a look at multi- indexing and we'll come back to a
877:11 multi- indexing and we'll come back to a little bit of this in a second so multi-
877:14 little bit of this in a second so multi- indexing is creating multiple indexes
877:17 indexing is creating multiple indexes we're not just going to create the
877:18 we're not just going to create the country as the index now we're going to
877:20 country as the index now we're going to add an additional index on top of that
877:23 add an additional index on top of that so let's pull up our data frame right
877:25 so let's pull up our data frame right now we have the country but let's do do
877:28 now we have the country but let's do do reset
877:29 reset index and we'll say in place equals true
877:34 index and we'll say in place equals true oops let's run it so now we have our
877:38 oops let's run it so now we have our data frame now let's set our index but
877:41 data frame now let's set our index but this time when we set our index we're
877:42 this time when we set our index we're going to add the country as the index as
877:44 going to add the country as the index as well as the continent as an index so
877:47 well as the continent as an index so we'll say data frame. setor index then
877:51 we'll say data frame. setor index then we'll do a parenthesis and instead of
877:53 we'll do a parenthesis and instead of just doing country like we did before
877:56 just doing country like we did before we're going to create a list oops and
877:59 we're going to create a list oops and we'll do it like that and then we'll
878:02 we'll do it like that and then we'll say oops
878:05 say oops continent and separate it by a comma so
878:08 continent and separate it by a comma so we have continents and Country let's
878:10 we have continents and Country let's just say in place is equal to true now
878:15 just say in place is equal to true now when we run this we're going to have two
878:17 when we run this we're going to have two indexes and let's see what this looks
878:20 indexes and let's see what this looks like and let's run this so now we have
878:24 like and let's run this so now we have country as well as continent as our
878:27 country as well as continent as our index now you may notice that these
878:30 index now you may notice that these indexes are repeating themselves on this
878:32 indexes are repeating themselves on this continent index we have Europe right
878:35 continent index we have Europe right here and Europe right here as well as
878:37 here and Europe right here as well as Asia and Asia and it looks a little bit
878:40 Asia and Asia and it looks a little bit funky but we are able to sort these
878:43 funky but we are able to sort these values and make they look a lot better
878:45 values and make they look a lot better so let's go ahead and try this we'll do
878:47 so let's go ahead and try this we'll do DF do sortore index and when we run this
878:52 DF do sortore index and when we run this it should sort our index alphabetically
878:55 it should sort our index alphabetically and we can also look in here and see
878:57 and we can also look in here and see what kind of things we can you know
878:59 what kind of things we can you know specify we can specify the axis but it's
879:02 specify we can specify the axis but it's automatically going to be looking at the
879:03 automatically going to be looking at the zero this is zero and this is one so we
879:05 zero this is zero and this is one so we have two axes within our data frame you
879:08 have two axes within our data frame you can choose the level whether it's
879:10 can choose the level whether it's ascending or not ascending in place kind
879:13 ascending or not ascending in place kind string sort remaining all of these
879:15 string sort remaining all of these different things the only one that I
879:17 different things the only one that I really you know think is worth looking
879:19 really you know think is worth looking at is the ascending we already know some
879:20 at is the ascending we already know some of these other ones but if we look at
879:23 of these other ones but if we look at ascending let's run it now it's sorted
879:26 ascending let's run it now it's sorted these and so now it's kind of grouped
879:28 these and so now it's kind of grouped together so we have Africa and all the
879:30 together so we have Africa and all the African ones as well as South America
879:32 African ones as well as South America and all the South American ones let's
879:35 and all the South American ones let's really quickly say
879:38 really quickly say pd. setor
879:41 pd. setor option and we'll say display.
879:46 option and we'll say display. max. columns and just like this let's
879:49 max. columns and just like this let's run it and I need to specify whoops
879:53 run it and I need to specify whoops specify right here let's see how many
879:54 specify right here let's see how many rows we
879:56 rows we have 235 so let's do
879:59 have 235 so let's do 235 let's run this and now when we run
880:03 235 let's run this and now when we run this you can see that Africa is all
880:05 this you can see that Africa is all grouped together and all the countries
880:07 grouped together and all the countries are in alphabetical order under it and
880:09 are in alphabetical order under it and then we go all the way down to Asia and
880:12 then we go all the way down to Asia and again just all in alphabetical order if
880:14 again just all in alphabetical order if we wanted to we could say
880:16 we wanted to we could say ascending equals
880:19 ascending equals true and then when we run this oh I
880:22 true and then when we run this oh I meant to say false and then when we run
880:24 meant to say false and then when we run this it's the exact opposite so it
880:25 this it's the exact opposite so it starts with South America the last one
880:28 starts with South America the last one and then goes in reverse alphabetical
880:29 and then goes in reverse alphabetical order we could also say false make it a
880:33 order we could also say false make it a list and do comma
880:35 list and do comma true and just like this and then it
880:38 true and just like this and then it would sort this First Column as false
880:40 would sort this First Column as false and this next column as true so you can
880:42 and this next column as true so you can really customize it but you know for
880:45 really customize it but you know for what we're doing we don't need any of
880:46 what we're doing we don't need any of that we just need to be able to see this
880:48 that we just need to be able to see this right here so now when we try to search
880:50 right here so now when we try to search by our index like we did before we did
880:53 by our index like we did before we did data frame. Loke now when we did that
880:56 data frame. Loke now when we did that and we said you know let's say Angola
880:59 and we said you know let's say Angola when we specified Angola it's not going
881:01 when we specified Angola it's not going to work properly because it's searching
881:04 to work properly because it's searching in this first index for the first string
881:07 in this first index for the first string that we have we can search Africa
881:11 that we have we can search Africa let's search for
881:13 let's search for Africa and now we have all of the
881:15 Africa and now we have all of the African countries and if we want to
881:18 African countries and if we want to specify to Angola we can also go down
881:20 specify to Angola we can also go down another level oops by doing Ang
881:24 another level oops by doing Ang Angola and now we have what we were
881:26 Angola and now we have what we were looking at before where we're calling
881:28 looking at before where we're calling all the data within those but we
881:30 all the data within those but we couldn't do it just based off Africa
881:32 couldn't do it just based off Africa because we had an additional Index right
881:34 because we had an additional Index right here so once we called both indexes now
881:37 here so once we called both indexes now we get this view but let's look at that
881:40 we get this view but let's look at that I look really
881:42 I look really quick when we run this let's just say
881:46 quick when we run this let's just say one because right up here oh we have
881:50 one because right up here oh we have Angola zero and then one so you think it
881:54 Angola zero and then one so you think it may pull up Angola let's go ahead and
881:56 may pull up Angola let's go ahead and run this and it's still pulling up
881:59 run this and it's still pulling up Albania let's go right up here if you
882:02 Albania let's go right up here if you remember when we didn't have the
882:04 remember when we didn't have the multiple indexes it was pulling up
882:06 multiple indexes it was pulling up Albania the difference when you're doing
882:08 Albania the difference when you're doing these multi- indexes is that the the L
882:11 these multi- indexes is that the the L is able to specify this whereas this one
882:15 is able to specify this whereas this one does not go based off that multi-
882:17 does not go based off that multi- indexing it's going to go based off the
882:19 indexing it's going to go based off the initial index or the integer based index
882:22 initial index or the integer based index so that's a lot about indexing in pandas
882:24 so that's a lot about indexing in pandas we'll cover even a few more things in
882:26 we'll cover even a few more things in future videos as we get more and more
882:28 future videos as we get more and more into pandas but this is a lot of what
882:31 into pandas but this is a lot of what indexing looks like within pandas and
882:33 indexing looks like within pandas and again super important to learn how to do
882:35 again super important to learn how to do and know how to do because it's a pretty
882:36 and know how to do because it's a pretty important building block as we go
882:39 important building block as we go through this Panda series so I hope you
882:41 through this Panda series so I hope you enjoyed this video on indexing if you
882:43 enjoyed this video on indexing if you did be sure to like And subscribe below
882:45 did be sure to like And subscribe below and I will see you in the next
882:48 and I will see you in the next [Music]
882:58 [Music] video hello everybody today we're going
883:00 video hello everybody today we're going to be taking a look at the group by
883:02 to be taking a look at the group by function and aggregating within panas
883:05 function and aggregating within panas group I is going to group together the
883:07 group I is going to group together the values in a column and display them all
883:09 values in a column and display them all on the same row and this allows you to
883:11 on the same row and this allows you to perform aggregate functions on those
883:13 perform aggregate functions on those groupings so let's start reading in our
883:15 groupings so let's start reading in our data and take a look so we're going to
883:17 data and take a look so we're going to do import pandas as
883:20 do import pandas as PD and then we're going to say our data
883:22 PD and then we're going to say our data frame is equal to and we'll say pd. read
883:27 frame is equal to and we'll say pd. read CSV we'll do an open parenthesis R and
883:31 CSV we'll do an open parenthesis R and our file path and we're going to be
883:32 our file path and we're going to be looking at the flavors CSV right here so
883:36 looking at the flavors CSV right here so right here we have our flavor of ice
883:38 right here we have our flavor of ice cream we have our base flavor flavor
883:40 cream we have our base flavor flavor whether it was vanilla or chocolate
883:42 whether it was vanilla or chocolate whether I liked it or not the flavor
883:44 whether I liked it or not the flavor rating texture rating and its overall or
883:47 rating texture rating and its overall or its total rating now these are all my
883:48 its total rating now these are all my own personal scores so you know I've
883:51 own personal scores so you know I've spent years researching this so these
883:52 spent years researching this so these are all very accurate but this should be
883:54 are all very accurate but this should be a low stress environment to learn Group
883:56 a low stress environment to learn Group by and the aggregate functions so the
883:59 by and the aggregate functions so the first thing that we can do is look at
884:01 first thing that we can do is look at our group by now you can't Group by well
884:04 our group by now you can't Group by well you can you can Group by flavor but as
884:06 you can you can Group by flavor but as you can see these are all unique values
884:09 you can see these are all unique values what we need is something that has
884:10 what we need is something that has duplicate values or or similar values on
884:13 duplicate values or or similar values on different rows that'll group together so
884:16 different rows that'll group together so this base flavor is actually a perfect
884:18 this base flavor is actually a perfect one to group it on and we'll do that by
884:20 one to group it on and we'll do that by saying DF do group by do an open
884:24 saying DF do group by do an open parenthesis and we'll just specify base
884:28 parenthesis and we'll just specify base flavor and this will then group together
884:30 flavor and this will then group together those values and I need to make sure I
884:32 those values and I need to make sure I can spell properly this will group those
884:35 can spell properly this will group those flavors together so let's run this and
884:38 flavors together so let's run this and as you can see it actually is its own
884:40 as you can see it actually is its own object so it has a group by data frame
884:43 object so it has a group by data frame Group by object so now that we've
884:45 Group by object so now that we've grouped them let's give it a variable so
884:48 grouped them let's give it a variable so we'll say group underscore byor frame
884:52 we'll say group underscore byor frame let's say that's equal to Let's copy
884:54 let's say that's equal to Let's copy this we'll run it and now what we need
884:58 this we'll run it and now what we need to do is run our aggregations in order
885:00 to do is run our aggregations in order to get an output so we're going to say
885:04 to get an output so we're going to say mean and that's all we're going to put
885:06 mean and that's all we're going to put just for now just to get an output that
885:08 just for now just to get an output that we can take a look off and then we'll
885:10 we can take a look off and then we'll build from there so let's go ahead and
885:12 build from there so let's go ahead and run this and right here we have our base
885:16 run this and right here we have our base flavor which is now saying is the index
885:18 flavor which is now saying is the index of chocolate or vanilla and then it's
885:20 of chocolate or vanilla and then it's taking the mean or the average of all
885:23 taking the mean or the average of all the columns that have integers notice
885:25 the columns that have integers notice that it did not take the liked column
885:27 that it did not take the liked column and it did not take the flavor column
885:29 and it did not take the flavor column because those are strings and they
885:30 because those are strings and they cannot aggregate those and we'll take a
885:32 cannot aggregate those and we'll take a look at that later but it took all the
885:34 look at that later but it took all the values that have integers and then it
885:36 values that have integers and then it gave us the average of those ratings
885:38 gave us the average of those ratings really quickly I wanted to give a huge
885:40 really quickly I wanted to give a huge shout out to the sponsor of this entire
885:41 shout out to the sponsor of this entire Panda series and that is udemy udemy has
885:44 Panda series and that is udemy udemy has some of the best courses at the best
885:45 some of the best courses at the best prices and it is no exception when it
885:47 prices and it is no exception when it comes to pandas courses if you want to
885:49 comes to pandas courses if you want to master pandas this is the course that I
885:51 master pandas this is the course that I would recommend it's going to teach you
885:52 would recommend it's going to teach you just about everything you need to know
885:54 just about everything you need to know about pandas so huge shout out to UD me
885:55 about pandas so huge shout out to UD me for sponsoring this Panda series and
885:57 for sponsoring this Panda series and let's get back to the video so right off
885:59 let's get back to the video so right off the bat as averages with chocolate I
886:01 the bat as averages with chocolate I have a much higher rating overall than
886:03 have a much higher rating overall than the ones with vanilla bases now we can
886:06 the ones with vanilla bases now we can actually combine all of this together
886:08 actually combine all of this together into one line and we can do something
886:09 into one line and we can do something like this so we'll
886:12 like this so we'll say DF do group by and we'll say mean
886:17 say DF do group by and we'll say mean just like this and this will actually
886:19 just like this and this will actually run it before we didn't have any
886:21 run it before we didn't have any aggregating function on there so it
886:23 aggregating function on there so it didn't run but now that we combine it
886:24 didn't run but now that we combine it all into one it will run properly now
886:27 all into one it will run properly now there are a lot of different aggregate
886:29 there are a lot of different aggregate functions but I'm going to show you some
886:31 functions but I'm going to show you some of the most popular ones or the most
886:32 of the most popular ones or the most common ones that you will see so let's
886:35 common ones that you will see so let's copy this right here so we can do dot
886:39 copy this right here so we can do dot count and when we run this we can look
886:41 count and when we run this we can look at the count and this will show us the
886:43 at the count and this will show us the actual count of the rows that were
886:45 actual count of the rows that were aggregated so for chocolate we have
886:47 aggregated so for chocolate we have three so there going to be three all the
886:49 three so there going to be three all the way across and for vanilla we had six so
886:51 way across and for vanilla we had six so we're looking at a higher count of
886:53 we're looking at a higher count of vanilla which if you're comparing it to
886:56 vanilla which if you're comparing it to this mean up here that could be a big
886:58 this mean up here that could be a big skew towards the chocolate because if
887:00 skew towards the chocolate because if you have one or two good chocolates it
887:02 you have one or two good chocolates it could really pull the numbers up whereas
887:03 could really pull the numbers up whereas if you had two good vanillas but all the
887:06 if you had two good vanillas but all the other ones were bad it pulls that
887:07 other ones were bad it pulls that average down so knowing the count of
887:09 average down so knowing the count of something something is really
887:11 something something is really good let's take a look at the next one
887:14 good let's take a look at the next one and we can do Min and Max and I'll just
887:16 and we can do Min and Max and I'll just run these really quickly we can do Min
887:19 run these really quickly we can do Min and when we run this the first thing
887:20 and when we run this the first thing that you should notice is that it now
887:22 that you should notice is that it now has a flavor and a liked column and
887:25 has a flavor and a liked column and that's because Min and Max will actually
887:27 that's because Min and Max will actually look at the first letter in the string
887:28 look at the first letter in the string or the first set of letters if there are
887:31 or the first set of letters if there are um you know chocolate something it'll
887:33 um you know chocolate something it'll look at the first and then it'll
887:35 look at the first and then it'll actually populate it so chocolate with
887:37 actually populate it so chocolate with the CH chocolate is the very first or
887:40 the CH chocolate is the very first or the minimum value for that string and
887:43 the minimum value for that string and for a cake batter that is the minimum
887:46 for a cake batter that is the minimum value in vanilla as well now with the
887:48 value in vanilla as well now with the liked it's interesting because
887:50 liked it's interesting because apparently I liked all the chocolate
887:51 apparently I liked all the chocolate ones I'm going to go take a look so
887:53 ones I'm going to go take a look so chocolate I liked chocolate I liked
887:55 chocolate I liked chocolate I liked chocolate I lik so there is no no option
887:57 chocolate I lik so there is no no option in this liked column so yes was the only
887:59 in this liked column so yes was the only option and now let's look at Max
888:02 option and now let's look at Max whoops and it should do the exact
888:04 whoops and it should do the exact opposite which is going to take the
888:06 opposite which is going to take the highest value even if it's a string so
888:08 highest value even if it's a string so Rocky Road the letter r comes later in
888:10 Rocky Road the letter r comes later in the alphabet so that's what it's looking
888:12 the alphabet so that's what it's looking at and so does vanilla and then we have
888:15 at and so does vanilla and then we have yes as well and then of course right
888:17 yes as well and then of course right here it's taking the max value so before
888:20 here it's taking the max value so before when we were looking at Min I just
888:21 when we were looking at Min I just focused on those but it still does the
888:23 focused on those but it still does the exact same thing to these integer um
888:26 exact same thing to these integer um columns as well so for the max value for
888:30 columns as well so for the max value for vanilla it was mint chocolate chip that
888:31 vanilla it was mint chocolate chip that was our base so I had a rating of 10 for
888:34 was our base so I had a rating of 10 for this vanilla row or grouping and then we
888:37 this vanilla row or grouping and then we can also look at the sum
888:40 can also look at the sum and there are all the sums for these and
888:43 and there are all the sums for these and again it only does integer because we
888:44 again it only does integer because we can't add the strings here are the sum
888:47 can't add the strings here are the sum or the total values for all of them and
888:49 or the total values for all of them and for the total values since we had you
888:51 for the total values since we had you know six rows that were grouping into
888:53 know six rows that were grouping into this vanilla we now have a lot of a much
888:56 this vanilla we now have a lot of a much higher score for vanilla now that's a
888:58 higher score for vanilla now that's a really simple way to do your
888:59 really simple way to do your aggregations but there is actually an
889:01 aggregations but there is actually an aggregation function and let's take a
889:04 aggregation function and let's take a look at this CU this is um a little bit
889:06 look at this CU this is um a little bit more complex although when I write it
889:08 more complex although when I write it out or show you hope it makes a lot of
889:10 out or show you hope it makes a lot of sense we can do a so this is our
889:14 sense we can do a so this is our aggregate function and what we need to
889:16 aggregate function and what we need to pass into our aggregate function is
889:18 pass into our aggregate function is actually a dictionary so let's do an
889:20 actually a dictionary so let's do an open parenthesis and we're going to do a
889:22 open parenthesis and we're going to do a squiggly bracket and then we need to
889:25 squiggly bracket and then we need to specify what we're going to be
889:26 specify what we're going to be aggregating on or what column so let's
889:28 aggregating on or what column so let's do this flavor rating let's copy this
889:32 do this flavor rating let's copy this we'll do flavor rating and I need to put
889:34 we'll do flavor rating and I need to put that as a
889:35 that as a string and then we'll do a colon and now
889:38 string and then we'll do a colon and now we can specify what what aggregate
889:40 we can specify what what aggregate functions we want so we've done sum
889:42 functions we want so we've done sum count mean Min and Max all of those and
889:45 count mean Min and Max all of those and we can actually put all of those into
889:47 we can actually put all of those into here and perform all of those
889:49 here and perform all of those aggregations on just one column so let's
889:52 aggregations on just one column so let's make a list and then let's say
889:55 make a list and then let's say mean
889:57 mean Max count and uh what's another one sum
890:02 Max count and uh what's another one sum so let's do all four of those only on
890:05 so let's do all four of those only on this flavor rating
890:07 this flavor rating column and when we run this we have our
890:10 column and when we run this we have our base flavor right here chocolate and
890:12 base flavor right here chocolate and vanilla but now we don't have multiple
890:14 vanilla but now we don't have multiple columns we have one column with multiple
890:17 columns we have one column with multiple Columns of our aggregations and it is
890:20 Columns of our aggregations and it is possible to pass in multiple columns
890:23 possible to pass in multiple columns like that so we'll do texture
890:26 like that so we'll do texture rating and we'll just come right here
890:28 rating and we'll just come right here and do a comma then we'll say uh uh
890:31 and do a comma then we'll say uh uh texture
890:32 texture rating and then a colon I don't know why
890:37 rating and then a colon I don't know why I spelled it out when I copied it but I
890:39 I spelled it out when I copied it but I did and then we'll do the exact same
890:41 did and then we'll do the exact same ones and now when we run it we're
890:43 ones and now when we run it we're getting the exact same columns mean Max
890:46 getting the exact same columns mean Max count and sum for flavor rating then
890:48 count and sum for flavor rating then mean Max count and sum for our texture
890:50 mean Max count and sum for our texture rating now so far we've only grouped on
890:53 rating now so far we've only grouped on one column but we can actually group on
890:56 one column but we can actually group on multiple columns let's go back up here
890:58 multiple columns let's go back up here to our data and I should have just
891:00 to our data and I should have just copied this down here let's go back down
891:03 copied this down here let's go back down and just look at this so really we only
891:06 and just look at this so really we only grouped it on this base flavor but you
891:08 grouped it on this base flavor but you can do multiple groupings or group by
891:11 can do multiple groupings or group by multiple columns so let's do our base
891:13 multiple columns so let's do our base flavor which we did already as well as
891:16 flavor which we did already as well as the liked column so we're going to say
891:18 the liked column so we're going to say DF dog Group by then we'll do an open
891:22 DF dog Group by then we'll do an open parenthesis and then instead of just
891:24 parenthesis and then instead of just passing through one string we're going
891:27 passing through one string we're going to do a list and we'll say base
891:31 to do a list and we'll say base flavor oops comma and then we'll do
891:35 flavor oops comma and then we'll do liked so now when it groups this it
891:38 liked so now when it groups this it should put put two groupings and let's
891:41 should put put two groupings and let's run this and just see oops I got to say
891:43 run this and just see oops I got to say let's just do
891:46 let's just do mean so now we have our chocolate and a
891:49 mean so now we have our chocolate and a vanilla and remember chocolate only had
891:52 vanilla and remember chocolate only had yes so that's the only one that it's
891:54 yes so that's the only one that it's going to group on but vanilla had a no
891:57 going to group on but vanilla had a no and a yes so if we look at the vanilla
891:59 and a yes so if we look at the vanilla we have our base flavor vanilla and then
892:01 we have our base flavor vanilla and then within liked we have no and a yes which
892:05 within liked we have no and a yes which can show us that within our vanilla when
892:07 can show us that within our vanilla when we group on these our NOS were really
892:09 we group on these our NOS were really low
892:10 low but our yeses were really high we
892:11 but our yeses were really high we actually had a pretty similar rating or
892:13 actually had a pretty similar rating or very close to the same rating as the
892:15 very close to the same rating as the ones we really liked in chocolate and
892:17 ones we really liked in chocolate and just like we did above we can take this
892:20 just like we did above we can take this doag and I'm going to copy this and
892:23 doag and I'm going to copy this and it'll perform it on each of those rows
892:26 it'll perform it on each of those rows let me close that and what did I do
892:28 let me close that and what did I do wrong oh I need the squiggly
892:31 wrong oh I need the squiggly bracket and it'll show us each of those
892:34 bracket and it'll show us each of those so the mean Max count and sum for all of
892:37 so the mean Max count and sum for all of the chocolate and vanilla as well as the
892:39 the chocolate and vanilla as well as the groupings of light yes and no now after
892:42 groupings of light yes and no now after we've looked at all that and that's how
892:44 we've looked at all that and that's how I usually do it there is one uh shortcut
892:47 I usually do it there is one uh shortcut function that can give you some of these
892:48 function that can give you some of these things just really quickly and so let's
892:51 things just really quickly and so let's go back up here and take this it's just
892:54 go back up here and take this it's just called describe um and if you've ever
892:56 called describe um and if you've ever done it it's just going to give you some
892:58 done it it's just going to give you some high level overview of some of those
893:00 high level overview of some of those different aggregations so let's run this
893:03 different aggregations so let's run this and it's going to give us our chocolate
893:04 and it's going to give us our chocolate and vanilla and within each column it's
893:07 and vanilla and within each column it's going to give us our count our mean our
893:09 going to give us our count our mean our standard deviation I believe is what
893:11 standard deviation I believe is what that is our minimum 25% 50 75 and 100
893:15 that is our minimum 25% 50 75 and 100 which is our Max then our count and our
893:17 which is our Max then our count and our means so a lot of those aggregate
893:19 means so a lot of those aggregate functions but the describe is you know a
893:22 functions but the describe is you know a very generalized um function we can't
893:25 very generalized um function we can't get as specific as we were with the
893:27 get as specific as we were with the previous ones that we were looking at
893:29 previous ones that we were looking at but I just wanted to throw this out
893:30 but I just wanted to throw this out there in case this is something that
893:31 there in case this is something that you'd be interested in because it you
893:32 you'd be interested in because it you know technically is showing a lot of
893:35 know technically is showing a lot of those aggregate functions just you know
893:37 those aggregate functions just you know all at one time so that is our group Buy
893:39 all at one time so that is our group Buy and aggregate functions within pandas I
893:41 and aggregate functions within pandas I hope that that was helpful I hope that
893:42 hope that that was helpful I hope that you understood you know everything that
893:44 you understood you know everything that we were working on if you like this
893:45 we were working on if you like this video be sure to like And subscribe and
893:47 video be sure to like And subscribe and check out all my other videos on python
893:49 check out all my other videos on python as well as pandas and I will see you in
893:51 as well as pandas and I will see you in the next
893:53 the next [Music]
894:03 [Music] video hello everybody today we're going
894:06 video hello everybody today we're going to be talking about merging joining and
894:07 to be talking about merging joining and concatenating data frames in p do this
894:10 concatenating data frames in p do this whole video is basically around being
894:11 whole video is basically around being able to combine two separate data frames
894:14 able to combine two separate data frames together into one data frame these are
894:16 together into one data frame these are really important to understand when
894:17 really important to understand when we're actually using the merge and the
894:19 we're actually using the merge and the join right here we have what's called an
894:21 join right here we have what's called an inner join and the Shaded part is what's
894:24 inner join and the Shaded part is what's going to be returned it's only the
894:25 going to be returned it's only the things that are in both the left and the
894:28 things that are in both the left and the right data frames then we have an outer
894:31 right data frames then we have an outer join or a full outer join and this will
894:33 join or a full outer join and this will take all the data from the left data
894:35 take all the data from the left data frame and the right data frame and
894:37 frame and the right data frame and everything that is similar so basically
894:39 everything that is similar so basically just takes everything we also have a
894:41 just takes everything we also have a left join which is going to take
894:43 left join which is going to take everything from the left and then if
894:44 everything from the left and then if there's anything that's similar it'll
894:46 there's anything that's similar it'll also include that and then the exact
894:48 also include that and then the exact opposite of that is the right join which
894:50 opposite of that is the right join which is going to give us everything from the
894:52 is going to give us everything from the right data frame and it's going to give
894:54 right data frame and it's going to give us everything that is similar but it's
894:55 us everything that is similar but it's not going to give us anything that is
894:57 not going to give us anything that is just unique to the left data frame so
894:59 just unique to the left data frame so this is just for reference because in a
895:01 this is just for reference because in a little bit when we start merging these
895:03 little bit when we start merging these these become very important so I just
895:04 these become very important so I just wanted to kind of show you how that
895:06 wanted to kind of show you how that works visually so let's get started by
895:08 works visually so let's get started by pulling in our files so first we're
895:10 pulling in our files so first we're going to say import and is aspd we'll
895:13 going to say import and is aspd we'll run this and then we'll say data frame
895:16 run this and then we'll say data frame one and we'll also have a data frame two
895:18 one and we'll also have a data frame two and these are the different data frames
895:20 and these are the different data frames the left and the right data frame that
895:22 the left and the right data frame that we'll be using to join merge and
895:24 we'll be using to join merge and concatenate so we'll say data frame 1 is
895:26 concatenate so we'll say data frame 1 is equal to pd. CSV read and we'll do R and
895:33 equal to pd. CSV read and we'll do R and here is our file path so we have this
895:35 here is our file path so we have this lr. CSV that's our Lord of the Rings CSV
895:39 lr. CSV that's our Lord of the Rings CSV and let's call that really quickly so we
895:41 and let's call that really quickly so we can see what's in there and I'm having a
895:43 can see what's in there and I'm having a dyslexic moment uh because it's supposed
895:45 dyslexic moment uh because it's supposed to be reor CSV uh I apologize for that
895:49 to be reor CSV uh I apologize for that but this is our data frame this is our
895:51 but this is our data frame this is our data frame one we have three columns
895:53 data frame one we have three columns it's their Fellowship ID 10001 2 3 and
895:56 it's their Fellowship ID 10001 2 3 and four their first name froto Sam wiise
895:58 four their first name froto Sam wiise gelf and Pippen and their skills hide
896:00 gelf and Pippen and their skills hide and gardening spells and fireworks so
896:02 and gardening spells and fireworks so this is our very first data frame that
896:04 this is our very first data frame that we're going to be working with let's go
896:06 we're going to be working with let's go down a little bit let's pull this down
896:09 down a little bit let's pull this down here and we're just going to say data
896:11 here and we're just going to say data Frame 2 Data Frame 2 and this is the
896:14 Frame 2 Data Frame 2 and this is the Lord of the Rings 2 so let's pull this
896:17 Lord of the Rings 2 so let's pull this one in now as you can see it's very
896:19 one in now as you can see it's very similar we have Fellowship ID 1 2 6 7 8
896:23 similar we have Fellowship ID 1 2 6 7 8 so we have three different IDs here we
896:26 so we have three different IDs here we don't have six seven and eight in this
896:28 don't have six seven and eight in this upper this First Data frame we also have
896:31 upper this First Data frame we also have the first name so froto and Sam or Sam
896:33 the first name so froto and Sam or Sam wise are in the very first and the
896:35 wise are in the very first and the second data frame but now we have three
896:37 second data frame but now we have three new people barir Eland and legalis and
896:40 new people barir Eland and legalis and now we have this age column which again
896:43 now we have this age column which again is unique to just this second data frame
896:45 is unique to just this second data frame really quickly I want to give a huge
896:46 really quickly I want to give a huge shout out to the sponsor of this video
896:47 shout out to the sponsor of this video and that is zendesk I've been using
896:49 and that is zendesk I've been using zenes for my company's customer
896:51 zenes for my company's customer analytics and it has been absolutely
896:52 analytics and it has been absolutely phenomenal they're going to be hosting a
896:54 phenomenal they're going to be hosting a conference called zenes relate on May
896:55 conference called zenes relate on May 10th and they're going to talk all about
896:57 10th and they're going to talk all about customer analytics chat Bots and AI in
897:00 customer analytics chat Bots and AI in this space you can attend in person in
897:01 this space you can attend in person in San Francisco or you can attend
897:03 San Francisco or you can attend virtually but space is limited so be
897:05 virtually but space is limited so be sure to apply if you want to attend so
897:07 sure to apply if you want to attend so if you are a business leader and you
897:08 if you are a business leader and you want to make most out of your customer
897:10 want to make most out of your customer data or you want to learn customer data
897:12 data or you want to learn customer data analytics I will leave links in the
897:14 analytics I will leave links in the description again huge shout out to
897:15 description again huge shout out to zendesk for sponsoring this video now
897:17 zendesk for sponsoring this video now the first one that I want to look at is
897:19 the first one that I want to look at is merge and I want to look at merge first
897:21 merge and I want to look at merge first because I think this one is the most
897:22 because I think this one is the most important I use this one more than any
897:24 important I use this one more than any of ones that we're going to talk about
897:25 of ones that we're going to talk about today the merge is just like the joins
897:28 today the merge is just like the joins that we were just looking at the outer
897:31 that we were just looking at the outer the inner the left and the right and
897:32 the inner the left and the right and there's also one called cross and I'll
897:34 there's also one called cross and I'll show you that one although if I'm being
897:36 show you that one although if I'm being honest I don't really use that one that
897:38 honest I don't really use that one that much but It's Worth showing just in case
897:40 much but It's Worth showing just in case you come into a scenario where you do
897:42 you come into a scenario where you do want to do that so let's go right down
897:44 want to do that so let's go right down here and I want to be able to see these
897:46 here and I want to be able to see these while we do it so we're going to say
897:48 while we do it so we're going to say data frame one and when we specify data
897:51 data frame one and when we specify data frame one as the very first data frame
897:54 frame one as the very first data frame we say datf frame. merge this is
897:57 we say datf frame. merge this is automatically going to be our left data
897:59 automatically going to be our left data frame then if we do our parentheses
898:02 frame then if we do our parentheses right here and we say data Frame 2 this
898:05 right here and we say data Frame 2 this is our right data frame and let's see
898:07 is our right data frame and let's see what happens when we do this
898:09 what happens when we do this so what it's going to do and this we
898:11 so what it's going to do and this we didn't specify this it's just a default
898:14 didn't specify this it's just a default it's going to do an inner join so it's
898:16 it's going to do an inner join so it's only going to give us an output where
898:18 only going to give us an output where specific values or the keys are the same
898:20 specific values or the keys are the same now you can't see this but what is
898:22 now you can't see this but what is happening is is it's taking this
898:23 happening is is it's taking this Fellowship ID and saying I have 101 here
898:27 Fellowship ID and saying I have 101 here a 102 here this is the exact same as up
898:31 a 102 here this is the exact same as up here with this Fellowship ID and
898:33 here with this Fellowship ID and fellowship ID of 101 and 2 but when we
898:36 fellowship ID of 101 and 2 but when we look at 13 and 4 those aren't in this
898:38 look at 13 and 4 those aren't in this right right data frame and 678 is not in
898:41 right right data frame and 678 is not in this left data frame so the only ones
898:44 this left data frame so the only ones that match are this 101 and two and
898:47 that match are this 101 and two and that's why they get pulled in down here
898:48 that's why they get pulled in down here but because we didn't explicitly say
898:52 but because we didn't explicitly say here's what I want to join or merge
898:54 here's what I want to join or merge between these two data frames it
898:56 between these two data frames it actually is looking at the fellowship ID
898:58 actually is looking at the fellowship ID and the first name so it's taking in
899:00 and the first name so it's taking in these unique values of froto and Sam
899:02 these unique values of froto and Sam wise which are the same in both which is
899:05 wise which are the same in both which is why I pulled it over but really quickly
899:08 why I pulled it over but really quickly let's just check and make sure that we
899:09 let's just check and make sure that we did it on the inner join because again
899:13 did it on the inner join because again we didn't specify anything that was just
899:14 we didn't specify anything that was just the default so we're going to say how is
899:17 the default so we're going to say how is equal to and then we'll say iner and if
899:20 equal to and then we'll say iner and if we run this it's going to be the exact
899:22 we run this it's going to be the exact same because again the inner is the
899:24 same because again the inner is the default but now just to show you how
899:27 default but now just to show you how it's kind of joining these two uh data
899:29 it's kind of joining these two uh data frames together I'm going to say on is
899:32 frames together I'm going to say on is equal to and then I'm only going to put
899:35 equal to and then I'm only going to put Fellowship ID so let's run this now the
899:38 Fellowship ID so let's run this now the first thing that you make may have
899:39 first thing that you make may have noticed is this first name undor X and
899:41 noticed is this first name undor X and this first name uncore Y what the merge
899:44 this first name uncore Y what the merge does as kind of a default is when you
899:47 does as kind of a default is when you were only joining on a fellowship ID we
899:49 were only joining on a fellowship ID we have this right data frame with
899:50 have this right data frame with Fellowship ID the left data frame with
899:53 Fellowship ID the left data frame with the fellowship ID if you're just joining
899:55 the fellowship ID if you're just joining on these and you're not joining on the
899:57 on these and you're not joining on the first name and the first name then it's
899:59 first name and the first name then it's going to separate those into an
900:01 going to separate those into an underscore X and an underscore Y and
900:04 underscore X and an underscore Y and even though they have the exact same
900:05 even though they have the exact same values since we are not merging on that
900:08 values since we are not merging on that column it automatically separates that
900:10 column it automatically separates that into two separate columns so we can see
900:12 into two separate columns so we can see the values within each of those columns
900:14 the values within each of those columns if we went into this on and we make a
900:17 if we went into this on and we make a list and let's do it like that and we
900:20 list and let's do it like that and we say comma and then we write first name
900:24 say comma and then we write first name oops first name and then we run this
900:28 oops first name and then we run this it's going to look exactly like it did
900:30 it's going to look exactly like it did before again it automatically pulled in
900:33 before again it automatically pulled in both of these columns when it was
900:34 both of these columns when it was merging at the first time even though we
900:36 merging at the first time even though we didn't write anything but if we actually
900:38 didn't write anything but if we actually write this this it's doing exactly what
900:40 write this this it's doing exactly what it was doing when we just had df2 we're
900:42 it was doing when we just had df2 we're just now writing it out now there are
900:44 just now writing it out now there are other arguments that we can pass into
900:46 other arguments that we can pass into this merge function let's hit shift Tab
900:49 this merge function let's hit shift Tab and let's scroll down here so within
900:51 and let's scroll down here so within this merge function we have a lot of
900:53 this merge function we have a lot of different arguments that you can pass
900:54 different arguments that you can pass into it first we have this right which
900:56 into it first we have this right which is the right data frame which is this
900:58 is the right data frame which is this data frame two then we have the how and
901:00 data frame two then we have the how and the on which we've already shown how to
901:02 the on which we've already shown how to do there's a left on right on left Index
901:06 do there's a left on right on left Index right index not something you'll
901:08 right index not something you'll probably use that much but you
901:10 probably use that much but you definitely can if you want to look into
901:11 definitely can if you want to look into that and there's all these doc strings
901:13 that and there's all these doc strings which show you exactly how to use all of
901:14 which show you exactly how to use all of these so if you're interest in looking
901:17 these so if you're interest in looking at the left and the right and the left
901:18 at the left and the right and the left index it's all in here the one that is
901:21 index it's all in here the one that is really good is the sort and you can sort
901:23 really good is the sort and you can sort it saying either it's false or true then
901:26 it saying either it's false or true then we have these suffixes now if you
901:28 we have these suffixes now if you remember when we took these out what it
901:30 remember when we took these out what it automatically did was it put in these
901:33 automatically did was it put in these underscore X and underscore y you can
901:36 underscore X and underscore y you can customize that and you can put in what
901:38 customize that and you can put in what whatever you'd like instead of the
901:40 whatever you'd like instead of the underscore X andore Y you can put in
901:42 underscore X andore Y you can put in some custom um string for that we also
901:45 some custom um string for that we also have an indicator and a validates again
901:47 have an indicator and a validates again all things you can go in here and look
901:49 all things you can go in here and look at I'm just going to show you the stuff
901:50 at I'm just going to show you the stuff that I use the most so these things
901:53 that I use the most so these things right here are things that I definitely
901:55 right here are things that I definitely use the most so now that we've looked at
901:56 use the most so now that we've looked at the inner join let's copy this right
901:58 the inner join let's copy this right down here and let's look at the outer
902:01 down here and let's look at the outer join and these get a little bit more
902:03 join and these get a little bit more tricky I think the inner join is
902:05 tricky I think the inner join is probably the easiest one to understand
902:08 probably the easiest one to understand well look at the outer is spelled o u t
902:11 well look at the outer is spelled o u t e r i I don't know why I always want to
902:13 e r i I don't know why I always want to say o t t r but let's run this and see
902:16 say o t t r but let's run this and see what we get so now this looks quite
902:19 what we get so now this looks quite different the inner join only gave us
902:22 different the inner join only gave us the values that are the exact same this
902:25 the values that are the exact same this one is going to give us all of the
902:27 one is going to give us all of the values regardless of if they are the
902:29 values regardless of if they are the same so we have 1 2 3 4 six seven and
902:33 same so we have 1 2 3 4 six seven and eight so let's scroll back up here so we
902:37 eight so let's scroll back up here so we have 1 2 3 4 1 two and six s and 8 so we
902:40 have 1 2 3 4 1 two and six s and 8 so we don't have a 105 and then if you notice
902:43 don't have a 105 and then if you notice in this data frame right here if the
902:46 in this data frame right here if the value doesn't have so if we can't join
902:48 value doesn't have so if we can't join on the fellowship ID or the first name
902:51 on the fellowship ID or the first name like legalis wasn't one that we joined
902:52 like legalis wasn't one that we joined on or that has a similar value in the
902:55 on or that has a similar value in the left data frame it just gives us an N
902:58 left data frame it just gives us an N which is not a number and it's going to
903:00 which is not a number and it's going to do that for any value where it couldn't
903:02 do that for any value where it couldn't find that join or it couldn't match uh
903:04 find that join or it couldn't match uh something within that either ID or first
903:06 something within that either ID or first name so in age we also have that for the
903:09 name so in age we also have that for the ones that weren't in the right data
903:11 ones that weren't in the right data frame we only had 101 and 102 so we'll
903:15 frame we only had 101 and 102 so we'll have the age for both froto and Sam but
903:17 have the age for both froto and Sam but for Gandalf and Pippen we don't have
903:20 for Gandalf and Pippen we don't have their corresponding IDs and so it's just
903:23 their corresponding IDs and so it's just going to be blank for Gandalf and Pippen
903:25 going to be blank for Gandalf and Pippen and you can see that right here so again
903:28 and you can see that right here so again outer joins are kind of the opposite of
903:30 outer joins are kind of the opposite of inner joins they're going to return
903:32 inner joins they're going to return everything from both if there is
903:34 everything from both if there is overlapping data it won't be duplicated
903:36 overlapping data it won't be duplicated now let's go on to the left join and I'm
903:39 now let's go on to the left join and I'm going to pull this down right here and
903:41 going to pull this down right here and now we're just going to say how is equal
903:43 now we're just going to say how is equal to left and let's run this so what this
903:48 to left and let's run this so what this is going to do is it's going to take
903:50 is going to do is it's going to take everything from the left table or the
903:52 everything from the left table or the left data frame right here so everything
903:54 left data frame right here so everything from data frame one then if there is any
903:57 from data frame one then if there is any overlap it'll also pull the overlapped
903:59 overlap it'll also pull the overlapped or the you know whatever we're able to
904:01 or the you know whatever we're able to merge on from data Frame 2 so let's go
904:04 merge on from data Frame 2 so let's go back up to our data frame 1 and two so
904:06 back up to our data frame 1 and two so it's going to pull everything from this
904:08 it's going to pull everything from this left data frame cuz we're specifying
904:10 left data frame cuz we're specifying we're doing a left join so everything
904:12 we're doing a left join so everything from the left data frame will be in
904:14 from the left data frame will be in there we're also going to try to bring
904:16 there we're also going to try to bring in everything from the right but only if
904:19 in everything from the right but only if it matches or or is able to merge so
904:21 it matches or or is able to merge so just this information right here will
904:23 just this information right here will come over we weren't able to join on
904:26 come over we weren't able to join on 1006 17 or 1008 so really none of that
904:30 1006 17 or 1008 so really none of that information is going to come over so
904:32 information is going to come over so let's go down and check on this so again
904:35 let's go down and check on this so again we have 1 2 3 4 all of the data with
904:38 we have 1 2 3 4 all of the data with this first name and skills everything is
904:40 this first name and skills everything is in here but then we are trying to bring
904:43 in here but then we are trying to bring over the age but we only have matches
904:45 over the age but we only have matches with 1,1 and 1002 so only these two
904:48 with 1,1 and 1002 so only these two values will come in let's look at the
904:50 values will come in let's look at the right join because it's basically the
904:52 right join because it's basically the exact opposite let's look at the
904:56 exact opposite let's look at the right and this is basically the exact
904:58 right and this is basically the exact opposite of the left in the fact that
905:00 opposite of the left in the fact that now we're only looking at the right hand
905:02 now we're only looking at the right hand and then if there's something that
905:04 and then if there's something that matches in data frame one then we will
905:06 matches in data frame one then we will pull that in so this this is basically
905:09 pull that in so this this is basically just looking like data Frame 2 except
905:11 just looking like data Frame 2 except we're pulling in that skills column and
905:13 we're pulling in that skills column and since only 101 and 102 are the same
905:17 since only 101 and 102 are the same that's why the skills values are here
905:20 that's why the skills values are here now those are the main types of merges
905:22 now those are the main types of merges that I will use when I'm using a data
905:24 that I will use when I'm using a data frame or when I'm trying to merge a data
905:26 frame or when I'm trying to merge a data frame but there also is one called a
905:28 frame but there also is one called a cross or a cross join uh and let's look
905:31 cross or a cross join uh and let's look at this one and this one is quite a bit
905:34 at this one and this one is quite a bit different here we go let's run this so
905:37 different here we go let's run this so this one is different in that it takes
905:39 this one is different in that it takes each value from the left data frame and
905:41 each value from the left data frame and Compares it to each value in the right
905:44 Compares it to each value in the right data frame so for froto in this left
905:47 data frame so for froto in this left data frame it looks at the froto in the
905:49 data frame it looks at the froto in the right data frame Sam wise in the right
905:51 right data frame Sam wise in the right data frame legalis elron and baromir all
905:54 data frame legalis elron and baromir all on the right data frame then it goes to
905:56 on the right data frame then it goes to the next value Sam wise and does the
905:58 the next value Sam wise and does the exact same thing Roto Sam wise legalis
906:01 exact same thing Roto Sam wise legalis Elon baromir and it does that for every
906:03 Elon baromir and it does that for every single value so let's go right back up
906:06 single value so let's go right back up here so it's taking this this this 101
906:10 here so it's taking this this this 101 it's comparing it to 1 2 3 4 5 then it's
906:13 it's comparing it to 1 2 3 4 5 then it's taking Samwise it's comparing it to 1 2
906:16 taking Samwise it's comparing it to 1 2 3 4 5 Gandalf 1 2 3 4 5 Pippen and then
906:19 3 4 5 Gandalf 1 2 3 4 5 Pippen and then you kind of see that pattern and that's
906:21 you kind of see that pattern and that's what a cross joint is um there are very
906:23 what a cross joint is um there are very few in my opinion reasons for a cross
906:26 few in my opinion reasons for a cross join although you'll if you ever do like
906:28 join although you'll if you ever do like an interview where you're being
906:29 an interview where you're being interviewed on python you will sometimes
906:32 interviewed on python you will sometimes be asked on Cross joins but there aren't
906:34 be asked on Cross joins but there aren't a lot of instances in actual work where
906:37 a lot of instances in actual work where you really use need a cross join now
906:40 you really use need a cross join now let's take a look at joins and joins are
906:43 let's take a look at joins and joins are pretty similar to the merge function and
906:46 pretty similar to the merge function and it can do a lot of the same thing except
906:48 it can do a lot of the same thing except in my opinion the join function isn't as
906:51 in my opinion the join function isn't as easily understood as the merge function
906:53 easily understood as the merge function it's a little bit more complicated um
906:56 it's a little bit more complicated um but let's take a look and see how we can
906:58 but let's take a look and see how we can join together these data frames using
907:00 join together these data frames using the join function so let's go right up
907:01 the join function so let's go right up here we're going to say data frame one
907:04 here we're going to say data frame one do join and then we'll do data frame two
907:07 do join and then we'll do data frame two very similar to how we did it before and
907:10 very similar to how we did it before and let's try running this and it's not
907:12 let's try running this and it's not going to work um when we did the merge
907:14 going to work um when we did the merge function it had a lot of defaults for us
907:17 function it had a lot of defaults for us let's go down and see what this error is
907:19 let's go down and see what this error is it says the columns overlap but no
907:21 it says the columns overlap but no suffix was specified so it's telling us
907:24 suffix was specified so it's telling us that it's trying to use the fellowship
907:25 that it's trying to use the fellowship ID and the first name just like the join
907:27 ID and the first name just like the join did except it's not able to distinguish
907:31 did except it's not able to distinguish which is which and so we need to go in
907:33 which is which and so we need to go in there and kind of help it out a little
907:34 there and kind of help it out a little bit again a little bit more Hands-On
907:37 bit again a little bit more Hands-On than the merge but let's see what we can
907:39 than the merge but let's see what we can do to make this work let's do comma and
907:42 do to make this work let's do comma and we'll say on and let's really quickly
907:45 we'll say on and let's really quickly let's open this up and kind of see what
907:47 let's open this up and kind of see what we have so this one has less options
907:49 we have so this one has less options than the merge does we have other and
907:51 than the merge does we have other and that's our other data frame we can do on
907:54 that's our other data frame we can do on and we're going to specify you know what
907:56 and we're going to specify you know what column do we want to join on and then we
907:57 column do we want to join on and then we can look at how do we want it to be a
907:59 can look at how do we want it to be a left an inner an outer the same kind of
908:02 left an inner an outer the same kind of types of joins as the merge then we have
908:04 types of joins as the merge then we have that left suffix right suffix and that's
908:07 that left suffix right suffix and that's right here is kind of part of the issue
908:09 right here is kind of part of the issue that we were just facing is that those
908:11 that we were just facing is that those columns are the same but if we say left
908:13 columns are the same but if we say left suffix it'll give us an underscore
908:16 suffix it'll give us an underscore whatever we want to specify any string
908:18 whatever we want to specify any string four columns that are both in the left
908:20 four columns that are both in the left and the right we can give it a unique
908:22 and the right we can give it a unique name so we'll no longer have that issue
908:24 name so we'll no longer have that issue and then we can also sort it like we did
908:26 and then we can also sort it like we did on the other one but anyways let's go
908:28 on the other one but anyways let's go back to our on we'll say on is equal to
908:31 back to our on we'll say on is equal to and then we'll say
908:33 and then we'll say Fellowship ID let's try running this and
908:37 Fellowship ID let's try running this and we're still getting an error it's just
908:39 we're still getting an error it's just not as simple as the merge so let's keep
908:41 not as simple as the merge so let's keep going so now let's specify the type so
908:43 going so now let's specify the type so we'll say how is equal to and we'll do
908:46 we'll say how is equal to and we'll do an
908:46 an outer and if we run this it still
908:49 outer and if we run this it still doesn't work we're still getting the
908:50 doesn't work we're still getting the exact same issue as the left suffix and
908:52 exact same issue as the left suffix and the right suffix so now let's finally
908:54 the right suffix so now let's finally resolve it I just wanted to show you how
908:56 resolve it I just wanted to show you how a little bit more frustrating it was but
908:58 a little bit more frustrating it was but now let's say uh L suffix is equal to
909:03 now let's say uh L suffix is equal to and now it automatically when we did the
909:05 and now it automatically when we did the merge did an underscore X but we can do
909:07 merge did an underscore X but we can do let's do
909:08 let's do underscore uh
909:10 underscore uh left and then we can do a comma we'll do
909:13 left and then we can do a comma we'll do right
909:14 right suffix and we'll says equal to and we'll
909:17 suffix and we'll says equal to and we'll do underscore right now when we run this
909:21 do underscore right now when we run this it should work properly let's run this
909:24 it should work properly let's run this so this is our output and obviously
909:26 so this is our output and obviously looks quite a bit different over here we
909:29 looks quite a bit different over here we have this Fellowship ID then we also
909:31 have this Fellowship ID then we also have Fellowship ID left first name left
909:34 have Fellowship ID left first name left Fellowship ID right and first name right
909:37 Fellowship ID right and first name right so it just doesn't doesn't look right
909:39 so it just doesn't doesn't look right now something I didn't specify when I
909:40 now something I didn't specify when I first started this cuz I kind of wanted
909:41 first started this cuz I kind of wanted to show you is that the join usually is
909:45 to show you is that the join usually is better for when you're working with
909:46 better for when you're working with indexes before when we were using the
909:49 indexes before when we were using the merge we were using the column names and
909:52 merge we were using the column names and that worked really well and it was
909:53 that worked really well and it was pretty easy to do but as you can see
909:55 pretty easy to do but as you can see right here when we're trying to use
909:56 right here when we're trying to use these column names it's not working
909:58 these column names it's not working exceptionally well let's go ahead and
910:00 exceptionally well let's go ahead and create our index and then I can show you
910:02 create our index and then I can show you how this actually works and how it works
910:04 how this actually works and how it works a little bit better when we're working
910:05 a little bit better when we're working with just the index although you can get
910:08 with just the index although you can get to work just the same as the merge it's
910:10 to work just the same as the merge it's just a lot more work so let's go right
910:12 just a lot more work so let's go right down here and let's go and say df4 so
910:15 down here and let's go and say df4 so we'll create a new data frame we'll say
910:18 we'll create a new data frame we'll say df1 do setor index and we'll do an open
910:23 df1 do setor index and we'll do an open parentheses and we'll say we want to do
910:25 parentheses and we'll say we want to do this index on the
910:28 this index on the fellowship ID and then we're going to do
910:31 fellowship ID and then we're going to do the join so now we're going to say join
910:33 the join so now we're going to say join so we're setting an index so we're
910:35 so we're setting an index so we're setting that index on the fellowship ID
910:37 setting that index on the fellowship ID now we're we're going to join it on df2
910:40 now we're we're going to join it on df2 do setor index and then we're also going
910:44 do setor index and then we're also going to do that on the fellowship ID and I'll
910:47 to do that on the fellowship ID and I'll just copy
910:55 this oh geez I hate it when I do that okay now we also want to do and specify
910:58 okay now we also want to do and specify the left and the right index so I'll
910:59 the left and the right index so I'll just copy this as we do need to specify
911:02 just copy this as we do need to specify this now let's try running the data
911:05 this now let's try running the data frame 4 so really quick just to recap we
911:08 frame 4 so really quick just to recap we were setting the indexes we were doing
911:10 were setting the indexes we were doing the same thing above right we have this
911:12 the same thing above right we have this join we were joining data frame one with
911:15 join we were joining data frame one with data Frame 2 now we're joining data
911:17 data Frame 2 now we're joining data frame 1 with data frame two except in
911:20 frame 1 with data frame two except in both instances we're setting the index
911:22 both instances we're setting the index as Fellowship ID so we're joining now on
911:25 as Fellowship ID so we're joining now on that index so now let's run this and
911:27 that index so now let's run this and this should look a lot more similar to
911:29 this should look a lot more similar to the merge than the join that we did
911:31 the merge than the join that we did above except now the fellowship ID right
911:34 above except now the fellowship ID right here is actually an index so it's just a
911:37 here is actually an index so it's just a little bit different but we can still go
911:39 little bit different but we can still go in here and do how is equal to
911:43 in here and do how is equal to Outer oops let's say outer so we can
911:46 Outer oops let's say outer so we can still specify our different types of
911:48 still specify our different types of joins or the different way that we can
911:49 joins or the different way that we can merge or join these data frames together
911:52 merge or join these data frames together we can still specify that again it's
911:54 we can still specify that again it's just a little bit different and that's
911:56 just a little bit different and that's why for most instances I'm using that
911:58 why for most instances I'm using that merge function because it's just a
911:59 merge function because it's just a little bit more seamless little bit more
912:01 little bit more seamless little bit more intuitive the join function can still
912:03 intuitive the join function can still get the job done but as you can see it
912:05 get the job done but as you can see it takes a little bit more work now let's
912:07 takes a little bit more work now let's look at concatenate concatenating data
912:09 look at concatenate concatenating data frames can be really useful and the
912:11 frames can be really useful and the distinction between a merge and join
912:13 distinction between a merge and join versus the concatenate is that the
912:15 versus the concatenate is that the concatenate is kind of like putting one
912:17 concatenate is kind of like putting one data frame on top of the other rather
912:19 data frame on top of the other rather than putting one data frame next to one
912:21 than putting one data frame next to one another which is like the merge and the
912:23 another which is like the merge and the join so concatenating them is just a
912:25 join so concatenating them is just a little bit different in how it'll
912:26 little bit different in how it'll operate but let's actually write this
912:28 operate but let's actually write this out and see how this looks let's go up
912:30 out and see how this looks let's go up here and we'll say pd. concat we'll do
912:34 here and we'll say pd. concat we'll do an open parenthesis and then we're going
912:36 an open parenthesis and then we're going to concatenate data frame 1 comma data
912:40 to concatenate data frame 1 comma data Frame 2 that's all we have to write and
912:42 Frame 2 that's all we have to write and let's run this and so just like I said
912:45 let's run this and so just like I said it literally took the First Data frame 1
912:47 it literally took the First Data frame 1 2 3 4 and put it on top of the right
912:50 2 3 4 and put it on top of the right data frame 1 2 6 7 8 so that is our left
912:54 data frame 1 2 6 7 8 so that is our left data frame this is our right data frame
912:56 data frame this is our right data frame and they're literally just sitting one
912:58 and they're literally just sitting one on top of the other but just like when
912:59 on top of the other but just like when we merg either with a left or a right
913:02 we merg either with a left or a right when you have these skills and there
913:03 when you have these skills and there aren't any values that populate for them
913:06 aren't any values that populate for them it is going to say not a number and
913:08 it is going to say not a number and since we're not actually joining we're
913:09 since we're not actually joining we're not joining on one and two even though
913:11 not joining on one and two even though this one and this one is the same rows
913:14 this one and this one is the same rows it's not populating that value because
913:16 it's not populating that value because again we're not joining these together
913:18 again we're not joining these together we're just concatenating and putting one
913:19 we're just concatenating and putting one on top of the other now if we go into
913:22 on top of the other now if we go into this concat we say shift tab there are a
913:25 this concat we say shift tab there are a lot of different things that we can do
913:27 lot of different things that we can do which if you remember the zero axis is
913:29 which if you remember the zero axis is the leftand index and the axis of one is
913:33 the leftand index and the axis of one is the top index which is the columns so
913:35 the top index which is the columns so you can specify that and we can also o
913:38 you can specify that and we can also o do joins and this is the one that I'm
913:40 do joins and this is the one that I'm going to take a look at but there are
913:41 going to take a look at but there are other ones that you can um look into as
913:43 other ones that you can um look into as well let's look at join let's do comma
913:47 well let's look at join let's do comma and we'll say join is equal to and let's
913:49 and we'll say join is equal to and let's do an inner join so let's see what
913:52 do an inner join so let's see what happens with this as you can see it is
913:54 happens with this as you can see it is only taking the columns that are the
913:56 only taking the columns that are the same that's what this in is doing it's
913:58 same that's what this in is doing it's joining these columns together and the
914:01 joining these columns together and the ones that were different they didn't
914:02 ones that were different they didn't take because again we weren't able to
914:05 take because again we weren't able to combine them they aren't similar between
914:07 combine them they aren't similar between both frames Let's do an outer and now
914:10 both frames Let's do an outer and now it's going to take all of them and like
914:12 it's going to take all of them and like I said that's doing this on these
914:14 I said that's doing this on these columns right here but we can also do it
914:15 columns right here but we can also do it on this axis as well so let's go ahead
914:18 on this axis as well so let's go ahead and say axis is equal to one and when we
914:23 and say axis is equal to one and when we run this now it's joining us on this
914:24 run this now it's joining us on this Index right here of 0 1 2 3 4 so now
914:27 Index right here of 0 1 2 3 4 so now these ones are being joined together and
914:29 these ones are being joined together and it's putting it side by side much like a
914:32 it's putting it side by side much like a merge wood so that's how concatenate
914:34 merge wood so that's how concatenate works and I'm going to show you one more
914:36 works and I'm going to show you one more thing and again it's not up here in this
914:38 thing and again it's not up here in this you know title because it's not one that
914:40 you know title because it's not one that I recommend but is one called append the
914:43 I recommend but is one called append the append function is used to append rows
914:45 append function is used to append rows from one data frame to the end of
914:47 from one data frame to the end of another data frame and then we can
914:48 another data frame and then we can return that new data frame and so let's
914:50 return that new data frame and so let's do data frame one. aend we'll do an open
914:54 do data frame one. aend we'll do an open parenthesis and we'll say data Frame 2
914:56 parenthesis and we'll say data Frame 2 very similar to how we've been doing
914:58 very similar to how we've been doing other things and let's run this and as
915:00 other things and let's run this and as you can see this is almost exactly like
915:02 you can see this is almost exactly like how the concatenate did when we first
915:04 how the concatenate did when we first did it but if we read kind of this
915:05 did it but if we read kind of this warning it's saying the frame append
915:08 warning it's saying the frame append method is deprecated and will be removed
915:11 method is deprecated and will be removed from pandas in the future version use
915:13 from pandas in the future version use pandas do concat instead so it's
915:15 pandas do concat instead so it's literally warning us you know a pend is
915:17 literally warning us you know a pend is on its way out if you want to do exactly
915:19 on its way out if you want to do exactly what you're doing right here go and try
915:21 what you're doing right here go and try concat or concatenate because that'll do
915:23 concat or concatenate because that'll do the exact same thing so I'm not really
915:25 the exact same thing so I'm not really going to show you any other variations
915:27 going to show you any other variations of a pend because there's no reason it's
915:29 of a pend because there's no reason it's going to be on its way out in the next
915:31 going to be on its way out in the next version so that is our video on merge
915:34 version so that is our video on merge join and concatenate and aend as well uh
915:36 join and concatenate and aend as well uh in panda does and I hope that that was
915:38 in panda does and I hope that that was helpful I hope that you learned
915:39 helpful I hope that you learned something I mean this stuff is really
915:41 something I mean this stuff is really important because often times you're not
915:42 important because often times you're not just working with one CSV or one Json or
915:45 just working with one CSV or one Json or one text file you're working with
915:46 one text file you're working with multiple of them and you need to combine
915:48 multiple of them and you need to combine them all into one data frame and so this
915:50 them all into one data frame and so this is a really really important concept and
915:53 is a really really important concept and thing to understand with that being said
915:55 thing to understand with that being said be sure to like And subscribe check out
915:57 be sure to like And subscribe check out all my other videos on Python and pandas
915:59 all my other videos on Python and pandas and I will see you in the next
916:02 and I will see you in the next [Music]
916:07 video [Music]
916:13 [Music] hello everybody today we're going to be
916:14 hello everybody today we're going to be building visualizations in pandas in
916:17 building visualizations in pandas in this video we'll look at how we can
916:18 this video we'll look at how we can build visualizations like line plots
916:20 build visualizations like line plots Scatter Plots bar charts histograms and
916:23 Scatter Plots bar charts histograms and more I'll also show you some of the ways
916:25 more I'll also show you some of the ways that you can customize these
916:26 that you can customize these visualizations to make them just a
916:27 visualizations to make them just a little bit better with that being said
916:29 little bit better with that being said let's go right over here start importing
916:31 let's go right over here start importing our libraries and we'll start with
916:33 our libraries and we'll start with importing pandas as PD and this one is
916:36 importing pandas as PD and this one is really all you need to actually create
916:38 really all you need to actually create the visualizations in pandas but we may
916:40 the visualizations in pandas but we may get a little bit crazy uh and so we're
916:42 get a little bit crazy uh and so we're going to do a few different ones as well
916:44 going to do a few different ones as well like import
916:46 like import numpy as NP and then we're going to do
916:50 numpy as NP and then we're going to do import Matt plot lib do
916:54 import Matt plot lib do pyplot as PLT now I may or may not use
916:57 pyplot as PLT now I may or may not use this I just you know when I get into
916:59 this I just you know when I get into visualizations I may want to change some
917:01 visualizations I may want to change some different things so we're going to at
917:03 different things so we're going to at least have them here in case we do want
917:04 least have them here in case we do want to use them let's go ahead and run this
917:08 to use them let's go ahead and run this so now let's get our data set that we're
917:09 so now let's get our data set that we're going to be using so let's say data
917:12 going to be using so let's say data frame is equal to pd. read _
917:16 frame is equal to pd. read _ CSV and let's get this in right here now
917:19 CSV and let's get this in right here now we're going to be doing these ice cream
917:21 we're going to be doing these ice cream ratings let's take a look at this really
917:22 ratings let's take a look at this really quickly now these values are completely
917:26 quickly now these values are completely randomly generated they are not real in
917:28 randomly generated they are not real in any way um but that's what we're going
917:31 any way um but that's what we're going to be using cuz I just wanted something
917:32 to be using cuz I just wanted something kind of generic something that wouldn't
917:33 kind of generic something that wouldn't be too crazy confusing just something
917:35 be too crazy confusing just something that we could use and you guys can
917:37 that we could use and you guys can understand that they're just numerical
917:38 understand that they're just numerical values but let's also set that index
917:41 values but let's also set that index really quick so we'll say data frame.
917:43 really quick so we'll say data frame. setor index and then we'll say date and
917:47 setor index and then we'll say date and then we'll say that's equal to the data
917:49 then we'll say that's equal to the data frame and we have this date column right
917:51 frame and we have this date column right here as our index so we have uh January
917:54 here as our index so we have uh January 1st 2nd 3rd 4th and then we have our
917:56 1st 2nd 3rd 4th and then we have our ratings right here and again these are
917:59 ratings right here and again these are all just integers and they're pretty
918:00 all just integers and they're pretty easy or are really easy to demonstrate
918:03 easy or are really easy to demonstrate how you can visualize these so that's
918:05 how you can visualize these so that's why we're using it today so the way that
918:06 why we're using it today so the way that we visualize something in pandas is we
918:08 we visualize something in pandas is we use something called plot so let's just
918:11 use something called plot so let's just take our data frame we'll do data frame.
918:13 take our data frame we'll do data frame. plot and we'll do our parentheses now
918:16 plot and we'll do our parentheses now let's go in here really quickly let's
918:18 let's go in here really quickly let's hit shift Tab and this is going to come
918:20 hit shift Tab and this is going to come up and this is pretty important because
918:24 up and this is pretty important because this kind of is going to tell us what we
918:25 this kind of is going to tell us what we can do within this plot and
918:27 can do within this plot and unfortunately there isn't like a quick
918:29 unfortunately there isn't like a quick overview we just have this doc string
918:31 overview we just have this doc string but we have our parameters right here
918:33 but we have our parameters right here these are what we can pass in to kind of
918:35 these are what we can pass in to kind of customize our visualization so the data
918:38 customize our visualization so the data is going to be our data frame then we
918:40 is going to be our data frame then we have our X and Y labels we can specify
918:43 have our X and Y labels we can specify the kind and this one's important
918:44 the kind and this one's important because you can specify what kind of
918:47 because you can specify what kind of visualization do we want we can do a
918:49 visualization do we want we can do a line plot horizontal a vertical bar plot
918:53 line plot horizontal a vertical bar plot histogram box plot and then a few others
918:55 histogram box plot and then a few others including area Pi density all these
918:58 including area Pi density all these other things we can also specify if we
919:00 other things we can also specify if we want it to be a subplot and a lot of
919:02 want it to be a subplot and a lot of these things that I'm specifying you
919:04 these things that I'm specifying you know I'm going to show you how to do you
919:06 know I'm going to show you how to do you can use a different indexes you can add
919:08 can use a different indexes you can add titles add grids Legends Styles all
919:11 titles add grids Legends Styles all these different things I mean you can go
919:13 these different things I mean you can go through here CU there are a lot but you
919:15 through here CU there are a lot but you can specify and and you know customize
919:17 can specify and and you know customize all of these things we won't be going
919:19 all of these things we won't be going into all of them but I will show you
919:21 into all of them but I will show you some of the ones that I probably use the
919:23 some of the ones that I probably use the most and that I think are the most
919:24 most and that I think are the most useful to know right away so let's get
919:26 useful to know right away so let's get out of here and we're just going to do
919:28 out of here and we're just going to do DF do plot and when we run this we'll
919:31 DF do plot and when we run this we'll get this right here and that was super
919:33 get this right here and that was super super easy created a line plot by
919:35 super easy created a line plot by literally doing just about nothing
919:37 literally doing just about nothing nothing um but by default it's going to
919:39 nothing um but by default it's going to give us a line plot so if we come up
919:42 give us a line plot so if we come up here we say kind and let me get that out
919:45 here we say kind and let me get that out of the way is equal to line and we run
919:49 of the way is equal to line and we run this so by default without us actually
919:51 this so by default without us actually having to input anything it's giving us
919:53 having to input anything it's giving us that line plot as a default so uh we can
919:56 that line plot as a default so uh we can specify it's a line plot as you can see
919:58 specify it's a line plot as you can see we already have all of our data right
920:00 we already have all of our data right here we didn't have to specify anything
920:02 here we didn't have to specify anything it kind of automatically took it in it
920:04 it kind of automatically took it in it is visualizing all three of these
920:06 is visualizing all three of these columns
920:08 columns and it has this little um Legend right
920:10 and it has this little um Legend right here and we can specify where we want
920:12 here and we can specify where we want that uh there is an argument to be able
920:14 that uh there is an argument to be able to do that it also gave us these tick
920:16 to do that it also gave us these tick marks of 2 4 6 8 10 again it read in and
920:21 marks of 2 4 6 8 10 again it read in and said it's only going from 0.0 to 1.0
920:25 said it's only going from 0.0 to 1.0 that is kind of the peak and so it kind
920:27 that is kind of the peak and so it kind of automatically gave us these ticks for
920:29 of automatically gave us these ticks for us again that's another thing that you
920:30 us again that's another thing that you can specify we make it go up to 2 5 10
920:33 can specify we make it go up to 2 5 10 1,000 whatever you want it to be and
920:35 1,000 whatever you want it to be and then we're doing this based on off of
920:37 then we're doing this based on off of this date value right here really
920:39 this date value right here really quickly I wanted to give a huge shout
920:40 quickly I wanted to give a huge shout out to the sponsor of this entire Panda
920:42 out to the sponsor of this entire Panda series and that is udemy udy has some of
920:44 series and that is udemy udy has some of the best courses at the best prices and
920:46 the best courses at the best prices and it is no exception when it comes to
920:48 it is no exception when it comes to pandas courses if you want to master
920:50 pandas courses if you want to master pandas this is the course that I would
920:51 pandas this is the course that I would recommend it's going to teach you just
920:52 recommend it's going to teach you just about everything you need to know about
920:54 about everything you need to know about pandas so huge shout out to you me for
920:56 pandas so huge shout out to you me for sponsoring this Panda series and let's
920:57 sponsoring this Panda series and let's get back to the video if we wanted to
920:59 get back to the video if we wanted to break these out by the actual column we
921:03 break these out by the actual column we could go in here and say subplot is
921:06 could go in here and say subplot is equal to true and it's actually subplots
921:10 equal to true and it's actually subplots whoops and now we can run that and then
921:13 whoops and now we can run that and then we can see each of those columns being
921:14 we can see each of those columns being broken out by themselves instead of them
921:16 broken out by themselves instead of them all being in one visualization it's now
921:19 all being in one visualization it's now uh three separate visualizations now
921:21 uh three separate visualizations now let's go right over here we're going to
921:23 let's go right over here we're going to get rid of the subplots I want to show
921:24 get rid of the subplots I want to show you just some of the different arguments
921:26 you just some of the different arguments that you can use to make this look nice
921:28 that you can use to make this look nice uh because I don't want to do this on
921:29 uh because I don't want to do this on every single visualization I just want
921:31 every single visualization I just want to show you what you can do so we have
921:34 to show you what you can do so we have this one right here we can add a title
921:36 this one right here we can add a title notice there's no title or anything
921:37 notice there's no title or anything really telling us what that is so we can
921:39 really telling us what that is so we can say comma title and we'll say ice cream
921:44 say comma title and we'll say ice cream ratings if we run this we now have this
921:47 ratings if we run this we now have this nice title right here now we can also
921:50 nice title right here now we can also customize the labels or the titles for
921:52 customize the labels or the titles for the X and Y AIS it automatically took
921:54 the X and Y AIS it automatically took this date which is right here this is
921:57 this date which is right here this is our date index it automatically took
921:59 our date index it automatically took that for us but we can customize that if
922:01 that for us but we can customize that if we'd like to all we have to do is comma
922:04 we'd like to all we have to do is comma and then we'll say xlabel is equal to
922:07 and then we'll say xlabel is equal to and so our X is this date one right here
922:10 and so our X is this date one right here and we can say daily
922:12 and we can say daily rating and then we can do the Y Lael
922:15 rating and then we can do the Y Lael we'll say y label is equal to and for
922:18 we'll say y label is equal to and for this one we can say
922:20 this one we can say scores hope you cannot hear my dog in
922:22 scores hope you cannot hear my dog in the background CU they being insane uh
922:24 the background CU they being insane uh but let's go ahead and run this and now
922:26 but let's go ahead and run this and now we have these daily ratings on the x-
922:28 we have these daily ratings on the x- axis and on the Y AIS we have scores now
922:31 axis and on the Y AIS we have scores now let's go right down here and start
922:33 let's go right down here and start taking a look at our next kind of
922:35 taking a look at our next kind of visualization which is going to be a bar
922:37 visualization which is going to be a bar plot so we'll do DF do plot we'll do
922:40 plot so we'll do DF do plot we'll do kind is equal to and for this one we're
922:43 kind is equal to and for this one we're going to say bar now this is what your
922:45 going to say bar now this is what your typical bar plot will look like and a
922:47 typical bar plot will look like and a lot of the arguments that we just did on
922:50 lot of the arguments that we just did on the line plot you can also apply to this
922:52 the line plot you can also apply to this bar plot something that's unique to the
922:54 bar plot something that's unique to the bar plot is that you can also make it a
922:56 bar plot is that you can also make it a stacked bar plot all we have to do is go
922:58 stacked bar plot all we have to do is go in here we'll say comma and we'll say
923:00 in here we'll say comma and we'll say stacked is equal to true so now this
923:04 stacked is equal to true so now this going to make it a stacked bar chart
923:06 going to make it a stacked bar chart instead of just know your regular bar
923:07 instead of just know your regular bar chart let's go ahead and run this and as
923:09 chart let's go ahead and run this and as you can see this is now stacked on top
923:11 you can see this is now stacked on top of one another with each of these
923:13 of one another with each of these columns all representing the values that
923:15 columns all representing the values that they have now we don't always have to do
923:17 they have now we don't always have to do every single column we can also specify
923:20 every single column we can also specify the column that we want so let's take
923:22 the column that we want so let's take the flavor rating for example we could
923:25 the flavor rating for example we could do flavor oops flavor rating good night
923:30 do flavor oops flavor rating good night flavor rating and then it's only going
923:33 flavor rating and then it's only going to take in that flavor rating column and
923:35 to take in that flavor rating column and if you notice we don't have a legend
923:37 if you notice we don't have a legend that's only when you have multiple
923:38 that's only when you have multiple values which we are only looking at this
923:40 values which we are only looking at this one column so all the values are right
923:43 one column so all the values are right here now in this bar chart it
923:44 here now in this bar chart it automatically defaults to a vertical bar
923:47 automatically defaults to a vertical bar chart but you can change it to a
923:49 chart but you can change it to a horizontal bar chart let's go ahead and
923:50 horizontal bar chart let's go ahead and take a look at how to do that bring back
923:53 take a look at how to do that bring back all of them we'll do DF do plot Dot and
923:57 all of them we'll do DF do plot Dot and then we'll say
923:58 then we'll say barh and I don't know if I can keeping
924:01 barh and I don't know if I can keeping that kind equals bar let me run this
924:03 that kind equals bar let me run this yeah I need to get rid of that because
924:04 yeah I need to get rid of that because the bar. H is its own um this is its own
924:08 the bar. H is its own um this is its own function so now I'm going to run this it
924:10 function so now I'm going to run this it should just have a stacked bar chart
924:12 should just have a stacked bar chart except now it should be horizontal so
924:15 except now it should be horizontal so now you can see this worked properly
924:17 now you can see this worked properly it's basically the exact same thing as a
924:20 it's basically the exact same thing as a vertical bar chart just now horizontal
924:22 vertical bar chart just now horizontal which may look better especially
924:23 which may look better especially depending on if you have values like
924:25 depending on if you have values like this or you know something else that
924:28 this or you know something else that just looks better being horizontal now
924:30 just looks better being horizontal now the next one that we're going to take a
924:31 the next one that we're going to take a look at is the scatter plot so we're
924:33 look at is the scatter plot so we're going to say DF do plot do scatter
924:37 going to say DF do plot do scatter scatter and if we run this we're going
924:39 scatter and if we run this we're going to get an error what we need in order to
924:42 to get an error what we need in order to run this properly is we need to specify
924:44 run this properly is we need to specify the X and the Y AIS in order for this
924:46 the X and the Y AIS in order for this scatter plot to work so let's go here
924:50 scatter plot to work so let's go here and we'll say x is equal to and we can
924:53 and we'll say x is equal to and we can take any of our columns that we have up
924:55 take any of our columns that we have up here so we'll say x is equal to texture
925:01 here so we'll say x is equal to texture rating and then oops Y is equal to we'll
925:05 rating and then oops Y is equal to we'll do overall rating
925:07 do overall rating now when we run this it should work
925:09 now when we run this it should work properly let's go ahead and take a look
925:11 properly let's go ahead and take a look now if we go in here and we do shift tab
925:15 now if we go in here and we do shift tab we can also see some other things that
925:16 we can also see some other things that we can specify so let's go right down
925:19 we can specify so let's go right down here so we have our X and we have our Y
925:21 here so we have our X and we have our Y and those are the ones that we just did
925:22 and those are the ones that we just did we can also pass through an S which is
925:25 we can also pass through an S which is going to tell us or or change the size
925:28 going to tell us or or change the size of the actual dots right here in our
925:29 of the actual dots right here in our scatter plot then we can also do a c
925:33 scatter plot then we can also do a c which is the color of each point let's
925:35 which is the color of each point let's start with the S
925:37 start with the S let's say s is equal to let's just do
925:39 let's say s is equal to let's just do 100 let's see what that looks like so we
925:42 100 let's see what that looks like so we have a much larger number let's do 500
925:44 have a much larger number let's do 500 and see what that looks like so we can
925:46 and see what that looks like so we can make these much larger on our
925:48 make these much larger on our visualization depending on what you're
925:50 visualization depending on what you're looking for we can also look at the
925:52 looking for we can also look at the color let's put comma C so for color we
925:56 color let's put comma C so for color we can say color is equal to and let's do
925:59 can say color is equal to and let's do uh yellow let's see if this works so now
926:02 uh yellow let's see if this works so now we've changed it to Yellow that looks
926:04 we've changed it to Yellow that looks absolutely terrible but it does work now
926:07 absolutely terrible but it does work now let's move on to the histogram histogram
926:09 let's move on to the histogram histogram is always a good one it's very similar
926:11 is always a good one it's very similar to something like a bar chart but what's
926:14 to something like a bar chart but what's great about a histogram is you can
926:15 great about a histogram is you can specify the bins um so let's go ahead
926:18 specify the bins um so let's go ahead and say DF
926:20 and say DF dolot doist then we'll do an open
926:23 dolot doist then we'll do an open parenthesis and let's go ahead and hit
926:26 parenthesis and let's go ahead and hit shift tab in here take a look at this
926:28 shift tab in here take a look at this one as well so some of our parameters
926:32 one as well so some of our parameters are the actual Columns of the data
926:33 are the actual Columns of the data frames that we want to pull in we get
926:36 frames that we want to pull in we get you can choose the bins and they have a
926:38 you can choose the bins and they have a default of 10 in here and so let's take
926:40 default of 10 in here and so let's take a look at how this works so we'll just
926:42 a look at how this works so we'll just run this as it is so this is by default
926:46 run this as it is so this is by default what this histogram is going to look
926:48 what this histogram is going to look like let's go ahead and specify our bins
926:51 like let's go ahead and specify our bins we'll just say it was 10 by default
926:53 we'll just say it was 10 by default let's just do 20 see what that looks
926:56 let's just do 20 see what that looks like so there are smaller columns right
926:57 like so there are smaller columns right off the bat and remember histograms are
927:00 off the bat and remember histograms are really good for showing distribution of
927:02 really good for showing distribution of variables you know that's really what a
927:04 variables you know that's really what a histogram is for but of course since
927:06 histogram is for but of course since these are completely random numbers this
927:08 these are completely random numbers this histogram isn't going to make any sense
927:10 histogram isn't going to make any sense at all but you can at least kind of see
927:12 at all but you can at least kind of see visually how it works and if I didn't
927:14 visually how it works and if I didn't mention it before which I should have
927:16 mention it before which I should have the bins represent how many kind of tick
927:18 the bins represent how many kind of tick marks are down here so if we just do one
927:22 marks are down here so if we just do one only going to be one very large uh you
927:25 only going to be one very large uh you know histogram we could even go further
927:29 know histogram we could even go further down from 10 and do five so now there's
927:31 down from 10 and do five so now there's only one 2 3 four five so the
927:34 only one 2 3 four five so the distribution gets smaller and and things
927:36 distribution gets smaller and and things get more compact as you spread it out
927:39 get more compact as you spread it out again like we did
927:41 again like we did 100 it's going to spread it out a lot um
927:44 100 it's going to spread it out a lot um and this is what it shows you know it's
927:46 and this is what it shows you know it's showing the distribution of those bins
927:48 showing the distribution of those bins across however many you want so the 10
927:51 across however many you want so the 10 by default you know it usually is pretty
927:54 by default you know it usually is pretty good for a lot of different things now
927:55 good for a lot of different things now let's go down here and look at the box
927:57 let's go down here and look at the box plot and the box plot is a pretty
927:59 plot and the box plot is a pretty interesting one let's go ahead and
928:01 interesting one let's go ahead and visualize it really quickly and then
928:02 visualize it really quickly and then I'll kind of explain how this one works
928:05 I'll kind of explain how this one works let's do d boox plot let's run this and
928:09 let's do d boox plot let's run this and really what we're looking at is some
928:10 really what we're looking at is some different markers within our data this
928:12 different markers within our data this line right here is the minimum value
928:15 line right here is the minimum value within that column we also have the
928:16 within that column we also have the bottom of the box which is the 25th
928:19 bottom of the box which is the 25th percentile of all the values within just
928:21 percentile of all the values within just this column this is 50% then we have 75%
928:26 this column this is 50% then we have 75% and then up here we have our maximum
928:28 and then up here we have our maximum value so I can take a glance at this and
928:30 value so I can take a glance at this and see that we have a low minimum a high
928:32 see that we have a low minimum a high maximum and it definitely skews towards
928:35 maximum and it definitely skews towards the lower range whereas if I look over
928:37 the lower range whereas if I look over here we have a lower minimum and a
928:39 here we have a lower minimum and a higher maximum and you can see that this
928:42 higher maximum and you can see that this medium point is at0 6 versus 04 over
928:44 medium point is at0 6 versus 04 over here so the skew is a lot higher now
928:46 here so the skew is a lot higher now let's go down here and take a look at an
928:48 let's go down here and take a look at an area plot we'll do DF do plot. area and
928:53 area plot we'll do DF do plot. area and let's just run this this is what we're
928:55 let's just run this this is what we're going to get by default now something I
928:57 going to get by default now something I wanted to show you earlier I just
928:59 wanted to show you earlier I just haven't gotten around to I want to show
929:00 haven't gotten around to I want to show you something called Figure size or fig
929:02 you something called Figure size or fig size um so for this it's know it's just
929:05 size um so for this it's know it's just looks small small looks a little bit
929:07 looks small small looks a little bit cramped let's say we want to increase
929:08 cramped let's say we want to increase the size of this and we'll say fig size
929:11 the size of this and we'll say fig size oops fig size is equal to and let's just
929:14 oops fig size is equal to and let's just do a parentheses and say 10 comma 5 that
929:18 do a parentheses and say 10 comma 5 that should be pretty large this is going to
929:20 should be pretty large this is going to make it a lot larger just something I
929:22 make it a lot larger just something I wanted to throw in there I look at these
929:23 wanted to throw in there I look at these area charts as pretty similar to like a
929:25 area charts as pretty similar to like a line chart if we went and compared those
929:28 line chart if we went and compared those be pretty similar um but they're
929:30 be pretty similar um but they're different visually and you know you
929:32 different visually and you know you absolutely can use these for different
929:33 absolutely can use these for different types of visualizations but I don't use
929:36 types of visualizations but I don't use this one a lot if I'm being honest
929:37 this one a lot if I'm being honest that's why it's kind of towards the end
929:39 that's why it's kind of towards the end of the video but you definitely can do
929:40 of the video but you definitely can do it let's go on to our very last one of
929:43 it let's go on to our very last one of the video that's going to be the
929:45 the video that's going to be the beautiful pie chart let's say DF plot.py
929:49 beautiful pie chart let's say DF plot.py do an open parenthesis and let's run it
929:53 do an open parenthesis and let's run it we're going to get this error that's
929:54 we're going to get this error that's because we need to specify what column
929:57 because we need to specify what column we're working with here so let's just
929:59 we're working with here so let's just say the Y and that's what we need let me
930:02 say the Y and that's what we need let me open this up for
930:03 open this up for us right here we have our y and this is
930:06 us right here we have our y and this is our our label or a column that we're
930:08 our our label or a column that we're going to plot that's really all we need
930:10 going to plot that's really all we need so we can just say Y is equal to flavor
930:14 so we can just say Y is equal to flavor rating oops flavor rating let's run this
930:18 rating oops flavor rating let's run this and now we get this visualization right
930:20 and now we get this visualization right here let's make this one a little bit
930:22 here let's make this one a little bit bigger big size is equal to 10 comma 6
930:30 bigger big size is equal to 10 comma 6 now it's a little bit bigger it
930:31 now it's a little bit bigger it definitely depends so this Legend is
930:33 definitely depends so this Legend is going to autop populate you know you can
930:35 going to autop populate you know you can make this as big as you want and
930:38 make this as big as you want and obviously it's going to look a little
930:39 obviously it's going to look a little bit better if you do it larger and these
930:41 bit better if you do it larger and these colors autop populate now you can
930:42 colors autop populate now you can customize these colors although I found
930:45 customize these colors although I found these ones to be just when you have a
930:46 these ones to be just when you have a lot of them it's harder to customize
930:48 lot of them it's harder to customize them as easily but you know definitely
930:50 them as easily but you know definitely look into it these are things that
930:52 look into it these are things that everything in here is almost something
930:54 everything in here is almost something that you can customize in some way
930:56 that you can customize in some way although it does get a little bit tricky
930:58 although it does get a little bit tricky you definitely have to do some research
930:59 you definitely have to do some research and some Googling around just to kind of
931:01 and some Googling around just to kind of figure out how to do those things now
931:03 figure out how to do those things now one last thing that I wanted to show and
931:05 one last thing that I wanted to show and something you know I could have probably
931:07 something you know I could have probably done at the beginning um is you can
931:09 done at the beginning um is you can actually change what visual this is and
931:12 actually change what visual this is and we can do that pretty easily within mpot
931:15 we can do that pretty easily within mpot lib there are different styles um and so
931:17 lib there are different styles um and so let's go right here let's add a new row
931:21 let's go right here let's add a new row a new cell and we'll say print and we'll
931:24 a new cell and we'll say print and we'll do PLT so that's that map plot lib right
931:26 do PLT so that's that map plot lib right here we'll do PLT do style.
931:31 here we'll do PLT do style. available and what this is going to do
931:33 available and what this is going to do whoops what this is going to do is show
931:35 whoops what this is going to do is show us all these different different types
931:37 us all these different different types of stylings that you can do to kind of
931:40 of stylings that you can do to kind of change up this visualization then once
931:42 change up this visualization then once we find the one that we like we'll just
931:44 we find the one that we like we'll just do PLT do style. use and then in the
931:49 do PLT do style. use and then in the parenthesis we'll just specify which one
931:51 parenthesis we'll just specify which one we want now there's all these Seaborn
931:54 we want now there's all these Seaborn ones and Seaborn is a really great um
931:57 ones and Seaborn is a really great um really great Library let's try Seaborn
932:00 really great Library let's try Seaborn deep I haven't tried this one at all
932:02 deep I haven't tried this one at all let's go ahead and try this and just
932:04 let's go ahead and try this and just changes some of the colors some of the
932:06 changes some of the colors some of the visuals we can try something like
932:09 visuals we can try something like 538 let's try this that looks quite a
932:13 538 let's try this that looks quite a bit different and let's try something
932:16 bit different and let's try something like um classic I don't know what this
932:19 like um classic I don't know what this one looks like let's just try
932:21 one looks like let's just try it so you can try out all these
932:23 it so you can try out all these different styles find one that you like
932:24 different styles find one that you like find one that you think looks really
932:26 find one that you think looks really nice and you can run with it through all
932:28 nice and you can run with it through all your visualizations so this has been our
932:30 your visualizations so this has been our video on visualizing data in pandas I
932:32 video on visualizing data in pandas I think it's is a really good introduction
932:34 think it's is a really good introduction on how you can visualize data within
932:35 on how you can visualize data within python and in future videos we'll look
932:37 python and in future videos we'll look at mpot lib and Seaborn which are some
932:40 at mpot lib and Seaborn which are some really great libraries for visualizing
932:42 really great libraries for visualizing data which I use a lot so I hope that
932:44 data which I use a lot so I hope that you enjoyed this video if you did be
932:46 you enjoyed this video if you did be sure to check out all my other videos on
932:47 sure to check out all my other videos on Python and pandas and I will see you in
932:49 Python and pandas and I will see you in the next
932:52 the next [Music]
933:02 [Music] video hello everybody today we're going
933:04 video hello everybody today we're going to be cleaning data using paint P now
933:06 to be cleaning data using paint P now there are literally hundreds of ways
933:08 there are literally hundreds of ways that you can clean data within pandas
933:10 that you can clean data within pandas but I'm going to show you some of the
933:11 but I'm going to show you some of the ones that I use a lot and ones that I
933:13 ones that I use a lot and ones that I think are really good to know when you
933:14 think are really good to know when you are cleaning your data sets so we're
933:16 are cleaning your data sets so we're going to start by saying import pandas
933:19 going to start by saying import pandas aspd and we're going to run that and now
933:22 aspd and we're going to run that and now we're going to import our file so we're
933:24 we're going to import our file so we're going to say data frame is equal to PD
933:27 going to say data frame is equal to PD that's pandas do read uncore and we
933:30 that's pandas do read uncore and we actually have this in an Excel file so
933:31 actually have this in an Excel file so we'll say read oops say read Excel do an
933:35 we'll say read oops say read Excel do an open parenthesis eses and we'll do R and
933:38 open parenthesis eses and we'll do R and then we'll paste the path right here and
933:40 then we'll paste the path right here and now we're just going to call that
933:41 now we're just going to call that variable so we'll call data frame and
933:42 variable so we'll call data frame and we'll actually read it in and look at
933:44 we'll actually read it in and look at the data so let's scroll down here and
933:46 the data so let's scroll down here and let's take a look at this data frame or
933:48 let's take a look at this data frame or this Excel file that we're reading in so
933:50 this Excel file that we're reading in so right off the bat we have this customer
933:51 right off the bat we have this customer ID that goes from 101 all the way down
933:54 ID that goes from 101 all the way down to
933:55 to 1020 we have this first name and
933:58 1020 we have this first name and everything looks pretty good here except
934:01 everything looks pretty good here except in this last name column uh looks like
934:03 in this last name column uh looks like we have some errors we have some forward
934:06 we have some errors we have some forward slashes some dots some null values um so
934:11 slashes some dots some null values um so definitely going to have to clean that
934:12 definitely going to have to clean that up because we don't want that in the
934:14 up because we don't want that in the data we have a phone number and it looks
934:17 data we have a phone number and it looks like we have a lot of different formats
934:19 like we have a lot of different formats um as well as Nas not a number um just
934:24 um as well as Nas not a number um just lots of different stuff so we're going
934:25 lots of different stuff so we're going to need to standardize that so clean it
934:27 to need to standardize that so clean it up and then standardize it to where it
934:29 up and then standardize it to where it all looks the same um we also have
934:32 all looks the same um we also have address and it looks like on some of
934:34 address and it looks like on some of these we just have a street address but
934:36 these we just have a street address but on some of the other ones we have like a
934:38 on some of the other ones we have like a street address and another location as
934:41 street address and another location as well as a zip code in some of them so
934:44 well as a zip code in some of them so we'll probably want to split those out
934:45 we'll probably want to split those out we have a paying customer uh which is
934:48 we have a paying customer uh which is yes and Nos and some of those are not
934:50 yes and Nos and some of those are not the same so I have to standardize that
934:52 the same so I have to standardize that we have a do not contact kind of the
934:54 we have a do not contact kind of the same thing as the paying customer and we
934:56 same thing as the paying customer and we have this not useful column which we'll
934:58 have this not useful column which we'll probably just want to get rid of okay so
935:01 probably just want to get rid of okay so the scenario is is that we got handed
935:03 the scenario is is that we got handed this list of names and we need to clean
935:05 this list of names and we need to clean it up and hand it off to the people who
935:07 it up and hand it off to the people who are actually going to make these calls
935:09 are actually going to make these calls to this customer list so they want all
935:11 to this customer list so they want all the data in here standardized and
935:13 the data in here standardized and cleaned so that the people who are
935:14 cleaned so that the people who are making those calls can just make those
935:16 making those calls can just make those calls as quickly as possible but they
935:18 calls as quickly as possible but they also don't want columns and rows that
935:20 also don't want columns and rows that aren't useful to them so things like
935:22 aren't useful to them so things like this not useful column we're probably
935:24 this not useful column we're probably going to get rid of and then ones that
935:27 going to get rid of and then ones that say do not contact if it says yes we
935:29 say do not contact if it says yes we should not contact them we probably will
935:31 should not contact them we probably will want to get rid of those somehow so
935:33 want to get rid of those somehow so that's a lot of what we're going to be
935:34 that's a lot of what we're going to be doing to clean this data set normally
935:37 doing to clean this data set normally the very first thing that I do when I'm
935:39 the very first thing that I do when I'm working with a data set most of the time
935:41 working with a data set most of the time except very rare cases when you're
935:43 except very rare cases when you're actually supposed to have duplicates is
935:45 actually supposed to have duplicates is I actually go and drop the duplicates
935:47 I actually go and drop the duplicates from the data set completely all you
935:49 from the data set completely all you have to do for that is say DF do
935:53 have to do for that is say DF do dropcore duplicates so they make it
935:55 dropcore duplicates so they make it super easy for you let's just run it and
935:59 super easy for you let's just run it and up here is our original data set we have
936:02 up here is our original data set we have this 19 and 20 and those are obviously
936:04 this 19 and 20 and those are obviously duplicates they have the exact same data
936:06 duplicates they have the exact same data it's just a duplicate row that we need
936:08 it's just a duplicate row that we need to get rid of if we look right down here
936:11 to get rid of if we look right down here we no longer have that 20 we now just
936:13 we no longer have that 20 we now just have one row of Anakin Skywalker and of
936:16 have one row of Anakin Skywalker and of course we want to save that so we're
936:18 course we want to save that so we're just going to say DF is equal to and DF
936:23 just going to say DF is equal to and DF so now it's going to save that to the
936:25 so now it's going to save that to the data frame variable again and now when
936:27 data frame variable again and now when we run this our data frame Now does not
936:29 we run this our data frame Now does not have any duplicates that's definitely
936:31 have any duplicates that's definitely one of the easier steps that we're going
936:33 one of the easier steps that we're going to look at uh things are going to get
936:34 to look at uh things are going to get quite a bit more complicated as we go
936:36 quite a bit more complicated as we go but I'm starting out you know kind of
936:38 but I'm starting out you know kind of simple so that we can kind of get a feel
936:40 simple so that we can kind of get a feel for it and then we'll start getting into
936:42 for it and then we'll start getting into the really tough stuff so the next thing
936:44 the really tough stuff so the next thing that I want to do is remove any columns
936:46 that I want to do is remove any columns that we don't need I don't want to clean
936:48 that we don't need I don't want to clean data that we're not going to use so if
936:51 data that we're not going to use so if we're just looking through here you know
936:52 we're just looking through here you know they may need you know first name last
936:54 they may need you know first name last name phone number for sure address might
936:57 name phone number for sure address might give them some information of where
936:58 give them some information of where they're calling to or time zone so we
937:00 they're calling to or time zone so we want that this not useful column looks
937:03 want that this not useful column looks like a pretty good candidate to delete
937:06 like a pretty good candidate to delete and it's very easy to do that we're
937:08 and it's very easy to do that we're going to go right down here and we're
937:09 going to go right down here and we're going to say DF do drop and we'll do an
937:13 going to say DF do drop and we'll do an open parenthesis drop just means we are
937:16 open parenthesis drop just means we are dropping that column and we can specify
937:18 dropping that column and we can specify that by saying columns is equal to and
937:21 that by saying columns is equal to and then we'll paste in that column that we
937:24 then we'll paste in that column that we want to delete so let's run this and see
937:26 want to delete so let's run this and see what it looks like and it literally just
937:29 what it looks like and it literally just drops that column exactly like we were
937:30 drops that column exactly like we were talking about it no longer has that
937:32 talking about it no longer has that column again we want to save that we can
937:34 column again we want to save that we can always do in in place equals true um if
937:37 always do in in place equals true um if you follow this tutorial series you can
937:38 you follow this tutorial series you can always do in place equals true and
937:40 always do in place equals true and that'll save it as well but just for our
937:42 that'll save it as well but just for our workflow most of the time I'm going to
937:44 workflow most of the time I'm going to assign it back to that variable um just
937:46 assign it back to that variable um just for keeping it the same really quickly I
937:49 for keeping it the same really quickly I wanted to give a huge shout out to the
937:50 wanted to give a huge shout out to the sponsor of this entire Panda series and
937:52 sponsor of this entire Panda series and that is udemy udemy has some of the best
937:54 that is udemy udemy has some of the best courses at the best prices and it is no
937:57 courses at the best prices and it is no exception when it comes to pandas
937:58 exception when it comes to pandas courses if you want to master pandas
938:00 courses if you want to master pandas this is the course that I would
938:01 this is the course that I would recommend it's going to teach you just
938:02 recommend it's going to teach you just about everything you need to know about
938:04 about everything you need to know about pandas so huge shout out to you me for
938:06 pandas so huge shout out to you me for sponsoring this Panda series and let's
938:07 sponsoring this Panda series and let's get back to the video now let's kind of
938:09 get back to the video now let's kind of go column by column and see what we need
938:12 go column by column and see what we need to fix and we'll start on this left-and
938:13 to fix and we'll start on this left-and side this customer ID to me looks
938:16 side this customer ID to me looks perfectly fine I'm not going to mess
938:17 perfectly fine I'm not going to mess with it at all the first name at a
938:20 with it at all the first name at a glance also looks perfectly fine I don't
938:23 glance also looks perfectly fine I don't see anything wrong with it visually
938:25 see anything wrong with it visually which is a good thing um although
938:27 which is a good thing um although sometimes that can be deceiving and that
938:28 sometimes that can be deceiving and that can cause errors down the line but we're
938:30 can cause errors down the line but we're not going to uh assume that there are
938:32 not going to uh assume that there are errors in here now let's look at this
938:34 errors in here now let's look at this last name now the last name obviously
938:36 last name now the last name obviously I'm I'm seeing some obvious things
938:38 I'm I'm seeing some obvious things things that we talked about when we were
938:39 things that we talked about when we were first looking at this data set we have
938:41 first looking at this data set we have this forward slash which we definitely
938:44 this forward slash which we definitely need to get rid of we have null values
938:47 need to get rid of we have null values so not a number right here we have some
938:50 so not a number right here we have some periods as well as an underscore right
938:52 periods as well as an underscore right here so all those things I think we
938:54 here so all those things I think we should clean up and get rid of it so
938:56 should clean up and get rid of it so that when the person is making these
938:57 that when the person is making these calls you know it's all cleaned up for
938:59 calls you know it's all cleaned up for them so how are we going to do that we
939:02 them so how are we going to do that we can actually do this in several
939:03 can actually do this in several different ways but let's just copy this
939:05 different ways but let's just copy this last name the first one I'm going to
939:07 last name the first one I'm going to show you is strip and we'll write it
939:09 show you is strip and we'll write it kind of like this we'll say data frame
939:11 kind of like this we'll say data frame and then we'll specify the column that
939:13 and then we'll specify the column that we're working with because we don't want
939:14 we're working with because we don't want to make these changes or strip all of
939:17 to make these changes or strip all of these values from everywhere we only
939:19 these values from everywhere we only want to do it on just this column if we
939:22 want to do it on just this column if we do this and we don't specify the column
939:23 do this and we don't specify the column name it will apply to everywhere so if
939:25 name it will apply to everywhere so if we're trying to do these yeah let's say
939:28 we're trying to do these yeah let's say bum these underscores maybe that would
939:30 bum these underscores maybe that would mess with something else in another
939:32 mess with something else in another column and we don't want that so we just
939:34 column and we don't want that so we just want to specify just this last name so
939:37 want to specify just this last name so let's go last name.
939:40 let's go last name. string. strip now what strip does and
939:43 string. strip now what strip does and let's see if we can open this up really
939:45 let's see if we can open this up really quickly no we can't um but what strip
939:47 quickly no we can't um but what strip does I was just I was hitting shift tab
939:50 does I was just I was hitting shift tab in here to see if it could bring up um
939:52 in here to see if it could bring up um you know some of the notes on it but
939:53 you know some of the notes on it but what strip does is it takes either the
939:55 what strip does is it takes either the left side or the right side well L strip
939:58 left side or the right side well L strip takes from the left side our strip takes
940:00 takes from the left side our strip takes from the right side and strip takes from
940:03 from the right side and strip takes from both but you can strip values off the
940:05 both but you can strip values off the left and the right hand side and we can
940:07 left and the right hand side and we can specify those values now for what we're
940:09 specify those values now for what we're doing in this column we can just use
940:12 doing in this column we can just use strip because as you can see this
940:13 strip because as you can see this forward slash these dots as well as this
940:17 forward slash these dots as well as this um underscore are all on the far sides
940:20 um underscore are all on the far sides if there was a value Like swancore Son
940:24 if there was a value Like swancore Son the strip wouldn't work at all because
940:26 the strip wouldn't work at all because it's not on the outside of the value or
940:28 it's not on the outside of the value or the word so we can use strip I'll also
940:31 the word so we can use strip I'll also show you how to use replace and replace
940:34 show you how to use replace and replace is another really good option for things
940:36 is another really good option for things like this but let's start with strip and
940:38 like this but let's start with strip and just see what it looks like and see if
940:39 just see what it looks like and see if we can get what we need done so let's
940:41 we can get what we need done so let's just run this for now see what happens
940:45 just run this for now see what happens so it looks like nothing has changed
940:48 so it looks like nothing has changed because again we're not specifying any
940:49 because again we're not specifying any specific value just by default it's only
940:52 specific value just by default it's only taking out white space so like spaces
940:54 taking out white space so like spaces that shouldn't be there that's what it
940:56 that shouldn't be there that's what it does by default now we can specify
940:58 does by default now we can specify within this exactly what values we want
941:01 within this exactly what values we want to take out so let's go ahead and do
941:03 to take out so let's go ahead and do that let's say left strip and let's try
941:06 that let's say left strip and let's try to take out these dots real quick so
941:08 to take out these dots real quick so we're just going to do a parenthesis dot
941:10 we're just going to do a parenthesis dot dot dot now let's run this and see what
941:12 dot dot now let's run this and see what it looks
941:13 it looks like for this one Potter it is now gone
941:18 like for this one Potter it is now gone so those three dots were there before
941:20 so those three dots were there before let's just show it so they were there
941:23 let's just show it so they were there and then when I ran it like this now
941:25 and then when I ran it like this now they're gone that's what the L strip
941:27 they're gone that's what the L strip does it takes it only off the left hand
941:30 does it takes it only off the left hand side now we can also do a forward slash
941:33 side now we can also do a forward slash so we'll do something like this and
941:34 so we'll do something like this and it'll get rid of the white but as you
941:37 it'll get rid of the white but as you can see now we aren't taking out these
941:39 can see now we aren't taking out these three dots so they're still there now is
941:41 three dots so they're still there now is it possible to do something like this
941:45 it possible to do something like this where we put these values inside of a
941:46 where we put these values inside of a list um let's try it so we'll say just
941:49 list um let's try it so we'll say just like this one two 3 let's run it and no
941:53 like this one two 3 let's run it and no it doesn't um this L strip actually sits
941:56 it doesn't um this L strip actually sits within the the realm of regular
941:57 within the the realm of regular expression so if you've ever worked with
942:00 expression so if you've ever worked with regular expression you know it gets very
942:02 regular expression you know it gets very complicated very complex so you want to
942:04 complicated very complex so you want to keep it kind of simple especially with
942:06 keep it kind of simple especially with these values where we're just taking a
942:07 these values where we're just taking a few out so what we're going to do is
942:09 few out so what we're going to do is we're going to do dot dot dot and we're
942:13 we're going to do dot dot dot and we're take it out one by one now in order to
942:15 take it out one by one now in order to save this because we want to save this
942:17 save this because we want to save this we want to take out that value we don't
942:19 we want to take out that value we don't just want to say data frame equals
942:20 just want to say data frame equals because that would be uh very bad what
942:23 because that would be uh very bad what this would say is now this data frame is
942:25 this would say is now this data frame is only equal to these values that we're
942:27 only equal to these values that we're seeing right here we want to only apply
942:29 seeing right here we want to only apply it to this column so we're going to go
942:32 it to this column so we're going to go like this so now when we do it and then
942:36 like this so now when we do it and then we call the entire data frame it's only
942:39 we call the entire data frame it's only applying this to this one column the
942:42 applying this to this one column the last name column so let's run
942:44 last name column so let's run it and now when we go down to Potter
942:47 it and now when we go down to Potter right here it's cleaned up so we're
942:50 right here it's cleaned up so we're going to do the same thing but for those
942:51 going to do the same thing but for those other
942:52 other values and we'll do it just like this
942:54 values and we'll do it just like this we'll do a forward slash and it's a left
942:58 we'll do a forward slash and it's a left strip and then we'll do I'll do the left
943:00 strip and then we'll do I'll do the left strip on this underscore to just to show
943:02 strip on this underscore to just to show you that it won't work and then
943:06 you that it won't work and then we will go on from there so it's not
943:08 we will go on from there so it's not pulling it because we're looking at the
943:09 pulling it because we're looking at the left hand side only we need to use R
943:12 left hand side only we need to use R strip so now let's use R
943:15 strip so now let's use R strip and now that looks perfect has no
943:18 strip and now that looks perfect has no underscore so that's how you can use
943:20 underscore so that's how you can use strip for either the left side the right
943:22 strip for either the left side the right side or just Strip by itself which
943:24 side or just Strip by itself which covers both sides now I showed you all
943:26 covers both sides now I showed you all of that because I am going to show you a
943:28 of that because I am going to show you a different way to do it um and I
943:30 different way to do it um and I apologize because I somewhat lied to you
943:31 apologize because I somewhat lied to you earlier um let's run this right here
943:36 earlier um let's run this right here actually we're just going to pull it in
943:37 actually we're just going to pull it in like
943:38 like this we're going to remove the
943:40 this we're going to remove the duplicates again bear with me we're
943:42 duplicates again bear with me we're going to drop that column and then now
943:45 going to drop that column and then now we're sitting with that data frame again
943:47 we're sitting with that data frame again with those exact same mistakes I just
943:48 with those exact same mistakes I just wanted to reset it for a second there is
943:51 wanted to reset it for a second there is a way uh that you can do this and I just
943:53 a way uh that you can do this and I just wanted to you know kind of show you how
943:55 wanted to you know kind of show you how you can do it you can do this right
943:59 you can do it you can do this right here and we'll say so we're now again
944:02 here and we'll say so we're now again we're just looking at this column just
944:04 we're just looking at this column just this column and we're using strip and
944:06 this column and we're using strip and let's get rid of R CU we want to do
944:08 let's get rid of R CU we want to do apply it to everywhere you can input all
944:11 apply it to everywhere you can input all of those values in visually and it will
944:14 of those values in visually and it will clean it up so let's say we want to get
944:15 clean it up so let's say we want to get rid of numbers we'll do one two three
944:18 rid of numbers we'll do one two three then we can do the dot so that's going
944:20 then we can do the dot so that's going to be for a period or for a dot dot dot
944:22 to be for a period or for a dot dot dot Potter we could also do the underscore
944:25 Potter we could also do the underscore and we can do the forward slash so we
944:27 and we can do the forward slash so we put it all in one string right here now
944:32 put it all in one string right here now let's take a look at this we'll get rid
944:34 let's take a look at this we'll get rid of this really quickly now let's take a
944:35 of this really quickly now let's take a look and all of them were removed I
944:38 look and all of them were removed I showed you how to do it before because
944:40 showed you how to do it before because that's at least how my mind would think
944:41 that's at least how my mind would think about it I'd think oh I can put it in a
944:43 about it I'd think oh I can put it in a list and run it through this L strip or
944:45 list and run it through this L strip or this right strip and it would work um
944:46 this right strip and it would work um but that's not how strip works you have
944:48 but that's not how strip works you have to kind of combine it all into one value
944:50 to kind of combine it all into one value so uh yes I deceived you I apologize but
944:54 so uh yes I deceived you I apologize but now when we call data frame and we
944:56 now when we call data frame and we assign it to that column so the last
944:58 assign it to that column so the last name column or assigning what we just
945:00 name column or assigning what we just did to this last name column everything
945:03 did to this last name column everything should look perfect
945:05 should look perfect and it does so our customer ID first
945:07 and it does so our customer ID first name last name are all cleaned up now
945:09 name last name are all cleaned up now we're going to come to a much more
945:12 we're going to come to a much more difficult one this is probably if I'm
945:14 difficult one this is probably if I'm being honest the hardest one I said we
945:15 being honest the hardest one I said we were going to work up but this is
945:16 were going to work up but this is probably the hardest one of the whole
945:18 probably the hardest one of the whole video working with phone numbers and
945:20 video working with phone numbers and look at all these different types of of
945:24 look at all these different types of of formats I mean it is um it's not going
945:27 formats I mean it is um it's not going to be fun and imagine you you know
945:29 to be fun and imagine you you know there's 20,000 of these you can't just
945:30 there's 20,000 of these you can't just go and manually clean those up you need
945:33 go and manually clean those up you need something to kind of automate that
945:35 something to kind of automate that so that is what we're going to do so
945:39 so that is what we're going to do so let's go right down here we'll copy the
945:40 let's go right down here we'll copy the data frame and I'm going to pull it
945:43 data frame and I'm going to pull it right here so now we need to clean up
945:45 right here so now we need to clean up this phone number what we want is it all
945:48 this phone number what we want is it all to look exactly the same unless it's
945:51 to look exactly the same unless it's blank and we'll keep it blank we don't
945:52 blank and we'll keep it blank we don't want to populate that data but we want
945:55 want to populate that data but we want all of them to look exactly like this
945:57 all of them to look exactly like this one and what we're going to do is right
946:00 one and what we're going to do is right off the bat we're going to take all of
946:02 off the bat we're going to take all of the non-numeric values and just complete
946:04 the non-numeric values and just complete completely get rid of them strip it down
946:06 completely get rid of them strip it down to just the numbers so this 1 23- 643 or
946:10 to just the numbers so this 1 23- 643 or forward slash will just be the numbers
946:13 forward slash will just be the numbers same with these bars and these slashes
946:16 same with these bars and these slashes and everything all of these will just be
946:18 and everything all of these will just be numeric then we'll go back and reformat
946:21 numeric then we'll go back and reformat it how we want to format it which will
946:24 it how we want to format it which will look exactly like this one um but we
946:26 look exactly like this one um but we just want to do it for the entire column
946:28 just want to do it for the entire column so let's go right up here and we're
946:30 so let's go right up here and we're going to try replace for the first time
946:32 going to try replace for the first time so let's do phone number
946:36 so let's do phone number just oops that's not what I wanted so
946:39 just oops that's not what I wanted so we're going to do a bracket say phone
946:41 we're going to do a bracket say phone number do
946:43 number do string. replace just like we did before
946:47 string. replace just like we did before now we're going to use some regular
946:49 now we're going to use some regular expression in here and I'll kind of do a
946:51 expression in here and I'll kind of do a really high overview although I'm not
946:52 really high overview although I'm not going to dive super deep into the
946:54 going to dive super deep into the regular expression then we're going to
946:56 regular expression then we're going to do a parenthesis and within there we're
946:58 do a parenthesis and within there we're going to do a bracket um I can't
947:00 going to do a bracket um I can't remember what this is called is it
947:02 remember what this is called is it called a carrot I think it's called a
947:03 called a carrot I think it's called a carrot uh I'm just going to call it that
947:05 carrot uh I'm just going to call it that it may not be correct but I think it's a
947:07 it may not be correct but I think it's a an upper Arrow so it's an upper Arrow a
947:10 an upper Arrow so it's an upper Arrow a dash oops A- Z A- Z and then
947:16 dash oops A- Z A- Z and then 0-9 now at a super high level what that
947:19 0-9 now at a super high level what that character that first thing is doing it's
947:20 character that first thing is doing it's saying we're going to return any
947:22 saying we're going to return any character except and then we specify
947:25 character except and then we specify anything A to Z A to Z upper or
947:27 anything A to Z A to Z upper or lowercase and then actually I think this
947:29 lowercase and then actually I think this should be like this A to Z uh and then 0
947:32 should be like this A to Z uh and then 0 to 9 so any value like a BC 1 2 3 those
947:36 to 9 so any value like a BC 1 2 3 those are not going to be matched it's going
947:37 are not going to be matched it's going to match all of them except these values
947:40 to match all of them except these values and then we're going to replace them by
947:42 and then we're going to replace them by saying comma and we're going to replace
947:43 saying comma and we're going to replace them with nothing so this is just an
947:45 them with nothing so this is just an empty string so literally we're taking
947:48 empty string so literally we're taking everything that is not an A B C A one
947:51 everything that is not an A B C A one two 3 so a letter or a number we're
947:53 two 3 so a letter or a number we're replacing all of that and then we're
947:55 replacing all of that and then we're replacing it with nothing so let's run
947:57 replacing it with nothing so let's run this and see what it looks like and it
947:59 this and see what it looks like and it looks like that worked properly now we
948:02 looks like that worked properly now we do have this na cuz we had an n- a for I
948:06 do have this na cuz we had an n- a for I don't remember maybe that was Creed
948:08 don't remember maybe that was Creed Bratton um but it worked for basically
948:11 Bratton um but it worked for basically everything else we're going to go
948:12 everything else we're going to go through the entire process and then at
948:14 through the entire process and then at the end we'll remove any values we want
948:16 the end we'll remove any values we want them to just be completely null we we
948:18 them to just be completely null we we don't want them to even see n an and
948:20 don't want them to even see n an and wonder what that is we just want it to
948:22 wonder what that is we just want it to be blank and we'll do that at the very
948:24 be blank and we'll do that at the very end so now that we know that that worked
948:26 end so now that we know that that worked let's assign it we'll do DF phone num is
948:31 let's assign it we'll do DF phone num is equal to and then we'll say data frame
948:34 equal to and then we'll say data frame and this looks a lot more standardized
948:37 and this looks a lot more standardized than it did before already but now what
948:39 than it did before already but now what we want to do is try to format this um
948:42 we want to do is try to format this um and I've done this many many times I
948:44 and I've done this many many times I always use a Lambda you can definitely
948:47 always use a Lambda you can definitely use a for loop I just I don't do it that
948:49 use a for loop I just I don't do it that way myself so I'm going to show you how
948:51 way myself so I'm going to show you how to do it using a Lambda let's get rid of
948:53 to do it using a Lambda let's get rid of this and we're going to say thef phone
948:57 this and we're going to say thef phone number we've already done that I'm just
948:59 number we've already done that I'm just going to get rid of it now we're going
949:00 going to get rid of it now we're going to say d phone number then we're going
949:01 to say d phone number then we're going to say do apply we'll do an open
949:04 to say do apply we'll do an open parentheses and then this is where we're
949:06 parentheses and then this is where we're going to build out our Lambda so we'll
949:08 going to build out our Lambda so we'll say Lambda X colon now this is where
949:11 say Lambda X colon now this is where we're going to kind of format it so what
949:13 we're going to kind of format it so what I want to do is I want to take the first
949:15 I want to do is I want to take the first three strings one two three then I want
949:17 three strings one two three then I want to add a slash and then the next three
949:19 to add a slash and then the next three strings add a slash or a dash uh and
949:23 strings add a slash or a dash uh and then that be the value that's returned
949:24 then that be the value that's returned so it's not super difficult we're just
949:26 so it's not super difficult we're just going to do X then a bracket let me get
949:29 going to do X then a bracket let me get rid of that an X and then a bracket and
949:31 rid of that an X and then a bracket and then we want the 0 to three so goes 0 1
949:35 then we want the 0 to three so goes 0 1 2 so 0 1 2 it doesn't include the three
949:40 2 so 0 1 2 it doesn't include the three it goes up to three so 0 one two that's
949:42 it goes up to three so 0 one two that's our third first three values then we'll
949:45 our third first three values then we'll do plus and do a quote and do a dash so
949:50 do plus and do a quote and do a dash so this is our first kind of sequence and
949:52 this is our first kind of sequence and I'm just going to copy this we'll do
949:55 I'm just going to copy this we'll do plus and instead of three or we are
949:58 plus and instead of three or we are going to start at three because now it's
949:59 going to start at three because now it's inclusive so we're going to go from
950:01 inclusive so we're going to go from three and we're going to go all the way
950:03 three and we're going to go all the way up to six so it should be 3 four five
950:06 up to six so it should be 3 four five our next three values then we have a
950:08 our next three values then we have a dash and we'll copy this and we'll say
950:12 dash and we'll copy this and we'll say plus and now we go from six all the way
950:17 plus and now we go from six all the way to 10 now let's try running this and as
950:22 to 10 now let's try running this and as you can see we get an error now I
950:24 you can see we get an error now I already know what the error is float
950:26 already know what the error is float object is not subscriptable which means
950:28 object is not subscriptable which means we're trying to um basically look at it
950:31 we're trying to um basically look at it like a string right now it's not a
950:32 like a string right now it's not a string it's actually a number so let me
950:36 string it's actually a number so let me get rid of this for just a second I'm G
950:38 get rid of this for just a second I'm G show you what it's talking about so
950:40 show you what it's talking about so right now we have values that are floats
950:44 right now we have values that are floats and values that are strings or not even
950:46 and values that are strings or not even a number so we have values that are
950:48 a number so we have values that are strings or not a number so if we want to
950:50 strings or not a number so if we want to actually look through it like kind of
950:52 actually look through it like kind of like indexing if we want to do that they
950:54 like indexing if we want to do that they all have to be strings so we need to
950:57 all have to be strings so we need to change this entire column into Strings
951:00 change this entire column into Strings before we can apply this um formatting
951:03 before we can apply this um formatting now when I was creating this if I'm
951:05 now when I was creating this if I'm being honest my first thought when I was
951:06 being honest my first thought when I was doing this was to do it like this string
951:10 doing this was to do it like this string DF phone number um let's just run that
951:13 DF phone number um let's just run that this is what the values look like um and
951:17 this is what the values look like um and I don't remember why or why it was doing
951:20 I don't remember why or why it was doing this I can't I can't remember but I
951:21 this I can't I can't remember but I looked into it quite a bit and I was
951:22 looked into it quite a bit and I was like oh I need to apply this string
951:26 like oh I need to apply this string converting it to a string on each value
951:30 converting it to a string on each value not the entire row or not the entire
951:32 not the entire row or not the entire column so how we can do that is actually
951:34 column so how we can do that is actually fairly easy because we've already done a
951:36 fairly easy because we've already done a lot of the heavy lifting we're just
951:38 lot of the heavy lifting we're just going to copy this and we're going to
951:40 going to copy this and we're going to say
951:42 say x so string of X and again Lambda is
951:47 x so string of X and again Lambda is like a little Anonymous function so you
951:49 like a little Anonymous function so you could do this by saying for um X in this
951:54 could do this by saying for um X in this uh column we could do a for Loop and
951:56 uh column we could do a for Loop and then say for every X it equals the
951:57 then say for every X it equals the string of X and then it changes it to a
951:59 string of X and then it changes it to a string but a Lambda just does it a lot
952:02 string but a Lambda just does it a lot quicker um so we're going to say so
952:04 quicker um so we're going to say so let's do that really quickly and all of
952:07 let's do that really quickly and all of our values look exactly the same and
952:09 our values look exactly the same and that's how we want it so we're just
952:10 that's how we want it so we're just going to copy this apply
952:15 going to copy this apply it good and now we're going to take this
952:20 it good and now we're going to take this and we're going to run this again just
952:22 and we're going to run this again just ignore all my commented out stuff
952:24 ignore all my commented out stuff pretend I don't have that um so now when
952:27 pretend I don't have that um so now when we run this it should work there we go
952:30 we run this it should work there we go now if we look at these numbers 1 2 3-
952:32 now if we look at these numbers 1 2 3- 545 d 5
952:40 421 and it does that for every single one where there's values even when
952:41 one where there's values even when there's Nan or na it's still adding
952:45 there's Nan or na it's still adding those values but we expected that so
952:49 those values but we expected that so let's apply it say is equal to and then
952:52 let's apply it say is equal to and then we'll look at the data
952:54 we'll look at the data frame and this looks almost exactly what
952:57 frame and this looks almost exactly what we're hoping for we just need to get rid
952:58 we're hoping for we just need to get rid of these so this n- Das and this na Dash
953:03 of these so this n- Das and this na Dash we need to get rid of those and that is
953:05 we need to get rid of those and that is super easy to do um we're just going to
953:08 super easy to do um we're just going to say so now that we've done it and we'll
953:10 say so now that we've done it and we'll comment that out we'll say
953:13 comment that out we'll say DF and let's copy this ignore the
953:17 DF and let's copy this ignore the messiness I do apologize for that it's
953:18 messiness I do apologize for that it's very messy um but if you're following
953:21 very messy um but if you're following along with me you get what we're doing
953:23 along with me you get what we're doing so DF phone number so only on the phone
953:25 so DF phone number so only on the phone number say string.
953:29 number say string. replace no open parenthesis now we can
953:32 replace no open parenthesis now we can specify this value so we want to take
953:34 specify this value so we want to take this exact
953:37 this exact value and replace it with nothing and
953:40 value and replace it with nothing and let's just see if that does work it does
953:43 let's just see if that does work it does now we have these
953:45 now we have these Nas and so let's actually I'll paste
953:49 Nas and so let's actually I'll paste that right down here we're going to do
953:51 that right down here we're going to do this is equal to and then we're just
953:54 this is equal to and then we're just going to take this entire string put it
953:56 going to take this entire string put it right here and put this value as our
954:00 right here and put this value as our what we're looking for and then
954:01 what we're looking for and then replacing and then when we call that
954:04 replacing and then when we call that data frame it should work properly and
954:07 data frame it should work properly and it is perfectly cleaned so we have every
954:11 it is perfectly cleaned so we have every single value all the exact same they
954:13 single value all the exact same they don't have different characters or
954:15 don't have different characters or different um you know formatting and we
954:18 different um you know formatting and we got rid of all the ones that we don't
954:19 got rid of all the ones that we don't have or don't need um all the ones that
954:21 have or don't need um all the ones that were just random values so this column
954:25 were just random values so this column is now completely cleaned up again
954:27 is now completely cleaned up again definitely one of the more difficult
954:28 definitely one of the more difficult ones um one that I've done a thousand
954:31 ones um one that I've done a thousand times I've had to work with a lot of
954:33 times I've had to work with a lot of phone numbers and stuff like like that
954:34 phone numbers and stuff like like that this one does get very tricky especially
954:36 this one does get very tricky especially if you have like a plus one which is
954:38 if you have like a plus one which is like an area code um that can get tricky
954:40 like an area code um that can get tricky as well but this is on a kind of a high
954:43 as well but this is on a kind of a high level this is how you can do that and
954:44 level this is how you can do that and it's pretty neat how you can actually
954:46 it's pretty neat how you can actually you know clean up and standardize those
954:48 you know clean up and standardize those phone numbers so let's go right down
954:50 phone numbers so let's go right down here uh let's run it the next thing that
954:52 here uh let's run it the next thing that we're going to look at is this address
954:55 we're going to look at is this address now let's just pretend that the people
954:57 now let's just pretend that the people who are on the call center want all
954:59 who are on the call center want all these separated into three different
955:00 these separated into three different columns they can read it easier see what
955:02 columns they can read it easier see what the ZIP code is where they live
955:04 the ZIP code is where they live uh you know whatever they want it for
955:06 uh you know whatever they want it for let's just say we want to do that and
955:07 let's just say we want to do that and this is you know again for this use case
955:09 this is you know again for this use case it may not make sense but you have to do
955:11 it may not make sense but you have to do this I do this all the time um you need
955:14 this I do this all the time um you need to split those columns now luckily all
955:16 to split those columns now luckily all of these things are separated by a comma
955:19 of these things are separated by a comma so we can specify that we're going to
955:20 so we can specify that we're going to split on this column and then we'll be
955:22 split on this column and then we'll be able to create three separate columns
955:25 able to create three separate columns based off of this one column which is
955:27 based off of this one column which is exactly what we want then we can name it
955:28 exactly what we want then we can name it as well and we can do that very easily
955:31 as well and we can do that very easily by using this split so we're going to
955:33 by using this split so we're going to say DF and we want to
955:42 specify oh jeez not again so we want to specify that we're looking at the
955:44 specify that we're looking at the address then we're going to say.
955:47 address then we're going to say. string. split we'll do an open
955:50 string. split we'll do an open parenthesis now the very first value
955:52 parenthesis now the very first value that we need to specify is what we're
955:53 that we need to specify is what we're splitting on so we want to split on the
955:56 splitting on so we want to split on the comma so we want to specify that and
955:59 comma so we want to specify that and then we need to specify how many values
956:01 then we need to specify how many values from left to right it should look for
956:04 from left to right it should look for now we'll just start with one and then
956:07 now we'll just start with one and then we'll go from there let's just see what
956:09 we'll go from there let's just see what this looks
956:10 this looks like
956:13 like so it doesn't really look like it did
956:16 so it doesn't really look like it did anything let's do two well let's go back
956:18 anything let's do two well let's go back to one and then let's say
956:21 to one and then let's say expand equals true when we expand it
956:25 expand equals true when we expand it it's actually going to uh separate it I
956:27 it's actually going to uh separate it I believe okay so we're expanding we now
956:28 believe okay so we're expanding we now we're only doing this with one comma so
956:31 we're only doing this with one comma so we're only looking at the very first
956:33 we're only looking at the very first comma and splitting it but in some of
956:35 comma and splitting it but in some of these well just in one there is an
956:37 these well just in one there is an additional comma so we should do it up
956:39 additional comma so we should do it up to two let's do this okay so now we have
956:43 to two let's do this okay so now we have three columns if we just save it like
956:45 three columns if we just save it like this it's going to give us these 0 one
956:47 this it's going to give us these 0 one two these basically these indexed values
956:49 two these basically these indexed values for these columns and we don't want that
956:52 for these columns and we don't want that we want to specify what these actually
956:54 we want to specify what these actually are and we can do that by saying DF and
956:56 are and we can do that by saying DF and let me just do is equal to we'll do
956:59 let me just do is equal to we'll do bracket and then within there we're
957:01 bracket and then within there we're going to specify our list so we have
957:03 going to specify our list so we have three three of them that we have so I'm
957:05 three three of them that we have so I'm going to do um the first one this is the
957:08 going to do um the first one this is the street address so we'll say street
957:12 street address so we'll say street address the next one is and it's sh is
957:16 address the next one is and it's sh is not a state uh but these all are states
957:18 not a state uh but these all are states so I'm just going to say
957:20 so I'm just going to say State and then for the very last one
957:24 State and then for the very last one that looks like a zip code so we'll say
957:26 that looks like a zip code so we'll say zip and we'll do code in fact I also
957:30 zip and we'll do code in fact I also want to do streetcore address um so what
957:33 want to do streetcore address um so what this is is now going to do is these
957:35 this is is now going to do is these three columns are going to be applied to
957:36 three columns are going to be applied to these three names and they'll basically
957:38 these three names and they'll basically be appended it doesn't replace the
957:41 be appended it doesn't replace the address we're not saying DF address
957:43 address we're not saying DF address equals the DF address we're not
957:45 equals the DF address we're not replacing it we're now creating
957:47 replacing it we're now creating different columns so let's run it and
957:50 different columns so let's run it and then let's also call it so they're right
957:52 then let's also call it so they're right over here on this right hand side I
957:53 over here on this right hand side I couldn't see them at first but it did
957:56 couldn't see them at first but it did exactly what we needed it to do so now
957:58 exactly what we needed it to do so now if we wanted to at the very end if we
958:00 if we wanted to at the very end if we want to we're not going to we could just
958:02 want to we're not going to we could just delete this address and keep the street
958:04 delete this address and keep the street address the state and the zip code
958:07 address the state and the zip code another really common thing that you can
958:10 another really common thing that you can do this happens often again with like
958:12 do this happens often again with like first name last name well you'll have
958:14 first name last name well you'll have Alex freeberg but it's Alex comma
958:16 Alex freeberg but it's Alex comma freeberg or Alex space freeberg and you
958:18 freeberg or Alex space freeberg and you can separate those out into different
958:20 can separate those out into different columns now the next one that we want to
958:22 columns now the next one that we want to look at is this paying customer and the
958:24 look at is this paying customer and the paying customer and do not contact are
958:27 paying customer and do not contact are very similar um in the fact that it's
958:30 very similar um in the fact that it's yes no NY yes no NY
958:34 yes no NY yes no NY um and so let's go right on down here
958:37 um and so let's go right on down here and we're going to say DF Dot and we
958:39 and we're going to say DF Dot and we want to just replace these values as all
958:43 want to just replace these values as all yeses or all NOS but just with the same
958:46 yeses or all NOS but just with the same formatting um just to keep it consistent
958:49 formatting um just to keep it consistent so let's make anything that's an N into
958:51 so let's make anything that's an N into a no anything that's a a y into a yes I
958:54 a no anything that's a a y into a yes I like it spelled out so let's change
958:56 like it spelled out so let's change anything that's a yes into a y anything
959:00 anything that's a yes into a y anything that's uh a a no into an n that's
959:04 that's uh a a no into an n that's usually how I do it just saves on data
959:06 usually how I do it just saves on data because it's less strings although it's
959:08 because it's less strings although it's can be often very minimal um but let's
959:11 can be often very minimal um but let's specify the P
959:14 specify the P customer we'll s say DF bracket Pay
959:18 customer we'll s say DF bracket Pay customer then we'll do do string.
959:22 customer then we'll do do string. replace so now we're just going to look
959:24 replace so now we're just going to look for those specific values so if it's a y
959:28 for those specific values so if it's a y oops a capital Y then we'll say
959:32 oops a capital Y then we'll say yes now let's run it and now we have no
959:35 yes now let's run it and now we have no more y's we now just have yeses although
959:39 more y's we now just have yeses although now these are yes yeses okay we don't
959:42 now these are yes yeses okay we don't want to do that let's do if we're
959:45 want to do that let's do if we're looking because it's taking it's
959:47 looking because it's taking it's literally looking up here and saying
959:48 literally looking up here and saying okay there's here's a y um let's change
959:51 okay there's here's a y um let's change the let's change that Y into a y so now
959:53 the let's change that Y into a y so now it's doing y uh we don't want that so
959:56 it's doing y uh we don't want that so let's look for the yes and change it
959:59 let's look for the yes and change it into a y now when we run this that looks
960:03 into a y now when we run this that looks a lot better um so we'll
960:06 a lot better um so we'll do DF paying customers equal to and then
960:10 do DF paying customers equal to and then we'll copy this we'll do the exact same
960:13 we'll copy this we'll do the exact same thing
960:14 thing no and
960:16 no and N then let's call it and now that entire
960:21 N then let's call it and now that entire column looks really good except for that
960:23 column looks really good except for that value right there but I'm going to leave
960:26 value right there but I'm going to leave that because I'm just going to apply it
960:27 that because I'm just going to apply it to the entire thing all at once to get
960:29 to the entire thing all at once to get rid of those at the end instead of just
960:31 rid of those at the end instead of just going column by column and then it's
960:33 going column by column and then it's it's literally going to be the exact
960:34 it's literally going to be the exact same thing so I'm not even going to
960:36 same thing so I'm not even going to scroll down whoops I'm just going to put
960:39 scroll down whoops I'm just going to put it right up here because this is the
960:41 it right up here because this is the exact same thing I'm going save us all
960:43 exact same thing I'm going save us all some
960:49 time and when we run this this looks exactly like what we're looking for
960:51 exactly like what we're looking for again some not a number values but we
960:53 again some not a number values but we can get rid of that in just a second by
960:55 can get rid of that in just a second by doing a place over the entire data frame
960:57 doing a place over the entire data frame and that is basically the end of
960:59 and that is basically the end of cleaning up individual columns now let's
961:02 cleaning up individual columns now let's go right down here we're going to say DF
961:05 go right down here we're going to say DF do string.
961:07 do string. replace and then we'll first do these
961:10 replace and then we'll first do these values oops so we'll do oops let me do
961:15 values oops so we'll do oops let me do that there we go and replace that with
961:18 that there we go and replace that with nothing and let's just see what it looks
961:20 nothing and let's just see what it looks like oops data frame object has no value
961:23 like oops data frame object has no value string well that's CU we were looking at
961:25 string well that's CU we were looking at columns before yeah I think I just need
961:27 columns before yeah I think I just need to get rid of this string we're not
961:29 to get rid of this string we're not looking it we're just really doing it
961:30 looking it we're just really doing it across the entire data frame now let's
961:32 across the entire data frame now let's try that
961:34 try that okay that worked
961:35 okay that worked appropriately and we'll just say data
961:37 appropriately and we'll just say data frame is equal to and then we'll copy
961:41 frame is equal to and then we'll copy this and we'll do the NN as
961:45 this and we'll do the NN as well and we'll
961:46 well and we'll [Music]
961:49 [Music] do and now when we do this it is not
961:52 do and now when we do this it is not going to replace these because these
961:54 going to replace these because these aren't actually a value because we're
961:56 aren't actually a value because we're looking for that string we actually need
961:57 looking for that string we actually need to use and I I completely forgot this
961:59 to use and I I completely forgot this I'm not going to lie to you um let's get
962:02 I'm not going to lie to you um let's get rid of this uh to get rid those values
962:04 rid of this uh to get rid those values because it's literally not a number
962:05 because it's literally not a number there it is technically empty um I
962:09 there it is technically empty um I forgot we can do um or we could not even
962:12 forgot we can do um or we could not even specify it we'll do DF do fillna so
962:16 specify it we'll do DF do fillna so we're going to fill these values if
962:18 we're going to fill these values if there's nothing in them we're going to
962:20 there's nothing in them we're going to fill it and we're going to
962:22 fill it and we're going to say blank and when we run that every
962:26 say blank and when we run that every value that doesn't have something in it
962:28 value that doesn't have something in it is going to show up blank even over here
962:30 is going to show up blank even over here where we only had a few all of them
962:32 where we only had a few all of them throughout the data frame if if it
962:33 throughout the data frame if if it doesn't have a value it is now blank so
962:36 doesn't have a value it is now blank so let's apply
962:38 let's apply that and we'll run
962:40 that and we'll run this and now all of our cleaning we
962:44 this and now all of our cleaning we actually cleaning up the individual
962:46 actually cleaning up the individual columns is completely done we've removed
962:49 columns is completely done we've removed columns we've split columns we've
962:52 columns we've split columns we've formatted and cleaned up phone numbers
962:54 formatted and cleaned up phone numbers we've also taken values off of first
962:57 we've also taken values off of first name or or this last name column and
962:59 name or or this last name column and then we formatt it in just kind of
963:01 then we formatt it in just kind of standardized paying customer and do not
963:04 standardized paying customer and do not contact now they also asked us to only
963:07 contact now they also asked us to only give them a list of phone numbers that
963:09 give them a list of phone numbers that they can call so if we take a look some
963:12 they can call so if we take a look some of these do not contacts are why which
963:14 of these do not contacts are why which means we cannot contact them and then
963:18 means we cannot contact them and then there are some that don't even have
963:19 there are some that don't even have phone numbers so we don't want to give
963:21 phone numbers so we don't want to give the people the call center numbers that
963:24 the people the call center numbers that or or people who don't have numbers so
963:27 or or people who don't have numbers so we want to remove those now there's a
963:29 we want to remove those now there's a few different ways that we can do this
963:32 few different ways that we can do this but let's start with and we'll just go
963:34 but let's start with and we'll just go by do this do not contact it seems like
963:37 by do this do not contact it seems like the most obvious one now if it's blank
963:40 the most obvious one now if it's blank we want to give them a call we only want
963:42 we want to give them a call we only want to not call them if they've specifically
963:45 to not call them if they've specifically said we cannot call them so if it's y
963:47 said we cannot call them so if it's y we're not going to call them so what we
963:50 we're not going to call them so what we need to do it's not anything like this
963:53 need to do it's not anything like this we probably need to Loop through this
963:56 we probably need to Loop through this column and then look at each row that
963:59 column and then look at each row that has a value of this and drop that entire
964:01 has a value of this and drop that entire row uh and we probably will'll need to
964:04 row uh and we probably will'll need to do that based off this index instead of
964:07 do that based off this index instead of doing it based off just this column uh
964:10 doing it based off just this column uh that may not make sense but let's
964:12 that may not make sense but let's actually let's actually start writing it
964:14 actually let's actually start writing it so we'll do 4X in and we need to look at
964:18 so we'll do 4X in and we need to look at our index so we're just going to do
964:20 our index so we're just going to do let's do nf. index and we'll do a colon
964:25 let's do nf. index and we'll do a colon enter and then we want to look at these
964:28 enter and then we want to look at these indexes how do we look at these indexes
964:30 indexes how do we look at these indexes we use lock that's going to be DF
964:34 we use lock that's going to be DF Lo and then we need to look at the value
964:36 Lo and then we need to look at the value which is this x right here so each time
964:39 which is this x right here so each time it looks at the index it's looking at
964:41 it looks at the index it's looking at the value but we want to look at the
964:43 the value but we want to look at the value of this column do not contact I
964:47 value of this column do not contact I don't know if I copied this before let
964:48 don't know if I copied this before let me copy it we only want to look at the
964:50 me copy it we only want to look at the value in this one column if we didn't it
964:53 value in this one column if we didn't it would look at um a different value so we
964:56 would look at um a different value so we don't want that so we're looking at just
964:58 don't want that so we're looking at just that value if it's equal to Y so if this
965:03 that value if it's equal to Y so if this value is equal to Y then we want to drop
965:05 value is equal to Y then we want to drop it so we actually need to say
965:07 it so we actually need to say if so if this value X in this column is
965:12 if so if this value X in this column is equal to Y then we want to do DF do drop
965:16 equal to Y then we want to do DF do drop and then we'll say x and we I think we
965:20 and then we'll say x and we I think we have to say in place equals true here
965:22 have to say in place equals true here otherwise it won't take a fact um
965:25 otherwise it won't take a fact um otherwise you have to say like DF is
965:27 otherwise you have to say like DF is equal to DF yeah I don't I don't want to
965:29 equal to DF yeah I don't I don't want to start messing with that let's just do in
965:31 start messing with that let's just do in place equals true
965:34 place equals true um and let's see if that works I I can't
965:37 um and let's see if that works I I can't remember if this is going to work or not
965:39 remember if this is going to work or not invalid syntax okay
965:42 invalid syntax okay neon and now let's try to run
965:45 neon and now let's try to run this okay okay yeah if we look at our
965:48 this okay okay yeah if we look at our index we can already tell that there are
965:50 index we can already tell that there are ones missing the one the one is missing
965:52 ones missing the one the one is missing the three is missing um let's see and
965:56 the three is missing um let's see and the 18 is missing so we already got rid
965:58 the 18 is missing so we already got rid of those values and you can you can see
965:59 of those values and you can you can see that there's no y's in here anymore
966:01 that there's no y's in here anymore which is really good we can if we want
966:04 which is really good we can if we want to and we probably should we should
966:05 to and we probably should we should probably populate that um really
966:08 probably populate that um really quickly um let me just go up here really
966:17 quick I'll copy this we probably should populate that and I didn't plan on doing
966:20 populate that and I didn't plan on doing this so um if it's blank oops it's blank
966:24 this so um if it's blank oops it's blank give it an n and we want to attribute it
966:27 give it an n and we want to attribute it to do not
966:30 to do not contact do not contact
966:38 whoops let's see if that works and we probably need to do do
966:41 works and we probably need to do do string let's just see if it
966:44 string let's just see if it works so if it's
966:47 works so if it's blank dude okay I don't know why it's
966:49 blank dude okay I don't know why it's giving us a triple
966:52 giving us a triple n maybe there's maybe I need to strip
966:54 n maybe there's maybe I need to strip this or
966:56 this or something uh okay never mind let's not
967:00 something uh okay never mind let's not do that but now we basically need to the
967:03 do that but now we basically need to the exact same thing for this phone number
967:05 exact same thing for this phone number um because if it's blank we don't want
967:08 um because if it's blank we don't want them calling it um so we can copy this
967:11 them calling it um so we can copy this entire thing go right down here and but
967:14 entire thing go right down here and but now we're looking at phone
967:17 now we're looking at phone number so now we're looking just at the
967:19 number so now we're looking just at the values within phone number and we only
967:21 values within phone number and we only want to look at if it's blank so if it
967:23 want to look at if it's blank so if it literally has no value we want to get
967:26 literally has no value we want to get rid of it let's run this and see if it
967:28 rid of it let's run this and see if it works again it should good and now our
967:32 works again it should good and now our list is getting much smaller so you can
967:34 list is getting much smaller so you can see in our index a lot of um those rows
967:37 see in our index a lot of um those rows were removed and okay good actually this
967:41 were removed and okay good actually this worked itself out because these all have
967:42 worked itself out because these all have ends um so right now we're sitting
967:45 ends um so right now we're sitting really good everything looks really um
967:48 really good everything looks really um standardized cleaned everything looks
967:51 standardized cleaned everything looks great I might drop this address if you
967:53 great I might drop this address if you want to you can drop this address but
967:55 want to you can drop this address but besides that this is all looking really
967:57 besides that this is all looking really good this Paint customer doesn't uh the
967:59 good this Paint customer doesn't uh the yes and NOS aren't really anything um
968:02 yes and NOS aren't really anything um now we could and we probably should
968:05 now we could and we probably should before we hand this off to the client or
968:08 before we hand this off to the client or the customer call list we probably
968:09 the customer call list we probably should reset this index because they
968:11 should reset this index because they might be confused as why there's numbers
968:13 might be confused as why there's numbers missing or you know they might use this
968:15 missing or you know they might use this index um to show how many people they've
968:18 index um to show how many people they've called or I don't know something like
968:19 called or I don't know something like that so let's go right down here we're
968:21 that so let's go right down here we're going to say DF Dot and then we'll do
968:25 going to say DF Dot and then we'll do reset
968:27 reset index and let's just see what this looks
968:29 index and let's just see what this looks like um it does work but as you can tell
968:32 like um it does work but as you can tell it didn't uh get rid of that index
968:35 it didn't uh get rid of that index completely it actually took the index
968:36 completely it actually took the index and saved that original one we do not
968:39 and saved that original one we do not need to save that whoops let's put it
968:41 need to save that whoops let's put it right in here now we're just going to do
968:42 right in here now we're just going to do drop equals true and when we do that it
968:46 drop equals true and when we do that it just completely resets it drops the
968:48 just completely resets it drops the original index and gives us a new index
968:51 original index and gives us a new index and that is what we want let's do DF
968:53 and that is what we want let's do DF equals and this is our final product now
968:57 equals and this is our final product now one thing that I you definitely could
968:59 one thing that I you definitely could have done here um and I made this a
969:01 have done here um and I made this a little probably more complicated than it
969:03 little probably more complicated than it needed to be um that was just how my
969:05 needed to be um that was just how my brain was working at the time when I'm
969:07 brain was working at the time when I'm you know typing this out we could have
969:09 you know typing this out we could have done DF do drop an a um which is
969:13 done DF do drop an a um which is literally going to look at these null
969:15 literally going to look at these null values um
969:17 values um before we couldn't do that with this one
969:19 before we couldn't do that with this one because these aren't we're not looking
969:20 because these aren't we're not looking at na we're looking at y's so we
969:22 at na we're looking at y's so we couldn't do that but because we're
969:24 couldn't do that but because we're looking at null values we could have
969:25 looking at null values we could have also done drop
969:27 also done drop na um and done subset is equal to and
969:31 na um and done subset is equal to and then done it just on this phone number
969:35 then done it just on this phone number and then done like this and done in
969:38 and then done like this and done in place equals true so we could have also
969:41 place equals true so we could have also done this and then said DF equals um I
969:45 done this and then said DF equals um I can't I mean I can run it it's just not
969:47 can't I mean I can run it it's just not going to do anything I can run it on the
969:50 going to do anything I can run it on the different column but that'll me mess
969:51 different column but that'll me mess everything up but this is another way
969:53 everything up but this is another way you can do it and I'll just save it in
969:55 you can do it and I'll just save it in case you want to um I'll say another way
969:59 case you want to um I'll say another way to drop null
970:01 to drop null values there you go and that'll just be
970:03 values there you go and that'll just be a note for us in the future um but this
970:06 a note for us in the future um but this is our final product it looks a lot
970:10 is our final product it looks a lot different than when we first started I
970:12 different than when we first started I mean we had mistakes here completely
970:15 mean we had mistakes here completely different formatting in the phone number
970:16 different formatting in the phone number different address everything that we
970:18 different address everything that we just talked about um and this looks just
970:20 just talked about um and this looks just a lot lot better and you can tell why
970:22 a lot lot better and you can tell why it's really important to do this process
970:24 it's really important to do this process because again we're working on a very
970:26 because again we're working on a very small data set I I purposely you know
970:29 small data set I I purposely you know created this data set with these
970:31 created this data set with these mistakes because you know when you're
970:33 mistakes because you know when you're looking at data that has tens of
970:35 looking at data that has tens of thousands 100 thousands a million rows
970:38 thousands 100 thousands a million rows these are all things that are going to
970:39 these are all things that are going to be applied to much larger scale and you
970:41 be applied to much larger scale and you won't be able to as easily see them um
970:44 won't be able to as easily see them um you'll have to do some exploratory data
970:46 you'll have to do some exploratory data analysis to find these mistakes and then
970:49 analysis to find these mistakes and then you're going to need to clean the data
970:50 you're going to need to clean the data or doing it at the same time when you're
970:52 or doing it at the same time when you're exploring the data uh so you'll clean it
970:54 exploring the data uh so you'll clean it up as you go but these are a lot of the
970:57 up as you go but these are a lot of the ways that I clean data a lot of the
970:59 ways that I clean data a lot of the things that you can do to make your data
971:01 things that you can do to make your data just a lot more standardized is a lot
971:03 just a lot more standardized is a lot more um visually better and then it
971:05 more um visually better and then it really helps later on with
971:07 really helps later on with visualizations and your you know actual
971:10 visualizations and your you know actual data analysis so I hope that that was
971:12 data analysis so I hope that that was helpful I know that this was a long
971:13 helpful I know that this was a long video I'm sure it was uh but I hope that
971:16 video I'm sure it was uh but I hope that you got something out of this you
971:17 you got something out of this you learned some of the techniques on how to
971:19 learned some of the techniques on how to actually clean data in pandas if you
971:20 actually clean data in pandas if you like this video be sure to like And
971:22 like this video be sure to like And subscribe check out all my other videos
971:24 subscribe check out all my other videos on pandas as well as Python and I will
971:26 on pandas as well as Python and I will see you in the next
971:31 [Music] video
971:39 [Music] hello everybody today we're going to be
971:41 hello everybody today we're going to be looking at exploratory data analysis
971:43 looking at exploratory data analysis using pandas exploratory data analysis
971:46 using pandas exploratory data analysis or Eda for short is basically just the
971:49 or Eda for short is basically just the first look at your data during this
971:51 first look at your data during this process we'll look at identifying
971:52 process we'll look at identifying patterns within the data understanding
971:54 patterns within the data understanding the relationships between the features
971:56 the relationships between the features and looking at outliers that may exist
971:58 and looking at outliers that may exist within your data set during this process
972:00 within your data set during this process you are looking for patterns and all
972:01 you are looking for patterns and all these things but you're also looking for
972:03 these things but you're also looking for um mistakes and missing values that you
972:05 um mistakes and missing values that you need to clean up during your cleaning
972:07 need to clean up during your cleaning process in the future now there are
972:09 process in the future now there are hundreds of ways to perform Eda on your
972:11 hundreds of ways to perform Eda on your data set but we can't possibly look at
972:13 data set but we can't possibly look at every single thing so I'm just going to
972:15 every single thing so I'm just going to show you what I think are some of the
972:17 show you what I think are some of the most popular and the best things that
972:19 most popular and the best things that you can do when you're first looking at
972:20 you can do when you're first looking at a data set the first thing that we're
972:22 a data set the first thing that we're going to do are import our libraries so
972:24 going to do are import our libraries so we'll do import pandas
972:27 we'll do import pandas aspd we're also going to import Seaborn
972:30 aspd we're also going to import Seaborn and matplot lib now dur during this
972:32 and matplot lib now dur during this exploratory data analysis process I
972:35 exploratory data analysis process I often like to visualize things as I go
972:38 often like to visualize things as I go because sometimes you just can't fully
972:40 because sometimes you just can't fully comprehend it unless you just visualize
972:42 comprehend it unless you just visualize it and it gives you a a larger broader
972:45 it and it gives you a a larger broader glimpse of everything so we're going to
972:47 glimpse of everything so we're going to import and let's do caborn
972:51 import and let's do caborn oops as SNS and then we'll import Matt
972:56 oops as SNS and then we'll import Matt plot li.
972:59 plot li. pyplot as
973:01 pyplot as PLT
973:03 PLT let's run
973:04 let's run this this should work okay perfect now
973:08 this this should work okay perfect now we need to bring in our data set so
973:10 we need to bring in our data set so we've worked with that world population
973:11 we've worked with that world population data set that is the exact one that
973:13 data set that is the exact one that we're going to use now so we'll say
973:15 we're going to use now so we'll say dataframe equals pd. read undor
973:19 dataframe equals pd. read undor CSV do R and we'll paste in our CSV and
973:25 CSV do R and we'll paste in our CSV and this is what it should look like
973:26 this is what it should look like although your path may be different be
973:27 although your path may be different be sure to make sure that you have the
973:29 sure to make sure that you have the correct file path then we'll read it in
973:32 correct file path then we'll read it in now this data set should look extremely
973:34 now this data set should look extremely familiar if you've done some of my
973:36 familiar if you've done some of my previous pandas tutorials but I did make
973:39 previous pandas tutorials but I did make some alterations to this one took out a
973:41 some alterations to this one took out a little bit of data put in a little bit
973:42 little bit of data put in a little bit of data here and there um to change
973:45 of data here and there um to change things up because if it was just exactly
973:47 things up because if it was just exactly how I pulled it which I got this data
973:49 how I pulled it which I got this data set from kaggle if it was exactly how we
973:51 set from kaggle if it was exactly how we pulled it like we've looked at in the
973:53 pulled it like we've looked at in the previous videos it's too simple you know
973:55 previous videos it's too simple you know we wouldn't actually be able to do some
973:56 we wouldn't actually be able to do some of the things that I would like to show
973:58 of the things that I would like to show you so be sure to actually download this
974:00 you so be sure to actually download this exact data set for this video because it
974:03 exact data set for this video because it is a little bit
974:04 is a little bit different but what we're going to do now
974:07 different but what we're going to do now is just try to get some highlevel
974:09 is just try to get some highlevel information from this now if yours looks
974:11 information from this now if yours looks just a little bit different like your
974:12 just a little bit different like your values are in scientific notation uh I
974:16 values are in scientific notation uh I have applied this so many times I think
974:17 have applied this so many times I think it's um you know still applied to this
974:20 it's um you know still applied to this you can do something and we'll write it
974:22 you can do something and we'll write it right down here we're going do pd. setor
974:26 right down here we're going do pd. setor option and we'll do an open parenthesis
974:29 option and we'll do an open parenthesis and we'll say
974:31 and we'll say display. float uncore format and so
974:34 display. float uncore format and so we're going to change that float format
974:36 we're going to change that float format by just saying Lambda X colon and then
974:40 by just saying Lambda X colon and then we're going to change basically how many
974:42 we're going to change basically how many um decimal points we're looking at so
974:45 um decimal points we're looking at so let's just do here so we'll do a quote
974:49 let's just do here so we'll do a quote percent sign 2f so we're formatting it
974:52 percent sign 2f so we're formatting it whoops 0 2f so we're going to format it
974:55 whoops 0 2f so we're going to format it and we'll do percent X this is going to
974:58 and we'll do percent X this is going to format it appropriately I'm I can run it
975:00 format it appropriately I'm I can run it um and actually it will change it this
975:02 um and actually it will change it this is at 0 one I believe last time I did it
975:04 is at 0 one I believe last time I did it so let's run this and then let's run
975:06 so let's run this and then let's run this again it'll change it to point 2 so
975:09 this again it'll change it to point 2 so that's two I like it at 0.1 we don't
975:12 that's two I like it at 0.1 we don't really need it any well let's keep it at
975:14 really need it any well let's keep it at point two why not we're going to keep it
975:16 point two why not we're going to keep it at point two that's how you change that
975:18 at point two that's how you change that and I like looking at it like this a lot
975:20 and I like looking at it like this a lot better than scientific notation so just
975:23 better than scientific notation so just something to point out um let's go down
975:25 something to point out um let's go down here and let's just pull up data frame
975:28 here and let's just pull up data frame so we have this data one of the first
975:31 so we have this data one of the first things that I like to do when I get a
975:32 things that I like to do when I get a data set is to just look at the info so
975:35 data set is to just look at the info so we're going to do doino and this gives
975:37 we're going to do doino and this gives us just some really high level
975:39 us just some really high level information this is how many columns we
975:41 information this is how many columns we have here are the column names here are
975:44 have here are the column names here are how many uh values we have and if you
975:47 how many uh values we have and if you notice this is where it kind of gets so
975:49 notice this is where it kind of gets so we have 234 in each of these so in each
975:52 we have 234 in each of these so in each of these columns we have 234 until we
975:55 of these columns we have 234 until we get to this 2022 population once we get
975:58 get to this 2022 population once we get there we start losing some values and
976:02 there we start losing some values and then at the world population percentage
976:04 then at the world population percentage we have all of our values all 234 of
976:07 we have all of our values all 234 of them the count tells us that it's non
976:09 them the count tells us that it's non null so it does have values in it and
976:10 null so it does have values in it and then we also have the data types and
976:12 then we also have the data types and these come in handy later um and these
976:15 these come in handy later um and these are really great to know and we'll be
976:17 are really great to know and we'll be able to kind of use those in a few
976:18 able to kind of use those in a few different ways later on in this tutorial
976:21 different ways later on in this tutorial really quickly I wanted to give a huge
976:22 really quickly I wanted to give a huge shout out to the sponsor of this entire
976:24 shout out to the sponsor of this entire Panda series and that is udemy udemy has
976:26 Panda series and that is udemy udemy has some of the best courses at the best
976:28 some of the best courses at the best prices and it is no exception when it
976:30 prices and it is no exception when it comes to Panda courses if you want to
976:31 comes to Panda courses if you want to master Master pandas this is the course
976:33 master Master pandas this is the course that I would recommend it's going to
976:34 that I would recommend it's going to teach you just about everything you need
976:36 teach you just about everything you need to know about pandas so huge shout out
976:37 to know about pandas so huge shout out to you to me for sponsoring this Panda
976:39 to you to me for sponsoring this Panda series and let's get back to the video
976:41 series and let's get back to the video the next thing that I really like to do
976:43 the next thing that I really like to do and this one is DF do
976:45 and this one is DF do describe this allows you to get really a
976:48 describe this allows you to get really a highlevel overview of all of your
976:50 highlevel overview of all of your columns very quickly you can get the
976:52 columns very quickly you can get the count the mean the standard deviation
976:56 count the mean the standard deviation the minimum value and the maximum value
976:59 the minimum value and the maximum value as well as your 25 50 and 75
977:02 as well as your 25 50 and 75 percentiles of your values so just at a
977:05 percentiles of your values so just at a super quick glance there is a row
977:07 super quick glance there is a row somewhere in here and there this country
977:10 somewhere in here and there this country their population is 510 for 2022 and in
977:14 their population is 510 for 2022 and in fact if you go back to 1970 it was
977:16 fact if you go back to 1970 it was higher it was at
977:17 higher it was at 752 that's just interesting then if we
977:20 752 that's just interesting then if we look at the um max population one has
977:23 look at the um max population one has 1.42 billion I believe that's China and
977:26 1.42 billion I believe that's China and then over here in 1970 we have 822
977:29 then over here in 1970 we have 822 million again I still believe that's
977:31 million again I still believe that's China but this gives you just a really
977:33 China but this gives you just a really nice high level of all of these values
977:36 nice high level of all of these values or all these different calculations that
977:38 or all these different calculations that you can run on it and we can run all
977:40 you can run on it and we can run all these individually on even specific
977:42 these individually on even specific columns but you know it's just a nice
977:45 columns but you know it's just a nice highlevel overview one thing that we
977:47 highlevel overview one thing that we just talked about was the null values
977:48 just talked about was the null values that we're seeing in here um I'd like to
977:51 that we're seeing in here um I'd like to see how many values we're actually
977:53 see how many values we're actually missing because that is a problem um we
977:55 missing because that is a problem um we don't want to have too many missing
977:57 don't want to have too many missing values or could really obscure or change
978:00 values or could really obscure or change the data set in irely and so we don't
978:02 the data set in irely and so we don't want that so we'll say DF do is null and
978:06 want that so we'll say DF do is null and then we'll do a parenthesis and we'll
978:07 then we'll do a parenthesis and we'll say do sum and when we do this
978:11 say do sum and when we do this whoops dot sum there we go when we do
978:15 whoops dot sum there we go when we do this it's going to give us all the
978:17 this it's going to give us all the columns and how many values we're
978:19 columns and how many values we're actually missing now we have
978:20 actually missing now we have 234 rows of data so we have 41 477 55424
978:27 234 rows of data so we have 41 477 55424 um so we have we definitely have data
978:30 um so we have we definitely have data missing what we choose to do with it in
978:33 missing what we choose to do with it in the data cleaning process maybe we want
978:35 the data cleaning process maybe we want to populate it with a median value maybe
978:37 to populate it with a median value maybe we just want to delete those countries
978:39 we just want to delete those countries entirely if the data is missing um you
978:41 entirely if the data is missing um you know I don't think you're going to do
978:43 know I don't think you're going to do that but these are things that you need
978:45 that but these are things that you need to think about when you're actually
978:47 to think about when you're actually finding these missing values this is
978:49 finding these missing values this is what the Eda process is all about we
978:51 what the Eda process is all about we want to find different um either
978:54 want to find different um either outliers missing values things that are
978:56 outliers missing values things that are wrong with the data or we can find
978:59 wrong with the data or we can find insights into it while we're doing this
979:00 insights into it while we're doing this as well so so this is definitely
979:02 as well so so this is definitely something that I would consider um when
979:04 something that I would consider um when I'm actually going through that data
979:05 I'm actually going through that data cleaning process really really important
979:07 cleaning process really really important information to know now let's go right
979:09 information to know now let's go right down here go to our next cell say DF do
979:13 down here go to our next cell say DF do unique and this is going to show us how
979:15 unique and this is going to show us how many unique values and it's actually n
979:18 many unique values and it's actually n unique uh this is going to show us how
979:20 unique uh this is going to show us how many unique values are actually in each
979:24 many unique values are actually in each of these uh columns and this one makes
979:27 of these uh columns and this one makes the most sense um for continents because
979:30 the most sense um for continents because I think there's only seven continents
979:31 I think there's only seven continents right right um but we have six right
979:34 right right um but we have six right here and for all of these each of these
979:36 here and for all of these each of these ranks countries capitals should all be
979:39 ranks countries capitals should all be unique that makes perfect sense as well
979:41 unique that makes perfect sense as well as these you know these populations are
979:43 as these you know these populations are such specific numbers and such large
979:45 such specific numbers and such large numbers I would be shocked if any of
979:47 numbers I would be shocked if any of these were similar and then for these
979:49 these were similar and then for these world population percentages it's much
979:52 world population percentages it's much lower and again that makes a lot of
979:54 lower and again that makes a lot of sense because when we're looking at and
979:55 sense because when we're looking at and we'll pull it up right here when we're
979:57 we'll pull it up right here when we're looking at these world population
980:00 looking at these world population percentages um a lot of them are really
980:02 percentages um a lot of them are really low 0.00 0.01 like this one um 0.2 there
980:08 low 0.00 0.01 like this one um 0.2 there are a lot of really low values for those
980:10 are a lot of really low values for those small countries and so those are all um
980:12 small countries and so those are all um you know one unique value now let's say
980:14 you know one unique value now let's say we just have this data right here and we
980:17 we just have this data right here and we want to take a look at some of the
980:18 want to take a look at some of the largest countries and we can easily do
980:21 largest countries and we can easily do that we could even we could say Max and
980:23 that we could even we could say Max and take a look at the largest country but I
980:25 take a look at the largest country but I want to be a little bit more strategic I
980:27 want to be a little bit more strategic I want to be able to look at some of the
980:28 want to be able to look at some of the top range of countries and we can do
980:30 top range of countries and we can do that based off this
980:32 that based off this 2022 population so we'll say DF do
980:36 2022 population so we'll say DF do sortore values this is how we sort and
980:39 sortore values this is how we sort and um not filter but um order our data so
980:42 um not filter but um order our data so we'll do sort values and then we'll do
980:45 we'll do sort values and then we'll do buy is equal and then we'll specify that
980:47 buy is equal and then we'll specify that we want uh this 2022 population and then
980:51 we want uh this 2022 population and then we're going to say comma and we'll say
980:53 we're going to say comma and we'll say actually let's just run this as is um
980:55 actually let's just run this as is um but we'll do head because we just want
980:57 but we'll do head because we just want to look at the top values so now we're
981:00 to look at the top values so now we're just looking at the very top values so
981:02 just looking at the very top values so what we're looking at is actually these
981:04 what we're looking at is actually these 2022 population um that's what we're
981:07 2022 population um that's what we're filtering on or sorting on basically and
981:09 filtering on or sorting on basically and we're looking at the very bottom values
981:12 we're looking at the very bottom values because it's sorting ascending so from
981:14 because it's sorting ascending so from lowest to highest so this Vatican City
981:17 lowest to highest so this Vatican City in Europe is um you know 510 that's the
981:20 in Europe is um you know 510 that's the value that we were looking at earlier
981:22 value that we were looking at earlier now we can do comma ascending equal to
981:25 now we can do comma ascending equal to false because it was by default true we
981:28 false because it was by default true we can do false whoops we can do false and
981:30 can do false whoops we can do false and then it'll give us the very largest ones
981:33 then it'll give us the very largest ones so if we just take a look at the top
981:35 so if we just take a look at the top five largest by population we're looking
981:37 five largest by population we're looking at China India United States Indonesia
981:40 at China India United States Indonesia and Pakistan and we can even specify
981:43 and Pakistan and we can even specify that we want the top 10 in this head we
981:46 that we want the top 10 in this head we can bring in the top 10 we also have
981:48 can bring in the top 10 we also have Nigeria Brazil Bangladesh Russia and
981:51 Nigeria Brazil Bangladesh Russia and Mexico and you can do this for literally
981:53 Mexico and you can do this for literally any of these columns whether you want to
981:55 any of these columns whether you want to look at continent capital country um you
981:58 look at continent capital country um you can sort on these and look at them and
982:00 can sort on these and look at them and you can even look at you know things
982:01 you can even look at you know things like growth rate world percentage this
982:03 like growth rate world percentage this one seems really interesting let's just
982:06 one seems really interesting let's just look at this one really quickly before
982:07 look at this one really quickly before we move on to the next thing um if we
982:10 we move on to the next thing um if we look at this world percentage just China
982:13 look at this world percentage just China alone I believe yeah just China alone is
982:16 alone I believe yeah just China alone is 17.88% of the world so
982:33 17.88% world population percentage again just getting in here looking around
982:35 just getting in here looking around that's all we're really doing now I want
982:37 that's all we're really doing now I want to look at something and I have always
982:39 to look at something and I have always liked doing this which is looking at
982:41 liked doing this which is looking at correlations um so correlation between
982:43 correlations um so correlation between usually only numeric values we can do
982:46 usually only numeric values we can do that by saying DF
982:48 that by saying DF docr and a parenthesis and we'll run
982:51 docr and a parenthesis and we'll run this and what this is is it is comparing
982:55 this and what this is is it is comparing every column to every other column and
982:58 every column to every other column and looking at how closely correlated they
983:00 looking at how closely correlated they are so this 2022 population if we look
983:03 are so this 2022 population if we look across the board it's very highly I mean
983:06 across the board it's very highly I mean this is a one: one this is highly
983:08 this is a one: one this is highly correlated to each other and that almost
983:11 correlated to each other and that almost for all of these populations they're
983:13 for all of these populations they're very very closely tied to each other
983:14 very very closely tied to each other which makes perfect sense because for
983:17 which makes perfect sense because for most countries they're going to be
983:18 most countries they're going to be steadily increasing and so they're
983:20 steadily increasing and so they're probably almost exactly correlated but
983:24 probably almost exactly correlated but we can look at these populations and if
983:26 we can look at these populations and if you look at the area it's only somewhat
983:29 you look at the area it's only somewhat correlated and that's because in some
983:31 correlated and that's because in some countries you know they have a very high
983:33 countries you know they have a very high population but a small area or vice
983:35 population but a small area or vice versa a small area and a very high
983:37 versa a small area and a very high population so there isn't a one toone
983:39 population so there isn't a one toone correlation there but it's hard to
983:41 correlation there but it's hard to really just glance at this um and
983:43 really just glance at this um and understand everything that's there we
983:45 understand everything that's there we could just visualize it and it would be
983:47 could just visualize it and it would be a lot easier so let's go ahead and do
983:51 a lot easier so let's go ahead and do that let's go down here we're just going
983:52 that let's go down here we're just going to visualize this using a heat map
983:55 to visualize this using a heat map basically so we're going to say SNS do
983:58 basically so we're going to say SNS do heatmap and an open parenthesis and the
984:02 heatmap and an open parenthesis and the data that we're going to be looking at
984:03 data that we're going to be looking at is DF do core correlation and then we
984:07 is DF do core correlation and then we also want to say inote equals true I'll
984:10 also want to say inote equals true I'll kind of show you what that looks like in
984:12 kind of show you what that looks like in just a little bit um but let's do PLT
984:16 just a little bit um but let's do PLT doow and this will be our first look and
984:19 doow and this will be our first look and I need to say show not shot um we can
984:24 I need to say show not shot um we can get a little glimpse of what it looks
984:26 get a little glimpse of what it looks like but this looks um absolutely
984:27 like but this looks um absolutely terrible let's change the figure size
984:30 terrible let's change the figure size really quick so I want to make this much
984:32 really quick so I want to make this much larger than it already is we'll do
984:35 larger than it already is we'll do pl. RC prams RC params oops right there
984:42 pl. RC prams RC params oops right there do an open parenthesis and then right
984:44 do an open parenthesis and then right here we're going to do in quotes do
984:46 here we're going to do in quotes do figure. fig size this actually needs to
984:50 figure. fig size this actually needs to be in brackets I
984:52 be in brackets I believe just like this not parentheses
984:55 believe just like this not parentheses we'll say fig size is equal to and now
984:58 we'll say fig size is equal to and now we can specify the value that we want
985:00 we can specify the value that we want let's do 10 comma seven and see if this
985:02 let's do 10 comma seven and see if this looks any
985:04 looks any better no no that doesn't look good do
985:09 better no no that doesn't look good do 20 okay that looks a lot better and um
985:13 20 okay that looks a lot better and um you know this is just a quick way
985:15 you know this is just a quick way because it gives you basically a
985:16 because it gives you basically a colorcoded system highly correlated is
985:19 colorcoded system highly correlated is this tan all the way down to basically
985:21 this tan all the way down to basically no correlation or negative correlation
985:23 no correlation or negative correlation even which is black so when we're
985:25 even which is black so when we're looking at these 2022 populations and
985:28 looking at these 2022 populations and these are populations right down here on
985:30 these are populations right down here on this axis we can see that all of these
985:32 this axis we can see that all of these are extremely highly correlated very
985:36 are extremely highly correlated very very quickly whereas the rank really has
985:38 very quickly whereas the rank really has nothing to do it's it's negatively
985:41 nothing to do it's it's negatively correlated doesn't really have anything
985:42 correlated doesn't really have anything to do with it then for the population
985:45 to do with it then for the population and the world population percentage it
985:47 and the world population percentage it again is quite correlated except for the
985:51 again is quite correlated except for the area density and growth rate so I find
985:55 area density and growth rate so I find that really interesting that you know
985:56 that really interesting that you know the density the growth rate in the area
985:58 the density the growth rate in the area aren't really all that Associated or
986:03 aren't really all that Associated or correlated with the population numbers
986:05 correlated with the population numbers that is I kind of would assumed that on
986:08 that is I kind of would assumed that on some level they went hand inand the area
986:11 some level they went hand inand the area does um would you know again make sense
986:13 does um would you know again make sense you know larger area larger population
986:15 you know larger area larger population that kind of thing but even density um I
986:18 that kind of thing but even density um I guess I guess density and growth rate um
986:21 guess I guess density and growth rate um growth rate I can see because that's a
986:22 growth rate I can see because that's a percentile thing that could be
986:24 percentile thing that could be definitely not correlated I thought the
986:26 definitely not correlated I thought the density would be more correlated than it
986:28 density would be more correlated than it is all that to say is this is one way
986:30 is all that to say is this is one way that you can kind of look at your data
986:32 that you can kind of look at your data see how correlated it is to one another
986:34 see how correlated it is to one another that can definitely um help you know
986:36 that can definitely um help you know what to analyze and look at later when
986:38 what to analyze and look at later when you're actually doing your data analysis
986:40 you're actually doing your data analysis let's go right down here um something
986:43 let's go right down here um something that I do almost all the time when I'm
986:45 that I do almost all the time when I'm doing any type of uh exploratory data
986:47 doing any type of uh exploratory data analysis like this I'm going to group
986:49 analysis like this I'm going to group together columns start looking at the
986:51 together columns start looking at the data a little bit closer um so let's go
986:54 data a little bit closer um so let's go ahead and group on the continent so
986:57 ahead and group on the continent so let's look at it right here let's group
986:59 let's look at it right here let's group on this continent because some times
987:01 on this continent because some times when you're doing this Eda you already
987:02 when you're doing this Eda you already know kind of what the end goal of this
987:04 know kind of what the end goal of this data set is you know kind of what you're
987:06 data set is you know kind of what you're looking for what you're going to
987:07 looking for what you're going to visualize at the end that you really
987:09 visualize at the end that you really comes in handy when doing this but
987:11 comes in handy when doing this but sometimes you don't sometimes just going
987:13 sometimes you don't sometimes just going in blind and so far we've really just
987:15 in blind and so far we've really just been going in blind we're just throwing
987:17 been going in blind we're just throwing things at the wind kind of seeing some
987:18 things at the wind kind of seeing some overviews um looking at correlation
987:21 overviews um looking at correlation that's all we've done now I kind of want
987:23 that's all we've done now I kind of want to get more specific I want to have like
987:25 to get more specific I want to have like a use case something that I'm kind of
987:27 a use case something that I'm kind of looking for not doing full data analysis
987:30 looking for not doing full data analysis not diving into the depths but something
987:32 not diving into the depths but something we can kind of aim for so the use case
987:34 we can kind of aim for so the use case or the question for us is are there
987:36 or the question for us is are there certain continents that have grown
987:38 certain continents that have grown faster than others and in which ways so
987:41 faster than others and in which ways so we want to focus on these continents we
987:43 we want to focus on these continents we know that that's the most important
987:44 know that that's the most important column for this use case this very fake
987:47 column for this use case this very fake use case um so we can group on this
987:49 use case um so we can group on this continent and we can look at these
987:51 continent and we can look at these populations right here because we can't
987:53 populations right here because we can't really see growth you can see a growth
987:56 really see growth you can see a growth rate but the density per uh kilometer we
988:00 rate but the density per uh kilometer we don't have multiple values for that it's
988:02 don't have multiple values for that it's just a static one single value same for
988:04 just a static one single value same for growth rate same for world population
988:06 growth rate same for world population percentage but we have this over a long
988:09 percentage but we have this over a long span many many years um you know 50
988:12 span many many years um you know 50 years of data here so this we can see
988:15 years of data here so this we can see which countries have really done well or
988:18 which countries have really done well or which continents have really done well
988:19 which continents have really done well so without you know talking about it
988:21 so without you know talking about it even more let's do DF Group by and then
988:25 even more let's do DF Group by and then we'll say continent oops let me just
988:29 we'll say continent oops let me just copy this I'm I'm not could it's
988:31 copy this I'm I'm not could it's spelling we're going to say DF groupy
988:33 spelling we're going to say DF groupy and then we'll do
988:35 and then we'll do mean and we can just do it just like
988:37 mean and we can just do it just like this and now we have Africa Asia Europe
988:41 this and now we have Africa Asia Europe North America Oceana and South
988:45 North America Oceana and South America okay so if I'm being completely
988:48 America okay so if I'm being completely honest I knew most of these all right
988:51 honest I knew most of these all right I'm no geography extra expert but I I
988:53 I'm no geography extra expert but I I knew most of these I don't know what
988:54 knew most of these I don't know what this ocean is um this that I don't I
988:58 this ocean is um this that I don't I genuinely don't know what that is um
989:01 genuinely don't know what that is um so let's just search for that value and
989:04 so let's just search for that value and see we'll come back up here in just a
989:06 see we'll come back up here in just a second but I want to I want to kind of
989:08 second but I want to I want to kind of understand um what this is so we're
989:10 understand um what this is so we're going to DF um and we'll say
989:14 going to DF um and we'll say continent let me sound that out for you
989:16 continent let me sound that out for you guys um then we'll do string. contains
989:21 guys um then we'll do string. contains oops contains good night and then I want
989:26 oops contains good night and then I want to look for
989:28 to look for Oceana uh and let's let's run this oh I
989:31 Oceana uh and let's let's run this oh I need to do it like
989:38 this now let's run this so now we're looking at our data frame we're seeing
989:40 looking at our data frame we're seeing when the values have this continent as
989:44 when the values have this continent as Oceana um okay so these look like
989:47 Oceana um okay so these look like Islands I'm guessing so we have Fiji
989:50 Islands I'm guessing so we have Fiji Guam um New
989:53 Guam um New Zealand Papa New Guinea yeah these look
989:56 Zealand Papa New Guinea yeah these look like all I'm I'm guessing based off the
989:59 like all I'm I'm guessing based off the continent Oceana
990:01 continent Oceana um Oceania o Ocea Oceania guys this is
990:06 um Oceania o Ocea Oceania guys this is tough for me okay I'm doing my best I
990:09 tough for me okay I'm doing my best I you know this is part of the Eda process
990:11 you know this is part of the Eda process I don't know what that means I don't
990:12 I don't know what that means I don't know what ocean ocean ocean Oceania geez
990:17 know what ocean ocean ocean Oceania geez I'm just going to call it Oceana that's
990:18 I'm just going to call it Oceana that's so wrong but I'm just gonna it's so easy
990:20 so wrong but I'm just gonna it's so easy for me to say you know I I now am seeing
990:23 for me to say you know I I now am seeing this and it looks like
990:25 this and it looks like Islands um which would make sense
990:29 Islands um which would make sense because for their average they have the
990:31 because for their average they have the highest average rank um and I'm guessing
990:35 highest average rank um and I'm guessing that's because they're just mostly small
990:37 that's because they're just mostly small continents so let's let's order this
990:39 continents so let's let's order this really quickly we're going to do dot
990:42 really quickly we're going to do dot sortore values do an open parenthesis
990:46 sortore values do an open parenthesis and I want to sort on the population
990:48 and I want to sort on the population we're just doing the average population
990:51 we're just doing the average population um we'll do BU um equal so on the
990:54 um we'll do BU um equal so on the average population and we'll do
990:57 average population and we'll do ascending equals false so we're looking
991:01 ascending equals false so we're looking at this average or the mean population
991:04 at this average or the mean population Asia has the highest population on
991:06 Asia has the highest population on average and we have South America Africa
991:09 average and we have South America Africa Europe North America and then Oceana at
991:14 Europe North America and then Oceana at the very bottom which makes perfect
991:15 the very bottom which makes perfect sense again small Islands um world
991:19 sense again small Islands um world population percentage so each of the
991:22 population percentage so each of the countries each of those countries in
991:23 countries each of those countries in Asia makes up about 1% on average really
991:27 Asia makes up about 1% on average really interesting um to know and just kind of
991:29 interesting um to know and just kind of look at this and and the density in Asia
991:34 look at this and and the density in Asia is far higher than double almost double
991:37 is far higher than double almost double every single other continent um really
991:41 every single other continent um really really interesting actually now that I'm
991:42 really interesting actually now that I'm looking at this but you know that's
991:44 looking at this but you know that's something that I would actually look
991:45 something that I would actually look into and I would be like what is this
991:47 into and I would be like what is this Oceana or Oceania what does that mean
991:51 Oceana or Oceania what does that mean and you know let me look into that let
991:52 and you know let me look into that let me explore that more because I want to
991:54 me explore that more because I want to know this data set I'm trying to really
991:56 know this data set I'm trying to really understand this data set well but what I
991:58 understand this data set well but what I want to do now is I want to visualize
991:59 want to do now is I want to visualize this um
992:01 this um because I just feel like looking at it I
992:03 because I just feel like looking at it I don't it's hard to visualize and again
992:06 don't it's hard to visualize and again the use case that we're saying is is
992:07 the use case that we're saying is is which continent has grown the fastest
992:10 which continent has grown the fastest like it could be percentage wise it
992:11 like it could be percentage wise it could be um you know as just a whole on
992:14 could be um you know as just a whole on average let's take a look so we're going
992:17 average let's take a look so we're going to take this and let's copy it like this
992:20 to take this and let's copy it like this let's bring this right down here so
992:23 let's bring this right down here so let's look at this so if I try to
992:27 let's look at this so if I try to visualize this and let's do that let's
992:29 visualize this and let's do that let's do df2 is equal to because I'm I already
992:33 do df2 is equal to because I'm I already know it's not going to look good just
992:34 know it's not going to look good just based off how the data is sitting um we
992:38 based off how the data is sitting um we do df2 oops what am I doing I don't need
992:42 do df2 oops what am I doing I don't need to do that but I will okay df2 and we'll
992:44 to do that but I will okay df2 and we'll do df2 do
992:47 do df2 do lot I'll we'll run it just like this um
992:51 lot I'll we'll run it just like this um as you can see Asia South America Africa
992:54 as you can see Asia South America Africa Europe North America Oceana we can kind
992:58 Europe North America Oceana we can kind of understand what's happening but these
993:01 of understand what's happening but these are the actual um values that are being
993:03 are the actual um values that are being visualized not the continents which is
993:06 visualized not the continents which is what I wanted um in order to switch it
993:09 what I wanted um in order to switch it and it's actually pretty easy and this
993:10 and it's actually pretty easy and this is something that um you know is good to
993:13 is something that um you know is good to know we can actually transpose it to
993:15 know we can actually transpose it to where these these continents become the
993:17 where these these continents become the columns and the columns become the index
993:20 columns and the columns become the index and all we have to do is say df2 do
993:25 and all we have to do is say df2 do transpose and we'll do this parentheses
993:27 transpose and we'll do this parentheses right here and let's just look at it and
993:30 right here and let's just look at it and then we'll save it so now all these
993:34 then we'll save it so now all these columns are right
993:35 columns are right here and all of the indexes are the
993:38 here and all of the indexes are the columns so let's say df3 is equal to and
993:42 columns so let's say df3 is equal to and I'm just doing that so I don't you know
993:43 I'm just doing that so I don't you know write over the DF or my earlier data
993:45 write over the DF or my earlier data frames so now we have this data frame
993:47 frames so now we have this data frame three so now let's do data frame 3. plot
993:51 three so now let's do data frame 3. plot and it should look quite a bit
993:54 and it should look quite a bit different uh whoops I didn't run this
993:57 different uh whoops I didn't run this let's run this and run this
994:01 let's run this and run this and as you can see this does not look
994:03 and as you can see this does not look right at all and the reason is is
994:06 right at all and the reason is is because we're not only looking at uh the
994:08 because we're not only looking at uh the correct columns we have this density in
994:10 correct columns we have this density in here word population percentage rank we
994:12 here word population percentage rank we don't need any of those the only ones
994:15 don't need any of those the only ones that we want to keep are these ones
994:17 that we want to keep are these ones right here this
994:18 right here this population now we can do that and we can
994:21 population now we can do that and we can just go right up here this is where we
994:22 just go right up here this is where we created that data frame two that we
994:24 created that data frame two that we transposed we can go right up here and
994:26 transposed we can go right up here and we can specify within this we actually
994:29 we can specify within this we actually only want specific specific values now
994:31 only want specific specific values now we can go through and handr write all of
994:34 we can go through and handr write all of these and by all means go for it but I
994:38 these and by all means go for it but I am going to go down here I'm going to
994:39 am going to go down here I'm going to say DF do columns and I'm going to run
994:42 say DF do columns and I'm going to run this it's going to give us this list of
994:45 this it's going to give us this list of all of our columns and I'm just going to
994:47 all of our columns and I'm just going to you can just copy
994:50 you can just copy this and you can put it right in here
994:52 this and you can put it right in here think I need a list with I think it
994:54 think I need a list with I think it needs to be like this if I'm let me try
994:56 needs to be like this if I'm let me try running this okay so this worked
994:58 running this okay so this worked properly you can do it just like this or
995:00 properly you can do it just like this or a little shortcut if you want to do it
995:02 a little shortcut if you want to do it like that if you want to do a shortcut
995:04 like that if you want to do a shortcut like um I I would hope you would you
995:07 like um I I would hope you would you would just do DF doc columns just like
995:10 would just do DF doc columns just like how we looked at down here except since
995:13 how we looked at down here except since this is our an index we can search
995:15 this is our an index we can search through it so we can just say 0 one two
995:18 through it so we can just say 0 one two okay so we can do five up to 13 because
995:22 okay so we can do five up to 13 because I think it's seven and we'll just let's
995:24 I think it's seven and we'll just let's see if this
995:26 see if this works uh it may not I may actually need
995:28 works uh it may not I may actually need to go like this let's see
995:31 to go like this let's see there we go so you can just use you know
995:34 there we go so you can just use you know the indexing to save you some visual
995:36 the indexing to save you some visual space gives you the exact same output so
995:38 space gives you the exact same output so now we have this this is our df2 now
995:41 now we have this this is our df2 now let's go down and transpose it so now we
995:44 let's go down and transpose it so now we just have these populations and we have
995:45 just have these populations and we have our conents right here and then now
995:48 our conents right here and then now we're going to plot it and this looks
995:51 we're going to plot it and this looks good although it's
995:54 good although it's backward um okay it's
995:56 backward um okay it's backward so what I actually want to do
996:01 backward so what I actually want to do is not this uh that is a quick way to do
996:04 is not this uh that is a quick way to do it although not the best way to do it um
996:08 it although not the best way to do it um so I'm actually going to copy all of
996:09 so I'm actually going to copy all of these and although I said it would save
996:11 these and although I said it would save us time it did not at all so I'm going
996:15 us time it did not at all so I'm going to put a bracket right
996:18 to put a bracket right here I'm going to paste this in here and
996:20 here I'm going to paste this in here and I'm literally going to change these up I
996:23 I'm literally going to change these up I might speed this up or I might just have
996:26 might speed this up or I might just have you sit through this because you know
996:28 you sit through this because you know this is an interesting part of the proc
996:30 this is an interesting part of the proc process and I want you know you to get
996:32 process and I want you know you to get the full experience you know what now
996:34 the full experience you know what now that I'm talking about it that is what
996:35 that I'm talking about it that is what we're going to do you guys can hang out
996:37 we're going to do you guys can hang out with me this is a good time we have
996:40 with me this is a good time we have 2010
996:42 2010 2015
996:44 2015 2020 and 2022 now let's run it what did
996:49 2020 and 2022 now let's run it what did I do oh too many brackets there we go so
996:52 I do oh too many brackets there we go so now it's ordered appropriately we have
996:54 now it's ordered appropriately we have 1970 all the way up to 2022 this is how
996:57 1970 all the way up to 2022 this is how we want it let's transpose it
996:59 we want it let's transpose it appropriate
997:01 appropriate let's run it and now we basically have
997:03 let's run it and now we basically have the inverted uh image of this now just
997:06 the inverted uh image of this now just at a glance and we haven't done anything
997:08 at a glance and we haven't done anything to this except for literally what we are
997:10 to this except for literally what we are looking at at a glance we can see that
997:13 looking at at a glance we can see that from
997:14 from 1970 China here you know Asia and China
997:17 1970 China here you know Asia and China are already in the lead by quite a bit
997:20 are already in the lead by quite a bit and it continues to drastically go up
997:23 and it continues to drastically go up especially in the 2000s like right here
997:26 especially in the 2000s like right here it explodes like just straight up then
997:30 it explodes like just straight up then kind of starts going up and just
997:31 kind of starts going up and just leveling off every other continent
997:34 leveling off every other continent especially oce Oceana is just really low
997:38 especially oce Oceana is just really low it it never has done a bunch let's see
997:39 it it never has done a bunch let's see look at green green has gone up um from
997:41 look at green green has gone up um from you know Point let's say
997:44 you know Point let's say 0.1 up to about point2 so they've almost
997:47 0.1 up to about point2 so they've almost doubled um in the last 50 years and
997:51 doubled um in the last 50 years and again you can just get an overview a
997:53 again you can just get an overview a high level overview of each of these you
997:56 high level overview of each of these you know continents over the span of this
997:59 know continents over the span of this time so this is kind of one way that we
998:01 time so this is kind of one way that we can you know look at that use case we're
998:04 can you know look at that use case we're not going to harp on that too long I
998:05 not going to harp on that too long I just want to give you an example like
998:07 just want to give you an example like you know when you're looking at this
998:10 you know when you're looking at this sometimes you'll have something in mind
998:11 sometimes you'll have something in mind of what you're looking for and you go
998:12 of what you're looking for and you go exploring and just kind of find what's
998:15 exploring and just kind of find what's out there and find what you see um the
998:17 out there and find what you see um the next thing I want to look at is a box
998:19 next thing I want to look at is a box plot now I personally I love box plots
998:22 plot now I personally I love box plots you know they're really good for finding
998:25 you know they're really good for finding outliers and there's a lot of outliers I
998:28 outliers and there's a lot of outliers I already know this because because the
998:30 already know this because because the average the 25th 50 percentile are very
998:32 average the 25th 50 percentile are very low and then there's some really just
998:34 low and then there's some really just big outliers but for your data set it
998:37 big outliers but for your data set it may not be that way and those outliers
998:39 may not be that way and those outliers may be something that you really need to
998:41 may be something that you really need to look into and box plots have been
998:43 look into and box plots have been something that I've used a lot where I
998:44 something that I've used a lot where I found those outliers that way and
998:46 found those outliers that way and started to dig into the data to find
998:48 started to dig into the data to find those outliers and you know came across
998:50 those outliers and you know came across some stuff that I'm like oh I have to
998:51 some stuff that I'm like oh I have to clean this up I have to go back to the
998:52 clean this up I have to go back to the source really um really really powerful
998:55 source really um really really powerful and useful to be able to find these so
998:58 and useful to be able to find these so all you have to do is d. boox plot and
999:01 all you have to do is d. boox plot and let's take a look at it and this already
999:03 let's take a look at it and this already looks good as is maybe I'll make it a
999:05 looks good as is maybe I'll make it a little bit wider um let's do fig size
999:10 little bit wider um let's do fig size oops sorry fig size is equal to let's
999:14 oops sorry fig size is equal to let's try
999:15 try 20 by
999:18 20 by 10 um okay that didn't help at all I
999:21 10 um okay that didn't help at all I apologize thought I would but let's keep
999:23 apologize thought I would but let's keep going what this is showing us is that
999:26 going what this is showing us is that these little boxes down here which are
999:28 these little boxes down here which are actually usually much much larger
999:30 actually usually much much larger because you have a more equal
999:31 because you have a more equal distribution of of um numbers or values
999:35 distribution of of um numbers or values in the small value this is where our
999:37 in the small value this is where our averages lie this number right here is
999:41 averages lie this number right here is the upper range and then all these
999:42 the upper range and then all these values all these Open Circles those
999:45 values all these Open Circles those actually stand for
999:46 actually stand for outliers so we're looking at the 2022
999:49 outliers so we're looking at the 2022 population there's a lot of outliers now
999:52 population there's a lot of outliers now for our data set knowing our data set is
999:54 for our data set knowing our data set is really important outliers are to be
999:56 really important outliers are to be expected especially when most countries
999:59 expected especially when most countries are continents are small so we're
1000:01 are continents are small so we're looking at you know all of these little
1000:03 looking at you know all of these little dots are outlier countries um or outlier
1000:07 dots are outlier countries um or outlier values which each value corresponds to a
1000:09 values which each value corresponds to a country so if this was a different data
1000:12 country so if this was a different data set I would be you know searching on
1000:14 set I would be you know searching on these and trying to find these so that I
1000:16 these and trying to find these so that I can see what's wrong with them if
1000:18 can see what's wrong with them if anything or if they are real um numbers
1000:20 anything or if they are real um numbers like if this was Revenue everyone's
1000:22 like if this was Revenue everyone's revenue is way down here and then
1000:23 revenue is way down here and then there's one company that's making like
1000:24 there's one company that's making like 10 trillion dollar that'd be an outlier
1000:27 10 trillion dollar that'd be an outlier up here and it would definitely be
1000:28 up here and it would definitely be something that you want to look into to
1000:30 something that you want to look into to for our data set knowing that you know
1000:32 for our data set knowing that you know we're looking at population this is more
1000:34 we're looking at population this is more than acceptable you know oddly enough
1000:37 than acceptable you know oddly enough but that's what box plots are really
1000:39 but that's what box plots are really good for showing you some of those cor
1000:41 good for showing you some of those cor tiles the upper and the lower um as well
1000:43 tiles the upper and the lower um as well as denoting these points that fall
1000:44 as denoting these points that fall outside of those normal ranges for you
1000:46 outside of those normal ranges for you to look into so really really useful so
1000:49 to look into so really really useful so now let's go down here pull up our data
1000:51 now let's go down here pull up our data frame again and we've kind of just
1000:54 frame again and we've kind of just zoomed into the whole Eda process there
1000:56 zoomed into the whole Eda process there was one last thing that I wanted to show
1000:58 was one last thing that I wanted to show you and this is the very last thing that
1001:00 you and this is the very last thing that we're going to look at we're ending on
1001:01 we're going to look at we're ending on really a low point if I'm being honest
1001:03 really a low point if I'm being honest because the last kind of stuff was more
1001:04 because the last kind of stuff was more much more exciting but there is
1001:06 much more exciting but there is something DF DOD types oops let's do DF
1001:11 something DF DOD types oops let's do DF DOD
1001:12 DOD types and we'll run this now just like
1001:15 types and we'll run this now just like info it gave us these values but we're
1001:18 info it gave us these values but we're actually able to search on these values
1001:20 actually able to search on these values now so these um object float and integer
1001:24 now so these um object float and integer we can search on those which is really
1001:26 we can search on those which is really great because we can do include equal
1001:29 great because we can do include equal and we can do something like number and
1001:32 and we can do something like number and none of these are numbers right or none
1001:34 none of these are numbers right or none of them explicitly say number but when
1001:37 of them explicitly say number but when we run it I'm getting an error series
1001:39 we run it I'm getting an error series object not oh that's because I'm doing
1001:42 object not oh that's because I'm doing um D types is for a series we need to do
1001:45 um D types is for a series we need to do select underscore D types now let's run
1001:48 select underscore D types now let's run this now it's only returning um The
1001:52 this now it's only returning um The Columns in this data frame where the
1001:54 Columns in this data frame where the data types are included in this number
1001:57 data types are included in this number so you won't see any you know country or
1001:59 so you won't see any you know country or any of those text or the strings if we
1002:02 any of those text or the strings if we want to do that we go in here and say
1002:06 want to do that we go in here and say object and run that and this is another
1002:09 object and run that and this is another really quick way where we can just
1002:12 really quick way where we can just filter those columns to look for
1002:14 filter those columns to look for specific whether it's numeric um we
1002:17 specific whether it's numeric um we could even do float in here and so now
1002:19 could even do float in here and so now it's not including that rank which was
1002:21 it's not including that rank which was an integer so we can specify the type of
1002:23 an integer so we can specify the type of data type and it'll filter all of the
1002:25 data type and it'll filter all of the columns based off of that which you know
1002:28 columns based off of that which you know when you're doing stuff like this you it
1002:29 when you're doing stuff like this you it is good to know what kind of data types
1002:32 is good to know what kind of data types you're working with and look at just
1002:33 you're working with and look at just those types of data types because there
1002:35 those types of data types because there might be some type of analysis you want
1002:36 might be some type of analysis you want to perform on just that whether it's
1002:39 to perform on just that whether it's numeric or just the string or integer
1002:41 numeric or just the string or integer columns within your data set so again
1002:43 columns within your data set so again ending on a low note I apologize um you
1002:46 ending on a low note I apologize um you know everything else that we looked at
1002:47 know everything else that we looked at all those other things that we looked at
1002:49 all those other things that we looked at are all things that I typically do in
1002:52 are all things that I typically do in some way or another when I'm looking at
1002:54 some way or another when I'm looking at a data set exploratory data analysis is
1002:57 a data set exploratory data analysis is really just the first look you're
1003:00 really just the first look you're looking at it you're going to be
1003:01 looking at it you're going to be cleaning it up doing the data cleaning
1003:02 cleaning it up doing the data cleaning process and then you're going to be
1003:04 process and then you're going to be doing your actual data analysis actually
1003:06 doing your actual data analysis actually finding those Trends and patterns and
1003:08 finding those Trends and patterns and then visualizing it um in some way to
1003:11 then visualizing it um in some way to find some kind of meaning or Insight or
1003:14 find some kind of meaning or Insight or value from that data and again there's a
1003:16 value from that data and again there's a thousand different ways you can go about
1003:18 thousand different ways you can go about this it it does typically um you know
1003:21 this it it does typically um you know depend on the data set but these are a
1003:24 depend on the data set but these are a lot of the ways that you'll clean a lot
1003:25 lot of the ways that you'll clean a lot of different data sets and so you know
1003:27 of different data sets and so you know that's why I went into the things that
1003:28 that's why I went into the things that we looked at in this video video so I
1003:30 we looked at in this video video so I hope that you guys liked it I hope that
1003:31 hope that you guys liked it I hope that you enjoyed something in this tutorial
1003:33 you enjoyed something in this tutorial if you like this video be sure to like
1003:34 if you like this video be sure to like And subscribe as well as check out all
1003:36 And subscribe as well as check out all my other videos on pandas and Python and
1003:39 my other videos on pandas and Python and I will see you in the next
1003:41 I will see you in the next [Music]
1003:51 [Music] video what's going on everybody welcome
1003:54 video what's going on everybody welcome back to another video today we are back
1003:56 back to another video today we are back with another data analyst portfolio
1003:57 with another data analyst portfolio project where we will be scraping data
1003:59 project where we will be scraping data from Amazon using
1004:05 [Music] python now you may be asking do I need
1004:08 python now you may be asking do I need to know web scraping to become a data
1004:10 to know web scraping to become a data analyst and the answer is no you
1004:11 analyst and the answer is no you absolutely don't need to know it but it
1004:13 absolutely don't need to know it but it is a very cool skill to learn and in
1004:15 is a very cool skill to learn and in fact I have used it in my job in the
1004:17 fact I have used it in my job in the past and so it is useful but you really
1004:20 past and so it is useful but you really don't need to know it something that it
1004:22 don't need to know it something that it is used for is kind of creating your own
1004:24 is used for is kind of creating your own data sets um and we're going to be
1004:26 data sets um and we're going to be looking at one where you can create your
1004:27 looking at one where you can create your own data set today but there are a lot
1004:30 own data set today but there are a lot of other uses for web scraping and I'm
1004:32 of other uses for web scraping and I'm sure I'll talk a little bit more about
1004:33 sure I'll talk a little bit more about that while we're actually walking
1004:34 that while we're actually walking through the project one last thing I
1004:35 through the project one last thing I want to say before we get started is
1004:37 want to say before we get started is that this is most likely an intermediate
1004:39 that this is most likely an intermediate project so if you are just now learning
1004:40 project so if you are just now learning the basics of python this might be a
1004:42 the basics of python this might be a little bit challenging for you but I
1004:44 little bit challenging for you but I still recommend going through it because
1004:46 still recommend going through it because I will do my best to walk through
1004:48 I will do my best to walk through everything every single step of the way
1004:49 everything every single step of the way and and kind of explain all the concepts
1004:52 and and kind of explain all the concepts and so you can still learn something
1004:53 and so you can still learn something even if you aren't super good at python
1004:55 even if you aren't super good at python right now with that being said let's
1004:56 right now with that being said let's jump over to my screen and get started
1004:58 jump over to my screen and get started on the project all right so we are going
1004:59 on the project all right so we are going to get started and if you didn't watch
1005:01 to get started and if you didn't watch the last project I had people download
1005:04 the last project I had people download Anaconda uh we use Jupiter notebooks um
1005:07 Anaconda uh we use Jupiter notebooks um and I'll show you how to get to that in
1005:08 and I'll show you how to get to that in just a second but I'll I'll leave this
1005:09 just a second but I'll I'll leave this link in the description if you haven't
1005:11 link in the description if you haven't done that already and you are just doing
1005:13 done that already and you are just doing this project um but you'll go you'll
1005:15 this project um but you'll go you'll download andaconda You Know download
1005:17 download andaconda You Know download super easy um and you're going to open
1005:18 super easy um and you're going to open up Jupiter notebooks I'll launch it
1005:20 up Jupiter notebooks I'll launch it right now I already have it open uh but
1005:23 right now I already have it open uh but I'll open up another one just for you
1005:25 I'll open up another one just for you know the purposes of demonstration what
1005:28 know the purposes of demonstration what we are going to do today and what we um
1005:31 we are going to do today and what we um what people voted on I mean there's like
1005:33 what people voted on I mean there's like there was like 8,000 people that voted
1005:35 there was like 8,000 people that voted um in the poll that I made of what data
1005:37 um in the poll that I made of what data you wanted me to scrape there was like
1005:39 you wanted me to scrape there was like Amazon cryptocurrency weather um
1005:42 Amazon cryptocurrency weather um something else I don't remember
1005:44 something else I don't remember overwhelmingly I mean like 70% of people
1005:46 overwhelmingly I mean like 70% of people maybe even 80% I you don't don't fact
1005:49 maybe even 80% I you don't don't fact check me on that voted for Amazon um and
1005:53 check me on that voted for Amazon um and so I'm going to do it now there are many
1005:56 so I'm going to do it now there are many things that you can scrape um off of
1005:58 things that you can scrape um off of Amazon just a ton of stuff um and I'm
1006:02 Amazon just a ton of stuff um and I'm going to show you how to do it I'm going
1006:04 going to show you how to do it I'm going to show you how to make it useful how to
1006:06 to show you how to make it useful how to make a data set um and it's going to be
1006:10 make a data set um and it's going to be really interesting but there are lots of
1006:12 really interesting but there are lots of other ways to do this and so I think um
1006:14 other ways to do this and so I think um and I have already kind of created it
1006:17 and I have already kind of created it I'm going to show you how to do it off
1006:18 I'm going to show you how to do it off of this page um when you're actually in
1006:20 of this page um when you're actually in an item and you can scrape you know
1006:22 an item and you can scrape you know basically anything in here um and I'll
1006:24 basically anything in here um and I'll show you how to do that another thing
1006:27 show you how to do that another thing that is a little bit more advanced and
1006:29 that is a little bit more advanced and that's why this first video is starting
1006:31 that's why this first video is starting off I think on the more easy side it's
1006:33 off I think on the more easy side it's not easy but it's easier the next thing
1006:36 not easy but it's easier the next thing the next video that I'm going to make is
1006:38 the next video that I'm going to make is how to actually do um basically do
1006:42 how to actually do um basically do multiple items right so this item this
1006:45 multiple items right so this item this item this item this item and then
1006:48 item this item this item and then Traverse through the different pages so
1006:50 Traverse through the different pages so there 20 Pages um you want all of that
1006:53 there 20 Pages um you want all of that data how do you get all of that that'll
1006:56 data how do you get all of that that'll be the next project um I don't know when
1006:57 be the next project um I don't know when I plan on doing that I it like 90% of
1007:00 I plan on doing that I it like 90% of the way done um but I had this one
1007:02 the way done um but I had this one completed and so I wanted to get that
1007:03 completed and so I wanted to get that out to you guys now but that will
1007:05 out to you guys now but that will probably be the next project I think
1007:06 probably be the next project I think that is much more difficult um and so if
1007:09 that is much more difficult um and so if you can understand this one and you get
1007:11 you can understand this one and you get it and and you understand it then the
1007:13 it and and you understand it then the next project you should be able to
1007:14 next project you should be able to understand too is just a little bit more
1007:16 understand too is just a little bit more complicated so with that being said um
1007:19 complicated so with that being said um we are going to actually get into the
1007:20 we are going to actually get into the project I'm going to delete one of these
1007:23 project I'm going to delete one of these um all we're going to do is go to new do
1007:26 um all we're going to do is go to new do Python 3 it'll open up new one we'll
1007:29 Python 3 it'll open up new one we'll call this um
1007:32 call this um Amazon web
1007:35 Amazon web scraper um project that's what we'll
1007:38 scraper um project that's what we'll call it I spell it right perfect um the
1007:42 call it I spell it right perfect um the first thing that we need to do uh or
1007:44 first thing that we need to do uh or that we should do is
1007:47 that we should do is upload um or or or import our
1007:50 upload um or or or import our libraries so I'm going to say um import
1007:53 libraries so I'm going to say um import oops what am I doing it's off to a
1007:56 oops what am I doing it's off to a terrible start there we go import
1007:59 terrible start there we go import libraries now I'm not going to write out
1008:00 libraries now I'm not going to write out all the libraries um I have some things
1008:03 all the libraries um I have some things that I'm going to be copying and pasting
1008:04 that I'm going to be copying and pasting throughout this I won't there's only a
1008:06 throughout this I won't there's only a few things that I'm copying and pasting
1008:07 few things that I'm copying and pasting you can take a quick glance um some of
1008:09 you can take a quick glance um some of the things that I just don't want to
1008:10 the things that I just don't want to waste time on um because this could be a
1008:12 waste time on um because this could be a long video I don't know I don't want to
1008:14 long video I don't know I don't want to waste time on stuff like this um and so
1008:17 waste time on stuff like this um and so you know I'm just going to copy and
1008:19 you know I'm just going to copy and paste it you guys are going to I'm going
1008:21 paste it you guys are going to I'm going there will be a link below if you
1008:22 there will be a link below if you haven't clicked it already that will go
1008:24 haven't clicked it already that will go to the GitHub page where you can
1008:26 to the GitHub page where you can literally have all of this code already
1008:28 literally have all of this code already written WR I do recommend writing it all
1008:31 written WR I do recommend writing it all yourself because you will learn it much
1008:33 yourself because you will learn it much better I promise CU then you'll make
1008:34 better I promise CU then you'll make mistakes and you'll figure it out and
1008:35 mistakes and you'll figure it out and all that all that good stuff but you
1008:37 all that all that good stuff but you will have that code available so just go
1008:38 will have that code available so just go copy and paste it um that's what I would
1008:40 copy and paste it um that's what I would do but what we are we are going to be
1008:42 do but what we are we are going to be using today is uh something called
1008:44 using today is uh something called Beautiful soup requests um then we're
1008:48 Beautiful soup requests um then we're going to be using time and date time and
1008:51 going to be using time and date time and a potential one if you want to get and
1008:53 a potential one if you want to get and I'm going to show you this at the end
1008:54 I'm going to show you this at the end this is not really part of the project
1008:56 this is not really part of the project it goes above and beyond but this
1008:58 it goes above and beyond but this Library here is for sending emails to
1009:01 Library here is for sending emails to yourself um and I'll show you how uh you
1009:04 yourself um and I'll show you how uh you can use it if you want to I already have
1009:06 can use it if you want to I already have the whole code written out um you can
1009:08 the whole code written out um you can just steal it and try it out yourself
1009:10 just steal it and try it out yourself and see if you can get it to work but
1009:11 and see if you can get it to work but this one is not um as important I'll put
1009:14 this one is not um as important I'll put it down here so um let's move on now one
1009:18 it down here so um let's move on now one thing I want to say before we get too
1009:20 thing I want to say before we get too into it is that well give me a
1009:23 into it is that well give me a second is that right here in front of me
1009:27 second is that right here in front of me is a different laptop now it took me a
1009:30 is a different laptop now it took me a solid I would say you know 10 hours or
1009:34 solid I would say you know 10 hours or so to write all of this is took over the
1009:37 so to write all of this is took over the course of like two weeks in my free time
1009:38 course of like two weeks in my free time I'd pick it up it took me a solid you
1009:41 I'd pick it up it took me a solid you know two weeks on and off an hour here
1009:43 know two weeks on and off an hour here an hour there to finish this project um
1009:47 an hour there to finish this project um and I made a ton of mistakes and messed
1009:49 and I made a ton of mistakes and messed a bunch of things up and I finally got
1009:50 a bunch of things up and I finally got it to work um you know after a bunch of
1009:52 it to work um you know after a bunch of revisions that's typically how things go
1009:54 revisions that's typically how things go when I do projects and so uh I'm about
1009:58 when I do projects and so uh I'm about to give you a stream lined version of
1010:00 to give you a stream lined version of this because I have all the code right
1010:02 this because I have all the code right down here and so I'm going to be
1010:04 down here and so I'm going to be glancing at this a lot um just so I
1010:07 glancing at this a lot um just so I don't make this video 20 hours of trying
1010:09 don't make this video 20 hours of trying to remember all the code off the top of
1010:11 to remember all the code off the top of my head I have it written out already I
1010:12 my head I have it written out already I already did the project it works it's
1010:14 already did the project it works it's beautiful it's a good project so um I
1010:16 beautiful it's a good project so um I don't want to waste your time and I just
1010:18 don't want to waste your time and I just want you to know that you know you you
1010:21 want you to know that you know you you nobody should be able to do this up top
1010:23 nobody should be able to do this up top their head in an hour most people won't
1010:26 their head in an hour most people won't um it takes time you make mistakes um
1010:30 um it takes time you make mistakes um but uh let's get started on the project
1010:33 but uh let's get started on the project now in this uh in this what we're going
1010:37 now in this uh in this what we're going to have to do is we going to have to
1010:41 to have to do is we going to have to tell beautiful soup and requests where
1010:44 tell beautiful soup and requests where we are actually getting this data from
1010:46 we are actually getting this data from what website um what is our computer you
1010:48 what website um what is our computer you know some information from our computer
1010:50 know some information from our computer I'm going to again there's going to be a
1010:52 I'm going to again there's going to be a little copying and pasting in here
1010:54 little copying and pasting in here because you don't ever you will never
1010:55 because you don't ever you will never ever ever need to know this um but right
1010:58 ever ever need to know this um but right here we're going to to basically connect
1011:00 here we're going to to basically connect to the website so I'm just going to say
1011:02 to the website so I'm just going to say connect to
1011:03 connect to website and we going to say URL is equal
1011:07 website and we going to say URL is equal to and let's go get our
1011:09 to and let's go get our URL so we have this right here so
1011:12 URL so we have this right here so literally just go up here do you know uh
1011:15 literally just go up here do you know uh controll a copy that oops that's the
1011:19 controll a copy that oops that's the actual project get rid of
1011:21 actual project get rid of that uh paste it in here and that is our
1011:24 that uh paste it in here and that is our URL we will use that in just a second uh
1011:27 URL we will use that in just a second uh what am I doing
1011:30 what am I doing me just get some room here and then we
1011:34 me just get some room here and then we what we're going to need is something
1011:36 what we're going to need is something called headers now again you will never
1011:38 called headers now again you will never ever ever need to know this so I'm just
1011:41 ever ever need to know this so I'm just going to say headers um what I'm going
1011:43 going to say headers um what I'm going to do is I'm going to copy this I'm
1011:44 to do is I'm going to copy this I'm going to show you how to get this really
1011:46 going to show you how to get this really quick um but is something called headers
1011:51 quick um but is something called headers so uh let me show you how to use how to
1011:54 so uh let me show you how to use how to get
1011:56 get this and why you don't need to know any
1011:58 this and why you don't need to know any of this so what this headers is is this
1012:01 of this so what this headers is is this something called a user agent you need
1012:02 something called a user agent you need to do this for your computer um and you
1012:06 to do this for your computer um and you can do that by going to this link right
1012:08 can do that by going to this link right here so I'm going to put this link in
1012:10 here so I'm going to put this link in the description so that you can go and
1012:12 the description so that you can go and get that and there's something right
1012:13 get that and there's something right here called the user agent so all you
1012:16 here called the user agent so all you have to do is copy this just like this
1012:20 have to do is copy this just like this do copy I'm going to go back here and
1012:22 do copy I'm going to go back here and I'll show you that it's I'm going to
1012:24 I'll show you that it's I'm going to copy it in um it'll be the exact same so
1012:27 copy it in um it'll be the exact same so there you go
1012:29 there you go it's the exact same um all of this extra
1012:33 it's the exact same um all of this extra stuff except encoding except um this
1012:38 stuff except encoding except um this HTML stuff Connection close all the you
1012:40 HTML stuff Connection close all the you don't need to know any of it I promise
1012:42 don't need to know any of it I promise you'll never come in handy ever in
1012:44 you'll never come in handy ever in life actually there will be one person
1012:46 life actually there will be one person who that becomes in handy for and then
1012:48 who that becomes in handy for and then they'll message me um but we are now
1012:52 they'll message me um but we are now connecting um using our computer using
1012:55 connecting um using our computer using this
1012:56 this URL and then what we want to write is we
1012:58 URL and then what we want to write is we want write page we're going to say
1013:01 want write page we're going to say equals and this is where we start using
1013:03 equals and this is where we start using uh these libraries so we're going to use
1013:05 uh these libraries so we're going to use requests.get and we are going to pull in
1013:08 requests.get and we are going to pull in that URL and we're just going to say
1013:11 that URL and we're just going to say headers is equal to our headers right
1013:15 headers is equal to our headers right here so uh we have this and this is
1013:19 here so uh we have this and this is where we're going to actually
1013:20 where we're going to actually start getting the data bringing in the
1013:23 start getting the data bringing in the data um and it's not going to look like
1013:26 data um and it's not going to look like that at first but I'll try to print some
1013:28 that at first but I'll try to print some stuff out out as we go along the way so
1013:30 stuff out out as we go along the way so that you can kind of see what it looks
1013:31 that you can kind of see what it looks like and how we're going to kind of make
1013:33 like and how we're going to kind of make it more useful because it comes in very
1013:36 it more useful because it comes in very dirty uh when we first get it and some
1013:38 dirty uh when we first get it and some of the things I'm going to show you will
1013:40 of the things I'm going to show you will just help clean that up um and before we
1013:42 just help clean that up um and before we actually go any any further I don't want
1013:44 actually go any any further I don't want my head to be here for the entire time
1013:46 my head to be here for the entire time I'm going to get rid of myself so you
1013:47 I'm going to get rid of myself so you can just see the page uh I just it's
1013:51 can just see the page uh I just it's less distracting uh I hate when I feel
1013:53 less distracting uh I hate when I feel like people are always watching me so I
1013:55 like people are always watching me so I want people to just focus on the code uh
1013:58 want people to just focus on the code uh so I will see in a little bit let's get
1014:00 so I will see in a little bit let's get back into it all right so what we are
1014:02 back into it all right so what we are going to do is we are actually going to
1014:03 going to do is we are actually going to start using the beautiful soup Library
1014:06 start using the beautiful soup Library all right so we are going to say soup
1014:08 all right so we are going to say soup one is equal to and this is where we
1014:11 one is equal to and this is where we actually start bringing beautiful soup
1014:12 actually start bringing beautiful soup and you guess it you're going to say
1014:13 and you guess it you're going to say beautiful soup and then in parenthesis
1014:16 beautiful soup and then in parenthesis we're going to do page.
1014:18 we're going to do page. content um and again these aren't really
1014:22 content um and again these aren't really things that you need to remember or need
1014:24 things that you need to remember or need to memorize we're just pulling in the
1014:26 to memorize we're just pulling in the content from the page that's really all
1014:27 content from the page that's really all we're doing right now and and it comes
1014:29 we're doing right now and and it comes in as HTML so we're going to do html.
1014:32 in as HTML so we're going to do html. parser uh and let's see if I can print
1014:35 parser uh and let's see if I can print out uh actually let me just do soup one
1014:38 out uh actually let me just do soup one I don't like I don't like doing upper
1014:40 I don't like I don't like doing upper caps on
1014:41 caps on stuff let's see if anything prints out
1014:43 stuff let's see if anything prints out real quick so we are literally pulling
1014:46 real quick so we are literally pulling in all of the
1014:49 in all of the HTML um and let me go show you really
1014:52 HTML um and let me go show you really quick because we're going to get to this
1014:53 quick because we're going to get to this in a second anyways um if you come here
1014:57 in a second anyways um if you come here this is
1014:59 this is this is a static page basically written
1015:01 this is a static page basically written in HTML um if you have never seen HTML
1015:04 in HTML um if you have never seen HTML before um you
1015:07 before um you know actually a lot of this is you know
1015:11 know actually a lot of this is you know just stuff that most people will never
1015:13 just stuff that most people will never use uh it's just good to know some of
1015:15 use uh it's just good to know some of the stuff is good to know so as you see
1015:17 the stuff is good to know so as you see I'm scrolling on this right side by the
1015:18 I'm scrolling on this right side by the way I did rightclick and inspect or
1015:21 way I did rightclick and inspect or control shift I whichever one works
1015:23 control shift I whichever one works better for you but as I'm scrolling over
1015:25 better for you but as I'm scrolling over this you should see it kind of
1015:26 this you should see it kind of highlighting different areas um it's
1015:29 highlighting different areas um it's hard to kind of get what you want let's
1015:31 hard to kind of get what you want let's say we want this title um what I can do
1015:34 say we want this title um what I can do is I can click select element go right
1015:37 is I can click select element go right here um and then we can select like a TI
1015:39 here um and then we can select like a TI the the the header or the title of the
1015:41 the the the header or the title of the the page now I just want to show you
1015:44 the page now I just want to show you though of what we're pulling in so we're
1015:46 though of what we're pulling in so we're pulling in this doc type HTML all of
1015:49 pulling in this doc type HTML all of this is coming in so that's what this is
1015:51 this is coming in so that's what this is right here this doc type HTML and we're
1015:54 right here this doc type HTML and we're pulling every single thing in that is
1015:56 pulling every single thing in that is what we're doing right now uh so let's
1015:59 what we're doing right now uh so let's get or let's go down a little bit let's
1016:02 get or let's go down a little bit let's do soup two we're just going to do a
1016:04 do soup two we're just going to do a very uh you know an upgrade to soup one
1016:08 very uh you know an upgrade to soup one basically we'll do beautiful soup
1016:11 basically we'll do beautiful soup again and then we're going to do uh soup
1016:15 again and then we're going to do uh soup one so we're pulling in that content
1016:18 one so we're pulling in that content again so that soup one and we're going
1016:20 again so that soup one and we're going to do do PR prettify if you don't know
1016:24 to do do PR prettify if you don't know what that is it is common in a lot of
1016:27 what that is it is common in a lot of different languages and a lot of
1016:29 different languages and a lot of different stuff um it just makes things
1016:31 different stuff um it just makes things look better it that's really all it
1016:35 look better it that's really all it is uh I don't know why I'm using double
1016:39 is uh I don't know why I'm using double quotes I don't know why I can you can do
1016:41 quotes I don't know why I can you can do single ones if you want um and now let's
1016:43 single ones if you want um and now let's do beautiful soup to and it should just
1016:46 do beautiful soup to and it should just be a it should be better formatted um
1016:49 be a it should be better formatted um and let's see if that's true and it is
1016:52 and let's see if that's true and it is so before if you did if you could tell
1016:53 so before if you did if you could tell it was didn't have basically any
1016:55 it was didn't have basically any formatting it has a little bit of
1016:56 formatting it has a little bit of formatting now um it'll help in a second
1016:59 formatting now um it'll help in a second um and you'll see that but now what we
1017:03 um and you'll see that but now what we want to do is go back and we want to
1017:05 want to do is go back and we want to actually get the data that we want now
1017:07 actually get the data that we want now you can get any data you want I'm going
1017:09 you can get any data you want I'm going to show you simple things really really
1017:13 to show you simple things really really easy um in my in in in my opinion it
1017:16 easy um in my in in in my opinion it gets more difficult the more complicated
1017:18 gets more difficult the more complicated stuff you start pulling um and and
1017:21 stuff you start pulling um and and you'll understand that as we go into it
1017:23 you'll understand that as we go into it so what I'm going to do is I'm going to
1017:25 so what I'm going to do is I'm going to select this and I'm going to select this
1017:27 select this and I'm going to select this um the title I want that and so if you
1017:30 um the title I want that and so if you do span ID it's equal to product uh
1017:34 do span ID it's equal to product uh title so we need to remember that um
1017:37 title so we need to remember that um class we don't need to know class I
1017:40 class we don't need to know class I believe uh we're going to be using that
1017:42 believe uh we're going to be using that ID this um ID equals product title so
1017:46 ID this um ID equals product title so that's what we're going to be using um
1017:48 that's what we're going to be using um class will come in in the next video
1017:50 class will come in in the next video when we start looking at these uh but
1017:52 when we start looking at these uh but not in this one so let's remember ID
1017:54 not in this one so let's remember ID equals product title so let's go back
1017:57 equals product title so let's go back over here so we have this soup 2 it's
1018:00 over here so we have this soup 2 it's basically all of that HTML in it right
1018:03 basically all of that HTML in it right down here that that is what we're
1018:04 down here that that is what we're pulling in so we need to kind of specify
1018:07 pulling in so we need to kind of specify what we actually want so let's say title
1018:09 what we actually want so let's say title that's what we're going to be getting um
1018:11 that's what we're going to be getting um and we're going to do soup 2 so using
1018:13 and we're going to do soup 2 so using taking all that content um we're do
1018:16 taking all that content um we're do find and we're going to do open
1018:18 find and we're going to do open parenthesis and we're going to say we
1018:19 parenthesis and we're going to say we want to find that ID where it's equal to
1018:23 want to find that ID where it's equal to product
1018:25 product title and then we're going to do do
1018:29 title and then we're going to do do getor text and then we're going to do
1018:33 getor text and then we're going to do open parentheses so now let's um let's
1018:37 open parentheses so now let's um let's print the
1018:40 print the title and see what we get all right so
1018:43 title and see what we get all right so that is exactly what we're looking for
1018:44 that is exactly what we're looking for it's funny got data Mis um T-shirt that
1018:50 it's funny got data Mis um T-shirt that that is what we're trying to pull in so
1018:52 that is what we're trying to pull in so that's perfect that's exactly what we
1018:53 that's perfect that's exactly what we want we don't uh let me let me just do
1018:57 want we don't uh let me let me just do this save me some time later on we don't
1018:59 this save me some time later on we don't only want the title we are also going to
1019:01 only want the title we are also going to be pulling in the price so if you can
1019:04 be pulling in the price so if you can guess uh we'll be doing some uh a data
1019:08 guess uh we'll be doing some uh a data set on the actual
1019:10 set on the actual pricing um and so let's go back here
1019:14 pricing um and so let's go back here we're going to again use this right here
1019:16 we're going to again use this right here and we're going to go to this
1019:18 and we're going to go to this price and it says again we're going to
1019:21 price and it says again we're going to look at this ID the ID equals price
1019:23 look at this ID the ID equals price blockor our price so fairly easy you can
1019:27 blockor our price so fairly easy you can copy this I'm just going to write it out
1019:30 copy this I'm just going to write it out um we're going to say price is equal to
1019:33 um we're going to say price is equal to sup
1019:34 sup 2. find and then it's going to be again
1019:38 2. find and then it's going to be again ID is equal to and it's going to be
1019:41 ID is equal to and it's going to be price block underscore our price did I
1019:45 price block underscore our price did I saw that right
1019:47 saw that right oops excuse me there we go and the exact
1019:51 oops excuse me there we go and the exact same
1019:52 same thing.get
1019:54 thing.get text
1019:56 text parenthesis uh and there's a g text
1019:58 parenthesis uh and there's a g text there's a get all or get all text um so
1020:02 there's a get all or get all text um so you know that get text is a specific
1020:04 you know that get text is a specific thing that we are using you we might use
1020:07 thing that we are using you we might use a different one later on um but that
1020:10 a different one later on um but that that is what we have so now
1020:13 that is what we have so now let's let's print the title and print
1020:16 let's let's print the title and print when I why do I have all
1020:18 when I why do I have all this too much uh too much space so let's
1020:22 this too much uh too much space so let's T print the title and print the price
1020:25 T print the title and print the price let's see what we get okay so we have
1020:27 let's see what we get okay so we have our title and we have our price I mean
1020:30 our title and we have our price I mean you know I don't know what all this
1020:32 you know I don't know what all this white space is over here um but it looks
1020:35 white space is over here um but it looks like there's a lot of white space over
1020:36 like there's a lot of white space over here we'll have to get rid of that uh in
1020:39 here we'll have to get rid of that uh in a little bit as we clean it up a little
1020:40 a little bit as we clean it up a little bit you can if you want do things like
1020:46 bit you can if you want do things like um you can get and this is up to you I'm
1020:49 um you can get and this is up to you I'm not going to do this right now but I'm
1020:50 not going to do this right now but I'm just going to show you how to do it you
1020:52 just going to show you how to do it you can get this where you're pulling in the
1020:54 can get this where you're pulling in the ratings um which is you know if you want
1020:57 ratings um which is you know if you want to look at like how the ratings over
1020:59 to look at like how the ratings over time or or what ratings are for specific
1021:02 time or or what ratings are for specific products that could be really useful um
1021:05 products that could be really useful um you can pull basically anything you can
1021:06 you can pull basically anything you can go down the product details and look at
1021:08 go down the product details and look at Dimensions uh anything you want on this
1021:11 Dimensions uh anything you want on this page it is static so you can go in here
1021:15 page it is static so you can go in here and pull anything it's it you just have
1021:17 and pull anything it's it you just have to pull it from the HTML know where
1021:18 to pull it from the HTML know where you're looking pull it in um and now
1021:21 you're looking pull it in um and now when we go back here excuse me I'm going
1021:23 when we go back here excuse me I'm going to show you now kind of how to use this
1021:26 to show you now kind of how to use this right because we have this
1021:28 right because we have this but how are we going to use it um that's
1021:30 but how are we going to use it um that's kind of the important part I think first
1021:32 kind of the important part I think first thing we need to do is clean this up a
1021:34 thing we need to do is clean this up a little bit because it just is you know
1021:39 little bit because it just is you know if we try to use this it wouldn't be
1021:41 if we try to use this it wouldn't be super useful because it'd be just a
1021:43 super useful because it'd be just a little bit dirty it's not super
1021:45 little bit dirty it's not super clean um so what we want to do is let's
1021:49 clean um so what we want to do is let's start with the price why not uh we're
1021:52 start with the price why not uh we're going to say price. strip um and that's
1021:55 going to say price. strip um and that's just going to take uh basically the the
1021:59 just going to take uh basically the the junk off of either side and so let's run
1022:02 junk off of either side and so let's run that real quick so this is what we have
1022:04 that real quick so this is what we have but what we can also do is I don't want
1022:07 but what we can also do is I don't want that dollar sign I just want the numeric
1022:09 that dollar sign I just want the numeric value um later on we are going to be
1022:11 value um later on we are going to be putting this and we're going to be um
1022:13 putting this and we're going to be um creating a process to put this into an
1022:15 creating a process to put this into an Excel file again we're trying to create
1022:18 Excel file again we're trying to create a data set I don't want you to have to
1022:19 a data set I don't want you to have to copy and paste stuff it's all going to
1022:21 copy and paste stuff it's all going to be automated basically to input this
1022:24 be automated basically to input this data into an Excel file for you or a CSV
1022:26 data into an Excel file for you or a CSV file for you so um you know think about
1022:29 file for you so um you know think about making it useful in a CSV or in an Excel
1022:32 making it useful in a CSV or in an Excel later on so what we can do is do a
1022:35 later on so what we can do is do a bracket and we're going to do one and
1022:38 bracket and we're going to do one and then everything after that so basically
1022:39 then everything after that so basically it's just going to take everything from
1022:41 it's just going to take everything from the first position onward uh so let's
1022:44 the first position onward uh so let's run that and there we go so let's just
1022:47 run that and there we go so let's just say price is equal to price. strip um
1022:52 say price is equal to price. strip um and pull uh just do everything after
1022:54 and pull uh just do everything after that first um that first not value what
1022:58 that first um that first not value what am I saying what's the word for that I
1023:00 am I saying what's the word for that I can't remember the word the first space
1023:02 can't remember the word the first space that's not the right word but all right
1023:03 that's not the right word but all right let's do the title um this is basically
1023:05 let's do the title um this is basically going to be the exact same thing um
1023:08 going to be the exact same thing um super easy so we're just going to do
1023:10 super easy so we're just going to do title. strip and open
1023:13 title. strip and open parentheses um and we can you know if
1023:16 parentheses um and we can you know if you want to do this exact same thing so
1023:20 you want to do this exact same thing so now we have it it's a little bit cleaner
1023:21 now we have it it's a little bit cleaner so this is what it originally looked
1023:23 so this is what it originally looked like and now this is what it looks like
1023:25 like and now this is what it looks like so you know nothing super crazy but you
1023:30 so you know nothing super crazy but you know something interesting to know now
1023:33 know something interesting to know now we are about to in the very next part
1023:35 we are about to in the very next part what we are going to do and let me just
1023:37 what we are going to do and let me just add a few of these because makes me feel
1023:39 add a few of these because makes me feel better um what we are about to do is
1023:41 better um what we are about to do is we're going to create our CSV to insert
1023:45 we're going to create our CSV to insert this data into the CSV and then later on
1023:47 this data into the CSV and then later on what I'm going to do is show you kind of
1023:49 what I'm going to do is show you kind of how to um automate this process to pull
1023:51 how to um automate this process to pull this data
1023:54 this data um to create a data set right just
1023:56 um to create a data set right just pulling this one time and putting into a
1023:58 pulling this one time and putting into a csb really doesn't do anything you can
1024:00 csb really doesn't do anything you can just copy and paste that and save
1024:01 just copy and paste that and save yourself a lot of time um what I'm going
1024:03 yourself a lot of time um what I'm going to show you is is um basically doing it
1024:07 to show you is is um basically doing it over over time and just having it
1024:09 over over time and just having it automated in the background that is what
1024:11 automated in the background that is what I'm going to show you um I guess a
1024:13 I'm going to show you um I guess a spoiler but what we need to do is we
1024:16 spoiler but what we need to do is we need
1024:17 need to create uh create the CSV insert it
1024:22 to create uh create the CSV insert it into the CSV and then create a process
1024:24 into the CSV and then create a process to append more data into that CSV um I'm
1024:27 to append more data into that CSV um I'm doing a lot of talking let's do some
1024:29 doing a lot of talking let's do some writing so what we need to do is we're
1024:32 writing so what we need to do is we're going to use um I should have done this
1024:34 going to use um I should have done this at the top maybe I'll go back and add
1024:36 at the top maybe I'll go back and add that later on we're going to do import
1024:38 that later on we're going to do import CSV now in a CSV what you want is you
1024:41 CSV now in a CSV what you want is you want headers and then you want the data
1024:44 want headers and then you want the data right so for our headers and we're going
1024:46 right so for our headers and we're going to call it header we're going to do um
1024:49 to call it header we're going to do um we're going to do a bracket and let's
1024:51 we're going to do a bracket and let's make the first one a title because
1024:54 make the first one a title because that's going to be uh we can call it
1024:56 that's going to be uh we can call it title you can call it product um
1024:59 title you can call it product um whatever you want I'm just going to call
1025:00 whatever you want I'm just going to call it because I've been using title I'm
1025:01 it because I've been using title I'm going to call it title um and then we'll
1025:03 going to call it title um and then we'll also
1025:04 also have
1025:06 have price now we need our data so I'm going
1025:08 price now we need our data so I'm going to say data is equal to now this is
1025:11 to say data is equal to now this is important um right now how our data is
1025:15 important um right now how our data is and I can do this right here we're going
1025:17 and I can do this right here we're going type um title or no let's do type
1025:20 type um title or no let's do type price so these are strings and that's
1025:24 price so these are strings and that's important to know um again I don't want
1025:27 important to know um again I don't want to get too much into you know
1025:29 to get too much into you know dictionaries and arrays and lists and
1025:32 dictionaries and arrays and lists and and strings and all these things but
1025:33 and strings and all these things but this is a string and you can't put that
1025:36 this is a string and you can't put that right now it's not super usable what
1025:38 right now it's not super usable what we're going to do is make this a list um
1025:42 we're going to do is make this a list um and so I'm doing an Open Bracket and I'm
1025:44 and so I'm doing an Open Bracket and I'm going to say our data is title
1025:48 going to say our data is title comma price oops price now oops if I do
1025:56 comma price oops price now oops if I do type oops of data I'll just run that
1026:00 type oops of data I'll just run that it's a list now um and this is important
1026:03 it's a list now um and this is important because you can run into a lot of issues
1026:05 because you can run into a lot of issues with the stuff it's really important to
1026:07 with the stuff it's really important to remember what's what type
1026:11 remember what's what type um how do I say this uh how your data is
1026:15 um how do I say this uh how your data is is it a list is it an array is it a
1026:17 is it a list is it an array is it a dictionary um you know what is it these
1026:20 dictionary um you know what is it these things are important they do play a big
1026:22 things are important they do play a big impact especially with this type of
1026:24 impact especially with this type of stuff so just wanted to show you that
1026:25 stuff so just wanted to show you that really quick but what we are now going
1026:28 really quick but what we are now going to do is create a CSV um you're going to
1026:32 to do is create a CSV um you're going to create an Excel I I call an Excel CSV
1026:35 create an Excel I I call an Excel CSV you know whatever you want to call it so
1026:37 you know whatever you want to call it so what we are going to do is we are going
1026:38 what we are going to do is we are going to say with and we're going to say open
1026:42 to say with and we're going to say open and now we're going to name our file you
1026:44 and now we're going to name our file you can name this whatever you want I'm
1026:46 can name this whatever you want I'm going to call it
1026:49 going to call it uh um
1026:52 uh um Amazon web
1026:55 Amazon web scraper data set that's real long
1026:58 scraper data set that's real long uh.
1026:58 uh. CSV and then we're going to do
1027:00 CSV and then we're going to do underscore W and that means
1027:04 underscore W and that means right um oh whoops that's not right just
1027:08 right um oh whoops that's not right just like I was wondering why that was uh in
1027:09 like I was wondering why that was uh in Black uh so we're going to do W which
1027:11 Black uh so we're going to do W which means right um and then we're going to
1027:14 means right um and then we're going to do new line and if you don't know what
1027:16 do new line and if you don't know what new line is uh all that does is when we
1027:20 new line is uh all that does is when we insert the data it doesn't have a a
1027:23 insert the data it doesn't have a a space in between each CSV and then we
1027:25 space in between each CSV and then we are going to do encode
1027:28 are going to do encode coding is equal to oops is equal to
1027:34 coding is equal to oops is equal to utf8 and that is it and we'll just say
1027:37 utf8 and that is it and we'll just say as uh let's do F so some of that stuff
1027:42 as uh let's do F so some of that stuff you don't need to know some of it's
1027:43 you don't need to know some of it's useful this W definitely need to know
1027:45 useful this W definitely need to know this new line is is good to know and um
1027:48 this new line is is good to know and um I'll take it I might take it out just to
1027:49 I'll take it I might take it out just to show you what it actually does because
1027:51 show you what it actually does because it's annoying if you don't have it I
1027:53 it's annoying if you don't have it I promise um but you know that that new
1027:56 promise um but you know that that new Line's important this encoding you know
1027:58 Line's important this encoding you know good to know I think that's by default
1028:00 good to know I think that's by default is is it's like that uh anyways what
1028:02 is is it's like that uh anyways what we're going to do now is we're going to
1028:05 we're going to do now is we're going to uh it's something within the
1028:07 uh it's something within the CSV within the CSV um Library so we're
1028:11 CSV within the CSV um Library so we're going to do something called CSV
1028:14 going to do something called CSV writer and oops CSV do
1028:18 writer and oops CSV do writer and we're going to do open
1028:20 writer and we're going to do open parenthesis and that is that and we'll
1028:22 parenthesis and that is that and we'll just call that
1028:25 just call that writer and then we'll we'll do this is
1028:28 writer and then we'll we'll do this is where we need to actually create the
1028:30 where we need to actually create the header so uh we're going to do writer is
1028:34 header so uh we're going to do writer is dot sorry writer. WR row uh and this is
1028:40 dot sorry writer. WR row uh and this is just for the
1028:41 just for the initial um the
1028:44 initial um the initial import or or or um not import
1028:48 initial import or or or um not import the initial insertion of the data into
1028:50 the initial insertion of the data into the CSV this is what's important the
1028:53 the CSV this is what's important the next one that we're going to write is
1028:54 next one that we're going to write is for when we're actually appending the
1028:55 for when we're actually appending the data which is going to be a little bit
1028:56 data which is going to be a little bit different but anyways we're going to do
1028:59 different but anyways we're going to do right Row open parenthesis and this is
1029:01 right Row open parenthesis and this is where that header is going to go so
1029:04 where that header is going to go so we're going to that these headers are
1029:05 we're going to that these headers are going to be the title and the
1029:07 going to be the title and the price and then for our last one we're
1029:10 price and then for our last one we're going to actually write the data which
1029:12 going to actually write the data which is this data right here and we're going
1029:14 is this data right here and we're going to say
1029:15 to say writer. write
1029:18 writer. write row and we're going to do data so this
1029:20 row and we're going to do data so this one we are creating the
1029:23 one we are creating the CSV and then we are inserting the header
1029:27 CSV and then we are inserting the header and inserting the data so super easy um
1029:32 and inserting the data so super easy um yeah I think that's fairly
1029:33 yeah I think that's fairly straightforward right now let's do this
1029:37 straightforward right now let's do this and let's see what happens so I just ran
1029:40 and let's see what happens so I just ran it um let's go over here in here
1029:44 it um let's go over here in here somewhere Amazon web scraper data set
1029:47 somewhere Amazon web scraper data set let's open that
1029:49 let's open that up and there we go oh
1029:53 up and there we go oh jeez this isn't good can't verify my uh
1030:01 my subscription uh why does it say $699 I'm going to go back and look but I
1030:04 $699 I'm going to go back and look but I think I know the
1030:05 think I know the issue um but this is exactly what we
1030:08 issue um but this is exactly what we want now of course we want more data and
1030:11 want now of course we want more data and maybe a little bit more useful data um
1030:14 maybe a little bit more useful data um and I'll show you how to get that in
1030:14 and I'll show you how to get that in just a second but we just created that
1030:17 just a second but we just created that out of thin air uh that was not I didn't
1030:19 out of thin air uh that was not I didn't have that saved before so we have this
1030:21 have that saved before so we have this data set and the issue was is that I ran
1030:26 data set and the issue was is that I ran this multiple times so now it's $6.99 if
1030:29 this multiple times so now it's $6.99 if I do it again it's 99 uh and if I did it
1030:31 I do it again it's 99 uh and if I did it again it's it gets rid of everything so
1030:33 again it's it gets rid of everything so I'm just going to run this again run
1030:35 I'm just going to run this again run this
1030:41 again now everything's back to normal okay so now if we run this it's going to
1030:45 okay so now if we run this it's going to overwrite this Amazon webscraper data
1030:47 overwrite this Amazon webscraper data set. CSV and it will put the data in
1030:52 set. CSV and it will put the data in properly so there we go oh jeez guys
1030:55 properly so there we go oh jeez guys this is embarrassing
1030:57 this is embarrassing I'm
1030:59 I'm embarrassed no I don't want this okay
1031:03 embarrassed no I don't want this okay perfect um guys I if you can't tell I'm
1031:07 perfect um guys I if you can't tell I'm in need of some um I'm in need I'm in
1031:10 in need of some um I'm in need I'm in need of some help here but I'm just
1031:14 need of some help here but I'm just kidding I'm I'm doing fine uh I just I
1031:16 kidding I'm I'm doing fine uh I just I don't know why that uh why I don't have
1031:19 don't know why that uh why I don't have my uh subscription activated it's not
1031:21 my uh subscription activated it's not going to matter for this video I guess
1031:22 going to matter for this video I guess but that's really random um so we got
1031:25 but that's really random um so we got what we need that's perfect
1031:28 what we need that's perfect now what we want to do after this um I I
1031:31 now what we want to do after this um I I guess actually what is important is some
1031:33 guess actually what is important is some more useful
1031:34 more useful data something that I like to do a lot
1031:38 data something that I like to do a lot when I do this type of this type of
1031:40 when I do this type of this type of stuff is I like to have some type of
1031:42 stuff is I like to have some type of date stamp um or some type of Tim stamp
1031:45 date stamp um or some type of Tim stamp to know when I collected this data it
1031:47 to know when I collected this data it usually comes in handy later on um I I
1031:50 usually comes in handy later on um I I have never regretted putting it in there
1031:52 have never regretted putting it in there I'll show you really quick how you can
1031:54 I'll show you really quick how you can do it uh you're going to do import
1031:55 do it uh you're going to do import daytime
1031:58 daytime geez I hate having to format stuff like
1032:00 geez I hate having to format stuff like that and what you can do is you can do
1032:02 that and what you can do is you can do date let me get date time and you do
1032:06 date let me get date time and you do dat. today open parentheses and that is
1032:11 dat. today open parentheses and that is going to give us this right here uh and
1032:13 going to give us this right here uh and so we're just going to do um today
1032:16 so we're just going to do um today that's what we'll call it is equal to
1032:19 that's what we'll call it is equal to this and we'll say print today and there
1032:24 this and we'll say print today and there we go so that is today's date is the 20
1032:27 we go so that is today's date is the 20 of August in
1032:28 of August in 2021 so today is now um is now this so
1032:33 2021 so today is now um is now this so actually I'm going to get rid of that
1032:36 actually I'm going to get rid of that I'm going to put it back up here I'm
1032:38 I'm going to put it back up here I'm going to put it right there I'm going to
1032:40 going to put it right there I'm going to run it again let's add this right here
1032:44 run it again let's add this right here we'll do
1032:47 we'll do um we'll do we'll call it
1032:50 um we'll do we'll call it date and then we'll add
1032:53 date and then we'll add today and we'll just run this again
1032:57 today and we'll just run this again and what we can do just to check the
1033:01 and what we can do just to check the data without having to open up the data
1033:03 data without having to open up the data every single time which is super
1033:05 every single time which is super annoying is we're going to use pandas
1033:06 annoying is we're going to use pandas again I should have imported this at the
1033:08 again I should have imported this at the top I'm just kind of um I'm not doing
1033:10 top I'm just kind of um I'm not doing this off the top of my head but uh I
1033:12 this off the top of my head but uh I didn't have it 100% planned so import
1033:15 didn't have it 100% planned so import pandas and we're just going to say pd.
1033:17 pandas and we're just going to say pd. read CSV and then we'll read it in um
1033:22 read CSV and then we'll read it in um what you can do or what I often do is I
1033:25 what you can do or what I often do is I go to properties and I go right
1033:37 here and we'll say boom boom back slash this right here this I am doing
1033:40 slash this right here this I am doing off the top of my head I don't do this
1033:41 off the top of my head I don't do this often I think I have this memorized by
1033:43 often I think I have this memorized by now uh I I I hope and then we'll do
1033:47 now uh I I I hope and then we'll do print oh no we don't have to do print
1033:49 print oh no we don't have to do print we'll just do this uh what do I do R
1033:52 we'll just do this uh what do I do R let's actually call this um data frame
1033:57 let's actually call this um data frame and we'll do
1034:00 and we'll do print let's see what happens perfect
1034:03 print let's see what happens perfect okay so what we have now is the new our
1034:07 okay so what we have now is the new our new header our new data that we added in
1034:09 new header our new data that we added in there so we have our title we have our
1034:11 there so we have our title we have our price and we have our date now again you
1034:15 price and we have our date now again you can customize this whatever you want to
1034:16 can customize this whatever you want to add go back here um you know find what
1034:20 add go back here um you know find what you want you know do you want it to make
1034:22 you want you know do you want it to make sure it has a men's option or different
1034:25 sure it has a men's option or different colors or you want to pull in this
1034:27 colors or you want to pull in this information whatever you want it it
1034:29 information whatever you want it it really does not matter um just matters
1034:31 really does not matter um just matters that you know you get what you need for
1034:34 that you know you get what you need for whatever purpose whatever you're making
1034:36 whatever purpose whatever you're making this for this is more of an introductory
1034:38 this for this is more of an introductory video to how to scrape data from Amazon
1034:41 video to how to scrape data from Amazon um the next video will probably be a
1034:42 um the next video will probably be a little bit more difficult and in-depth
1034:44 little bit more difficult and in-depth but this is kind of let's get you guys
1034:46 but this is kind of let's get you guys started so um we now have this and this
1034:49 started so um we now have this and this is
1034:50 is beautiful now something that you want to
1034:56 beautiful now something that you want to do when you're scraping data and you're
1034:58 do when you're scraping data and you're getting um I guess data over time and
1035:02 getting um I guess data over time and that's kind of what we're doing is going
1035:04 that's kind of what we're doing is going to be almost like um a price tracker
1035:07 to be almost like um a price tracker over time is you want to then append
1035:10 over time is you want to then append data to this so we can't only create it
1035:15 data to this so we can't only create it and that's what this does because if I
1035:16 and that's what this does because if I run this 100 times it'll only give me
1035:18 run this 100 times it'll only give me this first row we need to now append
1035:20 this first row we need to now append data to this so um
1035:23 data to this so um let's let's pull this down here
1035:27 let's let's pull this down here uh again I'm I'm not I haven't added a
1035:29 uh again I'm I'm not I haven't added a bunch of notes I'm going to say now we
1035:31 bunch of notes I'm going to say now we are appending data to the csb I haven't
1035:35 are appending data to the csb I haven't added a ton of notes I'll try to go back
1035:37 added a ton of notes I'll try to go back maybe afterwards and add some notes for
1035:38 maybe afterwards and add some notes for people who like to read
1035:40 people who like to read notes
1035:42 notes um so what we are now going to do is
1035:44 um so what we are now going to do is we're going to change this W to an A+
1035:47 we're going to change this W to an A+ now this is going to be how we append
1035:50 now this is going to be how we append the data um and we no longer need the
1035:53 the data um and we no longer need the header so we don't aren't going to do
1035:54 header so we don't aren't going to do the header anymore and there we go so
1035:57 the header anymore and there we go so now instead of excuse me so now instead
1036:01 now instead of excuse me so now instead of creating that header again creating
1036:03 of creating that header again creating that first row of data again we are
1036:06 that first row of data again we are ignoring the data and we're now going to
1036:08 ignoring the data and we're now going to the next nearest free row and a pending
1036:12 the next nearest free row and a pending data which means to add on data to
1036:15 data which means to add on data to that um and so if I run this which I'm
1036:17 that um and so if I run this which I'm not going to right now I mean why not I
1036:21 not going to right now I mean why not I can I can run it um and then we can read
1036:23 can I can run it um and then we can read this in so now there there's our data
1036:25 this in so now there there's our data I'll run it a few more more
1036:28 I'll run it a few more more times I ran it like three or four more
1036:30 times I ran it like three or four more times I I run that in and there we go
1036:32 times I I run that in and there we go now it's all the exact same data super
1036:34 now it's all the exact same data super um
1036:35 um boring but very very uh you know good to
1036:39 boring but very very uh you know good to have now we don't want to have to come
1036:41 have now we don't want to have to come in here and run this every day let's say
1036:43 in here and run this every day let's say we're going to do this daily um we don't
1036:46 we're going to do this daily um we don't want to have to come and write run this
1036:48 want to have to come and write run this every single day right we want a way
1036:50 every single day right we want a way where it does it while we sleep it does
1036:52 where it does it while we sleep it does it in the background of our laptop um
1036:55 it in the background of our laptop um and is easy to do right I don't want to
1036:57 and is easy to do right I don't want to come in here every single morning with a
1036:59 come in here every single morning with a set an alarm on my phone every single
1037:00 set an alarm on my phone every single morning come in here I want to automate
1037:03 morning come in here I want to automate this so uh how are we going to do that
1037:08 this so uh how are we going to do that give me one second uh if you didn't know
1037:10 give me one second uh if you didn't know I have three kids and one of them is
1037:11 I have three kids and one of them is waking up I'll be right back all right I
1037:14 waking up I'll be right back all right I think he is asleep um at least let's
1037:17 think he is asleep um at least let's hope he's asleep so now what we're going
1037:19 hope he's asleep so now what we're going to do is we're going
1037:21 to do is we're going to put this
1037:24 to put this all into
1037:27 all into uh this check uncore
1037:31 uh this check uncore price now you may never have used oh
1037:35 price now you may never have used oh geez what are these things called oh my
1037:37 geez what are these things called oh my gosh
1037:39 gosh super used all the time you'll know what
1037:43 super used all the time you'll know what I what it is
1037:45 I what it is uh not a function I don't even remember
1037:48 uh not a function I don't even remember what it's called maybe this's a function
1037:51 what it's called maybe this's a function um I can't think I'm having like a
1037:53 um I can't think I'm having like a writer's block or whatever that is we're
1037:55 writer's block or whatever that is we're going to put it all in here and then
1037:57 going to put it all in here and then we're going to be able to use this price
1037:58 we're going to be able to use this price check later um because we want to be
1038:00 check later um because we want to be able to automate this so let's go back
1038:02 able to automate this so let's go back all the way up
1038:03 all the way up here we are going to use this so let's
1038:07 here we are going to use this so let's copy all of that
1038:10 copy all of that in and oh jeez I hate
1038:20 this all right everything just like that um so this pulls in our
1038:22 um so this pulls in our data pulls in uh or or yeah pulls in all
1038:26 data pulls in uh or or yeah pulls in all of our data down to the title and the
1038:28 of our data down to the title and the price we want
1038:30 price we want to make it look
1038:34 to make it look right so we're going to put it right
1038:37 right so we're going to put it right here so now we have it formatted
1038:41 here so now we have it formatted properly um we want to add our date
1038:50 time do it just like that I don't know if there's a better I'm sure there's a
1038:52 if there's a better I'm sure there's a better way to do
1038:53 better way to do this um then we need
1038:57 this um then we need need this right
1039:06 here and just like that like that so now we have our header and our data and then
1039:08 we have our header and our data and then we want to pull this in right
1039:12 we want to pull this in right here boom boom boom
1039:16 here boom boom boom okay so everything that we just wrote
1039:19 okay so everything that we just wrote out we are now putting into this check
1039:23 out we are now putting into this check price now you can call it whatever you
1039:25 price now you can call it whatever you want doesn't matter but let's run that
1039:29 want doesn't matter but let's run that see if we get any errors we don't so
1039:31 see if we get any errors we don't so this is now good to go
1039:34 this is now good to go basically um what we are going to use
1039:37 basically um what we are going to use this for um and what this is going to do
1039:40 this for um and what this is going to do is we are going to put this on a timer
1039:43 is we are going to put this on a timer um you know have you ever wanted to like
1039:45 um you know have you ever wanted to like check something once a day once every 10
1039:49 check something once a day once every 10 seconds once a minute whatever you want
1039:51 seconds once a minute whatever you want and you don't want to have to actually
1039:53 and you don't want to have to actually pull up your phone and look at it this
1039:55 pull up your phone and look at it this is how we are going to do that so we had
1039:57 is how we are going to do that so we had something called uh let's see time this
1040:01 something called uh let's see time this this Library time right here that's what
1040:02 this Library time right here that's what we're going to use right now so we're
1040:04 we're going to use right now so we're going to say while oops while
1040:10 going to say while oops while true and go like this do a
1040:13 true and go like this do a colon we're going to say check unor
1040:17 colon we're going to say check unor price that's what we just wrote out and
1040:19 price that's what we just wrote out and we're going to do time dos sleep now
1040:23 we're going to do time dos sleep now this is completely up to you how how
1040:26 this is completely up to you how how much time you want to put in here for
1040:28 much time you want to put in here for the purposes of demonstration I'm going
1040:30 the purposes of demonstration I'm going to put 5 Seconds which means every 5
1040:33 to put 5 Seconds which means every 5 Seconds it is going to run through this
1040:36 Seconds it is going to run through this entire process and so let's run this
1040:39 entire process and so let's run this really quick and I'm going to run it for
1040:42 really quick and I'm going to run it for let's say 30 seconds and then I'm going
1040:45 let's say 30 seconds and then I'm going to
1040:46 to pull this in right
1040:49 pull this in right here so we just looked at it earlier we
1040:52 here so we just looked at it earlier we had four um well five rows of data right
1040:58 had four um well five rows of data right what we are going to do is in just a
1041:00 what we are going to do is in just a second I'm going to stop this you know
1041:01 second I'm going to stop this you know maybe after 30 seconds or so we're going
1041:03 maybe after 30 seconds or so we're going to see how much data is in
1041:05 to see how much data is in there uh and let's stop it right now
1041:07 there uh and let's stop it right now it's been going far enough um and La
1041:10 it's been going far enough um and La let's run it so now we have five six
1041:13 let's run it so now we have five six seven eight so I guess I ran for 20
1041:14 seven eight so I guess I ran for 20 seconds we
1041:16 seconds we can that was for demonstration purposes
1041:19 can that was for demonstration purposes I've never do any some anything every
1041:21 I've never do any some anything every every 5 Seconds um unless it was like
1041:22 every 5 Seconds um unless it was like Black Friday on
1041:24 Black Friday on Amazon we can put this
1041:27 Amazon we can put this as long or as short as you want you can
1041:30 as long or as short as you want you can run it every second if you want um that
1041:32 run it every second if you want um that doesn't make sense to me but you can
1041:35 doesn't make sense to me but you can what we can do is do a little bit of
1041:37 what we can do is do a little bit of math uh and I don't know this off the
1041:39 math uh and I don't know this off the top of my head so I'm going to uh do the
1041:42 top of my head so I'm going to uh do the math with you live pretty exciting stuff
1041:45 math with you live pretty exciting stuff got the calculator out so there are 60
1041:49 got the calculator out so there are 60 seconds in a minute and this goes by
1041:51 seconds in a minute and this goes by seconds by the way and you could do you
1041:54 seconds by the way and you could do you know you can do some um some string up
1041:59 know you can do some um some string up here of calculating this but I'm just
1042:01 here of calculating this but I'm just going to put in the number because it's
1042:03 going to put in the number because it's easier uh maybe not easier I'm just
1042:05 easier uh maybe not easier I'm just going to do it there's 60 seconds um in
1042:08 going to do it there's 60 seconds um in a minute there are 60 seconds or 60
1042:11 a minute there are 60 seconds or 60 minutes in an hour so that's one hour uh
1042:15 minutes in an hour so that's one hour uh and we can do 24 hours in a day so
1042:17 and we can do 24 hours in a day so that's
1042:23 86,000 400 I believe did I read that right oops did I read that right
1042:26 right oops did I read that right right yes so this now if I ran this and
1042:31 right yes so this now if I ran this and I'm going to this is going to check the
1042:34 I'm going to this is going to check the price every single day and this is the
1042:36 price every single day and this is the entire point of this um of of
1042:40 entire point of this um of of this project not the entire point but
1042:43 this project not the entire point but this is a big part of this project is we
1042:45 this is a big part of this project is we want to create our own data set now
1042:47 want to create our own data set now something that I personally really love
1042:50 something that I personally really love is a data set that
1042:52 is a data set that has you know that I can do some type of
1042:55 has you know that I can do some type of time ser series with now this is not
1042:58 time ser series with now this is not exciting it's probably not super
1042:59 exciting it's probably not super exciting for this right but you get the
1043:04 exciting for this right but you get the idea that if this price were to change
1043:08 idea that if this price were to change we would then see that reflected in the
1043:10 we would then see that reflected in the data at some
1043:11 data at some point you can do this on any item you
1043:15 point you can do this on any item you could ever imagine on Amazon it's the
1043:17 could ever imagine on Amazon it's the exact same process and some items change
1043:20 exact same process and some items change often this t-shirt will most likely
1043:23 often this t-shirt will most likely never change um and so you know again
1043:25 never change um and so you know again this is for for demonstration purposes
1043:27 this is for for demonstration purposes the code itself will be nice to put in a
1043:30 the code itself will be nice to put in a project although the data set that you
1043:31 project although the data set that you get from this probably won't be the best
1043:34 get from this probably won't be the best I would
1043:35 I would imagine but notice that this is running
1043:38 imagine but notice that this is running um I can then minimize this and this can
1043:40 um I can then minimize this and this can run on my computer basically as long as
1043:44 run on my computer basically as long as my computer uh is is
1043:47 my computer uh is is working um one thing I will say before I
1043:50 working um one thing I will say before I go on to some more stuff one thing that
1043:53 go on to some more stuff one thing that I will say is that I personally when I
1043:56 I will say is that I personally when I did this for a when I um created this I
1044:01 did this for a when I um created this I did something similar and I put this in
1044:03 did something similar and I put this in Visual Studio code um and I didn't put
1044:06 Visual Studio code um and I didn't put it in Jupiter notebooks that's a
1044:09 it in Jupiter notebooks that's a personal preference I would look into
1044:11 personal preference I would look into that if that is something that you want
1044:14 that if that is something that you want um I think visual studio code is a
1044:15 um I think visual studio code is a little bit easier for automating these
1044:17 little bit easier for automating these types of tasks um but for illustrative
1044:20 types of tasks um but for illustrative purposes and for demonstration purposes
1044:22 purposes and for demonstration purposes you cannot beat jupyter notebooks that's
1044:24 you cannot beat jupyter notebooks that's why I did it
1044:26 why I did it so with all that being said that is
1044:28 so with all that being said that is basically the end of the project now um
1044:30 basically the end of the project now um I'm not going to stop this and read it
1044:32 I'm not going to stop this and read it again but you get the point um we now
1044:36 again but you get the point um we now have um a data
1044:39 have um a data set that oh jeez all this again that now
1044:42 set that oh jeez all this again that now has um data I'm getting out of here oh
1044:45 has um data I'm getting out of here oh geez it's hounding me let me get out of
1044:47 geez it's hounding me let me get out of here oh
1044:49 here oh no all this is embarrassing guys I'm
1044:52 no all this is embarrassing guys I'm embarrassed we now have a CSV file with
1044:55 embarrassed we now have a CSV file with data in now you run this in the
1044:57 data in now you run this in the background of your computer you can do
1044:59 background of your computer you can do that I have done it I've ran it for
1045:01 that I have done it I've ran it for weeks I have ran it for months um if you
1045:04 weeks I have ran it for months um if you restart your computer just come back in
1045:06 restart your computer just come back in here and restart running this process um
1045:09 here and restart running this process um it's the same for any automated process
1045:11 it's the same for any automated process unless you start using some online um
1045:14 unless you start using some online um automation service which will run it
1045:16 automation service which will run it regardless of your computer they do it
1045:18 regardless of your computer they do it you know either in the cloud or on some
1045:21 you know either in the cloud or on some um
1045:22 um server so you know that this is a really
1045:25 server so you know that this is a really good option again if if you restart your
1045:27 good option again if if you restart your computer or something happens and you
1045:28 computer or something happens and you lose connection just come in here run
1045:30 lose connection just come in here run this through this script again um except
1045:32 this through this script again um except for the one where it deletes all your
1045:35 for the one where it deletes all your data don't run that one again only run
1045:37 data don't run that one again only run that one time um and then you will in
1045:42 that one time um and then you will in fact what I would do is then um I would
1045:45 fact what I would do is then um I would just comment this out right I'd come in
1045:48 just comment this out right I'd come in here and I would just comment this
1045:50 here and I would just comment this out so that anytime I come back in here
1045:53 out so that anytime I come back in here I would never accidentally delete all my
1045:54 I would never accidentally delete all my data
1045:56 data but that is what this project does now
1045:58 but that is what this project does now something really interesting something
1046:00 something really interesting something that I have done in the past that I
1046:02 that I have done in the past that I thought was really cool really useful I
1046:06 thought was really cool really useful I actually did it for um I actually did it
1046:09 actually did it for um I actually did it for some
1046:10 for some watches that I was watching especially
1046:13 watches that I was watching especially on Black Friday it's when I used it I
1046:16 on Black Friday it's when I used it I was interested in a price drop or
1046:19 was interested in a price drop or specific price change and what I did was
1046:24 specific price change and what I did was is I said and I don't
1046:27 is I said and I don't know so what I basically did was is I
1046:31 know so what I basically did was is I said if the price is lower than let's
1046:36 said if the price is lower than let's say let's say we wanted to drop below
1046:39 say let's say we wanted to drop below $14 it would then send an email um and
1046:44 $14 it would then send an email um and I'm going to show you the script that I
1046:45 I'm going to show you the script that I used it still works um and if this is
1046:49 used it still works um and if this is something that you are interested in
1046:50 something that you are interested in this could be a completely different
1046:52 this could be a completely different project I just think it's interesting
1046:54 project I just think it's interesting and I wanted to show it to you although
1046:56 and I wanted to show it to you although I wouldn't say this this is part of the
1046:58 I wouldn't say this this is part of the um final project let me just come in
1047:02 um final project let me just come in here and we are going to create this
1047:07 here and we are going to create this super simple um not super simple we're
1047:10 super simple um not super simple we're sending a mail we're connecting to a
1047:12 sending a mail we're connecting to a server we we're using Gmail we're
1047:15 server we we're using Gmail we're logging into our account that is my
1047:17 logging into our account that is my email you will not get my password we're
1047:19 email you will not get my password we're creting the subject the body um we we
1047:22 creting the subject the body um we we configure or or just kind of create this
1047:24 configure or or just kind of create this message and then we send a mail so then
1047:27 message and then we send a mail so then I have this Define uh or this send mail
1047:32 I have this Define uh or this send mail I am blanking on what this is called I'm
1047:33 I am blanking on what this is called I'm going to call it a function but that's
1047:34 going to call it a function but that's probably not right so if that price
1047:37 probably not right so if that price drops below a certain point it'll send
1047:39 drops below a certain point it'll send me an email um I have used this and I
1047:42 me an email um I have used this and I used it and was able to buy a watch that
1047:45 used it and was able to buy a watch that was like you know let's say 140 bucks
1047:47 was like you know let's say 140 bucks for like 90 bucks um on Black Friday
1047:49 for like 90 bucks um on Black Friday sale I was really really happy about
1047:50 sale I was really really happy about that so this can be used in that way as
1047:53 that so this can be used in that way as well um not something you to write into
1047:56 well um not something you to write into your project just something I'm going to
1047:57 your project just something I'm going to include down here if you want to try it
1048:01 include down here if you want to try it I think it's super interesting something
1048:02 I think it's super interesting something really
1048:03 really fun um really fun to mess around with I
1048:06 fun um really fun to mess around with I enjoyed this so with that being said uh
1048:11 enjoyed this so with that being said uh this is this is the project um I in the
1048:15 this is this is the project um I in the next one and I promise you this one is
1048:17 next one and I promise you this one is probably going to get a lot
1048:19 probably going to get a lot more difficult if you thought this one
1048:21 more difficult if you thought this one was easy which I hope maybe I hope you
1048:23 was easy which I hope maybe I hope you do then that means you're you know
1048:24 do then that means you're you know pretty good at python you know in the
1048:27 pretty good at python you know in the next the next um web scraping project
1048:30 next the next um web scraping project and I hope to do many of these I might
1048:32 and I hope to do many of these I might do um even all the ones that I put in
1048:34 do um even all the ones that I put in that poll but I started with the one
1048:35 that poll but I started with the one that was the most
1048:37 that was the most popular um you know if you were able to
1048:40 popular um you know if you were able to get through this I think that that is
1048:42 get through this I think that that is fantastic I think this is a solid
1048:44 fantastic I think this is a solid project to create um a data set and so
1048:48 project to create um a data set and so use this how you will you can copy my
1048:51 use this how you will you can copy my code exactly I don't have a problem with
1048:53 code exactly I don't have a problem with that again I don't think this is
1048:55 that again I don't think this is beginner there are some a little bit
1048:56 beginner there are some a little bit more advanced things and I not even
1048:59 more advanced things and I not even Advanced just like intermediate level
1049:01 Advanced just like intermediate level things um that you kind of learn as you
1049:03 things um that you kind of learn as you get into it and so um I hope that this
1049:05 get into it and so um I hope that this was instructional I hope I explained it
1049:08 was instructional I hope I explained it you know well um and I hope that this is
1049:11 you know well um and I hope that this is useful again you know when you actually
1049:13 useful again you know when you actually use this you'll have 22 23 24 25 you
1049:19 use this you'll have 22 23 24 25 you know you'll see a price change a price
1049:22 know you'll see a price change a price change a price change a price change go
1049:25 change a price change a price change go use a a product or go to something that
1049:27 use a a product or go to something that you were interested in or that you know
1049:29 you were interested in or that you know fluctuates often um and there are plenty
1049:32 fluctuates often um and there are plenty of those on Amazon I promise you there
1049:34 of those on Amazon I promise you there some that literally change almost every
1049:36 some that literally change almost every other day like down a dollar up a dollar
1049:38 other day like down a dollar up a dollar um and then Black Friday just goes crazy
1049:42 um and then Black Friday just goes crazy um with these price changes so use this
1049:44 um with these price changes so use this as you will I hope that this was
1049:46 as you will I hope that this was instructional I hope that it's useful I
1049:48 instructional I hope that it's useful I think I said that before is you know I'm
1049:50 think I said that before is you know I'm doing this because I think it's really
1049:52 doing this because I think it's really interesting it's really useful um um
1049:56 interesting it's really useful um um this to me again was a good
1049:58 this to me again was a good introduction a really good introduction
1050:01 introduction a really good introduction to web scraping because in this next one
1050:04 to web scraping because in this next one it gets quite a bit more difficult um I
1050:07 it gets quite a bit more difficult um I would say on a scale of like difficulty
1050:09 would say on a scale of like difficulty this is like maybe a four and it'll
1050:12 this is like maybe a four and it'll probably jump up to like a seven on this
1050:13 probably jump up to like a seven on this next one um just just much
1050:16 next one um just just much more um technical or or coding heavy so
1050:22 more um technical or or coding heavy so um you know look forward to that if
1050:23 um you know look forward to that if that's something that you look forward
1050:24 that's something that you look forward to with that being said I'm going to go
1050:27 to with that being said I'm going to go back over here for my send off with that
1050:29 back over here for my send off with that being said I hope this was helpful I
1050:33 being said I hope this was helpful I hope that you learned something um don't
1050:36 hope that you learned something um don't get mad at me if it was too easy don't
1050:37 get mad at me if it was too easy don't get mad if it was me if it was too hard
1050:39 get mad if it was me if it was too hard uh I'm doing my best over here so I
1050:41 uh I'm doing my best over here so I appreciate your patience thank you so
1050:43 appreciate your patience thank you so much for watching I really appreciate it
1050:46 much for watching I really appreciate it if you like this video be sure to like
1050:48 if you like this video be sure to like And subscribe below and I will see you
1050:50 And subscribe below and I will see you in the next
1050:52 in the next [Music]
1050:54 [Music] video
1051:04 what's going on everybody welcome back to another video today we're going to be
1051:06 to another video today we're going to be creating a script to automatically take
1051:08 creating a script to automatically take data from a crypto
1051:11 data from a crypto [Music]
1051:14 [Music] API now this project stems from an
1051:17 API now this project stems from an earlier video that I did where I walked
1051:18 earlier video that I did where I walked through what an API was and how you can
1051:20 through what an API was and how you can use it and in that video I showed you
1051:22 use it and in that video I showed you how to use coin market caps API so you
1051:24 how to use coin market caps API so you could start pulling in their crypto data
1051:26 could start pulling in their crypto data and in this video we're going to take it
1051:27 and in this video we're going to take it one step further and automate that
1051:28 one step further and automate that process now we're going to do a little
1051:29 process now we're going to do a little bit of transformation with the data I'm
1051:31 bit of transformation with the data I'm going to show you some cool stuff of how
1051:33 going to show you some cool stuff of how you can use it and maybe we'll do a
1051:34 you can use it and maybe we'll do a little bit of visualization at the end
1051:36 little bit of visualization at the end but that is not the main point of this
1051:38 but that is not the main point of this video it's mostly around the automation
1051:40 video it's mostly around the automation piece and a little bit of the data
1051:42 piece and a little bit of the data cleaning piece as well now fair warning
1051:44 cleaning piece as well now fair warning this is not a beginners level project
1051:45 this is not a beginners level project it's probably more like an intermediate
1051:47 it's probably more like an intermediate project and it's not even a complete
1051:49 project and it's not even a complete project per se because we're not doing
1051:52 project per se because we're not doing all the data cleaning we're not doing
1051:53 all the data cleaning we're not doing all the visualizations but but if you
1051:55 all the visualizations but but if you follow along we're going to cover a lot
1051:57 follow along we're going to cover a lot of different things and you're really
1051:58 of different things and you're really going to set yourself up to be able to
1052:00 going to set yourself up to be able to do just about anything you want with
1052:02 do just about anything you want with this data or different apis that you
1052:04 this data or different apis that you pull from so with that being said let's
1052:05 pull from so with that being said let's jump onto my screen and get started with
1052:06 jump onto my screen and get started with the project all right so this is where
1052:08 the project all right so this is where we stopped in our last video so if you
1052:10 we stopped in our last video so if you haven't watched it now is the time to go
1052:12 haven't watched it now is the time to go back and do that I'll have a link in the
1052:14 back and do that I'll have a link in the description also all the code that we're
1052:16 description also all the code that we're going to be looking at today and working
1052:18 going to be looking at today and working through is going to be in a GitHub repo
1052:21 through is going to be in a GitHub repo below so you can go and get all the code
1052:23 below so you can go and get all the code and have it completely finished and just
1052:24 and have it completely finished and just follow along or you can code it from
1052:27 follow along or you can code it from scratch along with me I do recommend
1052:29 scratch along with me I do recommend writing it from scratch if you can
1052:31 writing it from scratch if you can because I think you'll learn more and
1052:32 because I think you'll learn more and you'll make mistakes and you'll learn
1052:33 you'll make mistakes and you'll learn from that as we go through it but it is
1052:36 from that as we go through it but it is up to you so let's get started and as
1052:39 up to you so let's get started and as you can see uh we have the script right
1052:41 you can see uh we have the script right here and I'm starting basically from
1052:43 here and I'm starting basically from scratch I have a completed one up here
1052:45 scratch I have a completed one up here I'm actually going to get rid of those
1052:47 I'm actually going to get rid of those um and what we're going to do is we're
1052:49 um and what we're going to do is we're going to start from exactly where we
1052:51 going to start from exactly where we started in our last one I'm going to run
1052:52 started in our last one I'm going to run the script um this is going to p from
1052:55 the script um this is going to p from our
1052:56 our API and we're going to look at the
1052:59 API and we're going to look at the dictionary set our option and do our
1053:02 dictionary set our option and do our Json normaliz so this is where we
1053:04 Json normaliz so this is where we literally left off from the from the
1053:06 literally left off from the from the last video so we have all of this
1053:09 last video so we have all of this data
1053:11 data and what we want to do with it is we
1053:14 and what we want to do with it is we want to kind of automate that process
1053:16 want to kind of automate that process right because we don't want to have to
1053:17 right because we don't want to have to come in here run this and you know put
1053:20 come in here run this and you know put into a CSV manually or something like
1053:23 into a CSV manually or something like that we want to automate this data
1053:25 that we want to automate this data collection process so that we can just
1053:27 collection process so that we can just have the data ready for us to use um and
1053:29 have the data ready for us to use um and it all be ready to go so we're going to
1053:32 it all be ready to go so we're going to be using this script um but you know we
1053:35 be using this script um but you know we we might want to add a little bit more
1053:37 we might want to add a little bit more to it before we do that uh the first
1053:40 to it before we do that uh the first thing that I want to do before um before
1053:44 thing that I want to do before um before anything is something that I like to do
1053:46 anything is something that I like to do when I'm creating these automation
1053:47 when I'm creating these automation scripts as I I like to add a Tim stamp
1053:51 scripts as I I like to add a Tim stamp uh and the reason for that is because I
1053:53 uh and the reason for that is because I want to know when I ran or when each of
1053:56 want to know when I ran or when each of those um Loops you can say runs through
1053:59 those um Loops you can say runs through an and does those automated runs right
1054:01 an and does those automated runs right so if I do it every day I want to know
1054:03 so if I do it every day I want to know what time of day I ran it making sure
1054:06 what time of day I ran it making sure each run ran
1054:07 each run ran successfully and so all I'm going to do
1054:09 successfully and so all I'm going to do is I'm going to add a new column at the
1054:11 is I'm going to add a new column at the end and just call it timestamp so let's
1054:14 end and just call it timestamp so let's go right up here and we're going to say
1054:18 go right up here and we're going to say PD Dot and there's something called two
1054:20 PD Dot and there's something called two date time so we're going to do
1054:23 date time so we're going to do 2core date
1054:26 2core date time and then we're going to do now and
1054:31 time and then we're going to do now and what this is literally going to do is
1054:32 what this is literally going to do is take the the date the the Tim stamp of
1054:36 take the the date the the Tim stamp of right now when it's running and it's
1054:39 right now when it's running and it's going to show that now we need to of
1054:41 going to show that now we need to of course add a new uh a new column for
1054:44 course add a new uh a new column for that so all we're going to do is we're
1054:45 that so all we're going to do is we're going to say data frame whoops we're say
1054:49 going to say data frame whoops we're say data frame and let me see real quick we
1054:52 data frame and let me see real quick we just have the
1054:53 just have the data
1054:55 data we need to add we need to create this
1054:57 we need to add we need to create this data frame right here so data frame
1054:58 data frame right here so data frame equals and then this Json normalized and
1055:01 equals and then this Json normalized and we're going to say data frame and then
1055:02 we're going to say data frame and then we're going to do a bracket and we're
1055:04 we're going to do a bracket and we're going to say timestamp and we'll do well
1055:07 going to say timestamp and we'll do well are all these lowercase we're going to
1055:10 are all these lowercase we're going to keep with the the lower case we're going
1055:11 keep with the the lower case we're going to say time
1055:13 to say time stamp and we do that bracket and we'll
1055:15 stamp and we do that bracket and we'll say equals so what this going to do is g
1055:17 say equals so what this going to do is g to first off it's going to create this
1055:19 to first off it's going to create this dat or or assign this DF as our data
1055:21 dat or or assign this DF as our data frame and then we're going to add this
1055:24 frame and then we're going to add this time stamp and add this new column and
1055:26 time stamp and add this new column and so let's run this really
1055:29 so let's run this really quickly and let's go all the way to the
1055:32 quickly and let's go all the way to the right and this is our timestamp and this
1055:35 right and this is our timestamp and this is the time uh that it is right now this
1055:37 is the time uh that it is right now this is the day that I'm running it this is
1055:39 is the day that I'm running it this is the time that I'm running it and so this
1055:41 the time that I'm running it and so this is working properly now if you look
1055:43 is working properly now if you look really quickly there is a last updated
1055:46 really quickly there is a last updated in here and this is very close to this
1055:50 in here and this is very close to this timestamp but it is not the same thing
1055:52 timestamp but it is not the same thing um but if you looked through this data
1055:54 um but if you looked through this data and you really into it a little bit
1055:56 and you really into it a little bit there's this last update is coming from
1055:58 there's this last update is coming from coin market caps API and this is when
1056:01 coin market caps API and this is when the actual um cryptocurrency was updated
1056:05 the actual um cryptocurrency was updated in their system and so it is going to be
1056:07 in their system and so it is going to be really close but it's not going to be
1056:08 really close but it's not going to be exact and so I don't like to rely on
1056:12 exact and so I don't like to rely on built-in ones that you know are coming
1056:13 built-in ones that you know are coming from an API or something I want to make
1056:15 from an API or something I want to make one myself that's running on the system
1056:16 one myself that's running on the system where I'm creating the automated process
1056:18 where I'm creating the automated process just like just something I do um so now
1056:23 just like just something I do um so now we have this original data frame created
1056:26 we have this original data frame created right we H we now have what we need but
1056:30 right we H we now have what we need but what we want to do is to keep adding
1056:32 what we want to do is to keep adding data to this um we don't want it to just
1056:35 data to this um we don't want it to just go to um you know create these 5,000
1056:38 go to um you know create these 5,000 rows we want it to create 5,000 5,000
1056:41 rows we want it to create 5,000 5,000 5,000 over time whether it's a day an
1056:44 5,000 over time whether it's a day an hour a week um whatever you want to run
1056:46 hour a week um whatever you want to run it so um what I'm actually going to do
1056:49 it so um what I'm actually going to do is I'm going to limit this a lot I just
1056:51 is I'm going to limit this a lot I just want to look at the top let's say 15 so
1056:54 want to look at the top let's say 15 so we're going to do that that we're going
1056:55 we're going to do that that we're going to run through all this again so now I
1056:57 to run through all this again so now I just have top 15 it's going to be um
1057:01 just have top 15 it's going to be um easier to to see and it won't take as
1057:03 easier to to see and it won't take as much time to run our scripts again you
1057:05 much time to run our scripts again you can keep as many as you'd like if you
1057:07 can keep as many as you'd like if you want a 100 200 all 5,000 you do whatever
1057:10 want a 100 200 all 5,000 you do whatever you'd like but what we are now going to
1057:13 you'd like but what we are now going to do is we're going to create a function
1057:16 do is we're going to create a function using this original script so we again
1057:18 using this original script so we again we have this data frame and we are going
1057:21 we have this data frame and we are going to create an automated process that is
1057:24 to create an automated process that is going to autom a script to automate this
1057:26 going to autom a script to automate this that is going to append data to this
1057:28 that is going to append data to this data frame right here so that's kind of
1057:30 data frame right here so that's kind of you know the big thing that we're trying
1057:31 you know the big thing that we're trying to accomplish in this project um so
1057:35 to accomplish in this project um so let's go up here and we're going to
1057:37 let's go up here and we're going to we'll just take from here all the way to
1057:41 we'll just take from here all the way to here we just going to copy this and
1057:45 here we just going to copy this and going to paste it down here now what we
1057:48 going to paste it down here now what we need to do is we need to create a
1057:50 need to do is we need to create a function so we're going to say
1057:52 function so we're going to say DF and we're going to call this the a
1057:54 DF and we're going to call this the a apore Runner because this is going to
1057:57 apore Runner because this is going to run our API um whenever we need it to
1058:01 run our API um whenever we need it to run now when you are
1058:03 run now when you are formatting um something for a function
1058:06 formatting um something for a function it it needs to be formatted properly and
1058:09 it it needs to be formatted properly and so what we need to do is need to go over
1058:10 so what we need to do is need to go over here hit tap we're going to do this all
1058:13 here hit tap we're going to do this all the way down I'm just going to skip
1058:14 the way down I'm just going to skip forward when it's all the way done all
1058:15 forward when it's all the way done all right so now we have this URL and what
1058:18 right so now we have this URL and what we want to add because this is again
1058:20 we want to add because this is again this is going to run through kind of
1058:21 this is going to run through kind of this this automated process we're going
1058:23 this this automated process we're going to run this um this function there what
1058:26 to run this um this function there what we want is to also add this right here
1058:28 we want is to also add this right here so we need to take this and we're gonna
1058:31 so we need to take this and we're gonna need to add
1058:32 need to add this we'll just put it down
1058:36 this we'll just put it down here
1058:37 here [Music]
1058:39 [Music] okay and let's do that so what we have
1058:43 okay and let's do that so what we have so far is really close to what we want
1058:46 so far is really close to what we want our function to be um we have this
1058:50 our function to be um we have this function that we're going to be running
1058:52 function that we're going to be running through it's going to call this function
1058:54 through it's going to call this function it's going to call the the API we're
1058:56 it's going to call the the API we're going to use our key we are going to um
1059:00 going to use our key we are going to um you know test it load it format It And
1059:03 you know test it load it format It And format it right here then we're going to
1059:04 format it right here then we're going to add this timestamp and then we will have
1059:06 add this timestamp and then we will have this now right now it's just C it's just
1059:09 this now right now it's just C it's just going to print this data frame basically
1059:11 going to print this data frame basically but that's not what we want right now
1059:13 but that's not what we want right now what we want is to actually append this
1059:16 what we want is to actually append this data so when it gets to here when it
1059:18 data so when it gets to here when it gets to this data that's going to be
1059:20 gets to this data that's going to be right um right here what we want to do
1059:23 right um right here what we want to do now since we already have the original
1059:25 now since we already have the original data frame set up up top is we now want
1059:27 data frame set up up top is we now want to say that this is going to be data
1059:29 to say that this is going to be data frame two and we're going to say it's
1059:32 frame two and we're going to say it's going to append it to data Frame 2 and
1059:34 going to append it to data Frame 2 and so the original data frame we're going
1059:36 so the original data frame we're going to say data frame
1059:37 to say data frame 2. append and we're going to say
1059:41 2. append and we're going to say df2 all this does is this says this new
1059:46 df2 all this does is this says this new data that's GNA be coming in every time
1059:48 data that's GNA be coming in every time let's say it's a loop and it's just
1059:49 let's say it's a loop and it's just looping through pulling the data pulling
1059:50 looping through pulling the data pulling the data pulling the data we're going to
1059:53 the data pulling the data we're going to create this data frame we're going to
1059:55 create this data frame we're going to add add this time stamp like like we
1059:57 add add this time stamp like like we want and then we're going to append that
1059:59 want and then we're going to append that to this original data frame so as of
1060:03 to this original data frame so as of right now this looks good I will we'll
1060:05 right now this looks good I will we'll run it in a second I'll create it so I
1060:09 run it in a second I'll create it so I just created
1060:10 just created it so now we need to actually create our
1060:12 it so now we need to actually create our script to automatically run this so
1060:15 script to automatically run this so we're going to do something called
1060:17 we're going to do something called import OS and let me tell you there's a
1060:20 import OS and let me tell you there's a thousand different ways to do this and
1060:21 thousand different ways to do this and there are better ways to do this but
1060:24 there are better ways to do this but they are much more complex much more
1060:26 they are much more complex much more complicated and some cost money in order
1060:28 complicated and some cost money in order to do it I'm going to show you different
1060:31 to do it I'm going to show you different options on how to do this in future
1060:33 options on how to do this in future videos on how to automate your Python
1060:35 videos on how to automate your Python scripts but this one to me is one I've
1060:38 scripts but this one to me is one I've used a lot um many many times for
1060:40 used a lot um many many times for different projects and it works so I'm
1060:43 different projects and it works so I'm not going to show you the most
1060:44 not going to show you the most complicated thing in the world I'm going
1060:45 complicated thing in the world I'm going to show you something that I've just
1060:46 to show you something that I've just used a lot and so we're going to say
1060:49 used a lot and so we're going to say from time import time from time import
1060:54 from time import time from time import sleep that one's
1060:56 sleep that one's important and now we're going to create
1060:58 important and now we're going to create our Loop so what these um what the time
1061:02 our Loop so what these um what the time and the sleep and the OS uh your
1061:05 and the sleep and the OS uh your operating system what what these are
1061:06 operating system what what these are going to do is they're going to give us
1061:08 going to do is they're going to give us the ability to track the time and we're
1061:12 the ability to track the time and we're going to be able to run through and call
1061:14 going to be able to run through and call this function in certain intervals that
1061:17 this function in certain intervals that we want so let's create our for loop
1061:20 we want so let's create our for loop we're going to say 4 I in now
1061:24 we're going to say 4 I in now you can create this specific part in
1061:28 you can create this specific part in different ways but what I'm going to do
1061:30 different ways but what I'm going to do is I'm going to say range of one uh
1061:32 is I'm going to say range of one uh let's say
1061:33 let's say 333 and I say 333 and if you remember
1061:37 333 and I say 333 and if you remember from the first video on the API you only
1061:39 from the first video on the API you only have 333 runs per day and so if I ran
1061:45 have 333 runs per day and so if I ran ran this 333 times today that would be
1061:49 ran this 333 times today that would be our Max and so that's why I'm using that
1061:51 our Max and so that's why I'm using that 333 just for reference so now we're
1061:53 333 just for reference so now we're going to
1061:54 going to do
1061:56 do API Runner so in this loop we're going
1061:59 API Runner so in this loop we're going to call this function up here and then
1062:01 to call this function up here and then I'm going to say I want to prove or or
1062:04 I'm going to say I want to prove or or show have an output to show that this is
1062:06 show have an output to show that this is running through successfully so I'm just
1062:09 running through successfully so I'm just going to and you can write anything here
1062:10 going to and you can write anything here we're just going to say API Runner
1062:16 we're just going to say API Runner completed uh completed
1062:19 completed uh completed successfully successfully how do you
1062:21 successfully successfully how do you spell that successfully that doesn't
1062:25 spell that successfully that doesn't look
1062:26 look right I'm just going to say completed
1062:28 right I'm just going to say completed all right forget that I don't remember
1062:30 all right forget that I don't remember how to say uh Spell successfully if
1062:32 how to say uh Spell successfully if that's if it spelled it right you guys
1062:34 that's if it spelled it right you guys spell it that way but I can't remember
1062:36 spell it that way but I can't remember now we're going to use this sleep right
1062:38 now we're going to use this sleep right here now this counts it in seconds you
1062:41 here now this counts it in seconds you can change it to minutes hours whatever
1062:43 can change it to minutes hours whatever we're GNA have it run every minute which
1062:46 we're GNA have it run every minute which is every 60 seconds and so this is going
1062:49 is every 60 seconds and so this is going to I'm just going to say it's going to
1062:51 to I'm just going to say it's going to sleep for one
1062:53 sleep for one minute and then we're g to say
1062:57 minute and then we're g to say exit so all this is going to do and this
1063:01 exit so all this is going to do and this is again fairly simple it's just a
1063:04 is again fairly simple it's just a simple for Loop and what it says is it's
1063:07 simple for Loop and what it says is it's going to call this API it's going to
1063:09 going to call this API it's going to tell us that it ran successfully and
1063:11 tell us that it ran successfully and then it's going to wait for 60 seconds
1063:13 then it's going to wait for 60 seconds and it's going to run again that's it so
1063:17 and it's going to run again that's it so let's run this and see what happens see
1063:18 let's run this and see what happens see if what we did works so rant the first
1063:21 if what we did works so rant the first time now I'm not gonna I'm not going to
1063:25 time now I'm not gonna I'm not going to bore you because I'm doing this live
1063:26 bore you because I'm doing this live exactly what we're about to get is what
1063:28 exactly what we're about to get is what we're going to use I didn't run it
1063:29 we're going to use I didn't run it overnight or or for a week so that we
1063:32 overnight or or for a week so that we have a bunch of data I'm what you were
1063:34 have a bunch of data I'm what you were going to work with I'm going to work
1063:35 going to work with I'm going to work with as well so I'm going to wait a few
1063:37 with as well so I'm going to wait a few minutes I'm going to let this run I want
1063:39 minutes I'm going to let this run I want you to do the same thing I'm going to
1063:41 you to do the same thing I'm going to let this run for maybe like five minutes
1063:43 let this run for maybe like five minutes or so and we'll work with what we have
1063:46 or so and we'll work with what we have and we'll keep going with the project
1063:47 and we'll keep going with the project because again we're not the point of
1063:50 because again we're not the point of this project is not to create the final
1063:52 this project is not to create the final product or creating all the visuals
1063:54 product or creating all the visuals ations that um will most likely be in
1063:56 ations that um will most likely be in another video where we're taking all
1063:58 another video where we're taking all this data and doing all these things
1063:59 this data and doing all these things with it the point of this video is to
1064:00 with it the point of this video is to automate it clean it up to where we have
1064:03 automate it clean it up to where we have it to where we can really use it and
1064:05 it to where we can really use it and then I'm going to let you guys loose and
1064:06 then I'm going to let you guys loose and you guys can do whatever you want with
1064:08 you guys can do whatever you want with it and I think it's really setting you
1064:10 it and I think it's really setting you up for a lot of successful projects in
1064:13 up for a lot of successful projects in the future that you can do all by
1064:14 the future that you can do all by yourself without me having to walk you
1064:16 yourself without me having to walk you through it so as you can see it's
1064:17 through it so as you can see it's already ran through twice I'm going to
1064:19 already ran through twice I'm going to pause for a second I'm going to let that
1064:21 pause for a second I'm going to let that run through uh just a few more times and
1064:24 run through uh just a few more times and then we will continue with the project
1064:26 then we will continue with the project all right we are back and of course it's
1064:28 all right we are back and of course it's only ran what five times um it has not
1064:31 only ran what five times um it has not reached the limit of 333 so we are
1064:33 reached the limit of 333 so we are perfectly fine what I'm going to do is
1064:35 perfectly fine what I'm going to do is I'm just going to stop this by clicking
1064:36 I'm just going to stop this by clicking this uh square up here and it's going to
1064:39 this uh square up here and it's going to give us some error and then we're going
1064:40 give us some error and then we're going to check it and we will see what we have
1064:45 to check it and we will see what we have I don't know why it's taking so long if
1064:46 I don't know why it's taking so long if I'm being honest all right so I
1064:47 I'm being honest all right so I interrupted it and let's run this let's
1064:50 interrupted it and let's run this let's see what we got I hope we have more than
1064:51 see what we got I hope we have more than 15 because if not I'm going be very
1064:53 15 because if not I'm going be very upset
1065:00 okay so okay well uh I made a mistake um I was
1065:03 well uh I made a mistake um I was supposed to put data frame right here
1065:06 supposed to put data frame right here and I had data frame too so um take
1065:11 and I had data frame too so um take change your script do not do what I just
1065:14 change your script do not do what I just did we're supposed to be append it's
1065:16 did we're supposed to be append it's supposed to be data frame append and
1065:18 supposed to be data frame append and we're supposed to be appending the
1065:19 we're supposed to be appending the original D this data frame two to the
1065:23 original D this data frame two to the original data frame so so um I messed up
1065:26 original data frame so so um I messed up on that one let's rerun that let's rerun
1065:30 on that one let's rerun that let's rerun that um let's
1065:33 that um let's see local variable DF reference before
1065:36 see local variable DF reference before assignment okay this is perfect because
1065:38 assignment okay this is perfect because this happened to me before um we're
1065:41 this happened to me before um we're running into all sorts of good stuff I
1065:42 running into all sorts of good stuff I like to keep this stuff in my videos I
1065:44 like to keep this stuff in my videos I laugh because I hate running into
1065:46 laugh because I hate running into mistakes but everybody says they they're
1065:48 mistakes but everybody says they they're happy that I do this um so I'm going to
1065:51 happy that I do this um so I'm going to keep doing it I'm not going to cut this
1065:52 keep doing it I'm not going to cut this out I promise um
1065:54 out I promise um but what we actually need to do is we
1065:56 but what we actually need to do is we need to go back up to this function
1065:58 need to go back up to this function because what happened was is we called
1066:00 because what happened was is we called this data
1066:01 this data frame and now it's it's because it's in
1066:04 frame and now it's it's because it's in a function it's in what they would call
1066:06 a function it's in what they would call a local variable what we need to do is
1066:09 a local variable what we need to do is we now need to state that this is a
1066:13 we now need to state that this is a global um it's just called a global
1066:16 global um it's just called a global that's all it is um and so what we're
1066:18 that's all it is um and so what we're going to do is we're going do tab we're
1066:19 going to do is we're going do tab we're say Global say
1066:22 say Global say DF and what this should do is this
1066:26 DF and what this should do is this should declare it as a global variable
1066:28 should declare it as a global variable and it should let this run properly
1066:31 and it should let this run properly let's hope it
1066:32 let's hope it does all right it's
1066:34 does all right it's running um again I run into mistakes I
1066:38 running um again I run into mistakes I let me tell you something while we're
1066:40 let me tell you something while we're here for just a second this project I
1066:43 here for just a second this project I ran into probably a hundred mistakes or
1066:46 ran into probably a hundred mistakes or a hundred errors issues that I had to
1066:48 a hundred errors issues that I had to research for hours um and hours I'm
1066:51 research for hours um and hours I'm legitimately on stack Overflow and just
1066:53 legitimately on stack Overflow and just Googling and F figuring these things out
1066:55 Googling and F figuring these things out there were a lot of new things that I
1066:56 there were a lot of new things that I had never run into before um just on
1066:59 had never run into before um just on this project and so um everything that
1067:01 this project and so um everything that you're seeing is from after I went
1067:03 you're seeing is from after I went through all of those things or after I
1067:06 through all of those things or after I fixed all of those things and had to
1067:07 fixed all of those things and had to really work through them it was it was
1067:09 really work through them it was it was very um it was frustrating at times I
1067:11 very um it was frustrating at times I just I couldn't figure it out and so
1067:13 just I couldn't figure it out and so what you're looking at is kind of the
1067:14 what you're looking at is kind of the polished version of that now that I have
1067:16 polished version of that now that I have everything laid out because I I can't
1067:19 everything laid out because I I can't spend 10 hours on a project nobody would
1067:21 spend 10 hours on a project nobody would watch it so just know that if you are
1067:24 watch it so just know that if you are running into some of these mistakes or
1067:25 running into some of these mistakes or you run into mistakes later on when
1067:27 you run into mistakes later on when you're expanding this project that's
1067:29 you're expanding this project that's completely normal so what we're going to
1067:31 completely normal so what we're going to do is we're going to let this run for a
1067:32 do is we're going to let this run for a little bit and then after maybe three or
1067:35 little bit and then after maybe three or four minutes we'll come back and we'll
1067:37 four minutes we'll come back and we'll keep going with the project all right so
1067:40 keep going with the project all right so let's run this and check and see if we
1067:43 let's run this and check and see if we have uh the data that we're looking for
1067:46 have uh the data that we're looking for uh and it looks like we do let's go
1067:48 uh and it looks like we do let's go actually back up here really
1067:52 actually back up here really quick um we want to set this to display
1067:55 quick um we want to set this to display Max rows because I want to be able to
1067:58 Max rows because I want to be able to see all the rows and not just um a few
1068:00 see all the rows and not just um a few of them so and that just instead of it
1068:04 of them so and that just instead of it gives us this scrolling instead of that
1068:06 gives us this scrolling instead of that dot dot dot that shows us just a few so
1068:09 dot dot dot that shows us just a few so there's our original 15 and then we have
1068:12 there's our original 15 and then we have the next um the next Loop and then we
1068:15 the next um the next Loop and then we have the next Loop and let me scroll
1068:17 have the next Loop and let me scroll over to the timestamps and I'll show you
1068:19 over to the timestamps and I'll show you what I mean um so was ran on
1068:22 what I mean um so was ran on 52651 let's go down
1068:26 52651 let's go down 526 at 150
1068:29 526 at 150 2905 I say 1501 2905 and then the next
1068:34 2905 I say 1501 2905 and then the next one you can see was ran at
1068:37 one you can see was ran at 36 31 these are all the ones one minute
1068:40 36 31 these are all the ones one minute after each other my original one was
1068:42 after each other my original one was from
1068:43 from earlier 32 33 yeah so you can see 32 31
1068:48 earlier 32 33 yeah so you can see 32 31 3030 or um 3029 and this one was about
1068:51 3030 or um 3029 and this one was about 15 minutes ago when I first
1068:54 15 minutes ago when I first um ran the original data frame right all
1068:57 um ran the original data frame right all right guys this is Alex from the future
1068:59 right guys this is Alex from the future I've actually completed this entire
1069:01 I've actually completed this entire project uh in the video and you're about
1069:02 project uh in the video and you're about to see all that after this but I wanted
1069:05 to see all that after this but I wanted to show you one more thing that you can
1069:06 to show you one more thing that you can do in this function up here that I
1069:08 do in this function up here that I didn't show you uh originally that I'm
1069:11 didn't show you uh originally that I'm coming back to show you and that's how
1069:12 coming back to show you and that's how to actually put it into a CSV now all
1069:16 to actually put it into a CSV now all we've done in this one is we we've kept
1069:18 we've done in this one is we we've kept it all enclosed in a data frame and
1069:21 it all enclosed in a data frame and that's it and that may be great but a
1069:24 that's it and that may be great but a lot of you guys are going to want to
1069:25 lot of you guys are going to want to automate this and put it into a CSV and
1069:28 automate this and put it into a CSV and I want to show you how to do that all
1069:30 I want to show you how to do that all right so what I'm going to show you
1069:31 right so what I'm going to show you really quickly is right here in this uh
1069:33 really quickly is right here in this uh in this folder right here I have all
1069:35 in this folder right here I have all these different API 3es and fours these
1069:37 these different API 3es and fours these were tests that I did before but what
1069:39 were tests that I did before but what you can do is instead of just putting it
1069:41 you can do is instead of just putting it into a data frame you can actually
1069:44 into a data frame you can actually append the data to a CSV and have that
1069:46 append the data to a CSV and have that CSV sitting out there for you instead of
1069:49 CSV sitting out there for you instead of just keeping it all in the data frame
1069:51 just keeping it all in the data frame and there's a lot of different uses for
1069:53 and there's a lot of different uses for that you may want to have that file
1069:56 that you may want to have that file separately from here just in case
1069:58 separately from here just in case something times out or something breaks
1070:00 something times out or something breaks which is a legitimate concern or your
1070:02 which is a legitimate concern or your computer shuts off or or something like
1070:04 computer shuts off or or something like that that is a legitimate concern so
1070:06 that that is a legitimate concern so what we're going to do is we're going to
1070:08 what we're going to do is we're going to say um if not and this is basically an
1070:11 say um if not and this is basically an if statement we're going to say
1070:14 if statement we're going to say os.
1070:15 os. path dot is file so what this is going
1070:19 path dot is file so what this is going to do is check if there's already a file
1070:22 to do is check if there's already a file under this name and we're going to do r
1070:25 under this name and we're going to do r dot or or R um if you have never done um
1070:30 dot or or R um if you have never done um if you've never done CSV stuff before
1070:33 if you've never done CSV stuff before it's really important that you put that
1070:35 it's really important that you put that you you're going to get an error every
1070:37 you you're going to get an error every time so we're going to take this right
1070:39 time so we're going to take this right here and we're going to copy that and
1070:41 here and we're going to copy that and we're going to put that right here and
1070:43 we're going to put that right here and then we're also going to do a slash and
1070:47 then we're also going to do a slash and then we're going to name it basically um
1070:49 then we're going to name it basically um let's name this API because I don't
1070:51 let's name this API because I don't think I have that one in there I think I
1070:52 think I have that one in there I think I deleted it yeah so I don't have API so
1070:54 deleted it yeah so I don't have API so I'm just going to keep it api.
1070:56 I'm just going to keep it api. CSV and then I'm going to close that
1070:59 CSV and then I'm going to close that parentheses and then we're going to add
1071:01 parentheses and then we're going to add a colon right here and we're going to
1071:03 a colon right here and we're going to say if that does not exist we are going
1071:07 say if that does not exist we are going to write this to it and create it so
1071:10 to write this to it and create it so we're going to say data frames that's
1071:12 we're going to say data frames that's this data frame right
1071:14 this data frame right here data frame dot we going to say 2or
1071:19 here data frame dot we going to say 2or CSV and we're going to do that R and
1071:23 CSV and we're going to do that R and then we're going to copy this so let's
1071:27 then we're going to copy this so let's just let's just replace it like
1071:31 just let's just replace it like that and then we're going to say
1071:34 that and then we're going to say comma
1071:36 comma header oops header is equal
1071:41 header oops header is equal to column uncore names so what this is
1071:46 to column uncore names so what this is going to do is if we run through this
1071:48 going to do is if we run through this and what we would have to do is um I'll
1071:52 and what we would have to do is um I'll talk about this in a little bit we'll
1071:53 talk about this in a little bit we'll have to change this up a little bit but
1071:55 have to change this up a little bit but what this is going to do is going to
1071:57 what this is going to do is going to check to see if this file right here
1072:00 check to see if this file right here exists if it does not it is going to
1072:03 exists if it does not it is going to create it and create the column headers
1072:06 create it and create the column headers based off the this data frame that is
1072:08 based off the this data frame that is what that does now what we want to do is
1072:11 what that does now what we want to do is say else and this next part that we're
1072:14 say else and this next part that we're going to write is saying if there's
1072:16 going to write is saying if there's already the API file there we want to
1072:19 already the API file there we want to append the data we don't want to
1072:21 append the data we don't want to overwrite it or anything like that we
1072:22 overwrite it or anything like that we want to append the the data so we're
1072:23 want to append the the data so we're going to say we're basically going to
1072:25 going to say we're basically going to copy
1072:27 copy this maybe not the whole thing but I
1072:29 this maybe not the whole thing but I already did it um so we're going to copy
1072:31 already did it um so we're going to copy that and we're going to say mode oops
1072:35 that and we're going to say mode oops mode equals
1072:38 mode equals a and a stands for append and then we're
1072:41 a and a stands for append and then we're going to say header oops keep messing up
1072:44 going to say header oops keep messing up header and we're say false oops we're
1072:48 header and we're say false oops we're going to say false which means when it
1072:49 going to say false which means when it depends the data it's not going to use
1072:51 depends the data it's not going to use those the column headers every time
1072:53 those the column headers every time which you don't want because every time
1072:55 which you don't want because every time you append it if you added the headers
1072:58 you append it if you added the headers every 15 rows every 15 rows you're going
1073:00 every 15 rows every 15 rows you're going to have another headers that you're
1073:02 to have another headers that you're going to have to like go out into that
1073:04 going to have to like go out into that CSV and filter out and and get rid of
1073:06 CSV and filter out and and get rid of them so we're going to say header equals
1073:08 them so we're going to say header equals false now just a second ago I said you
1073:10 false now just a second ago I said you would need to mess with this just a
1073:12 would need to mess with this just a little bit and you would because every
1073:14 little bit and you would because every time um you'd be putting in this data
1073:17 time um you'd be putting in this data frame which it's already appending it to
1073:19 frame which it's already appending it to this data frame so every time you'd be
1073:21 this data frame so every time you'd be creating a lot of duplicates if if you
1073:23 creating a lot of duplicates if if you kept it exactly as is what you were
1073:25 kept it exactly as is what you were going to need to do is basically take it
1073:27 going to need to do is basically take it back to its to its um bones um so you
1073:30 back to its to its um bones um so you need
1073:31 need to kind of keep it like this so what you
1073:35 to kind of keep it like this so what you need to do is just now run this and it
1073:37 need to do is just now run this and it would work perfectly uh let's test it
1073:40 would work perfectly uh let's test it really quick um to see if it works uh
1073:43 really quick um to see if it works uh because I'm I'm promising you something
1073:44 because I'm I'm promising you something I want to make sure it actually works
1073:46 I want to make sure it actually works let's run it this time okay so it just
1073:50 let's run it this time okay so it just ran for the first time so it should have
1073:52 ran for the first time so it should have created this file
1073:53 created this file let's go see if that works properly so
1073:56 let's go see if that works properly so now it just created that file and now
1073:58 now it just created that file and now we're going to see if it actually
1074:01 we're going to see if it actually appends the data so let's wait just one
1074:03 appends the data so let's wait just one time um and then I'm going to stop it
1074:05 time um and then I'm going to stop it I'm going to see if it works again I'm
1074:07 I'm going to see if it works again I'm just verifying to make sure that what
1074:09 just verifying to make sure that what I'm telling you is actually working uh
1074:12 I'm telling you is actually working uh because if it doesn't I would feel
1074:13 because if it doesn't I would feel terrible we don't want that and while
1074:16 terrible we don't want that and while that's running actually I'm going to add
1074:19 that's running actually I'm going to add this because now I want to show you how
1074:21 this because now I want to show you how to call it um super easy we're just
1074:24 to call it um super easy we're just going to do
1074:25 going to do pd.
1074:27 pd. reor CSV we do that we're going to call
1074:34 reor CSV we do that we're going to call this just like
1074:36 this just like that and then we're going to say data
1074:39 that and then we're going to say data frame and we're just going to do
1074:42 frame and we're just going to do 72 something random because I've already
1074:45 72 something random because I've already done this whole project I don't want to
1074:46 done this whole project I don't want to mess anything up so we're going say data
1074:48 mess anything up so we're going say data frame 72 so now let's stop this
1074:53 frame 72 so now let's stop this um and what we're going to do is once
1074:56 um and what we're going to do is once that stops we're going to run this and
1074:58 that stops we're going to run this and see if it actually um worked and see
1075:01 see if it actually um worked and see make sure that this actually pulled the
1075:03 make sure that this actually pulled the data in all right so we interrupted it
1075:05 data in all right so we interrupted it the file is ready to be read in so let's
1075:08 the file is ready to be read in so let's read it in there's our file um let's see
1075:13 read it in there's our file um let's see what did I mess up or did I mess
1075:15 what did I mess up or did I mess anything
1075:16 anything up ah I didn't mess anything up this is
1075:19 up ah I didn't mess anything up this is the index for this file and we already
1075:22 the index for this file and we already had this in here we'd probably be able
1075:23 had this in here we'd probably be able to get rid of it but if you see we have
1075:25 to get rid of it but if you see we have zero 1 two 3 four five six seven eight n
1075:28 zero 1 two 3 four five six seven eight n 14 then we have zero 1 2 3 and if we
1075:31 14 then we have zero 1 2 3 and if we look at the time stamp it should be one
1075:33 look at the time stamp it should be one minute apart so it's 11
1075:35 minute apart so it's 11 1945 it said 12045 so this worked
1075:40 1945 it said 12045 so this worked exactly as planned um again you have two
1075:42 exactly as planned um again you have two different options you can just keep it
1075:44 different options you can just keep it how it was before and I'll leave both of
1075:46 how it was before and I'll leave both of those options you know in the in the
1075:48 those options you know in the in the script so that you can kind of choose
1075:50 script so that you can kind of choose which one you want but um that's how you
1075:53 which one you want but um that's how you do that so then right here you're
1075:55 do that so then right here you're appending it to a CSV file and then if
1075:57 appending it to a CSV file and then if you just keep this and you get rid of
1075:59 you just keep this and you get rid of all this you're just appending it to a
1076:00 all this you're just appending it to a data frame now please continue with the
1076:03 data frame now please continue with the rest of the video that I already have
1076:05 rest of the video that I already have done um but again I'm future Alex so uh
1076:08 done um but again I'm future Alex so uh please continue with the rest of the
1076:09 please continue with the rest of the video okay so we have all this data we
1076:13 video okay so we have all this data we have we have so many columns we can do
1076:17 have we have so many columns we can do now you know if you want to completely
1076:19 now you know if you want to completely just go and do your own thing you
1076:21 just go and do your own thing you absolutely can do that I'm going to mess
1076:23 absolutely can do that I'm going to mess around with a few things um kind of show
1076:26 around with a few things um kind of show you something that I did that I thought
1076:29 you something that I did that I thought was really interesting um in order to
1076:32 was really interesting um in order to visualize this data a little bit and
1076:33 visualize this data a little bit and transform it a little bit to make it
1076:35 transform it a little bit to make it more
1076:36 more usable um but we're not doing a full
1076:38 usable um but we're not doing a full data cleaning that's not what this
1076:39 data cleaning that's not what this project is I'm not doing a full data
1076:41 project is I'm not doing a full data cleaning of this data that would be a ma
1076:44 cleaning of this data that would be a ma a very large undertaking because
1076:45 a very large undertaking because honestly this needs a lot of work one
1076:47 honestly this needs a lot of work one thing that I do want to clean up really
1076:49 thing that I do want to clean up really quick uh is is this right here I this
1076:54 quick uh is is this right here I this the math will be fine it's just the way
1076:56 the math will be fine it's just the way that it's shown on here is in state the
1076:58 that it's shown on here is in state the scientific notation and I don't like it
1077:00 scientific notation and I don't like it so what I'm going to do really
1077:02 so what I'm going to do really quickly I is just um get rid of that so
1077:06 quickly I is just um get rid of that so we're going
1077:07 we're going to we're GNA say
1077:09 to we're GNA say pd. set and we do underscore option and
1077:15 pd. set and we do underscore option and this is going to be do parentheses I'm
1077:18 this is going to be do parentheses I'm going to say display this is just this
1077:21 going to say display this is just this how this is formatting so we're going to
1077:23 how this is formatting so we're going to display
1077:25 display float underscore
1077:28 float underscore format and we're going to say comma and
1077:32 format and we're going to say comma and now we're going to use this
1077:34 now we're going to use this Lambda say x colon and we're going to
1077:38 Lambda say x colon and we're going to say
1077:40 say percent
1077:43 percent 0.5f and that right there and we're
1077:46 0.5f and that right there and we're going to say percent X now if you don't
1077:50 going to say percent X now if you don't know what lambdas is lambdas are um I
1077:53 know what lambdas is lambdas are um I highly recommend looking those up um
1077:56 highly recommend looking those up um again this is not a beginner tutorial
1077:59 again this is not a beginner tutorial whoops no such Keys display floor format
1078:03 whoops no such Keys display floor format that makes sense uh this is float yeah
1078:07 that makes sense uh this is float yeah guys this is not a beginner's level all
1078:09 guys this is not a beginner's level all right uh you can't use the floor format
1078:11 right uh you can't use the floor format this is the float format all right so
1078:13 this is the float format all right so now let's take a look at this uh this DF
1078:15 now let's take a look at this uh this DF uh this data frame that we have so we're
1078:16 uh this data frame that we have so we're just GNA hit DF hit enter and now our
1078:19 just GNA hit DF hit enter and now our numbers are a little bit more easily
1078:20 numbers are a little bit more easily readable I prefer it this way you do not
1078:23 readable I prefer it this way you do not have to do this I'm doing this just
1078:25 have to do this I'm doing this just because this is what I
1078:27 because this is what I prefer so let's jump right into it um
1078:30 prefer so let's jump right into it um something that when I saw this data I
1078:33 something that when I saw this data I was like something that I really thought
1078:34 was like something that I really thought was interesting is this percent change
1078:37 was interesting is this percent change of one hour percent change 24 hours 7
1078:40 of one hour percent change 24 hours 7 days 30 days 60 days 90 days if you're
1078:43 days 30 days 60 days 90 days if you're not in crypto or you don't do investing
1078:44 not in crypto or you don't do investing or anything like that what this is going
1078:47 or anything like that what this is going to show us is how I mean it's pretty
1078:49 to show us is how I mean it's pretty obvious how much the price of this coin
1078:53 obvious how much the price of this coin has changed over the last hour 24 hours
1078:55 has changed over the last hour 24 hours seven days so as you can see it's it's
1078:58 seven days so as you can see it's it's barely fluctuated over the past 24 hours
1079:01 barely fluctuated over the past 24 hours a little bit over the past um seven days
1079:04 a little bit over the past um seven days a lot over the last 30 days 60 days and
1079:06 a lot over the last 30 days 60 days and 90 days 20 minus 26% minus 33% we're in
1079:10 90 days 20 minus 26% minus 33% we're in may we just had a kind of a crash in
1079:12 may we just had a kind of a crash in crypto a couple weeks ago so I mean this
1079:15 crypto a couple weeks ago so I mean this tracks right but I want to visualize
1079:19 tracks right but I want to visualize this see this and kind of see um
1079:23 this see this and kind of see um you know how this is going to look and
1079:26 you know how this is going to look and how if I can gain any insight from that
1079:28 how if I can gain any insight from that information and just having it all
1079:30 information and just having it all displayed for me but in its current
1079:33 displayed for me but in its current state um you know we really cannot do
1079:37 state um you know we really cannot do that um now another issue not an issue
1079:41 that um now another issue not an issue but another thing that we have to take
1079:42 but another thing that we have to take into consideration is we
1079:44 into consideration is we have Bitcoin net right here we have
1079:47 have Bitcoin net right here we have Bitcoin right here after different polls
1079:49 Bitcoin right here after different polls now we just did it a minute after each
1079:51 now we just did it a minute after each other but for your project may do it a a
1079:53 other but for your project may do it a a run each day a run every hour or
1079:57 run each day a run every hour or something like that right
1080:00 something like that right and if you did that your data could be
1080:03 and if you did that your data could be very different and so you may just want
1080:07 very different and so you may just want to take this first one but what I'm
1080:09 to take this first one but what I'm going to do for the sake of this project
1080:11 going to do for the sake of this project I'm going to group them so let's go down
1080:14 I'm going to group them so let's go down here and we're going to say DF dog Group
1080:19 here and we're going to say DF dog Group by and so if you've ever done something
1080:22 by and so if you've ever done something like SQL uh this is how you Group by in
1080:25 like SQL uh this is how you Group by in pandas basically we're going to group by
1080:28 pandas basically we're going to group by uh the name so so on bitcoin etherium te
1080:31 uh the name so so on bitcoin etherium te so we're gonna we're gonna do that on
1080:34 so we're gonna we're gonna do that on name and uh I'm not gonna I'm gonna say
1080:38 name and uh I'm not gonna I'm gonna say sort is equal to false oops I'm not
1080:42 sort is equal to false oops I'm not going to sort it uh you could say true
1080:45 going to sort it uh you could say true there but we're not going to and I guess
1080:48 there but we're not going to and I guess you'll see why later we're going to do
1080:50 you'll see why later we're going to do an open
1080:51 an open bracket and now we need to choose what
1080:54 bracket and now we need to choose what we're going to group by uh or what we're
1080:56 we're going to group by uh or what we're going to what columns we're going to
1080:57 going to what columns we're going to have so I'm going to do another Open
1080:59 have so I'm going to do another Open Bracket and I'm just going to copy and
1081:01 Bracket and I'm just going to copy and paste these so I'm going to start right
1081:03 paste these so I'm going to start right here at quote percent one hour so I'm
1081:06 here at quote percent one hour so I'm going to do boom and
1081:10 going to do boom and then go over one and we're going to take
1081:14 then go over one and we're going to take 24
1081:16 24 hours paste that
1081:19 hours paste that comma we have the 7day 30-day
1081:24 comma we have the 7day 30-day and we're going to do like
1081:31 that and I'm just going to do comma I'm gonna do the same one but I'm just going
1081:33 gonna do the same one but I'm just going to manually change it to
1081:36 to manually change it to 30day rid of that at the end I don't
1081:38 30day rid of that at the end I don't know what that is uh then we're going to
1081:41 know what that is uh then we're going to do 60
1081:44 do 60 days and comma and we're going to do our
1081:47 days and comma and we're going to do our last one which is 90 days and let's see
1081:51 last one which is 90 days and let's see what that gives us
1081:56 uh doesn't give us anything okay I know what's wrong here
1081:59 anything okay I know what's wrong here um we forgot to add basically the what
1082:01 um we forgot to add basically the what we're we have we're grouping by
1082:04 we're we have we're grouping by something we need to have like an
1082:06 something we need to have like an average a
1082:08 average a mean a mode or something like that right
1082:12 mean a mode or something like that right so all we have to do is go to the end
1082:14 so all we have to do is go to the end right here and let's just do we're going
1082:16 right here and let's just do we're going to do an
1082:18 to do an average um and so we're taking this
1082:22 average um and so we're taking this number let's say this is for Bitcoin so
1082:24 number let's say this is for Bitcoin so we're going to take this number in this
1082:26 we're going to take this number in this one hour for every time it's Bitcoin
1082:28 one hour for every time it's Bitcoin it's going to group them all together um
1082:30 it's going to group them all together um and then it's going to average them so
1082:32 and then it's going to average them so in the past five minutes where it's been
1082:35 in the past five minutes where it's been running we're going to take the average
1082:36 running we're going to take the average or the mean of that so let's run this
1082:40 or the mean of that so let's run this again and so now this is our output
1082:43 again and so now this is our output let's take a
1082:44 let's take a look Oops I meant down here let's run
1082:48 look Oops I meant down here let's run this
1082:55 now now what we have is all of these um cryptos these are all 15 that we have
1082:57 cryptos these are all 15 that we have and this is the average um for this 1
1082:59 and this is the average um for this 1 hour 247 days 30 days 60 days and 90
1083:02 hour 247 days 30 days 60 days and 90 days so now we have all of our
1083:05 days so now we have all of our cryptocurrencies over here we have our
1083:07 cryptocurrencies over here we have our percent changes up top and then our
1083:09 percent changes up top and then our averages um here as well and so now what
1083:13 averages um here as well and so now what we're going to do is you know if you try
1083:16 we're going to do is you know if you try to visualize this as is doesn't really
1083:19 to visualize this as is doesn't really work because these percent changes are
1083:22 work because these percent changes are up here as columns and we don't really
1083:24 up here as columns and we don't really want them as columns because that it
1083:26 want them as columns because that it just doesn't work for visual for
1083:28 just doesn't work for visual for actually creating the visualizations we
1083:30 actually creating the visualizations we really need these to be rows and so my
1083:33 really need these to be rows and so my initial thought when I was doing this
1083:35 initial thought when I was doing this was I of course I need to Pivot um you
1083:37 was I of course I need to Pivot um you know if you've ever used pivot like an
1083:39 know if you've ever used pivot like an Excel or powerbi or something like that
1083:41 Excel or powerbi or something like that that was my first thought and I tried
1083:43 that was my first thought and I tried everything and I could get not could not
1083:45 everything and I could get not could not get it to work and I almost gave up
1083:47 get it to work and I almost gave up until I I ran across um something called
1083:50 until I I ran across um something called stacking or back and and so this was not
1083:54 stacking or back and and so this was not something that I I I think I have used
1083:56 something that I I I think I have used it before but I I couldn't remember to
1083:58 it before but I I couldn't remember to be being completely Frank I couldn't
1083:59 be being completely Frank I couldn't remember how to do this so I just did um
1084:02 remember how to do this so I just did um once I saw what it was I did Stack let's
1084:06 once I saw what it was I did Stack let's make that dat four you don't have to do
1084:08 make that dat four you don't have to do this uh you can keep this all the
1084:09 this uh you can keep this all the original data frame I'm just I like for
1084:12 original data frame I'm just I like for visual purposes you can see like the
1084:13 visual purposes you can see like the progression that we're making um but I
1084:16 progression that we're making um but I like to you know create its new data
1084:18 like to you know create its new data frame and I can always go back and look
1084:19 frame and I can always go back and look at this data frame three um as we go but
1084:23 at this data frame three um as we go but you don't you don't have to do that
1084:24 you don't you don't have to do that that's just what I'm doing so now let's
1084:27 that's just what I'm doing so now let's take a look at this now uh up here we
1084:29 take a look at this now uh up here we had Bitcoin and we had all these columns
1084:31 had Bitcoin and we had all these columns and we had uh these numbers as rows but
1084:35 and we had uh these numbers as rows but now we have all of these as rows as well
1084:39 now we have all of these as rows as well this how we have this is much much more
1084:42 this how we have this is much much more usable um and if you've ever done
1084:44 usable um and if you've ever done something like pivot or the stacking
1084:46 something like pivot or the stacking before you'll know that you you kind of
1084:48 before you'll know that you you kind of have to do it if you really want to
1084:49 have to do it if you really want to visualize this
1084:51 visualize this well but um you because we just stacked
1084:55 well but um you because we just stacked it it kind of changed it so if we look
1084:58 it it kind of changed it so if we look at um let's look at the type of let's do
1085:01 at um let's look at the type of let's do type of data frame three this is
1085:04 type of data frame three this is before um before we stacked it this was
1085:08 before um before we stacked it this was in a data frame but now let's go and
1085:11 in a data frame but now let's go and look at data frame four so this is a
1085:14 look at data frame four so this is a series this is no longer a data frame so
1085:17 series this is no longer a data frame so we have to remember that that's that's
1085:19 we have to remember that that's that's really important because we can no
1085:20 really important because we can no longer treat it as a data frame it's now
1085:23 longer treat it as a data frame it's now a series so we want to get it back to a
1085:25 a series so we want to get it back to a data frame we don't want it to be like
1085:28 data frame we don't want it to be like that because you can't really use it in
1085:29 that because you can't really use it in the series so what we're going to do and
1085:32 the series so what we're going to do and let me just create a few of these so you
1085:34 let me just create a few of these so you can be up here better so now what we're
1085:37 can be up here better so now what we're going to do is we're going to say data
1085:39 going to do is we're going to say data frame 4 Dot and something called 2core
1085:42 frame 4 Dot and something called 2core frame so we're going to make this into a
1085:45 frame so we're going to make this into a frame and now we're going to specify the
1085:47 frame and now we're going to specify the name and it doesn't mean um the name
1085:50 name and it doesn't mean um the name like right here we have actually mean
1085:52 like right here we have actually mean the name of these values right here this
1085:55 the name of these values right here this is part of the stacking process in these
1085:58 is part of the stacking process in these columns or these two columns so let's go
1086:02 columns or these two columns so let's go right here and we're going to call it
1086:04 right here and we're going to call it let's just say
1086:06 let's just say values and let's make this data frame
1086:12 values and let's make this data frame five and let's see the output whoops for
1086:15 five and let's see the output whoops for data frame five and now so there's that
1086:18 data frame five and now so there's that values and now this already looks a lot
1086:22 values and now this already looks a lot better right so it's in this it's in
1086:24 better right so it's in this it's in this more um this is already a data so
1086:27 this more um this is already a data so this is a data frame so let's look at
1086:29 this is a data frame so let's look at type data frame five so now it's in a
1086:32 type data frame five so now it's in a data frame
1086:33 data frame but the issue is is that this name is
1086:37 but the issue is is that this name is kind of acting like a an index which we
1086:41 kind of acting like a an index which we don't want because we want to be able to
1086:43 don't want because we want to be able to use this so it doesn't really have an
1086:45 use this so it doesn't really have an index at the moment so we need to give
1086:48 index at the moment so we need to give it an index but typically when you give
1086:51 it an index but typically when you give an index you'll do something like um
1086:53 an index you'll do something like um we'll say dataframe do5 we'll do
1086:56 we'll say dataframe do5 we'll do setor index and then you'll do something
1086:59 setor index and then you'll do something like um name so let's just do dat frame
1087:03 like um name so let's just do dat frame six is equal to we'll see we'll see what
1087:06 six is equal to we'll see we'll see what happens here it's going to give us an
1087:08 happens here it's going to give us an error oops what I meant is we're going
1087:10 error oops what I meant is we're going to do data frame five
1087:13 to do data frame five bracket uh name and that's a column
1087:16 bracket uh name and that's a column right we're going to do that and it's
1087:18 right we're going to do that and it's basically going to say that that's not
1087:20 basically going to say that that's not going to work and and what we need to do
1087:23 going to work and and what we need to do is what or at least what I want to do
1087:26 is what or at least what I want to do and what we're going to do in this video
1087:28 and what we're going to do in this video is I'm going to create numbers I really
1087:30 is I'm going to create numbers I really would just want it to be numbered one
1087:32 would just want it to be numbered one two three four five that's what I want
1087:34 two three four five that's what I want um but we don't have that right now I
1087:36 um but we don't have that right now I can't just will it into existence so now
1087:39 can't just will it into existence so now what we're going to do is kind of create
1087:41 what we're going to do is kind of create uh an index basically out of thin air so
1087:44 uh an index basically out of thin air so we're going to do pd.
1087:46 we're going to do pd. index and we're going to say uh you know
1087:50 index and we're going to say uh you know we basically want how many um rows are
1087:54 we basically want how many um rows are in here that's where we want our our um
1087:57 in here that's where we want our our um index to be we want it to count how many
1087:59 index to be we want it to count how many are in here now you can make this
1088:00 are in here now you can make this Dynamic and I it probably wouldn't be
1088:03 Dynamic and I it probably wouldn't be that hard but I'm gonna take this super
1088:04 that hard but I'm gonna take this super lazy route um and I'm just GNA
1088:08 lazy route um and I'm just GNA say let's do DF
1088:11 say let's do DF do5 or oops df5 doc
1088:15 do5 or oops df5 doc count and there's 90 values in here so
1088:19 count and there's 90 values in here so what I'm going to do is I'm going to do
1088:21 what I'm going to do is I'm going to do a
1088:22 a range of 90 uh and this is not uh I
1088:27 range of 90 uh and this is not uh I would definitely make this Dynamic but
1088:29 would definitely make this Dynamic but I'm again I'm just
1088:31 I'm again I'm just being being a little bit lazy we call
1088:33 being being a little bit lazy we call this index is equal to and I'm going to
1088:37 this index is equal to and I'm going to put this Index right here so now this is
1088:38 put this Index right here so now this is a number so now it's going
1088:41 a number so now it's going to literally Index this for us now I've
1088:45 to literally Index this for us now I've ran into this issue many times um so
1088:48 ran into this issue many times um so what I need to actually do is to reset
1088:51 what I need to actually do is to reset this index and then do it properly the
1088:53 this index and then do it properly the first time uh so let's do re let's get
1088:56 first time uh so let's do re let's get rid of this let's reset this index um
1088:59 rid of this let's reset this index um and it actually fixed itself um so what
1089:03 and it actually fixed itself um so what was happening was is we were indexing
1089:05 was happening was is we were indexing something that was already indexed we
1089:07 something that was already indexed we were causing
1089:08 were causing issues in a nutshell so we reset the
1089:11 issues in a nutshell so we reset the index and now this is what it looks like
1089:13 index and now this is what it looks like and this is exactly what we want this is
1089:16 and this is exactly what we want this is really how we wanted it formatted in
1089:17 really how we wanted it formatted in order to for our visualizations we have
1089:20 order to for our visualizations we have multiple rows for the Bitcoin um each of
1089:23 multiple rows for the Bitcoin um each of these columns are is now a row with the
1089:25 these columns are is now a row with the value attached to it exactly what we
1089:28 value attached to it exactly what we wanted so um really quick I for whatever
1089:33 wanted so um really quick I for whatever reason it it makes that uh level one I
1089:36 reason it it makes that uh level one I don't know why but we're just going to
1089:37 don't know why but we're just going to rename that column really quickly so
1089:39 rename that column really quickly so we're going to do data frame
1089:41 we're going to do data frame 6.
1089:44 6. rename and then we're going to do and
1089:46 rename and then we're going to do and open parentheses say columns equal to
1089:51 open parentheses say columns equal to we're going to do one of these these bad
1089:52 we're going to do one of these these bad boys oops one of these bad boys this
1089:54 boys oops one of these bad boys this this type of bracket and we're going to
1089:56 this type of bracket and we're going to say
1089:58 say levelor one and we do a colon and then
1090:03 levelor one and we do a colon and then oops and then a colon and then we want
1090:06 oops and then a colon and then we want to change it to and I'm just going to
1090:07 to change it to and I'm just going to call this the percent underscore change
1090:11 call this the percent underscore change so let's call this dat frame
1090:13 so let's call this dat frame [Music]
1090:15 [Music] seven again you don't have to do that
1090:17 seven again you don't have to do that I'm just doing it so now this looks much
1090:21 I'm just doing it so now this looks much much better now let's try to visualize
1090:23 much better now let's try to visualize this one um because we haven't done any
1090:25 this one um because we haven't done any visualizations yet we've just been
1090:26 visualizations yet we've just been messing with the data a little bit I I
1090:28 messing with the data a little bit I I you know I kind of want to see how we
1090:30 you know I kind of want to see how we can use this it's something that I
1090:32 can use this it's something that I personally am interested in so I kind of
1090:34 personally am interested in so I kind of wanted to see visualize how these
1090:36 wanted to see visualize how these changed over these these time periods um
1090:39 changed over these these time periods um but we need to um import some stuff in
1090:41 but we need to um import some stuff in order to be able to visualize this so
1090:44 order to be able to visualize this so we're going to import cbor as SNS and if
1090:48 we're going to import cbor as SNS and if we need to um we're going to import map
1090:51 we need to um we're going to import map plot lib as well I don't know if we'll
1090:53 plot lib as well I don't know if we'll use it right now or at all but um we're
1090:57 use it right now or at all but um we're going to we're going to add it in here
1091:00 going to we're going to add it in here either
1091:00 either way so now those are added and so what
1091:04 way so now those are added and so what we're going to do is come right here
1091:05 we're going to do is come right here we're going to do
1091:06 we're going to do SNS doat plot and we're going to oops
1091:13 SNS doat plot and we're going to oops we're going to say the x axis is equal
1091:16 we're going to say the x axis is equal to and we want to do this as the percent
1091:19 to and we want to do this as the percent change percent
1091:23 change percent change and then we have the Y AIS now we
1091:27 change and then we have the Y AIS now we want the y- axis to be these values
1091:30 want the y- axis to be these values right here say comma Y is equal to and
1091:35 right here say comma Y is equal to and we're going to say
1091:36 we're going to say values oops and then we're going to say
1091:39 values oops and then we're going to say comma and we'll say we want to basically
1091:42 comma and we'll say we want to basically create a Legend um I guess you could
1091:45 create a Legend um I guess you could call it we're going to say Hue is equal
1091:47 call it we're going to say Hue is equal to name um I'll show you what it looks
1091:50 to name um I'll show you what it looks like without it and then you know you
1091:52 like without it and then you know you can
1091:53 can see that we need that we're going to say
1091:56 see that we need that we're going to say the data is equal to this data frame
1092:01 the data is equal to this data frame seven data frame
1092:04 seven data frame seven and then we are going to say the
1092:07 seven and then we are going to say the kind is equal
1092:14 to now let's run this and see what we get and super quickly with just you know
1092:18 get and super quickly with just you know limited um inputs here's what we have
1092:22 limited um inputs here's what we have now this looks really good we can narrow
1092:25 now this looks really good we can narrow this down if we wanted to to a few less
1092:27 this down if we wanted to to a few less because there's a lot here and there's a
1092:29 because there's a lot here and there's a lot of colors but again that's just
1092:31 lot of colors but again that's just because we have a lot of different stuff
1092:34 because we have a lot of different stuff but there's a few that are doing really
1092:36 but there's a few that are doing really well I think this is
1092:39 well I think this is Tron um and then we have a few that are
1092:41 Tron um and then we have a few that are not doing so well but it's really hard
1092:44 not doing so well but it's really hard to see if you look down here it's really
1092:46 to see if you look down here it's really hard to see this um and that's just
1092:49 hard to see this um and that's just because of the the column name
1092:52 because of the the column name and so I actually want to change these
1092:54 and so I actually want to change these column names or these values so that
1092:56 column names or these values so that when we visualize it right down here it
1092:59 when we visualize it right down here it it doesn't look like that I kind of want
1093:01 it doesn't look like that I kind of want this to be you know at least one good
1093:04 this to be you know at least one good visualization you can take out of here
1093:06 visualization you can take out of here this is definitely not perfect or
1093:07 this is definitely not perfect or complete by any means but you know you
1093:09 complete by any means but you know you can take take that away from here um so
1093:12 can take take that away from here um so let's um I did Alt Enter which adds
1093:16 let's um I did Alt Enter which adds another row I could have just pushed
1093:17 another row I could have just pushed plus that's was kind of the lazy way um
1093:20 plus that's was kind of the lazy way um what I'm going to do
1093:22 what I'm going to do is I'm going to change these um these
1093:25 is I'm going to change these um these values in here so how I'm going to do
1093:27 values in here so how I'm going to do that is I'm going to do data frame seven
1093:29 that is I'm going to do data frame seven and we only want to look at this one
1093:31 and we only want to look at this one column so we'll do that right
1093:35 column so we'll do that right there and we want to say dot
1093:39 there and we want to say dot replace and we're going to an
1093:41 replace and we're going to an open parenthesis and then a bracket now
1093:45 open parenthesis and then a bracket now what we need to do is I'm just to show
1093:48 what we need to do is I'm just to show you um one of them is I'm going to say
1093:52 you um one of them is I'm going to say this one
1093:53 this one hour do that oops and then what I need
1093:56 hour do that oops and then what I need to do is a comma another bracket and
1093:59 to do is a comma another bracket and this is what it's going to change to I'm
1094:00 this is what it's going to change to I'm just going to say one hour oops one hour
1094:03 just going to say one hour oops one hour um and we'll do this one really quick
1094:05 um and we'll do this one really quick and then I'm gonna I don't want you to
1094:07 and then I'm gonna I don't want you to have to watch me type all this out but
1094:08 have to watch me type all this out but I'm going to go through and basically do
1094:09 I'm going to go through and basically do all of this uh for those but let's let's
1094:12 all of this uh for those but let's let's see this really quick and so now as you
1094:14 see this really quick and so now as you can see that um the originally it said
1094:16 can see that um the originally it said quote. USD percent change 1 hour is now
1094:19 quote. USD percent change 1 hour is now only 1 hour now
1094:22 only 1 hour now this didn't actually do anything we need
1094:24 this didn't actually do anything we need to apply it to this right here so I'm
1094:27 to apply it to this right here so I'm going to say data frame 7 is equal
1094:30 going to say data frame 7 is equal to and then we'll run data frame 7 again
1094:35 to and then we'll run data frame 7 again so now that has actually changed that
1094:37 so now that has actually changed that value now I'm going to go through and
1094:39 value now I'm going to go through and I'm going to update that for every
1094:41 I'm going to update that for every single one all right so I basically just
1094:43 single one all right so I basically just put the other ones um in here that we
1094:45 put the other ones um in here that we wanted to change with commas afterneath
1094:48 wanted to change with commas afterneath so I have 24 hours comma with the seven
1094:50 so I have 24 hours comma with the seven days 30 days 60 days 90 days and then
1094:53 days 30 days 60 days 90 days and then this bracket over here which tells uh it
1094:56 this bracket over here which tells uh it what to change it do 24 7 days 30 days
1094:59 what to change it do 24 7 days 30 days 60 days 90 days so let's run this I
1095:02 60 days 90 days so let's run this I haven't even tried it yet uh and it
1095:05 haven't even tried it yet uh and it looks like it obviously worked properly
1095:07 looks like it obviously worked properly so now let's go back down here and let's
1095:10 so now let's go back down here and let's run this
1095:11 run this again and look at that it looks so much
1095:15 again and look at that it looks so much cleaner so much nicer um and as you I
1095:19 cleaner so much nicer um and as you I mean all of them with that 1 hour change
1095:21 mean all of them with that 1 hour change has very little change and then you can
1095:24 has very little change and then you can look back so we can see back within 90
1095:26 look back so we can see back within 90 days it's gone a lot of these have gone
1095:28 days it's gone a lot of these have gone down which again if you're following
1095:30 down which again if you're following crypto you know there's a big crash
1095:32 crypto you know there's a big crash recently um especially with with you
1095:34 recently um especially with with you know all these altcoins um that you're
1095:36 know all these altcoins um that you're seeing right here went down a ton so I
1095:39 seeing right here went down a ton so I think this is um Avalanche or die or
1095:42 think this is um Avalanche or die or whatever these ones are you know went
1095:44 whatever these ones are you know went down dramatically whereas there's one up
1095:47 down dramatically whereas there's one up here this Lone Wolf um that's just
1095:50 here this Lone Wolf um that's just that's just did do really well for
1095:51 that's just did do really well for whatever reason so it's really
1095:53 whatever reason so it's really interesting um to see now this is a
1095:55 interesting um to see now this is a pretty specific um visualization that I
1096:00 pretty specific um visualization that I personally wanted to see and I thought
1096:01 personally wanted to see and I thought was interesting you can do absolutely
1096:04 was interesting you can do absolutely whatever you want to do with this data I
1096:06 whatever you want to do with this data I mean there's so much here you can do a
1096:08 mean there's so much here you can do a lot I mean a lot with this data
1096:11 lot I mean a lot with this data especially depending on how long you
1096:13 especially depending on how long you track it right I only did this over the
1096:15 track it right I only did this over the course of like five minutes but if you
1096:17 course of like five minutes but if you set this up um and you can track it over
1096:20 set this up um and you can track it over a longer time
1096:22 a longer time now um let's say you wanted to do
1096:25 now um let's say you wanted to do something much simpler uh you just
1096:27 something much simpler uh you just wanted to look at like Bitcoin over that
1096:30 wanted to look at like Bitcoin over that time that you you know uh uh took the
1096:32 time that you you know uh uh took the data in that's going to be a lot simpler
1096:34 data in that's going to be a lot simpler than what we just did and I'll show you
1096:36 than what we just did and I'll show you how to do that really quickly so we're
1096:37 how to do that really quickly so we're going to look at the data frame and we
1096:40 going to look at the data frame and we are going to say uh or we're going to
1096:42 are going to say uh or we're going to take specific columns we just want um a
1096:46 take specific columns we just want um a few columns that we want to keep or or
1096:49 few columns that we want to keep or or pull from so we're going to take uh oops
1096:51 pull from so we're going to take uh oops we're going to take the name
1096:54 we're going to take the name column we're going to do
1096:57 column we're going to do uh might be easier if I copy them but
1096:59 uh might be easier if I copy them but I'm just going to write them out quote.
1097:02 I'm just going to write them out quote. USD do price this is the price of the
1097:06 USD do price this is the price of the actual
1097:07 actual cryptocurrency then we're going to
1097:09 cryptocurrency then we're going to do Tim
1097:12 do Tim stamp and let's make this data frame and
1097:16 stamp and let's make this data frame and we're just going to do 10 for absolutely
1097:18 we're just going to do 10 for absolutely no
1097:19 no reason uh maybe made at n it would have
1097:21 reason uh maybe made at n it would have been easier so now we just have these um
1097:25 been easier so now we just have these um these columns and you know we have all
1097:27 these columns and you know we have all these separate columns so what we can do
1097:31 these separate columns so what we can do and the re kind of the reason I want to
1097:32 and the re kind of the reason I want to show you this is you can just query this
1097:34 show you this is you can just query this really quickly and just take the columns
1097:37 really quickly and just take the columns that you want so let's say we just
1097:38 that you want so let's say we just wanted to look at Bitcoin so we're going
1097:40 wanted to look at Bitcoin so we're going to say data frame
1097:43 to say data frame 10. query do open parenthesis and we're
1097:46 10. query do open parenthesis and we're going to say name is equal and equal is
1097:50 going to say name is equal and equal is not like that uh when you're doing it
1097:52 not like that uh when you're doing it like this you need to say equal equal
1097:54 like this you need to say equal equal equal
1097:56 equal to oops ignore that uh is equal to
1098:01 to oops ignore that uh is equal to bitcoin and we're going do it just like
1098:03 bitcoin and we're going do it just like that and we're going to say data frame
1098:05 that and we're going to say data frame 10 is equal to let's try running that I
1098:09 10 is equal to let's try running that I think something's wrong with it try it
1098:11 think something's wrong with it try it like
1098:13 like this oops all right let's try that there
1098:17 this oops all right let's try that there we go it was just the I needed a double
1098:19 we go it was just the I needed a double quotation instead of a single quotation
1098:21 quotation instead of a single quotation that was the issue so now we have
1098:23 that was the issue so now we have Bitcoin we have the price and we have
1098:24 Bitcoin we have the price and we have these time stamps so this is the actual
1098:26 these time stamps so this is the actual time when we ran it so this is the
1098:28 time when we ran it so this is the original data frame and then in the you
1098:30 original data frame and then in the you know this this project it took me 15
1098:32 know this this project it took me 15 more minutes to get this one and then we
1098:33 more minutes to get this one and then we had it running properly for the next
1098:35 had it running properly for the next five minutes so that's you know that's
1098:37 five minutes so that's you know that's actually what we have now if we want to
1098:40 actually what we have now if we want to just visualize this really simply what
1098:43 just visualize this really simply what we can do is we're going to
1098:45 we can do is we're going to say uh we're going to do SNS doline plot
1098:50 say uh we're going to do SNS doline plot and that's going to be like a little
1098:51 and that's going to be like a little line chart or line graph what whatever
1098:54 line chart or line graph what whatever you want to call it and then we're going
1098:56 you want to call it and then we're going to say x is equal to and we'll say
1099:01 to say x is equal to and we'll say quote no actually we wanted the time
1099:04 quote no actually we wanted the time stamp to be on the x-axis um and then
1099:07 stamp to be on the x-axis um and then we'll do y is equal to quote. USD do
1099:14 we'll do y is equal to quote. USD do price and let's see if that
1099:19 price and let's see if that works good not interpret time stamp for
1099:22 works good not interpret time stamp for the
1099:24 the parameter uh that's because it's not
1099:28 parameter uh that's because it's not understanding that the
1099:30 understanding that the data equals data frame 10 now let's try
1099:35 data equals data frame 10 now let's try this all right so this is uh looks
1099:38 this all right so this is uh looks terrible let
1099:40 terrible let me me just say SNS doet underscore
1099:46 me me just say SNS doet underscore theme and open parentheses we'll do
1099:49 theme and open parentheses we'll do style is equal to dark
1099:54 style is equal to dark grid this looks a little better now
1099:58 grid this looks a little better now again we are looking just at a very very
1100:01 again we are looking just at a very very short time series but we can look at
1100:06 short time series but we can look at just Bitcoin or we could look at
1100:08 just Bitcoin or we could look at multiple and we're showing this you know
1100:11 multiple and we're showing this you know this line that's showing us this
1100:12 this line that's showing us this trajectory over time so you can get
1100:15 trajectory over time so you can get really creative with this you can run
1100:16 really creative with this you can run this for a long time you can show
1100:18 this for a long time you can show Bitcoin over days weeks or month months
1100:21 Bitcoin over days weeks or month months however long you run this and so that's
1100:23 however long you run this and so that's really all I've got um honestly like I
1100:26 really all I've got um honestly like I said this is not a I wouldn't say this
1100:27 said this is not a I wouldn't say this is a complete full project but I'm
1100:30 is a complete full project but I'm showing you how to do something to
1100:31 showing you how to do something to enable you to kind of run with it and
1100:33 enable you to kind of run with it and run with the ball and do basically
1100:35 run with the ball and do basically whatever you want with this you can pull
1100:38 whatever you want with this you can pull it from you know data from a different
1100:40 it from you know data from a different API you can use this exact API in data
1100:44 API you can use this exact API in data but I wanted to show you just a few
1100:46 but I wanted to show you just a few things that I initially saw that I might
1100:48 things that I initially saw that I might do with the data and you you have so
1100:51 do with the data and you you have so much let me go back to this original
1100:53 much let me go back to this original data
1100:54 data frame uh right we'll use this one right
1100:57 frame uh right we'll use this one right here this one right here look at all
1100:59 here this one right here look at all this data I mean you have so so so much
1101:02 this data I mean you have so so so much data actually let's go to this one this
1101:04 data actually let's go to this one this one's better you have so much data so
1101:06 one's better you have so much data so many numbers here um so many columns
1101:08 many numbers here um so many columns that we didn't even look at that you can
1101:11 that we didn't even look at that you can use um and so you know there's a lot
1101:15 use um and so you know there's a lot that you can use here and I'm really
1101:17 that you can use here and I'm really trying to just set you up so that you
1101:19 trying to just set you up so that you can run with it and do whatever you want
1101:22 can run with it and do whatever you want I could have done a thousand different
1101:23 I could have done a thousand different things here but you know I tried to just
1101:25 things here but you know I tried to just show you two things that you can do with
1101:27 show you two things that you can do with the data that I thought were pretty
1101:28 the data that I thought were pretty interesting or or simple to do and you
1101:32 interesting or or simple to do and you know I want you guys to go out and do
1101:34 know I want you guys to go out and do something way way better than what I did
1101:36 something way way better than what I did so I hope that this was helpful I hope
1101:38 so I hope that this was helpful I hope that this showed you how to automate
1101:40 that this showed you how to automate that process so you don't have to sit
1101:42 that process so you don't have to sit there and click it and append it and do
1101:44 there and click it and append it and do all these different things that it can
1101:45 all these different things that it can show you how to kind of automate this
1101:47 show you how to kind of automate this process and hopefully that will be
1101:48 process and hopefully that will be helpful in your future projects so with
1101:51 helpful in your future projects so with that being said thank you so much for
1101:53 that being said thank you so much for watching if you made it all day to the
1101:55 watching if you made it all day to the end you guys are fantastic if you like
1101:57 end you guys are fantastic if you like this video be sure to like And subscribe
1101:59 this video be sure to like And subscribe below I'll see you in the next
1102:02 below I'll see you in the next [Music]
1102:12 [Music] video what's going on everybody welcome
1102:14 video what's going on everybody welcome back to another video today I'm going to
1102:16 back to another video today I'm going to be walking you through how to create
1102:17 be walking you through how to create your very own portfolio
1102:19 your very own portfolio website
1102:22 website [Music]
1102:27 now we just completed our data analyst portfolio project Series where we walk
1102:29 portfolio project Series where we walk through four projects in SQL Tableau and
1102:31 through four projects in SQL Tableau and Python and so if you have completed
1102:34 Python and so if you have completed those projects you now want to share
1102:35 those projects you now want to share them with potential employers and I
1102:37 them with potential employers and I think the best way to do that is to
1102:39 think the best way to do that is to create your own website in just a little
1102:40 create your own website in just a little bit I'm going to show you two options on
1102:42 bit I'm going to show you two options on how you can actually create your own
1102:43 how you can actually create your own website the first one is a website
1102:45 website the first one is a website builder like wix.com and the second one
1102:48 builder like wix.com and the second one is hosting your own website through
1102:49 is hosting your own website through something called GitHub Pages now if you
1102:51 something called GitHub Pages now if you have never created your own website
1102:53 have never created your own website before it can sound a little bit
1102:54 before it can sound a little bit daunting but don't worry I'm going to
1102:56 daunting but don't worry I'm going to walk you through every single step of
1102:57 walk you through every single step of the way from the very start to the very
1102:59 the way from the very start to the very end and once you reach the end you'll
1103:00 end and once you reach the end you'll have a complete data analyst portfolio
1103:02 have a complete data analyst portfolio website so without further Ado let's
1103:04 website so without further Ado let's jump on my screen and let's get started
1103:06 jump on my screen and let's get started all right so the website that you're
1103:07 all right so the website that you're looking at right now is the actual
1103:08 looking at right now is the actual website that we are going to build in
1103:10 website that we are going to build in this video um it is hosted on GitHub
1103:12 this video um it is hosted on GitHub Pages or github.io so this is actually
1103:15 Pages or github.io so this is actually being hosted right now by GitHub pages
1103:17 being hosted right now by GitHub pages so if you type this in I'll leave a link
1103:19 so if you type this in I'll leave a link in the description if you type Tye this
1103:20 in the description if you type Tye this in um you will get this page and you can
1103:24 in um you will get this page and you can check it out for yourself if you don't
1103:25 check it out for yourself if you don't want to just watch me look at it um so
1103:28 want to just watch me look at it um so you know it has this little header and
1103:30 you know it has this little header and you can write a little bit about
1103:31 you can write a little bit about yourself and then these are our actual
1103:34 yourself and then these are our actual projects so this is our data cleaning in
1103:36 projects so this is our data cleaning in SQL project um and then there's the
1103:38 SQL project um and then there's the covid uh data exploration Tableau
1103:40 covid uh data exploration Tableau dashboards movie correlation with python
1103:43 dashboards movie correlation with python um this is a future video I plan on
1103:45 um this is a future video I plan on doing a few more of these projects
1103:47 doing a few more of these projects because I just really enjoy them so uh
1103:50 because I just really enjoy them so uh you know and then there's this contact
1103:52 you know and then there's this contact information at the bottom so it's a
1103:54 information at the bottom so it's a really
1103:55 really simple website and it gets the point
1103:58 simple website and it gets the point across and uh I have something similar
1104:01 across and uh I have something similar to this for my own personal one I I use
1104:03 to this for my own personal one I I use a different variation but um this all
1104:06 a different variation but um this all comes from this website HTML 5 up there
1104:10 comes from this website HTML 5 up there are lots of templates lots of options
1104:12 are lots of templates lots of options that you can use um again the one we're
1104:14 that you can use um again the one we're going to be working with is this one but
1104:17 going to be working with is this one but I use a different one for mine and they
1104:19 I use a different one for mine and they are really good
1104:21 are really good I me super easy to build and customize
1104:25 I me super easy to build and customize yourself and I will say again I have no
1104:28 yourself and I will say again I have no experience doing this I just watched a
1104:30 experience doing this I just watched a YouTube video that showed me how to do
1104:32 YouTube video that showed me how to do this and now I am creating my own
1104:34 this and now I am creating my own YouTube video to show you how to do this
1104:36 YouTube video to show you how to do this so it's coming um pretty much full
1104:38 so it's coming um pretty much full circle so like I said there's no no real
1104:41 circle so like I said there's no no real narrative to it it just clicks to your
1104:42 narrative to it it just clicks to your project um if you click on this and
1104:44 project um if you click on this and let's just open a new tab it'll take you
1104:47 let's just open a new tab it'll take you right to our to the GitHub project um
1104:49 right to our to the GitHub project um and then you the the whoever is checking
1104:51 and then you the the whoever is checking this out like a an employer or a
1104:53 this out like a an employer or a recruiter can see your code so super
1104:56 recruiter can see your code so super simple another way that you can do this
1104:59 simple another way that you can do this is kind of creating your own website
1105:01 is kind of creating your own website through like a template or something
1105:03 through like a template or something like that um almost like a Blog style so
1105:06 like that um almost like a Blog style so I imagine it being very something very
1105:08 I imagine it being very something very similar to this where there's this
1105:10 similar to this where there's this introduction and you can talk about you
1105:12 introduction and you can talk about you know where you got the data set how you
1105:13 know where you got the data set how you got the data um and then you can kind of
1105:16 got the data um and then you can kind of have a more narrative uh approach with
1105:19 have a more narrative uh approach with screenshots and with some code as well
1105:21 screenshots and with some code as well so you know this person included
1105:23 so you know this person included screenshots um and then there's the code
1105:25 screenshots um and then there's the code right here that I can actually copy um
1105:28 right here that I can actually copy um and paste that and it just walks through
1105:30 and paste that and it just walks through the logic of how the project was done um
1105:34 the logic of how the project was done um there's a story to it really and so that
1105:37 there's a story to it really and so that might be something that you're
1105:38 might be something that you're interested in now I have done something
1105:40 interested in now I have done something like this in the past and I used Wix and
1105:42 like this in the past and I used Wix and there's a you can do this completely for
1105:44 there's a you can do this completely for free um the one we're doing today is
1105:46 free um the one we're doing today is completely free as well but you know if
1105:48 completely free as well but you know if you want the customize
1105:51 you want the customize um the customized URL you do have to pay
1105:54 um the customized URL you do have to pay for it on Wix but you can get a free Wix
1105:57 for it on Wix but you can get a free Wix website with the Wix um in the URL so
1106:00 website with the Wix um in the URL so you know try this out these are super
1106:03 you know try this out these are super easy you can find thousands of templates
1106:05 easy you can find thousands of templates and a million tutorials of how to do
1106:06 and a million tutorials of how to do them um so that's not the one we're
1106:08 them um so that's not the one we're going to be working on today so with
1106:11 going to be working on today so with that being said uh the very very first
1106:13 that being said uh the very very first thing that we need to do before we do
1106:15 thing that we need to do before we do anything is actually download visual
1106:18 anything is actually download visual studio code this is where we're going to
1106:20 studio code this is where we're going to download that HTML and we're going to be
1106:22 download that HTML and we're going to be working with it in there um again I
1106:25 working with it in there um again I don't know if I said this before but it
1106:27 don't know if I said this before but it seems a little bit intimidating at first
1106:29 seems a little bit intimidating at first but once we actually start looking at it
1106:31 but once we actually start looking at it it's a lot easier than it looks I
1106:33 it's a lot easier than it looks I promise you so if you are me and you
1106:35 promise you so if you are me and you have a Windows computer you'll just go
1106:38 have a Windows computer you'll just go right here you'll install it um super
1106:40 right here you'll install it um super easy to install I'm not going to walk
1106:41 easy to install I'm not going to walk you through how to do that um of course
1106:43 you through how to do that um of course I already have it up and running down
1106:46 I already have it up and running down here so once you have that installed
1106:49 here so once you have that installed what you're going to do going to come to
1106:50 what you're going to do going to come to this website a link should be in the
1106:52 this website a link should be in the description we are going to download
1106:55 description we are going to download this all you have to click is the free
1106:57 this all you have to click is the free download it's going to pop up I'm going
1106:59 download it's going to pop up I'm going to put it in my downloads I'm GNA click
1107:03 to put it in my downloads I'm GNA click save
1107:05 save fantastic uh so let's go to the
1107:07 fantastic uh so let's go to the downloads and it should be right here
1107:09 downloads and it should be right here now if we open this up it has a few
1107:11 now if we open this up it has a few different things in it okay so um I'm
1107:14 different things in it okay so um I'm using the brave browser so that's going
1107:16 using the brave browser so that's going to be right here so that's this the
1107:17 to be right here so that's this the symbol but for you if you're using
1107:19 symbol but for you if you're using Google Chrome that should be the symbol
1107:21 Google Chrome that should be the symbol there as well but this is everything
1107:24 there as well but this is everything that you should be seeing and what we
1107:26 that you should be seeing and what we want to do is we want to take it out of
1107:27 want to do is we want to take it out of this um zip folder because it's there
1107:31 this um zip folder because it's there are things that can read into it with
1107:34 are things that can read into it with Visual Studio code but I want to make
1107:35 Visual Studio code but I want to make this as user friendly as I possibly can
1107:38 this as user friendly as I possibly can so what we're going to do is we're going
1107:40 so what we're going to do is we're going to make create a new folder and I'm just
1107:42 to make create a new folder and I'm just going to call it massively or you can
1107:44 going to call it massively or you can call it um Port website whatever you
1107:47 call it um Port website whatever you want to call it I'm just going to do
1107:49 want to call it I'm just going to do Port website
1107:51 Port website um and we are just going to I'm going to
1107:53 um and we are just going to I'm going to copy this in I'm not going to cut it in
1107:55 copy this in I'm not going to cut it in just in case I make a mistake so going
1107:58 just in case I make a mistake so going to put all of those um all of those
1108:01 to put all of those um all of those things in
1108:02 things in here and now what we're going to do is
1108:05 here and now what we're going to do is we're going to go to visual studio code
1108:08 we're going to go to visual studio code right here and you should be greeted
1108:11 right here and you should be greeted with this um this right here and we're
1108:13 with this um this right here and we're just going to click open folder and
1108:15 just going to click open folder and we're going to go to Port website and
1108:17 we're going to go to Port website and we're going to go select
1108:18 we're going to go select folder and you're going to say say yes I
1108:21 folder and you're going to say say yes I trust this one and right over here is
1108:24 trust this one and right over here is all of the documents that we were just
1108:26 all of the documents that we were just looking at now the one that the only one
1108:29 looking at now the one that the only one really that we're going to be working in
1108:31 really that we're going to be working in um we'll work a little bit in the images
1108:33 um we'll work a little bit in the images um because I'll show you how to add your
1108:34 um because I'll show you how to add your own images the really the only one we're
1108:37 own images the really the only one we're going to be working in is this index so
1108:41 going to be working in is this index so again it looks complicated um if you've
1108:44 again it looks complicated um if you've never looked at HTML before um it does
1108:46 never looked at HTML before um it does look a little bit complicated but HTML
1108:49 look a little bit complicated but HTML to me
1108:50 to me is one of the more easily understood
1108:53 is one of the more easily understood languages um once you start kind of
1108:55 languages um once you start kind of getting into it which we're about to
1108:57 getting into it which we're about to we're going to walk through the entire
1108:58 we're going to walk through the entire process it actually makes a lot of sense
1109:00 process it actually makes a lot of sense and it is pretty simple um something
1109:03 and it is pretty simple um something that you're going to want is you're
1109:05 that you're going to want is you're going to want something called a live so
1109:07 going to want something called a live so like if I click right here and I click
1109:10 like if I click right here and I click open with live server you don't have it
1109:11 open with live server you don't have it yet I'm guessing unless you've done this
1109:13 yet I'm guessing unless you've done this before um it's going to open up this
1109:15 before um it's going to open up this website and this is what we're looking
1109:17 website and this is what we're looking at right now so it has a bunch of um
1109:20 at right now so it has a bunch of um gibberish or some language that I do not
1109:23 gibberish or some language that I do not know and so we can view this live um in
1109:27 know and so we can view this live um in just a second I'm going to take myself
1109:30 just a second I'm going to take myself off screen but before I do that um let's
1109:33 off screen but before I do that um let's download or let's um search for that
1109:38 download or let's um search for that that
1109:39 that live um I think it's called live share
1109:42 live um I think it's called live share live server um let me see what this is
1109:45 live server um let me see what this is called yeah live server so come right
1109:48 called yeah live server so come right here it's called this live server there
1109:50 here it's called this live server there it is yeah that's the one so this is our
1109:53 it is yeah that's the one so this is our live server you just need to click
1109:54 live server you just need to click install it takes like 5 seconds and it
1109:56 install it takes like 5 seconds and it should be completely installed um what
1109:59 should be completely installed um what this does is it just hosts a local
1110:02 this does is it just hosts a local website it's not something that anybody
1110:03 website it's not something that anybody can access um but it connects to your
1110:06 can access um but it connects to your code and when we make updates it'll make
1110:08 code and when we make updates it'll make a lot you can see it live you can see
1110:09 a lot you can see it live you can see those updates live so I'll show you all
1110:11 those updates live so I'll show you all that in a second just be sure to um be
1110:14 that in a second just be sure to um be sure to download that or install that uh
1110:16 sure to download that or install that uh with that being said let's get out of
1110:19 with that being said let's get out of this let's go all let's go back right
1110:21 this let's go all let's go back right here uh with that being said I am going
1110:23 here uh with that being said I am going to take myself off screen so that you
1110:25 to take myself off screen so that you can see everything that I am seeing as
1110:27 can see everything that I am seeing as well um it's been really great seeing
1110:29 well um it's been really great seeing you have lots of different videos coming
1110:33 you have lots of different videos coming up lots of new projects um I just I
1110:36 up lots of new projects um I just I really enjoyed this project series I
1110:38 really enjoyed this project series I think I'm just going to do more of them
1110:39 think I'm just going to do more of them so uh all right I'm G to get myself off
1110:41 so uh all right I'm G to get myself off screen so let's look at what we actually
1110:45 screen so let's look at what we actually need to do so I'm going to um
1110:50 need to do so I'm going to um so let me see okay so we're already
1110:51 so let me see okay so we're already connected to the
1110:53 connected to the live um actually I got rid of it
1110:56 live um actually I got rid of it whoops let's pull this over and let's
1111:00 whoops let's pull this over and let's pull
1111:01 pull that and we're going
1111:04 that and we're going to open in live server so if we look
1111:09 to open in live server so if we look right over here and I know this going to
1111:10 right over here and I know this going to be a little bit Squish and I'm sorry
1111:12 be a little bit Squish and I'm sorry about that um but if we look right over
1111:16 about that um but if we look right over here this says this is massively so you
1111:19 here this says this is massively so you you can change that that's that's this
1111:21 you can change that that's that's this right here and you can say we're going
1111:23 right here and you can say we're going to say Alex the analyst portfolio and
1111:27 to say Alex the analyst portfolio and we'll get rid of this massively I'm
1111:29 we'll get rid of this massively I'm gonna hit control save you can also go
1111:31 gonna hit control save you can also go up here and hit save but I'm I'm going
1111:34 up here and hit save but I'm I'm going hit controls so I hit contrl s and just
1111:38 hit controls so I hit contrl s and just like that it updates on the website now
1111:41 like that it updates on the website now again this is just a local so it's
1111:43 again this is just a local so it's nothing that anybody can see so don't
1111:45 nothing that anybody can see so don't worry but what we're going to do is I'm
1111:47 worry but what we're going to do is I'm going to walk you through the entire
1111:49 going to walk you through the entire process of creating this and then at the
1111:51 process of creating this and then at the end I will show you how to host it on
1111:53 end I will show you how to host it on GitHub um and it's honestly it's it's a
1111:56 GitHub um and it's honestly it's it's a fairly easy process it's just takes a
1111:58 fairly easy process it's just takes a little bit of time to customize it all
1112:00 little bit of time to customize it all so let's get into it so we have this um
1112:04 so let's get into it so we have this um you may not be able to see it let me
1112:05 you may not be able to see it let me actually pull this up so it says
1112:06 actually pull this up so it says massively by HTTP we're going to
1112:08 massively by HTTP we're going to customiz that customize that as well
1112:11 customiz that customize that as well whoops I don't want to do that every
1112:12 whoops I don't want to do that every single time I'm I'm going to try not to
1112:14 single time I'm I'm going to try not to go full and go back and everything like
1112:16 go full and go back and everything like that so we're just going to say Alex the
1112:18 that so we're just going to say Alex the analyst
1112:25 portfolio um contrl s and right up here that changed it you may not be able to
1112:26 that changed it you may not be able to see yeah don't ask me that again thank
1112:27 see yeah don't ask me that again thank you uh right up here you probably can't
1112:29 you uh right up here you probably can't see at the moment we'll see that later
1112:31 see at the moment we'll see that later um but it it customizes this um tab
1112:34 um but it it customizes this um tab which is really
1112:35 which is really cool so let's go right down here now
1112:39 cool so let's go right down here now this is where it says a free fully
1112:41 this is where it says a free fully responsive HTML uh five template we can
1112:47 responsive HTML uh five template we can customize that and I highly encourage
1112:49 customize that and I highly encourage you do so what you can do and they
1112:53 you do so what you can do and they actually included their Twitter handle
1112:56 actually included their Twitter handle right here and you can do the same if
1112:58 right here and you can do the same if you look at this one right here I
1113:00 you look at this one right here I included my Alex the analyst handle that
1113:03 included my Alex the analyst handle that that goes to my YouTube channel and you
1113:04 that goes to my YouTube channel and you can do the exact same thing includes
1113:06 can do the exact same thing includes your LinkedIn or your GitHub profile or
1113:08 your LinkedIn or your GitHub profile or whatever you want to include in there um
1113:11 whatever you want to include in there um and so you know be aware that you can do
1113:14 and so you know be aware that you can do that so let's say um oops I need to
1113:18 that so let's say um oops I need to click back in here
1113:20 click back in here so we're going to
1113:22 so we're going to say
1113:23 say um data analyst skilled in and then
1113:28 um data analyst skilled in and then again don't write what I'm writing um
1113:30 again don't write what I'm writing um you can it's I'm just going to make it
1113:32 you can it's I'm just going to make it really simple but you know this part is
1113:35 really simple but you know this part is meant to be a little bit about you um as
1113:37 meant to be a little bit about you um as who you are so I'm going to say data
1113:38 who you are so I'm going to say data analyst skilled in
1113:40 analyst skilled in SQL Tableau and
1113:44 SQL Tableau and Python and then I'm just going to get
1113:47 Python and then I'm just going to get rid of all of this
1113:50 rid of all of this yep yep yep everything from here
1113:54 yep yep yep everything from here over and contrl
1113:56 over and contrl S and so super simple um actually let me
1113:59 S and so super simple um actually let me where was that
1114:01 where was that four four here it is we don't need that
1114:06 four four here it is we don't need that actually we don't need any anything from
1114:09 actually we don't need any anything from here
1114:11 here over probably here honestly see what
1114:14 over probably here honestly see what that looks like um and yeah and I can
1114:16 that looks like um and yeah and I can again you can use any website right here
1114:18 again you can use any website right here that you want
1114:20 that you want and you can customize what it looks like
1114:21 and you can customize what it looks like so I'm going to say Alex the analyst um
1114:24 so I'm going to say Alex the analyst um and then whatever URL you want to
1114:26 and then whatever URL you want to include in there that's what you need to
1114:28 include in there that's what you need to put so now if I save oops if I hit
1114:30 put so now if I save oops if I hit contrl s so now it says Alex the
1114:33 contrl s so now it says Alex the analyst um so pretty
1114:36 analyst um so pretty easy now we're going to go down and you
1114:40 easy now we're going to go down and you can use this however you want to use it
1114:42 can use this however you want to use it I would you can even make this um you
1114:45 I would you can even make this um you can make this like one of your one of
1114:48 can make this like one of your one of your readmes like a you and put the link
1114:50 your readmes like a you and put the link for that I decided to include um again
1114:54 for that I decided to include um again on this one I decided to include the
1114:56 on this one I decided to include the project that I thought that we've done
1114:58 project that I thought that we've done that was like the most impressive or the
1115:01 that was like the most impressive or the I don't know the coolest one I don't
1115:03 I don't know the coolest one I don't know if you consider data cleaning and
1115:05 know if you consider data cleaning and SQ cool but um I do I think it's cool so
1115:09 SQ cool but um I do I think it's cool so I included that one as my very first one
1115:10 I included that one as my very first one so that's what we're going to do um
1115:12 so that's what we're going to do um right here so we're going to go down and
1115:16 right here so we're going to go down and it's going to
1115:18 it's going to say
1115:20 say let's say it says this is massively
1115:22 let's say it says this is massively that's not
1115:23 that's not it uh cool so let's see what oh okay I
1115:26 it uh cool so let's see what oh okay I know what that is we'll come back to
1115:27 know what that is we'll come back to this up here um in just a little bit I'm
1115:30 this up here um in just a little bit I'm going to go full screen I'll show you
1115:31 going to go full screen I'll show you what this is and then we'll come back to
1115:33 what this is and then we'll come back to it but if we go right down here this is
1115:36 it but if we go right down here this is our what they're calling a featured post
1115:38 our what they're calling a featured post and then the ones below this are posts
1115:41 and then the ones below this are posts so in our featured post um I'm going to
1115:44 so in our featured post um I'm going to get rid of the date I don't want them to
1115:45 get rid of the date I don't want them to know that I just created it like um I
1115:49 know that I just created it like um I don't know oops I keep doing uh control
1115:52 don't know oops I keep doing uh control a selecting everything whoops so we're
1115:55 a selecting everything whoops so we're going to say um data cleaning in
1116:00 going to say um data cleaning in SQL and we'll get rid of
1116:03 SQL and we'll get rid of this and contrl S again I'm just
1116:06 this and contrl S again I'm just updating it a lot so that you see what
1116:09 updating it a lot so that you see what I'm doing and where it's going and we're
1116:11 I'm doing and where it's going and we're going to get rid of basically all of
1116:13 going to get rid of basically all of this and go back and we're just going to
1116:16 this and go back and we're just going to say in this project we C clean data in
1116:21 say in this project we C clean data in we clean let's do we clean housing data
1116:24 we clean let's do we clean housing data in SQL
1116:26 in SQL server and contr S so super easy again
1116:30 server and contr S so super easy again uh give a little bit more description I
1116:31 uh give a little bit more description I did in my other one um and you have the
1116:33 did in my other one um and you have the you have you can see that website so go
1116:35 you have you can see that website so go check it out and then we'll have an
1116:37 check it out and then we'll have an image and I'm going to show you um at
1116:40 image and I'm going to show you um at the end we're going to go back and redo
1116:41 the end we're going to go back and redo all the images but I'm not going to do
1116:44 all the images but I'm not going to do that at this very
1116:45 that at this very moment um
1116:48 moment um so what
1116:49 so what now you can have this full story I chose
1116:52 now you can have this full story I chose to do view
1116:59 project and i h contrl s it says view project I think that just looks better
1117:00 project I think that just looks better especially if you're displaying a
1117:02 especially if you're displaying a project I think it is nice uh now we go
1117:05 project I think it is nice uh now we go into all the indiv individual posts um
1117:07 into all the indiv individual posts um actually no wait what I want I want to
1117:10 actually no wait what I want I want to show you really quick is how you
1117:11 show you really quick is how you actually link it to this so let's go
1117:14 actually link it to this so let's go right over here this is our co uh that's
1117:17 right over here this is our co uh that's our Co one here's a data cleaning
1117:19 our Co one here's a data cleaning project so all you have to do is take um
1117:24 project so all you have to do is take um take this website so that's the URL and
1117:27 take this website so that's the URL and you're going to put it right here now
1117:29 you're going to put it right here now there's three different places this href
1117:31 there's three different places this href is places are places where you can put a
1117:33 is places are places where you can put a link to a website um and on here it
1117:37 link to a website um and on here it references this right here so you can
1117:39 references this right here so you can they can click on this data cleaning and
1117:41 they can click on this data cleaning and SQL they can click on the image um as
1117:44 SQL they can click on the image um as because you know this href is right next
1117:46 because you know this href is right next to this image they can also click on the
1117:49 to this image they can also click on the view project button so you can put it in
1117:52 view project button so you can put it in all three um and you'll just go like
1117:54 all three um and you'll just go like this you'll you'll stick the URL right
1117:57 this you'll you'll stick the URL right where that um hashtag or pound sign
1118:01 where that um hashtag or pound sign is and then we're going to save that
1118:05 is and then we're going to save that oops oh I I this is embarrassing I am
1118:08 oops oh I I this is embarrassing I am not a website I am not a web developer
1118:10 not a website I am not a web developer as you can see um but then if I go in
1118:13 as you can see um but then if I go in here and I right click and I say open
1118:15 here and I right click and I say open link it is going to take me to that
1118:18 link it is going to take me to that project so super super simple and we're
1118:19 project so super super simple and we're going to do basically that for all of
1118:21 going to do basically that for all of these um I'm only going to show you
1118:23 these um I'm only going to show you three and then you can do the rest but I
1118:25 three and then you can do the rest but I want to show you how to also do the um
1118:27 want to show you how to also do the um put the Tableau it's the exact same
1118:29 put the Tableau it's the exact same thing but you know it's different so
1118:31 thing but you know it's different so wanted to show it to you so the next one
1118:35 wanted to show it to you so the next one that we're going to do is go down to
1118:37 that we're going to do is go down to posts
1118:38 posts and again I'm going to get rid of this
1118:40 and again I'm going to get rid of this date you can keep that in there if you
1118:41 date you can keep that in there if you want excuse me and that's totally fine
1118:44 want excuse me and that's totally fine just
1118:45 just update the date um this is that said mag
1118:49 update the date um this is that said mag again I think this might be like some
1118:50 again I think this might be like some language that I just don't know about um
1118:52 language that I just don't know about um the next one is data
1118:55 the next one is data exploration in
1118:57 exploration in SQL and I'm going to get rid of
1119:00 SQL and I'm going to get rid of this and we'll save that
1119:03 this and we'll save that perfect and we'll do view
1119:11 project cool and yeah so now we need to um
1119:16 cool and yeah so now we need to um customize this summary and so I'm just
1119:19 customize this summary and so I'm just going to say something really simple um
1119:23 going to say something really simple um data exploration of
1119:37 Server there we go let's save that we have view project now let's go get our
1119:39 have view project now let's go get our project so this is the data exploration
1119:43 project so this is the data exploration we're going to take this we're going to
1119:45 we're going to take this we're going to copy it and we're going to put it right
1119:46 copy it and we're going to put it right in
1119:48 in here and right in here as well and if
1119:52 here and right in here as well and if you want to you can also include it
1119:54 you want to you can also include it right up here so we have it in all three
1119:56 right up here so we have it in all three places uh again once you click on these
1120:00 places uh again once you click on these they will come up let's go to the next
1120:03 they will come up let's go to the next one we're going to get rid of
1120:06 one we're going to get rid of this this one is going to be our Tableau
1120:09 this this one is going to be our Tableau projects so actually let me just copy
1120:11 projects so actually let me just copy that while we're here this is going to
1120:12 that while we're here this is going to be our Tableau projects so if you have
1120:15 be our Tableau projects so if you have one specific project that you want to
1120:17 one specific project that you want to include what you need to do is actually
1120:20 include what you need to do is actually go in here click view grab that URL what
1120:24 go in here click view grab that URL what I am doing is I am just sharing my
1120:26 I am doing is I am just sharing my Tableau public page so if you have tons
1120:29 Tableau public page so if you have tons of projects in here and um you want to
1120:32 of projects in here and um you want to display all of them then or you want
1120:35 display all of them then or you want them to be able to see all of them and
1120:37 them to be able to see all of them and go and pick and see and choose what they
1120:38 go and pick and see and choose what they want to look at then just choose this
1120:40 want to look at then just choose this URL that we're choosing right here so um
1120:44 URL that we're choosing right here so um in here on in the um HTML we're going to
1120:47 in here on in the um HTML we're going to put I'm going to put tab
1120:53 projects and let's go like
1120:56 and let's go like this and then we will get rid of uh that
1121:01 this and then we will get rid of uh that hashtag pound sign whatever you want to
1121:03 hashtag pound sign whatever you want to call
1121:04 call it and we'll hit contrl s and oh we got
1121:08 it and we'll hit contrl s and oh we got to do the
1121:10 to do the um this as
1121:18 well this is my this is going to be a terrible don't use this this is my
1121:21 terrible don't use this this is my Tableau this holds I'm just this is bad
1121:24 Tableau this holds I'm just this is bad this holds all of my
1121:27 this holds all of my Tableau
1121:29 Tableau dashboards don't please don't do this um
1121:32 dashboards don't please don't do this um I am doing this because I don't want to
1121:35 I am doing this because I don't want to take forever in a video to make it
1121:37 take forever in a video to make it perfect um and then you know you're
1121:39 perfect um and then you know you're going to do the exact same thing so in
1121:41 going to do the exact same thing so in this one right here I included four so
1121:44 this one right here I included four so I'm going to keep
1121:45 I'm going to keep four
1121:47 four um let me do the
1121:49 um let me do the no I'm just going to do these three I'm
1121:51 no I'm just going to do these three I'm not gonna take up more of our time um so
1121:56 not gonna take up more of our time um so we did those I'm just going to keep
1121:57 we did those I'm just going to keep these three in for visual purposes but
1122:01 these three in for visual purposes but once you get down here um you know what
1122:04 once you get down here um you know what we're going to do is delete some of this
1122:06 we're going to do is delete some of this right so we this is our data
1122:08 right so we this is our data exploration and where's our
1122:11 exploration and where's our Tableau this is our Tableau right here
1122:14 Tableau this is our Tableau right here so Tableau projects they're separated by
1122:16 so Tableau projects they're separated by these articles so what we're going to do
1122:17 these articles so what we're going to do is go around right here and we're going
1122:19 is go around right here and we're going to go down down down down to right here
1122:22 to go down down down down to right here this is going to get rid of all these
1122:24 this is going to get rid of all these other articles or all these other what
1122:26 other articles or all these other what they're calling um posts so we're going
1122:29 they're calling um posts so we're going to get rid of those and we're going to
1122:31 to get rid of those and we're going to hit
1122:32 hit save and now as you can see we have our
1122:35 save and now as you can see we have our header we have our first project and we
1122:38 header we have our first project and we have our second and our third I would
1122:40 have our second and our third I would include those other projects that we've
1122:42 include those other projects that we've done in here so that it looks good this
1122:45 done in here so that it looks good this is this footer right here we don't need
1122:47 is this footer right here we don't need that because we don't have any
1122:49 that because we don't have any um anything else in there so we're going
1122:51 um anything else in there so we're going to get rid of that as well and now we
1122:53 to get rid of that as well and now we just have this information
1122:55 just have this information now I don't have anything where they can
1122:59 now I don't have anything where they can do the name email message or you can
1123:00 do the name email message or you can keep that in there if you'd like um but
1123:03 keep that in there if you'd like um but I am going to get rid of this so we're
1123:05 I am going to get rid of this so we're going to go right here that's the
1123:07 going to go right here that's the section so don't delete the section we
1123:10 section so don't delete the section we want that I'm going to delete this
1123:11 want that I'm going to delete this footer section as what they're calling
1123:13 footer section as what they're calling it and now we have this address phone
1123:17 it and now we have this address phone email social um and I'm G to get to the
1123:19 email social um and I'm G to get to the Social in just a second it's again super
1123:22 Social in just a second it's again super easy but for the address I just put
1123:25 easy but for the address I just put location I don't want to give somebody
1123:26 location I don't want to give somebody my address or put it on a website
1123:27 my address or put it on a website anywhere um it's not something I want to
1123:30 anywhere um it's not something I want to do so what we're going to do is just put
1123:33 do so what we're going to do is just put I'm going to put
1123:34 I'm going to put Dallas and Texas and we can keep it like
1123:37 Dallas and Texas and we can keep it like that and we'll hit oops we'll hit save
1123:41 that and we'll hit oops we'll hit save and it'll have Dallas Texas um hate the
1123:44 and it'll have Dallas Texas um hate the look of the zeros 6 seven8 n z so we're
1123:48 look of the zeros 6 seven8 n z so we're going we're going to do that phone
1123:50 going we're going to do that phone number
1123:52 number two3
1123:54 two3 56
1123:55 56 7890 and then email and we'll
1123:59 7890 and then email and we'll put Alex the analyst 95@gmail.com
1124:05 put Alex the analyst 95@gmail.com if you have issues with this um you can
1124:08 if you have issues with this um you can email me
1124:10 email me but I'll try I will try to respond to
1124:13 but I'll try I will try to respond to all your emails I get a lot um so I will
1124:16 all your emails I get a lot um so I will do my best but that is my actual email
1124:18 do my best but that is my actual email if you are curious
1124:19 if you are curious now um now that we have this we also
1124:23 now um now that we have this we also have these the social media now I want
1124:26 have these the social media now I want to display my LinkedIn and I also want
1124:29 to display my LinkedIn and I also want to display my GitHub so what I'm going
1124:32 to display my GitHub so what I'm going to do right here is I'm going to go over
1124:34 to do right here is I'm going to go over here and do
1124:40 LinkedIn perfect let's go to this so I'm going to take my LinkedIn
1124:48 URL and I am going to get rid of these first two because I'm only going to
1124:50 first two because I'm only going to include two and for this one I'm going
1124:53 include two and for this one I'm going to do uh
1124:55 to do uh LinkedIn oops linked in and then for
1125:01 LinkedIn oops linked in and then for right here I'm going to replace that
1125:03 right here I'm going to replace that with linked
1125:05 with linked in and what you're going to do is put
1125:08 in and what you're going to do is put this link right here and then we're
1125:11 this link right here and then we're going to go get get the
1125:13 going to go get get the GitHub so let's do GitHub oh who is this
1125:17 GitHub so let's do GitHub oh who is this sign up what is going on
1125:20 sign up what is going on um I don't there let's just go back here
1125:24 um I don't there let's just go back here I that was some I was like viewing a
1125:26 I that was some I was like viewing a while back or something um so we're
1125:28 while back or something um so we're going to take the GitHub and we're going
1125:30 going to take the GitHub and we're going to put that right here so it already has
1125:34 to put that right here so it already has it as um the GitHub is this supposed to
1125:37 it as um the GitHub is this supposed to be
1125:39 be lowercase I think it is let me see if
1125:42 lowercase I think it is let me see if this is lowercased as well yeah um so do
1125:45 this is lowercased as well yeah um so do it like that do it lowercased um I
1125:47 it like that do it lowercased um I forgot that that was how they did it
1125:51 forgot that that was how they did it um and oh that's the label that doesn't
1125:53 um and oh that's the label that doesn't matter as much but this right here is
1125:54 matter as much but this right here is the class is actually the important part
1125:56 the class is actually the important part because then when we go back here there
1125:59 because then when we go back here there is no LinkedIn image but when we save it
1126:02 is no LinkedIn image but when we save it oops when we save it it has the LinkedIn
1126:05 oops when we save it it has the LinkedIn image because it's already a class that
1126:06 image because it's already a class that was created in this HTML um
1126:10 was created in this HTML um template so we have that um and let me
1126:14 template so we have that um and let me bring this full screen really quick
1126:16 bring this full screen really quick because there are a few things that we
1126:17 because there are a few things that we couldn't see in that that screen these
1126:20 couldn't see in that that screen these right here are things that we could not
1126:22 right here are things that we could not see before um and these as well so what
1126:28 see before um and these as well so what we can do is we're going to go down here
1126:29 we can do is we're going to go down here we're just going to copy these social
1126:31 we're just going to copy these social we're going to replace them right here
1126:33 we're going to replace them right here so they can have those and then we're
1126:34 so they can have those and then we're going to get rid of these two right here
1126:36 going to get rid of these two right here and this says this is massively um and
1126:38 and this says this is massively um and we're going to change that as well let's
1126:40 we're going to change that as well let's make this full screen for the first time
1126:42 make this full screen for the first time feels good um I hate doing split screen
1126:45 feels good um I hate doing split screen but I do it for you guys um so this is
1126:50 but I do it for you guys um so this is massively and we're just going to put
1126:52 massively and we're just going to put we're just going to get rid of these two
1126:54 we're just going to get rid of these two this is um it's called The Navigator the
1126:57 this is um it's called The Navigator the the different tabs we're going to get
1126:58 the different tabs we're going to get rid of those two tabs and then for this
1127:00 rid of those two tabs and then for this I'm just going to call it
1127:02 I'm just going to call it projects and I'll once I once we go back
1127:05 projects and I'll once I once we go back and update all this then you will um
1127:08 and update all this then you will um you'll see those
1127:09 you'll see those changes so let's see so we made those
1127:12 changes so let's see so we made those changes here's our social or the social
1127:14 changes here's our social or the social medias uh Social Media stuff we're going
1127:17 medias uh Social Media stuff we're going to go and copy copy these
1127:20 to go and copy copy these two and we're going to replace all of
1127:24 two and we're going to replace all of these with
1127:27 these with this
1127:28 this um and let's save that and let's go back
1127:33 um and let's save that and let's go back so now as you can see those two are gone
1127:35 so now as you can see those two are gone this says projects there's only two
1127:36 this says projects there's only two right here and if you click on it it's
1127:38 right here and if you click on it it's going to go to my LinkedIn or your
1127:41 going to go to my LinkedIn or your LinkedIn when you do it um and this will
1127:44 LinkedIn when you do it um and this will take you to the GitHub so it is all
1127:48 take you to the GitHub so it is all working as intended this is great um
1127:51 working as intended this is great um when you scroll down and it says
1127:52 when you scroll down and it says massively we can change that as well and
1127:54 massively we can change that as well and we should let's do that really quick um
1127:58 we should let's do that really quick um we'll just
1127:59 we'll just say Alex the
1128:01 say Alex the analyst and we'll update
1128:04 analyst and we'll update that and there we go so in a nutshell
1128:08 that and there we go so in a nutshell this is the a lot of it um we need
1128:12 this is the a lot of it um we need images and I don't think I set this up
1128:16 images and I don't think I set this up for this video so I'm going to I'm going
1128:18 for this video so I'm going to I'm going to like cut myself off for like 2
1128:20 to like cut myself off for like 2 seconds go pull those images in um
1128:23 seconds go pull those images in um because it could take like a few minutes
1128:25 because it could take like a few minutes I don't want to waste your time and then
1128:26 I don't want to waste your time and then I'll come back so I'll see you in two
1128:28 I'll come back so I'll see you in two seconds all right so I just pulled over
1128:30 seconds all right so I just pulled over the images that we are going to use
1128:32 the images that we are going to use let's go to the downloads um they're
1128:34 let's go to the downloads um they're right here they're the housing Tableau
1128:36 right here they're the housing Tableau and Co um if I open up this Co one this
1128:39 and Co um if I open up this Co one this is what the image looks like this is
1128:40 is what the image looks like this is what we're going to use for that covid
1128:43 what we're going to use for that covid project so I'm going to copy these I'm
1128:45 project so I'm going to copy these I'm going to go into the port website um
1128:47 going to go into the port website um that we just have I'm going to go to
1128:49 that we just have I'm going to go to images and I'm going to insert these in
1128:51 images and I'm going to insert these in here so now that we have those images in
1128:54 here so now that we have those images in here let's go
1128:56 here let's go back and let's see what we got so we
1128:59 back and let's see what we got so we just put these images in this um you'll
1129:02 just put these images in this um you'll have this folder right here and you can
1129:05 have this folder right here and you can open it up and you can see all of these
1129:07 open it up and you can see all of these that we have so all we're going to do is
1129:10 that we have so all we're going to do is go and replace the images these these
1129:12 go and replace the images these these you know temporary images that they had
1129:15 you know temporary images that they had for us and we should be gold and then
1129:18 for us and we should be gold and then we're going to actually upload it to to
1129:20 we're going to actually upload it to to GitHub and then create our website for
1129:23 GitHub and then create our website for free so let's go right down here this is
1129:26 free so let's go right down here this is our very first uh one this is our data
1129:29 our very first uh one this is our data cleaning in SQL this is with the housing
1129:31 cleaning in SQL this is with the housing data so this image right over here it
1129:35 data so this image right over here it says images p1. jpeg so jpeg I don't
1129:39 says images p1. jpeg so jpeg I don't know why I said it like that so this is
1129:41 know why I said it like that so this is the housing so what we're going to do
1129:42 the housing so what we're going to do right here is do housing and it'll
1129:45 right here is do housing and it'll autocomplete for us um so that housing
1129:47 autocomplete for us um so that housing should be in there now next one is the
1129:50 should be in there now next one is the data exploration in SQL that was with
1129:52 data exploration in SQL that was with the co so we're going to get rid of this
1129:54 the co so we're going to get rid of this we're going to say Co um because that is
1129:57 we're going to say Co um because that is the image that I have right over here
1129:59 the image that I have right over here and then the last one is excuse me
1130:02 and then the last one is excuse me Tableau so let's go right over here
1130:04 Tableau so let's go right over here let's do TBL
1130:06 let's do TBL low let's get rid oh I got to save that
1130:10 low let's get rid oh I got to save that uh contrl s perfect and now let's look
1130:14 uh contrl s perfect and now let's look at
1130:16 at it there you go there you go go oh this
1130:18 it there you go there you go go oh this one still says full story go change that
1130:20 one still says full story go change that um I'm going to go change it just
1130:23 um I'm going to go change it just doesn't feel
1130:24 doesn't feel right uh view project oh that's not how
1130:28 right uh view project oh that's not how you spell
1130:30 you spell it okay contrl s perfect okay so now
1130:36 it okay contrl s perfect okay so now this looks a lot better um and when we
1130:39 this looks a lot better um and when we host it um through GitHub Pages or
1130:42 host it um through GitHub Pages or github.io this is going to be what it
1130:45 github.io this is going to be what it looks like I mean it is and you can add
1130:48 looks like I mean it is and you can add a lot more to it you can take away from
1130:50 a lot more to it you can take away from it you can add as many projects as you
1130:51 it you can add as many projects as you want you can keep adding you can copy
1130:53 want you can keep adding you can copy those articles or those posts and you
1130:55 those articles or those posts and you can just keep adding them um so this is
1130:59 can just keep adding them um so this is kind of what it's going to look
1131:01 kind of what it's going to look like
1131:02 like and it was not that hard I don't think I
1131:05 and it was not that hard I don't think I hope this was not too difficult I really
1131:07 hope this was not too difficult I really don't think it is um it's really just
1131:09 don't think it is um it's really just using a template and kind of
1131:10 using a template and kind of understanding a little basics of HTML so
1131:14 understanding a little basics of HTML so um we are going to take this and we we
1131:16 um we are going to take this and we we have this saved already we have this all
1131:19 have this saved already we have this all saved what we are going to do now is
1131:22 saved what we are going to do now is upload this to GitHub so let's go right
1131:25 upload this to GitHub so let's go right over here let's go to here and let's go
1131:29 over here let's go to here and let's go to
1131:31 to repositories and how do where where's
1131:34 repositories and how do where where's the new one oh I need to sign in okay
1131:37 the new one oh I need to sign in okay I'm going to get rid of this part so you
1131:38 I'm going to get rid of this part so you can't see it so we are going to say a
1131:41 can't see it so we are going to say a new
1131:42 new repository we're going to call it Alex
1131:45 repository we're going to call it Alex the analyst
1131:46 the analyst 2 .
1131:49 2 . github.io so we're going to write it
1131:51 github.io so we're going to write it just like that you know if your name's
1131:54 just like that you know if your name's um
1131:55 um Alex Jimmy I don't know why I said Jimmy
1132:00 Alex Jimmy I don't know why I said Jimmy Alex Jimmy Alex jimmy. github.io you can
1132:03 Alex Jimmy Alex jimmy. github.io you can always go back after the fact and change
1132:05 always go back after the fact and change this so it's not a big deal whether you
1132:07 this so it's not a big deal whether you change it or not and we're going to
1132:10 change it or not and we're going to create this
1132:12 create this repository we're going to say upload an
1132:14 repository we're going to say upload an existing
1132:15 existing file and instead of choosing them what
1132:18 file and instead of choosing them what we're going to do is just go right over
1132:20 we're going to do is just go right over here go to this and we're just going to
1132:23 here go to this and we're just going to copy this in or not copy it in but drag
1132:25 copy this in or not copy it in but drag it in okay so we're going to take this
1132:27 it in okay so we're going to take this drag it in right here and it can take a
1132:29 drag it in right here and it can take a it'll take a little bit has a 75 but it
1132:32 it'll take a little bit has a 75 but it shouldn't take that
1132:39 long and let's just wait for it I taking a sip of water I
1132:41 a sip of water I apologize but it is literally uploading
1132:43 apologize but it is literally uploading just everything that we had in there so
1132:44 just everything that we had in there so all the updates and all the changes and
1132:46 all the updates and all the changes and all the stuff that we um had
1132:48 all the stuff that we um had and it looks like it's done so let's
1132:50 and it looks like it's done so let's just
1132:51 just write initial
1132:53 write initial commit commit
1132:55 commit commit changes it is processing
1132:59 changes it is processing it all right and it should be done very
1133:02 it all right and it should be done very very soon as long as I have a good
1133:04 very soon as long as I have a good internet
1133:06 internet connection we shall
1133:16 see stick with me it's taking its time um while while it's loading let's
1133:19 time um while while it's loading let's go over to oh oh there it is so perfect
1133:21 go over to oh oh there it is so perfect so here's everything that we have has
1133:23 so here's everything that we have has this read me that it generated let's
1133:25 this read me that it generated let's over to
1133:27 over to settings and we have this U
1133:31 settings and we have this U github.io and if we go right down here
1133:34 github.io and if we go right down here to GitHub Pages pages settings now has
1133:38 to GitHub Pages pages settings now has its own dedicated tab let's check it out
1133:40 its own dedicated tab let's check it out here so it is
1133:44 here so it is um it's currently disabled but we're
1133:46 um it's currently disabled but we're going to say want it to do pull from the
1133:48 going to say want it to do pull from the main um I think it's the doc we'll see
1133:52 main um I think it's the doc we'll see I'm going to save this your site is
1133:54 I'm going to save this your site is ready to be published let's open this up
1133:57 ready to be published let's open this up okay site not found maybe it's from the
1134:01 okay site not found maybe it's from the root
1134:03 root save um your site is having a build a
1134:06 save um your site is having a build a problem let me see if I can actually
1134:09 problem let me see if I can actually change the name I already have an Alex
1134:11 change the name I already have an Alex analyist but I'm GNA see it's already
1134:13 analyist but I'm GNA see it's already taken um I'm just going to try this one
1134:15 taken um I'm just going to try this one one more time oh and now it's working uh
1134:19 one more time oh and now it's working uh I have no idea why it uh didn't work
1134:22 I have no idea why it uh didn't work before but this is fantastic it was
1134:25 before but this is fantastic it was giving me all this I was maybe I was
1134:27 giving me all this I was maybe I was just reading too much into that I had I
1134:29 just reading too much into that I had I had never tried to create another
1134:32 had never tried to create another umio or or GitHub pages on this so
1134:36 umio or or GitHub pages on this so anyways thanks for sticking with me
1134:37 anyways thanks for sticking with me through all that um stuff so now we have
1134:41 through all that um stuff so now we have our actual website um it doesn't look
1134:44 our actual website um it doesn't look the same up here because of that thing
1134:47 the same up here because of that thing that we were just looking at it should
1134:49 that we were just looking at it should just be this part right here but um this
1134:52 just be this part right here but um this is an actual website now it's being
1134:54 is an actual website now it's being hosted through GitHub and it's
1134:56 hosted through GitHub and it's completely free if you want to pay you
1135:00 completely free if you want to pay you can hide this from your GitHub um your
1135:03 can hide this from your GitHub um your repository has to be public uh something
1135:05 repository has to be public uh something I didn't mention when you're doing this
1135:08 I didn't mention when you're doing this your repository has to be public um if I
1135:12 your repository has to be public um if I change the visibility to private um you
1135:15 change the visibility to private um you will not be able to see it anymore
1135:18 will not be able to see it anymore you'll have to then pay if you want to
1135:19 you'll have to then pay if you want to make this repository private you have to
1135:21 make this repository private you have to then pay I think it's like $4 a month or
1135:23 then pay I think it's like $4 a month or something like that so worth looking
1135:25 something like that so worth looking into um if you don't want to display
1135:28 into um if you don't want to display that on your GitHub worth looking into
1135:31 that on your GitHub worth looking into but this is our final product I mean it
1135:34 but this is our final product I mean it looks pretty fantastic and you can use
1135:36 looks pretty fantastic and you can use any of these templates right there are
1135:38 any of these templates right there are lots of different templates that are
1135:40 lots of different templates that are fantastic I mean they look amazing they
1135:43 fantastic I mean they look amazing they look professional um it's really up to
1135:45 look professional um it's really up to your style like this one looks kind of
1135:48 your style like this one looks kind of cool a little bit um edgy for for my
1135:50 cool a little bit um edgy for for my taste but uh this one looks really good
1135:52 taste but uh this one looks really good too might might be able to add some more
1135:54 too might might be able to add some more narrative to that one so again go
1135:57 narrative to that one so again go through it make your make a good choice
1136:00 through it make your make a good choice in it and then update it how we updated
1136:03 in it and then update it how we updated it uh I will include the um let's see I
1136:08 it uh I will include the um let's see I will include everything that's in here
1136:10 will include everything that's in here and I'll keep this on my on this GitHub
1136:13 and I'll keep this on my on this GitHub that you can go in there and if you want
1136:15 that you can go in there and if you want to download these images you can
1136:16 to download these images you can download the images that I I used um or
1136:19 download the images that I I used um or you can go find your own just um you
1136:20 you can go find your own just um you know look for try to get like HD images
1136:23 know look for try to get like HD images on Google just type in Google Images and
1136:26 on Google just type in Google Images and search for whatever image you want to
1136:27 search for whatever image you want to search try to get an HD image with that
1136:30 search try to get an HD image with that being said that is the entire project I
1136:32 being said that is the entire project I I I I hope this didn't go too long um
1136:34 I I I hope this didn't go too long um this may have gone you know this may
1136:37 this may have gone you know this may have gone like 30 45 minutes but in the
1136:40 have gone like 30 45 minutes but in the end of it at the at the end which is
1136:42 end of it at the at the end which is where we are now we have an entire
1136:44 where we are now we have an entire website it was completely free and I
1136:46 website it was completely free and I hope that you can host the projects and
1136:48 hope that you can host the projects and you can create create more projects I
1136:50 you can create create more projects I will be coming out with more projects
1136:52 will be coming out with more projects myself that hopefully will be
1136:53 myself that hopefully will be interesting to you in the future so with
1136:56 interesting to you in the future so with that being said thank you guys for
1136:58 that being said thank you guys for joining me for you who stuck it out to
1137:00 joining me for you who stuck it out to the very end you are fantastic you know
1137:02 the very end you are fantastic you know send me a post your website on LinkedIn
1137:05 send me a post your website on LinkedIn and tag me in it because I love seeing
1137:07 and tag me in it because I love seeing um you guys do these projects and this
1137:09 um you guys do these projects and this stuff so I'm super excited to see all of
1137:11 stuff so I'm super excited to see all of these um that you guys tag me on on
1137:14 these um that you guys tag me on on LinkedIn and whatnot so with that being
1137:16 LinkedIn and whatnot so with that being said this is it I hope you learned
1137:18 said this is it I hope you learned something I hope that it worked for you
1137:21 something I hope that it worked for you and I appreciate you watching be sure to
1137:23 and I appreciate you watching be sure to like And subscribe below and I will see
1137:25 like And subscribe below and I will see you in the next video
1137:38 [Music] goodbye what's going on everybody
1137:40 goodbye what's going on everybody welcome back to another video today I'm
1137:42 welcome back to another video today I'm going to help you create a data analyst
1137:47 resume [Music]
1137:52 now when I say data analyst rume it's not that much different than a regular
1137:54 not that much different than a regular rume except that it's going to be
1137:56 rume except that it's going to be catered for a data analyst job in just a
1137:59 catered for a data analyst job in just a second we're going to take a look on my
1138:00 second we're going to take a look on my screen at a sample resume I'll have the
1138:02 screen at a sample resume I'll have the template in the description so you can
1138:03 template in the description so you can just go and download it and fill in your
1138:05 just go and download it and fill in your information but it's a fantastic
1138:07 information but it's a fantastic starting place to actually creating your
1138:08 starting place to actually creating your resume when we're looking at this resume
1138:10 resume when we're looking at this resume we'll take a look at each section kind
1138:12 we'll take a look at each section kind of dissect each part of it and then at
1138:14 of dissect each part of it and then at the very end I'll give some extra tips
1138:16 the very end I'll give some extra tips on what you should include and how to
1138:17 on what you should include and how to actually write your rume as well so
1138:19 actually write your rume as well so without further Ado let's jump onto my
1138:20 without further Ado let's jump onto my screen take a look at the rume and see
1138:22 screen take a look at the rume and see how you can create your own data analyst
1138:24 how you can create your own data analyst resume so here's our sample resume I'm
1138:26 resume so here's our sample resume I'm just going to walk through the entire
1138:27 just going to walk through the entire thing super quick and then we'll break
1138:28 thing super quick and then we'll break down each section individually I'll give
1138:30 down each section individually I'll give my thoughts and some tips on each
1138:32 my thoughts and some tips on each section and remember you can download
1138:34 section and remember you can download this exact thing in the description
1138:36 this exact thing in the description below I'll have a link I'll probably put
1138:38 below I'll have a link I'll probably put it on my GitHub or somewhere else but
1138:39 it on my GitHub or somewhere else but it'll be free to download uh so you can
1138:41 it'll be free to download uh so you can go ahead and do that but let's zoom in
1138:43 go ahead and do that but let's zoom in just a little bit so at the very top we
1138:46 just a little bit so at the very top we have our header we have some just basic
1138:49 have our header we have some just basic uh contact information then we have
1138:51 uh contact information then we have skills then we have projects and notice
1138:53 skills then we have projects and notice the projects are up here at the top and
1138:55 the projects are up here at the top and we'll get to that later about the order
1138:57 we'll get to that later about the order of where you should be putting your
1138:58 of where you should be putting your things then we have work experience and
1139:01 things then we have work experience and then we have education so really quickly
1139:04 then we have education so really quickly I'm going to zoom out and I hope you can
1139:06 I'm going to zoom out and I hope you can still see it the order is actually quite
1139:09 still see it the order is actually quite important now there is one piece that is
1139:11 important now there is one piece that is not in here right now and that is a
1139:14 not in here right now and that is a summary section I don't have a summary
1139:16 summary section I don't have a summary section on my real resume I just I don't
1139:19 section on my real resume I just I don't think it's useful or helpful I don't
1139:21 think it's useful or helpful I don't have one you can include one and it
1139:23 have one you can include one and it would be right up here at the very top
1139:25 would be right up here at the very top now why do we have the skills and
1139:27 now why do we have the skills and projects at the top well it's because
1139:30 projects at the top well it's because that most people who are trying to break
1139:33 that most people who are trying to break into a data analytics don't have any
1139:34 into a data analytics don't have any experience in data analytics if I am
1139:37 experience in data analytics if I am reading this resume as a hiring manager
1139:40 reading this resume as a hiring manager and the first thing that I look up here
1139:41 and the first thing that I look up here and I see is experience and it's not
1139:44 and I see is experience and it's not analyst it's a teacher or a nurse or
1139:46 analyst it's a teacher or a nurse or something I'm going to be like
1139:47 something I'm going to be like this person doesn't have any experience
1139:49 this person doesn't have any experience I don't want to hire them the first
1139:51 I don't want to hire them the first thing that you want to have in your
1139:52 thing that you want to have in your resume is something that is good for the
1139:54 resume is something that is good for the hiring manager to see the first several
1139:56 hiring manager to see the first several things you should put all your best
1139:57 things you should put all your best stuff at the top that's my uh what I
1139:59 stuff at the top that's my uh what I believe so I think that these skills are
1140:03 believe so I think that these skills are really strong a lot of great skills and
1140:05 really strong a lot of great skills and then these projects are all really good
1140:07 then these projects are all really good projects now this is just a sample these
1140:09 projects now this is just a sample these aren't all real projects um or they are
1140:11 aren't all real projects um or they are real real projects they're just not you
1140:14 real real projects they're just not you know ones that I built myself it's just
1140:16 know ones that I built myself it's just a sample so uh then right here we have
1140:19 a sample so uh then right here we have our work experience now if you're like I
1140:21 our work experience now if you're like I said a nurse or a teacher or a lawyer or
1140:22 said a nurse or a teacher or a lawyer or something that's not relevant to data
1140:24 something that's not relevant to data analytics you want that at the bottom um
1140:26 analytics you want that at the bottom um and then you're going to want to tie in
1140:28 and then you're going to want to tie in uh some things in these descriptions and
1140:29 uh some things in these descriptions and then the education at the bottom my
1140:31 then the education at the bottom my education was terrible okay I had a
1140:33 education was terrible okay I had a bachelor's in recreational therapy which
1140:36 bachelor's in recreational therapy which had nothing to do with data analytics so
1140:38 had nothing to do with data analytics so for a tech job has was not good I always
1140:40 for a tech job has was not good I always had mine at the bottom so let's start at
1140:43 had mine at the bottom so let's start at the very top and walk through each
1140:45 the very top and walk through each section so
1140:47 section so at the very top you want to have maybe a
1140:50 at the very top you want to have maybe a title but for sure your full name you
1140:53 title but for sure your full name you definitely want to include your phone
1140:55 definitely want to include your phone number if you're okay with them calling
1140:56 number if you're okay with them calling you but definitely an email for sure
1140:59 you but definitely an email for sure include things like a LinkedIn profile
1141:01 include things like a LinkedIn profile or a GitHub profile you can also put
1141:03 or a GitHub profile you can also put your portfolio in fact I highly
1141:05 your portfolio in fact I highly recommend putting your portfolio because
1141:06 recommend putting your portfolio because it just looks good or if they check it
1141:08 it just looks good or if they check it out that's a really good thing and then
1141:10 out that's a really good thing and then your location cuz sometimes your job is
1141:12 your location cuz sometimes your job is going to be location based whether
1141:13 going to be location based whether you're in Dallas or another Metropolitan
1141:15 you're in Dallas or another Metropolitan City it's just nice ni to have that on
1141:18 City it's just nice ni to have that on there this should be the simplest one to
1141:20 there this should be the simplest one to fill out unless you haven't built out
1141:22 fill out unless you haven't built out something like a portfolio you just
1141:24 something like a portfolio you just don't include it um but this one should
1141:26 don't include it um but this one should be the simplest one right you're just
1141:28 be the simplest one right you're just putting contact information maybe a link
1141:30 putting contact information maybe a link to a website next we have the skill
1141:32 to a website next we have the skill section and this one on my own personal
1141:34 section and this one on my own personal resume I have at the very top I
1141:36 resume I have at the very top I typically recommend anyone who does not
1141:38 typically recommend anyone who does not have experience who is trying to break
1141:39 have experience who is trying to break in to data analytics to put this at the
1141:41 in to data analytics to put this at the top as well and have these skills and
1141:44 top as well and have these skills and know these skills that's important um
1141:46 know these skills that's important um but when the hiring manager first
1141:48 but when the hiring manager first initially sees this there's just going
1141:50 initially sees this there's just going to be a mental check okay they have the
1141:52 to be a mental check okay they have the skills that we're looking for let's move
1141:53 skills that we're looking for let's move on to the rest of the resume um but you
1141:55 on to the rest of the resume um but you want as many mental checks for what
1141:58 want as many mental checks for what they're looking for at the beginning
1142:00 they're looking for at the beginning just going to I'm going to keep
1142:01 just going to I'm going to keep repeating that um this is how I
1142:04 repeating that um this is how I personally write my skills so I write
1142:07 personally write my skills so I write something like SQL and then I'll say SQL
1142:09 something like SQL and then I'll say SQL Server my SQL postrace SQL now I have
1142:12 Server my SQL postrace SQL now I have used all these different types of SQL in
1142:14 used all these different types of SQL in my actual job if you don't you haven't
1142:16 my actual job if you don't you haven't done done that and you're just starting
1142:17 done done that and you're just starting out maybe you put something like um you
1142:21 out maybe you put something like um you know
1142:21 know subqueries store procedures joins
1142:24 subqueries store procedures joins whatever the actual things within SQL I
1142:27 whatever the actual things within SQL I don't really think I don't recommend
1142:28 don't really think I don't recommend that as much because typically people
1142:32 that as much because typically people know what SQL is like if they use SQL
1142:34 know what SQL is like if they use SQL they know what SQL is so they're just
1142:35 they know what SQL is so they're just going to expect that you know those
1142:36 going to expect that you know those things now for something like python
1142:39 things now for something like python it's different because there are
1142:40 it's different because there are packages something are there are
1142:41 packages something are there are packages and libraries within them so
1142:43 packages and libraries within them so you can specify I have worked with
1142:46 you can specify I have worked with pandas in my actual job and I look for
1142:48 pandas in my actual job and I look for people who know pandas as well because
1142:50 people who know pandas as well because you know we use it so actually
1142:52 you know we use it so actually specifying these packages or libraries
1142:54 specifying these packages or libraries is really helpful so this is how I would
1142:57 is really helpful so this is how I would put these things on a resume now this is
1143:00 put these things on a resume now this is another resume this is our sample two
1143:02 another resume this is our sample two I'm going to maybe include this one down
1143:04 I'm going to maybe include this one down below although I don't like this format
1143:06 below although I don't like this format as much but if you like it you can but
1143:09 as much but if you like it you can but here's another way that you can um show
1143:11 here's another way that you can um show these skills just a different way to do
1143:12 these skills just a different way to do it I want to show you both ways um we
1143:15 it I want to show you both ways um we have like Python and the libraries
1143:16 have like Python and the libraries underneath it I've even seen it to where
1143:18 underneath it I've even seen it to where people will write out almost like um let
1143:20 people will write out almost like um let me go down here they'll write out like a
1143:22 me go down here they'll write out like a narrative um they'll do
1143:25 narrative um they'll do Python and then they'll have like a
1143:26 Python and then they'll have like a colon and then they'll say use to um
1143:32 colon and then they'll say use to um manipulate data and I'm not spelling
1143:34 manipulate data and I'm not spelling that right in pandas dot dot dot and
1143:36 that right in pandas dot dot dot and they've write it out you can do that as
1143:38 they've write it out you can do that as well again I'd like bullet points
1143:41 well again I'd like bullet points because it's to the point it's exactly
1143:42 because it's to the point it's exactly what you need let's get rid of this one
1143:44 what you need let's get rid of this one real quick so this is the one
1143:47 real quick so this is the one uh that I like so that's the skill
1143:50 uh that I like so that's the skill section let's move down to the projects
1143:53 section let's move down to the projects now the project section is almost
1143:56 now the project section is almost primarily for people who are just
1143:58 primarily for people who are just starting out once you get experience
1144:00 starting out once you get experience typically you maybe have one project on
1144:02 typically you maybe have one project on there or no projects at all but the
1144:04 there or no projects at all but the project section is used as kind of um
1144:07 project section is used as kind of um inl of actual experience right I've
1144:10 inl of actual experience right I've always said that you need to build
1144:12 always said that you need to build projects not just for your resume but
1144:15 projects not just for your resume but also for the interviews so so then when
1144:17 also for the interviews so so then when you get into an interview you can point
1144:18 you get into an interview you can point to these projects and say yes I've used
1144:20 to these projects and say yes I've used SQL I did it in this project and they
1144:22 SQL I did it in this project and they may have seen it and you can walk them
1144:24 may have seen it and you can walk them through how you actually used it it
1144:26 through how you actually used it it gives you more credibility than just
1144:28 gives you more credibility than just saying you know how to use SQL So within
1144:31 saying you know how to use SQL So within the project section we're going to have
1144:33 the project section we're going to have a project like this one says data
1144:35 a project like this one says data science job market exploratory data
1144:37 science job market exploratory data analysis so this is a personal project
1144:40 analysis so this is a personal project and then within it they did some really
1144:42 and then within it they did some really great stuff here's usually what I
1144:45 great stuff here's usually what I recommend and this is in here which is
1144:47 recommend and this is in here which is you specify what you did you say I used
1144:49 you specify what you did you say I used Python and what did you do to analyze
1144:52 Python and what did you do to analyze this and gain insights in the job market
1144:55 this and gain insights in the job market then you walk through some of the things
1144:56 then you walk through some of the things that you actually did things like regex
1144:58 that you actually did things like regex techniques you used pandas matplot lib
1145:01 techniques you used pandas matplot lib you built a wordcloud these are keywords
1145:04 you built a wordcloud these are keywords that somebody will look for and they
1145:06 that somebody will look for and they even highlighted them which I personally
1145:08 even highlighted them which I personally like and do as myself they highlighted
1145:11 like and do as myself they highlighted these things so that the viewer or the
1145:13 these things so that the viewer or the um hiring manager is actually seeing
1145:15 um hiring manager is actually seeing them making sure that they're bold so
1145:16 them making sure that they're bold so that they are catching their eye so I
1145:19 that they are catching their eye so I personally do this and I recommend this
1145:21 personally do this and I recommend this that's all it needs to be it just needs
1145:23 that's all it needs to be it just needs to be I built a Tablo dashboard doing
1145:26 to be I built a Tablo dashboard doing this from this data set I cleaned it in
1145:28 this from this data set I cleaned it in SQL and you show those skills something
1145:31 SQL and you show those skills something that's important in both the skill
1145:33 that's important in both the skill section and the project section is using
1145:36 section and the project section is using and highlighting your skills as much as
1145:39 and highlighting your skills as much as possible especially if you don't have
1145:42 possible especially if you don't have any experience if you've never had a job
1145:43 any experience if you've never had a job before once you have a job and you come
1145:46 before once you have a job and you come down to like the work experience then it
1145:47 down to like the work experience then it kind of speaks for you but if you don't
1145:50 kind of speaks for you but if you don't you want the projects and the skills to
1145:52 you want the projects and the skills to speak towards your skills and
1145:53 speak towards your skills and credibility so we have this right here
1145:56 credibility so we have this right here now one thing that's not in here that I
1145:58 now one thing that's not in here that I actually do recommend is a hyperlink
1146:00 actually do recommend is a hyperlink maybe right here or actually this being
1146:03 maybe right here or actually this being a hyperlink to the project because they
1146:06 a hyperlink to the project because they might read this and be like I we work
1146:08 might read this and be like I we work with you know data science job market
1146:11 with you know data science job market data I don't know and then they'll click
1146:13 data I don't know and then they'll click on this link and they can see your work
1146:15 on this link and they can see your work that is the one thing that I would
1146:16 that is the one thing that I would change change in this other than that
1146:18 change change in this other than that this is exactly how I would have it very
1146:20 this is exactly how I would have it very very very similar to my own um and a lot
1146:22 very very similar to my own um and a lot of this that I did I actually took from
1146:25 of this that I did I actually took from other resumés and formatted how I prefer
1146:27 other resumés and formatted how I prefer and like it um so again some of this is
1146:30 and like it um so again some of this is personal preference and you can change
1146:31 personal preference and you can change it however you want that's just how I
1146:33 it however you want that's just how I like it so that is the project section
1146:36 like it so that is the project section now we're going to go down to the work
1146:37 now we're going to go down to the work experience section now this person does
1146:39 experience section now this person does have a little bit of analyst uh
1146:43 have a little bit of analyst uh experience so you know if you don't
1146:46 experience so you know if you don't that's okay
1146:47 that's okay but you put your previous experience now
1146:49 but you put your previous experience now here's what I recommend if you've been a
1146:51 here's what I recommend if you've been a teacher for 15 years you've been a nurse
1146:53 teacher for 15 years you've been a nurse for 10 years you've had 10 different
1146:55 for 10 years you've had 10 different jobs don't put all your experience on
1146:56 jobs don't put all your experience on here um maybe put your last two jobs
1147:00 here um maybe put your last two jobs going back maybe three years I don't
1147:02 going back maybe three years I don't recommend you filling it up because it's
1147:04 recommend you filling it up because it's not going to be super relevant unless
1147:06 not going to be super relevant unless you're applying for a healthc care data
1147:07 you're applying for a healthc care data analyst position and you have a Nursing
1147:09 analyst position and you have a Nursing degree then it's relevant and that
1147:11 degree then it's relevant and that experience is super helpful because it's
1147:12 experience is super helpful because it's domain experience right then you may go
1147:15 domain experience right then you may go back five years just you know use your
1147:17 back five years just you know use your discretion but what you need to include
1147:20 discretion but what you need to include of course your title where you worked
1147:22 of course your title where you worked your location and the times that's
1147:24 your location and the times that's standard for almost any resume but
1147:26 standard for almost any resume but within here uh what you really want to
1147:28 within here uh what you really want to do is highlight again the skills if you
1147:30 do is highlight again the skills if you can if you can't that'll change but in
1147:34 can if you can't that'll change but in here he says implemented a new reporting
1147:35 here he says implemented a new reporting using Excel pivot and VBA which reduced
1147:38 using Excel pivot and VBA which reduced processing time by 50% these types of um
1147:42 processing time by 50% these types of um quantitative information I reduced time
1147:45 quantitative information I reduced time I I I saved the company money I I did
1147:48 I I I saved the company money I I did something quantitative putting that in
1147:50 something quantitative putting that in here is always helpful always highly
1147:52 here is always helpful always highly recommended although it can be tough to
1147:54 recommended although it can be tough to measure these things right typically
1147:56 measure these things right typically what I recommend especially if you're
1147:58 what I recommend especially if you're first starting out is to highlight
1147:59 first starting out is to highlight skills if you're a teacher you've
1148:01 skills if you're a teacher you've probably used Excel and you've probably
1148:03 probably used Excel and you've probably used Excel for closer to data analytics
1148:05 used Excel for closer to data analytics than you think just in a teacher way and
1148:08 than you think just in a teacher way and not a data analytics way but you can
1148:10 not a data analytics way but you can reward these things and make them sound
1148:12 reward these things and make them sound good if you are a a nurse like I was
1148:15 good if you are a a nurse like I was saying youve used used Excel you've used
1148:18 saying youve used used Excel you've used a health information system you've used
1148:21 a health information system you've used uh some type of database talk to that
1148:23 uh some type of database talk to that include that in here um and it can be
1148:26 include that in here um and it can be hard to write these out and I'm going to
1148:28 hard to write these out and I'm going to show you away in just a little bit about
1148:30 show you away in just a little bit about how you can write these out and think
1148:31 how you can write these out and think about these things or have a way to help
1148:33 about these things or have a way to help you write them or give you ideas we'll
1148:35 you write them or give you ideas we'll get to that in a second lastly we have
1148:38 get to that in a second lastly we have the education piece this is again really
1148:40 the education piece this is again really simple at the very bottom education what
1148:42 simple at the very bottom education what your degree was where you went um and if
1148:45 your degree was where you went um and if you have you know some help ful things
1148:47 you have you know some help ful things to include you can do that and then when
1148:49 to include you can do that and then when you actually went now you can include
1148:51 you actually went now you can include other things in here as well like boot
1148:53 other things in here as well like boot camps if you went to a boot camp or you
1148:55 camps if you went to a boot camp or you could also include things like a GPA
1148:57 could also include things like a GPA although I don't personally recommend it
1148:59 although I don't personally recommend it GPA has never been anything that I've
1149:01 GPA has never been anything that I've ever cared about or I've seen anyone
1149:03 ever cared about or I've seen anyone care about ever um so you don't normally
1149:06 care about ever um so you don't normally have to include it one other thing that
1149:08 have to include it one other thing that you can include at the very bottom is
1149:10 you can include at the very bottom is something like
1149:11 something like certifications uh I personally don't put
1149:13 certifications uh I personally don't put a lot of stock in certifications unless
1149:16 a lot of stock in certifications unless it is one that I have recommended in
1149:17 it is one that I have recommended in previous video like the Tableau
1149:19 previous video like the Tableau certification or Tableau desktop
1149:21 certification or Tableau desktop certification if you're applying to a
1149:23 certification if you're applying to a job that uses taow that actually could
1149:25 job that uses taow that actually could be really good so definitely include
1149:27 be really good so definitely include that but ones on udem me ones on corsera
1149:31 that but ones on udem me ones on corsera or like my Alex the analyst boot camp
1149:33 or like my Alex the analyst boot camp that I have on my channel I wouldn't
1149:35 that I have on my channel I wouldn't really include that in your resume it's
1149:37 really include that in your resume it's mostly for learning if you get something
1149:40 mostly for learning if you get something like the Tableau one or the AWS uh Cloud
1149:43 like the Tableau one or the AWS uh Cloud one or the um Azure Cloud one those are
1149:45 one or the um Azure Cloud one those are all actual certifications that can help
1149:47 all actual certifications that can help you and give you credibility towards a
1149:49 you and give you credibility towards a certain skill now really quickly let's
1149:50 certain skill now really quickly let's just take a glance at the other resume
1149:52 just take a glance at the other resume this is resume 2 so we have the
1149:54 this is resume 2 so we have the education at the top doesn't have to be
1149:57 education at the top doesn't have to be at the top unless it's relevant which
1149:58 at the top unless it's relevant which you could put at the top we have a skill
1150:00 you could put at the top we have a skill section they again this is the projects
1150:02 section they again this is the projects same projects and then work experience
1150:04 same projects and then work experience this is just a little bit different um
1150:06 this is just a little bit different um order so you can do it like this as well
1150:09 order so you can do it like this as well in different way you can write the
1150:10 in different way you can write the skills and you can also include a
1150:12 skills and you can also include a summary section as well so that's the
1150:14 summary section as well so that's the meat and potatoes of how I would create
1150:16 meat and potatoes of how I would create create a data analyst resume now writing
1150:18 create a data analyst resume now writing it is actually a different Beast right
1150:20 it is actually a different Beast right you have to actually write it out get
1150:22 you have to actually write it out get something on the resume and then apply
1150:24 something on the resume and then apply using that resume but it can be hard to
1150:26 using that resume but it can be hard to come up with these ideas so uh I just
1150:29 come up with these ideas so uh I just want to show you something that a lot of
1150:31 want to show you something that a lot of people have been using I personally
1150:32 people have been using I personally haven't written a resume in a little
1150:34 haven't written a resume in a little while so I don't use it for my own
1150:36 while so I don't use it for my own resume or haven't used it but I will um
1150:38 resume or haven't used it but I will um and that's using chat gbt or some
1150:40 and that's using chat gbt or some variation whether it's on Bing or you
1150:42 variation whether it's on Bing or you know you get some different version or
1150:44 know you get some different version or some new product that's out there at the
1150:45 some new product that's out there at the moment I'm just going to show you how to
1150:47 moment I'm just going to show you how to do it in chat GPT some of the things
1150:48 do it in chat GPT some of the things that you can prompt it to do and that'll
1150:51 that you can prompt it to do and that'll be it I'm just going to show you kind of
1150:53 be it I'm just going to show you kind of some ideas that it can generate for you
1150:55 some ideas that it can generate for you to help you write these things all right
1150:56 to help you write these things all right so here in my screen we're on chat gbt
1150:58 so here in my screen we're on chat gbt if you haven't used it I'll leave a link
1151:00 if you haven't used it I'll leave a link in the description I also have a whole
1151:01 in the description I also have a whole video on how to use chat GPT for a data
1151:03 video on how to use chat GPT for a data analysis um so I like chat GPT now I've
1151:07 analysis um so I like chat GPT now I've already written out these questions
1151:08 already written out these questions because I don't want to wait for the
1151:09 because I don't want to wait for the responses but here's what I asked it to
1151:11 responses but here's what I asked it to do and you can do some variation of this
1151:13 do and you can do some variation of this whether you're a nurse or a lawyer or a
1151:16 whether you're a nurse or a lawyer or a teach teacher or whatever I said I'm a
1151:18 teach teacher or whatever I said I'm a math High School teacher trying to
1151:19 math High School teacher trying to become a data analyst how can I use my
1151:21 become a data analyst how can I use my experience on my resume to help me get a
1151:23 experience on my resume to help me get a job this is just to help provoke some
1151:26 job this is just to help provoke some ideas and it says you know you most
1151:28 ideas and it says you know you most likely have some skills emphasize your
1151:30 likely have some skills emphasize your quantitative skills so those are some of
1151:32 quantitative skills so those are some of the things you can focus on showcase
1151:34 the things you can focus on showcase your ability to commute complex Concepts
1151:36 your ability to commute complex Concepts which is really important in data
1151:37 which is really important in data analytics being able to present
1151:39 analytics being able to present information which teachers have
1151:41 information which teachers have highlight your experience with
1151:42 highlight your experience with technology hopefully you're using some
1151:44 technology hopefully you're using some type of uh you know database for
1151:46 type of uh you know database for students or you know Excel or something
1151:48 students or you know Excel or something like that you can highlight that and
1151:50 like that you can highlight that and showcase your ability to solve problems
1151:52 showcase your ability to solve problems now the next thing that I asked it was I
1151:54 now the next thing that I asked it was I built a covid tableau dashboard using
1151:57 built a covid tableau dashboard using Tableau how can I add this to my resume
1152:00 Tableau how can I add this to my resume and then it's going to tell you exactly
1152:02 and then it's going to tell you exactly how you can do that it's going to say
1152:04 how you can do that it's going to say include the link to your dashboard which
1152:05 include the link to your dashboard which I also recommend provide a brief
1152:07 I also recommend provide a brief description highlight your data
1152:09 description highlight your data visualization skills include screenshots
1152:11 visualization skills include screenshots or images which that's what I would be
1152:12 or images which that's what I would be putting in the project itself not on
1152:14 putting in the project itself not on your resume then provide context for the
1152:17 your resume then provide context for the data all really good stuff really great
1152:20 data all really good stuff really great now the last thing is kind of what I'm
1152:21 now the last thing is kind of what I'm trying to get at as a whole it can help
1152:24 trying to get at as a whole it can help you write things so I'm going to say
1152:26 you write things so I'm going to say write a two sent I said write a two
1152:28 write a two sent I said write a two write two sentences highlighting my
1152:30 write two sentences highlighting my covid Tableau dashboard to add to my
1152:32 covid Tableau dashboard to add to my resume and it's going to say developed a
1152:35 resume and it's going to say developed a covid tablet dashboard to visualize
1152:37 covid tablet dashboard to visualize pandemic Trends using real-time data
1152:38 pandemic Trends using real-time data sources demonstrating strong data
1152:40 sources demonstrating strong data visualization and Analysis skills so
1152:43 visualization and Analysis skills so this can help you generate those
1152:45 this can help you generate those descriptions in your work experience it
1152:47 descriptions in your work experience it can help you generate the descriptions
1152:49 can help you generate the descriptions in your projects and this can be really
1152:52 in your projects and this can be really helpful to just generate some ideas cuz
1152:53 helpful to just generate some ideas cuz I personally really struggle with like
1152:56 I personally really struggle with like highlighting my skills and descriptions
1152:58 highlighting my skills and descriptions within those things this can be a way to
1153:00 within those things this can be a way to kind of help you do that so don't you
1153:03 kind of help you do that so don't you know just copy and paste but let it
1153:05 know just copy and paste but let it prompt you let it give you ideas now the
1153:07 prompt you let it give you ideas now the last thing that I want to mention is
1153:09 last thing that I want to mention is just your overall resume as a whole the
1153:11 just your overall resume as a whole the template that I use the template that I
1153:13 template that I use the template that I recommend is very very friend friendly
1153:16 recommend is very very friend friendly to these automated systems that check
1153:18 to these automated systems that check your resume if you did not know most
1153:21 your resume if you did not know most companies especially big companies use
1153:23 companies especially big companies use these automated systems that scan your
1153:25 these automated systems that scan your resum see if it has what they're looking
1153:27 resum see if it has what they're looking for and then that rume if it gets
1153:29 for and then that rume if it gets through that system gets passed on to a
1153:32 through that system gets passed on to a recruiter or hiring manager typically
1153:34 recruiter or hiring manager typically most companies don't go straight to the
1153:36 most companies don't go straight to the hiring manager so you need a resume that
1153:39 hiring manager so you need a resume that can pass through those initial systems
1153:41 can pass through those initial systems and pass those tests the RS that I've
1153:43 and pass those tests the RS that I've shown you today will do that they have
1153:45 shown you today will do that they have bullet points they have the keywords
1153:46 bullet points they have the keywords they have everything you need that's why
1153:48 they have everything you need that's why I recommend or partially why I recommend
1153:50 I recommend or partially why I recommend this type of resume other ones that have
1153:53 this type of resume other ones that have images and different fonts and different
1153:55 images and different fonts and different stylings can cause issues with these
1153:57 stylings can cause issues with these automated systems where it just doesn't
1153:59 automated systems where it just doesn't read it properly or you know it doesn't
1154:01 read it properly or you know it doesn't read the right words that you want it to
1154:04 read the right words that you want it to read so just know that these types of
1154:07 read so just know that these types of résumés have different uses right you're
1154:09 résumés have different uses right you're not just handing it off to somebody to
1154:11 not just handing it off to somebody to where they can read it and it's needs to
1154:12 where they can read it and it's needs to be visually stimulating really what you
1154:15 be visually stimulating really what you need is you needed to get through those
1154:16 need is you needed to get through those initial systems which these resumés uh
1154:19 initial systems which these resumés uh if you write them well you have good you
1154:21 if you write them well you have good you know skills and the right things on your
1154:22 know skills and the right things on your resume they will pass through that first
1154:25 resume they will pass through that first layer to get to those hiring managers so
1154:27 layer to get to those hiring managers so again be sure to download those those
1154:28 again be sure to download those those are completely free I just I highly
1154:31 are completely free I just I highly recommend using them I think they're
1154:32 recommend using them I think they're really good so be sure to download those
1154:34 really good so be sure to download those use those just put in your own
1154:36 use those just put in your own information be sure to build out your
1154:37 information be sure to build out your own projects don't just keep the ones
1154:39 own projects don't just keep the ones that are on there because you'll need to
1154:41 that are on there because you'll need to be able to speak to them sometimes
1154:42 be able to speak to them sometimes recruiters or hiring managers are going
1154:44 recruiters or hiring managers are going to ask you about them how you build it
1154:46 to ask you about them how you build it what you did and you can also point to
1154:47 what you did and you can also point to those projects in your actual interview
1154:50 those projects in your actual interview so I hope that this was helpful I hope
1154:52 so I hope that this was helpful I hope that your resume is ready to go I hope
1154:54 that your resume is ready to go I hope that you ready to start applying for
1154:55 that you ready to start applying for those data analyst jobs thank you guys
1154:57 those data analyst jobs thank you guys so much for watching I really appreciate
1154:59 so much for watching I really appreciate it if you like this video be sure to
1155:01 it if you like this video be sure to like And subscribe below and I'll see
1155:02 like And subscribe below and I'll see you in the next
1155:05 you in the next [Music]
1155:14 [Music] video what's going on everybody my name
1155:17 video what's going on everybody my name is Alex freeberg and today we're going
1155:18 is Alex freeberg and today we're going to be walking through my top three tips
1155:20 to be walking through my top three tips on how to use LinkedIn to land a job
1155:22 on how to use LinkedIn to land a job LinkedIn is a fantastic place to look
1155:24 LinkedIn is a fantastic place to look for a job it's its own little ecosystem
1155:26 for a job it's its own little ecosystem where career-driven people can connect
1155:28 where career-driven people can connect and talk with one another and help each
1155:30 and talk with one another and help each other find jobs I personally have landed
1155:32 other find jobs I personally have landed jobs through Linkedin and so I know how
1155:33 jobs through Linkedin and so I know how effective it can be let's jump over to
1155:35 effective it can be let's jump over to my screen and I'm going to show you my
1155:37 my screen and I'm going to show you my top three strategies that I have found
1155:39 top three strategies that I have found to be the most successful to actually
1155:40 to be the most successful to actually finding a job so I'm logged into my
1155:42 finding a job so I'm logged into my completely Anonymous account here and
1155:44 completely Anonymous account here and I'm going to show you the very first tip
1155:45 I'm going to show you the very first tip which is you shouldn't be just applying
1155:47 which is you shouldn't be just applying to a position you should be actually
1155:48 to a position you should be actually reaching out to the recruiter and I'm
1155:50 reaching out to the recruiter and I'm going to show you exactly how to do that
1155:52 going to show you exactly how to do that so the first thing that we have to do is
1155:53 so the first thing that we have to do is actually find a job that we want to
1155:55 actually find a job that we want to apply to so let's go to the job section
1155:57 apply to so let's go to the job section right over here and let's search for
1156:01 right over here and let's search for data
1156:03 data analyst and let's do that
1156:05 analyst and let's do that in let's do
1156:08 in let's do Chicago because why not uh so it's going
1156:11 Chicago because why not uh so it's going to search for data analy positions in
1156:14 to search for data analy positions in Chicago we have one right here let's see
1156:17 Chicago we have one right here let's see what it looks like cuz you know I don't
1156:19 what it looks like cuz you know I don't want to apply to jobs that I'm not
1156:21 want to apply to jobs that I'm not extremely qualified for so this is a job
1156:23 extremely qualified for so this is a job that I want to apply for and before I
1156:25 that I want to apply for and before I actually go and applies to the job I
1156:27 actually go and applies to the job I want to see if I can reach out to a
1156:28 want to see if I can reach out to a recruiter and talk to them beforehand so
1156:30 recruiter and talk to them beforehand so let me show you how to do that so what
1156:31 let me show you how to do that so what we're going to do is actually click on
1156:33 we're going to do is actually click on the company right here it's going to
1156:34 the company right here it's going to take us to basically their LinkedIn
1156:36 take us to basically their LinkedIn profile page for their entire company
1156:38 profile page for their entire company and we're going to scroll down we're
1156:39 and we're going to scroll down we're going to go over to
1156:41 going to go over to people and then we're going to search
1156:43 people and then we're going to search for
1156:44 for recruiter
1156:47 recruiter so if we scroll down all the way to the
1156:49 so if we scroll down all the way to the bottom we can see that there are
1156:50 bottom we can see that there are recruiters that actually work inh house
1156:52 recruiters that actually work inh house for this company and so now would be a
1156:54 for this company and so now would be a time where I actually reach out to some
1156:55 time where I actually reach out to some of these recruiters and I say hey I see
1156:57 of these recruiters and I say hey I see a job that I really like I think I'm
1156:59 a job that I really like I think I'm really qualified for and I would love to
1157:00 really qualified for and I would love to talk more about it with you you can ask
1157:02 talk more about it with you you can ask them things about the job to make sure
1157:03 them things about the job to make sure that it is a good fit for you and then I
1157:06 that it is a good fit for you and then I highly recommend you asking them what
1157:07 highly recommend you asking them what they think is the best way to apply for
1157:09 they think is the best way to apply for this job to make sure that your resume
1157:11 this job to make sure that your resume gets noticed and you get an interview
1157:12 gets noticed and you get an interview since they are a recruiter who works at
1157:14 since they are a recruiter who works at this company they may be the the one
1157:15 this company they may be the the one who's actually going to be looking at
1157:17 who's actually going to be looking at these resumés and so they may give you a
1157:18 these resumés and so they may give you a tip on the best way to actually apply
1157:20 tip on the best way to actually apply they may also just ask you to send them
1157:22 they may also just ask you to send them your ré directly that they can look at
1157:24 your ré directly that they can look at it or maybe later on down the line this
1157:26 it or maybe later on down the line this actually is a person who is reviewing
1157:27 actually is a person who is reviewing resumés and so if they come across your
1157:29 resumés and so if they come across your resume they may be able to put a face to
1157:31 resume they may be able to put a face to the name and that may give you bonus
1157:33 the name and that may give you bonus points I'm going to leave a template
1157:34 points I'm going to leave a template script in the description in case you
1157:35 script in the description in case you don't know exactly what you want to say
1157:37 don't know exactly what you want to say to this recruiter and it'll give you
1157:39 to this recruiter and it'll give you just a baseline of some of the things
1157:40 just a baseline of some of the things that you might want to say number two is
1157:42 that you might want to say number two is to actually ask for a referral now if
1157:44 to actually ask for a referral now if you don't know what a referral is it is
1157:45 you don't know what a referral is it is is where somebody who already works at
1157:47 is where somebody who already works at the company can refer you to a specific
1157:49 the company can refer you to a specific job and then might get you a little bit
1157:51 job and then might get you a little bit higher on the list for interviews so I
1157:52 higher on the list for interviews so I highly recommend reaching out to
1157:54 highly recommend reaching out to somebody who already works at that
1157:55 somebody who already works at that company and ask if they're willing to be
1157:57 company and ask if they're willing to be a referral for you I get people reaching
1157:59 a referral for you I get people reaching out to me all the time asking to be a
1158:01 out to me all the time asking to be a referral for them for my company and
1158:03 referral for them for my company and nine times out of 10 I say yes I always
1158:05 nine times out of 10 I say yes I always ask to see their resume first just to
1158:07 ask to see their resume first just to make sure that their resume aligns with
1158:08 make sure that their resume aligns with the position at least a little bit but
1158:10 the position at least a little bit but there's basically no harm in me being a
1158:12 there's basically no harm in me being a referral for somebody in fact I may
1158:14 referral for somebody in fact I may actually get a bonus if that person ends
1158:16 actually get a bonus if that person ends up getting hired and so for the most
1158:18 up getting hired and so for the most part there's almost no risk for the
1158:19 part there's almost no risk for the employee to actually being a referral
1158:21 employee to actually being a referral and so a lot of times they will say yes
1158:23 and so a lot of times they will say yes now let me show you how to do that and
1158:25 now let me show you how to do that and it is very similar to finding a
1158:26 it is very similar to finding a recruiter so we're going to stay on this
1158:28 recruiter so we're going to stay on this people section but instead of searching
1158:30 people section but instead of searching for a recruiter we're going to search
1158:31 for a recruiter we're going to search for a job title that is similar to yours
1158:34 for a job title that is similar to yours so let's actually see if they do already
1158:35 so let's actually see if they do already have any data analysts and if they do
1158:38 have any data analysts and if they do that is the person that we're going to
1158:39 that is the person that we're going to reach out to because that is the person
1158:41 reach out to because that is the person we'll probably have the best connection
1158:42 we'll probably have the best connection with so it looks like we have six
1158:44 with so it looks like we have six employees and let's SC SC down and so it
1158:46 employees and let's SC SC down and so it looks like all these people have data
1158:48 looks like all these people have data related jobs and so I would reach out to
1158:50 related jobs and so I would reach out to these people and say I saw an open data
1158:52 these people and say I saw an open data analyst position at your company I would
1158:53 analyst position at your company I would love to know more about your company as
1158:55 love to know more about your company as a whole and then you can talk to them a
1158:57 a whole and then you can talk to them a little bit and then in the end your goal
1158:58 little bit and then in the end your goal is to ask them for a referral and if
1159:01 is to ask them for a referral and if that happens that is fantastic and then
1159:02 that happens that is fantastic and then you can go ahead and apply for the job
1159:04 you can go ahead and apply for the job and mark them as a referral for you now
1159:06 and mark them as a referral for you now my third tip on how to get a job through
1159:08 my third tip on how to get a job through Linkedin is to actually have recruiters
1159:09 Linkedin is to actually have recruiters reach out to you so let me show you how
1159:11 reach out to you so let me show you how to do that the first thing we're going
1159:13 to do that the first thing we're going to do is actually go over to my profile
1159:15 to do is actually go over to my profile here and we'll click view
1159:18 here and we'll click view profile now there's a few things that we
1159:20 profile now there's a few things that we want to make sure that we have on here
1159:22 want to make sure that we have on here so that recruiters can reach out to us
1159:24 so that recruiters can reach out to us the first thing that I want to do is to
1159:26 the first thing that I want to do is to actually come to this section right here
1159:27 actually come to this section right here which is show recruiters you're open to
1159:29 which is show recruiters you're open to work and when I click on this I can
1159:31 work and when I click on this I can actually choose some job titles and some
1159:33 actually choose some job titles and some locations where I actually want to apply
1159:34 locations where I actually want to apply and have recruiters reach out to me and
1159:36 and have recruiters reach out to me and so right now I have data analyst I have
1159:39 so right now I have data analyst I have in the DFW area which is where I live I
1159:41 in the DFW area which is where I live I can also add titles like business
1159:44 can also add titles like business analyst um and then maybe Junior data
1159:46 analyst um and then maybe Junior data analyst entry-level data analyst or
1159:48 analyst entry-level data analyst or things like that that could potentially
1159:50 things like that that could potentially have recruiters reach out to me for
1159:51 have recruiters reach out to me for positions that I'm interested in and
1159:53 positions that I'm interested in and then you can say that you're immediately
1159:54 then you can say that you're immediately and actively applying and you can also
1159:56 and actively applying and you can also say that you're only looking for
1159:58 say that you're only looking for full-time positions or contract
1159:59 full-time positions or contract positions and then you can actually add
1160:01 positions and then you can actually add this to your profile and I only want
1160:03 this to your profile and I only want recruiters to see that because I do
1160:05 recruiters to see that because I do currently have a job at McDonald's and
1160:07 currently have a job at McDonald's and so I don't want McDonald's firing me
1160:09 so I don't want McDonald's firing me because I'm looking for employment
1160:10 because I'm looking for employment elsewhere so let's save that and it
1160:13 elsewhere so let's save that and it looks like it was updated and so now
1160:15 looks like it was updated and so now when recruiters are searching for
1160:16 when recruiters are searching for candidates for a specific position you
1160:18 candidates for a specific position you will be on that list so that they can
1160:20 will be on that list so that they can find you and reach out to you something
1160:21 find you and reach out to you something else I should mention is on your profile
1160:23 else I should mention is on your profile page I would try to have some type of
1160:25 page I would try to have some type of professional photo so that you look
1160:27 professional photo so that you look really good I would also try to include
1160:28 really good I would also try to include data analyst somewhere in your title if
1160:30 data analyst somewhere in your title if you already have a data analyst job and
1160:32 you already have a data analyst job and you're looking for another one you can
1160:33 you're looking for another one you can just have your previous company but if
1160:35 just have your previous company but if you're looking for a data analyst job
1160:36 you're looking for a data analyst job you can always put seeking data analyst
1160:38 you can always put seeking data analyst position or something like that another
1160:40 position or something like that another thing I think is really important is
1160:42 thing I think is really important is having really good descriptions for your
1160:44 having really good descriptions for your previous work I don't currently have
1160:46 previous work I don't currently have this but I would go a little bit into
1160:47 this but I would go a little bit into the work that I actually do make sure
1160:50 the work that I actually do make sure that the experience matches kind of what
1160:51 that the experience matches kind of what you're looking for if you do have
1160:53 you're looking for if you do have previous experience if not that's
1160:54 previous experience if not that's totally fine the next section on your
1160:56 totally fine the next section on your profile page that I would recommend
1160:57 profile page that I would recommend looking at and updating is your skill
1160:59 looking at and updating is your skill section and so you want to go in there
1161:01 section and so you want to go in there and make sure you have all of your
1161:02 and make sure you have all of your relevant really data analyst heavy
1161:05 relevant really data analyst heavy skills on there specifically hard skills
1161:07 skills on there specifically hard skills because soft skills aren't going to
1161:09 because soft skills aren't going to translate too much into this section I
1161:11 translate too much into this section I would definitely stick to things like
1161:12 would definitely stick to things like SQL python Tableau Excel things that
1161:15 SQL python Tableau Excel things that data analysts are going to use because
1161:17 data analysts are going to use because this is where they're going to actually
1161:18 this is where they're going to actually look and see if you have the skills that
1161:20 look and see if you have the skills that they are looking for for that position
1161:22 they are looking for for that position when I was applying to jobs in only
1161:23 when I was applying to jobs in only applying to job postings and not using
1161:25 applying to job postings and not using any of these strategies my success rate
1161:28 any of these strategies my success rate was 0.04 which means out of 1,000
1161:31 was 0.04 which means out of 1,000 applications that I filled out and sent
1161:32 applications that I filled out and sent my resume to I only heard back from four
1161:34 my resume to I only heard back from four of them to actually get an interview but
1161:36 of them to actually get an interview but with these strategies I was able to get
1161:38 with these strategies I was able to get that up to 10% and at my best I was able
1161:40 that up to 10% and at my best I was able to get that up to 15% but that's because
1161:42 to get that up to 15% but that's because I was applying to a lot less positions
1161:44 I was applying to a lot less positions and I was targeting jobs that I really
1161:45 and I was targeting jobs that I really wanted to work for and so I put in more
1161:47 wanted to work for and so I put in more effort in order to contact people and
1161:49 effort in order to contact people and work with Recruiters in order to get
1161:50 work with Recruiters in order to get that job I genuinely hope that these
1161:52 that job I genuinely hope that these strategies can be helpful for you
1161:54 strategies can be helpful for you especially if you're trying to apply for
1161:55 especially if you're trying to apply for jobs right now thank you guys so much
1161:57 jobs right now thank you guys so much for watching I really appreciate it if
1161:59 for watching I really appreciate it if you liked this video and got anything
1162:01 you liked this video and got anything out of it at all be sure to like And
1162:02 out of it at all be sure to like And subscribe below and I'll see you in the
1162:04 subscribe below and I'll see you in the next video hello everybody
1162:05 next video hello everybody congratulations if you are watching this
1162:08 congratulations if you are watching this that means that you completed the data
1162:09 that means that you completed the data analyst boot camp if you haven't don't
1162:11 analyst boot camp if you haven't don't keep watching this is only for people
1162:13 keep watching this is only for people who have completed the data analyst boot
1162:14 who have completed the data analyst boot camp playlist on my YouTube channel woo
1162:17 camp playlist on my YouTube channel woo all right now that we filtered those
1162:19 all right now that we filtered those people out I'm going to show you how you
1162:20 people out I'm going to show you how you can download your certificate and your
1162:22 can download your certificate and your certification now that you've completed
1162:24 certification now that you've completed the data analyst boot camp I will leave
1162:26 the data analyst boot camp I will leave a link in the description but let's go
1162:27 a link in the description but let's go on to my screen I'm going to show you
1162:28 on to my screen I'm going to show you how to actually access this and download
1162:31 how to actually access this and download your certification all right guys don't
1162:32 your certification all right guys don't go around telling people this or sharing
1162:34 go around telling people this or sharing this uh but this is our data analytics
1162:37 this uh but this is our data analytics boot camp on the Alex the analyst GitHub
1162:39 boot camp on the Alex the analyst GitHub right up here I will have this link in
1162:41 right up here I will have this link in the description what you can go ahead
1162:43 the description what you can go ahead and do is you can come right here you
1162:45 and do is you can come right here you can download this you'll just right
1162:46 can download this you'll just right click or click download and you just do
1162:48 click or click download and you just do something like save image as um or you
1162:51 something like save image as um or you can come to this one this is the one
1162:52 can come to this one this is the one that I think is the the real money maker
1162:54 that I think is the the real money maker here uh this is the certificate of
1162:56 here uh this is the certificate of completion for the data analytics boot
1162:59 completion for the data analytics boot camp I have my not signature but my name
1163:02 camp I have my not signature but my name as well as U my position with a blank
1163:06 as well as U my position with a blank space right here to fill in your name
1163:07 space right here to fill in your name feel free to put this on LinkedIn or
1163:09 feel free to put this on LinkedIn or Twitter or Instagram and tag me in that
1163:11 Twitter or Instagram and tag me in that because I would love to just say
1163:12 because I would love to just say congratulations because honestly it's a
1163:14 congratulations because honestly it's a lot lot of work to go through all those
1163:16 lot lot of work to go through all those videos and learn all of those skills so
1163:18 videos and learn all of those skills so congratulations I hope that you learned
1163:19 congratulations I hope that you learned something along this journey a new skill
1163:21 something along this journey a new skill a new thought a new idea and I'm proud
1163:23 a new thought a new idea and I'm proud of you I'm proud of you for putting in
1163:25 of you I'm proud of you for putting in the work it's not easy but you did it
1163:27 the work it's not easy but you did it and I hope that you came out on the
1163:29 and I hope that you came out on the other side better for it so congrats
1163:31 other side better for it so congrats I'll see you in the next
1163:44 [Music] video