0:03 This DBT masterclass will make you a pro
0:05 DBT developer from scratch because you
0:09 will master DBT core and DBT CLI, DBT
0:12 models, DBT ginger, DBT macros, DBT
0:15 generic test and singular test, DBT
0:17 seeds, building slowly changing
0:20 dimensions using snapshots, DBT node
0:23 selection, DBT profiles. Not only this,
0:25 you also going to learn CI/CD workflows
0:28 within DBT so that you can deploy your
0:32 objects. But why should you learn DBT in
0:35 2025? Because every other organization
0:39 is looking for DBT developers right now.
0:41 And this is the one-stop solution to
0:45 master DBD in 2025. Do you know what I
0:47 have literally put my heart into this
0:50 video? I have covered so many areas
0:52 which are necessary for you. I have
0:54 referred to so many resources. So now
0:56 it's your responsibility to show your
0:57 love in the comment section. you need to
0:59 support this channel if you want more
1:01 and more content, more and more useful
1:03 tutorials for your data engineering
1:06 career and let's get started with this
1:08 amazing masterpiece. So what's up?
1:10 What's up? What's up mafam? First of
1:14 all, happy Sunday. And do you know what?
1:17 I am so so so happy because this is one
1:20 of the most requested video on this
1:22 channel. And I was literally waiting for
1:24 recording this particular video because
1:27 DBT is such a lovely thing and such a
1:28 revolutionary thing in the in the you
1:30 can say world of modern data
1:33 engineering. And I can feel like why DBT
1:36 is so much in demand because it should
1:38 be in demand because even if you are
1:41 totally unaware of DBT, what is DBT?
1:43 What is the full form of DBT? I already
1:44 mentioned in the introduction of this
1:47 video that you going to become pro in
1:50 DBT and I do not expect any kind of
1:53 prior knowledge with DBT or with you can
1:56 say any kind of knowledge related to
2:00 DBT. Just forget about anything. Just
2:02 just relax. just relax because in this
2:04 particular course you going to become
2:08 pro because DBT is so so so easy to
2:11 start that you can just watch a quick
2:13 tutorial about DBT and you can say hey I
2:17 know DBT but when you actually try to
2:19 learn the skills which are required to
2:23 crack the interviews you need to say oh
2:26 DBT is not that much easy so you need to
2:28 understand all the ins and all the outs
2:30 of DBT
2:34 And as you know as you know me and this
2:37 channel this channel will give you and
2:39 obviously is giving you a lot of lot of
2:41 a lot of knowledge and this video I can
2:44 say that this is the most in detailed
2:47 video that exists on YouTube. Trust me
2:50 trust me. So I am really excited. I am
2:53 really really really excited because I
2:55 personally love DBT and you will love
2:57 DBT a lot after this video. One more
3:00 thing after learning this technology you
3:03 can actually feel confident because DBT
3:07 is very much in demand plus DBT is not
3:10 actually mastered by a lot of people. So
3:14 you will gain I will say
3:16 much more advantage if you add this
3:19 technology because see a lot of people
3:21 can add DBT in their resumeum but the
3:23 thing is when they will be asked the
3:26 question that time will be the deciding
3:29 factor whether they will be accepted or
3:31 not because it is very easy to write DB
3:34 uh it is very easy to write DBT in rum
3:37 but what are models what are snapshots
3:39 what are seeds these are like very basic questions
3:40 questions
3:42 And when they go deeper inside ginger
3:45 functions and all an Lamba come here you
3:47 said that you do not need to you you
3:48 told us that we do not need to know
3:50 anything. Yeah. So I I I just use some
3:53 terms right just slang related to DBT. So
3:55 So
3:57 just sit back and relax because this
3:59 video is specially made for you on your
4:01 special demand and you're going to love
4:04 it. Okay. So let's get started. Plus
4:07 plus this is like pure hands-on video.
4:09 So we will be covering a little bit of
4:11 theory and then we directly jump onto
4:13 the practical and we will cover more
4:16 theory along with the practical. Okay,
4:17 this is not like we will be just
4:20 covering the theory and then at the end
4:22 we will do the practical. No, we will
4:24 learn by doing it. Okay, let's let's
4:27 let's get started. And this is like
4:29 first of all congratulation to all these
4:32 people who have cracked the MNC's their
4:36 dream job in US or whatever like there
4:38 are so many so many so many messages so
4:39 many comments. So this is the these are
4:41 the comments that I was able to gather
4:44 this week. So even if you have cracked
4:46 offers just let me know in the comment
4:47 section. I would love to highlight your
4:49 story and I literally want to see your
4:53 comments here. You will also write these
4:54 comments that you have just cracked the
4:56 offer. You have just cracked the job.
4:57 You have just cracked your dream dream
5:02 role. Trust me. Trust me. So now let's
5:05 talk about DBT.
5:07 What is DBD? Everyone talks about DBD
5:12 nowadays and an we are in a FOMO. We
5:15 also want to talk about DBD. Bro, you
5:16 will be talking about DBD plus you will
5:18 become a pro developer. So let me just
5:20 tell you what is DBD and let me just
5:23 clear a myths. A lot of people think DBT
5:27 is an orchestration tool. Okay. A lot of
5:30 people think DBT is a transformation
5:33 tool. Okay. A lot of people think DBT is
5:37 a data warehousing tool. Okay. Okay.
5:39 Makes sense. See,
5:41 I know there are like so many stories. I
5:43 know like there are so many myths. Let
5:47 me just tell you DBT is nothing.
5:49 Okay. So then what is DBT? First of all,
5:50 let me just tell you the full form of
5:53 DBT. DBT stands for bro. First of all,
5:56 just start taking notes because this boy.
5:58 boy.
6:01 Okay, it's fine. Just take your notes.
6:05 It's fine. So, DVT stands for data build
6:09 tool. What is that? Data build tool.
6:10 What can you interpret from this
6:13 definition? Data build tool. Hm. So, it
6:14 is something related to data domain.
6:17 Okay, that's for sure. Build. Build can
6:18 be anything. It can be pipelines, it can
6:20 be data warehouses, it can be
6:21 transformation, it can be engine,
6:23 anything. Tool is tool. Okay. So this
6:25 much is not like this definition is not
6:28 enough to understand DBD. That's true.
6:29 Let me just tell you if you are a data
6:34 engineer. Okay. If you new to data
6:36 engineering domain, if you're trying to
6:38 switch to data engineering domain, you
6:39 would have heard about a term called
6:44 ETL. Okay.
6:46 Oh, so it is an ETL tool. Shh. Just
6:49 listen to me.
6:53 So, ETL, ETL stands for extract,
6:55 extract, transform,
7:02 load. Simple, very good. With the course
7:05 of time, we introduced another term
7:06 which is called ELT. Because with the
7:09 rise of big data, we prefer doing ELT.
7:12 That means we first extract the data,
7:14 then we load it, and then we transform
7:18 it. Right? Okay. Extract
7:21 load transform. Make sense?
7:28 Now, DBT is is okay. Suspense is over.
7:30 Is this
7:34 transform and transform?
7:40 Oh, okay. So, DBT means your transformation
7:41 transformation layer.
7:44 layer.
7:47 Hm. Okay, what does it mean? So,
7:50 basically DBT is not any kind of
7:53 orchestration tool. Orchestration or
7:55 basically data extraction part will be
7:58 done by your orchestration tool such as
8:00 Apache Airflow, Azure Data Factory,
8:03 maybe datab bricks, lakeflow jobs or any
8:08 kind of stuff. Okay. Now
8:11 load step where you store your data in
8:14 the data lakehouse
8:18 SQL server dedicated databases dedicated
8:21 SQL pools anything will be done by your
8:27 same same tools synapse red shift Google
8:30 query all those things snowflake okay
8:33 datab bricks all those things those are
8:38 same now Whatever thing you do in the
8:40 transformation layer will be done by
8:45 data build tool. Okay. So an lamba for
8:48 example we use something called as pispark.
8:50 pispark.
8:52 Mhm. For big data engineering a lot. Yes
8:56 that's true. Will it be replaced by dbt
8:58 like what is that? We have also heard
9:02 about this thing or what is that? Okay. Pispark
9:04 Pispark
9:07 does almost everything. It extracts your
9:10 data. It transforms your data. It loads
9:14 your data. It optimizes your data. It
9:17 partitions your data. It does a lot of
9:19 other things as well rather than just
9:22 performing PiSpark transformations.
9:26 That's simple. That's true. Okay. So yes
9:29 when you are using pi spark just for the transformation
9:30 transformation
9:35 it can be replaced by dbt but
9:37 just the transformation part. Pispark
9:42 does a lot more than the transformations
9:44 make sense? It reads your data. It
9:47 parses your data. It works with JSON. It
9:49 works with API calls. It writes your
9:51 data in the multiple data lakes and so
9:53 many things, right? So many things. So
9:55 pi spark transformation part can be
9:58 replaced with dbt. Oh
9:59 Oh
10:04 okay. So now one thing can we say that
10:05 we should not learn pispark we should
10:08 just shift our focus to dbt. No that's
10:11 not true because pispark is pispark.
10:13 Pispark is you can say a kind of code
10:17 that you have to write. Okay. So can we
10:19 say that we need to integrate DBT with
10:22 Pispark or you can say DBT with existing
10:24 Pispark code. Yes.
10:28 Yes, you need to do that. That's what
10:31 all the companies are doing right now. H
10:33 but why do we need DBT? Let's talk about
10:35 like why do we need DBT? Like that is
10:37 clear. DBD is just a tool like what is
10:39 DBD? How we build DBD? We will talk
10:41 about that. But we know this thing that
10:43 DBT is something that is used to
10:45 transform our data. Simple that is clear
10:48 so far. Very good. Now let's talk about
10:50 Y dBT. You can just write it here. Y
10:55 dBT. Okay. Like you can just say Y DBT.
10:59 Okay. Why is why why DBT not why is DBT?
11:01 So why DBT? Why do we need DBT?
11:03 Everything is running fine. People are
11:05 writing Pisper code. People are writing
11:07 SQL code. People are writing Python
11:12 code. Why? Why a new framework? Why?
11:13 Just because it is new so every everyone
11:15 is just feeling happy to use it. Not
11:18 actually because DBT provides you
11:22 something called as modularity
11:26 or you can say templating feature.
11:29 Okay. What does it mean? Let's suppose
11:33 you have built a data warehouse. Okay.
11:34 You have built slowly changing
11:38 dimensions. Okay. You have built silver
11:42 layer, gold layer, whatever. Okay.
11:43 Okay.
11:46 In another project, maybe you can say in
11:48 another business data warehouse. Okay,
11:50 business unit data warehouse you did the
11:53 same thing. Then in the other business
11:55 unit you did the same thing. Then in the
11:56 other project you did the same thing. So
11:59 every time you are doing the same stuff
12:01 again and again there is no modularity.
12:04 There's no templating feature available.
12:06 You have to write your static code again
12:09 and again. There is no plug and play.
12:11 That means you developed something for
12:13 one for for for one time and you can use
12:16 the same thing in different areas. You
12:18 cannot use that thing use that thing in
12:20 like maybe you can say you have
12:22 developed one logic you have just
12:23 written down your code and you can just
12:26 use that code in multiple places. You
12:28 cannot do that. But with DBD we can do
12:31 that. So what we do we simply define the
12:32 code we simply write the code at one
12:35 place and we use something called as
12:38 templates. It's called ginga templates.
12:40 So if you're aware of this ginger
12:44 template there will be you can say 1%
12:46 edge in that particular case and we do
12:48 not know about ginger template what is
12:51 gate don't worry an lamba is here even
12:53 if you do not know about ginger template
12:54 I will just let you know okay don't
12:55 worry but yes if you know ginger
12:57 template it's good and I would say who
12:59 are in software engineering industry who
13:01 have done bte would know ginger template
13:03 because they use with html a lot with
13:05 flask applications and all so they would
13:07 know ginger template uh very well but
13:08 it's not a big deal I I just let you
13:10 know don't worry ginger templates are
13:12 very easy. So they just use Ginga
13:15 templates and they make their code
13:17 modular. That means they do not write
13:21 static code. They write dynamic code.
13:24 Anal Lamba just let me know one thing.
13:28 DBT has just you can say entered into
13:30 the market. H
13:33 people would be writing dynamic code
13:36 before that as well. I'm so sure like
13:39 not the startups but yeah big companies.
13:41 You are right. So why do we need DBD?
13:46 See, people were writing you can say
13:49 modular code. That's true. But there was
13:52 not you can say a very efficient way.
13:54 People were using Python list
13:57 comprehensions. People were using Python
13:59 classes. People were using Python
14:01 functions to make that code modular. But
14:04 with DBT, they can just do all those
14:06 things very easily. Very easily. And
14:08 obviously that's not just a thing. But
14:11 DBT they also get some amazing features
14:12 like how they can just build story
14:14 changing dimensions because DBT is
14:16 purely focused on data engineering
14:18 that's it. So they know what we need. We
14:20 need incremental data loading. We need
14:22 story changing dimensions. We need
14:23 because these are like bread and butter
14:25 for us. We cannot live without with
14:27 that. We can live without bread and
14:29 butter. But yeah it's fine. So this is
14:31 the thing TBT knows that pain. Okay. So
14:34 it provided extra support for those
14:39 features. Obviously, TBT can do a lot of
14:41 things other than data engineering. They
14:43 can just be used in anything because sky
14:45 is the limit when you are just trying to
14:47 be creative, right? But it is focused
14:50 more on data engineering. Make sense?
14:53 Make sense? That is why we need DBD. And
14:55 obviously it is not like DBT just
14:58 appeared and people are just hyping it.
15:02 No. See, organizations have used DBT.
15:05 organizations have tried DPD and they
15:07 obviously got some amazing results in
15:09 code management in all those stuff. So
15:12 that's why people are appreciating it.
15:14 That's why big companies are demanding people.
15:16 people.
15:18 Let's say they they they they need DBD developers.
15:20 developers.
15:21 Make sense? Makes sense. Makes sense.
15:25 Very good. So that is why we need DBT.
15:28 Okay. So I know this is just the
15:29 starting of the course but we can say
15:32 one thing that DBT is required in the
15:33 modern world because Anlama you have
15:36 given us a very very very cool picture
15:39 of DBT. See, I'm just telling you the
15:41 truth and you are smart enough to take
15:44 the decisions. Okay. Okay. So, what do
15:47 we have next? In the next slide, we have
15:50 DBT platforms. This is one of the
15:54 biggest confusing part. We have so many
15:56 so many so many names. DBT core, DBT
15:59 cloud and now we have DBT cannibus an
16:01 LMA. What are these things? What are
16:04 these things? See if you want to just
16:06 perform some stuff related to DBT
16:10 obviously you need a platform right okay
16:14 so very popular or you can say the one
16:16 which is the backbone of DBT is the DBT core
16:18 core
16:21 dBT core now what is DBT core let's talk
16:28 about it dbt core is the CLI tool which
16:31 is an opensource opensource that means
16:36 it is Yes. Wow. Amazing. This amazing
16:38 framework is available for free.
16:39 Literally free because open source is
16:43 free, right? Yes. Yes. Yes. That's why
16:45 it is very much in demand because right
16:46 right now every organization is trying
16:48 to build something on top of this
16:50 engine. So basically DBT core is like a
16:55 CLI tool the backbone of DBT. Okay. This
16:58 tool is available openly and everyone
17:01 can use it. Make sense? So this is the
17:04 ent you can say core engine. DBT core is
17:08 the core engine behind DBT. Okay, makes
17:13 sense. With this you can build almost
17:15 anything related to DBT not anything
17:17 like related to DBT that we talk about
17:20 it. Okay. Plus it is open source. It is
17:23 open source. That means it is not
17:25 managed by anything. That's why it is
17:27 open source, right? It is open source.
17:29 You will be managing it. So that means
17:31 you need to take care of or you can say
17:36 manage it. Okay. In terms of compute, in
17:38 terms of
17:41 you can say um
17:43 um
17:44 compute. Yeah. Obviously most of the
17:46 things are like compute because other
17:47 than that I don't think so you would
17:48 need anything. Yeah. Basically compute
17:51 or basically yeah another thing after
17:53 compute get
17:55 like all the CI/CD part as well you need
17:57 to manage it on your own. Okay. Because
17:59 that is why it is like open source
18:02 similar to you can say Apache Spark.
18:04 Okay. Okay. Makes sense. An Lumba why
18:06 did you talk about Apache Spark? Because
18:08 you will feel relevant. Okay. DBT core
18:11 is fine. Opensource free and all. What
18:17 is DBT cloud? So DBT cloud is a managed
18:20 product which is built on top of DBT
18:23 core. Oh
18:26 okay. similar to data bricks because
18:28 they founded Apache Spark and they made
18:31 it open source and they built a product
18:32 on top of it which is called data
18:34 bricks. Similarly, DBT core is the
18:37 engine and they have built a product on
18:39 top of DBT core. It's called DPT cloud.
18:42 Obviously, it is like a product. So, it
18:46 will be managed by DPT. Okay,
18:48 managed by DBT
18:50 and it's an amazing product. Let me just
18:52 tell you it's an amazing product because
18:53 obviously all the capabilities of dbt
18:56 core are there plus more and more
18:58 enhancement git management yes there is
19:01 like managed git repos available in dbt
19:03 cloud I have tried it it's amazing okay
19:05 there are like so many other features UI
19:07 you will get the UI because if you are
19:09 using CLI you will not get the UI right
19:13 but if you want to see the UI okay then
19:17 dbt cloud is the choice right so dbt
19:19 cloud is like the manage product built
19:22 on top of DBT core.
19:24 Okay. And in this video, we going to use
19:28 DBT core. Okay. By the way, DBT cloud is
19:31 also free. Not entirely but for a few
19:33 days and after that you need to pay
19:34 obviously because it is not free like it
19:37 is a product proper product but I know
19:39 you would need something that which will
19:41 be like purely free. Okay. So I just
19:42 picked this particular service DBT core.
19:44 Okay. And don't worry I will just show
19:45 you how you can just set up your DBT
19:48 cloud account as well. is very easy and
19:49 you will get some free trial as well. So
19:51 that is up to you. Make sense? But yeah,
19:53 we will be just taking care with the DB
19:55 well like we will be just taking uh care
19:57 of DBT with the DBT core engine itself.
19:59 Okay? And don't worry uh the installment
20:02 part is like little bit complex but I
20:03 will just let you know everything. Make
20:06 sense? Very good. Now let's talk about
20:09 DBT canvas. What is this DBT canvas? So
20:12 this is the new thing and to be honest I
20:15 have also not explored much about DBT
20:16 canvas because I think it is recently
20:18 added not like recently recent like just
20:20 today but yeah it is like added I think
20:22 I would say a few months back. So what
20:25 is DBT canvas and I would say it's not
20:28 like should be um tried or should be
20:30 used by everyone but yeah it is an
20:32 amazing feature. Let me just tell you
20:34 DBT canvas is a feature which is
20:37 available within DBT cloud. Oh, okay.
20:39 So, this is not like separate
20:41 environment. No, no, no, no. Just a
20:44 feature available within DBT cloud.
20:48 Okay, make sense. So, DBT cloud actually
20:52 has you can say a tab dedicated tab for
20:55 DBT canvas. What it does? So, remember I
20:58 told you that we transform our data with
21:01 DBD in the templating format. Yes,
21:03 Yes,
21:05 obviously you need to write the code and
21:06 obviously I will just show you how to
21:09 write the code. SQL, Python, all those
21:12 code, right? Okay, but DBT canvas says
21:15 hey you do not need to write any code.
21:18 You can build the same thing using drag
21:20 and drop feature. So it is like recently
21:22 added and it is like more draggy droppy
21:25 feature. If you like coding obviously
21:27 you will go with normal coding but if
21:29 you like like drag and dropping thing
21:30 then you can just try dbt canvas. I
21:33 myself have not tried dbt canvas much.
21:35 So let me be very honest with you but
21:37 yeah it's your choice even if you have
21:39 tried DBT canvas it's your choice as a
21:41 developer if you want to go that way
21:44 right so it's your choice not like oh
21:45 DVD canvas is there we will be just
21:47 using DBT canvas DBT canvas DB canvas
21:51 bro hold on hold on there is a coder
21:54 okay there is a non-coder non-coder will
21:56 pick DBD canvas if you are a coder you
21:58 will pick DBD cloud or DB core coding
22:00 right so it's up to you but they are
22:02 giving you the choice why they're giving
22:04 you the choice because they are serving
22:08 the entire enterprise, entire
22:10 organization and in every organization
22:13 there's a combination of developers,
22:15 coders, non-coders. In some
22:17 organizations, they are like not
22:19 pro-coders. They are like they know
22:21 about data industry but they're not like
22:22 procoders. So they have the feature of
22:24 DBT canvas. So they are just giving you
22:27 all the flavors. It's up to you like
22:29 what's your taste, right?
22:33 It depends upon a taste buds. Wow.
22:35 So this is all about your DBT core and
22:38 all just one last part that I want to
22:39 discuss and then we're going to jump
22:42 straight on the practical. So what is
22:45 that backbone of DBT?
22:47 What is the backbone of DBT? The
22:52 backbone of the DBT is DBT models. Okay,
22:53 what is what is this? It is a machine
22:55 learning model or what model? So
22:56 basically it is not any kind of machine
23:01 learning model. DBT models are something
23:05 which hold your code. Oh,
23:10 the coding part is your models. Models.
23:13 Models. Models
23:17 is the coding part of DBT. Okay. And
23:18 this is the thing which is
23:22 revolutionized by DBT canvas. So you do
23:24 not need to write code to build your DBT
23:26 models. You can use draggy droppy
23:28 feature. But if you want to write code
23:31 and trust me you should go with the
23:33 coding part and once you are good with
23:36 that you can just switch to your draggy
23:38 droppy feature. Okay. So if you want to
23:40 build your DBD models you have to write
23:42 code and models are something which will
23:44 populate the data and don't worry it
23:46 will be very relevant and let me just
23:47 give you a quick example. Okay. And this
23:51 example will build your base. Okay. So
23:55 let's say you have an Lamba just give us
23:58 a detailed example. Okay. Example is
24:02 very easy. So let's say this is you. Okay.
24:04 Okay.
24:07 Okay. This is you as a data engineer. Or
24:08 let's say there's a data engineer, not
24:11 you. Okay. Because you will say, hey, I
24:14 don't look like this.
24:16 Okay. So let's say there are two data engineers.
24:17 engineers. [Music]
24:18 [Music]
24:21 Okay. And they work in a team and they
24:23 are best friends.
24:27 Okay. Okay. So they they both are in a
24:29 same team.
24:31 Very good.
24:34 Okay. Let me just redraw it
24:36 on Lamba.
24:38 Just build a nice data engineer. That's
24:40 it. Okay. So there's a data engineer.
24:43 Okay. And this person works with any or
24:44 let's say this person works in any
24:47 organization in which they are using
24:50 let's say data bricks. Let's say so they
24:54 are using data bricks or let's say they
25:03 or they're using snowflake both are fine
25:06 because it actually doesn't matter and
25:11 they want to build their bronze layer
25:13 their silver layer their gold layer
25:14 because obviously these are the three
25:15 layers that we build using
25:17 transformation. So they want to build
25:19 these three layers, bronze, silver, gold
25:22 using DBT. Okay. So what will happen?
25:25 Let's say there's a source here. Okay.
25:28 Okay.
25:30 Any kind of API, any kind of SQL
25:32 database, anything, literally anything.
25:35 This is a source. So what will happen?
25:40 Let's say this is our DBT.
25:43 Okay, this is our DBT.
25:46 So what we will do? We will or let's say
25:50 this data engineer will use this source
25:54 in the DBT okay cook something
26:03 make sense it will build the models and
26:06 these models will populate the data back
26:09 to these particular platforms such as
26:13 data bricks snowflake synapse red shift
26:17 bigquery fabric like anything anything anala
26:19 anala
26:22 hold on quick question.
26:24 Okay, this will write the code. This
26:28 will perform the transformation. But we
26:31 are smart enough. Okay, and we know that
26:34 whenever we want to transform our data,
26:37 we need some kind of compute.
26:40 So who will provide that compute?
26:42 Because DBT doesn't have anything
26:44 because DBT is just like transformation
26:47 layer. It doesn't even have any kind of
26:48 compute that we have already discussed
26:51 that it has nothing. So who will
26:52 transform the code? Who will provide
26:55 that computation power? That's a very
26:57 good question.
26:59 Those things will be provided by these
27:02 platforms. So it is the condition of
27:05 DBT. Hey bro, I will transform your
27:07 data. I will make your code modular. I
27:09 will make templates for you. But in
27:14 return, I will re use your resources
27:15 because I do not have any kind of
27:19 resources. I can write templating code.
27:23 I can write modular code but I need to
27:25 use your compute. That means datab
27:27 bricks clusters that means snowflake
27:31 compute fabric compute redquery compute
27:34 bigquery redquery red shift bigquery
27:37 compute. So DBT doesn't hold. That's a
27:40 very good point. DBT doesn't have any
27:42 kind of compute, any kind of cluster,
27:46 nothing. It is just a templating format
27:48 that we use while transforming our data.
27:50 That's it. It will use our resources
27:52 whose resources like your platform's
27:56 resources. Make sense? So this is the
28:01 backbone of DPT. That means models.
28:04 I hope it was helpful.
28:08 Okay. So now in order to work with DBT
28:10 cloud, it is very easy. You can simply
28:12 go there and just build your models.
28:16 Panch lamba what platform are we using
28:18 with DBT? Because you mentioned so many
28:20 platforms. So we'll be working with
28:24 datab bricks as our platform. Why?
28:28 Because datab bricks provides the free
28:31 platform that we can use. We do not need
28:33 to pay for any cluster. We do not need
28:35 to pay for any kind of uh warehouse. We
28:37 do not need to pay for any kind of data
28:38 that we are writing in that particular
28:40 platform. We do not need to pay anything
28:42 plus we do not need any kind of credit
28:43 card, debit card. We do not need any
28:46 kind of business email, nothing.
28:48 Everything is free. So that's why I have
28:51 picked data bricks plus
28:55 very good point. It doesn't matter bro.
28:57 Literally it doesn't matter if you use
28:59 data bricks if you use snowflake if you
29:04 use fabric if you use red shift bigquery
29:08 anything because we are covering DBD and
29:12 99% of the things are just pure DBD only
29:14 1% part that means when we'll be just
29:17 testing our data hey if everything is
29:20 there hey if tables are there hey if
29:21 bronze layer silver layer gold is there
29:24 only then we need to use this platform
29:26 otherwise because it's it it doesn't
29:29 matter but just we wanted to pick
29:31 something I picked data bricks because
29:32 it is easier for you to set up data
29:35 bricks it's free and it's easy to use
29:37 make sense very good so this is the
29:38 thing that we will be just setting up
29:40 first of all that's our first thing okay
29:45 data bricks then we need to install dbt
29:48 core and there are some steps that are
29:50 involved because you would need to
29:52 install so many things first of all you
29:54 would need to install Python Hey, why
29:56 Python? Because it uses Python you can
29:58 say interpreter to just process your
30:01 code. Okay, you need to install Python.
30:03 Then you need to install Git.
30:05 Okay, why Git? Because we'll be just
30:06 taking care of CI/CD part as well. And
30:08 as I just mentioned that we are just
30:11 learning DBD with Git. Okay, we do not
30:12 know about Git. We do not know about
30:16 CI/CD. It's fine. It's fine. I will just
30:19 tell you. Don't worry. Okay, Python get
30:23 and I think you also need to install UV.
30:26 What is UV? If you do not know about UV,
30:28 it's fine. Basically, UV is a new
30:31 package which is recently entered our
30:34 Python world. And UV is like basically a
30:36 package manager. Instead of using pip,
30:38 now we use UV. That's a transition that
30:40 everyone is making right now. Every
30:42 developer and you should also do that.
30:44 And UV is an amazing package manager.
30:45 And don't worry, I will just let you
30:48 know that as well. Don't worry. Don't
30:50 worry, bro. Don't worry. So, Python get
30:52 UV. And I think you would need a code
30:55 editor. And I will say VS code because
30:59 VS code it's like oh man amazing themes.
31:04 Oh so Python get UV VS and what else? I
31:06 think that's it. Yeah these are some of
31:07 the things that we need to install. Yes.
31:09 Yes. Yes. It's a pain but you need to
31:11 bear this pain because the result is
31:13 amazing bro. Okay so now let's actually
31:15 start with setting up our databicks
31:18 account and not just account setting up
31:20 our basically like we can just install
31:22 the databicks. Okay. and we will just
31:24 setting up the source as well after
31:26 installing this DBD. Okay, because first
31:29 of all the in this section we are just
31:31 focusing more on setting up our
31:33 environment which environment not from
31:36 the data perspective. Environment means
31:38 the technologies, tools and all right
31:41 okay so let's get started and just
31:43 follow this part very carefully because
31:45 a lot of things I would say everything
31:46 is dependent on this particular section
31:48 because installation is everything bro
31:52 okay so let's see so let's see how we
31:54 can just quickly create our datab bricks
31:55 account and it's very very easy let me
31:57 just check if my voice is being recorded
32:01 because you know me right so now datab
32:03 bricks databicks is very easy to set up
32:05 you simply need to go on your Google or
32:07 any kind of platform or any kind of
32:09 browser. Just type data pre free edition.
32:11 edition.
32:14 Bam. Okay. Perfect. Click on the very
32:16 what is this man? Click on the very
32:18 first link. Okay. And this will just
32:21 take you to this page. Sometimes it can
32:22 take you to a different page like this.
32:25 Let's say like this. Uh no worries.
32:27 Click on tryx and that's fine. Okay. So
32:29 at the end you will land on this page.
32:32 Just click on this particular thing. Do
32:34 not click here express atom.
32:36 Express setup is also a free service but
32:38 it's not like always free. It's like
32:41 free for a limited time period but you
32:43 need like everything free right. So now
32:45 just click on this looking for database
32:48 free edition and click here. So here you
32:49 need to provide your normal Gmail
32:51 account normal Microsoft account any
32:53 account. You do not need to specify any
32:54 business email student account no
32:56 nothing just your regular email ID and
32:59 that's it. Okay. Once you do that, just
33:01 land on the same page and this time
33:03 scroll down and click on already have an
33:05 account and click on login because now
33:07 you do not want to create your account.
33:09 Now you just want to login. Okay, then
33:10 you can simply click on go continue with
33:12 Google if you have that account. Okay,
33:16 then just click on this AWS thing. So
33:18 this is the homepage that you will also
33:20 see in your particular case. So this is
33:22 the databicks workspace homepage. Yeah,
33:24 it was that easy. I told you that's why
33:26 I picked data bricks because our focus
33:27 is on learning dbt. We do not want to
33:30 spend our time creating the source
33:34 creating the source data and all. So our
33:37 focus is on DBD right so this is the
33:39 thing and DBD is all set all set all set
33:41 and yes we'll be just setting up the
33:43 source don't worry we'll just creating
33:45 cataloges and all those things tokens
33:46 and all everything will be there but
33:50 this resource this technology is done.
33:53 So this is done. Now let's install all
33:55 the things one by one by one for our DBT
33:58 and let's actually set that now. Okay,
34:00 let's do it. So let's start our
34:03 installation related to DVD stuff. First
34:07 of all, we need a kind of code editor.
34:09 Okay, the best code editor. There are
34:11 like so many code editors but just
34:15 download VS Code. Okay, simply search VS
34:18 Code install. VS Code.
34:20 What is this?
34:23 VS Code, VS Studio Code, whatever you
34:25 want to just call it. Okay, simply go to
34:28 this particular link and here you can
34:31 click on Windows and download and
34:32 install VS Visual Studio Code. That's
34:34 it. And just a quick homework I would
34:37 say not a homework like I can expect
34:40 this much of thing from you. Yes, we are
34:43 just mentioning each and every step that
34:44 you need to do this, you need to do
34:48 that, you need to do. Okay, the
34:50 expectation from you is maybe you'll be
34:52 using Mac, maybe you'll be using any
34:54 other laptop or anything. These are very
34:57 basic installation steps that I would
34:59 expect you to figure out. Still I'm just
35:01 showing you each and every step
35:04 everything. But if you feel stuck, let's
35:06 say you are installing git, let's say
35:07 you are installing Python, let's say you
35:09 are installing UV. So these are the
35:11 things which are very generic and very
35:13 easy to find and very easy to
35:16 troubleshoot. I would expect if you are
35:17 a data engineer, if you want to become a
35:19 data engineer, you can just troubleshoot
35:22 these basic stuff. So this is a kind of prerequisitely
35:30 do that. Okay. So very good just like
35:32 pre you can say homework that I would
35:33 expect from you because these are the
35:35 installation that you need to take care
35:37 of according to your system, right? So
35:41 you can do that. I know just just go on
35:44 Google search if you feel stuck, right?
35:45 Still I'm showing you everything step by
35:48 step but sometimes as per the machine
35:50 sometime it does not work you need to go
35:53 and troubleshoot it right okay so just
35:55 go and just download VS code and I'm
35:56 expecting that you have downloaded
35:59 Visual Studio Code in your system very
36:02 good that technology is done okay now
36:05 what's next the next part is you need to
36:08 install something called as get okay so
36:12 I'll simply say get download
36:16 get download. Okay. And then here is the
36:18 link. So here you have all the options.
36:20 If I click on Windows, then I can simply
36:22 say get for Windows 64 setup. I can just
36:25 click on this particular link. And here
36:27 is my get downloaded. I already have get
36:30 here. Okay. So I can just show you my
36:32 get. If I go to command prompt, I can
36:34 simply say get
36:38 version and I will get version 2.5
36:41 250.1 dot windows.1. So this is my get
36:42 version for now. Make sense? So this is
36:44 my git. So you can also download it and
36:47 you can confirm it just by typing uh in
36:49 your search bar. Simply write cmd which
36:51 will open your command prompt and simply
36:55 write get space hyphen version. It will
36:57 give you the get version to confirm like
37:00 everything is installed
37:02 like efficiently. Nothing is broken,
37:04 nothing is missing. Everything is there.
37:06 Make sense? So this is like very basic
37:08 installation step. just go and click
37:10 next next next and just pick the folder
37:13 write things and that's it again this
37:14 much of thing I would expect from you
37:16 that you can download git on your system
37:20 make sense so you have git make sure you
37:23 write in your cmd get- version and you
37:26 see the result very good is also done
37:29 the next thing is next thing is python
37:31 you need to install python you will say
37:34 anal lamba we are pro data engineers and
37:36 we already have python in our system so
37:39 Bro, you need to still install Python.
37:41 Why? Because let me just show you one
37:45 thing. DBT compatibility
37:47 with Python
37:50 because DBT dB's latest version is not
37:52 compatible with all the Python versions.
37:54 So if I just click on this link. So
37:55 these are the basic versions. So
37:57 currently I know you would have
38:00 installed Python 3.13 maybe. Okay. So it
38:02 is not compatible with the latest
38:03 version and we are using latest version
38:06 of DBT. So see there's a danger mark. So
38:09 you cannot use 3.13. So you have to
38:13 install 3.12. Oh, so how you can just
38:16 check your Python version. Uh again cmd
38:19 simply write Python and you will see the
38:22 version. So I also have 3.13. Yes, I
38:25 also have 3.13 because I wanted to show
38:27 you how you can install Python 3.12.
38:29 Okay. So I also have 3.13. So I also
38:31 need to install 3.12. Let's install it
38:34 together. So I will simply click on this
38:37 link and which link? Let me just type. I
38:41 will say Python download. Okay. And this
38:42 is download Python. Do not click here
38:44 because it will install 3.13. Scroll
38:47 down. Okay. Scroll down and it will say
38:49 looking for a specific release. Then
38:53 search for 3.12. See 3.12.11
38:55 then 3.12.10. You can just download
38:56 anything. Let's say I want to download 3.12.11.
38:58 3.12.11.
39:00 It should be 3.12 and that's it. And
39:03 after that it can be anything. So I will
39:06 simply say this one. Okay. So this is
39:09 the 3.12. Everything looks fine. Okay.
39:11 So let me just download it. I can simply
39:13 click here download button. And it will
39:15 give me all the options that I have. It
39:18 has gzipped source tarbell and blah blah
39:20 blah. Where is that particular
39:24 downloader? No installers. Okay. Okay.
39:25 Okay.
39:27 Get the latest release of 3.3. The
39:28 release you're looking for is security
39:32 bug fix release for okay so we can
39:35 simply search for any other release if
39:39 they have let's try for 3.1 2.9
39:42 download okay description files yeah
39:44 here we have Windows 64-bit so you can
39:46 just download any one but it should be
39:48 3.12 that's it and once you get Windows
39:51 installer 64-bit that's fine just click
39:53 on this particular thing so it has now
39:55 downloaded Python for me let me just
39:57 open that
39:59 And one the most important stuff that
40:03 you can miss you need to check this box
40:06 add python.exe to the path otherwise it
40:08 will not add python path and then if you
40:10 will try to use python in your VS code
40:12 you will see the errors hey where is
40:15 python I cannot find python. So in just
40:16 check this box. Okay then you can say
40:19 install now and it will install the
40:23 python. That's it. Just just just just
40:24 few seconds and that's it. Python is there
40:26 there
40:28 and we will just test it. Don't worry.
40:30 So I'm installing 3.12. This should be
40:34 fine. Make make sense. Makes sense.
40:37 Okay. So it is all almost almost there.
40:39 And after that let me just remind if we
40:41 need to do if we need to download
40:43 anything else.
40:45 Let me do some calculation. I think yeah
40:47 one more thing is left. So setup was
40:49 successful. That's fine. I can now test
40:52 my Python version. I can simply say Python.
40:54 Python.
40:56 What is that?
41:01 Name error. Python is not defined. Okay.
41:05 Uh oh, makes sense. Makes sense. Makes
41:07 sense. So now we can just open a new cmd
41:09 prompt. Okay. Install the latest
41:13 PowerShell. I can simply say uh Python.
41:16 Okay. So now it is showing 3.1 lamb.
41:18 Quick question out of the box. Can we
41:21 ask? Yes, go ahead.
41:23 Earlier you had 3.13.
41:26 Now you just installed 3.12. So now it
41:30 is showing 3.12. Where is 3.13 now? So
41:34 3.13 is also there. 3.12 is also there.
41:37 So what happened? It just overrided the
41:40 or you can say overridden the uh path.
41:43 So now every time your system picks the
41:46 latest version that you downloaded, not
41:49 the latest. Latest means the re the more
41:51 recent version you downloaded so that it
41:54 can only pick that particular version
41:57 in our system we are using Python 3.13
41:59 and what will happen to those projects
42:00 nothing will happen because you can
42:02 install or you can basically create
42:04 virtual environments within your Python
42:06 uh folders within your you can say VS
42:08 code editor and then you can work with
42:10 any Python version. It is very simple.
42:12 You just need to define which Python
42:14 version you need to use for your
42:16 project. But you need to have those
42:20 Python versions in your system. Okay.
42:22 Can you just show us how we can just
42:23 create virtual environments and all
42:24 because we haven't created that?
42:25 Obviously, I will just help you out.
42:27 Don't worry. So now you have both the
42:28 Python versions. Okay, makes sense. So
42:30 now let's see how we can just create
42:32 virtual environments. And before that,
42:35 before that you need to download UV
42:36 because I told you that we will be using
42:40 UV. UV is not something that you need to
42:43 download from Google. You need to add
42:45 that package using pip. Now what is pip?
42:47 PIP is the traditional package manager
42:50 in Python which gets automatically
42:51 downloaded on your system when you
42:54 download the Python. Make sense? So with
42:57 the help of pip, you can download UV.
42:59 Okay. And how we can just download UV?
43:01 So you can just open another terminal
43:05 and you can simply say pip install
43:07 UV. You can do it from here as well and
43:09 from VS code as well. But let's do it
43:13 from here. Uh let's hit enter. PIP
43:16 install UV.
43:19 So successfully installed UV installing
43:20 collected package. I think I already had
43:23 UV but now it has just downloaded from
43:26 here as well. So it's fine. It's fine.
43:30 So to update run python.exe pip install
43:32 upgrade pep and all. And I think yeah
43:34 it's fine. So I already had UV maybe
43:36 with not this particular Python version.
43:38 So that means like it has just
43:39 downloaded one for one more time but
43:41 it's fine. So now you have everything.
43:44 Now let's actually go inside our VS code
43:47 and let's create something virtual
43:49 environments and all those things. Okay,
43:52 let me just show you. So we are into our
43:56 lovely VS code editor. Wow. So in your
43:58 case it can look a little different
44:00 because obviously there are some
44:03 customizations, there are some changes
44:05 as per the you can say OS that we use
44:08 but overall it should be same. Okay. So
44:10 first thing first you need to set up
44:13 basically create a folder. Okay. So you
44:15 can create a folder in your C drive
44:17 anywhere in your computer. Just create
44:19 that folder and you need to open that
44:23 folder with the help of VS code. How? I
44:25 am um expecting that you have created
44:27 that folder. I know that you know how to
44:29 create a folder in your Windows in your
44:33 Mac. Okay. Then click on file and then
44:36 click on open folder. Make sense? When
44:37 you click on open folder, you will see
44:40 the location and just pick that folder.
44:42 In my case, I have created this folder
44:45 which is called an Lamba DBT tutorial.
44:46 Make sense? So this is the folder that
44:47 I'm using and it is empty. You can see
44:49 there's nothing inside it. We need to
44:52 create an empty folder. Make sense? So I
44:53 am expecting that you will do that
44:56 thing. Okay, very good. Sorted. Good.
44:58 Second thing that we need to do, we need
45:01 to install an extension.
45:04 Okay, which extension? Python extension.
45:07 So simply click on this boxes in may in
45:08 in your case it may be on the left hand
45:10 side. So just look for this thing. It's
45:12 called extension. Just go here and just
45:15 simply search Python.
45:23 Okay. Python by Microsoft. Python by
45:25 this. So I think this is the one that I
45:27 have also installed. Yeah. So just click
45:29 on this and it is by Microsoft and in my
45:31 case I have already installed it. What
45:33 is this? It is basically it is not an
45:34 interpreter. It is just an extension
45:37 which provides some you can say support.
45:40 Which support? It can be let's say
45:42 pilance which is like it intellisense
45:44 that we use with Python just to just for
45:46 the code completion. It can give you
45:49 some hints while writing the code, color
45:50 changing properties for the modules and
45:52 all those thing and it is very helpful
45:54 when you just develop the code locally
45:56 and that's the one of the sole reasons
45:58 why developers prefer building code
46:00 writing code locally because they get
46:02 additional support that they do not get
46:04 on the hosted platforms or managed
46:06 platforms. Yes, I know like companies
46:07 are doing their best to add those
46:10 capabilities but still it's not that
46:13 much you can say catchy that we have in
46:16 VS. As you can see Visual Studio Code
46:18 extension with rich support for Python
46:20 language for all activity providing
46:21 access points for extensions to
46:24 seamlessly integrate and offer support
46:26 for IntelliSense Pilance okay debugging
46:28 Python debugger is there formatting
46:30 linting code in navigation refactoring
46:31 variable explorer test explorer and
46:33 more. So it is an amazing extension. You
46:35 can just download it. Okay. One
46:36 extension is this one that we need to
46:38 download. More extension. I will just
46:40 let you know. Make sense? Okay. So this
46:42 is basically the tab that we use and I
46:44 can simply close it. Very good. Now let
46:47 I I just want to show you something. If
46:49 you click on these three dots, you will
46:50 be seeing terminal and you can click on
46:53 new terminal. Okay. So by default it
46:55 will open this thing PowerShell. In your
46:57 case it can be like a lot of things. If
46:58 I click on this dropdown, I have a lot
47:00 of options. In your case case, I think
47:03 you will not see git bash or javascript
47:04 debug terminal. So it's fine. It's it's
47:06 fine. I have some additional terminals
47:08 in my system. Okay. So here I just want
47:10 to show you the python version first of
47:11 all. So this is the terminal that you
47:13 can use like powershell. So this is the
47:15 biggest advantage of VS code that you
47:18 can run your terminal
47:20 parallelly with your code. I will write
47:23 python and I will see python 3.112.9.
47:25 Very good. I can literally write
47:28 anything. let's say uh or even I can
47:30 just create multiple terminals. I can
47:31 click on plus and I can simply create
47:33 another terminal. There's a shortcut key
47:36 for terminal. Okay. And it's control
47:38 shift and
47:41 uh it's called back tick. It is just
47:45 above one one key above of tab key and I
47:47 can just show you here as well terminal.
47:49 See this is the shortcut and in your
47:51 case you can also explore it. So let's
47:53 say this one. Okay, perfect. Now I can
47:56 simply say get version.
47:58 Perfect. So this is my get version. I
48:01 can just run my all the all the terminal
48:02 commands as well. And obviously if you
48:04 are just learning DBD it's all like
48:07 command line interface. It's all CLI. So
48:10 you have to use VS code in order to run
48:13 DBD right. Okay perfect. So one thing
48:15 that we need to create first of all. So
48:18 this is our folder. Now, DBT says you
48:22 can download DBT you can say package but
48:24 in order to efficiently work with it you
48:28 need to install something called as like
48:31 package with respect to an environment.
48:33 Now you need to create a virtual
48:35 environment. So basically I will open a
48:38 terminal. Traditionally we were writing
48:41 this code let's say python minus m which
48:43 is module then we write venv then
48:46 virtual name which is v and v. You can
48:47 just pick anyone but this is like common
48:49 practice that we have dove v and v.
48:51 Okay. So this is the code that we used
48:55 to write. Okay. But with UV we do not
48:57 need to write anything literally. Yes.
48:58 Let me just first of all show you
49:00 something about UV. UV is an like
49:02 amazing package.
49:11 Okay. Let me just click here.
49:18 [Music] Boom.
49:20 Boom. Pro
49:23 Pro
49:26 okay. Okay. Okay. Why I'm seeing this
49:31 particular thing in any other language?
49:38 I think yeah. Astral. I found it. So
49:40 this is the official documentation for
49:43 UV. And I can also bookmark it.
49:47 installation UV. No, just UV. Perfect.
49:49 So, this is the installation tab and
49:50 here you can see that this is how we
49:53 just download the UV and you have
49:55 already run this command. So, we have UV
49:57 in our uh you can say terminal in our
49:59 system. So, these are there are like so
50:01 many options with the like with which we
50:04 can just download UV and UV is an
50:05 amazing thing and do you know what's the
50:07 best thing? Let's say you still want to
50:09 use pip. You can use pip with UV as
50:10 well. And don't worry, I'll just let you
50:12 know everything that we need to we need
50:15 to know about UV because obviously this
50:18 is not like UV masterclass but we will
50:22 learn UV as well. Okay. So that is fine.
50:24 Now we need to install DBD. Let me just
50:27 write dbd installation and I will just
50:32 show you what you need. Who
50:34 is this? So basically install from
50:37 source. There are basically so many um
50:40 ways with which we can just download DBD
50:43 core engine. The best thing is
50:46 downloading it with the uh pip command
50:47 or basically not from pip command
50:49 basically from the package. Why we just
50:51 call it as pip because pip is very
50:52 popular right? So I will just click on
50:53 install with pip. So here are the
50:55 commands. So first of all we need to
50:56 create virtual environment and we will
50:58 not be creating like this because this
51:00 is like very very very old school. Okay.
51:03 And once we have a virtual envirment
51:05 then we need to create this particular
51:08 package using this thing dbt core. Now
51:11 we cannot install only and only dbt
51:14 core. We need to install an adapter as
51:16 well. Now this can be your interview
51:19 question. What is an adapter? Remember I
51:23 just told you that with the DBT we need
51:26 a kind of platform
51:28 because we need to in uh use compute we
51:31 need to in uh use cluster we need to use
51:33 the platform where we will be just
51:35 processing our data right so that is
51:36 called adapter that is just the fancy
51:38 name that's it and we have so many
51:40 adapters we have datab bricks adapter
51:43 snowflake adapter fabric adapter um
51:44 synapse adapter there are so many
51:45 adapters and let me just tell you how
51:47 you can just install adapters it's very
51:52 See yes documentation is a little bit
51:53 you can say confusing but don't worry
51:56 I'm here to tell you guys okay so first
51:58 of all I am here in this particular
51:59 folder and you also need to make sure
52:02 that you are in the right folder now ana
52:05 now just tell us how we can just create
52:07 a virtual moment first of all you do not
52:09 need to create a virtual moment yes you
52:10 will be creating virtual moment but
52:12 automatically you need to simply write
52:16 UV in it that's it literally that's it
52:20 just hit Enter and boom you will see
52:24 magic. Magic literally. Yes. See. Ooh
52:26 all these things are there. Yes. All
52:29 these things are there. So in my case I
52:32 can also see dotgget. In your case maybe
52:33 you will not be able to see dotgate
52:37 because by default all the you can say
52:40 dotget dot things are like hidden by
52:42 default in your system. You can actually
52:44 enable it but that's not a big deal
52:47 because this is just a git and it is off
52:50 like it is hidden for a reason because
52:52 you should not touch it because if you
52:54 touch you can mess up with your codebase
52:56 that is why it is hidden by default. So
52:58 on why you have enabled it because I
53:01 like to do that. So you do not need to
53:02 worry about that. It if it is hidden
53:04 it's fine and you can actually enable it
53:06 from here as well. And you can just I
53:08 think click on settings. And if you just
53:11 go to command pallet and if you just see
53:13 um open file settings I guess open
53:16 settings. Okay. And if you just scroll
53:20 down here you will see an option
53:22 exclude. Yeah. So here you will be
53:24 having that dot get option. Just click
53:27 on cross and it will be gone. Okay, just
53:29 a tip for you. You can also mess up with
53:32 your codebase. Join join me. Okay, so
53:35 that is good. So our all the things are
53:37 ready. What are these files? These files
53:39 are created by UV. Okay. And these are
53:41 very very important. We are not going to
53:43 worry about all the files because it is
53:45 related to UV and you do not need to
53:47 worry about that. One thing is very
53:48 important. Click on Python version file.
53:51 So this is the file. So let's say you
53:53 are using
53:56 basically your UV for any other project
54:00 and you want to use Python version 3.13.
54:02 You just need to define it here and
54:06 that's it. This way it will use 3.13. If
54:08 you want to use 3.12 just define it here
54:12 that's it and save the file. Wow. That's
54:16 why I love UV. Okay. Otherwise, you have
54:19 to create your virtual environment based
54:22 on a specific Python version and you
54:25 need to tell hey create my version with
54:26 this particular Python version. But that
54:28 is not an ideal way. This is an ideal
54:31 way. Make sense? Okay. Very good. Now,
54:33 next thing is an Lamba where is our
54:34 virtual environment? It will be
54:37 automatically created for you the moment
54:39 you run one command. It is called UV
54:42 sync. UV sync. Okay. And just make sure
54:44 Python version is this one and it is
54:47 saved. Okay, I will hit enter and that's
54:49 it. Here is your virtual environment
54:51 created venv. Make sense? So what it
54:54 does, it basically syncs all your
54:57 properties defined in your codebase.
55:01 Simple. Very good. Now we want to
55:02 install some libraries basically
55:04 packages. So you would remember we
55:06 installed something like this pip
55:08 install and package name right. But we
55:10 do not need to do anything. We need to
55:13 use uv. We simply write uv add and then
55:15 package name. Let's say package name is
55:19 dbt core. Right? Hit enter. It has
55:21 started downloading dbt core. And it is
55:23 done. Very good. Now we need to install
55:28 one more thing. It is called uv um dbt
55:30 datab bricks because we are using data
55:32 bricks for our adapter. Hit enter. So
55:34 now it is downloading this particular
55:37 adapter. Wow. Literally wow. So it has
55:41 almost downloaded it and it is good.
55:45 Uh yes yes it is good. So that's how it
55:48 has downloaded this thing lamba. Before
55:53 UV we used to create what is this man?
55:56 Yep. So before UV we used to create some
56:00 files. It's called requirements.txt
56:03 so that we can store all our packages
56:05 how we will be just storing this thing.
56:07 So here in UV we have something called
56:11 as pi project.toml. toml. This file
56:14 stores all your packages. See, it has a
56:16 uh list called dependencies and it has
56:20 all the packages that you are using.
56:22 Wow. So, we do not need to manage
56:25 requirements.xt as well. Yes, that's why
56:30 UV is UV anal.
56:31 Let's say we still want to use
56:33 requirements.txt because we want to use
56:35 it. Okay, you can write something like
56:39 this. UV, pip freeze. As I just
56:40 mentioned that with UV you can use exact
56:42 pip commands as well. You just need to
56:45 give a prefix called UV. That's it. PIP
56:48 freeze and requirements
56:50 requirements
56:52 txt. Hit enter. And here is your
56:54 requirement requirements.xt file. And
56:57 see all the requirements are here. All
56:59 the requirements. Ooh, there are so
57:01 many. Yeah, obviously because these are
57:02 like all the dependencies as well that
57:04 it writes in requirements.txt. But with
57:07 toml it just writes the parent package
57:09 name that's it. Okay. But I just showed
57:13 you both the ways. Oh okay. So now an
57:14 Lamba a quick question last last
57:18 question. Okay. Let's say you want to um
57:20 download any dependencies which are in
57:23 your requirements. So we just write used
57:26 to write pip install
57:29 minus r requirements.txt. So here you
57:32 can simply say uv add and instead of
57:33 package name you simply need to write
57:35 minus r requirements.txt. That's it.
57:37 Simple dimple. Perfect. So anaba
57:40 everything is done. Yes. One thing we
57:41 need to activate the virtual environment
57:45 as well. Yes. In order to activate the
57:46 virtual environment I think we didn't
57:48 activate virtual and we directly
57:51 downloaded our packages. As you can see
57:55 see it automatic. I I think so. I think
57:57 so because we didn't uh enable the
57:58 virtual norm and we directly downloaded
58:00 the packages. Ideally we should have
58:02 enabled virtual norm first and then we
58:03 should download it because now what
58:06 happened it would have downloaded all
58:08 the packages on my local machine.
58:10 Basically it is a global environment not
58:11 my virtual environment but it is fine it
58:14 will not stop you. So in 99% of the
58:18 cases it installs the you can say
58:19 dependencies in your virtual environment
58:21 itself. But I also need to cover 1%
58:23 people because I want to help everyone.
58:25 So the thing is even if you are not sure
58:27 like whether it has downloaded or you
58:29 can say installed the packages in your
58:31 machine or in the global machine or in
58:34 your virtual or simply type UV and then
58:37 you can simply say uh remove and the
58:40 package name u remove let's say dbt core
58:42 that's it and hit enter and then you can
58:44 say dbt data bricks and hit enter so
58:46 what it will do it will simply remove
58:47 everything in the tomal file as you can
58:49 see I have also run that command so that
58:51 I can just show you so now it is empty
58:54 happy now Let's reinstall it. So before
58:56 reinstalling it now make sure that your
58:59 environment is activated but by default
59:02 it will be uh you can say activated. See
59:04 it is just showing me the folder name
59:07 which is this one anlama dbt tutorial.
59:09 So this means my environment is
59:11 activated. That's why I'm just seeing
59:12 this thing here. Okay. And you can also
59:14 confirm it from here as well. As you can
59:17 see an llama dbt tutorial 3.12.9 that
59:19 means this is a terminal being used. If
59:22 only Python 3 something is written that
59:24 means your global environment is being
59:25 used but you need to use virtual
59:28 environment. Okay. But still there's a
59:30 command called dot venv that means your
59:34 virtual environment name then scripts
59:37 then activate. With this command it will
59:38 activate your environment. Obviously it
59:40 was activated but just reactivated it.
59:43 Okay. So that is all. Now your virtual
59:45 is also there. Now you can simply say uv add
59:48 add
59:53 dbt core and dbt datab bricks.
59:55 Let's install both together. So as you
59:57 can see my list is back again. Okay,
60:00 makes sense. Very very very good. And we can also update recommend.txt. Yes, this
60:02 can also update recommend.txt. Yes, this is like boilerplate code but you have to
60:04 is like boilerplate code but you have to write it if you are a data engineer, if
60:05 write it if you are a data engineer, if you are a developer. So it's fine. UV
60:08 you are a developer. So it's fine. UV add not add pip
60:11 add not add pip freeze and then
60:14 freeze and then requirements txt. Okay. So now it has
60:17 requirements txt. Okay. So now it has updated this requirement txt as well. So
60:19 updated this requirement txt as well. So now this is good. Everything is fine and
60:21 now this is good. Everything is fine and we can also confirm our dbt is working.
60:23 we can also confirm our dbt is working. Simply search not search like write dbt
60:25 Simply search not search like write dbt and hit enter. It will return some you
60:28 and hit enter. It will return some you can say manual guide all those help dbd
60:31 can say manual guide all those help dbd build clean and all those things. If you
60:33 build clean and all those things. If you are able to see this thing that means
60:35 are able to see this thing that means your installation is perfect everything
60:37 your installation is perfect everything is nice. Make sense? So that's all you
60:40 is nice. Make sense? So that's all you need to do and that's all you need to do
60:42 need to do and that's all you need to do as a prerequest. So that means your DBD
60:44 as a prerequest. So that means your DBD is fine. Now your DBD is installed
60:47 is fine. Now your DBD is installed successfully and yes even if you saw
60:49 successfully and yes even if you saw some errors it's fine. Okay. So now what
60:52 some errors it's fine. Okay. So now what we need to do we need to make an initial
60:54 we need to do we need to make an initial commitment which is very very important.
60:55 commitment which is very very important. Okay. So this is a get thing. So you can
60:57 Okay. So this is a get thing. So you can just copy me what I'm doing. So simply
60:59 just copy me what I'm doing. So simply write get
61:01 write get uh and see one more thing even if you
61:04 uh and see one more thing even if you know get we do not need to say get in it
61:06 know get we do not need to say get in it because in the UV init your get init
61:08 because in the UV init your get init command is embedded okay so I'll simply
61:10 command is embedded okay so I'll simply say get branch minus m main simply hit
61:14 say get branch minus m main simply hit this okay so what it will do it will
61:16 this okay so what it will do it will rename your master branch to main
61:18 rename your master branch to main because main is a new word then you can
61:20 because main is a new word then you can simply say get add then space then dot
61:24 simply say get add then space then dot hit enter then write get commit space
61:28 hit enter then write get commit space minus min then write the message in the
61:31 minus min then write the message in the double quotes initial commit. Now hold
61:34 double quotes initial commit. Now hold on do not hit enter because I need to
61:36 on do not hit enter because I need to talk to you. Look into my eyes. Okay. So
61:38 talk to you. Look into my eyes. Okay. So the thing is this is these are the
61:41 the thing is this is these are the commands that are related to get. These
61:44 commands that are related to get. These commands do not align with dbd. But if
61:50 commands do not align with dbd. But if you are using dbt locally you need to
61:54 you are using dbt locally you need to know how to run only these two commands.
61:56 know how to run only these two commands. So just a quick overview. If you are not
61:58 So just a quick overview. If you are not familiar with G, not a big deal. We are
62:00 familiar with G, not a big deal. We are not using get in detail. But there are
62:02 not using get in detail. But there are two commands. One is get add which
62:05 two commands. One is get add which selects your files that you want to add
62:08 selects your files that you want to add in your continuous integration. When we
62:11 in your continuous integration. When we write get add dot that means we want to
62:13 write get add dot that means we want to select all the files. Okay. And then we
62:17 select all the files. Okay. And then we write get commit that we want to make a
62:19 write get commit that we want to make a commit on this change. And then the
62:21 commit on this change. And then the message and we are saying initial commit
62:22 message and we are saying initial commit and hit enter. Perfect. So now we have
62:25 and hit enter. Perfect. So now we have made the commit successfully in our main
62:28 made the commit successfully in our main branch. Okay, make sense? So we'll be
62:30 branch. Okay, make sense? So we'll be learning some git as well. So it is like
62:33 learning some git as well. So it is like full package which will just um make you
62:35 full package which will just um make you learn git as well. Some you can say
62:37 learn git as well. Some you can say Python development as well. Some you can
62:39 Python development as well. Some you can say code setup as well. So everything is
62:41 say code setup as well. So everything is there. So that is that means like that
62:43 there. So that is that means like that that's why I love DBT because you need
62:44 that's why I love DBT because you need to learn so many things whenever you
62:46 to learn so many things whenever you just work with DBT and that's fine.
62:48 just work with DBT and that's fine. Okay. So this is done and we have
62:51 Okay. So this is done and we have successfully set up our environment. Now
62:53 successfully set up our environment. Now let's go back to our datab bricks and
62:56 let's go back to our datab bricks and now let's set up our source basically
62:59 now let's set up our source basically catalog basically schemas data where are
63:02 catalog basically schemas data where are the data where are the sources
63:03 the data where are the sources everything okay because dbt is good now
63:06 everything okay because dbt is good now let's go to datab bricks and set up that
63:07 let's go to datab bricks and set up that thing first of all let's quickly create
63:10 thing first of all let's quickly create the catalog because that is the
63:11 the catalog because that is the obviously the prerequest because we need
63:13 obviously the prerequest because we need to set up the source right so I will
63:15 to set up the source right so I will simply go to catalog okay and just
63:17 simply go to catalog okay and just ignore these catalog these are like my
63:19 ignore these catalog these are like my catalog that I was just playing with so
63:21 catalog that I was just playing with so simply click on this plus button And
63:23 simply click on this plus button And let's create a new catalog. And this
63:25 let's create a new catalog. And this catalog name will be let's say DBD
63:27 catalog name will be let's say DBD tutorial.
63:29 tutorial. Okay, let's say DBD tutorial dev. Okay,
63:31 Okay, let's say DBD tutorial dev. Okay, because this is a dev catalog. Then
63:33 because this is a dev catalog. Then click on create and then click on
63:35 click on create and then click on configure catalog and then scroll down
63:38 configure catalog and then scroll down and click on next and then save. So your
63:40 and click on next and then save. So your catalog is created. Catalog is basically
63:42 catalog is created. Catalog is basically equivalent to a database. Okay. So
63:44 equivalent to a database. Okay. So within that we have so many schemas. So
63:46 within that we have so many schemas. So let's create a new schema and it will be
63:48 let's create a new schema and it will be called as source. Okay. Because this
63:50 called as source. Okay. Because this will be our source. So simply click on
63:52 will be our source. So simply click on create. Perfect. Our source schema is
63:55 create. Perfect. Our source schema is ready. Click on source. And now we need
63:57 ready. Click on source. And now we need to create some tables, right? So we do
63:59 to create some tables, right? So we do not have any kind of data. So so far. So
64:01 not have any kind of data. So so far. So we have the data. It is on my GitHub
64:03 we have the data. It is on my GitHub repository. Let me just show you.
64:06 repository. Let me just show you. Uh uh uh let's click on this link. Okay.
64:09 Uh uh uh let's click on this link. Okay. And I can just take you to the get. So
64:11 And I can just take you to the get. So this is a very very very very beautiful
64:13 this is a very very very very beautiful GitHub repo or not like repo like
64:16 GitHub repo or not like repo like account. You can simply click on this
64:17 account. You can simply click on this button and then you can simply go to
64:19 button and then you can simply go to repositories. Okay. So simply click on
64:21 repositories. Okay. So simply click on Anlama YouTube. Okay. And within this
64:23 Anlama YouTube. Okay. And within this you need to find the bright folder
64:25 you need to find the bright folder folder. It is called DBD masterclass
64:28 folder. It is called DBD masterclass here. So within that we have so many CSV
64:31 here. So within that we have so many CSV files. Just make sure you download all
64:32 files. Just make sure you download all the CSV files because you'll be just um
64:34 the CSV files because you'll be just um using these CSV files to create our
64:36 using these CSV files to create our source tables. Okay. So let's create a
64:40 source tables. Okay. So let's create a table. So you can simply click on table
64:42 table. So you can simply click on table and it will ask you where's the file.
64:44 and it will ask you where's the file. Simply click on browse button and you
64:46 Simply click on browse button and you need to now in like upload that
64:48 need to now in like upload that particular CSV file. Okay, make sense?
64:50 particular CSV file. Okay, make sense? Very good. So, as you can see that I
64:52 Very good. So, as you can see that I have uploaded fact sales file which is
64:54 have uploaded fact sales file which is fact sales dot CSV and it will just
64:58 fact sales dot CSV and it will just create the table on top of that CSV
65:00 create the table on top of that CSV file. Basically, not on top of that CSV
65:02 file. Basically, not on top of that CSV file. It will take that CSV file, it
65:04 file. It will take that CSV file, it will convert it into delta format and it
65:06 will convert it into delta format and it will create that table. We do not need
65:07 will create that table. We do not need to worry about that thing because all of
65:10 to worry about that thing because all of these things will be you can say managed
65:12 these things will be you can say managed by database itself. So we just need the
65:14 by database itself. So we just need the data that's it and we have this data and
65:17 data that's it and we have this data and we have so many so many ids keys and all
65:21 we have so many so many ids keys and all those things right all these things are
65:24 those things right all these things are here okay make sense and these are the
65:26 here okay make sense and these are the columns that we have within this
65:28 columns that we have within this promotion SK customer product store date
65:31 promotion SK customer product store date and all those things all the ids are
65:33 and all those things all the ids are here simply click on create table and
65:36 here simply click on create table and then it will create the table okay so
65:40 then it will create the table okay so everything will be managed by databicks
65:42 everything will be managed by databicks so this This is just you can say we are
65:44 so this This is just you can say we are preparing the source. Okay. And this way
65:46 preparing the source. Okay. And this way you need to install basically not
65:48 you need to install basically not install
65:50 install create tables like this one by one. If I
65:51 create tables like this one by one. If I scroll down I will see all the columns
65:54 scroll down I will see all the columns like this. Okay. And this is a basically
65:55 like this. Okay. And this is a basically a manage table of data source. Okay.
65:58 a manage table of data source. Okay. Similarly you can simply click on create
66:00 Similarly you can simply click on create and then not like from here go to the
66:03 and then not like from here go to the source schema. Okay. And within this you
66:06 source schema. Okay. And within this you should see that particular table and you
66:08 should see that particular table and you are not able to see that. That makes
66:10 are not able to see that. That makes sense because when you just created that
66:12 sense because when you just created that table, you didn't pick the right schema.
66:13 table, you didn't pick the right schema. So simply click on create and click on
66:16 So simply click on create and click on table and then just browse the file for
66:18 table and then just browse the file for one more time. So here you need to pick
66:21 one more time. So here you need to pick the right catalog as well. So I think
66:23 the right catalog as well. So I think catalog is also fine. Source is also
66:24 catalog is also fine. Source is also fine. That means I think our table is
66:26 fine. That means I think our table is created. Let me just refresh it.
66:29 created. Let me just refresh it. Uh so you can simply go to catalog.
66:32 Uh so you can simply go to catalog. Okay. And then go to your catalog which
66:36 Okay. And then go to your catalog which is called
66:38 is called uh DBT tutorial dev. Perfect. Within
66:41 uh DBT tutorial dev. Perfect. Within source we have the table. Yeah, we have
66:42 source we have the table. Yeah, we have the table. So similarly you need to
66:45 the table. So similarly you need to create more tables. Simply click on
66:46 create more tables. Simply click on source then create then create table. So
66:49 source then create then create table. So what you need to do you need to upload
66:50 what you need to do you need to upload all the files one by one. Fact sales
66:53 all the files one by one. Fact sales fact returns dim customers dim products
66:55 fact returns dim customers dim products like everything every file and you need
66:57 like everything every file and you need to have those all the tables ready. Let
66:59 to have those all the tables ready. Let me just do it for you quickly so that
67:01 me just do it for you quickly so that you can also do it right.
67:03 you can also do it right. So as you can see that I have prepared
67:06 So as you can see that I have prepared all the tables dim customer dim product
67:08 all the tables dim customer dim product dim store and two fact tableable
67:10 dim store and two fact tableable factored and the fact. So so our all the
67:12 factored and the fact. So so our all the tables are ready that means that means
67:15 tables are ready that means that means that means our source is ready. Okay. So
67:18 that means our source is ready. Okay. So now the situation is we have our source
67:21 now the situation is we have our source sitting in the datab bricks. Make sense?
67:23 sitting in the datab bricks. Make sense? And this is a schema. We need to create
67:26 And this is a schema. We need to create bronze, silver, gold layer in datab
67:30 bronze, silver, gold layer in datab bricks using dbt models or basically all
67:32 bricks using dbt models or basically all those things that we use within data
67:34 those things that we use within data bricks. Uh we need to use the same in
67:36 bricks. Uh we need to use the same in dbt but we will not be using data bricks
67:38 dbt but we will not be using data bricks for for our transformation. We will be
67:40 for for our transformation. We will be using dbt only make sense. So now let's
67:46 using dbt only make sense. So now let's just start building something called as
67:49 just start building something called as models and how we can just initialize
67:51 models and how we can just initialize dbt connection with data bricks and all
67:54 dbt connection with data bricks and all those things. Let me just show you.
67:57 those things. Let me just show you. So now it's time to write our first dbt
68:00 So now it's time to write our first dbt command. Yes, first dbt cla command. We
68:03 command. Yes, first dbt cla command. We are just going to write that. And what
68:04 are just going to write that. And what is that particular command? So basically
68:07 is that particular command? So basically whenever you want to just work with DBD
68:10 whenever you want to just work with DBD with any platform we just call it as
68:12 with any platform we just call it as adapter with like more technical term we
68:14 adapter with like more technical term we need to create few things. First of all
68:17 need to create few things. First of all it's called connection and even before
68:20 it's called connection and even before connection there's a there's something
68:22 connection there's a there's something called project. So let me just show you
68:23 called project. So let me just show you the hierarchy. It is not like very
68:26 the hierarchy. It is not like very technical but yes you need to understand
68:28 technical but yes you need to understand this. So first of all there is a
68:30 this. So first of all there is a project. Okay there is a project. Very
68:34 project. Okay there is a project. Very good because obviously what whatever you
68:36 good because obviously what whatever you are building you are building it in a
68:38 are building you are building it in a project right so first thing will be the
68:41 project right so first thing will be the project and project is so so so
68:43 project and project is so so so important so important I I would say one
68:45 important so important I I would say one of the most important stuff and one of
68:47 of the most important stuff and one of the most easiest stuff but yes one
68:50 the most easiest stuff but yes one mistake can just break a lot of things
68:52 mistake can just break a lot of things but don't worry I'm with you and then we
68:54 but don't worry I'm with you and then we have something called as by the way I
68:57 have something called as by the way I should not create this error because
68:58 should not create this error because earlier this was supposed to be like
69:02 earlier this was supposed to be like this like there was an there was a
69:04 this like there was an there was a hierarchy before but now there's no
69:05 hierarchy before but now there's no hierarchy. We can create independent
69:07 hierarchy. We can create independent connections as well. So then we have
69:09 connections as well. So then we have something called as connection. Make
69:11 something called as connection. Make sense? Make sense? Make sense? There's
69:13 sense? Make sense? Make sense? There's no hierarchy. Remember both are
69:15 no hierarchy. Remember both are independent stuff. We can use connection
69:18 independent stuff. We can use connection with any project and we can just make or
69:21 with any project and we can just make or you can say use so many connections
69:22 you can say use so many connections within one project now. Okay. So now
69:26 within one project now. Okay. So now first of all let me just open a fresh
69:27 first of all let me just open a fresh terminal.
69:29 terminal. Okay. Perfect. So now I will write my
69:30 Okay. Perfect. So now I will write my first dbt command and just make sure you
69:33 first dbt command and just make sure you are inside this particular folder which
69:34 are inside this particular folder which is called an dbt tutorial. Okay,
69:37 is called an dbt tutorial. Okay, obviously in your case it will be a
69:38 obviously in your case it will be a different folder. I will say dbt
69:42 different folder. I will say dbt in it. Okay 3 2 1 let's go. And you will
69:46 in it. Okay 3 2 1 let's go. And you will see something will happen on your
69:48 see something will happen on your terminal. You need to wait for a few
69:49 terminal. You need to wait for a few seconds. Okay. So now it is saying enter
69:51 seconds. Okay. So now it is saying enter a name for your project. Oh okay. So
69:54 a name for your project. Oh okay. So here's the thing. Whenever you are using
69:57 here's the thing. Whenever you are using DBT cloud all these things will be
70:00 DBT cloud all these things will be turned into a nice UI
70:05 turned into a nice UI that is why they are charging right but
70:07 that is why they are charging right but when you are not using Python cloud
70:09 when you are not using Python cloud Python cloud w DBT cloud you are using
70:12 Python cloud w DBT cloud you are using DBT core which is a free version all
70:14 DBT core which is a free version all these things will be available in your
70:16 these things will be available in your terminal but do you know what this is
70:19 terminal but do you know what this is you can say fun fact coders love using
70:24 you can say fun fact coders love using CLI more. Okay. So, obviously it is
70:28 CLI more. Okay. So, obviously it is really fun and let me just tell you why
70:30 really fun and let me just tell you why and how. So, first of all, let's give a
70:31 and how. So, first of all, let's give a nice project name. Okay. I want to name
70:33 nice project name. Okay. I want to name it as let's say an
70:36 it as let's say an uh dbt
70:38 uh dbt tutorial
70:40 tutorial dbt YouTube. Let's say an dbt YouTube.
70:43 dbt YouTube. Let's say an dbt YouTube. Make sense? I will hit enter.
70:46 Make sense? I will hit enter. Oh wow. So, so many things actually
70:48 Oh wow. So, so many things actually happened. I can see a new folder on an
70:50 happened. I can see a new folder on an dbt YouTube. Okay. Do not open that.
70:52 dbt YouTube. Okay. Do not open that. Wait, wait, wait. Do not open that. I
70:54 Wait, wait, wait. Do not open that. I will let you know. Wait now if you read
70:57 will let you know. Wait now if you read everything here it is written here it is
71:00 everything here it is written here it is written happy modeling. Okay he's
71:02 written happy modeling. Okay he's copying me. I always say happy learning.
71:03 copying me. I always say happy learning. So now it is saying happy modeling. So
71:06 So now it is saying happy modeling. So now it is saying which database would
71:08 now it is saying which database would you like to use? There are only two
71:10 you like to use? There are only two options data bricks and spark. But an
71:12 options data bricks and spark. But an lamba we have heard that we can just
71:14 lamba we have heard that we can just work with snowflake fabric blah blah
71:18 work with snowflake fabric blah blah blah. Why there are only two options?
71:21 blah. Why there are only two options? because you have only installed dbt
71:23 because you have only installed dbt datab bricks adapter that is why it is
71:25 datab bricks adapter that is why it is suggesting you based on that okay so
71:28 suggesting you based on that okay so just hit one and then enter that means
71:30 just hit one and then enter that means you want to work with data bricks
71:33 you want to work with data bricks perfect so now it is saying obviously
71:36 perfect so now it is saying obviously you need to just wait a little bit
71:38 you need to just wait a little bit because it is just doing its work behind
71:40 because it is just doing its work behind the scenes and you will be prompted very
71:43 the scenes and you will be prompted very soon what you do what you need to do
71:45 soon what you do what you need to do next make sense so it is saying warning
71:48 next make sense so it is saying warning thrift
71:50 thrift sss L compact using legacy validation
71:53 sss L compact using legacy validation call back whatever and one more thing
71:55 call back whatever and one more thing which is very important whenever you are
71:58 which is very important whenever you are working with CLI sometimes it can take a
72:01 working with CLI sometimes it can take a lot of time but you do not need to hit
72:04 lot of time but you do not need to hit any key nothing so just be patient and
72:08 any key nothing so just be patient and everything will be there for you make
72:09 everything will be there for you make sense okay now see now it is asking me
72:12 sense okay now see now it is asking me one thing what is that it is asking me
72:14 one thing what is that it is asking me for the host now where is the host as I
72:17 for the host now where is the host as I just mentioned that dbt will not be
72:20 just mentioned that dbt will not be responsible for the compute. You need to
72:22 responsible for the compute. You need to host your own compute. Okay, where is
72:25 host your own compute. Okay, where is the compute? Where is the host in your
72:27 the compute? Where is the host in your datab bricks? Let me just show you. If
72:29 datab bricks? Let me just show you. If you just open your datab bricks um
72:31 you just open your datab bricks um portal workspace, just go to compute.
72:33 portal workspace, just go to compute. Okay, and you will see something called
72:35 Okay, and you will see something called a SQL warehouse. So, DBT will be using
72:38 a SQL warehouse. So, DBT will be using our SQL warehouse and this is a free SQL
72:40 our SQL warehouse and this is a free SQL warehouse that we get with the database
72:42 warehouse that we get with the database account. So, you simply need to click on
72:43 account. So, you simply need to click on this. Okay. And then simply go to
72:46 this. Okay. And then simply go to connection details and simply copy this
72:48 connection details and simply copy this host name. That's it. and I will just
72:50 host name. That's it. and I will just paste it here and hit enter. Now it is
72:54 paste it here and hit enter. Now it is asking for the HTTP path. I will simply
72:56 asking for the HTTP path. I will simply copy this HTTP path. Basically these are
72:58 copy this HTTP path. Basically these are the connection details that you need to
72:59 the connection details that you need to provide. Hit enter. Now this is the best
73:02 provide. Hit enter. Now this is the best thing it is asking. Hey these are the
73:05 thing it is asking. Hey these are the connection. I know but how I will be
73:07 connection. I know but how I will be having you can say power to use this. I
73:10 having you can say power to use this. I do not have any kind of liberty. Yes. I
73:14 do not have any kind of liberty. Yes. I will say you can do that and I will
73:16 will say you can do that and I will provide you one token. Make sense? First
73:18 provide you one token. Make sense? First of all hit one because we want to
73:20 of all hit one because we want to provide token. So just use token. Now it
73:23 provide token. So just use token. Now it is saying what is the token value. You
73:26 is saying what is the token value. You need to create a new token. Now just go
73:28 need to create a new token. Now just go to your settings
73:30 to your settings just go to developer and then
73:34 just go to developer and then in the access tokens manage and I
73:37 in the access tokens manage and I already have one token. Okay. I will
73:39 already have one token. Okay. I will simply click on uh generate new token
73:41 simply click on uh generate new token and I will simply say db2 YouTube.
73:44 and I will simply say db2 YouTube. Make sense? Now I will simply click on
73:45 Make sense? Now I will simply click on create. It will generate the token for
73:47 create. It will generate the token for me and obviously I will not show you. So
73:50 me and obviously I will not show you. So now here is a catch. Let me just tell
73:52 now here is a catch. Let me just tell you. You will click on generate you will
73:54 you. You will click on generate you will see the token just copy it. Just copy
73:58 see the token just copy it. Just copy it. I have also copied the value of the
74:01 it. I have also copied the value of the token. But you know what? When I will
74:04 token. But you know what? When I will come here and click on this terminal and
74:07 come here and click on this terminal and I will say controlV
74:09 I will say controlV I have hit control V but still I will
74:12 I have hit control V but still I will not be able to see the value. Why?
74:13 not be able to see the value. Why? Because of the security reasons. And
74:15 Because of the security reasons. And this is one of the best reasons. By the
74:17 this is one of the best reasons. By the way, there's a like there there could be
74:19 way, there's a like there there could be a better way to do that. Whenever I'm
74:21 a better way to do that. Whenever I'm just hitting controlV, you could replace
74:24 just hitting controlV, you could replace the values with star star star because
74:26 the values with star star star because at least I would know I have hit the
74:27 at least I would know I have hit the value. But it is there. You just need to
74:29 value. But it is there. You just need to hit enter and that's it. See because I
74:32 hit enter and that's it. See because I pasted the value but still you will not
74:34 pasted the value but still you will not see anything. Perfect. Now it is saying
74:36 see anything. Perfect. Now it is saying do you want to use unity unity catalog
74:38 do you want to use unity unity catalog or not. Bro, you cannot say no to unity
74:42 or not. Bro, you cannot say no to unity catalog in 2025. Simply say one.
74:45 catalog in 2025. Simply say one. Perfect. So now it is saying catalog
74:47 Perfect. So now it is saying catalog name. What should be the catalog name?
74:49 name. What should be the catalog name? What is the initial catalog? We have
74:51 What is the initial catalog? We have already created the catalog. As you can
74:53 already created the catalog. As you can see, if I go here to catalog, we have
74:58 see, if I go here to catalog, we have catalog catalog catalog dbt dbt core
75:01 catalog catalog catalog dbt dbt core catalog dev. DBD core catalog broad. I
75:04 catalog dev. DBD core catalog broad. I think these are the two, right? No, no,
75:08 think these are the two, right? No, no, no, no, no. Where is that? Here. DB
75:11 no, no, no. Where is that? Here. DB tutorial dev. Yeah, these are like my
75:13 tutorial dev. Yeah, these are like my old cataloges. Just ignore. I was just
75:15 old cataloges. Just ignore. I was just playing with DBT a lot. So, so it is
75:18 playing with DBT a lot. So, so it is called DBD tutorial dev. Okay. So, I
75:21 called DBD tutorial dev. Okay. So, I will simply provide the name here.
75:24 will simply provide the name here. DBD tutorial. Even if you do not have
75:27 DBD tutorial. Even if you do not have the catalog, it will create one for you.
75:29 the catalog, it will create one for you. But you are a good boy, you are a good
75:31 But you are a good boy, you are a good girl. So, you already have one. So, now
75:32 girl. So, you already have one. So, now it is saying schema. It is fine. You can
75:35 it is saying schema. It is fine. You can simply say default because we will be
75:36 simply say default because we will be just changing the schema name. Don't
75:38 just changing the schema name. Don't worry about that. Just write default.
75:40 worry about that. Just write default. Threads one. One thread is enough. Hit
75:41 Threads one. One thread is enough. Hit enter and perfect. Do you know what it
75:45 enter and perfect. Do you know what it has said? Now your connection is
75:47 has said? Now your connection is established. You can test your
75:49 established. You can test your connection by running your second dbt
75:51 connection by running your second dbt command. It is called dbt debug and hit
75:54 command. It is called dbt debug and hit enter and just wait. Just wait. It will
75:56 enter and just wait. Just wait. It will simply run the test and if everything is
75:58 simply run the test and if everything is fine, you will see all the test passed
76:01 fine, you will see all the test passed and it is saying error not found.
76:03 and it is saying error not found. Dbtro.myl blah blah blah. Very good. I
76:06 Dbtro.myl blah blah blah. Very good. I wanted to show you this particular
76:08 wanted to show you this particular error. Why? Why an Lamba? Obviously,
76:12 error. Why? Why an Lamba? Obviously, here's a catch. So, now let me just wait
76:15 here's a catch. So, now let me just wait for it to complete. So, you will see one
76:17 for it to complete. So, you will see one check failed. Very good. Failures are
76:19 check failed. Very good. Failures are good. Really? Yeah. Sometimes. So, just
76:23 good. Really? Yeah. Sometimes. So, just close this terminal. And even if you do
76:25 close this terminal. And even if you do if you do not close this terminal, it's
76:26 if you do not close this terminal, it's fine. So, now the thing is you do not
76:30 fine. So, now the thing is you do not have DBT project at this level. This is
76:35 have DBT project at this level. This is my parent folder.
76:38 my parent folder. But our DBT project is in this
76:42 But our DBT project is in this particular folder. Oh,
76:45 particular folder. Oh, so you first need to go inside this
76:47 so you first need to go inside this folder. Then you need to run your all
76:49 folder. Then you need to run your all the dbt commands. Otherwise, it will not
76:52 the dbt commands. Otherwise, it will not be able to run anything. Okay. I will
76:54 be able to run anything. Okay. I will simply say cd. CD stands for change
76:56 simply say cd. CD stands for change directory. And I will simply say an dbt
76:59 directory. And I will simply say an dbt YouTube. So now it will take me inside
77:02 YouTube. So now it will take me inside this uh folder. And now I will say dbt
77:05 this uh folder. And now I will say dbt debug and hit enter.
77:08 debug and hit enter. And now you will see it will work fine.
77:11 And now you will see it will work fine. It should by the way. See valid, valid,
77:14 It should by the way. See valid, valid, valid. All checks passed. Very good. So
77:16 valid. All checks passed. Very good. So we have all the things ready. Okay. And
77:20 we have all the things ready. Okay. And I want to show you one more thing. One
77:22 I want to show you one more thing. One more thing. What is that? It is saying
77:25 more thing. What is that? It is saying using profiles at my C drive users
77:30 using profiles at my C drive users anshell that is my username dbd then
77:32 anshell that is my username dbd then profiles.yml.
77:34 profiles.yml. So this is a profile that is created for
77:38 So this is a profile that is created for us and it is one of the most important
77:40 us and it is one of the most important file I would say the most important file
77:43 file I would say the most important file for dbt core. Without this you cannot do
77:46 for dbt core. Without this you cannot do anything. Anala why it is so important
77:49 anything. Anala why it is so important because in this particular profile which
77:52 because in this particular profile which is called a yaml file
77:54 is called a yaml file all the connection details all the token
77:58 all the connection details all the token all the connection like HTTP connection
78:00 all the connection like HTTP connection host everything is written in that
78:02 host everything is written in that particular file.
78:04 particular file. Okay, quick question that I also had.
78:07 Okay, quick question that I also had. Why it didn't create the file inside
78:09 Why it didn't create the file inside this folder and it created this file in
78:12 this folder and it created this file in the C drive? So, by default, it creates
78:15 the C drive? So, by default, it creates the file in this particular location.
78:17 the file in this particular location. But you know what? We can anytime copy
78:19 But you know what? We can anytime copy this file to this folder and every time
78:22 this file to this folder and every time we run any dbt command, it will first
78:25 we run any dbt command, it will first search this file in this particular
78:27 search this file in this particular folder. If it does not find anything, it
78:29 folder. If it does not find anything, it will go to this location. So ideally
78:32 will go to this location. So ideally being a good developer you should
78:33 being a good developer you should provide this file here in this
78:35 provide this file here in this particular folder. Make sense? Okay.
78:37 particular folder. Make sense? Okay. Very good. By the way like it is like
78:39 Very good. By the way like it is like very small thing but it can create a lot
78:42 very small thing but it can create a lot of disasters. So these are like very
78:45 of disasters. So these are like very detailed stuff that I'm just providing
78:46 detailed stuff that I'm just providing to you because these are like fine if
78:49 to you because these are like fine if you even if you do not know but when
78:52 you even if you do not know but when you're sitting in the interviews when
78:53 you're sitting in the interviews when you are just preparing for your dream
78:55 you are just preparing for your dream roles you need to know these details.
78:58 roles you need to know these details. Right. Right.
79:00 Right. Right. So let's explore this folder. What do we
79:02 So let's explore this folder. What do we have inside this folder? And you will
79:03 have inside this folder? And you will see magic. Just click here. Oh,
79:07 see magic. Just click here. Oh, this is your DBT. Okay, first of all,
79:10 this is your DBT. Okay, first of all, launch number cool down. What is this?
79:13 launch number cool down. What is this? First of all, we are seeing one folder
79:14 First of all, we are seeing one folder analysis logs, macros, models, seeds,
79:19 analysis logs, macros, models, seeds, snapshot, test, then get ignore, dbt,
79:23 snapshot, test, then get ignore, dbt, project, readme is fine. We know that
79:24 project, readme is fine. We know that readme. What is readme? But lamba so
79:27 readme. What is readme? But lamba so many folders. What is this?
79:30 many folders. What is this? What is this? So this is your DBD.
79:32 What is this? So this is your DBD. Really trust me this is very easy. This
79:36 Really trust me this is very easy. This is just a kind of folder structure that
79:38 is just a kind of folder structure that we follow in DBD. That's it. That's it.
79:42 we follow in DBD. That's it. That's it. That's it. That's it. That's it. Okay.
79:43 That's it. That's it. That's it. Okay. Make sense? Very good. First thing
79:45 Make sense? Very good. First thing first, I want to show you something. As
79:47 first, I want to show you something. As I told you that the most important thing
79:50 I told you that the most important thing in this particular whole development is
79:53 in this particular whole development is project. You have a file called dbt
79:55 project. You have a file called dbt project.yamel. Okay. If I click here and
79:59 project.yamel. Okay. If I click here and if I just turn off the you can say
80:01 if I just turn off the you can say terminal
80:03 terminal you will see this particular file like
80:06 you will see this particular file like this. What is this thing? So as you can
80:08 this. What is this thing? So as you can see that my name is an dbt YouTube
80:12 see that my name is an dbt YouTube profile is an dbt YouTube and model path
80:15 profile is an dbt YouTube and model path and all those things. So in this DBT
80:19 and all those things. So in this DBT core whatever you do whatever literally
80:24 core whatever you do whatever literally whatever you do you have to just provide
80:28 whatever you do you have to just provide the meta data here because this is the
80:32 the meta data here because this is the backbone of DBT core whatever you are
80:34 backbone of DBT core whatever you are writing it here it will just go to this
80:36 writing it here it will just go to this location and perform this stuff. So for
80:38 location and perform this stuff. So for example model so I have written not I by
80:41 example model so I have written not I by default DBD core has written for me
80:43 default DBD core has written for me obviously I can change it. It is written
80:45 obviously I can change it. It is written if you want to run models you need to go
80:48 if you want to run models you need to go to this location which is called models
80:50 to this location which is called models folder. See we have models here. Make
80:53 folder. See we have models here. Make sense? It is saying models. If I just
80:55 sense? It is saying models. If I just want to rename it I have to rename the
80:58 want to rename it I have to rename the path here as well. Usually we do not do
81:01 path here as well. Usually we do not do it. But if you are interested to do
81:04 it. But if you are interested to do that, if your organization wants to use
81:05 that, if your organization wants to use a different naming convention, you have
81:08 a different naming convention, you have the liberty. You have the liberty. Okay,
81:11 the liberty. You have the liberty. Okay, makes sense. Very good. So we going to
81:13 makes sense. Very good. So we going to cover dbt project.yml in very much
81:16 cover dbt project.yml in very much detail because obviously everything will
81:18 detail because obviously everything will be just mentioned here. So do not need
81:20 be just mentioned here. So do not need to worry about that. But I just wanted
81:22 to worry about that. But I just wanted to show you the very first glimpse of
81:24 to show you the very first glimpse of the dbt project. Mml file because it is
81:26 the dbt project. Mml file because it is the most important stuff. Okay, make
81:29 the most important stuff. Okay, make sense? And whatever we have here, do not
81:30 sense? And whatever we have here, do not worry about that because we'll be
81:32 worry about that because we'll be covering that. Now, now if you see this
81:35 covering that. Now, now if you see this file, this is called a YAML file. Okay?
81:38 file, this is called a YAML file. Okay? Some people say in markup language, some
81:40 Some people say in markup language, some people say yet another markup language.
81:42 people say yet another markup language. You can say YAML. So if you see the
81:44 You can say YAML. So if you see the format in this particular button, it is
81:46 format in this particular button, it is called YAML. But you know what? We
81:49 called YAML. But you know what? We cannot pick this particular format. Why?
81:52 cannot pick this particular format. Why? Because we'll be using Ginga functions,
81:54 Because we'll be using Ginga functions, Ginga templates. By default, YAML
81:58 Ginga templates. By default, YAML doesn't know what is ginga within YAML.
82:01 doesn't know what is ginga within YAML. So, we need to tell it. How we can just
82:03 So, we need to tell it. How we can just tell it? There are like so many ways.
82:05 tell it? There are like so many ways. But the best way to tell it and I would
82:08 But the best way to tell it and I would say this way is not just for telling
82:11 say this way is not just for telling that we are just using Ginga within SQL,
82:13 that we are just using Ginga within SQL, Ginga within YAML. This particular thing
82:16 Ginga within YAML. This particular thing is very handy when you are doing local
82:18 is very handy when you are doing local development for DBT. It's called DBT
82:21 development for DBT. It's called DBT power or basically power DBT extension.
82:24 power or basically power DBT extension. Let me just show you. If you go here and
82:27 Let me just show you. If you go here and if I search DBT and hit enter, there are
82:31 if I search DBT and hit enter, there are basically two uh extensions very popular
82:33 basically two uh extensions very popular one but this is like more popular power
82:35 one but this is like more popular power user for DBT. Just click here and click
82:38 user for DBT. Just click here and click on install and you can simply install
82:40 on install and you can simply install it. What this will do it will make your
82:43 it. What this will do it will make your coding better. Why? Because it will
82:45 coding better. Why? Because it will autocomplete your code. It will give you
82:47 autocomplete your code. It will give you suggestions. It will give you lineage.
82:49 suggestions. It will give you lineage. It will give you almost all the features
82:52 It will give you almost all the features that are available in DBT cloud. That's
82:54 that are available in DBT cloud. That's why I love it. You can build graphs, you
82:56 why I love it. You can build graphs, you can build DAG, you can build anything.
82:58 can build DAG, you can build anything. And after this extension, you will feel
83:00 And after this extension, you will feel like why I need to use DBT cloud when
83:02 like why I need to use DBT cloud when everything is available here. Um see DBT
83:06 everything is available here. Um see DBT cloud is a managed uh product and
83:08 cloud is a managed uh product and obviously there are like many reasons
83:09 obviously there are like many reasons why you should use that. But when you
83:11 why you should use that. But when you are learning, when you do not want to
83:13 are learning, when you do not want to pay then obviously DBT core is for you.
83:16 pay then obviously DBT core is for you. But when you are in the organization,
83:18 But when you are in the organization, organization would love to work with dbt
83:20 organization would love to work with dbt cloud I'm so sure. Okay. So simple
83:23 cloud I'm so sure. Okay. So simple simple simple. So now this extension is
83:25 simple simple. So now this extension is there. Okay. Now if I go to let's say
83:29 there. Okay. Now if I go to let's say files and let's actually we can uh
83:33 files and let's actually we can uh obviously uh do a lot of stuff with this
83:35 obviously uh do a lot of stuff with this particular thing. So one thing that I
83:37 particular thing. So one thing that I want to tell you just click on this dbt
83:39 want to tell you just click on this dbt core button which is here in the bottom.
83:43 core button which is here in the bottom. Just click here and you will see setup
83:45 Just click here and you will see setup extension. Let's set up this extension
83:47 extension. Let's set up this extension which is this particular setup DPD.
83:49 which is this particular setup DPD. Okay. So it is saying select Python
83:51 Okay. So it is saying select Python interpreter. Yes, we have done that.
83:53 interpreter. Yes, we have done that. Assistant file types associate sorry
83:55 Assistant file types associate sorry associate file types is the thing that I
83:56 associate file types is the thing that I wanted to mention. Just click here and
83:58 wanted to mention. Just click here and simply pick associate file types. You
84:00 simply pick associate file types. You can see it is telling us that for DBT
84:04 can see it is telling us that for DBT power user to work axql
84:07 power user to work axql file types need to be associated with
84:10 file types need to be associated with the value sql or gingaql andl file types
84:15 the value sql or gingaql andl file types should be associated with the value yl
84:18 should be associated with the value yl or gingl
84:21 or gingl make sense this is the thing that I'm
84:23 make sense this is the thing that I'm telling because
84:25 telling because this particular extension will not be
84:28 this particular extension will not be able to understand if you are let's say
84:30 able to understand if you are let's say You are writing Ginga code within what
84:31 You are writing Ginga code within what is ginger and again and again you're
84:33 is ginger and again and again you're using ginger what is ginga it's an
84:36 using ginger what is ginga it's an amazing amazing amazing framework okay
84:38 amazing amazing amazing framework okay so anytime you want to use ginga in your
84:42 so anytime you want to use ginga in your yaml in your SQL it will be almost
84:45 yaml in your SQL it will be almost impossible for it to actually
84:49 impossible for it to actually understand what's going on but when we
84:51 understand what's going on but when we set up this thing it will be you can say
84:53 set up this thing it will be you can say one of the most easiest thing for it to
84:56 one of the most easiest thing for it to understand like what's going on so it is
84:58 understand like what's going on so it is telling us like what we need to do make
85:00 telling us like what we need to do make sense? Okay. So what I will do now I
85:02 sense? Okay. So what I will do now I will simply say associate file types.
85:04 will simply say associate file types. Okay. And I will pick this one. So now
85:08 Okay. And I will pick this one. So now it will give me this particular option.
85:09 it will give me this particular option. I will simply say add item. I will say
85:12 I will simply say add item. I will say ax dossql. That means all the files with
85:16 ax dossql. That means all the files with SQL extension. I want to provide the
85:18 SQL extension. I want to provide the value ginger sql. Okay. Perfect. I want
85:22 value ginger sql. Okay. Perfect. I want to add one more item. Aixy.
85:26 to add one more item. Aixy. Oops.
85:28 Oops. ax
85:29 ax dotyl the value will be ginga yaml make
85:33 dotyl the value will be ginga yaml make sure you are adding y ml here and here
85:36 sure you are adding y ml here and here just yml okay just say okay and that's
85:40 just yml okay just say okay and that's it that's it now you can simply close
85:43 it that's it now you can simply close this thing and simply you can click on
85:45 this thing and simply you can click on welcome and you can take it because you
85:47 welcome and you can take it because you have just done that run dbt deps we will
85:51 have just done that run dbt deps we will run it don't worry okay you can simply
85:53 run it don't worry okay you can simply say finish setup mark all as done very
85:56 say finish setup mark all as done very good simply close is now if you open
85:59 good simply close is now if you open this particular project dbt project.l Y
86:02 this particular project dbt project.l Y you will observe that logo is changed.
86:04 you will observe that logo is changed. See earlier it was red logo now it is
86:06 See earlier it was red logo now it is grayish logo and now if you click on
86:08 grayish logo and now if you click on this file earlier you were seeing it
86:10 this file earlier you were seeing it here yaml you can even just um go to the
86:14 here yaml you can even just um go to the previous section just you can just uh
86:16 previous section just you can just uh back back forward you can just uh move
86:19 back back forward you can just uh move towards left to the video okay and you
86:22 towards left to the video okay and you will see that earlier it was written
86:24 will see that earlier it was written just yaml but now it is written
86:27 just yaml but now it is written ginger yaml
86:29 ginger yaml make sense okay so this was a very small
86:33 make sense okay so this was a very small thing But this small again details,
86:36 thing But this small again details, details are everything when you are a
86:38 details are everything when you are a data engineer. You cannot miss even one
86:40 data engineer. You cannot miss even one thing. Make sense? Very good. Very very
86:42 thing. Make sense? Very good. Very very very good. By the way, take notes. These
86:46 very good. By the way, take notes. These are very detailed knowledge. Okay? And
86:48 are very detailed knowledge. Okay? And you need to note each and everything.
86:50 you need to note each and everything. Make sense? By the way, in your case,
86:52 Make sense? By the way, in your case, you will see folders with different
86:54 you will see folders with different icons because I'm using a theme. So do
86:56 icons because I'm using a theme. So do not worry about that. This is not
86:57 not worry about that. This is not related to DBT. This is my like VS code.
87:00 related to DBT. This is my like VS code. If you just want to um download this
87:02 If you just want to um download this theme, you can just go to the themes.
87:05 theme, you can just go to the themes. Just go to extensions and I think the
87:06 Just go to extensions and I think the theme called is like I think folder
87:09 theme called is like I think folder something
87:11 something um I don't know like the the the
87:13 um I don't know like the the the material I think material yeah material
87:16 material I think material yeah material icon theme you can just download it if
87:17 icon theme you can just download it if you want. Okay. So if I just go here I
87:19 you want. Okay. So if I just go here I have already installed it. It's an
87:21 have already installed it. It's an amazing theme as you can see my lovely
87:22 amazing theme as you can see my lovely folders. Okay. So that's it. That's it.
87:25 folders. Okay. So that's it. That's it. That's it. See these are like things
87:26 That's it. See these are like things that make developers feel happy. like
87:29 that make developers feel happy. like what developer wants like what what what
87:31 what developer wants like what what what he or she wants nothing just beautiful
87:33 he or she wants nothing just beautiful folders nice colors nice theme and
87:36 folders nice colors nice theme and that's it that's it okay so let's see
87:40 that's it that's it okay so let's see what do we have next so now I have done
87:44 what do we have next so now I have done a small thing that you also need to do
87:46 a small thing that you also need to do and what's that I have copied profiles
87:51 and what's that I have copied profiles doyamel file remember when we were just
87:54 doyamel file remember when we were just configuring our connection and we were
87:56 configuring our connection and we were just seeing a location C drive users
88:00 just seeing a location C drive users then folder then one more folder then
88:02 then folder then one more folder then profiles doml obviously in your case it
88:04 profiles doml obviously in your case it will be different and most probably it
88:06 will be different and most probably it will be in the C drive okay one thing
88:09 will be in the C drive okay one thing that it will be in the dbt folder and
88:12 that it will be in the dbt folder and that you may need to enable and I would
88:14 that you may need to enable and I would expect if you are a data engineer how to
88:16 expect if you are a data engineer how to enable the hidden folders you can simply
88:18 enable the hidden folders you can simply click on or right click or click on
88:20 click on or right click or click on three dots in your file explorer and you
88:22 three dots in your file explorer and you just need to enable the hidden folders
88:24 just need to enable the hidden folders because all the folders that start from
88:26 because all the folders that start from dot are hidden by nature here. So you
88:28 dot are hidden by nature here. So you need to enable it in order to see that
88:30 need to enable it in order to see that file and that's it. You just need to
88:32 file and that's it. You just need to bring that file here. Just copy and
88:34 bring that file here. Just copy and paste it in the root directory. Let me
88:36 paste it in the root directory. Let me just show you the folder structure so
88:38 just show you the folder structure so that you will not be confused. You need
88:40 that you will not be confused. You need to paste that file. You do not need to
88:44 to paste that file. You do not need to um you can say move that file. Just copy
88:46 um you can say move that file. Just copy it. Just create another instance of that
88:48 it. Just create another instance of that file. That's it. And paste that file
88:51 file. That's it. And paste that file into the dbt YouTube. That means uh in
88:54 into the dbt YouTube. That means uh in your root directory of the DBT project.
88:59 your root directory of the DBT project. Make sense? Inside this folder, not
89:02 Make sense? Inside this folder, not inside this folder. Inside this DBT
89:06 inside this folder. Inside this DBT project folder. And at the same level
89:08 project folder. And at the same level where you have dbt project.l.
89:11 where you have dbt project.l. Make sense? Very good. Now you will be
89:14 Make sense? Very good. Now you will be seeing that I have two files profiles.
89:16 seeing that I have two files profiles. And profiles ps. So basically what I
89:19 And profiles ps. So basically what I have done um this is my original file.
89:22 have done um this is my original file. Okay. in which I have all the
89:23 Okay. in which I have all the information and this is the file that I
89:26 information and this is the file that I have in which I have like obviously u
89:31 have in which I have like obviously u disabled the token I don't want to show
89:32 disabled the token I don't want to show it obviously I will just revoke the
89:34 it obviously I will just revoke the token even if it is visible that is fine
89:36 token even if it is visible that is fine but just for the safer side I have just
89:38 but just for the safer side I have just uh create a duplicate copy of this you
89:39 uh create a duplicate copy of this you do not need to create that I have
89:41 do not need to create that I have created this so that whenever I'll be
89:43 created this so that whenever I'll be just demonstrating this file I'll be
89:45 just demonstrating this file I'll be using ps means pseudo so I have just
89:47 using ps means pseudo so I have just created profiles psyl and I will be just
89:51 created profiles psyl and I will be just leaving token field empty but in your
89:54 leaving token field empty but in your case it will be filled so do not need to
89:55 case it will be filled so do not need to worry about that okay make sense very
89:58 worry about that okay make sense very good so now everything is fine right so
90:01 good so now everything is fine right so in the dbt projecty
90:03 in the dbt projecty you need to make sure the first thing
90:05 you need to make sure the first thing that your profile is not set as default
90:09 that your profile is not set as default it should be set set as the same name as
90:12 it should be set set as the same name as your project which is this one dbt an
90:15 your project which is this one dbt an dbt YouTube which is this one and this
90:18 dbt YouTube which is this one and this same name should be in the d uh this pro
90:22 same name should be in the d uh this pro profiles ps.yamel as well make sense so
90:25 profiles ps.yamel as well make sense so that it should be matching exactly
90:28 that it should be matching exactly matching so if I open profiles ps.yamel
90:31 matching so if I open profiles ps.yamel YL you will see that this is our profile
90:35 YL you will see that this is our profile name an dbt YouTube and within this
90:38 name an dbt YouTube and within this profile we have two connections
90:40 profile we have two connections basically not two like we'll be creating
90:42 basically not two like we'll be creating second one very soon but for now we just
90:43 second one very soon but for now we just have one dev environment which is this
90:46 have one dev environment which is this catalog host path and schema is default
90:49 catalog host path and schema is default threads one token I have just removed it
90:51 threads one token I have just removed it and type database you do not need to
90:52 and type database you do not need to remove your token okay because it will
90:54 remove your token okay because it will be used to just authenticate your you
90:57 be used to just authenticate your you can say SQL warehouse make sense very
91:00 can say SQL warehouse make sense very So now make sure this name is matching
91:03 So now make sure this name is matching here with this one and with this one as
91:07 here with this one and with this one as well. Perfect. So now first thing first
91:10 well. Perfect. So now first thing first everything is done. I would say your 20
91:13 everything is done. I would say your 20 25% of of the work is done really. Yes.
91:16 25% of of the work is done really. Yes. Because these are the things that are
91:18 Because these are the things that are expected from you because whenever
91:20 expected from you because whenever someone is hiring you they do not know
91:22 someone is hiring you they do not know anything about DBD. They just know that
91:24 anything about DBD. They just know that they want to hire someone. you will be
91:26 they want to hire someone. you will be saying hey I just want to develop I
91:28 saying hey I just want to develop I don't want to configure and I do not
91:30 don't want to configure and I do not know how to configure the things they
91:32 know how to configure the things they will say okay bye-bye so you need to
91:34 will say okay bye-bye so you need to know these things because you because
91:36 know these things because you because see this technology is really new no one
91:38 see this technology is really new no one will be there to help you out you need
91:40 will be there to help you out you need to figure out the things and how after
91:42 to figure out the things and how after watching this video right okay so first
91:45 watching this video right okay so first of all let me just create a simple model
91:48 of all let me just create a simple model directly model on lamba yes yes yes so
91:52 directly model on lamba yes yes yes so first of all you have so many folders
91:54 first of all you have so many folders right okay very Now within these
91:56 right okay very Now within these folders, let me zoom it now. It's fine.
92:00 folders, let me zoom it now. It's fine. Within the models folder, this is our
92:02 Within the models folder, this is our basically you can say
92:05 basically you can say favorite folder so far because we'll be
92:08 favorite folder so far because we'll be just working a lot with this folder.
92:10 just working a lot with this folder. Okay, very good. And it is not empty. If
92:13 Okay, very good. And it is not empty. If I click on this, you will see a
92:15 I click on this, you will see a subfolder within this. It's called it's
92:17 subfolder within this. It's called it's called example. And then within the
92:19 called example. And then within the example, we have one two two SQL files
92:22 example, we have one two two SQL files and one schema.l file.
92:25 and one schema.l file. I didn't create that. I know. I know.
92:28 I didn't create that. I know. I know. But still it is there for you. So I
92:30 But still it is there for you. So I don't want to keep it. I will simply
92:31 don't want to keep it. I will simply delete it. Why? Because I prefer my own
92:34 delete it. Why? Because I prefer my own naming convention and I prefer creating
92:38 naming convention and I prefer creating everything in the medallion architecture
92:40 everything in the medallion architecture way. Let's say bronze, silver, gold. I
92:42 way. Let's say bronze, silver, gold. I don't like to keep example and blah blah
92:44 don't like to keep example and blah blah blah. This is just for the quick
92:46 blah. This is just for the quick reference and we do not like references.
92:49 reference and we do not like references. Yeah, sometimes no. So I will simply say
92:51 Yeah, sometimes no. So I will simply say delete
92:52 delete uh
92:54 uh example folder. Yeah, just delete the
92:56 example folder. Yeah, just delete the whole example folder. Now if I click on
92:58 whole example folder. Now if I click on models, it is empty. Perfect. That's
93:00 models, it is empty. Perfect. That's what I want. Now I will create one
93:02 what I want. Now I will create one folder inside this. Click on this model
93:04 folder inside this. Click on this model and click on this folder button. And I
93:06 and click on this folder button. And I will simply say um bronze.
93:10 will simply say um bronze. Perfect. And I will click on models for
93:12 Perfect. And I will click on models for one more time and create one more folder
93:15 one more time and create one more folder at the same level that we have bronze.
93:17 at the same level that we have bronze. Just make sure you're not creating
93:19 Just make sure you're not creating folder within bronze bronze silver and
93:23 folder within bronze bronze silver and then click on models one more time and
93:25 then click on models one more time and then gold.
93:28 then gold. Okay, perfect. Now I will create one
93:31 Okay, perfect. Now I will create one more folder. It's called sources. And
93:33 more folder. It's called sources. And this is an advanced feature. Okay.
93:35 this is an advanced feature. Okay. Really? Yes. Or source or sources? Let's
93:38 Really? Yes. Or source or sources? Let's say source. Okay. Source. And this is an
93:42 say source. Okay. Source. And this is an advanced feature that we'll be just
93:43 advanced feature that we'll be just talking about. Don't worry. Okay. So
93:44 talking about. Don't worry. Okay. So this is our you can say medallion
93:46 this is our you can say medallion architecture that we have set up okay in
93:48 architecture that we have set up okay in our folder structure. Makes sense. So
93:51 our folder structure. Makes sense. So first of all we know that we want to
93:54 first of all we know that we want to populate the bronze layer. That is for
93:56 populate the bronze layer. That is for sure. That is for sure. Very good. So if
94:00 sure. That is for sure. Very good. So if I just go to let's say my data bricks I
94:04 I just go to let's say my data bricks I know that I want to pull the data right
94:08 know that I want to pull the data right in the bronze layer. Make sense? So this
94:11 in the bronze layer. Make sense? So this is our source and just a quick note I
94:15 is our source and just a quick note I have added one more table which is
94:16 have added one more table which is called dim date as you can see um why
94:19 called dim date as you can see um why because there are some you can say
94:22 because there are some you can say advanced scenarios that we can just
94:23 advanced scenarios that we can just cover if we have date dimension as well
94:25 cover if we have date dimension as well basically a date column and you can also
94:28 basically a date column and you can also create this particular table how I have
94:31 create this particular table how I have just dropped the CSV in my GitHub repo.
94:34 just dropped the CSV in my GitHub repo. You can just download and you can just
94:35 You can just download and you can just create the table as I just explained how
94:37 create the table as I just explained how you can just create the table right. So
94:39 you can just create the table right. So just create that table within the source
94:41 just create that table within the source and that's it. This is our like
94:43 and that's it. This is our like basically source that we are just using.
94:44 basically source that we are just using. Make sense? By the way, do you know what
94:46 Make sense? By the way, do you know what what happened with me right now? Do you
94:49 what happened with me right now? Do you know I recorded
94:52 know I recorded almost almost all the bronze layer
94:53 almost almost all the bronze layer injection and now I have to re
94:56 injection and now I have to re re-record. No worries. No worries. And
94:58 re-record. No worries. No worries. And that's why you are seeing this file
94:59 that's why you are seeing this file sources. ML. Don't worry. Let me just
95:00 sources. ML. Don't worry. Let me just delete this first of all. I will just
95:02 delete this first of all. I will just create a fresh file for you. Okay. So
95:05 create a fresh file for you. Okay. So basically we have our folders ready.
95:08 basically we have our folders ready. bronze, silver and gold. So these are
95:11 bronze, silver and gold. So these are the folders and these are ready.
95:13 the folders and these are ready. Obviously these are empty and this
95:15 Obviously these are empty and this bronze folder was almost full but no
95:17 bronze folder was almost full but no worries, no worries. No anything for you
95:19 worries, no worries. No anything for you my love data fam anything. So now let's
95:22 my love data fam anything. So now let's recreate this. Okay. So in the bronze
95:25 recreate this. Okay. So in the bronze folder we want to populate the data from
95:29 folder we want to populate the data from the source in the as it is form. Why?
95:32 the source in the as it is form. Why? Because uh according to the material
95:33 Because uh according to the material architecture rules in the bronze layer
95:35 architecture rules in the bronze layer we do not perform any kind of
95:36 we do not perform any kind of transformations. We simply want to pull
95:38 transformations. We simply want to pull the data. Makes sense. Now you will be
95:40 the data. Makes sense. Now you will be like very excited. What is the code? How
95:43 like very excited. What is the code? How we can just create the table and blah
95:44 we can just create the table and blah blah blah blah blah. It is very simple.
95:47 blah blah blah blah. It is very simple. Let me just show you. Let me just click
95:49 Let me just show you. Let me just click on this bronze folder. Okay. Let me just
95:51 on this bronze folder. Okay. Let me just click plus new file and let's create a
95:53 click plus new file and let's create a bronze
95:56 bronze sales. SQL. Yes. SQL. So within this you
96:01 sales. SQL. Yes. SQL. So within this you want to create a table in your bronze
96:03 want to create a table in your bronze layer in the datab bricks. But you are
96:05 layer in the datab bricks. But you are in DBD. how it will create that you will
96:07 in DBD. how it will create that you will think okay we would need to write
96:09 think okay we would need to write something like this create table blah
96:11 something like this create table blah blah blah blah blah no nothing you just
96:14 blah blah blah blah no nothing you just need to write your select statement and
96:16 need to write your select statement and that's it really yes let's say I want to
96:19 that's it really yes let's say I want to pull all the data from the source so I
96:21 pull all the data from the source so I will simply write this select from
96:24 will simply write this select from um datab bricks
96:29 um datab bricks uh what what was the catalog name what
96:31 uh what what was the catalog name what was the catalog name let me just check
96:33 was the catalog name let me just check it in the profile IPS. Okay. So, catalog
96:38 it in the profile IPS. Okay. So, catalog name is DBD tutorial dev. Okay.
96:43 name is DBD tutorial dev. Okay. DBD tutorial dot source as you know.
96:47 DBD tutorial dot source as you know. Then fact sales. Okay.
96:52 Then fact sales. Okay. So, this is my query. Just remember we
96:54 So, this is my query. Just remember we do not need to write semicolon. Okay.
96:56 do not need to write semicolon. Okay. When we are just working with DBD, so do
96:58 When we are just working with DBD, so do not write semicolon otherwise it will
96:59 not write semicolon otherwise it will throw the error. So, this is our file.
97:01 throw the error. So, this is our file. Let me just first of all save it. Okay.
97:04 Let me just first of all save it. Okay. Yes. And currently we are on the main
97:06 Yes. And currently we are on the main branch. So do not worry about that. We
97:08 branch. So do not worry about that. We will be just creating feature branches
97:10 will be just creating feature branches as well because you know that we should
97:12 as well because you know that we should work in the g manner. But currently I
97:14 work in the g manner. But currently I just want to focus more on this
97:15 just want to focus more on this particular thing. Once you have this
97:17 particular thing. Once you have this fundamental then we will just switching
97:19 fundamental then we will just switching to the branches. Okay. So this is the
97:22 to the branches. Okay. So this is the way we should just query the data. Okay.
97:25 way we should just query the data. Okay. Now in the DBT cloud we get amazing UI
97:28 Now in the DBT cloud we get amazing UI things so that we can just run our
97:30 things so that we can just run our queries, preview the data and all. But
97:33 queries, preview the data and all. But what's in our DBD core? Hm. Nothing. But
97:37 what's in our DBD core? Hm. Nothing. But with the help of that extension, we are
97:39 with the help of that extension, we are very lucky that we can just use almost
97:42 very lucky that we can just use almost all those things that we use in DBT
97:43 all those things that we use in DBT cloud. Can you see this toolbar? This is
97:46 cloud. Can you see this toolbar? This is because of that particular DBT
97:48 because of that particular DBT extension, power user for DBT extension.
97:50 extension, power user for DBT extension. If you click on this play button, it
97:53 If you click on this play button, it will execute this SQL command for you
97:55 will execute this SQL command for you literally. And you know what? You will
97:57 literally. And you know what? You will say an Lamba, okay, this is the output.
98:00 say an Lamba, okay, this is the output. how it is just running this particular
98:01 how it is just running this particular SQL query. It is using our compute which
98:04 SQL query. It is using our compute which is in the data bricks really. Yes. So
98:07 is in the data bricks really. Yes. So this is the output and in your case if
98:09 this is the output and in your case if it is of different color you can simply
98:11 it is of different color you can simply pick the colors from here. Maybe it will
98:13 pick the colors from here. Maybe it will be like this and you can just simply
98:15 be like this and you can just simply click on this paint button. You can pick
98:17 click on this paint button. You can pick this as well vapor and oh this is also
98:20 this as well vapor and oh this is also nice man. Solarized dark then monokai.
98:25 nice man. Solarized dark then monokai. This is also nice. Let's pick this one.
98:27 This is also nice. Let's pick this one. Uh yeah this is or or or let's say
98:32 Uh yeah this is or or or let's say product. Product is nice. Okay. So this
98:34 product. Product is nice. Okay. So this is the query. Okay. And we can also look
98:37 is the query. Okay. And we can also look at the lineage and let me just reset it.
98:39 at the lineage and let me just reset it. Yeah. This is the latest lineage. The
98:41 Yeah. This is the latest lineage. The previous one was the updated one that I
98:43 previous one was the updated one that I just I will just show you. Don't worry
98:45 just I will just show you. Don't worry because I'm just re-recording it. So
98:47 because I'm just re-recording it. So this is the thing like bronze sales and
98:49 this is the thing like bronze sales and we do not have any kind of lineage. Why?
98:50 we do not have any kind of lineage. Why? Because this is the only piece that we
98:52 Because this is the only piece that we have. This is the only object. This is
98:55 have. This is the only object. This is the way that
98:57 the way that a lot of people just prefer doing it.
98:59 a lot of people just prefer doing it. But according to me, this is not the
99:01 But according to me, this is not the ideal way really. Why? See
99:05 ideal way really. Why? See this is the query. Make sense? Very
99:08 this is the query. Make sense? Very good. This is the table and we know that
99:10 good. This is the table and we know that we are pulling the data from this
99:12 we are pulling the data from this particular source. But if you want to
99:16 particular source. But if you want to track it in future, if you want to just
99:18 track it in future, if you want to just keep the lineage robust enough to know
99:22 keep the lineage robust enough to know the sources whenever you want, you
99:25 the sources whenever you want, you should use dynamic sources and this is
99:29 should use dynamic sources and this is you can say a recent added concept in
99:32 you can say a recent added concept in the DBT world as well like lot like very
99:35 the DBT world as well like lot like very recently but yeah it was not like very
99:37 recently but yeah it was not like very much popular before obviously these are
99:40 much popular before obviously these are like small small things advanced things
99:41 like small small things advanced things so it's my duty to just tell you all the
99:44 so it's my duty to just tell you all the advanced things as well. So instead of
99:46 advanced things as well. So instead of writing this, it will also work. That's
99:48 writing this, it will also work. That's not that's not like it it will not work.
99:50 not that's not like it it will not work. It will work. But the professional way
99:52 It will work. But the professional way is we should create a source. So for
99:55 is we should create a source. So for now, if I just run this command, if I go
99:58 now, if I just run this command, if I go to lineage, I just see this particular
100:00 to lineage, I just see this particular model. I do not know how this model,
100:03 model. I do not know how this model, what is this saying, man? An error
100:05 what is this saying, man? An error occurred while trying to execute a
100:07 occurred while trying to execute a query. Why? Why? Why? Why? Why?
100:13 query. Why? Why? Why? Why? Why? Why
100:14 Why syntax error? What is syntax error? App
100:18 syntax error? What is syntax error? App DVD version DVD profile name
100:22 DVD version DVD profile name DVD YouTube target name detailed error.
100:25 DVD YouTube target name detailed error. Let me just close it. Let me just run it
100:27 Let me just close it. Let me just run it for one more time. This query was
100:28 for one more time. This query was running just 2 minutes back. Just 2
100:31 running just 2 minutes back. Just 2 minutes back. And it is running now as
100:33 minutes back. And it is running now as well. Huh. So now if you just click on
100:37 well. Huh. So now if you just click on the lineage.
100:38 the lineage. Oh man. So here you are not seeing from
100:43 Oh man. So here you are not seeing from where you are pulling this table.
100:46 where you are pulling this table. Oh, makes sense. Makes sense. Point
100:48 Oh, makes sense. Makes sense. Point valid point. So we do not have something
100:51 valid point. So we do not have something like this. Hey, this is the source and
100:53 like this. Hey, this is the source and this is coming from here. We should add
100:57 this is coming from here. We should add that source if you want to become an
100:59 that source if you want to become an efficient data engineer like top data
101:01 efficient data engineer like top data engineer and especially with DBT it's
101:04 engineer and especially with DBT it's very very important. So we that's why I
101:06 very very important. So we that's why I created this folder called source. So
101:08 created this folder called source. So within this I will create a source
101:11 within this I will create a source profile not profile I would say source
101:13 profile not profile I would say source you can say property I will simply click
101:15 you can say property I will simply click on this plus file and I will create a
101:18 on this plus file and I will create a file called sources doyamel.
101:21 file called sources doyamel. Yes it will be a YAML file. Okay make
101:24 Yes it will be a YAML file. Okay make sense and now it is saying hey this is a
101:28 sense and now it is saying hey this is a YAML file and what will be the structure
101:31 YAML file and what will be the structure and what will be the code what will be
101:33 and what will be the code what will be this this this that that let me just
101:34 this this this that that let me just show you it's very simple. Let me just
101:36 show you it's very simple. Let me just take you to the documentation. Just
101:37 take you to the documentation. Just search um what what what will be the
101:40 search um what what what will be the best keyword? Uh sources sources and
101:43 best keyword? Uh sources sources and DVD. Perfect.
101:46 DVD. Perfect. Okay. Scroll down. Just click on any
101:48 Okay. Scroll down. Just click on any link. You will be landing on something
101:50 link. You will be landing on something relevant. Obviously, obviously the docu
101:52 relevant. Obviously, obviously the docu DVD documentation. So it is saying
101:53 DVD documentation. So it is saying adding add sources to your DA. Make
101:55 adding add sources to your DA. Make sense? So if you scroll down, this is
101:57 sense? So if you scroll down, this is the code that you will be using to
102:00 the code that you will be using to create your source. Do not worry about
102:03 create your source. Do not worry about anything. This is the code that you do
102:05 anything. This is the code that you do not need to remember. You need to
102:07 not need to remember. You need to understand it. YAML is not for learning.
102:10 understand it. YAML is not for learning. YAML is for understanding. Make sense?
102:13 YAML is for understanding. Make sense? And do not do not be hesitant to work
102:17 And do not do not be hesitant to work with YAML because YAML is everywhere in
102:20 with YAML because YAML is everywhere in DBD in CI/CD life cycles in like
102:22 DBD in CI/CD life cycles in like literally everywhere. And you do not
102:24 literally everywhere. And you do not need to remember it just understand it.
102:26 need to remember it just understand it. Okay. So let me just copy this code.
102:28 Okay. So let me just copy this code. Okay. And let me just show you one more
102:30 Okay. And let me just show you one more thing that we have some functions. Okay.
102:32 thing that we have some functions. Okay. Okay. Let me just show you after that
102:34 Okay. Let me just show you after that one by one. Okay. So let me just paste
102:36 one by one. Okay. So let me just paste this code here and let me just remove
102:38 this code here and let me just remove the unnecessary stuff. Perfect. So what
102:40 the unnecessary stuff. Perfect. So what this code is saying it is saying first
102:42 this code is saying it is saying first of all version two make sense. This is
102:44 of all version two make sense. This is the kind of version that we know that
102:45 the kind of version that we know that then sources that means this is
102:48 then sources that means this is particularly defining sources. Very
102:51 particularly defining sources. Very good. Then name. Then we need to define
102:54 good. Then name. Then we need to define this sources. Basically we need to
102:57 this sources. Basically we need to provide a name. Okay. And ideally in the
102:59 provide a name. Okay. And ideally in the DBT world we should provide or basically
103:03 DBT world we should provide or basically we should create one source per schema
103:07 we should create one source per schema this is not like a rule of thumb but
103:10 this is not like a rule of thumb but this is a very strong recommendation by
103:12 this is a very strong recommendation by DBT. Let me just show you the
103:13 DBT. Let me just show you the documentation. So it says by default
103:16 documentation. So it says by default schema will be the same as the name.
103:18 schema will be the same as the name. Okay add schema only if you want to use
103:21 Okay add schema only if you want to use a source name that differs from the
103:22 a source name that differs from the existing schema schema. That means it
103:24 existing schema schema. That means it says hey you should use schema okay for
103:29 says hey you should use schema okay for your source it should be equivalent so
103:31 your source it should be equivalent so I'll simply say source why source just
103:34 I'll simply say source why source just let me know in the comment section why
103:36 let me know in the comment section why because my schema name is source
103:39 because my schema name is source coincidentally
103:41 coincidentally so I will simply say source simple okay
103:44 so I will simply say source simple okay and what is a database name in the dbt
103:47 and what is a database name in the dbt world database name is equivalent to the
103:49 world database name is equivalent to the catalog in datab bricks okay so it is
103:51 catalog in datab bricks okay so it is dbt tutorial dev very
103:53 dbt tutorial dev very Now we can just simply define all the
103:56 Now we can just simply define all the tables within this list. This is a list
103:59 tables within this list. This is a list because there's a hyphen after after
104:01 because there's a hyphen after after this. So we can simply say this is a
104:04 this. So we can simply say this is a tables. Okay, table is basically a key.
104:07 tables. Okay, table is basically a key. Make sense? Table is basically a key.
104:09 Make sense? Table is basically a key. Within this key, we have a list of
104:11 Within this key, we have a list of dictionaries. Let's say this is a key.
104:13 dictionaries. Let's say this is a key. This is a value. This is a key. This is
104:15 This is a value. This is a key. This is a value. Make sense? So this is
104:18 a value. Make sense? So this is basically the list with for the tables.
104:21 basically the list with for the tables. And this is the first item. And this is
104:23 And this is the first item. And this is the second item. If I want to write
104:26 the second item. If I want to write equal and JSON for this, it will be look
104:28 equal and JSON for this, it will be look like this. Let's say sources.
104:32 like this. Let's say sources. Uh sources.
104:34 Uh sources. Yeah. Perfect. Yes. So sources is just
104:39 Yeah. Perfect. Yes. So sources is just one key. Okay. And then within this we
104:41 one key. Okay. And then within this we have a list of all the sources. So
104:45 have a list of all the sources. So everything is inside this particular
104:47 everything is inside this particular dictionary. Okay. Because we have name
104:50 dictionary. Okay. Because we have name is source. Perfect. Database this schema
104:53 is source. Perfect. Database this schema this tables tables is itself a list.
104:57 this tables tables is itself a list. Why? Because we have these particular
104:59 Why? Because we have these particular two values. Make sense? Then within this
105:03 two values. Make sense? Then within this list we have a dictionary of key value
105:05 list we have a dictionary of key value pair name equals to orders and that's
105:07 pair name equals to orders and that's it. See name orders then name orders.
105:10 it. See name orders then name orders. That's it. That's how you can just feel
105:13 That's it. That's how you can just feel more and more YAML whenever you just try
105:16 more and more YAML whenever you just try to relate it with JSON because you are
105:18 to relate it with JSON because you are already familiar with JSON or basically
105:20 already familiar with JSON or basically uh your dictionaries and now YAML simple
105:24 uh your dictionaries and now YAML simple and just tell me one thing just be
105:26 and just tell me one thing just be honest which one is more easy to read
105:30 honest which one is more easy to read once you understand that. Okay, once you
105:32 once you understand that. Okay, once you understand both, now if you just want to
105:34 understand both, now if you just want to make a choice which one is more easier
105:36 make a choice which one is more easier to read and more easier to basically
105:39 to read and more easier to basically make the code look more clean, obviously
105:43 make the code look more clean, obviously YAML see it is so so so neat and clean
105:46 YAML see it is so so so neat and clean and whereas see dictionary then list
105:50 and whereas see dictionary then list then double quotes and all those things.
105:53 then double quotes and all those things. That's why YAML is a choice for big
105:55 That's why YAML is a choice for big projects as well. Make sense? This was
105:57 projects as well. Make sense? This was just like a quick
106:00 just like a quick your YAML class because see it is really
106:02 your YAML class because see it is really important. It is directly related to
106:03 important. It is directly related to this. Okay. So now name is let's name
106:06 this. Okay. So now name is let's name all the tables. Fact sales.
106:20 Fact returns. Okay. Dim date. Yes we have dim date.
106:25 Okay. Dim date. Yes we have dim date. Then we have name dim store. Yes. Dim
106:28 Then we have name dim store. Yes. Dim product. Yes.
106:30 product. Yes. And then we have dim customer. Perfect.
106:33 And then we have dim customer. Perfect. So these are done. Let me just save
106:35 So these are done. Let me just save this. Perfect. Do you know what? Now I
106:38 this. Perfect. Do you know what? Now I can use these sources directly from
106:41 can use these sources directly from here. Really? Yes. Let me just show you.
106:45 here. Really? Yes. Let me just show you. So there's a function for this and you
106:47 So there's a function for this and you need to use double curly braces for
106:49 need to use double curly braces for that. And I will simply use source. This
106:51 that. And I will simply use source. This is basically a ginger function. Okay.
106:53 is basically a ginger function. Okay. source
106:55 source source
106:57 source source and then we need to provide the
106:59 source and then we need to provide the source name. Source name is
107:00 source name. Source name is coincidentally source. Okay. Then we
107:04 coincidentally source. Okay. Then we need to pick the source name. Basically
107:06 need to pick the source name. Basically which kind of table you want to pick
107:08 which kind of table you want to pick with from that source. I want to pick
107:10 with from that source. I want to pick fact sales. Make sense? Okay. Let me
107:13 fact sales. Make sense? Okay. Let me just save it and run it. I should see
107:15 just save it and run it. I should see the exact result I was seeing before.
107:18 the exact result I was seeing before. Plus I will just show you the advantage
107:20 Plus I will just show you the advantage as well. See the lineage. Now let me
107:22 as well. See the lineage. Now let me just refresh it. See now you are seeing
107:26 just refresh it. See now you are seeing this particular. Oh, this is so smooth
107:28 this particular. Oh, this is so smooth man. Nice.
107:29 man. Nice. So now you are seeing fact sales as a
107:32 So now you are seeing fact sales as a source for this bronze sales. Make
107:35 source for this bronze sales. Make sense? Now you are seeing this. So now
107:37 sense? Now you are seeing this. So now you can see the advantage of using
107:40 you can see the advantage of using source. Make sense? Very good. So now
107:43 source. Make sense? Very good. So now what I will do? I will simply quickly
107:46 what I will do? I will simply quickly create all the other models as well for
107:48 create all the other models as well for bronze. I will say bronze returns
107:52 bronze. I will say bronze returns dossql.
107:53 dossql. Let me paste it here. Let's say
107:56 Let me paste it here. Let's say returns. Okay. Let's save and run it to
108:00 returns. Okay. Let's save and run it to make sure it is fine.
108:04 make sure it is fine. Bronze, let's say customer.
108:08 Bronze, let's say customer. Let's paste it here dim customer.
108:13 Let's paste it here dim customer. Like obviously you do not need to create
108:15 Like obviously you do not need to create all the bronze models because this is
108:17 all the bronze models because this is not a project. This is a tutorial where
108:18 not a project. This is a tutorial where we are just learning new and new things
108:20 we are just learning new and new things within you can say DBD. We do not need
108:22 within you can say DBD. We do not need to create all the bronze layer all the
108:24 to create all the bronze layer all the things. This is just a demonstration for
108:26 things. This is just a demonstration for you so that you can just feel relevant.
108:28 you so that you can just feel relevant. Okay. And dim customer. Let's run this.
108:32 Okay. And dim customer. Let's run this. But still I'm just creating everything
108:33 But still I'm just creating everything for you. Bronze and then date dots
108:37 for you. Bronze and then date dots equal.
108:40 equal. And then I can say dim date
108:55 YouTube sales uh what was the source name source
109:00 uh what was the source name source and then
109:02 and then let's say this one is
109:04 let's say this one is dem dim is done I guess. Yeah, now we
109:07 dem dim is done I guess. Yeah, now we can say
109:08 can say uh dem customer is also done on dim
109:11 uh dem customer is also done on dim product.
109:13 product. Okay, makes sense. Let's run this.
109:16 Okay, makes sense. Let's run this. Oh, this is date. Okay,
109:28 compilation error. What is the error? What is the error? There's a problem in
109:30 What is the error? There's a problem in DVD compilation project failed. Bronze
109:32 DVD compilation project failed. Bronze date. Bronze date. Bronze date.SQL. Oh,
109:35 date. Bronze date. Bronze date.SQL. Oh, because we do not have C in this.
109:38 because we do not have C in this. See, see C.
109:47 Let's run this. Perfect. Dim date is also done. Now let's create
109:51 also done. Now let's create bronze
109:53 bronze um product
109:56 um product dots SQL
110:12 and our last table is dim supplier. What was that? Dim store.
110:17 was that? Dim store. Okay. Dim store
110:26 dossql. Perfect. Dim store is also done.
110:35 Oops. Oops. Perfect. So all the models are ready.
110:38 Perfect. So all the models are ready. Really these are just the scripts, bro.
110:40 Really these are just the scripts, bro. Where are the models? Now we will be
110:42 Where are the models? Now we will be just running our models. So basically
110:44 just running our models. So basically the bronze folder is ready.
110:46 the bronze folder is ready. dynamically we are using sources not
110:50 dynamically we are using sources not just hard-coded values right now I will
110:52 just hard-coded values right now I will just ask you one thing one small thing
110:54 just ask you one thing one small thing what's that
110:56 what's that now if I want to run these models how we
111:00 now if I want to run these models how we can just run these because see we just
111:02 can just run these because see we just have select statement just select
111:04 have select statement just select statement we do not have anything else
111:07 statement we do not have anything else but that's the magic of dbt okay now let
111:11 but that's the magic of dbt okay now let me just show you something if I just go
111:14 me just show you something if I just go to dbt projecty if you just scroll
111:16 to dbt projecty if you just scroll Scroll down you will see something
111:17 Scroll down you will see something called as models an dbt YouTube which is
111:20 called as models an dbt YouTube which is my project name. Within that we have
111:23 my project name. Within that we have example. If you remember we had example
111:26 example. If you remember we had example folder within the models folder but we
111:30 folder within the models folder but we removed it. We created one folder called
111:31 removed it. We created one folder called bronze. So we need to rename it here as
111:34 bronze. So we need to rename it here as well.
111:36 well. Make sense? So now anal lamba what is
111:38 Make sense? So now anal lamba what is this materialized view and this plus
111:40 this materialized view and this plus button? That means whatever you have
111:42 button? That means whatever you have within the bronze folder it will create
111:45 within the bronze folder it will create the views in data bricks but you will
111:48 the views in data bricks but you will say ana I want to create tables because
111:50 say ana I want to create tables because I love tables. So you can simply replace
111:53 I love tables. So you can simply replace this value with table. Now whatever you
111:56 this value with table. Now whatever you have like whatever models you are
111:59 have like whatever models you are building if I run these models these
112:02 building if I run these models these models will be creating tables for me in
112:05 models will be creating tables for me in the data bricks catalog and schema.
112:09 the data bricks catalog and schema. Okay. And first of all save this file.
112:11 Okay. And first of all save this file. So this is the way you materialize it.
112:14 So this is the way you materialize it. Okay. And you know the best thing is
112:16 Okay. And you know the best thing is best thing about it you can act this is
112:20 best thing about it you can act this is basically called a config. This is like
112:22 basically called a config. This is like one of the config. Let me just write it
112:25 one of the config. Let me just write it for you. This is basically the
112:28 for you. This is basically the configuration.
112:30 configuration. One of the configuration. Make sense?
112:33 One of the configuration. Make sense? Wow man your handwriting is so good.
112:42 Okay. So this is one kind of configuration that we are adding here.
112:44 configuration that we are adding here. And do you know what? Do you know what
112:46 And do you know what? Do you know what we can define configuration in multiple
112:49 we can define configuration in multiple areas.
112:51 areas. Okay. What do you mean in multiple
112:53 Okay. What do you mean in multiple areas? Configuration we will just talk
112:56 areas? Configuration we will just talk about in very few seconds. Let me first
112:58 about in very few seconds. Let me first of all run this model so that you will
113:00 of all run this model so that you will know the importance of configuration.
113:02 know the importance of configuration. Because right now you will say what is
113:04 Because right now you will say what is the importance? No. But just remember
113:06 the importance? No. But just remember that this this is one of the
113:08 that this this is one of the configuration. Okay. And see it is
113:12 configuration. Okay. And see it is saying other options
113:14 saying other options makes sense. So let's save this. And now
113:16 makes sense. So let's save this. And now how we can just run this particular
113:18 how we can just run this particular model. There are multiple ways. You can
113:19 model. There are multiple ways. You can even use your um extension button. But
113:22 even use your um extension button. But the best way is using terminal. Just
113:25 the best way is using terminal. Just open the terminal. Okay. and just make
113:28 open the terminal. Okay. and just make sure that you are inside this particular
113:31 sure that you are inside this particular DBT project folder because if you run
113:34 DBT project folder because if you run your model right from here it will not
113:35 your model right from here it will not run. Why? Because you are inside your
113:38 run. Why? Because you are inside your parent folder. You are not inside this
113:40 parent folder. You are not inside this folder. So I will simply say cd that
113:43 folder. So I will simply say cd that means change directory and an
113:46 means change directory and an dbt YouTube.
113:48 dbt YouTube. Now I am inside this folder. Now I can
113:50 Now I am inside this folder. Now I can just run that. I will simply say dbt run
113:53 just run that. I will simply say dbt run and hit enter. That's it. Now what it
113:56 and hit enter. That's it. Now what it will do? What this command will do? You
113:59 will do? What this command will do? You will see all the logs here. See all the
114:01 will see all the logs here. See all the logs are here. And now it is building
114:04 logs are here. And now it is building our model and everything you are seeing
114:07 our model and everything you are seeing in front of your eyes. See 1 2 3 4
114:10 in front of your eyes. See 1 2 3 4 because there are total six tables. As
114:12 because there are total six tables. As you can see all the things are here.
114:15 you can see all the things are here. Everything. Wow. Everything. Everything.
114:18 Everything. Wow. Everything. Everything. See completed successfully. Wow.
114:23 See completed successfully. Wow. The same thing is visible in DBT cloud
114:26 The same thing is visible in DBT cloud in the form of a catchy UI. That is the
114:30 in the form of a catchy UI. That is the difference. That's it. Obviously,
114:31 difference. That's it. Obviously, they're just providing a managed
114:34 they're just providing a managed service. Obviously, they need to just
114:36 service. Obviously, they need to just provide nice UI for all these. But
114:39 provide nice UI for all these. But terminal fulfills that thing as well.
114:42 terminal fulfills that thing as well. Okay. So, let me just go to the datab
114:44 Okay. So, let me just go to the datab bricks just to see like what actually
114:46 bricks just to see like what actually happened. So if I just refresh
114:49 happened. So if I just refresh and if I just go to the default schema
114:51 and if I just go to the default schema see all the tables are created here.
114:54 see all the tables are created here. Bronze customer dim date product return
114:56 Bronze customer dim date product return sales bronze store all the tables. Why
114:59 sales bronze store all the tables. Why default schema? Because in your profiles
115:03 default schema? Because in your profiles if I go to
115:05 if I go to profiles profiles ps you have given
115:09 profiles profiles ps you have given default schema. So that is why all the
115:12 default schema. So that is why all the objects are created within this
115:14 objects are created within this particular fi this particular schema.
115:16 particular fi this particular schema. Make sense? Make sense? Okay, good.
115:20 Make sense? Make sense? Okay, good. Now, so many questions. I know, I know,
115:24 Now, so many questions. I know, I know, I know. So many questions you have like
115:26 I know. So many questions you have like for Hey, hey, hey, wait, hold on, hold
115:29 for Hey, hey, hey, wait, hold on, hold on, hold on. Okay, so this is very
115:32 on, hold on. Okay, so this is very interesting. First of all, you have
115:33 interesting. First of all, you have created your first tables.
115:34 created your first tables. Congratulations. Okay, now
115:38 Congratulations. Okay, now now
115:39 now let me just continue my previous thing.
115:41 let me just continue my previous thing. But before that, let me just show you
115:43 But before that, let me just show you one thing because I just saw that folder
115:44 one thing because I just saw that folder and I'm really excited. You will see one
115:47 and I'm really excited. You will see one more folder that is just created for
115:49 more folder that is just created for you. It is called target. It was not
115:52 you. It is called target. It was not there before. It is just created. It is
115:54 there before. It is just created. It is created whenever you run the models,
115:57 created whenever you run the models, whenever you compile the models. Really?
116:00 whenever you compile the models. Really? Yes. Why? Let me show you why. So you
116:04 Yes. Why? Let me show you why. So you have written the query here, right? This
116:07 have written the query here, right? This is very simple query. But let's say you
116:08 is very simple query. But let's say you are writing a very complex query. Don't
116:10 are writing a very complex query. Don't worry you'll be just writing complex
116:11 worry you'll be just writing complex queries as well. Let's say you are
116:13 queries as well. Let's say you are writing very complex query and if you
116:15 writing very complex query and if you just want to see hey which query is
116:17 just want to see hey which query is actually used to create our models we we
116:20 actually used to create our models we we are very excited to know that. So this
116:22 are very excited to know that. So this is the folder for that. If you click
116:24 is the folder for that. If you click here target if you say run not compile
116:27 here target if you say run not compile run
116:28 run you will see all these models. If I
116:31 you will see all these models. If I click here let's say bronze customers
116:33 click here let's say bronze customers just open anything. If I click here,
116:35 just open anything. If I click here, this is the real query that has been
116:39 this is the real query that has been running behind that model. Create or
116:42 running behind that model. Create or replace table catalog name, schema name,
116:45 replace table catalog name, schema name, table name using delta which is the
116:48 table name using delta which is the database syntax as select axis from this
116:51 database syntax as select axis from this particular
116:53 particular thing. This is the real query that is
116:56 thing. This is the real query that is running behind your DBT models to
116:59 running behind your DBT models to populate the tables in data bricks.
117:02 populate the tables in data bricks. Okay. So this is the place where you can
117:04 Okay. So this is the place where you can come and see your query. If you just
117:07 come and see your query. If you just want to troubleshoot, if you just want
117:08 want to troubleshoot, if you just want to confirm, hey, what's going on? What
117:10 to confirm, hey, what's going on? What is the query? This is the place. Okay,
117:14 is the query? This is the place. Okay, this is the place. And you didn't write
117:16 this is the place. And you didn't write anything create or replace nothing.
117:17 anything create or replace nothing. Everything is just added through DBD
117:20 Everything is just added through DBD models. Make sense? All this boilerplate
117:23 models. Make sense? All this boilerplate code is gone. Very good. Very good. So
117:27 code is gone. Very good. Very good. So that's the thing target. Now you will
117:30 that's the thing target. Now you will say anala let's say we have created so
117:33 say anala let's say we have created so many models okay you have deleted so
117:36 many models okay you have deleted so many models just let's let let's let's
117:38 many models just let's let let's let's suppose you have deleted so many models
117:40 suppose you have deleted so many models okay
117:41 okay now
117:43 now all these things run will be there it
117:46 all these things run will be there it will not be automatically removed really
117:48 will not be automatically removed really let's say I deleted this bronze date
117:50 let's say I deleted this bronze date right now I don't want to do that but
117:52 right now I don't want to do that but let's say I deleted that after that as
117:54 let's say I deleted that after that as well this bronze date will be here it
117:56 well this bronze date will be here it will not be gone hm then there will be a
117:59 will not be gone hm then there will be a garbage right and to clean the garbage
118:03 garbage right and to clean the garbage what do we do we simply clean it right
118:06 what do we do we simply clean it right okay so that is why we have an amazing
118:08 okay so that is why we have an amazing command here in the dbd terminal so I
118:11 command here in the dbd terminal so I will simply first of all go to cd to
118:14 will simply first of all go to cd to this folder and I will run dbt clean
118:16 this folder and I will run dbt clean what this command will do it will clean
118:18 what this command will do it will clean all the things within the target folder
118:20 all the things within the target folder and you will see target folder is gone
118:23 and you will see target folder is gone so you can run dbt clean command when
118:25 so you can run dbt clean command when you just want to clean everything
118:27 you just want to clean everything Because every time you'll be just
118:30 Because every time you'll be just running your models, the fresh target
118:33 running your models, the fresh target folder will be there. So don't worry,
118:35 folder will be there. So don't worry, just clean everything. Okay, make sense?
118:37 just clean everything. Okay, make sense? See, now there's no target folder. Very
118:39 See, now there's no target folder. Very good. Now let's talk about that I was
118:42 good. Now let's talk about that I was just highlighting before.
118:44 just highlighting before. Configurations. Configurations in DBT
118:46 Configurations. Configurations in DBT are very important. And configurations
118:51 are very important. And configurations can be configured. Wow. Configurations
118:54 can be configured. Wow. Configurations can be configured. configurations can be
118:56 can be configured. configurations can be configured in three different areas in
119:00 configured in three different areas in DBT. Oh, just listen to this carefully.
119:03 DBT. Oh, just listen to this carefully. This is very important. So for now we
119:07 This is very important. So for now we have written our configuration in
119:10 have written our configuration in project dbt project.yamel. This is our
119:14 project dbt project.yamel. This is our configuration. Configuration is like a
119:16 configuration. Configuration is like a kind of
119:19 kind of you can say this is a way to tell DBT
119:22 you can say this is a way to tell DBT how we want to run our models, how we
119:24 how we want to run our models, how we want to materialize, where we want to
119:27 want to materialize, where we want to materialize. All these questions are
119:29 materialize. All these questions are answered through configurations and
119:32 answered through configurations and there are three different areas to store
119:34 there are three different areas to store the configuration. First one is this as
119:36 the configuration. First one is this as we as we all can see materializes table
119:39 we as we all can see materializes table right very good. There's one more way
119:43 right very good. There's one more way which is called properties. Let me show
119:45 which is called properties. Let me show you.
119:47 you. So first is DBT project.
119:50 So first is DBT project. Okay. The second area is properties.
119:54 Okay. The second area is properties. What is properties? I will just let you
119:56 What is properties? I will just let you know. Don't worry. The third one is
119:59 know. Don't worry. The third one is block. Okay. And what is the priority?
120:04 block. Okay. And what is the priority? Let's say I define one config here and
120:09 Let's say I define one config here and here and here. So priority will be given
120:13 here and here. So priority will be given to block the most then properties then
120:16 to block the most then properties then dbt project. Oh even if I just define
120:20 dbt project. Oh even if I just define the same configuration with different
120:22 the same configuration with different values in these three areas. Let's pick
120:24 values in these three areas. Let's pick this example materialization. Let's say
120:26 this example materialization. Let's say I want to materialize my all my all the
120:31 I want to materialize my all my all the models within the bronze folder as
120:32 models within the bronze folder as table. Okay. So what it will do? It will
120:35 table. Okay. So what it will do? It will materialize all the things as tables.
120:37 materialize all the things as tables. Make sense? But let's say I want to
120:40 Make sense? But let's say I want to materialize my dim date in the bronze
120:43 materialize my dim date in the bronze folder as a view.
120:46 folder as a view. Oh, then what I can do? I can simply say
120:50 Oh, then what I can do? I can simply say I want to materialize
120:52 I want to materialize my dim date as view in the properties or
120:55 my dim date as view in the properties or in block. So that will take precedence
120:58 in block. So that will take precedence over DBT project. Don't worry, I'll just
121:00 over DBT project. Don't worry, I'll just show you as well. So for now, I'm just
121:02 show you as well. So for now, I'm just giving you the overview. Okay. Then
121:05 giving you the overview. Okay. Then let's say you have written your propert
121:06 let's say you have written your propert something something in the properties.
121:08 something something in the properties. Now if you just write something in the
121:10 Now if you just write something in the block this will take the most precedence
121:13 block this will take the most precedence and it will ignore that as well. Make
121:15 and it will ignore that as well. Make sense? So this is just the order of
121:17 sense? So this is just the order of precedence. Now let me just show you if
121:19 precedence. Now let me just show you if I can find a very good resource for
121:22 I can find a very good resource for that. Let's say um
121:26 that. Let's say um let me search
121:29 let me search config configuration properties.
121:32 config configuration properties. Perfect. DPD config parameters. I want
121:36 Perfect. DPD config parameters. I want official documentation. Perfect. Define
121:38 official documentation. Perfect. Define configs. So here it is saying config and
121:41 configs. So here it is saying config and inheritance blah blah blah
121:44 inheritance blah blah blah example. We do not need example for now.
121:46 example. We do not need example for now. I just want to show you that one block
121:50 I just want to show you that one block general properties general configs.
121:54 general properties general configs. Okay. General properties. Okay. This is
121:57 Okay. General properties. Okay. This is fine.
121:58 fine. General configs.
122:01 General configs. Okay.
122:03 Okay. Okay.
122:13 Okay. Access. Uh yeah. Let's explore this one. So this is saying define
122:15 this one. So this is saying define configs. So here we can define configs.
122:19 configs. So here we can define configs. So it is saying the most specific config
122:20 So it is saying the most specific config always takes precedence. As I just
122:22 always takes precedence. As I just mentioned the most specific that means
122:25 mentioned the most specific that means the more closer config to your model
122:28 the more closer config to your model takes more precedence. This generally
122:30 takes more precedence. This generally follows the order above an infile in an
122:33 follows the order above an infile in an infile config that means the block then
122:36 infile config that means the block then properties and then project file the
122:39 properties and then project file the same thing that I've just mentioned same
122:41 same thing that I've just mentioned same thing okay same thing now
122:46 thing okay same thing now if you just scroll down you will see
122:47 if you just scroll down you will see some examples and all and let me just
122:49 some examples and all and let me just show you actually so basically let's
122:51 show you actually so basically let's create a properties file because you are
122:53 create a properties file because you are not introduced to the properties let me
122:55 not introduced to the properties let me just uh introduce to the properties so
122:58 just uh introduce to the properties so here What is the property file?
123:00 here What is the property file? Basically, property file is basically
123:05 Basically, property file is basically actually can hold a lot of things.
123:07 actually can hold a lot of things. Property files can hold your data desk
123:09 Property files can hold your data desk that we'll be just covering very soon.
123:11 that we'll be just covering very soon. Property files can actually do a lot of
123:13 Property files can actually do a lot of stuff. For now, you can say property
123:15 stuff. For now, you can say property files are also used to define the
123:17 files are also used to define the config. For now, that's it. That's it.
123:19 config. For now, that's it. That's it. Make sense? And you can define your
123:22 Make sense? And you can define your property files in your models. Okay? And
123:25 property files in your models. Okay? And they are also of the YAML format. Makes
123:28 they are also of the YAML format. Makes sense? Okay. Now let's say I want to
123:32 sense? Okay. Now let's say I want to create one property file within this
123:34 create one property file within this bronze folder. Okay. Let's create one.
123:37 bronze folder. Okay. Let's create one. And what will be the name of property
123:38 And what will be the name of property files? It can be literally anything but
123:40 files? It can be literally anything but I personally prefer to create the file
123:43 I personally prefer to create the file with the name as properties
123:47 with the name as properties properties.
123:49 properties. Because we could just create one file
123:50 Because we could just create one file within one folder and properties.l is
123:53 within one folder and properties.l is very you can say um understood that it
123:56 very you can say um understood that it is like properties.
123:58 is like properties. for properties. So how we can just
124:00 for properties. So how we can just define properties? Now just go to the
124:02 define properties? Now just go to the documentation, just grab any code like
124:04 documentation, just grab any code like any sample code. Let's say version two
124:06 any sample code. Let's say version two and blah blah blah. I don't need sources
124:09 and blah blah blah. I don't need sources because we just want to configure
124:11 because we just want to configure models.
124:13 models. So models name config materialization
124:16 So models name config materialization and that's it. This is good. So if I
124:19 and that's it. This is good. So if I paste it here and obviously I need to
124:21 paste it here and obviously I need to say version
124:25 say version two. Perfect. So this is the models and
124:28 two. Perfect. So this is the models and we know that within our models let me
124:32 we know that within our models let me just see must match the file name that
124:35 just see must match the file name that is fine okay makes sense makes sense
124:37 is fine okay makes sense makes sense make sense okay perfect so this is
124:39 make sense okay perfect so this is saying models so whatever models do we
124:42 saying models so whatever models do we have within this particular directory it
124:45 have within this particular directory it should be exactly matching and it is a
124:47 should be exactly matching and it is a good comment written here I will not
124:48 good comment written here I will not remove it it will be your benefit as
124:50 remove it it will be your benefit as well so I can simply change this to my
124:53 well so I can simply change this to my model let's say I want to perform this
124:57 model let's say I want to perform this property for bronze date. Make sense? Do
125:00 property for bronze date. Make sense? Do not you do not need to write dosql
125:02 not you do not need to write dosql dotsql is just an extension. Just write
125:03 dotsql is just an extension. Just write bronze date. That's it. Then config. And
125:06 bronze date. That's it. Then config. And we do not need to config anything else.
125:09 we do not need to config anything else. We want to config something called as
125:11 We want to config something called as materialized.
125:14 materialized. Okay. Now I want to say I want to
125:16 Okay. Now I want to say I want to materialize this as view. Let's say.
125:18 materialize this as view. Let's say. Okay. And let's remove other things or
125:21 Okay. And let's remove other things or let's keep it. Let's say I want to
125:23 let's keep it. Let's say I want to materialize bronze product as well as a
125:26 materialize bronze product as well as a view. Make sense? So let me just save
125:29 view. Make sense? So let me just save this. Do you know what? If I run this
125:32 this. Do you know what? If I run this particular model, not like this just
125:34 particular model, not like this just model like the whole model like all the
125:35 model like the whole model like all the models. Now this will take precedence
125:38 models. Now this will take precedence really. Yes. This is the magic of it. So
125:41 really. Yes. This is the magic of it. So this is a property file. This is just a
125:44 this is a property file. This is just a second area. Let's talk about the third
125:46 second area. Let's talk about the third area. Let's say let's say this is the
125:49 area. Let's say let's say this is the properties file built just for these two
125:53 properties file built just for these two right let's say in your bronze
125:56 right let's say in your bronze um let's say bronze sales let's say in
125:59 um let's say bronze sales let's say in the bronze sales you can add config here
126:03 the bronze sales you can add config here as well really yes this is called block
126:07 as well really yes this is called block level configuration I can simply write
126:08 level configuration I can simply write something like this config materialized
126:11 something like this config materialized equals to view
126:14 equals to view perfect let me just save it So now what
126:17 perfect let me just save it So now what will happen it will create this as a
126:19 will happen it will create this as a view. So these are three different areas
126:22 view. So these are three different areas where you can just define your
126:23 where you can just define your properties and configurations and
126:25 properties and configurations and basically block level configurations. So
126:27 basically block level configurations. So if I just go and just run dbt run you
126:31 if I just go and just run dbt run you will see that it will create encountered
126:34 will see that it will create encountered error. Wow. What is the error? No dbd
126:37 error. Wow. What is the error? No dbd project obviously. See this is the error
126:39 project obviously. See this is the error that you will see if you will not write
126:41 that you will see if you will not write cd. See that's what you you see
126:44 cd. See that's what you you see Anchulama. You also forgot to write CD.
126:46 Anchulama. You also forgot to write CD. Okay, it's fine. It happens. Now I can
126:48 Okay, it's fine. It happens. Now I can say DBT run. It is fine. So now it now
126:53 say DBT run. It is fine. So now it now it is running. Okay. Now it is saying
126:56 it is running. Okay. Now it is saying something else. Okay. What is this?
126:58 something else. Okay. What is this? Database SQL adapter
127:01 Database SQL adapter exception concurrency one thread found
127:03 exception concurrency one thread found six models database error. So I think my
127:06 six models database error. So I think my token was expired because I just set the
127:08 token was expired because I just set the token just for one day. So I just
127:10 token just for one day. So I just created a new token and I updated in the
127:13 created a new token and I updated in the um profiles.yamel AML in the token area.
127:15 um profiles.yamel AML in the token area. So I think now it should work because
127:17 So I think now it should work because that is like very silly and very small
127:19 that is like very silly and very small error. Let's hope it should work. So now
127:22 error. Let's hope it should work. So now it is saying six models, six sources,
127:24 it is saying six models, six sources, 686 macros. Wow. So let's wait. And as I
127:27 686 macros. Wow. So let's wait. And as I just mentioned like it takes some time.
127:29 just mentioned like it takes some time. Do not cancel anything. It can just mess
127:31 Do not cancel anything. It can just mess up with the code base and do not do
127:33 up with the code base and do not do anything. Make sense? Okay. So even if
127:36 anything. Make sense? Okay. So even if you see the same error, let's say your
127:38 you see the same error, let's say your token is also expired. Do not worry.
127:40 token is also expired. Do not worry. Just go to data bricks. Same steps.
127:42 Just go to data bricks. Same steps. Create the new token. And this time you
127:44 Create the new token. And this time you do not need to say dbt in it from the
127:45 do not need to say dbt in it from the scratch. No, no. Just update that token
127:48 scratch. No, no. Just update that token within your profiles.yamel. That's it.
127:51 within your profiles.yamel. That's it. Make sense? Make sense? Now it is
127:55 Make sense? Make sense? Now it is running it. Let's wait for a few more
127:58 running it. Let's wait for a few more seconds.
128:00 seconds. And that's it. Okay. So yeah, this is
128:03 And that's it. Okay. So yeah, this is the file. You can just update it here.
128:06 the file. You can just update it here. And I have updated it here. Okay. So now
128:09 And I have updated it here. Okay. So now it is saying one of SQL start. So now it
128:11 it is saying one of SQL start. So now it is started. Very good. Now, let's wait.
128:14 is started. Very good. Now, let's wait. You may opt in into new behaviors sooner
128:17 You may opt in into new behaviors sooner by setting. Oh, yeah. Yeah, that's fine.
128:21 by setting. Oh, yeah. Yeah, that's fine. Okay. Okay. Okay.
128:30 Okay. Okay. Okay. Complete successfully. Let me just
128:33 Okay. Complete successfully. Let me just show you. If I show you in the catalog
128:36 show you. If I show you in the catalog this time, DBT tutorial dev. If I go to
128:39 this time, DBT tutorial dev. If I go to default, you will see views. See now
128:43 default, you will see views. See now tables are gone. Why? Because bronze
128:46 tables are gone. Why? Because bronze date, bronze sales and bronze product.
128:50 date, bronze sales and bronze product. Now we have
128:52 Now we have these tables and views in combination.
128:55 these tables and views in combination. Sometimes sometimes you will see tables
128:57 Sometimes sometimes you will see tables and views both because sometimes it does
128:59 and views both because sometimes it does not delete your particular tables. It
129:01 not delete your particular tables. It just creates new views. It totally
129:04 just creates new views. It totally depends like how you have configured
129:05 depends like how you have configured your token and how you have just
129:06 your token and how you have just provided the permission. But it's fine.
129:09 provided the permission. But it's fine. It should create views as well. That is
129:11 It should create views as well. That is more important. And how it took
129:13 more important. And how it took precedence. You have just seen that you
129:15 precedence. You have just seen that you have added config in three different
129:17 have added config in three different areas. And how it just took the
129:19 areas. And how it just took the precedence. You just witnessed that.
129:22 precedence. You just witnessed that. Make sense? Make sense? Very good. And
129:25 Make sense? Make sense? Very good. And let me just tell you, let me just tell
129:27 let me just tell you, let me just tell you these are like highlevel things
129:30 you these are like highlevel things configuring in project level, then
129:33 configuring in project level, then properties level, then block level. So
129:36 properties level, then block level. So just feel happy. Okay? Just feel happy.
129:39 just feel happy. Okay? Just feel happy. Okay. And I know you are happy you are
129:41 Okay. And I know you are happy you are with me. So
129:44 with me. So good, good, good. So this is our bronze
129:46 good, good, good. So this is our bronze layer so far. Okay. This is our bronze
129:49 layer so far. Okay. This is our bronze layer so far. Makes sense. So we have
129:51 layer so far. Makes sense. So we have created sources and everything. And we
129:52 created sources and everything. And we have actually done a lot of things. Let
129:54 have actually done a lot of things. Let me first of all commit this. So if I
129:56 me first of all commit this. So if I just go to the
129:59 just go to the um terminal and this time you do not
130:01 um terminal and this time you do not need to run cd because now you are using
130:04 need to run cd because now you are using commit and you want to commit everything
130:06 commit and you want to commit everything not just this um dbt project. So you
130:10 not just this um dbt project. So you will say get status first of all and you
130:12 will say get status first of all and you will see a lot of things a lot of things
130:14 will see a lot of things a lot of things see obviously not lot of things because
130:17 see obviously not lot of things because a lot of things are inside this folder.
130:18 a lot of things are inside this folder. So I will simply say get add dot I told
130:22 So I will simply say get add dot I told you it is just for selecting these
130:24 you it is just for selecting these things then committing it get commit
130:28 things then committing it get commit minus m and I will write the message
130:30 minus m and I will write the message let's say bronze layer because we have
130:32 let's say bronze layer because we have prepared the bronze layer perfect so all
130:36 prepared the bronze layer perfect so all these things are created now you are not
130:39 these things are created now you are not seeing any green color or anything so
130:41 seeing any green color or anything so now now from now onwards we will not be
130:44 now now from now onwards we will not be directly working with the main branch
130:46 directly working with the main branch that is not the ideal scenario You
130:48 that is not the ideal scenario You should create the feature branches and
130:50 should create the feature branches and then you should just merge those
130:52 then you should just merge those branches within the main. But be very
130:54 branches within the main. But be very careful. Okay. So now let's create a
130:56 careful. Okay. So now let's create a branch. How you can just create a
130:58 branch. How you can just create a branch? It is very simple. Just say get
131:00 branch? It is very simple. Just say get switch
131:02 switch minus C. Okay. And then you can simply
131:05 minus C. Okay. And then you can simply say feature let's say un in your case it
131:08 say feature let's say un in your case it can be something else. What is this
131:09 can be something else. What is this command for? This command will create a
131:12 command for? This command will create a new feature branch from the head of the
131:14 new feature branch from the head of the current branch which is the main branch.
131:16 current branch which is the main branch. Okay. Hit enter and now you are switched
131:19 Okay. Hit enter and now you are switched to the feature branch as you can see.
131:20 to the feature branch as you can see. And now whatever you will do will only
131:23 And now whatever you will do will only be available in the feature branch until
131:26 be available in the feature branch until and unless you merge those changes. Make
131:28 and unless you merge those changes. Make sense? Good. So now if I ask you one
131:32 sense? Good. So now if I ask you one simple question and I know you had this
131:35 simple question and I know you had this question somewhere in your mind but you
131:37 question somewhere in your mind but you were hiding this. I know this. See you
131:39 were hiding this. I know this. See you are saying hey Anlamba
131:42 are saying hey Anlamba it's fine we are able to just develop
131:45 it's fine we are able to just develop our objects and everything in the
131:47 our objects and everything in the default schema we know that we can just
131:50 default schema we know that we can just change the schema as well obviously we
131:52 change the schema as well obviously we know that like we can just rename it as
131:54 know that like we can just rename it as like let's say bronze but
131:58 like let's say bronze but this is just fine for one layer what if
132:02 this is just fine for one layer what if we want to create different schema for
132:04 we want to create different schema for different models let's say bronze schema
132:07 different models let's say bronze schema for bronze models models, silver schema
132:09 for bronze models models, silver schema for silver models, gold schema for gold
132:12 for silver models, gold schema for gold models. This is an advanced topic, but
132:14 models. This is an advanced topic, but let me just make it very easy for you,
132:16 let me just make it very easy for you, very easy. Okay, so let me just first of
132:19 very easy. Okay, so let me just first of all take you to the documentation page.
132:20 all take you to the documentation page. Okay, so
132:23 Okay, so when I just said it is an advanced
132:25 when I just said it is an advanced topic, but it is very easy. I meant to
132:26 topic, but it is very easy. I meant to say that now it is easy for you. Why?
132:29 say that now it is easy for you. Why? Because I have already introduced to the
132:31 Because I have already introduced to the configurations, properties, those blocks
132:35 configurations, properties, those blocks and this is this can be achieved through
132:37 and this is this can be achieved through that thing. Let me just take you to the
132:38 that thing. Let me just take you to the documentation. Let me just write let's
132:40 documentation. Let me just write let's say custom schema dbt. Perfect. So this
132:45 say custom schema dbt. Perfect. So this is the page and first of all let's turn
132:46 is the page and first of all let's turn on the lights. Oh now it's fine. So by
132:49 on the lights. Oh now it's fine. So by default by default all the dbt models
132:52 default by default all the dbt models are built in the schema specified in
132:53 are built in the schema specified in your dbt environment or basically the
132:55 your dbt environment or basically the profiles target. Make sense? But we want
132:58 profiles target. Make sense? But we want to use something else. Okay. So in that
133:01 to use something else. Okay. So in that particular scenario you have again
133:02 particular scenario you have again options. You can configure the schema in
133:05 options. You can configure the schema in your project. Okay, in your block or in
133:10 your project. Okay, in your block or in your you can say normal properties file
133:12 your you can say normal properties file as well. Okay, let's do that. See now
133:16 as well. Okay, let's do that. See now all those advanced DBT concepts and I'm
133:19 all those advanced DBT concepts and I'm not kidding. You can just talk to any
133:21 not kidding. You can just talk to any DBT developer, anyone who is just giving
133:24 DBT developer, anyone who is just giving interviews for DBT roles, just ask this
133:26 interviews for DBT roles, just ask this this question from them and just ask
133:28 this question from them and just ask this like are you getting these types of
133:30 this like are you getting these types of questions? These are literally the
133:31 questions? These are literally the advanced level topics that you are
133:33 advanced level topics that you are mastering like this like this and in
133:37 mastering like this like this and in return I'm asking you to support this
133:39 return I'm asking you to support this channel. In return I'm asking you to
133:41 channel. In return I'm asking you to just drop your lovely comment on this
133:43 just drop your lovely comment on this channel. Just write something so that
133:46 channel. Just write something so that YouTube can also um you can say help me
133:50 YouTube can also um you can say help me and what you are doing what you are
133:52 and what you are doing what you are doing. So just support this channel.
133:55 doing. So just support this channel. Just write your comments, write your
133:58 Just write your comments, write your feedbacks, positive feedback.
134:01 feedbacks, positive feedback. Just do it, bro. Do it. You have to
134:04 Just do it, bro. Do it. You have to support this channel. So now I was
134:08 support this channel. So now I was saying that we can just do that thing in
134:11 saying that we can just do that thing in the project level as well. So let's say
134:13 the project level as well. So let's say we have bronze, right? I can simply say
134:17 we have bronze, right? I can simply say schema and I will simply say bronze.
134:20 schema and I will simply say bronze. That's it. And similarly, I can create
134:22 That's it. And similarly, I can create silver layer as well. Okay. Silver layer
134:25 silver layer as well. Okay. Silver layer materialized table I will be creating
134:27 materialized table I will be creating silver lab silver labor silver layer as
134:29 silver lab silver labor silver layer as well and schema is silver and then gold
134:33 well and schema is silver and then gold materialized gold perfect so this is the
134:36 materialized gold perfect so this is the schema that we have defined on the dbt
134:38 schema that we have defined on the dbt project level itself we could have done
134:41 project level itself we could have done that here as well let's say if I go to
134:43 that here as well let's say if I go to models if I go to bronze if I go to
134:46 models if I go to bronze if I go to properties I can do it here as well
134:48 properties I can do it here as well let's say um bronze date and bronze
134:53 let's say um bronze date and bronze product so this is basically obviously
134:54 product so this is basically obviously on the model level. Okay. And this is on
134:59 on the model level. Okay. And this is on the model level. And now I can simply
135:00 the model level. And now I can simply say
135:03 say schema bronze. I can do it on the model
135:07 schema bronze. I can do it on the model level as well. basically in the
135:08 level as well. basically in the properties. But obviously it is very
135:11 properties. But obviously it is very time-taking and you cannot like you you
135:14 time-taking and you cannot like you you would not like to spend your time in
135:15 would not like to spend your time in just defining the you can say
135:18 just defining the you can say materialization for each model like you
135:20 materialization for each model like you can let's say you just want to save your
135:23 can let's say you just want to save your only one object in a particular schema
135:25 only one object in a particular schema in a particular you can say um your own
135:29 in a particular you can say um your own schema basically just for like one model
135:31 schema basically just for like one model then you can just do that but ideally in
135:35 then you can just do that but ideally in more in 90% of the cases schema should
135:38 more in 90% of the cases schema should remain same for all the models in each
135:41 remain same for all the models in each layer. So that is why it makes more
135:43 layer. So that is why it makes more sense to define the schema in the dbt
135:46 sense to define the schema in the dbt project. But you have the liberty. You
135:49 project. But you have the liberty. You have the liberty. Make sense? So I can
135:51 have the liberty. Make sense? So I can just leave it here as well. It's fine
135:53 just leave it here as well. It's fine because it will take precedence and
135:54 because it will take precedence and values are same. So it will not create
135:56 values are same. So it will not create any difference. Make sense? So now this
135:59 any difference. Make sense? So now this is not the only thing that you need to
136:00 is not the only thing that you need to do. Really? Yes. Because let me just
136:03 do. Really? Yes. Because let me just tell you how dbt creates a schema for
136:06 tell you how dbt creates a schema for you. There's a macro called uh this one.
136:11 you. There's a macro called uh this one. It is called generate schema name. So
136:13 It is called generate schema name. So this is the macro that it runs behind
136:15 this is the macro that it runs behind the scene. If we want to just define the
136:17 the scene. If we want to just define the custom schema. So you do not need to
136:19 custom schema. So you do not need to understand this macro for now. Just
136:20 understand this macro for now. Just forget about that. Okay. So this is the
136:23 forget about that. Okay. So this is the code that you can understand. It is
136:25 code that you can understand. It is written that whatever value you will
136:28 written that whatever value you will write in the custom schema name, it will
136:30 write in the custom schema name, it will add the default schema as well as the
136:33 add the default schema as well as the prefix. So that means let's say you have
136:35 prefix. So that means let's say you have written bronze. So it will make it as
136:37 written bronze. So it will make it as default bronze default silver because
136:41 default bronze default silver because your default schema schema name is
136:43 your default schema schema name is default. That is why. So you need to
136:46 default. That is why. So you need to copy this code and you need to remove
136:49 copy this code and you need to remove this part and save it in the macros
136:52 this part and save it in the macros folder. Let me just show you. If you
136:53 folder. Let me just show you. If you just go here in the macros it is empty
136:56 just go here in the macros it is empty just dot getep. Just create a file
136:58 just dot getep. Just create a file within this folder and just say generate
137:03 within this folder and just say generate schemasql.
137:08 That's it. And here you can just paste the macro. And here you can simply
137:10 the macro. And here you can simply remove this part. That's it.
137:14 remove this part. That's it. Perfect. And you can just save this.
137:16 Perfect. And you can just save this. Perfect. That's it. Now you will get the
137:18 Perfect. That's it. Now you will get the exact schema that you want. Let me just
137:21 exact schema that you want. Let me just go to the terminal. Okay. And now let's
137:24 go to the terminal. Okay. And now let's say cd because we need to first of all
137:27 say cd because we need to first of all go inside the project. Now let's say dbt
137:30 go inside the project. Now let's say dbt run. And now let's see if we are able to
137:32 run. And now let's see if we are able to create our objects in the dedicated
137:34 create our objects in the dedicated schema. Let's see.
137:41 Let's see. And all these things are being done in the feature branch. So
137:43 being done in the feature branch. So that that is good. That is the ideal
137:45 that that is good. That is the ideal practice that we should do. Make sense?
137:48 practice that we should do. Make sense? Makes sense.
137:50 Makes sense. And
137:52 And meanwhile I can just look at my list
137:53 meanwhile I can just look at my list that I want to cover today. Uh this is
137:56 that I want to cover today. Uh this is done because I don't want to leave
137:58 done because I don't want to leave anything because these are very very
138:00 anything because these are very very important. Okay.
138:02 important. Okay. Okay. Okay. Okay.
138:05 Okay. Okay. Okay. Okay. Oh yeah. We can just talk about
138:06 Okay. Oh yeah. We can just talk about this thing. It is called node selection.
138:08 this thing. It is called node selection. Let's talk about it. It's very nice.
138:11 Let's talk about it. It's very nice. It's very nice.
138:20 So it as you know like it takes like a few seconds but it is fine because maybe
138:23 few seconds but it is fine because maybe it is just turning on the warehouse and
138:27 it is just turning on the warehouse and now then it will be just running all
138:29 now then it will be just running all these things. So it's fine. It's fine.
138:31 these things. So it's fine. It's fine. It's fine. It's fine. We are okay with
138:34 It's fine. It's fine. We are okay with that.
138:40 Concurrency one threads target equal. Okay. Now it is started and fingers
138:43 Okay. Now it is started and fingers crossed we should see everything in the
138:45 crossed we should see everything in the bronze schema only. No prefix, no
138:49 bronze schema only. No prefix, no default schema, nothing
138:52 default schema, nothing nothing.
138:54 nothing. Okay. First table. Okay. Okay. Okay.
138:55 Okay. First table. Okay. Okay. Okay. Okay. This is the best part of the db
138:57 Okay. This is the best part of the db run when we see okay okay okay okay and
139:00 run when we see okay okay okay okay and then we see successfully accomplished
139:03 then we see successfully accomplished completed successfully. Wow. So now I'm
139:05 completed successfully. Wow. So now I'm very excited to see the results. So
139:08 very excited to see the results. So let's go to datab bricks and let's see
139:11 let's go to datab bricks and let's see the results. Let me just refresh it.
139:14 the results. Let me just refresh it. DBD. Oh, perfect. We have bronze schema
139:16 DBD. Oh, perfect. We have bronze schema and you know that we didn't create the
139:18 and you know that we didn't create the schema. Okay. And now we have everything
139:20 schema. Okay. And now we have everything ready. Wow. This is it. This is the
139:22 ready. Wow. This is it. This is the thing that we are looking for. And we
139:24 thing that we are looking for. And we have successfully done that. And see how
139:28 have successfully done that. And see how easy was that and see how difficult was
139:32 easy was that and see how difficult was that here. So now just drop a lovely
139:34 that here. So now just drop a lovely comment right now. Right. Right. right
139:37 comment right now. Right. Right. right now. So now let's talk about a very
139:40 now. So now let's talk about a very simple thing which is called node
139:42 simple thing which is called node selection in DBD. What is node
139:44 selection in DBD. What is node selection? So every time if you observe
139:47 selection? So every time if you observe we write this thing
139:49 we write this thing we write dbt run and what it does it
139:53 we write dbt run and what it does it runs all the models but but sometimes
139:56 runs all the models but but sometimes you do not want to run all the models
139:59 you do not want to run all the models you let's say want to run only one or
140:01 you let's say want to run only one or two models just to check anything or
140:04 two models just to check anything or let's say just to do anything right. So
140:07 let's say just to do anything right. So in that particular say you can say dbt
140:09 in that particular say you can say dbt run then space hyphen select and then
140:12 run then space hyphen select and then model name. Let's say I want to run only
140:14 model name. Let's say I want to run only bronze date. That's it. Okay. And then I
140:18 bronze date. That's it. Okay. And then I can hit enter. And now what it will do
140:21 can hit enter. And now what it will do it will only run the specific model
140:23 it will only run the specific model which is bronze date. And you can see it
140:26 which is bronze date. And you can see it is done. Completed successfully. Okay
140:29 is done. Completed successfully. Okay this was very nice. Let's say you want
140:31 this was very nice. Let's say you want to run
140:33 to run multiple models. You can simply write
140:35 multiple models. You can simply write something like this. DBT run - select.
140:40 something like this. DBT run - select. Okay. And if it is not clear, let me
140:42 Okay. And if it is not clear, let me just write clear. And yeah, so dbt run -
140:48 just write clear. And yeah, so dbt run - select. And then you can write in double
140:51 select. And then you can write in double quotes because you are writing multiple
140:52 quotes because you are writing multiple models. So bronze date
140:56 models. So bronze date and you can write let's say bronze
141:00 and you can write let's say bronze store.
141:12 okay. Makes sense now. See, it has just run two models, one and two. That's it.
141:14 run two models, one and two. That's it. So, that's how you can just run only and
141:18 So, that's how you can just run only and only selected models. That's it. Let's
141:21 only selected models. That's it. Let's say you have bronze, silver, gold, all
141:23 say you have bronze, silver, gold, all the folders ready and you just want to
141:25 the folders ready and you just want to run only the bronze folder. Bronze
141:28 run only the bronze folder. Bronze folder. That's it. You can also do that.
141:30 folder. That's it. You can also do that. dbt run - select and then obviously the
141:34 dbt run - select and then obviously the folder path it will be models then
141:38 folder path it will be models then bronze and then all the models. Now it
141:40 bronze and then all the models. Now it will run all the models only within the
141:42 will run all the models only within the bronze folder. Yes, we do not have
141:44 bronze folder. Yes, we do not have silver and gold that is why it is doing
141:46 silver and gold that is why it is doing all the six. But imagine you have silver
141:49 all the six. But imagine you have silver and gold as well and you can just run
141:51 and gold as well and you can just run your models like this. This is called
141:53 your models like this. This is called node selection in DBD. Okay, this is a
141:55 node selection in DBD. Okay, this is a very fancy word, fancy term but very
141:59 very fancy word, fancy term but very easy to understand but you need to
142:01 easy to understand but you need to remember this right. Very good. Now we
142:04 remember this right. Very good. Now we going to cover something called as data
142:07 going to cover something called as data test and along with this we will be just
142:10 test and along with this we will be just introduced to something called as DBT
142:13 introduced to something called as DBT packages. Packages we do have packages
142:16 packages. Packages we do have packages in DBD. Yes, we have a lot of things in
142:18 in DBD. Yes, we have a lot of things in DBD. So let me just introduce you with
142:21 DBD. So let me just introduce you with the data test and and packages. Let me
142:26 the data test and and packages. Let me just show you. So now let's talk about
142:29 just show you. So now let's talk about DBD tests.
142:31 DBD tests. DBD test I think is actually a
142:35 DBD test I think is actually a gamechanging thing because yes it is
142:39 gamechanging thing because yes it is providing us the ability to write our
142:42 providing us the ability to write our code in the templating format and so
142:43 code in the templating format and so many other things. But meanwhile, you
142:46 many other things. But meanwhile, you didn't actually think, hey, let's talk
142:48 didn't actually think, hey, let's talk about some real stuff. If we are
142:50 about some real stuff. If we are actually building data engineering
142:52 actually building data engineering solutions,
142:54 solutions, we usually validate before building
142:57 we usually validate before building anything. And that's one of the most
143:00 anything. And that's one of the most important things for data engineers.
143:02 important things for data engineers. Why? Because let's say very common, very
143:04 Why? Because let's say very common, very simple example, duplicates.
143:07 simple example, duplicates. Ddduplication is one of those things
143:09 Ddduplication is one of those things that you cannot that that you cannot
143:13 that you cannot that that you cannot negotiate, right? So all those things
143:15 negotiate, right? So all those things need to be there before building your
143:18 need to be there before building your objects. What do you think? DPD has not
143:21 objects. What do you think? DPD has not thought about this thing obviously. So
143:24 thought about this thing obviously. So DB test is like you can say a way where
143:27 DB test is like you can say a way where we can add the rules, you can say
143:30 we can add the rules, you can say expectations, you can say some kind of
143:33 expectations, you can say some kind of criteria, validation steps, checks,
143:36 criteria, validation steps, checks, whatever you want to say, you can add
143:38 whatever you want to say, you can add that using DBT test module. Okay. And
143:41 that using DBT test module. Okay. And before talking about this while
143:42 before talking about this while recording this video two more offer
143:45 recording this video two more offer letters and congratulations both of you
143:48 letters and congratulations both of you and I thought I'll just add it here
143:50 and I thought I'll just add it here while recording this video. So I just
143:52 while recording this video. So I just saw these comments so I thought of just
143:54 saw these comments so I thought of just let's add these comments as well. So
143:56 let's add these comments as well. So very very congratulations and happy
143:58 very very congratulations and happy learning.
144:00 learning. Okay. So now you know about DVD test.
144:03 Okay. So now you know about DVD test. Okay. very very very well done that you
144:06 Okay. very very very well done that you now know a little bit of DBD test very
144:09 now know a little bit of DBD test very little in DBD we have multiple types of
144:12 little in DBD we have multiple types of DVD test earlier it was not there but
144:15 DVD test earlier it was not there but now we have more and more types and
144:16 now we have more and more types and obviously this is an Lama right so
144:19 obviously this is an Lama right so obviously you will get the updated
144:20 obviously you will get the updated knowledge so first of all let's talk
144:23 knowledge so first of all let's talk about the very basic one and the most
144:27 about the very basic one and the most widely used one which is it it is called
144:30 widely used one which is it it is called generic test so as the name suggest
144:33 generic test so as the name suggest generic test or generic test or generic
144:36 generic test or generic test or generic test whatever you want to say. So this
144:39 test whatever you want to say. So this will apply some kind of validation check
144:42 will apply some kind of validation check the name generic that means these tests
144:45 the name generic that means these tests are like generally used in all the
144:47 are like generally used in all the solutions and do you know what we have
144:50 solutions and do you know what we have almost I think four to five different
144:52 almost I think four to five different kind of generic test that we can add in
144:54 kind of generic test that we can add in our solutions make sense these are very
144:58 our solutions make sense these are very very handy because obviously these are
145:00 very handy because obviously these are generic test because they know that you
145:02 generic test because they know that you will be adding these things in this
145:05 will be adding these things in this particular generic test we have not null
145:07 particular generic test we have not null We have let's say unique, we have
145:10 We have let's say unique, we have accepted values and we have
145:12 accepted values and we have relationship. The these are basically I
145:14 relationship. The these are basically I think four types of generic test that we
145:15 think four types of generic test that we have. Okay, because not null and unique
145:18 have. Okay, because not null and unique are those things that you have to add in
145:20 are those things that you have to add in all the solutions almost all the
145:21 all the solutions almost all the solutions because in every table you
145:23 solutions because in every table you will expect a primary key and it is the
145:26 will expect a primary key and it is the behavior of primary key that it should
145:27 behavior of primary key that it should be unique not null like this right so
145:30 be unique not null like this right so you can add those things and this is all
145:32 you can add those things and this is all about generic test and now let's
145:34 about generic test and now let's actually practice this and now let's
145:36 actually practice this and now let's actually see it how we can just use
145:39 actually see it how we can just use generic test in our solution make sense
145:41 generic test in our solution make sense let's see so I am here back on my datab
145:44 let's see so I am here back on my datab portal and before just showing you
145:46 portal and before just showing you anything. Let me just take you to the
145:47 anything. Let me just take you to the documentation page and let's see if we
145:50 documentation page and let's see if we find the best web page for that about
145:53 find the best web page for that about dbt test command. No, we do not need to
145:54 dbt test command. No, we do not need to see dbt test command. We need to see dbt
145:58 see dbt test command. We need to see dbt generic test
146:00 generic test because before running the command we
146:03 because before running the command we need to build test right uh writing yes
146:06 need to build test right uh writing yes custom generic test we are not talking
146:08 custom generic test we are not talking about custom but I can just find the
146:11 about custom but I can just find the relevant URL. This is best practices.
146:13 relevant URL. This is best practices. Okay, generic test blah blah blah. So
146:16 Okay, generic test blah blah blah. So all these things we'll be just covering
146:17 all these things we'll be just covering in the generic test as well. But now we
146:19 in the generic test as well. But now we are these are like basically custom
146:20 are these are like basically custom ones. So this is the one that I was
146:23 ones. So this is the one that I was looking for. Let's say this kind of
146:26 looking for. Let's say this kind of syntax. Add description to generic test
146:35 column name data test relationships. Okay, makes sense. Makes sense.
146:39 Okay, makes sense. Makes sense. Let me find a better link.
146:42 Let me find a better link. DVD DVD DVD
146:45 DVD DVD DVD generic test or let's say DVD test
146:49 generic test or let's say DVD test let's see about DVD test command about
146:52 let's see about DVD test command about references
146:54 references so this is basically command that you'll
146:55 so this is basically command that you'll be just talking about don't worry but
146:56 be just talking about don't worry but before that we need to build DVD test
147:01 before that we need to build DVD test uh
147:03 uh improve
147:05 improve what is this oh improve the search
147:08 what is this oh improve the search engine what are you doing test and DBD
147:11 engine what are you doing test and DBD should we use start like should we use
147:13 should we use start like should we use starting perplexity? Do you want us to
147:14 starting perplexity? Do you want us to use that? Okay. What data tests are
147:18 use that? Okay. What data tests are available for me to use in DBT? Perfect.
147:20 available for me to use in DBT? Perfect. This was the web page I was looking for
147:22 This was the web page I was looking for and thank you so much for providing us.
147:24 and thank you so much for providing us. So, as I just mentioned that basically
147:26 So, as I just mentioned that basically these are the four major types of tests
147:28 these are the four major types of tests available. Unique, not null, accepted
147:31 available. Unique, not null, accepted values and relationships. These are very
147:33 values and relationships. These are very basic ones and very fundamental ones.
147:34 basic ones and very fundamental ones. And don't worry, I will just telling you
147:36 And don't worry, I will just telling you how you can just even add the severity
147:37 how you can just even add the severity and all those things. But I just want
147:39 and all those things. But I just want you to you can say in I I just want you
147:42 you to you can say in I I just want you to be introduced with these things. And
147:46 to be introduced with these things. And was this helpful? Not really. And it is
147:48 was this helpful? Not really. And it is saying custom schema DB util package DBD
147:50 saying custom schema DB util package DBD code discussion. Let's see unique. Do we
147:53 code discussion. Let's see unique. Do we have anything in unique? So now I'm not
147:55 have anything in unique? So now I'm not relying on this search engine and I have
147:57 relying on this search engine and I have to manually scratch it. So what is DBD?
148:01 to manually scratch it. So what is DBD? Build your DBD projects.
148:04 Build your DBD projects. Okay, I think it should be in this one.
148:06 Okay, I think it should be in this one. Build DBD projects. I think so. build
148:09 Build DBD projects. I think so. build your tag perfect here test data test and
148:12 your tag perfect here test data test and we are not talking about unit test right
148:13 we are not talking about unit test right now unit tests are like very very
148:15 now unit tests are like very very important I will just let you know and
148:16 important I will just let you know and this is for now we are just doing data
148:18 this is for now we are just doing data testing okay so add data test to to your
148:20 testing okay so add data test to to your DAG and this was a web page that I was
148:22 DAG and this was a web page that I was looking for and which search engine
148:24 looking for and which search engine provided to me but no worries so
148:26 provided to me but no worries so basically let's start with overview data
148:28 basically let's start with overview data test are assertions or basically you can
148:30 test are assertions or basically you can say validation checks or basically
148:32 say validation checks or basically expectation is like a very good word in
148:34 expectation is like a very good word in the world of data engineering okay data
148:36 the world of data engineering okay data test are assertion you make about models
148:38 test are assertion you make about models and other resources in DBD project
148:40 and other resources in DBD project sources. Seeds snapshots an Lamba what
148:43 sources. Seeds snapshots an Lamba what are seeds and snapshots obviously we
148:46 are seeds and snapshots obviously we have we haven't talked about that but
148:48 have we haven't talked about that but yes we can add test and seeds and
148:50 yes we can add test and seeds and snapshots as well that we'll be just
148:52 snapshots as well that we'll be just talking very very very soon okay then it
148:54 talking very very very soon okay then it is saying when you run dbd test dbt will
148:56 is saying when you run dbd test dbt will tell you if each test in your project
148:59 tell you if each test in your project passes or fails very good that's what we
149:01 passes or fails very good that's what we want right and now let's scroll down we
149:03 want right and now let's scroll down we have singular data test and generic data
149:06 have singular data test and generic data test that's the thing that we need to
149:08 test that's the thing that we need to cover in this particular section So
149:10 cover in this particular section So let's start with singular test okay not
149:13 let's start with singular test okay not like singular generic data test okay so
149:15 like singular generic data test okay so this is the singular data test just
149:16 this is the singular data test just ignore and this is the generic data test
149:18 ignore and this is the generic data test so generic data test is like you can say
149:23 so generic data test is like you can say to my knowledge or basically if you
149:26 to my knowledge or basically if you follow my guidance you should always add
149:29 follow my guidance you should always add data test on the properties file because
149:33 data test on the properties file because you will be adding test on specific
149:36 you will be adding test on specific columns and each specific or basically
149:39 columns and each specific or basically any column that you want to use in your
149:40 any column that you want to use in your data test will be available in the
149:42 data test will be available in the model. That's why this is the file that
149:45 model. That's why this is the file that you can use in your testing. Make sense?
149:48 you can use in your testing. Make sense? Not like exactly this one. You need to
149:49 Not like exactly this one. You need to tweak it. But this is a very good
149:51 tweak it. But this is a very good example where you can just use the
149:53 example where you can just use the properties file to create the data test.
149:56 properties file to create the data test. And it is very easy. And remember we
149:58 And it is very easy. And remember we define data test on the column level. As
150:00 define data test on the column level. As you can see first of all we define the
150:02 you can see first of all we define the model and within that model and remember
150:05 model and within that model and remember this is a list. So these are like all
150:07 this is a list. So these are like all the models. But for now we just have
150:09 the models. But for now we just have only one model which is called orders.
150:12 only one model which is called orders. So this is a list and first item is name
150:14 So this is a list and first item is name and order. So basically this will be our
150:16 and order. So basically this will be our dictionary. Okay. So name orders columns
150:19 dictionary. Okay. So name orders columns and then a list of dictionaries name
150:22 and then a list of dictionaries name order ID and then data test unique and
150:25 order ID and then data test unique and not null. Yes we can add multiple data
150:28 not null. Yes we can add multiple data test to a single column. It is accepted.
150:33 test to a single column. It is accepted. Make sense? Very good. Then we have
150:36 Make sense? Very good. Then we have accepted values. Then we have arguments.
150:38 accepted values. Then we have arguments. Then we have value and we have so many
150:41 Then we have value and we have so many other things. Okay. So let's copy this
150:44 other things. Okay. So let's copy this and let's actually see this in practical
150:47 and let's actually see this in practical because when you will be just building
150:48 because when you will be just building the stuff then you will just get to know
150:50 the stuff then you will just get to know okay we can just add so many things and
150:53 okay we can just add so many things and what about severity? We will just talk
150:55 what about severity? We will just talk about severity as well like how we can
150:56 about severity as well like how we can just define severity and it should be in
150:59 just define severity and it should be in this particular page. If not I can just
151:01 this particular page. If not I can just show you it's not a big deal. It's just
151:03 show you it's not a big deal. It's just a config that we need to add. So let's
151:05 a config that we need to add. So let's go to our VS code and perfect. So here
151:08 go to our VS code and perfect. So here we have this particular macro that we
151:10 we have this particular macro that we built last time. Okay. So now let's say
151:14 built last time. Okay. So now let's say I have this particular bronze layer and
151:16 I have this particular bronze layer and we are getting data from the source. We
151:18 we are getting data from the source. We know that but we actually do not know
151:20 know that but we actually do not know like what kind of data we'll be
151:22 like what kind of data we'll be receiving right there may be some
151:24 receiving right there may be some malformed records, not unique records.
151:27 malformed records, not unique records. Yes, in the bronze layer we actually do
151:30 Yes, in the bronze layer we actually do not apply any transformation but at
151:32 not apply any transformation but at least we can add some test because just
151:35 least we can add some test because just tell me one thing you are using a table
151:38 tell me one thing you are using a table from the source okay and
151:42 from the source okay and you do not want forget about
151:43 you do not want forget about transformation forget about
151:45 transformation forget about transformation you will at least expect
151:48 transformation you will at least expect that your table should not have
151:50 that your table should not have duplicates in the primary key because if
151:52 duplicates in the primary key because if there are duplicates in the primary key
151:54 there are duplicates in the primary key that is a source problem that is not
151:56 that is a source problem that is not your problem. You do not need to
151:59 your problem. You do not need to dduplicate it because when you are
152:01 dduplicate it because when you are expecting a primary key in the source
152:03 expecting a primary key in the source table, that means database
152:05 table, that means database administrators or any source provider,
152:07 administrators or any source provider, it can be API provider, database,
152:09 it can be API provider, database, anyone. It's their responsibility to add
152:12 anyone. It's their responsibility to add a primary key, right? You cannot say we
152:16 a primary key, right? You cannot say we know how to dduplicate, we can just
152:18 know how to dduplicate, we can just dduplicate. No, no.
152:22 dduplicate. No, no. If you are expecting a primary key in
152:24 If you are expecting a primary key in the source, that means you need a
152:27 the source, that means you need a primary key in the source. If they say
152:30 primary key in the source. If they say that they do not have primary key in the
152:32 that they do not have primary key in the source, that is a different story,
152:34 source, that is a different story, right? When they are saying that they
152:36 right? When they are saying that they have a primary key, that means they need
152:37 have a primary key, that means they need to add it. Let's say they said, hey, in
152:39 to add it. Let's say they said, hey, in the bronze sales, if I just open this
152:42 the bronze sales, if I just open this one, if I just run this, so let's say
152:45 one, if I just run this, so let's say they are saying that we have
152:48 they are saying that we have um primary key in this particular table.
152:51 um primary key in this particular table. Okay. And the column will be let's see
152:55 Okay. And the column will be let's see the results first of all and then we can
152:57 the results first of all and then we can just figure out like what will be the
152:59 just figure out like what will be the primary key and then we can just add
153:01 primary key and then we can just add data test on top of those columns if
153:04 data test on top of those columns if there's a composite key or if it it's
153:06 there's a composite key or if it it's just like single column. Okay, doesn't
153:08 just like single column. Okay, doesn't matter much. So why it is taking much
153:10 matter much. So why it is taking much for the first time because obviously the
153:12 for the first time because obviously the cluster was turned off so it takes a
153:15 cluster was turned off so it takes a little bit more time. So sale ID as you
153:17 little bit more time. So sale ID as you can see is a primary key here. sale ID.
153:20 can see is a primary key here. sale ID. See,
153:22 See, right? So, we can just add a test in the
153:25 right? So, we can just add a test in the sale ID that it should be unique plus it
153:28 sale ID that it should be unique plus it should be not null. Make sense? And same
153:31 should be not null. Make sense? And same thing, we can just do it on let's say
153:35 thing, we can just do it on let's say bronze store as well. Let's say
153:38 bronze store as well. Let's say bronze store. If I just run this,
153:48 if I run this, yes, perfect. Because every store should be unique, right? So
153:50 every store should be unique, right? So we can actually add these kind of test
153:52 we can actually add these kind of test on the primary keys. Make sense? And
153:55 on the primary keys. Make sense? And then we'll be just adding more and more
153:56 then we'll be just adding more and more test. Don't worry. So this is perfect.
153:59 test. Don't worry. So this is perfect. Make sense? So now what we going to do?
154:01 Make sense? So now what we going to do? We will simply open our properties.yamel
154:03 We will simply open our properties.yamel file. Okay. And here we have all the
154:06 file. Okay. And here we have all the models listed, right? Let's add one more
154:09 models listed, right? Let's add one more model
154:11 model and it will be
154:17 bronze sales. Make sense? Okay. And by the way, we can
154:20 Make sense? Okay. And by the way, we can just paste the code as well like from
154:22 just paste the code as well like from here. So if I go here, I can simply copy
154:26 here. So if I go here, I can simply copy from columns instead of just writing
154:28 from columns instead of just writing everything again
154:31 everything again columns until not null.
154:35 columns until not null. Perfect. So now
154:38 Perfect. So now if I write this columns, columns is
154:43 if I write this columns, columns is order ID. Not really. It was sale ID in
154:45 order ID. Not really. It was sale ID in our case.
154:48 our case. Sale ID. Perfect. Should I snooze my
154:54 Sale ID. Perfect. Should I snooze my No, no, it's fine. Code completion. So
154:56 No, no, it's fine. Code completion. So here, let's add data test. So what we
154:59 here, let's add data test. So what we need to do, we'll simply say, hey, this
155:01 need to do, we'll simply say, hey, this is my column name and this is sale ID.
155:04 is my column name and this is sale ID. And then within that, we basically you
155:06 And then within that, we basically you and me, we just want to apply this data
155:09 and me, we just want to apply this data test. So we will simply write data
155:10 test. So we will simply write data tests. And then what type of data we
155:13 tests. And then what type of data we want to add? Unique and not null. Make
155:16 want to add? Unique and not null. Make sense? So let's apply the same thing to
155:19 sense? So let's apply the same thing to our
155:21 our another model that we discussed is
155:23 another model that we discussed is bronze sales.
155:25 bronze sales. Make sense? Okay. Very good. And this
155:27 Make sense? Okay. Very good. And this time it will be store SK.
155:33 time it will be store SK. Okay. Store SK. And let me just confirm
155:35 Okay. Store SK. And let me just confirm what was the
155:38 what was the name of this
155:39 name of this fact sales.
155:42 fact sales. Bronze sales basically. Let me run this
155:45 Bronze sales basically. Let me run this because I think it was SK instead of ID.
155:49 because I think it was SK instead of ID. Let's confirm.
155:51 Let's confirm. No, it's fine. Sales ID. Sales ID, not
155:54 No, it's fine. Sales ID. Sales ID, not sale ID. Perfect.
155:58 sale ID. Perfect. Perfect. Perfect. Perfect. So, this is
155:59 Perfect. Perfect. Perfect. So, this is the these are the basically two tests
156:01 the these are the basically two tests that we have added. Let's say we want to
156:03 that we have added. Let's say we want to add one more test. What is that test
156:06 add one more test. What is that test that you want to add? Let's say we want
156:10 that you want to add? Let's say we want to add
156:12 to add test for our another column which is if
156:15 test for our another column which is if you remember that we have one more
156:18 you remember that we have one more column it's called store name I guess
156:21 column it's called store name I guess let's say we want to apply a strong
156:23 let's say we want to apply a strong check on the store name because
156:26 check on the store name because sometimes they can just use any
156:28 sometimes they can just use any different naming space okay they can
156:30 different naming space okay they can just use let's say another case
156:32 just use let's say another case sensitivity so we want to make sure that
156:34 sensitivity so we want to make sure that they are using exact store name make
156:37 they are using exact store name make sense
156:38 sense We want to just make sure this. So what
156:40 We want to just make sure this. So what I will do? I will create a data test on
156:43 I will do? I will create a data test on top of this as well. Can we do that?
156:44 top of this as well. Can we do that? Yes. So how? So now I will go to bronze.
156:49 Yes. So how? So now I will go to bronze. First of all, it's not sales, it's
156:51 First of all, it's not sales, it's bronze store.
156:53 bronze store. Okay. So this is my model name and
156:56 Okay. So this is my model name and within that I have this column. I will
156:58 within that I have this column. I will create another
157:01 create another another column. Let's say name
157:06 another column. Let's say name name and then store name. Yes. But this
157:09 name and then store name. Yes. But this time I do not want to use unique or not
157:11 time I do not want to use unique or not null. No, I'll be using a special kind
157:13 null. No, I'll be using a special kind of generic test. It's called accepted
157:16 of generic test. It's called accepted values.
157:18 values. Accepted values. Make sense? So within
157:22 Accepted values. Make sense? So within that, yes, in YAML we pass the list in
157:25 that, yes, in YAML we pass the list in the form of dashes. But in this
157:27 the form of dashes. But in this particular case, the inline list will be
157:29 particular case, the inline list will be more readable, right? Like this. And the
157:32 more readable, right? Like this. And the store names are let's say Mega Mart
157:36 store names are let's say Mega Mart Manhattan.
157:42 And then we have Mega Mart Brooklyn. And then we have
157:50 Mega Mart Queens. Not Queens. I thought it is reading my file, but now
157:52 I thought it is reading my file, but now it is just giving me some US city names.
157:56 it is just giving me some US city names. Mega Mart
158:03 and then Mega Mart San Joles, not San Francisco.
158:11 San Jos. Oh, we have one in Toronto as well.
158:13 Oh, we have one in Toronto as well. Nice.
158:15 Nice. It's called Mega Mart Toronto. Okay,
158:18 It's called Mega Mart Toronto. Okay, nice.
158:20 nice. So,
158:22 So, Mega Mart, Toronto.
158:26 Mega Mart, Toronto. Perfect. Toronto. Toronto. Some people
158:29 Perfect. Toronto. Toronto. Some people say Toronto. It's not Toronto. It's
158:31 say Toronto. It's not Toronto. It's Toronto.
158:33 Toronto. Toronto.
158:35 Toronto. Toronto. The best. Toronto.
158:40 Toronto. The best. Toronto. So now this test is also ready. Accepted
158:44 So now this test is also ready. Accepted values. Okay. Data test. But the thing
158:46 values. Okay. Data test. But the thing is, but the thing is you cannot you
158:50 is, but the thing is you cannot you cannot write directly like this like
158:51 cannot write directly like this like accepted values really like you can
158:54 accepted values really like you can write but ideally you can just say hey
158:57 write but ideally you can just say hey if you just want to add accepted values
158:59 if you just want to add accepted values then you I think pass the pass like one
159:02 then you I think pass the pass like one more value it is called arguments. Okay
159:05 more value it is called arguments. Okay and within that you need to pass the
159:06 and within that you need to pass the values. It is a recent change it was not
159:09 values. It is a recent change it was not in the previous versions. So what you
159:11 in the previous versions. So what you need to write? You simply need to say
159:12 need to write? You simply need to say first of all arguments like first
159:15 first of all arguments like first accepted values is fine. After accepted
159:17 accepted values is fine. After accepted values you can say like this.
159:21 values you can say like this. Uh first of all let me just remove it. I
159:24 Uh first of all let me just remove it. I will say control X.
159:26 will say control X. Okay. So now within the accepted values
159:29 Okay. So now within the accepted values just remember the indentation. Hit one
159:31 just remember the indentation. Hit one more space or basically tab then simply
159:34 more space or basically tab then simply write arguments.
159:41 Okay. When you write arguments, wow, spelling, look at the spelling.
159:44 spelling, look at the spelling. So when you pass arguments, now within
159:47 So when you pass arguments, now within the argument, you will write something
159:49 the argument, you will write something called as values within the arguments.
159:53 called as values within the arguments. Within the arguments like just hit one
159:56 Within the arguments like just hit one more space and then you will simply
159:57 more space and then you will simply write values
159:59 write values and the values will be this list. This
160:02 and the values will be this list. This list
160:03 list let me remove space.
160:06 let me remove space. So this is the new way to do it. Okay.
160:10 So this is the new way to do it. Okay. And why I am seeing this particular
160:13 And why I am seeing this particular color mismatch?
160:15 color mismatch? Uh oh. Arguments
160:19 Uh oh. Arguments arguments. It's fine. Okay. So this way
160:22 arguments. It's fine. Okay. So this way you can apply multiple data test to your
160:25 you can apply multiple data test to your columns or basically to your source or
160:26 columns or basically to your source or to anything. So for so now we have just
160:29 to anything. So for so now we have just applied data test to our bronze model.
160:31 applied data test to our bronze model. You can apply test on silver model on
160:33 You can apply test on silver model on gold model on seeds and snapshot that
160:36 gold model on seeds and snapshot that we'll be just talking about very very
160:37 we'll be just talking about very very soon. And this is the way. Let me just
160:39 soon. And this is the way. Let me just save it. So now we want to test it. I
160:41 save it. So now we want to test it. I will open the terminal. Okay. I will
160:43 will open the terminal. Okay. I will simply say cd
160:46 simply say cd dbt YouTube.
160:48 dbt YouTube. So that I can just enter inside this
160:50 So that I can just enter inside this that folder. Now I will simply say dbt
160:53 that folder. Now I will simply say dbt test. When I run dbt test what will
160:55 test. When I run dbt test what will happen? DBD test will
160:59 happen? DBD test will test all the you can say things that we
161:02 test all the you can say things that we have for this particular model. for this
161:05 have for this particular model. for this particular model. Make sense? So let's
161:08 particular model. Make sense? So let's run this command and let's see if our
161:09 run this command and let's see if our all the tests are passed or failed.
161:12 all the tests are passed or failed. Okay, let's see. Found six models, five
161:14 Okay, let's see. Found six models, five data tests, six sources, one of five
161:17 data tests, six sources, one of five start, one fail. Okay.
161:19 start, one fail. Okay. Okay, that's fine. Let's see why it
161:22 Okay, that's fine. Let's see why it failed. Failed in test accepted values.
161:24 failed. Failed in test accepted values. Okay, very good. So it is saying it
161:26 Okay, very good. So it is saying it actually failed. Failure in test
161:29 actually failed. Failure in test accepted values, bronze store name but
161:31 accepted values, bronze store name but but and whatever. So this test actually
161:34 but and whatever. So this test actually failed and that's not a big deal. If it
161:37 failed and that's not a big deal. If it is failed, it's fine because we will
161:39 is failed, it's fine because we will just make sure like what is the thing
161:41 just make sure like what is the thing that made it failed and it's a good
161:45 that made it failed and it's a good check that see if we have anything let's
161:48 check that see if we have anything let's say which is not relevant it will be
161:50 say which is not relevant it will be failing. Okay. So now let's see why it
161:52 failing. Okay. So now let's see why it failed.
161:54 failed. It should not be failing. Let me just
161:56 It should not be failing. Let me just check the real values in the data
161:58 check the real values in the data bricks. Maybe they are just giving us
161:59 bricks. Maybe they are just giving us the cleaner version. So I'll simply go
162:01 the cleaner version. So I'll simply go to catalog
162:03 to catalog and DB tutorial dev
162:06 and DB tutorial dev bronze
162:08 bronze bronze let's say store
162:11 bronze let's say store sample data
162:14 sample data so value is fine manhattan okay I think
162:18 so value is fine manhattan okay I think I wrote the wrong spelling for Manhattan
162:20 I wrote the wrong spelling for Manhattan yep it's a it's not e see
162:25 yep it's a it's not e see one wrong value in the source will not
162:29 one wrong value in the source will not allow you to process your data. That's
162:32 allow you to process your data. That's the power of data test. And that's the
162:36 the power of data test. And that's the obvious behavior that we should see.
162:38 obvious behavior that we should see. Make sense? DBT test.
162:41 Make sense? DBT test. Now let's run for one more time. Now
162:43 Now let's run for one more time. Now let's see if it fails. Oh, it failed
162:45 let's see if it fails. Oh, it failed again. Maybe there's like mismatch. One
162:48 again. Maybe there's like mismatch. One more time. Got five results. Configured
162:49 more time. Got five results. Configured to fail. If not equal zero. Completed
162:52 to fail. If not equal zero. Completed with one error. We just have one error.
162:53 with one error. We just have one error. Okay.
162:55 Okay. Okay. Only one test failed actually and
162:58 Okay. Only one test failed actually and other tests are running fine. Pass start
163:01 other tests are running fine. Pass start test unique bronze sale ID store key.
163:04 test unique bronze sale ID store key. Okay. Okay. So, accepted values is
163:06 Okay. Okay. So, accepted values is failing. Okay. Makes sense. Megamart
163:09 failing. Okay. Makes sense. Megamart Megaart Brooklyn
163:11 Megaart Brooklyn [Music]
163:13 [Music] Mega Mart San Jose.
163:16 Mega Mart San Jose. Okay.
163:18 Okay. Mega Mart Toronto.
163:21 Mega Mart Toronto. Uh I think it is fine.
163:26 Uh I think it is fine. Mega Mart
163:28 Mega Mart meat Toronto okay so now let's cover
163:32 meat Toronto okay so now let's cover that particular topic as well I think
163:33 that particular topic as well I think this is the right time to cover that so
163:35 this is the right time to cover that so let's say you are receiving these issues
163:38 let's say you are receiving these issues or basically these errors and if the
163:41 or basically these errors and if the test fails it will throw an error but
163:44 test fails it will throw an error but sometimes you do not want to throw an
163:46 sometimes you do not want to throw an error you just want to throw a warning
163:48 error you just want to throw a warning maybe this is that maybe this is not
163:50 maybe this is that maybe this is not much critical to your particular table
163:52 much critical to your particular table this is just you can say good to have
163:55 this is just you can say good to have test but not like mandatory test. So you
163:58 test but not like mandatory test. So you can just configure the severity severity
164:02 can just configure the severity severity of your test as well. How you can just
164:05 of your test as well. How you can just cover severity of your test? You simply
164:07 cover severity of your test? You simply need to add config in your particular
164:10 need to add config in your particular test. So let's say you want to add
164:14 test. So let's say you want to add config let's say here uh data test
164:17 config let's say here uh data test accepted values
164:20 accepted values and you will simply say
164:30 here yes config and then you will simply say config and
164:33 and then you will simply say config and config I think if I'm not wrong it is
164:34 config I think if I'm not wrong it is called severity directly so these are
164:37 called severity directly so these are basically recent changes so severity is
164:40 basically recent changes so severity is on. So let's say I just want to warn
164:42 on. So let's say I just want to warn myself. Okay, instead of throwing an
164:45 myself. Okay, instead of throwing an error. Now let's try to run this because
164:47 error. Now let's try to run this because this time we know that we have wrong
164:49 this time we know that we have wrong values but still we should see warning
164:51 values but still we should see warning instead of error. And I think accepted
164:54 instead of error. And I think accepted values indentation should be like this.
164:57 values indentation should be like this. Just add one more tab. Yep. Because it
164:59 Just add one more tab. Yep. Because it should be inside accepted values because
165:01 should be inside accepted values because this is a property basically
165:02 this is a property basically configuration for accepted values data
165:04 configuration for accepted values data test. Let's see. Let me just first of
165:06 test. Let's see. Let me just first of all save the file because it sure it
165:09 all save the file because it sure it will pick only the saved version of the
165:11 will pick only the saved version of the file. So now let me run cd
165:15 file. So now let me run cd and I think I will be using this one and
165:18 and I think I will be using this one and dbd test. Let's see and this time we
165:21 dbd test. Let's see and this time we should see warning instead of failure.
165:22 should see warning instead of failure. Let's see. Okay, perfect. Now we are
165:25 Let's see. Okay, perfect. Now we are seeing warning and warn. Now it is not
165:27 seeing warning and warn. Now it is not throwing an error. It is saying hey you
165:28 throwing an error. It is saying hey you have a warning in test accepted values.
165:31 have a warning in test accepted values. Maybe you have some wrong wrong
165:32 Maybe you have some wrong wrong spelling. Maybe you have something that
165:34 spelling. Maybe you have something that is not expected. So this is basically
165:37 is not expected. So this is basically the warning that we are seeing. And this
165:39 the warning that we are seeing. And this way you can add the severity.
165:42 way you can add the severity. See at the end it is saying pass equals
165:44 See at the end it is saying pass equals to 4, one equals to 1 and error equals
165:47 to 4, one equals to 1 and error equals to zero. That's how you can just set the
165:49 to zero. That's how you can just set the severity in your data test. And let's
165:52 severity in your data test. And let's see like what's actually wrong with
165:53 see like what's actually wrong with this. Maybe we have just misspelled
165:56 this. Maybe we have just misspelled anything.
165:57 anything. Mega mart Brooklyn.
166:01 Mega mart Brooklyn. So it will take time. So let's add test
166:05 So it will take time. So let's add test on a different column. That's not a big
166:06 on a different column. That's not a big deal. So let's say I want to add a test
166:09 deal. So let's say I want to add a test on any other column because I want to
166:12 on any other column because I want to see spelling of each alphabetic case
166:14 see spelling of each alphabetic case sensitivity. So let's say I want to add
166:15 sensitivity. So let's say I want to add a test on city or basically on the
166:17 a test on city or basically on the country. We know that there will be just
166:19 country. We know that there will be just two countries USA and Canada for this
166:21 two countries USA and Canada for this particular data store. So let's add
166:23 particular data store. So let's add those values. Okay. And let's see. So
166:26 those values. Okay. And let's see. So the column name is country.
166:29 the column name is country. column name is country.
166:49 and let's use double quotes maybe if it wants double code but let's try with
166:51 wants double code but let's try with this one. If it gives error or basically
166:53 this one. If it gives error or basically warning let we will be just trying with
166:54 warning let we will be just trying with the double quotes. Okay. So let's now
166:58 the double quotes. Okay. So let's now run dbt test.
167:06 Perfect. Let's see. Let's see what we get. Okay. Warning.
167:10 get. Okay. Warning. Oh, that is something else.
167:11 Oh, that is something else. Configuration path exist. Okay. Okay.
167:12 Configuration path exist. Okay. Okay. Configuration path exist in your DB
167:14 Configuration path exist in your DB project file which do not apply to any
167:16 project file which do not apply to any resource. Configuration paths. Okay. Oh,
167:19 resource. Configuration paths. Okay. Oh, that is just a like normal warning for
167:20 that is just a like normal warning for our DBT project. ML that is fine. But
167:23 our DBT project. ML that is fine. But our all the tests are passed. Yes. So
167:25 our all the tests are passed. Yes. So that means there was like um some you
167:27 that means there was like um some you can say um whatever your spelling
167:32 can say um whatever your spelling mistake case sensitivity and all. So we
167:34 mistake case sensitivity and all. So we just apply a test on particular on a
167:35 just apply a test on particular on a different particular column. Okay, make
167:37 different particular column. Okay, make sense? So this way you can actually see
167:40 sense? So this way you can actually see all the tests are passed, right? And
167:42 all the tests are passed, right? And zero warning and zero error. That's how
167:43 zero warning and zero error. That's how you work with data test. You add
167:45 you work with data test. You add severity, you add multiple you can say
167:48 severity, you add multiple you can say conditions on one column. So I have just
167:51 conditions on one column. So I have just showed you so many ways to do it, right?
167:54 showed you so many ways to do it, right? So this was all about the generic test.
167:57 So this was all about the generic test. Now let's talk about the singular test.
167:59 Now let's talk about the singular test. What are singular test? So now if we
168:02 What are singular test? So now if we talk about singular test. First of all
168:05 talk about singular test. First of all let me just tell you what are singular
168:07 let me just tell you what are singular test. Okay. So you understood like these
168:08 test. Okay. So you understood like these are the generic test and these are the
168:10 are the generic test and these are the test that we apply on the column.
168:12 test that we apply on the column. Singular test are like one step ahead.
168:15 Singular test are like one step ahead. Why? And what's different in particular
168:18 Why? And what's different in particular singular test? Basically singular test
168:22 singular test? Basically singular test to me to me if I want to understand
168:24 to me to me if I want to understand singular test I will treat singular test
168:27 singular test I will treat singular test as like logical test
168:30 as like logical test that can be built for KPIs test that can
168:34 that can be built for KPIs test that can be built for the business because these
168:36 be built for the business because these test are for us for data engineers okay
168:39 test are for us for data engineers okay obviously those tests are obviously for
168:42 obviously those tests are obviously for us as well but
168:44 us as well but the stakeholder for those test can be
168:47 the stakeholder for those test can be can be business as well but Business is
168:50 can be business as well but Business is not responsible for these kinds of
168:51 not responsible for these kinds of tests. Unique not null. Just imagine
168:54 tests. Unique not null. Just imagine business success manager or whatever
168:55 business success manager or whatever business manager or whatever manager is
168:58 business manager or whatever manager is saying hey why do we have nulls?
169:00 saying hey why do we have nulls? Obviously they cannot they do not know
169:01 Obviously they cannot they do not know right? So now singular tests are like
169:06 right? So now singular tests are like when you want to test a logic when you
169:09 when you want to test a logic when you want to test a KPI logic when you just
169:11 want to test a KPI logic when you just want to test something else which will
169:13 want to test something else which will require multiple columns from multiple
169:15 require multiple columns from multiple tables. It can be literally anything. So
169:18 tables. It can be literally anything. So just a quick example. So what you do
169:20 just a quick example. So what you do first of all in your particular DBT
169:23 first of all in your particular DBT project you will be having one folder
169:25 project you will be having one folder it's called test. So whatever you will
169:28 it's called test. So whatever you will be just writing here it will be treated
169:30 be just writing here it will be treated as your singular test. Okay. Whatever
169:35 as your singular test. Okay. Whatever you will be writing here yes you can
169:37 you will be writing here yes you can create generic test as well here really
169:39 create generic test as well here really yes I will just talk about that as well.
169:41 yes I will just talk about that as well. What are custom generic test? But let's
169:43 What are custom generic test? But let's talk about the singular test first of
169:44 talk about the singular test first of all. So if you just want to create a
169:46 all. So if you just want to create a singular test, you can just think of it
169:48 singular test, you can just think of it like any SQL statement that you want to
169:50 like any SQL statement that you want to run as a test any like it can be
169:53 run as a test any like it can be anything. Let's say you do not want any
169:55 anything. Let's say you do not want any price greater than or maybe less than
169:58 price greater than or maybe less than zero. Okay, which can be like non-
170:00 zero. Okay, which can be like non- negative because price are not non-
170:01 negative because price are not non- negative. It can be zero if like 100%
170:03 negative. It can be zero if like 100% off is there. But you do not want like
170:06 off is there. But you do not want like if something will be there for like
170:08 if something will be there for like let's say non-zero value. So let's see
170:12 let's say non-zero value. So let's see let's see one test. So let's actually
170:16 let's see one test. So let's actually try to build because only then you will
170:18 try to build because only then you will understand it. Okay. And if you will
170:20 understand it. Okay. And if you will understand it then only you will crack
170:22 understand it then only you will crack the interviews and if you will crack the
170:24 the interviews and if you will crack the interviews only then you will just be
170:26 interviews only then you will just be happy and if you are happy only then I
170:29 happy and if you are happy only then I will be happy. So there's a chain right?
170:31 will be happy. So there's a chain right? So let's say bronze sales if I just
170:34 So let's say bronze sales if I just query this table
170:36 query this table and we want to just make sure that
170:38 and we want to just make sure that whatever whatever value for this
170:41 whatever whatever value for this particular column which is called gross
170:43 particular column which is called gross amount it should not be less than zero.
170:46 amount it should not be less than zero. It doesn't make any sense right and you
170:48 It doesn't make any sense right and you need to make sure because there can be a
170:50 need to make sure because there can be a er there can be an error on the source
170:52 er there can be an error on the source side where you are saying hey you have
170:54 side where you are saying hey you have just given me the negative value and
170:56 just given me the negative value and just one minus sign can disturb or I
171:00 just one minus sign can disturb or I would say disrupt the whole finance
171:03 would say disrupt the whole finance sheet right one single negative sign so
171:06 sheet right one single negative sign so there's not an error for the values just
171:09 there's not an error for the values just one sign can actually disrupt everything
171:12 one sign can actually disrupt everything so we need to make sure we need to make
171:14 so we need to make sure we need to make sure okay so how we can just make sure
171:16 sure okay so how we can just make sure so obviously we do not have any kind of
171:18 so obviously we do not have any kind of generic test for um non- negative values
171:21 generic test for um non- negative values and all those things or basically this
171:23 and all those things or basically this is not just like on one column they can
171:25 is not just like on one column they can be like multiple columns multiple logics
171:28 be like multiple columns multiple logics and multiple things let me just run this
171:30 and multiple things let me just run this let's say I want to make sure that
171:34 let's say I want to make sure that all these columns should not be negative
171:37 all these columns should not be negative discount amount okay and net amount or
171:40 discount amount okay and net amount or let's say gross amount and net amount
171:42 let's say gross amount and net amount and yeah these two columns because
171:44 and yeah these two columns because discount can be negative Maybe you are
171:45 discount can be negative Maybe you are overcharging. So maths, simple maths. I
171:49 overcharging. So maths, simple maths. I know you have not studied maths but
171:51 know you have not studied maths but trust me. So gross amount and net
171:54 trust me. So gross amount and net amount. Okay. So let's create a test.
171:58 amount. Okay. So let's create a test. So if you just go to test folder, let's
171:59 So if you just go to test folder, let's create a file and I will simply say um
172:04 create a file and I will simply say um finance or basically non- negative
172:07 finance or basically non- negative non- negative test do.SQL.
172:10 non- negative test do.SQL. Let's create this file. So what I will
172:12 Let's create this file. So what I will do? I'll simply write select
172:14 do? I'll simply write select ax
172:20 from and you know that what are we just using? We are using source. So I'll just
172:23 using? We are using source. So I'll just use ginger function source and source
172:25 use ginger function source and source name is source
172:28 name is source sorus
172:35 and then you can simply write um bronze not bronze I would say fact sales
172:38 not bronze I would say fact sales because that is our you can say
172:42 because that is our you can say data but you know what we can actually
172:44 data but you know what we can actually apply this test on top of our bronze
172:47 apply this test on top of our bronze model instead of source because we need
172:50 model instead of source because we need to make sure like once the data is
172:51 to make sure like once the data is loaded in a bronze it should not be
172:53 loaded in a bronze it should not be negative. Make sense? So we can just
172:55 negative. Make sense? So we can just actually apply bronze as well. So now
172:58 actually apply bronze as well. So now you will learn one more thing. What is
173:00 you will learn one more thing. What is that? It is called ref function. Ref
173:02 that? It is called ref function. Ref function is so hey snooze
173:06 function is so hey snooze giving spoilers.
173:08 giving spoilers. So
173:11 So so now let's talk about ref function.
173:13 so now let's talk about ref function. Now what is ref function? Ref function
173:14 Now what is ref function? Ref function is an abbreviation for reference. So we
173:18 is an abbreviation for reference. So we are using source objects using source
173:21 are using source objects using source function. Now from now onwards we want
173:25 function. Now from now onwards we want to use let's say models from the bronze
173:27 to use let's say models from the bronze layer and we know that we have all these
173:30 layer and we know that we have all these models bronze returns bronze sales
173:32 models bronze returns bronze sales bronze store and so many other things.
173:34 bronze store and so many other things. How we can just use these models as a
173:36 How we can just use these models as a source. So there's a function called
173:38 source. So there's a function called ref.
173:40 ref. So I will just use ref and here you can
173:43 So I will just use ref and here you can simply write the model name which is
173:46 simply write the model name which is bronze and then let's say sales make
173:50 bronze and then let's say sales make sense and then I will say and don't
173:53 sense and then I will say and don't worry I'll just show you the compiled
173:54 worry I'll just show you the compiled version of this as well where
173:58 version of this as well where uh gross amount
174:06 is greater than zero and no not greater than zero less than zero and I will just
174:08 than zero less than zero and I will just tell you why
174:10 tell you why and uh net amount
174:14 and uh net amount control I to continue with coalent
174:18 control I to continue with coalent net amount
174:20 net amount less than zero. Why less than zero?
174:22 less than zero. Why less than zero? Because in the singular test it goes in
174:26 Because in the singular test it goes in the reverse order. If it will return any
174:29 the reverse order. If it will return any record then it will be treated as a
174:32 record then it will be treated as a failed test. If it does not return any
174:35 failed test. If it does not return any value that means it will be treated as a
174:37 value that means it will be treated as a passed test. So obviously we do not want
174:39 passed test. So obviously we do not want to see any record for less than zero and
174:42 to see any record for less than zero and less than zero. Make sense? Before
174:43 less than zero. Make sense? Before running it, I will just show you the
174:44 running it, I will just show you the compiled version of it. If I click here
174:46 compiled version of it. If I click here compile dbt preview. So this is the
174:49 compile dbt preview. So this is the compiled version of it. See, so we know
174:51 compiled version of it. See, so we know that whatever we have in the bronze
174:54 that whatever we have in the bronze sales, if we are referring bronze sales,
174:56 sales, if we are referring bronze sales, we know that this object is created in
174:58 we know that this object is created in this particular catalog and in this
174:59 this particular catalog and in this particular schema and the model name is
175:01 particular schema and the model name is bronze sales. So whatever we are writing
175:03 bronze sales. So whatever we are writing here ref, it will be pointed directly to
175:05 here ref, it will be pointed directly to this particular object behind the
175:07 this particular object behind the scenes.
175:08 scenes. Make sense? Good. So now we can simply
175:11 Make sense? Good. So now we can simply close it. And now we can simply run it
175:13 close it. And now we can simply run it as well. Just to make sure like it is
175:15 as well. Just to make sure like it is running fine.
175:18 running fine. And let's see if it runs. It should run
175:20 And let's see if it runs. It should run because that's just a ginger function.
175:22 because that's just a ginger function. So see we do not have any kind of
175:24 So see we do not have any kind of result. That means it is a passed test.
175:26 result. That means it is a passed test. Okay. So this is our test. Now if I just
175:29 Okay. So this is our test. Now if I just run let's say dbt test command one more
175:31 run let's say dbt test command one more time. So what will happen? It will
175:33 time. So what will happen? It will include this test as well.
175:35 include this test as well. DBD test
175:42 no dbt projectl found at expected path obviously. Obviously who will write cd?
175:46 obviously. Obviously who will write cd? Oh man. Yep.
175:48 Oh man. Yep. DBD test.
175:51 DBD test. Perfect. So now it is adding this test
175:53 Perfect. So now it is adding this test as well. This time we have six test.
175:54 as well. This time we have six test. Earlier it was only five test, right?
175:57 Earlier it was only five test, right? And six test as you can see unique,
175:58 And six test as you can see unique, bronze, blah blah blah. And I think this
176:01 bronze, blah blah blah. And I think this test is here. Non- negative test. See
176:03 test is here. Non- negative test. See this is passed run and then pass
176:07 this is passed run and then pass make sense that's how you can define
176:09 make sense that's how you can define your singular test and just imagine as I
176:13 your singular test and just imagine as I just mentioned this select statement can
176:15 just mentioned this select statement can be so complex
176:18 be so complex there can be multiple joins there can be
176:20 there can be multiple joins there can be nested sub queries there can be so many
176:22 nested sub queries there can be so many KPIs so many custom functions literally
176:26 KPIs so many custom functions literally anything singular test that's why I just
176:29 anything singular test that's why I just treat them as logical test or basically
176:31 treat them as logical test or basically the alerts
176:32 the alerts Let's say you are finding something
176:34 Let's say you are finding something right and if you see your profit goes
176:38 right and if you see your profit goes down by 20%. You want to fail everything
176:41 down by 20%. You want to fail everything that can be a test right because you
176:43 that can be a test right because you cannot expect profit less than 20%. Like
176:45 cannot expect profit less than 20%. Like obviously in the real world it is less
176:46 obviously in the real world it is less than 20%. But that's it. So this way you
176:50 than 20%. But that's it. So this way you can build your singular test.
176:52 can build your singular test. Make sense? Make sense? Makes sense.
176:54 Make sense? Make sense? Makes sense. Makes sense. Now let's talk about the
176:58 Makes sense. Now let's talk about the custom generic test. That's a that's an
177:00 custom generic test. That's a that's an advanced topic but I want to cover this
177:02 advanced topic but I want to cover this as well. So if I just go to the
177:04 as well. So if I just go to the documentation let's see
177:06 documentation let's see custom generic test
177:10 custom generic test custom generic test
177:13 custom generic test dbt
177:21 custom compile. Oh wow man what a search engine writing custom generic data test.
177:24 engine writing custom generic data test. Wow.
177:25 Wow. So writing custom generic data test.
177:27 So writing custom generic data test. Very good. So now it is saying as I just
177:30 Very good. So now it is saying as I just mentioned that if you just want to
177:32 mentioned that if you just want to create generic custom generic data test
177:34 create generic custom generic data test you have to use test directory. Okay.
177:36 you have to use test directory. Okay. And what you need to do basically let me
177:38 And what you need to do basically let me tell you you have to create a generic
177:40 tell you you have to create a generic folder within the test folder so that it
177:42 folder within the test folder so that it will be treated as your generic data
177:44 will be treated as your generic data test but it will be a custom customized.
177:47 test but it will be a custom customized. So so far we have just four generic test
177:49 So so far we have just four generic test not null um um unique accepted values
177:54 not null um um unique accepted values and relationship. But let's say you want
177:56 and relationship. But let's say you want to create a custom generic test for non-
177:58 to create a custom generic test for non- negative because you know that non-
178:00 negative because you know that non- negative is another very common scenario
178:02 negative is another very common scenario common test for your organization. So
178:04 common test for your organization. So you can pre-build your generic test and
178:07 you can pre-build your generic test and it will be a customized one and you can
178:09 it will be a customized one and you can use it in the same way that you use
178:11 use it in the same way that you use other generic test. Wow. Literally wow.
178:15 other generic test. Wow. Literally wow. So here let's see some more things and
178:18 So here let's see some more things and this is the way that we just create the
178:20 this is the way that we just create the generic data test. Basically, this is a
178:22 generic data test. Basically, this is a kind of macro, not m Yeah, a kind of
178:25 kind of macro, not m Yeah, a kind of macro with the test keyword that we need
178:29 macro with the test keyword that we need to create. Obviously, we haven't covered
178:31 to create. Obviously, we haven't covered macro yet, but there's nothing special
178:33 macro yet, but there's nothing special in it because only two lines are
178:36 in it because only two lines are reserved for macro. One is the top one
178:39 reserved for macro. One is the top one and the second one is this one. See? And
178:42 and the second one is this one. See? And other things are exactly the same. Make
178:46 other things are exactly the same. Make sense? So what you need to do here you
178:47 sense? So what you need to do here you will simply write your select statement
178:49 will simply write your select statement that you are writing like select column
178:51 that you are writing like select column and it will be dynamic because you are
178:53 and it will be dynamic because you are just creating a generic test that you
178:54 just creating a generic test that you can just use anywhere literally
178:56 can just use anywhere literally anywhere. So you will be using column
178:58 anywhere. So you will be using column name then model and then whatever and
179:01 name then model and then whatever and then you can simply say hey it should
179:03 then you can simply say hey it should not return anything. Make sense? Make
179:05 not return anything. Make sense? Make sense? Make sense? So I know it is a
179:07 sense? Make sense? So I know it is a little bit complex but don't worry let
179:09 little bit complex but don't worry let me just show you then you will
179:10 me just show you then you will understand it. Okay. So we are here back
179:13 understand it. Okay. So we are here back in our VS code editor. So let's try to
179:17 in our VS code editor. So let's try to create the customized generic test.
179:19 create the customized generic test. Okay, make sense. So for that what I
179:21 Okay, make sense. So for that what I will do? I will simply create a
179:23 will do? I will simply create a directory within this test folder. I
179:25 directory within this test folder. I will simply call it as generic because
179:26 will simply call it as generic because this is mandatory. Okay, generic. And
179:30 this is mandatory. Okay, generic. And within this generic, whatever I'll be
179:33 within this generic, whatever I'll be just writing here will be treated as my
179:35 just writing here will be treated as my customized generic test. Let's create a
179:37 customized generic test. Let's create a file within this. And I will name it as
179:39 file within this. And I will name it as let's say
179:41 let's say um generic
179:43 um generic and then non- negative.
179:51 Make sense? SQL. Very good. So now we simply need to first of all write our
179:52 simply need to first of all write our SQL statement. select ax
179:57 SQL statement. select ax from our table name or whatever that we
180:00 from our table name or whatever that we want. Okay. So here obviously it will be
180:02 want. Okay. So here obviously it will be let's say table. Make sense? Make sense?
180:06 let's say table. Make sense? Make sense? Okay. Very good. And in this case it can
180:08 Okay. Very good. And in this case it can be model because in this particular test
180:10 be model because in this particular test you remember we referred the model. So
180:12 you remember we referred the model. So let's say we just wrote model. Very
180:14 let's say we just wrote model. Very good. And we want where any column on
180:19 good. And we want where any column on which we want to apply this test. Let's
180:20 which we want to apply this test. Let's say column should be less than zero.
180:23 say column should be less than zero. Why? Because if it returns anything that
180:25 Why? Because if it returns anything that means it is failed. So you know this
180:26 means it is failed. So you know this thing. Now we need to convert this
180:29 thing. Now we need to convert this simple thing into a customized generic
180:31 simple thing into a customized generic test using a macro. So it is very
180:33 test using a macro. So it is very simple. You simply need to write let's
180:35 simple. You simply need to write let's say this and then test and you can write
180:40 say this and then test and you can write here percentage
180:42 here percentage instead of writing double quotes. So
180:45 instead of writing double quotes. So this is basically the function that
180:47 this is basically the function that you'll be just writing. It's called test
180:49 you'll be just writing. It's called test and after test you need to give a name.
180:52 and after test you need to give a name. Like if you just follow my guidance I
180:53 Like if you just follow my guidance I will just ask you to keep the test name
180:55 will just ask you to keep the test name and file name same for better code
180:57 and file name same for better code management. I will simply call it as
180:59 management. I will simply call it as generic non- negative. Perfect. So this
181:01 generic non- negative. Perfect. So this is my test name. Okay. And this is
181:04 is my test name. Okay. And this is basically the macro. Macro means
181:06 basically the macro. Macro means functions. If you create a function in
181:09 functions. If you create a function in Python, it is equivalent to macro here.
181:11 Python, it is equivalent to macro here. Okay. In ginger I will call I will just
181:13 Okay. In ginger I will call I will just pass two parameters. One is model. The
181:16 pass two parameters. One is model. The second one is column. Make sense? So
181:19 second one is column. Make sense? So these are basically two parameters. So
181:22 these are basically two parameters. So obviously we need to remove this
181:23 obviously we need to remove this particular thing here. So what we need
181:25 particular thing here. So what we need to write? We need to say hey the model
181:28 to write? We need to say hey the model is not hardcoded model because just use
181:30 is not hardcoded model because just use your common sense. If you're writing
181:32 your common sense. If you're writing select a from model it will look for
181:34 select a from model it will look for model as a table but you need to say hey
181:37 model as a table but you need to say hey this is a variable just use model here.
181:41 this is a variable just use model here. Simple. And same with column as well.
181:44 Simple. And same with column as well. Let's say like this. Simple.
181:48 Let's say like this. Simple. Simple. Okay. And then you need to
181:50 Simple. Okay. And then you need to simply write end test. Uh
181:55 simply write end test. Uh end test. Perfect. That's it. Let me
181:57 end test. Perfect. That's it. Let me just show you the compiled version of
181:58 just show you the compiled version of it. So what it is referring obviously it
182:01 it. So what it is referring obviously it will not give you anything because we
182:02 will not give you anything because we have not passed the value. We have not
182:04 have not passed the value. We have not called this function. This is just
182:05 called this function. This is just basically the comp uh you can say coded
182:09 basically the comp uh you can say coded version of this particular macro. Okay.
182:11 version of this particular macro. Okay. And if you just go to the documentation
182:13 And if you just go to the documentation page, you can see like everything is
182:14 page, you can see like everything is same. This is a test and then the
182:16 same. This is a test and then the function or basically macro name and we
182:18 function or basically macro name and we are just using model and column name
182:20 are just using model and column name right like this. And it's not column,
182:22 right like this. And it's not column, it's like column name. So we can just
182:24 it's like column name. So we can just correct it
182:26 correct it and it should be exactly same. Yes, it
182:28 and it should be exactly same. Yes, it should be exactly same because this is a
182:30 should be exactly same because this is a macro that we are building for
182:32 macro that we are building for customized generic test. Okay. Then
182:34 customized generic test. Okay. Then column name here as well. Perfect. Just
182:36 column name here as well. Perfect. Just correct it. Perfect. So this is
182:38 correct it. Perfect. So this is basically the generic test that we have
182:39 basically the generic test that we have built. If I go to my project yam
182:42 built. If I go to my project yam basically properties yaml file let's say
182:45 basically properties yaml file let's say I want to add this particular generic
182:47 I want to add this particular generic test to my column and which column I
182:51 test to my column and which column I want to apply that particular test on
182:56 want to apply that particular test on gross amount. Okay I want to apply this
182:59 gross amount. Okay I want to apply this on gross amount and I will simply say
183:04 on gross amount and I will simply say data tests and it will become what is
183:07 data tests and it will become what is the test name? What is the test name
183:09 the test name? What is the test name that you'll be using now? What is the
183:11 that you'll be using now? What is the test name? It is called generic
183:15 test name? It is called generic non negative.
183:18 non negative. Perfect. That's it. That's it.
183:22 Perfect. That's it. That's it. And here here is not enough. Why?
183:26 And here here is not enough. Why? Because if you remember
183:28 Because if you remember we need to pass two parameters as well,
183:31 we need to pass two parameters as well, right? Column and model name. Basically
183:33 right? Column and model name. Basically model and column name. But actually you
183:36 model and column name. But actually you are passing both the parameters. How?
183:37 are passing both the parameters. How? This is the model name here and this is
183:39 This is the model name here and this is the column name. So it will
183:40 the column name. So it will automatically pass both the parameters.
183:42 automatically pass both the parameters. Okay, but you are passing it. Do not
183:44 Okay, but you are passing it. Do not feel like you're not passing the values.
183:45 feel like you're not passing the values. If I run dbd test now let's see what
183:48 If I run dbd test now let's see what happens.
183:49 happens. cd and then dbd test
183:57 and this time we have seven data test. Perfect.
183:59 Perfect. And 6 7 perfect it's working fine. You
184:03 And 6 7 perfect it's working fine. You can see our test which is called generic
184:04 can see our test which is called generic non- negative bronze test. This is the
184:06 non- negative bronze test. This is the one. And it passed.
184:14 Hey Siri, shut up.
184:21 Oh man. So this is all about your generic
184:25 So this is all about your generic customized generic data test. I hope you
184:27 customized generic data test. I hope you got all the data testing in DBD. So now
184:31 got all the data testing in DBD. So now let's see what are DBD seeds. I know
184:34 let's see what are DBD seeds. I know very fancy name and you know what it's
184:37 very fancy name and you know what it's very simple and very handy thing very
184:40 very simple and very handy thing very handy thing what are DBT seeds have you
184:42 handy thing what are DBT seeds have you ever worked with the lookup files lookup
184:46 ever worked with the lookup files lookup tables
184:47 tables you can say mapping tables mapping files
184:51 you can say mapping tables mapping files no very good slow claps so if yes very
184:55 no very good slow claps so if yes very good so basically what are these things
184:58 good so basically what are these things these are basically you can say same
185:01 these are basically you can say same thing in dbt so if You have worked with
185:04 thing in dbt so if You have worked with mapping tables. Mapping tables can be
185:06 mapping tables. Mapping tables can be let's say parameter files. It can be
185:08 let's say parameter files. It can be treated as like just for the lookup
185:10 treated as like just for the lookup anything which is very small which is
185:12 anything which is very small which is very static which will rarely change. So
185:15 very static which will rarely change. So you do not want to load that data from
185:18 you do not want to load that data from the source and do something it is for
185:20 the source and do something it is for your particular reference. It is for
185:22 your particular reference. It is for your particular development. That is
185:24 your particular development. That is called a mapping table. And that is the
185:26 called a mapping table. And that is the same thing but we have a fancy name DBT
185:28 same thing but we have a fancy name DBT seeds. And I love this name. I love this
185:30 seeds. And I love this name. I love this name DBT seeds. So what are DBT seeds?
185:33 name DBT seeds. So what are DBT seeds? So let's say you have a mapping table.
185:35 So let's say you have a mapping table. It can be literally anything. It can be
185:36 It can be literally anything. It can be let's say mapping table for categories.
185:39 let's say mapping table for categories. It can be mapping table for let's say um
185:42 It can be mapping table for let's say um anything literally anything. Okay?
185:43 anything literally anything. Okay? Because they're like infinite use cases
185:46 Because they're like infinite use cases for mapping tables. It can be literally
185:47 for mapping tables. It can be literally anything bro. So what we do in DBD
185:51 anything bro. So what we do in DBD seeds? DBT says if you want to work with
185:54 seeds? DBT says if you want to work with mapping tables, lookup files, whatever
185:56 mapping tables, lookup files, whatever that you use to store in your platforms,
185:58 that you use to store in your platforms, let's say any platform, AWS, Azure, GCP,
186:02 let's say any platform, AWS, Azure, GCP, database, anywhere, you can store those
186:04 database, anywhere, you can store those things in DBD seeds as well. How? So,
186:06 things in DBD seeds as well. How? So, you just need to create CSV files. Okay,
186:09 you just need to create CSV files. Okay, CSV files. Simple CSV files. And those
186:12 CSV files. Simple CSV files. And those CSV files will be used to create your
186:16 CSV files will be used to create your mapping table
186:19 mapping table in your
186:21 in your desired schema
186:32 desired schema and catalog because we obviously we can just
186:34 because we obviously we can just configure it. Okay. So this is the
186:36 configure it. Okay. So this is the concept of DBTC. simple you will say
186:38 concept of DBTC. simple you will say hunchama this is so simple obviously
186:40 hunchama this is so simple obviously obviously obviously this is so simple
186:42 obviously obviously this is so simple because you need to hit that subscribe
186:43 because you need to hit that subscribe button because I'm just making it simple
186:45 button because I'm just making it simple for you
186:47 for you so desired schema okay and desired
186:50 so desired schema okay and desired catalog and it will create a mapping
186:51 catalog and it will create a mapping table on top of the CSV file and
186:53 table on top of the CSV file and everything will be automated as you know
186:54 everything will be automated as you know about DBD right so let's see how we can
186:57 about DBD right so let's see how we can just work with DBD seeds and using a
187:00 just work with DBD seeds and using a real use case okay let's see so this is
187:02 real use case okay let's see so this is our basically lovely
187:06 our basically lovely DBD project okay So now what we need to
187:08 DBD project okay So now what we need to do we have yes a dedicated folder for
187:11 do we have yes a dedicated folder for seeds as well that is why I told you
187:13 seeds as well that is why I told you that all these folder structure can be
187:16 that all these folder structure can be modified but it is perfect it is perfect
187:20 modified but it is perfect it is perfect do not change it so if you go inside
187:22 do not change it so if you go inside seeds you can create a file it's called
187:25 seeds you can create a file it's called let's say mapping
187:27 let's say mapping dot csv now it should not be exactly
187:30 dot csv now it should not be exactly called as mapping okay it's not like
187:32 called as mapping okay it's not like that you can just give any name let's
187:34 that you can just give any name let's say lookup
187:37 say lookup lookup dot
187:39 lookup dot CSV. It can be lookup dot CSV, right? It
187:41 CSV. It can be lookup dot CSV, right? It can be literally anything. And within
187:43 can be literally anything. And within that, you just need to create a CSV. CSV
187:45 that, you just need to create a CSV. CSV means commaepparated values, right? You
187:48 means commaepparated values, right? You didn't know the full form of CSV. Oh,
187:50 didn't know the full form of CSV. Oh, man. Come on, man. So, um what can be
187:53 man. Come on, man. So, um what can be the good example? Let's say you are
187:55 the good example? Let's say you are storing any information. Um any any any
187:59 storing any information. Um any any any any any any information. Let's say let's
188:02 any any any information. Let's say let's create a pseudo dummy table. Okay. So
188:04 create a pseudo dummy table. Okay. So let's say you are storing something
188:05 let's say you are storing something called as customer information. Okay, we
188:08 called as customer information. Okay, we already have customers table but let's
188:09 already have customers table but let's let's imagine let's imagine I know you
188:12 let's imagine let's imagine I know you are really good at imagination. So let's
188:14 are really good at imagination. So let's say customer ID.
188:17 say customer ID. Okay, perfect. See love you man. So this
188:21 Okay, perfect. See love you man. So this is our CSV. Let me save this. So I have
188:25 is our CSV. Let me save this. So I have this CSV file within the seeds. If I now
188:28 this CSV file within the seeds. If I now run my model, it will not create that
188:30 run my model, it will not create that seed. Why? Because we have special
188:32 seed. Why? Because we have special command for it. It's called DBD seed.
188:34 command for it. It's called DBD seed. But you know what? It will create
188:36 But you know what? It will create everything in the default schema and
188:38 everything in the default schema and default catalog. Oh, so Ana, do we need
188:42 default catalog. Oh, so Ana, do we need to configure the schema for seeds as
188:43 to configure the schema for seeds as well? Perfect. Yes. So if I go to dbt
188:46 well? Perfect. Yes. So if I go to dbt project.yml
188:47 project.yml the same way that we have done for
188:49 the same way that we have done for models, we need to do it for seeds as
188:52 models, we need to do it for seeds as well. I will simply write seeds. Okay.
188:55 well. I will simply write seeds. Okay. And then I will simply write my DBD
188:58 And then I will simply write my DBD project name.
189:01 project name. Perfect. And within this I will simply
189:03 Perfect. And within this I will simply say plus schema
189:06 say plus schema and schema will be let's say bronze
189:08 and schema will be let's say bronze because I want to just save it in the
189:10 because I want to just save it in the bronze. Make sense? So now let me just
189:12 bronze. Make sense? So now let me just save it. You can also create a
189:14 save it. You can also create a properties file within the um seeds as
189:18 properties file within the um seeds as well. It's fine but for seeds it is like
189:21 well. It's fine but for seeds it is like obviously good to have it here. So to my
189:24 obviously good to have it here. So to my understanding you should define your
189:26 understanding you should define your schemas and all in the
189:30 schemas and all in the project at the project level because it
189:31 project at the project level because it is very easy to look but yes if you want
189:34 is very easy to look but yes if you want to save or basically change the schema
189:36 to save or basically change the schema or basically anything you can obviously
189:38 or basically anything you can obviously either create properties or config block
189:40 either create properties or config block simple okay so this is fine let me just
189:43 simple okay so this is fine let me just run this seed dbt seed okay so let me
189:48 run this seed dbt seed okay so let me run this let's see what happens
189:50 run this let's see what happens obviously error obviously who will write
189:52 obviously error obviously who will write cd command and
189:55 cd command and dbt seed
189:57 dbt seed perfect so let's see it is saying okay
190:01 perfect so let's see it is saying okay concurrency monthre found six models
190:03 concurrency monthre found six models okay so now configuration path existing
190:06 okay so now configuration path existing dbd project file which do not apply to
190:08 dbd project file which do not apply to any resource that we know yeah yeah yeah
190:10 any resource that we know yeah yeah yeah silver and gold are those things so you
190:12 silver and gold are those things so you need to wait it is just creating that
190:13 need to wait it is just creating that particular table and whenever it creates
190:15 particular table and whenever it creates a new table it takes time it takes time
190:18 a new table it takes time it takes time and you can literally see that table in
190:19 and you can literally see that table in datab bricks and I will just show you
190:21 datab bricks and I will just show you don't worry don't worry don't worry
190:22 don't worry don't worry don't worry don't worry
190:23 don't worry So here let's wait and here we have
190:26 So here let's wait and here we have seven data t six models one seed six
190:28 seven data t six models one seed six sources either like I think yeah earlier
190:31 sources either like I think yeah earlier I think it was just zero seed right but
190:34 I think it was just zero seed right but now you are seeing one seed as well and
190:37 now you are seeing one seed as well and six sources perfect
190:40 six sources perfect makes sense a little bit about why do we
190:43 makes sense a little bit about why do we need seed how to use seed okay so
190:47 need seed how to use seed okay so meanwhile it is just creating these
190:48 meanwhile it is just creating these things and taking some time so let me
190:51 things and taking some time so let me just take you to the dbced Let's see if
190:53 just take you to the dbced Let's see if we find anything
190:55 we find anything that will be important to you because I
190:57 that will be important to you because I know you are pro. You know everything.
190:59 know you are pro. You know everything. So add seeds to your DAG and seeds are
191:02 So add seeds to your DAG and seeds are CSV files in your DB project typically
191:04 CSV files in your DB project typically in your seeds directory that DBD can
191:06 in your seeds directory that DBD can load into your data warehouse using DBD
191:08 load into your data warehouse using DBD seed command. Simple same thing we have
191:10 seed command. Simple same thing we have just done. Okay. And it is saying you
191:12 just done. Okay. And it is saying you can also use the seed that is very
191:14 can also use the seed that is very important. That is my second step that I
191:16 important. That is my second step that I will just tell you. Once seed is
191:18 will just tell you. Once seed is created, how we will be using the seed
191:21 created, how we will be using the seed within our DBD project? Same way you use
191:23 within our DBD project? Same way you use other models using ref function. No
191:26 other models using ref function. No difference, zero difference. So as you
191:29 difference, zero difference. So as you can see this is the command to use the
191:31 can see this is the command to use the seed. Simple, simple, simple, simple.
191:33 seed. Simple, simple, simple, simple. Let's see. Okay, complete successfully.
191:35 Let's see. Okay, complete successfully. This is done. Let's see if it has
191:37 This is done. Let's see if it has created the seed for us. If I go to
191:39 created the seed for us. If I go to catalog, if I go to
191:42 catalog, if I go to my catalog, bronze table, do I have
191:45 my catalog, bronze table, do I have lookup table? Yes. See it has created
191:48 lookup table? Yes. See it has created the seed with the same name that I have
191:50 the seed with the same name that I have the file. Simple lookup. Make sense?
191:53 the file. Simple lookup. Make sense? Very good. Now let me just show you one
191:54 Very good. Now let me just show you one more thing. Within the dbt you will be
191:57 more thing. Within the dbt you will be seeing one folder. It is called
191:59 seeing one folder. It is called analysis. What is this thing? What is
192:02 analysis. What is this thing? What is this thing analysis? Basically remember
192:05 this thing analysis? Basically remember let's say I want to run one SQL query.
192:09 let's say I want to run one SQL query. I do not want to include that SQL query
192:12 I do not want to include that SQL query in my building of project or building of
192:15 in my building of project or building of objects. But I still want to have one
192:17 objects. But I still want to have one quick folder where I can just go and run
192:20 quick folder where I can just go and run any query and that query will not be the
192:22 any query and that query will not be the part of my development. But it is for my
192:24 part of my development. But it is for my reference. If I just want to come again
192:26 reference. If I just want to come again and use the same query, the same thing
192:28 and use the same query, the same thing we can do it in the analysis folder. See
192:30 we can do it in the analysis folder. See it is an empty folder. Let's say I want
192:32 it is an empty folder. Let's say I want to create one simple quick query. I will
192:35 to create one simple quick query. I will simply say one or basically let's say
192:40 simply say one or basically let's say one explore. Okay, explore.SQL. So this
192:44 one explore. Okay, explore.SQL. So this particular SQL file will not be used
192:48 particular SQL file will not be used anywhere
192:50 anywhere anywhere nowhere but it will be here for
192:54 anywhere nowhere but it will be here for my reference. Let's say I want to see
192:57 my reference. Let's say I want to see what data I have in the seed command.
192:59 what data I have in the seed command. Let's say select a from
193:04 Let's say select a from ref not bronze sales lookup. Okay I want
193:07 ref not bronze sales lookup. Okay I want to run this. Let's say save and let me
193:11 to run this. Let's say save and let me just run it. Okay,
193:14 just run it. Okay, just to see the output like what is the
193:16 just to see the output like what is the output? I know like it is the lookup.
193:18 output? I know like it is the lookup. Okay, so this is the data. Perfect. So
193:20 Okay, so this is the data. Perfect. So now I do not want to include this
193:23 now I do not want to include this particular query in any corner of my
193:26 particular query in any corner of my project. No, but I want to keep this
193:29 project. No, but I want to keep this particular file for the analysis, for
193:31 particular file for the analysis, for the quick queries, for you can say quick
193:34 the quick queries, for you can say quick um building queries, anything literally
193:37 um building queries, anything literally anything, bro. Whatever. What what what
193:38 anything, bro. Whatever. What what what should I say? Like literally anything
193:41 should I say? Like literally anything that's why it is called analysis folder.
193:42 that's why it is called analysis folder. If you want to analyze your data, but
193:44 If you want to analyze your data, but you do not want to include that in your
193:46 you do not want to include that in your project, that is the thing that you do
193:48 project, that is the thing that you do under the analysis folder. Make sense?
193:51 under the analysis folder. Make sense? Make sense? And this way, this query is
193:54 Make sense? And this way, this query is now saved here. It's not going anywhere.
193:57 now saved here. It's not going anywhere. No, but it will not be used anywhere in
193:59 No, but it will not be used anywhere in your project. It's just for your
194:00 your project. It's just for your reference. Make sense? So this way you
194:03 reference. Make sense? So this way you also saw like how you can um use lookup
194:08 also saw like how you can um use lookup or basically seeds using ref function
194:10 or basically seeds using ref function the same way that you use other models.
194:12 the same way that you use other models. Makes sense. Makes sense. I hope. I
194:15 Makes sense. Makes sense. I hope. I hope. I hope. So now let's commit these
194:19 hope. I hope. So now let's commit these changes. We have made a a lot of
194:21 changes. We have made a a lot of changes. Okay. So now what I will do? I
194:23 changes. Okay. So now what I will do? I will open my terminal and do not need to
194:26 will open my terminal and do not need to write cd because we we are just
194:27 write cd because we we are just committing everything including these
194:29 committing everything including these things as well. So I'll simply say and
194:31 things as well. So I'll simply say and you are on the feature branch make sure.
194:33 you are on the feature branch make sure. Okay. So I will simply say get add dot
194:37 Okay. So I will simply say get add dot get commit
194:39 get commit minus m and we will say what did we
194:42 minus m and we will say what did we learn in this particular section
194:45 learn in this particular section um
194:47 um seeds
194:49 seeds and we learned about tests
194:52 and we learned about tests etc. Okay perfect. So all are committed
194:57 etc. Okay perfect. So all are committed and now it's time to learn about ginger.
195:01 and now it's time to learn about ginger. Let's explore ginga and macros in
195:04 Let's explore ginga and macros in detail. Okay. And then you will be
195:06 detail. Okay. And then you will be master. You will become master of ginger
195:08 master. You will become master of ginger and macros. Trust me it's fun. It's very
195:11 and macros. Trust me it's fun. It's very fun and after learning ginger you will
195:13 fun and after learning ginger you will understand the power of dbt. That is you
195:15 understand the power of dbt. That is you can say the main power of dbt. You can
195:18 can say the main power of dbt. You can make templates and all those things.
195:20 make templates and all those things. Okay. So before that before that before
195:22 Okay. So before that before that before that hold on hold on hold on. You can
195:24 that hold on hold on hold on. You can actually see the lineage as well. Let me
195:25 actually see the lineage as well. Let me just show you. If you run this command,
195:28 just show you. If you run this command, you can see the lineage as well for
195:30 you can see the lineage as well for seed. If you click on reset, basically
195:34 seed. If you click on reset, basically this is just a file. If you just go to
195:36 this is just a file. If you just go to seeds here and if you just click on
195:39 seeds here and if you just click on lineage
195:42 lineage a valid DBT file, etc. blah blah blah.
195:46 a valid DBT file, etc. blah blah blah. Oh man, it is like you just need to
195:48 Oh man, it is like you just need to click on reset and you will also see the
195:49 click on reset and you will also see the lineage and I have already shown the
195:52 lineage and I have already shown the lineage. So that is fine. But yeah, you
195:54 lineage. So that is fine. But yeah, you can see the lineage as well. And don't
195:55 can see the lineage as well. And don't worry, all those things will be covered
195:57 worry, all those things will be covered when you just building silver and gold
195:59 when you just building silver and gold here because you will see the entire
196:00 here because you will see the entire lineage. I love lineage because it is a
196:03 lineage. I love lineage because it is a great thing. Okay. And yes, don't worry,
196:06 great thing. Okay. And yes, don't worry, we're going to cover target snapshots as
196:08 we're going to cover target snapshots as well. And this target folder, as you can
196:10 well. And this target folder, as you can see, we have so many things now in the
196:11 see, we have so many things now in the run. We have test, we have seeds, we
196:14 run. We have test, we have seeds, we have models. All those thing that we are
196:16 have models. All those thing that we are building here everything is being stored
196:19 building here everything is being stored here in the form of
196:21 here in the form of not logs basically the queries that are
196:24 not logs basically the queries that are being used. So if you just go to test
196:26 being used. So if you just go to test you will see this test query was
196:28 you will see this test query was actually used to run your non- negative
196:30 actually used to run your non- negative test.sql which is a singular test. See
196:34 test.sql which is a singular test. See this is your query select count ax this
196:37 this is your query select count ax this this from this select from this. So this
196:39 this from this select from this. So this is the way it runs your singular test.
196:42 is the way it runs your singular test. Make sense? So this is the way to look
196:44 Make sense? So this is the way to look at the queries. Each query is here.
196:46 at the queries. Each query is here. Nothing is hidden from you. Nothing is
196:48 Nothing is hidden from you. Nothing is like behind the curtains. No, everything
196:50 like behind the curtains. No, everything is in is in front of you. Just need to
196:52 is in is in front of you. Just need to find the curtains. Okay. And you know
196:55 find the curtains. Okay. And you know how to clean this. So that is also done.
196:58 how to clean this. So that is also done. So far so good. And I'm happy that
197:01 So far so good. And I'm happy that you're learning a lot in the world of
197:02 you're learning a lot in the world of TBT. Okay. So now let's see what are
197:04 TBT. Okay. So now let's see what are ginga and macros. So let's talk about
197:09 ginga and macros. So let's talk about ginga finally. What is ginger? Ginga is
197:13 ginga finally. What is ginger? Ginga is basically you can say a framework kind
197:15 basically you can say a framework kind of framework um for
197:18 of framework um for basically it is a templating framework
197:20 basically it is a templating framework or basically template framework which is
197:22 or basically template framework which is or basically a framework which is used
197:24 or basically a framework which is used to build the templates if you are from
197:27 to build the templates if you are from software engineering background if you
197:30 software engineering background if you have done your BTE and I would say a lot
197:33 have done your BTE and I would say a lot of you would have done Btech because a
197:35 of you would have done Btech because a very few people are there in tech who
197:39 very few people are there in tech who have not done BTE I am one of those so
197:43 have not done BTE I am one of those so it's fine. So if you are if you have
197:45 it's fine. So if you are if you have done B tech you would have like you can
197:47 done B tech you would have like you can say seen Ginga in your college days in
197:50 say seen Ginga in your college days in your university days what is ginger you
197:54 your university days what is ginger you use or basically any developer uses
197:58 use or basically any developer uses Ginga mostly with HTML okay whenever
198:02 Ginga mostly with HTML okay whenever they just want to make dynamic web pages
198:04 they just want to make dynamic web pages they use Ginga to pass dynamic content
198:07 they use Ginga to pass dynamic content in their web pages because this is a
198:09 in their web pages because this is a templating language but it is not like
198:11 templating language but it is not like you can only use Ginga with you can HTML
198:14 you can only use Ginga with you can HTML only. No, Ginga is Ginga. Ginga is a
198:17 only. No, Ginga is Ginga. Ginga is a templating framework. You can put any
198:19 templating framework. You can put any stuff within that Ginga. That's it. You
198:22 stuff within that Ginga. That's it. You can connect your Ginga with HTML. You
198:25 can connect your Ginga with HTML. You can connect your Ginga with Python.
198:27 can connect your Ginga with Python. Obviously, you would need to um download
198:29 Obviously, you would need to um download some dependencies to render the template
198:31 some dependencies to render the template for Ginga. And in HTML, it is like
198:33 for Ginga. And in HTML, it is like automatically there. So, yeah, when you
198:35 automatically there. So, yeah, when you whenever you just build flask
198:37 whenever you just build flask applications, you use Ginga a lot. Okay.
198:39 applications, you use Ginga a lot. Okay. So, this is like Ginga in short. If you
198:41 So, this is like Ginga in short. If you have if you have not done B tech so I
198:43 have if you have not done B tech so I can feel you. Okay. So, Ginga is like a
198:45 can feel you. Okay. So, Ginga is like a templating framework in which you so if
198:48 templating framework in which you so if you have used Python list comprehensions
198:50 you have used Python list comprehensions to pass dynamic content I would say this
198:52 to pass dynamic content I would say this is a better way to do it or I would say
198:54 is a better way to do it or I would say more professional way to do it using
198:56 more professional way to do it using Ginga. Make sense? Makes sense. Okay. By
199:00 Ginga. Make sense? Makes sense. Okay. By the way, yeah, just opt for some
199:02 the way, yeah, just opt for some different degrees as well. Btech is not
199:04 different degrees as well. Btech is not everything. And I think nowadays in the
199:07 everything. And I think nowadays in the world of AI BEK is not actually very
199:09 world of AI BEK is not actually very very relevant because I don't know like
199:11 very relevant because I don't know like why big big universities are not
199:13 why big big universities are not updating their syllabus. I'm not saying
199:15 updating their syllabus. I'm not saying like all those syllabus are not
199:17 like all those syllabus are not important but they need to integrate
199:19 important but they need to integrate some things which are relevant for the
199:21 some things which are relevant for the modern world but before going to the
199:23 modern world but before going to the universities we need to update the
199:25 universities we need to update the syllabus of schools. So obviously that
199:28 syllabus of schools. So obviously that is the reason like even after cracking
199:30 is the reason like even after cracking the top you can say the most difficult
199:33 the top you can say the most difficult entry examination which is J
199:37 entry examination which is J students studying in those big biggest
199:39 students studying in those big biggest universities still opt for the online
199:43 universities still opt for the online boot camps and that is the reality of
199:45 boot camps and that is the reality of the education system because they are
199:49 the education system because they are not getting the education which is
199:51 not getting the education which is actually relevant or maybe they are not
199:54 actually relevant or maybe they are not there are not much resources available
199:56 there are not much resources available able in those big universities. I don't
199:58 able in those big universities. I don't know what's the gap but I I I know like
200:00 know what's the gap but I I I know like there is a gap why those biggest
200:03 there is a gap why those biggest universities students still opt for
200:05 universities students still opt for online boot camps and that is why I I I
200:08 online boot camps and that is why I I I I don't know like obviously I didn't go
200:10 I don't know like obviously I didn't go to that university so how I can just
200:11 to that university so how I can just comment but I know there's a gap and who
200:14 comment but I know there's a gap and who will fill that gap obviously someone
200:17 will fill that gap obviously someone okay and yeah there are like so many
200:19 okay and yeah there are like so many gaps
200:21 gaps I am just trying to fill a very small
200:23 I am just trying to fill a very small gap that is for data engineering okay
200:25 gap that is for data engineering okay that's it big data data engineering
200:28 that's it big data data engineering cloud all those things. Okay, Anla, if
200:31 cloud all those things. Okay, Anla, if you get a chance to go to those
200:32 you get a chance to go to those universities, will you go like as a
200:35 universities, will you go like as a speaker or as a mentor? Obviously, I
200:38 speaker or as a mentor? Obviously, I would love to because I would be very
200:41 would love to because I would be very very raw on the stage and I will
200:42 very raw on the stage and I will definitely spit some facts. I don't care
200:45 definitely spit some facts. I don't care like who is listening, whoever the
200:46 like who is listening, whoever the principal or whatever is listening.
200:49 principal or whatever is listening. Okay. So okay let's let's let's talk
200:53 Okay. So okay let's let's let's talk about those things
200:55 about those things in the upcoming days. Okay. So now let's
200:59 in the upcoming days. Okay. So now let's explore Ginga. So first of all let me
201:01 explore Ginga. So first of all let me just take you to the you can say
201:04 just take you to the you can say backbone of Ginga that we going to use
201:06 backbone of Ginga that we going to use in DBT. I will simply search DBT for
201:08 in DBT. I will simply search DBT for loops. Let's say because for loops if
201:11 loops. Let's say because for loops if conditions are the most used versions
201:13 conditions are the most used versions and I want to see the Ginga
201:16 and I want to see the Ginga uh oh DBT for loops. Ginga for loops
201:22 uh oh DBT for loops. Ginga for loops perfect templating designer
201:24 perfect templating designer documentation okay let's see this so
201:26 documentation okay let's see this so this is Ginga and as I just mentioned
201:28 this is Ginga and as I just mentioned that this is primarily used with dynamic
201:30 that this is primarily used with dynamic HTML content but ginga is ginga okay so
201:34 HTML content but ginga is ginga okay so as you can see that there is a for loop
201:37 as you can see that there is a for loop written and this is your ginga code
201:40 written and this is your ginga code whatever is written inside it it doesn't
201:43 whatever is written inside it it doesn't care it can be HTML it can be Python it
201:45 care it can be HTML it can be Python it can be JavaScript
201:47 can be JavaScript it can be anything it can be any code.
201:49 it can be anything it can be any code. It will simply put this thing okay and
201:53 It will simply put this thing okay and it doesn't need to worry like how you
201:55 it doesn't need to worry like how you will be just rendering it because
201:56 will be just rendering it because rendering will be taken place by your
202:00 rendering will be taken place by your interpreter
202:01 interpreter make sense you need to import libraries
202:04 make sense you need to import libraries who can read this ginga we have like
202:06 who can read this ginga we have like multiple libraries in python for
202:08 multiple libraries in python for rendering ginga and the same thing I
202:10 rendering ginga and the same thing I think dbt is also leveraging with SQL
202:12 think dbt is also leveraging with SQL ginger so that is fine okay so this is
202:14 ginger so that is fine okay so this is like a kind of for loop there are like
202:16 like a kind of for loop there are like so so so many things that we going to
202:18 so so so many things that we going to just see in this particular example.
202:21 just see in this particular example. Okay. So now let's let me just take you
202:23 Okay. So now let's let me just take you to the Ginga with respect to DBT. So
202:25 to the Ginga with respect to DBT. So this is the official documentation for
202:26 this is the official documentation for Ginga macros in DBT. Okay. So this is
202:29 Ginga macros in DBT. Okay. So this is the overview. You can combine SQL with
202:31 the overview. You can combine SQL with Ginga at templating language. As I just
202:33 Ginga at templating language. As I just mentioned using Ginga turns your DBT
202:35 mentioned using Ginga turns your DBT project into into a programming
202:37 project into into a programming environment for SQL. That means you will
202:40 environment for SQL. That means you will feel that you do not need to write those
202:42 feel that you do not need to write those static SQL commands. No, you can
202:44 static SQL commands. No, you can actually perform proper programming for
202:46 actually perform proper programming for loops, if conditions, case statements,
202:49 loops, if conditions, case statements, all those things with respect to SQL and
202:52 all those things with respect to SQL and obviously in SQL we know that we do not
202:53 obviously in SQL we know that we do not have these capabilities. We do not have
202:55 have these capabilities. We do not have capabilities to run for loops and all
202:57 capabilities to run for loops and all those things. But with Ginga, we can do
202:59 those things. But with Ginga, we can do this and this is a game-changing thing.
203:02 this and this is a game-changing thing. Let me be very honest with you, it's an
203:05 Let me be very honest with you, it's an amazing thing. Okay, so getting started
203:07 amazing thing. Okay, so getting started and blah blah blah. So first of all let
203:10 and blah blah blah. So first of all let me just take you to So now let's talk
203:13 me just take you to So now let's talk about Ginga in the real world. We just
203:16 about Ginga in the real world. We just got the you can say high level
203:17 got the you can say high level understanding. Now it's time to just
203:18 understanding. Now it's time to just actually explore it in our DBD project.
203:20 actually explore it in our DBD project. Okay. By the way Ginga is amazing
203:23 Okay. By the way Ginga is amazing amazing amazing amazing. I don't know
203:24 amazing amazing amazing. I don't know like why like okay I don't know I don't
203:27 like why like okay I don't know I don't know. So let's talk about ginger. So let
203:31 know. So let's talk about ginger. So let me just create first of all a simple
203:34 me just create first of all a simple ginger in the analysis folder because
203:36 ginger in the analysis folder because obviously we do not want to save that
203:37 obviously we do not want to save that particular ginger but I want to just
203:39 particular ginger but I want to just show you how it works. So let's create a
203:42 show you how it works. So let's create a file within this. I will say ginga
203:46 file within this. I will say ginga 1 or let's say one ginga or ginga ginga
203:49 1 or let's say one ginga or ginga ginga one let's say ginga 1.sql
203:52 one let's say ginga 1.sql okay makes sense so this is our
203:54 okay makes sense so this is our ginga.sql
203:56 ginga.sql ginga 1.sql SQL basically. So we going
203:58 ginga 1.sql SQL basically. So we going to create a ginga here. So how we can
204:00 to create a ginga here. So how we can just create? Let's get started with
204:02 just create? Let's get started with that. So first of all treat ginga or
204:06 that. So first of all treat ginga or basically treat ginga SQL or basically
204:09 basically treat ginga SQL or basically you can say treat ginga as adding the
204:12 you can say treat ginga as adding the programming capability on top of your
204:14 programming capability on top of your SQL that is missing right that is
204:16 SQL that is missing right that is missing and as we all know like SQL is
204:18 missing and as we all know like SQL is very very powerful but it lacks in the
204:20 very very powerful but it lacks in the programming
204:22 programming capabilities and all. So let's say I
204:24 capabilities and all. So let's say I want in the programming world we start
204:26 want in the programming world we start with variables right so how we can just
204:28 with variables right so how we can just create variables within ginga it is very
204:30 create variables within ginga it is very simple you will start with mustache
204:32 simple you will start with mustache syntax basically double oh man I need to
204:35 syntax basically double oh man I need to snooze this this this bro okay okay so
204:39 snooze this this this bro okay okay so now it's fine so whenever we just use
204:41 now it's fine so whenever we just use double curly braces is it is called
204:43 double curly braces is it is called mustach syntax okay so if I just want to
204:46 mustach syntax okay so if I just want to create a variable I will simply say set
204:49 create a variable I will simply say set variable let's say variable one or let's
204:51 variable let's say variable one or let's say variable name whatever Okay, set
204:54 say variable name whatever Okay, set variable name equals to let's say I want
204:56 variable name equals to let's say I want to say an lamba. Make sense? So you can
205:00 to say an lamba. Make sense? So you can just set your variable basically create
205:02 just set your variable basically create your variable. Now obviously whenever
205:04 your variable. Now obviously whenever you just work with programming world you
205:05 you just work with programming world you just set so so so many things right in
205:08 just set so so so many things right in the like you can say variables. So now
205:11 the like you can say variables. So now there's like one change that you need to
205:13 there's like one change that you need to make. What's that? Whenever you use
205:16 make. What's that? Whenever you use double quotes, yes, like not double
205:18 double quotes, yes, like not double quotes, basically double curly braces,
205:20 quotes, basically double curly braces, it is fine. But whenever you are
205:22 it is fine. But whenever you are defining something, it is the rule of
205:24 defining something, it is the rule of thumb that you have to use percentage.
205:28 thumb that you have to use percentage. Okay? So just replace this with the
205:30 Okay? So just replace this with the percentage sign and that's it. Make
205:33 percentage sign and that's it. Make sense? Let me just save it. And if you
205:35 sense? Let me just save it. And if you just want to see the result of it,
205:37 just want to see the result of it, obviously you will not see anything. So
205:39 obviously you will not see anything. So now let's print this variable. And now
205:42 now let's print this variable. And now you will be using so remember whenever
205:44 you will be using so remember whenever you want to use any variable you use it
205:48 you want to use any variable you use it within the double curly braces I will
205:50 within the double curly braces I will simply say where name make sense okay so
205:53 simply say where name make sense okay so this will print this whenever you want
205:56 this will print this whenever you want to
205:57 to develop anything using for loop if
206:00 develop anything using for loop if conditions set variable you have to use
206:03 conditions set variable you have to use curly braces with percentage sign okay
206:06 curly braces with percentage sign okay so I cannot run this because this will
206:08 so I cannot run this because this will not run anything but yes I can compile
206:09 not run anything but yes I can compile it so I will simply click on this
206:11 it so I will simply click on this compile button and I will see an Lamba.
206:14 compile button and I will see an Lamba. Make sense? So now you will observe one
206:16 Make sense? So now you will observe one thing. This is a very small thing and it
206:18 thing. This is a very small thing and it is optional but it makes your code more
206:20 is optional but it makes your code more readable. You are seeing two empty lines
206:23 readable. You are seeing two empty lines there. Why? Do you know why? Because you
206:25 there. Why? Do you know why? Because you have two empty lines here as well. If
206:27 have two empty lines here as well. If you want to get rid of this thing, you
206:29 you want to get rid of this thing, you can simply add a minus sign after this
206:32 can simply add a minus sign after this percentage and before this percentage
206:33 percentage and before this percentage like this. Now if you just say compile
206:38 like this. Now if you just say compile it will like obviously do you do not
206:40 it will like obviously do you do not have this variable name here. If you
206:42 have this variable name here. If you just let's say paste variable name here
206:45 just let's say paste variable name here and if you run this now you will see
206:47 and if you run this now you will see that it will paste it. But wait I think
206:51 that it will paste it. But wait I think it is not refreshed.
206:53 it is not refreshed. So this is variable name. Okay makes
206:56 So this is variable name. Okay makes sense. And let me just go here. Ctrl S.
207:01 sense. And let me just go here. Ctrl S. Save.
207:03 Save. Are you kidding me, bro? Are you kidding
207:04 Are you kidding me, bro? Are you kidding me?
207:06 me? So, let me just add minus here as well.
207:09 So, let me just add minus here as well. By the way, it is not required. But
207:10 By the way, it is not required. But yeah,
207:16 control S. Oh, I told you it is a refresh refresh issue. So, if I just
207:19 refresh refresh issue. So, if I just click here now, it should just return it
207:21 click here now, it should just return it in the first line. That doesn't make any
207:23 in the first line. That doesn't make any sense if it returns in the third line.
207:24 sense if it returns in the third line. So, just refresh it. Sometime it takes
207:25 So, just refresh it. Sometime it takes time. So, this was like very basic use
207:27 time. So, this was like very basic use case. Let me just level up. Okay. ginger
207:31 case. Let me just level up. Okay. ginger 2.SQL.
207:33 2.SQL. Perfect. So now let's explore how we can
207:36 Perfect. So now let's explore how we can just run a loop. Because whenever we
207:38 just run a loop. Because whenever we just think about programming interface,
207:39 just think about programming interface, programming language, programming
207:40 programming language, programming capabilities, we always think about for
207:43 capabilities, we always think about for loops because we work with big data. We
207:45 loops because we work with big data. We work with so many entities in one you
207:47 work with so many entities in one you can say file in one particular notebook.
207:49 can say file in one particular notebook. So let's talk about for loop. How we can
207:51 So let's talk about for loop. How we can just run loops within Ginga. So let's
207:53 just run loops within Ginga. So let's say I have a list. It can be any list.
207:55 say I have a list. It can be any list. Let's say uh apples. Okay, apples. So,
208:00 Let's say uh apples. Okay, apples. So, let's store all the apples. Um, let's
208:03 let's store all the apples. Um, let's put uh I don't like Fuji. I like Gala.
208:08 put uh I don't like Fuji. I like Gala. Then we have I think Red Delicious.
208:13 Then we have I think Red Delicious. Yep, Red Delicious. And then I think we
208:17 Yep, Red Delicious. And then I think we have one more which was
208:21 have one more which was Honeydew. I guess Fuji is nice like
208:24 Honeydew. I guess Fuji is nice like sometime it it is fine, sometime it is
208:26 sometime it it is fine, sometime it is not. And then we have Kala.
208:31 not. And then we have Kala. Oh yeah, we have Mintosh as well. I
208:34 Oh yeah, we have Mintosh as well. I don't like Macintosh but but it is like
208:37 don't like Macintosh but but it is like very very sore but some people like it.
208:39 very very sore but some people like it. So that's it. Or should we add more
208:41 So that's it. Or should we add more apples? More types of apples.
208:44 apples? More types of apples. Uh let's say Honey Crisp. Honey crisp is
208:46 Uh let's say Honey Crisp. Honey crisp is like the most expensive one by the way.
208:48 like the most expensive one by the way. Okay Anchaba, how do you know these many
208:51 Okay Anchaba, how do you know these many types of apples? How? So basically I
208:53 types of apples? How? So basically I used to work in the store and I used to
208:57 used to work in the store and I used to just put all the types of apples on the
208:59 just put all the types of apples on the shelves. So I know like all these
209:00 shelves. So I know like all these things. So I used to just open big boxes
209:02 things. So I used to just open big boxes of apples and I used to just put like
209:04 of apples and I used to just put like Gala in this shelf. Um Red Delicious in
209:07 Gala in this shelf. Um Red Delicious in this shelf, Fuji in this. Fuji was like
209:09 this shelf, Fuji in this. Fuji was like very very seasonal. So I rarely I used
209:12 very very seasonal. So I rarely I used to rarely see Fuji apples. Mintosh was
209:15 to rarely see Fuji apples. Mintosh was like I I don't like macintosh. Honey
209:18 like I I don't like macintosh. Honey crisp was like most expensive one. So I
209:21 crisp was like most expensive one. So I know all these things. Yeah, your boy
209:23 know all these things. Yeah, your boy has worked in the stores and I used to
209:26 has worked in the stores and I used to just put all those types of apples in
209:28 just put all those types of apples in the shelves. So I know a lot about these
209:30 the shelves. So I know a lot about these kind of things like fruits, apples and
209:32 kind of things like fruits, apples and different different types of things.
209:34 different different types of things. Okay. So yeah, it is what it is. So now
209:38 Okay. So yeah, it is what it is. So now let's say I want to just print all the
209:40 let's say I want to just print all the apple all the types of apples but
209:42 apple all the types of apples but dynamically. Okay. So how we can just do
209:43 dynamically. Okay. So how we can just do that? So I will simply say percentage.
209:46 that? So I will simply say percentage. Okay. For same way the same syntax you
209:49 Okay. For same way the same syntax you use for Python same is here for I in
209:53 use for Python same is here for I in apples. Okay now you do not need to
209:56 apples. Okay now you do not need to write like this like this apples. Why?
209:59 write like this like this apples. Why? Because you are already inside this
210:01 Because you are already inside this curly brace. So you do not need to write
210:03 curly brace. So you do not need to write write it again and again. Make sense?
210:05 write it again and again. Make sense? Because this is not like a kind of
210:07 Because this is not like a kind of variable that you're using here for i in
210:10 variable that you're using here for i in apples and I want to show I. I is just
210:12 apples and I want to show I. I is just the it value of apples. Then I will
210:14 the it value of apples. Then I will simply write end loop
210:18 simply write end loop or basically end not end loop and four I
210:21 or basically end not end loop and four I guess and for loop and four I can just
210:23 guess and for loop and four I can just check no worries. So this will give me
210:26 check no worries. So this will give me the values dynamically. Let me just
210:28 the values dynamically. Let me just compile it.
210:30 compile it. Okay. So I can see apples I can see I.
210:33 Okay. So I can see apples I can see I. Okay. Oh, makes sense. So here I can
210:36 Okay. Oh, makes sense. So here I can simply say what is I? I is just you can
210:41 simply say what is I? I is just you can say I like it's just a string because
210:46 say I like it's just a string because whatever you write here like this
210:48 whatever you write here like this all these things will be treated as a
210:50 all these things will be treated as a string. If you want to actually use
210:53 string. If you want to actually use variable within like particular loop you
210:56 variable within like particular loop you have to write something like this. Let's
210:58 have to write something like this. Let's say this and for
211:01 say this and for uh compare. Okay. So now I can say that
211:05 uh compare. Okay. So now I can say that something is
211:08 something is something is I think it is not end for
211:10 something is I think it is not end for or basically it is not end loop. Let me
211:12 or basically it is not end loop. Let me just confirm. I think it was end for or
211:16 just confirm. I think it was end for or end loop. Let me just check because I
211:18 end loop. Let me just check because I know if I just go on the search engine I
211:19 know if I just go on the search engine I will never find that. So I think let me
211:24 will never find that. So I think let me let me just check.
211:27 let me just check. Uh I saved it somewhere.
211:30 Uh I saved it somewhere. Um it's N4. Okay, it is fine. Oh, okay.
211:35 Um it's N4. Okay, it is fine. Oh, okay. And four. Okay. Okay. Okay. Makes sense.
211:37 And four. Okay. Okay. Okay. Makes sense. Uh, okay. So basically the thing is
211:42 Uh, okay. So basically the thing is obviously if in order to just run this
211:45 obviously if in order to just run this we need to just apply some SQL commands
211:47 we need to just apply some SQL commands or anything because this will not show
211:49 or anything because this will not show you anything because once the for loop
211:51 you anything because once the for loop is over how you will just see that right
211:53 is over how you will just see that right and for that let's actually create a
211:56 and for that let's actually create a real list inside the ginger and let's
211:58 real list inside the ginger and let's actually run you can say a select
212:00 actually run you can say a select statement a kind of select statement or
212:01 statement a kind of select statement or I do not want to just directly show you
212:03 I do not want to just directly show you the select statement because you can be
212:04 the select statement because you can be confused because for some people it can
212:06 confused because for some people it can be confusing so let me just first of all
212:08 be confusing so let me just first of all show you without select statement
212:10 show you without select statement percentage um percentage. Okay, here
212:15 percentage um percentage. Okay, here okay set. Yep. Oops.
212:19 okay set. Yep. Oops. Perfect. So now let me just save it and
212:21 Perfect. So now let me just save it and let me just show you. Okay, perfect. Now
212:23 let me just show you. Okay, perfect. Now we have Gala Red Delicious, Fuji,
212:25 we have Gala Red Delicious, Fuji, Mintosh, Honey Crisp. Okay. And if you
212:27 Mintosh, Honey Crisp. Okay. And if you want to remove the extra space, you
212:29 want to remove the extra space, you already know how you can just remove
212:30 already know how you can just remove that. It is
212:33 that. It is by here. You can add minus sign here,
212:37 by here. You can add minus sign here, minus sign here. Uh that's it.
212:41 minus sign here. Uh that's it. And just refresh it. It takes some time.
212:43 And just refresh it. It takes some time. So yeah, that's how you can just
212:47 So yeah, that's how you can just run a loop within this. Okay. Now,
212:50 run a loop within this. Okay. Now, second or basically third third thing
212:53 second or basically third third thing which is very very used whenever we just
212:55 which is very very used whenever we just work with you can say programming
212:57 work with you can say programming languages, whenever we just talk about
212:59 languages, whenever we just talk about programming capabilities, we always talk
213:01 programming capabilities, we always talk about something called as if conditions.
213:04 about something called as if conditions. Right? So let's say I want to run a loop
213:07 Right? So let's say I want to run a loop over this Apple's list but I want to
213:09 over this Apple's list but I want to only show if the Apple name is not
213:12 only show if the Apple name is not Mintosh because I do not like Macintosh.
213:15 Mintosh because I do not like Macintosh. Okay. So how we can just do that? It's
213:16 Okay. So how we can just do that? It's very simple. Let me just make some room.
213:20 very simple. Let me just make some room. Okay. So I will simply add one more if
213:22 Okay. So I will simply add one more if condition after this. So I will simply
213:23 condition after this. So I will simply say and just for the indentation I am
213:26 say and just for the indentation I am adding if condition at this level so
213:29 adding if condition at this level so that you can understand it. Okay. So it
213:32 that you can understand it. Okay. So it will be like this if and I can say I
213:37 will be like this if and I can say I equals equals uh let's say not equals
213:40 equals equals uh let's say not equals basically not equals Macintosh or
213:44 basically not equals Macintosh or anything
213:46 anything Macintosh if that is not equal right
213:50 Macintosh if that is not equal right then I want to print then I and this
213:54 then I want to print then I and this autocomp completion is like
213:57 autocomp completion is like bro okay if this is not equals to
214:00 bro okay if this is not equals to Mintosh only then print I otherwise
214:04 Mintosh only then print I otherwise print nothing because it's just a if
214:06 print nothing because it's just a if condition but if you want to just add l
214:08 condition but if you want to just add l if you can also add that okay so let's
214:11 if you can also add that okay so let's say let's add else condition just to
214:13 say let's add else condition just to make the code you can say more readable
214:15 make the code you can say more readable or more nice it's not like recommended
214:17 or more nice it's not like recommended to add every time else because in the
214:19 to add every time else because in the programming world we do not write else
214:20 programming world we do not write else every time but just to show you you can
214:22 every time but just to show you you can write else like this okay and then else
214:27 write else like this okay and then else you can say I so this time what it will
214:29 you can say I so this time what it will do it will write I hate
214:33 do it will write I hate whatever is Macintosh. Okay. And this
214:37 whatever is Macintosh. Okay. And this way you can combine your string and your
214:40 way you can combine your string and your variable. So if you observe one thing, I
214:43 variable. So if you observe one thing, I have used string with this variable. So
214:46 have used string with this variable. So it is equivalent to using f string in
214:49 it is equivalent to using f string in Python. Similar to this, let's say f and
214:51 Python. Similar to this, let's say f and then this I hate and then a variable. If
214:55 then this I hate and then a variable. If in python you just use single curly
214:56 in python you just use single curly brace, in this particular gen you use
214:59 brace, in this particular gen you use double curly brace. Not a big deal like
215:01 double curly brace. Not a big deal like this simple it is equivalent to this
215:04 this simple it is equivalent to this makes sense so let's run it now and
215:05 makes sense so let's run it now and let's see what we get
215:08 let's see what we get uh okay so we got gala red delicious
215:10 uh okay so we got gala red delicious Fuji and in the Mcintosh we got I hate
215:13 Fuji and in the Mcintosh we got I hate Macintosh okay make sense so that's how
215:15 Macintosh okay make sense so that's how you can run loops and you can now
215:19 you can run loops and you can now actually build your dynamic SQL
215:21 actually build your dynamic SQL statements now let's take a very real
215:23 statements now let's take a very real world example and this is like an end to
215:25 world example and this is like an end to an example which will clear all of your
215:27 an example which will clear all of your ginga concepts basically And this will
215:30 ginga concepts basically And this will set a very strong base for your future.
215:33 set a very strong base for your future. Whenever you just want to use Ginga with
215:35 Whenever you just want to use Ginga with DBT, you can easily use that because
215:37 DBT, you can easily use that because this is one of those things which are
215:39 this is one of those things which are almost used every single day. Okay. So
215:42 almost used every single day. Okay. So let's create one more file in the
215:44 let's create one more file in the analysis
215:45 analysis and I will call it as ginger
215:52 3.SQL. Perfect. So within this let's say we
215:55 Perfect. So within this let's say we want to incrementally load the data.
215:59 want to incrementally load the data. Okay, incrementally load the data. Okay,
216:02 Okay, incrementally load the data. Okay, makes sense. And we want to
216:06 makes sense. And we want to incrementally load the data from the
216:08 incrementally load the data from the bronze model. Let's say from this
216:10 bronze model. Let's say from this particular model called bronze sales.
216:13 particular model called bronze sales. Okay, let me just open this model and
216:14 Okay, let me just open this model and let me just show you what do we have in
216:16 let me just show you what do we have in this particular model because I just
216:18 this particular model because I just want to see whatever columns we have.
216:21 want to see whatever columns we have. Okay, let's wait. Let's wait.
216:26 Okay, let's wait. Let's wait. And you should know the concept of
216:28 And you should know the concept of incremental loading. If not, I will just
216:29 incremental loading. If not, I will just give you a very quick glimpse of it.
216:32 give you a very quick glimpse of it. It's not a rocket science. Okay, I
216:35 It's not a rocket science. Okay, I sometime like I let let me share till in
216:38 sometime like I let let me share till in the meantime it is running. So every oh
216:41 the meantime it is running. So every oh man it was a very interesting topic. Do
216:43 man it was a very interesting topic. Do you want to hear that? Let me just tell
216:45 you want to hear that? Let me just tell it. So we have we do not have a date
216:49 it. So we have we do not have a date here but we have a date key. If you see
216:51 here but we have a date key. If you see date SK and how many date SK we have.
216:55 date SK and how many date SK we have. Okay. So for example, for example, we
216:58 Okay. So for example, for example, we will be using this date SK column for
217:00 will be using this date SK column for our incremental loading. Okay. So I was
217:02 our incremental loading. Okay. So I was just saying if you just want to hear
217:04 just saying if you just want to hear otherwise you can just skip one to two
217:05 otherwise you can just skip one to two minutes. So I was just saying see
217:07 minutes. So I was just saying see whenever we just want to explain
217:08 whenever we just want to explain something to anyone. Let's say we want
217:11 something to anyone. Let's say we want to explain something new to anyone.
217:12 to explain something new to anyone. Okay. And if it is challenging or let's
217:15 Okay. And if it is challenging or let's say even if it is not challenging we
217:16 say even if it is not challenging we always say hey it is not a rocket
217:19 always say hey it is not a rocket science. It is very easy. It is not a
217:21 science. It is very easy. It is not a rocket science. It is very easy. It is
217:23 rocket science. It is very easy. It is not a rocket science. It is doable. I
217:25 not a rocket science. It is doable. I sometime think about those people who
217:27 sometime think about those people who are actually studying rocket science
217:29 are actually studying rocket science because this has been feed into their
217:31 because this has been feed into their mind because rocket even if rocket
217:33 mind because rocket even if rocket science is easy. I know it is not easy
217:35 science is easy. I know it is not easy but just think about their mentality.
217:37 but just think about their mentality. They know that it is difficult. They
217:40 They know that it is difficult. They know like this is the example that is
217:41 know like this is the example that is given to every single person. Hey this
217:44 given to every single person. Hey this is doable. This is not rocket science.
217:45 is doable. This is not rocket science. Just think about those who are just
217:47 Just think about those who are just studying rocket science. Just think
217:48 studying rocket science. Just think about the mentality that you have
217:50 about the mentality that you have developed for them. So cruel man. So now
217:54 developed for them. So cruel man. So now let's get back to our ginger. So let's
217:56 let's get back to our ginger. So let's say we want to just incrementally load
217:58 say we want to just incrementally load the data from the bronze sales. Okay. So
218:01 the data from the bronze sales. Okay. So first of all in the real world in the
218:03 first of all in the real world in the data engineering world we do not create
218:05 data engineering world we do not create two different processing units for
218:09 two different processing units for incremental data loading for the initial
218:11 incremental data loading for the initial run and the incremental run. If you know
218:15 run and the incremental run. If you know about the fundamentals of data
218:16 about the fundamentals of data engineering, I expect that you know
218:17 engineering, I expect that you know something. Okay. So we do not create two
218:20 something. Okay. So we do not create two different units or basically two
218:22 different units or basically two different um processing notebooks and
218:25 different um processing notebooks and all we want to process our initial load
218:28 all we want to process our initial load and incremental load both in the single
218:30 and incremental load both in the single notebook. Okay. So we create a kind of
218:32 notebook. Okay. So we create a kind of flag. Let's say init flag. Okay. Init
218:36 flag. Let's say init flag. Okay. Init flag equals to zero or one. And how we
218:38 flag equals to zero or one. And how we can just create variables in uh ginger.
218:41 can just create variables in uh ginger. We know that. Oh man. Let me just nose
218:44 We know that. Oh man. Let me just nose it.
218:46 it. Yeah. So now let me show you. You can
218:50 Yeah. So now let me show you. You can simply say set init load or basically
218:53 simply say set init load or basically you can say incremental load or increase
218:57 you can say incremental load or increase flag. Okay. By default this value is one
219:02 flag. Okay. By default this value is one because initially you will be just
219:04 because initially you will be just loading the whole data right so this
219:05 loading the whole data right so this value is one perfect and we also have
219:08 value is one perfect and we also have one value which is called last load date
219:12 one value which is called last load date okay last load okay now we do not have
219:15 okay last load okay now we do not have date so we have a date key so let's say
219:17 date so we have a date key so let's say the last load date was with ID3 make
219:21 the last load date was with ID3 make sense very good now I'll be writing this
219:23 sense very good now I'll be writing this query select ax from and we know that
219:29 query select ax from and we know that What we want to use? We want to use ref
219:31 What we want to use? We want to use ref function, right?
219:33 function, right? Ref. Bronze sales.
219:36 Ref. Bronze sales. Bronze sales. Perfect. But here is the
219:39 Bronze sales. Perfect. But here is the thing. If it is an incremental load or
219:44 thing. If it is an incremental load or if it is an initial load, you need to
219:47 if it is an initial load, you need to complete your query accordingly. So now
219:50 complete your query accordingly. So now here we'll be using if condition. We'll
219:52 here we'll be using if condition. We'll be saying
219:54 be saying if
219:56 if okay because now is the role of ginger.
219:58 okay because now is the role of ginger. So here I will say if you can say
220:03 So here I will say if you can say incremental flag
220:06 incremental flag equals to equals to 1. If it is one that
220:09 equals to equals to 1. If it is one that means this is you can say incremental
220:11 means this is you can say incremental load. Okay let's say zero. Okay. If it
220:14 load. Okay let's say zero. Okay. If it is an equals to equals to 0 that means
220:16 is an equals to equals to 0 that means you do not want to perform incremental
220:17 you do not want to perform incremental load. This is just an initial load then
220:20 load. This is just an initial load then you will do actually nothing. Actually
220:24 you will do actually nothing. Actually nothing. Make sense? You can simply
220:25 nothing. Make sense? You can simply write 1= to equals to 1. it will do
220:28 write 1= to equals to 1. it will do nothing right okay this is one way of
220:31 nothing right okay this is one way of doing it more professional way is just
220:33 doing it more professional way is just treat it with the incremental load
220:35 treat it with the incremental load directly so this is the query that will
220:37 directly so this is the query that will be running if incremental flag equals to
220:41 be running if incremental flag equals to zero that means full load but if it is
220:43 zero that means full load but if it is an incremental load then we will be just
220:46 an incremental load then we will be just processing like this uh where what was
220:48 processing like this uh where what was the key column I forgot let me just run
220:51 the key column I forgot let me just run it I think it was date SK if I'm not
220:54 it I think it was date SK if I'm not wrong
220:56 wrong date SK K. Yeah. Date SK. Okay. So then
221:00 date SK K. Yeah. Date SK. Okay. So then we'll be saying
221:03 we'll be saying date. Let me just close it. Don't say.
221:06 date. Let me just close it. Don't say. Yeah. So we'll be saying date SK
221:15 greater than greater than what variable. Which variable? Last load.
221:18 Which variable? Last load. Okay. Makes sense. And that's it. Then
221:21 Okay. Makes sense. And that's it. Then we'll be saying end if.
221:29 End if. Perfect. So this is the way you can do this thing. Let me just click on
221:31 can do this thing. Let me just click on compile and let's see what query it has
221:33 compile and let's see what query it has returned. So it has created this query.
221:35 returned. So it has created this query. Select testics from this table where.
221:36 Select testics from this table where. Oh, I forgot to add where. Let me add
221:39 Oh, I forgot to add where. Let me add where. Yep.
221:43 where. Yep. Here let's add where. Perfect. So now it
221:47 Here let's add where. Perfect. So now it is saying
221:48 is saying where. Hey. Oh man. Just refresh it.
221:51 where. Hey. Oh man. Just refresh it. Just refresh. It takes some time. So you
221:53 Just refresh. It takes some time. So you can imagine like it will be where here
221:55 can imagine like it will be where here obviously because we have added it here.
221:57 obviously because we have added it here. So it will be where date SK greater than
222:00 So it will be where date SK greater than three because it is an incremental load.
222:02 three because it is an incremental load. How do we know? Because we have set it
222:03 How do we know? Because we have set it to one. Make sense? Make sense? So this
222:07 to one. Make sense? Make sense? So this is the way that we can actually do it.
222:09 is the way that we can actually do it. If you want to make it more more more
222:12 If you want to make it more more more professional and it will be a
222:14 professional and it will be a challenging one but an Lamba is here,
222:16 challenging one but an Lamba is here, your boy your brother is here. Let me
222:20 your boy your brother is here. Let me just tell you. So let's say from this
222:22 just tell you. So let's say from this particular model you do not want to pull
222:24 particular model you do not want to pull all the records you do not want to write
222:25 all the records you do not want to write select a tricks you want to pull only
222:27 select a tricks you want to pull only let's say specific columns right so
222:30 let's say specific columns right so let's create the list of specific
222:31 let's create the list of specific columns that we want to pull make sense
222:34 columns that we want to pull make sense I will say set
222:37 I will say set columns list equals and let's define the
222:40 columns list equals and let's define the list of columns let's say I want to pull
222:41 list of columns let's say I want to pull date SK
222:43 date SK I want to pull sales ID
222:48 I want to pull sales ID okay makes sense I want to pull sales ID
222:51 okay makes sense I want to pull sales ID and I want to also pull let's say order
222:53 and I want to also pull let's say order amount
222:55 amount and so many things right let's say we
222:57 and so many things right let's say we want to pull these three columns make
222:58 want to pull these three columns make sense so what I will do here I will not
223:00 sense so what I will do here I will not write a str I will run a loop to
223:03 write a str I will run a loop to dynamically call the columns how I will
223:05 dynamically call the columns how I will say for i and like this
223:11 say for i and like this for i in columns list okay and within
223:15 for i in columns list okay and within that I will run basically a loop on top
223:19 that I will run basically a loop on top of this so I will simply Say I I want to
223:22 of this so I will simply Say I I want to return I and I want to return a comma.
223:25 return I and I want to return a comma. Simple. And let's actually run this.
223:27 Simple. And let's actually run this. There's an issue with this. I will just
223:28 There's an issue with this. I will just let you know what's that. And oh man,
223:31 let you know what's that. And oh man, this is still not refreshed. What is
223:34 this is still not refreshed. What is this?
223:36 this? Oh man. Let me just refresh the window.
223:40 Oh man. Let me just refresh the window. Command pallet. Refresh. Reload window.
223:42 Command pallet. Refresh. Reload window. Okay. Let's see.
223:51 Let me run this. So, it is taking some time. Let's wait
223:55 So, it is taking some time. Let's wait for it to load.
223:58 for it to load. Oh, man. It will open, I think, multiple
224:01 Oh, man. It will open, I think, multiple screens now. So, let's wait. Let's wait.
224:03 screens now. So, let's wait. Let's wait. It is activating extensions. Oh, yeah.
224:05 It is activating extensions. Oh, yeah. See, I told you. Uh, let me close all of
224:09 See, I told you. Uh, let me close all of all of these.
224:13 all of these. Okay. Perfect. Perfect.
224:16 Okay. Perfect. Perfect. Okay, perfect. Perfect. And now here we
224:19 Okay, perfect. Perfect. And now here we want to say end four obviously. Perfect.
224:24 want to say end four obviously. Perfect. Now let's see
224:27 Now let's see the version. And if again I need to just
224:31 the version. And if again I need to just reload it. See these are the you can say
224:34 reload it. See these are the you can say some of the um drawbacks of using free
224:38 some of the um drawbacks of using free versions because it is not we have to
224:40 versions because it is not we have to use extensions and extensions are like
224:42 use extensions and extensions are like not very very all the time very you can
224:45 not very very all the time very you can say
224:47 say helpful reload window sometime it
224:49 helpful reload window sometime it becomes really really slow sometime it
224:51 becomes really really slow sometime it is fine so now it is activating
224:54 is fine so now it is activating extension let's wait
224:56 extension let's wait okay okay
224:59 okay okay let's wait let's wait let's wait
225:02 let's wait let's wait let's wait Okay, parsing on DVD. Okay, perfect. Let
225:05 Okay, parsing on DVD. Okay, perfect. Let me just open this. Perfect. Now this is
225:08 me just open this. Perfect. Now this is my query which is dynamically created.
225:11 my query which is dynamically created. There is just one issue ana there's no
225:13 There is just one issue ana there's no issue. Just run it. Shut up. There is
225:16 issue. Just run it. Shut up. There is one issue. Let me just tell you. So
225:18 one issue. Let me just tell you. So basically if you see in our last column
225:21 basically if you see in our last column should we enter comma? No. That is
225:25 should we enter comma? No. That is against our SQL syntax. Exactly. So here
225:27 against our SQL syntax. Exactly. So here we will be using a inbuilt an inbuilt
225:30 we will be using a inbuilt an inbuilt function of the ginga for loop. It is
225:33 function of the ginga for loop. It is called for or basically loop.last. Let
225:36 called for or basically loop.last. Let me just check
225:38 me just check loop.last
225:43 ginga. It is not a dbd function. It is a ginga function. Okay. Uh loop.last.
225:48 ginga function. Okay. Uh loop.last. Okay.
225:49 Okay. Yeah. Here. Where where is that? Where
225:51 Yeah. Here. Where where is that? Where is that?
225:54 is that? Like I know I can simply use it and you
225:56 Like I know I can simply use it and you can just because these search engines
225:58 can just because these search engines are like becoming so so so good
226:00 are like becoming so so so good nowadays. I know that let's say
226:08 yeah this is still better. Yeah it is saying loop.last I told you. So we need
226:10 saying loop.last I told you. So we need to say only add comma if if it is not
226:15 to say only add comma if if it is not the last item of the loop. Loop.last
226:17 the last item of the loop. Loop.last last is nothing but just a just an
226:20 last is nothing but just a just an inbuilt function that will return true
226:22 inbuilt function that will return true or false on the basis of the position of
226:25 or false on the basis of the position of the iterated value. If that would be the
226:28 the iterated value. If that would be the last element, it will say yes otherwise
226:30 last element, it will say yes otherwise false. So I'll simply say this comma let
226:33 false. So I'll simply say this comma let me first of all remove this comma and I
226:34 me first of all remove this comma and I will now add. See it is so smart. It
226:37 will now add. See it is so smart. It knows what I want to just type. So I'll
226:38 knows what I want to just type. So I'll simply say if not loop.last then only
226:41 simply say if not loop.last then only add this comma. Which comma? If you can
226:44 add this comma. Which comma? If you can see it here see otherwise do not add a
226:46 see it here see otherwise do not add a comma. Okay, let me now run this.
226:50 comma. Okay, let me now run this. Anala,
226:52 Anala, you have to refresh.
226:54 you have to refresh. You have to refresh your window.
226:57 You have to refresh your window. Okay, let's wait. It should work. And
226:59 Okay, let's wait. It should work. And this is you can see your base of ginger.
227:01 this is you can see your base of ginger. Okay, once you know all these things
227:04 Okay, once you know all these things that I have just discussed, I will I
227:06 that I have just discussed, I will I will tell you go through this video like
227:09 will tell you go through this video like go through this section of the video
227:10 go through this section of the video multiple times because it is really new
227:12 multiple times because it is really new to many people and let me be very
227:14 to many people and let me be very honest, it was new to me as well. Like
227:16 honest, it was new to me as well. Like not really new. I really knew Ginga
227:20 not really new. I really knew Ginga before jumping onto DBD. But yeah, when
227:23 before jumping onto DBD. But yeah, when you use it with SQL statements, it is
227:25 you use it with SQL statements, it is really new because you would have never
227:26 really new because you would have never used these kinds of function. Even if
227:28 used these kinds of function. Even if you are from software engineering
227:29 you are from software engineering background, you would not have used
227:31 background, you would not have used these particular function in your SQL in
227:33 these particular function in your SQL in your you can say data engineering stuff,
227:35 your you can say data engineering stuff, right? So I would recommend you to just
227:37 right? So I would recommend you to just practice and just play with it. Now we
227:39 practice and just play with it. Now we do not have a comma and that's what we
227:41 do not have a comma and that's what we want. Okay. So now if I run this, what
227:44 want. Okay. So now if I run this, what will happen? Because this is just a
227:45 will happen? Because this is just a compiled version, right? Let's see the
227:47 compiled version, right? Let's see the result if we have any errors while
227:49 result if we have any errors while running it. And I do not expect anything
227:52 running it. And I do not expect anything but yeah. Yes, a column variable of
227:55 but yeah. Yes, a column variable of function name with order amount cannot
227:56 function name with order amount cannot be resolved. Yeah. So this is not a
227:58 be resolved. Yeah. So this is not a syntactical issue. There's like the
228:00 syntactical issue. There's like the wrong column name. So we can simply say
228:02 wrong column name. So we can simply say gross amount simple
228:04 gross amount simple instead of order amount.
228:07 instead of order amount. So nothing is wrong on the you can say
228:09 So nothing is wrong on the you can say ginger and dbt side. So let's see
228:14 ginger and dbt side. So let's see let's see let's see let's see.
228:22 So perfect we have got the dynamic value okay for our dbt
228:26 okay for our dbt ginga function. Very very good. So
228:28 ginga function. Very very good. So that's all about ginga. And now let's
228:30 that's all about ginga. And now let's talk about macros in ginger. What are
228:33 talk about macros in ginger. What are macros? Basically macros are equivalent
228:36 macros? Basically macros are equivalent to functions in Python or in any
228:39 to functions in Python or in any programming language. Okay. So what's so
228:43 programming language. Okay. So what's so special in that? Nothing special. You
228:45 special in that? Nothing special. You can just reuse your code. Why? Okay.
228:47 can just reuse your code. Why? Okay. What what's so special? What's so
228:48 What what's so special? What's so special in the functions? Obviously, we
228:50 special in the functions? Obviously, we create functions so that we can reuse
228:52 create functions so that we can reuse our logic. Similarly, we can create
228:54 our logic. Similarly, we can create functions in the gene jar so that we can
228:56 functions in the gene jar so that we can reuse the logic. Okay? And it can be any
228:59 reuse the logic. Okay? And it can be any macro. Okay? Let me first of all show
229:01 macro. Okay? Let me first of all show you how you can just create a macro. So,
229:03 you how you can just create a macro. So, let's create macro one. And don't worry,
229:06 let's create macro one. And don't worry, the real macro will be there in the
229:08 the real macro will be there in the macro directory. But just to show you or
229:11 macro directory. But just to show you or let's directly create it there. Okay,
229:13 let's directly create it there. Okay, let's directly create it there. So let's
229:16 let's directly create it there. So let's go to macros and let's create our first
229:19 go to macros and let's create our first macro. We already have one macro by the
229:21 macro. We already have one macro by the way generate schema if you remember and
229:23 way generate schema if you remember and we didn't create it. We just copy pasted
229:25 we didn't create it. We just copy pasted it because it is the recommended way to
229:27 it because it is the recommended way to change the schema name. Okay. So this is
229:29 change the schema name. Okay. So this is also kind of macro and this is the macro
229:31 also kind of macro and this is the macro that always gets run whenever we just
229:33 that always gets run whenever we just want to populate anything. Okay.
229:37 want to populate anything. Okay. Sort sorted. So let's create our first
229:40 Sort sorted. So let's create our first macro.
229:42 macro. So first macro will be very simple.
229:46 So first macro will be very simple. Let's say first macro and let's give it
229:48 Let's say first macro and let's give it very good name. Okay. So let's call it
229:51 very good name. Okay. So let's call it as because I'm going to use it in my
229:52 as because I'm going to use it in my silver layer. So I'll simply say
229:55 silver layer. So I'll simply say multiply.
229:57 multiply. Okay. Multiply.
230:00 Okay. Multiply. Yeah. Multiply. Multiply.SQL. Let's say
230:03 Yeah. Multiply. Multiply.SQL. Let's say you want to just create a macro for
230:04 you want to just create a macro for multiplication because in the fact table
230:06 multiplication because in the fact table you create so many you can say
230:08 you create so many you can say aggregations based on some calculations.
230:10 aggregations based on some calculations. So you do not want to reuse of basically
230:12 So you do not want to reuse of basically obviously you want to reuse the code but
230:13 obviously you want to reuse the code but you do not want to rewrite the code
230:15 you do not want to rewrite the code again and again. Okay. So let's try to
230:17 again and again. Okay. So let's try to write the code. So
230:21 write the code. So that slang I love that slang with slang.
230:25 that slang I love that slang with slang. So now what we need to do we will simply
230:29 So now what we need to do we will simply say I'm really bad at that slang. So let
230:32 say I'm really bad at that slang. So let me first of all
230:35 me first of all snooze this.
230:37 snooze this. Okay. So now what you need to do you
230:39 Okay. So now what you need to do you will simply write a macro. Okay,
230:43 will simply write a macro. Okay, perfect. And now you'll be writing macro
230:45 perfect. And now you'll be writing macro name. Similarly when you write df and
230:48 name. Similarly when you write df and function name, right? Same. Let's say I
230:50 function name, right? Same. Let's say I want to create this multiply.
230:53 want to create this multiply. I will recommend you to keep the macro
230:55 I will recommend you to keep the macro name and file name for better code
230:57 name and file name for better code management because when you're learning
231:00 management because when you're learning it doesn't matter but when you develop
231:01 it doesn't matter but when you develop the things it actually matters a lot. So
231:04 the things it actually matters a lot. So now whenever we just create a function
231:05 now whenever we just create a function we pass some parameters because we
231:07 we pass some parameters because we expect some values. I will say column
231:09 expect some values. I will say column one
231:10 one column 2 make sense? Okay. Column 1,
231:14 column 2 make sense? Okay. Column 1, column 2 simple. So this is our
231:16 column 2 simple. So this is our parameters. And what we need to do what
231:18 parameters. And what we need to do what we need to return I want to return
231:21 we need to return I want to return uh obviously this is a variable. So I
231:24 uh obviously this is a variable. So I will simply say column one. Otherwise
231:26 will simply say column one. Otherwise remember if you do not encapsulate this
231:28 remember if you do not encapsulate this in double curly braces it will treat it
231:30 in double curly braces it will treat it as a string. I will say column 1
231:33 as a string. I will say column 1 multiply
231:35 multiply column 2 make sense and let's say end
231:37 column 2 make sense and let's say end macro
231:39 macro end macro
231:49 and macro perfect so this is our macro make sense okay makes sense so this is
231:52 make sense okay makes sense so this is our macro and how we can just run that
231:53 our macro and how we can just run that obviously we need something to call it
231:56 obviously we need something to call it maybe a kind of select statement or
231:57 maybe a kind of select statement or anything let's go to analysis and to
231:59 anything let's go to analysis and to just test it Let's go to
232:03 just test it Let's go to Ginga 3 not Ginga 3. Let's create a new
232:05 Ginga 3 not Ginga 3. Let's create a new file and I'll simply say query macro
232:10 file and I'll simply say query macro query macro.sql. Now you will be feeling
232:12 query macro.sql. Now you will be feeling how important this folder is because you
232:13 how important this folder is because you want to just test so many things. I will
232:15 want to just test so many things. I will simply say select
232:18 simply say select let's say now how I can just use the
232:20 let's say now how I can just use the macro. I just need to simply
232:22 macro. I just need to simply um use the macro name. It's multiply.
232:27 um use the macro name. It's multiply. Okay. And I think we also need to create
232:29 Okay. And I think we also need to create the macro before using it because
232:31 the macro before using it because there's a way we can just use it. So
232:34 there's a way we can just use it. So when we just create the macros, we also
232:35 when we just create the macros, we also need to just run something like let's
232:37 need to just run something like let's say maybe dbt run or something so that
232:39 say maybe dbt run or something so that it will be populated. But I'm not sure
232:41 it will be populated. But I'm not sure maybe it is like automatically
232:42 maybe it is like automatically populated. Let me just go and test
232:45 populated. Let me just go and test multiply let's say 10 and 50. Okay. As
232:50 multiply let's say 10 and 50. Okay. As test column
232:52 test column simple I want to just run this. So what
232:54 simple I want to just run this. So what will happen now?
232:56 will happen now? So this is my um compiled code. Makes
232:59 So this is my um compiled code. Makes sense. If I just run this, what will
233:01 sense. If I just run this, what will happen?
233:06 I should see 500 as total column. No, multiply on search path cannot resolve
233:08 multiply on search path cannot resolve routine. Multiply on search path. I
233:11 routine. Multiply on search path. I think we need to first of all create
233:12 think we need to first of all create this macro. This is what I think because
233:16 this macro. This is what I think because obviously in order to
233:19 obviously in order to create this macro h let me just go on
233:21 create this macro h let me just go on documentation because this is also like
233:25 documentation because this is also like macro
233:27 macro macros and ginger on INDVD
233:36 ginger and macros. Okay. Macros. Okay. And macro and Okay.
233:41 Macros. Okay. And macro and Okay. Uh okay. We put it in the macros. Okay.
233:46 Uh okay. We put it in the macros. Okay. So our logic is also ready. Model which
233:48 So our logic is also ready. Model which uses macro might look like this. Okay.
233:50 uses macro might look like this. Okay. This would compile to this one. Okay.
233:55 This would compile to this one. Okay. Amount this this using a macro from a
233:59 Amount this this using a macro from a package.
234:01 package. We are not talking about packages right
234:02 We are not talking about packages right now. I know that. So that is fine.
234:09 Uh use ginger naming convention from app data. Payments.
234:12 data. Payments. Okay, select send to dollars amount as
234:15 Okay, select send to dollars amount as amount USD. Okay, so there's a like
234:18 amount USD. Okay, so there's a like function they have created to convert
234:20 function they have created to convert dollars or cents into dollars but we are
234:22 dollars or cents into dollars but we are not doing that. So
234:25 not doing that. So macro build macro maybe we need to first
234:28 macro build macro maybe we need to first of all build it because select multiply
234:31 of all build it because select multiply and we have the basically we forgot to
234:34 and we have the basically we forgot to add this. So if you ever want to use you
234:36 add this. So if you ever want to use you can say macro we have to treat it like
234:38 can say macro we have to treat it like this and yeah that's it that's it that's
234:43 this and yeah that's it that's it that's it that's it. So
234:46 it that's it. So let's try to run it now. What if what
234:48 let's try to run it now. What if what what we see.
234:50 what we see. Okay. So whenever you just want to use
234:52 Okay. So whenever you just want to use this particular thing connection error
234:53 this particular thing connection error analysis query macro what is this end of
234:57 analysis query macro what is this end of print statement got as line two
235:02 print statement got as line two expected token view detailed error.
235:04 expected token view detailed error. Okay. Compilation error multiply
235:08 Okay. Compilation error multiply 10 50 as test column. Okay.
235:13 10 50 as test column. Okay. Okay. Okay, let's try to use it on a
235:15 Okay. Okay, let's try to use it on a particular table just to understand it
235:17 particular table just to understand it better. Comparation error analysis macro
235:20 better. Comparation error analysis macro query macro. Okay, end of token end of
235:22 query macro. Okay, end of token end of print statement got as. Hm. Okay, makes
235:25 print statement got as. Hm. Okay, makes sense. That's a very basic mistake. We
235:29 sense. That's a very basic mistake. We just need to take it out. And that's it.
235:34 just need to take it out. And that's it. And that's it. Perfect.
235:38 And that's it. Perfect. Perfect. Perfect. Perfect. So, let's run
235:40 Perfect. Perfect. Perfect. So, let's run it. Obviously, obviously, obviously some
235:43 it. Obviously, obviously, obviously some things can happen. Okay. When the person
235:46 things can happen. Okay. When the person is tired,
235:48 is tired, not really.
235:51 not really. And it should work now. I should see
235:53 And it should work now. I should see 500. So, just add this these two curly
235:55 500. So, just add this these two curly braces. If not, you can just Oh, yeah.
235:57 braces. If not, you can just Oh, yeah. Perfect. If not, you can just try
235:58 Perfect. If not, you can just try running it as well. It's not a big deal.
236:00 running it as well. It's not a big deal. Sometime it works, sometime it does not.
236:03 Sometime it works, sometime it does not. Yes.
236:04 Yes. So, now we have also covered ginger
236:07 So, now we have also covered ginger functions and we have just built some
236:08 functions and we have just built some amazing function as well. Let's say
236:10 amazing function as well. Let's say multiply. Okay, makes sense. Makes
236:12 multiply. Okay, makes sense. Makes sense. Makes sense. Makes sense. Very
236:13 sense. Makes sense. Makes sense. Very good. So far we have built our bronze
236:16 good. So far we have built our bronze layer. Now let's try to build you can
236:18 layer. Now let's try to build you can say silver layer. Okay, because
236:20 say silver layer. Okay, because obviously bronze layer is built. So
236:22 obviously bronze layer is built. So let's say in the silver layer what we
236:24 let's say in the silver layer what we need to build. We need to build a kind
236:26 need to build. We need to build a kind of OBT which is one big table which is
236:30 of OBT which is one big table which is very popular nowadays which is very
236:32 very popular nowadays which is very trending nowadays I would say in the
236:33 trending nowadays I would say in the data engine community because people are
236:35 data engine community because people are just trying to be more inclined towards
236:38 just trying to be more inclined towards one big table because it is fine when
236:40 one big table because it is fine when you have a small data model but when the
236:42 you have a small data model but when the data model becomes too big so people try
236:46 data model becomes too big so people try to just create one big table based on
236:48 to just create one big table based on their requirements let's say one big
236:51 their requirements let's say one big table for finance one big table for and
236:53 table for finance one big table for and basically one big table per entity
236:57 basically one big table per entity instead of having multiple tables like
236:58 instead of having multiple tables like one big table per entity. So let's
237:00 one big table per entity. So let's explore that as well and in this
237:01 explore that as well and in this particular case we're going to write a
237:03 particular case we're going to write a comp not complex a little bit more
237:05 comp not complex a little bit more advanced SQL that you will also learn.
237:07 advanced SQL that you will also learn. So let's go to silver and let's finally
237:10 So let's go to silver and let's finally build our silver layer and in order to
237:12 build our silver layer and in order to build the silver layer I will go to
237:14 build the silver layer I will go to models silver and let me just give you
237:17 models silver and let me just give you the first of all requirements. So in the
237:19 the first of all requirements. So in the silver layer what I'm going to do
237:22 silver layer what I'm going to do obviously I'll be just using you can say
237:24 obviously I'll be just using you can say functions multiply and you can build
237:27 functions multiply and you can build other macros as well. We'll be just
237:29 other macros as well. We'll be just using these macros and just a small
237:32 using these macros and just a small homework just try to build more and more
237:34 homework just try to build more and more transformation functions that you can
237:35 transformation functions that you can build within your you can say that you
237:38 build within your you can say that you can use actually within your silver
237:40 can use actually within your silver layer. Make sense? Okay very good. So
237:42 layer. Make sense? Okay very good. So now let's try to create one big table
237:45 now let's try to create one big table and what will be the one big table how
237:47 and what will be the one big table how it will look like. So basically one big
237:49 it will look like. So basically one big table will look like this. So we have
237:52 table will look like this. So we have bronze customers, we have bronze
237:54 bronze customers, we have bronze products, we have sales. Okay. And we
237:57 products, we have sales. Okay. And we have store as well. If I go to bronze
237:59 have store as well. If I go to bronze sales, we have all the foreign keys and
238:01 sales, we have all the foreign keys and primary key. Right? So let's build our
238:05 primary key. Right? So let's build our you can say
238:07 you can say um okay
238:10 um okay I want to do something for snapshots as
238:12 I want to do something for snapshots as well because I want to show you okay
238:14 well because I want to show you okay snapshot we can just treat it
238:15 snapshot we can just treat it differently. That's fine. Okay makes
238:16 differently. That's fine. Okay makes sense. So let's create a silver layer.
238:18 sense. So let's create a silver layer. First of all, let's go step by step. So
238:20 First of all, let's go step by step. So we have all the foreign keys here,
238:22 we have all the foreign keys here, right? So let's try to create a silver
238:24 right? So let's try to create a silver model based on the models that we have.
238:28 model based on the models that we have. Okay. So first of all, what I will do? I
238:30 Okay. So first of all, what I will do? I will treat my bronze sale as main table.
238:34 will treat my bronze sale as main table. Make sense? And after that I will attach
238:38 Make sense? And after that I will attach it with products and customers. Products
238:41 it with products and customers. Products and customers. Okay, makes sense. And
238:44 and customers. Okay, makes sense. And store as well, but that is optional. But
238:46 store as well, but that is optional. But for now, let's build bronze sales with
238:49 for now, let's build bronze sales with products and customers because I have a
238:50 products and customers because I have a very good use case that is used in the
238:52 very good use case that is used in the real world. So, let's do that. So, let's
238:54 real world. So, let's do that. So, let's go to silver. Let's create a new file
238:56 go to silver. Let's create a new file and I will say silver
238:59 and I will say silver sales. Silver sales or I will say silver
239:04 sales. Silver sales or I will say silver sales info. Okay? Because this is a
239:07 sales info. Okay? Because this is a combined this is an entity. Do not treat
239:10 combined this is an entity. Do not treat it as like you can say um
239:14 it as like you can say um transforming each table. No, this is a
239:17 transforming each table. No, this is a modern world use case where we just try
239:20 modern world use case where we just try to you can say build one big table for
239:24 to you can say build one big table for the each entity. This is a this is an
239:27 the each entity. This is a this is an entity which will be used by the
239:28 entity which will be used by the business. Make sense? Okay. Because see
239:32 business. Make sense? Okay. Because see there is no hard and set rule for data
239:34 there is no hard and set rule for data engineering. It totally depends how our
239:37 engineering. It totally depends how our stakeholders want to use our data. So
239:40 stakeholders want to use our data. So let's say if you are building something
239:42 let's say if you are building something if someone asks you or if someone let's
239:44 if someone asks you or if someone let's say um say say hey say say say say say
239:47 say say say say say say say say say say say say say say say say say say say say
239:47 say say say say say say say say say say say say say say say says like hey what
239:48 say say say say say says like hey what are you doing you need to just build
239:50 are you doing you need to just build this thing in bronze you need to just um
239:53 this thing in bronze you need to just um enrich or transform data in enriched or
239:55 enrich or transform data in enriched or basically silver you need to build a
239:57 basically silver you need to build a dimensional data model in gold just ask
240:00 dimensional data model in gold just ask the person are you the stakeholder the
240:02 the person are you the stakeholder the person will say no so just say shut up
240:05 person will say no so just say shut up because our stakeholder just said that I
240:07 because our stakeholder just said that I want data like this so it's my duty to
240:10 want data like this so it's my duty to deliver the data like this. Who are you
240:12 deliver the data like this. Who are you to just pass the comments?
240:14 to just pass the comments? This is a reality. It's not like if you
240:17 This is a reality. It's not like if you are engineer, it's not like you have to
240:19 are engineer, it's not like you have to every time serve the dimensional data
240:20 every time serve the dimensional data model. Maybe they just want simple
240:24 model. Maybe they just want simple transform table with combined columns.
240:27 transform table with combined columns. It's called one big table. It's their
240:29 It's called one big table. It's their use case. Who are you to comment? So
240:32 use case. Who are you to comment? So just know these things as well that
240:33 just know these things as well that these things also exist. It's not like
240:35 these things also exist. It's not like you have to deliver dimensional data
240:37 you have to deliver dimensional data model. Yes, in most of the scenarios and
240:39 model. Yes, in most of the scenarios and so far we are delivering dimensional
240:41 so far we are delivering dimensional data models, fact tables, dimension
240:43 data models, fact tables, dimension tables, um all those aggregated views.
240:46 tables, um all those aggregated views. Yes, this is a trend so far but you can
240:48 Yes, this is a trend so far but you can say more and more requirements are also
240:51 say more and more requirements are also coming from the one big table side as
240:53 coming from the one big table side as well. Makes sense. Makes sense. So you
240:57 well. Makes sense. Makes sense. So you should know that side as well. Okay. So
240:59 should know that side as well. Okay. So let's try to build that particular
241:01 let's try to build that particular piece. So here I'm going to write my
241:03 piece. So here I'm going to write my select statement and it will not be like
241:05 select statement and it will not be like very simple. So let's use something
241:07 very simple. So let's use something called as common table expression
241:08 called as common table expression because because
241:12 because because uh DBT says if you just want to write
241:14 uh DBT says if you just want to write nested subqueries just go with CTS. Why
241:19 nested subqueries just go with CTS. Why more readable better maintenance you can
241:22 more readable better maintenance you can say management plus easy to query. Yes
241:25 say management plus easy to query. Yes you need to understand CTS for that and
241:28 you need to understand CTS for that and I expect that you know CTE in SQL and
241:31 I expect that you know CTE in SQL and yeah you should know SQ SQL if you are a
241:34 yeah you should know SQ SQL if you are a data engineer. Okay or any data
241:36 data engineer. Okay or any data professional and that is why I always
241:37 professional and that is why I always say if you are if you know SQL you are
241:40 say if you are if you know SQL you are actually in a very good picture because
241:43 actually in a very good picture because in most of the things SQL is a backbone
241:46 in most of the things SQL is a backbone okay so let's write our first CT I will
241:49 okay so let's write our first CT I will say with
241:50 say with uh sales
241:52 uh sales okay or basically bronze sales just to
241:56 okay or basically bronze sales just to just to make it more readable and let me
241:59 just to make it more readable and let me just turn it off let me just snooze it
242:01 just turn it off let me just snooze it okay so now I will say as
242:05 okay so now I will say as select.
242:07 select. I want
242:21 Okay. I want product SK. Not really. I just want their name. So I will say unit
242:24 just want their name. So I will say unit price or quantity basically.
242:28 price or quantity basically. Okay. Quantity and unit price. We have
242:31 Okay. Quantity and unit price. We have gross amount. We can just use gross
242:32 gross amount. We can just use gross amount. Okay.
242:39 Gross amount makes sense. Then we have discount.
242:42 discount. Okay. Then let's say payment method.
242:54 Perfect. I want these three columns from from bronze sales. So this is my fourth
242:56 from bronze sales. So this is my fourth CT that I have created. Okay. Let me say
242:59 CT that I have created. Okay. Let me say from
243:04 and I will refer my bronze layer as you know how we can just refer that ref
243:07 know how we can just refer that ref bronze sales
243:15 perfect and this is my name bronze sales let's create second CT and how we can
243:17 let's create second CT and how we can just create second CT this is very
243:19 just create second CT this is very confusing like two sum but yeah don't
243:21 confusing like two sum but yeah don't worry this was confusing to me as well
243:23 worry this was confusing to me as well so don't feel like that so let's say you
243:25 so don't feel like that so let's say you want to create second CT so you will be
243:26 want to create second CT so you will be thinking like hey we need to write this
243:29 thinking like hey we need to write this let's say
243:31 let's say with and then CTE or whatever name let's
243:34 with and then CTE or whatever name let's say bronze
243:36 say bronze um products okay you'll be thinking like
243:39 um products okay you'll be thinking like this no once you write with that's the
243:41 this no once you write with that's the only time you you need to write with
243:43 only time you you need to write with other than that you will be simply
243:45 other than that you will be simply defining the name that's it you can
243:46 defining the name that's it you can simply say bronze okay and just for the
243:49 simply say bronze okay and just for the readability you can hit enter and then
243:50 readability you can hit enter and then bronze products simple and then as and
243:54 bronze products simple and then as and then query now within this query you can
243:56 then query now within this query you can also inherit this query Let's say you
243:58 also inherit this query Let's say you want to write select a str from bronze
244:00 want to write select a str from bronze sales or let me just make it sales
244:02 sales or let me just make it sales that's it otherwise it will be confusing
244:04 that's it otherwise it will be confusing with the model name and this okay now
244:06 with the model name and this okay now you can also write something like this
244:09 you can also write something like this select a strix from
244:12 select a strix from sales where sale ID is something like
244:16 sales where sale ID is something like this anything so this way you can
244:17 this anything so this way you can inherit the CTE that is called nested
244:20 inherit the CTE that is called nested common table expressions okay but we are
244:23 common table expressions okay but we are just creating product so I will simply
244:25 just creating product so I will simply write select Uh basically I think here I
244:30 write select Uh basically I think here I need to pull product ID as well. Yes
244:39 ID and then what is the column name for that other
244:40 what is the column name for that other thing?
244:43 thing? Product ID uh product SK not ID product
244:46 Product ID uh product SK not ID product SK and customer SK. SK means like
244:48 SK and customer SK. SK means like surrogate key.
244:56 Okay. And then I will say customer SK.
245:04 Perfect. So here I will simply now I can simply search
245:07 simply search product SK or ID. Let me open bronze
245:10 product SK or ID. Let me open bronze product.
245:12 product. Let me run this.
245:25 Okay. So here we have product SK. Perfect. And we want basically category.
245:27 Perfect. And we want basically category. Okay. So here I will say product SK.
245:34 Okay. So here I will say product SK. Product category or basically category.
245:45 Perfect. from as you know
245:47 as you know ref
245:49 ref bronze product
245:59 perfect so let's create our third CD which is called I think customer
246:02 which is called I think customer and this will be like as
246:10 select and in the customer I'm expecting customer SK Okay,
246:13 customer SK Okay, perfect. And let's explore what do we
246:15 perfect. And let's explore what do we have in customer SK.
246:18 have in customer SK. Let's run this
246:29 customer SK. Okay. And then we have loyalty tier. Hm. Let's use this loyalty
246:33 loyalty tier. Hm. Let's use this loyalty tier or let's say gender.
246:37 tier or let's say gender. Uh yeah, let's say gender because we can
246:39 Uh yeah, let's say gender because we can just analyze like who is making more
246:40 just analyze like who is making more sales, boys, girls or Yeah.
246:44 sales, boys, girls or Yeah. So let's say gender perfect
246:48 So let's say gender perfect and this is our
246:51 and this is our okay customer SK. Perfect. So this is
246:54 okay customer SK. Perfect. So this is our query. So these are three CTEs. Now
246:56 our query. So these are three CTEs. Now we're going to use these three CDEs in
246:58 we're going to use these three CDEs in our main select statement. And how we
247:00 our main select statement. And how we can just do that? I will simply say
247:01 can just do that? I will simply say selectuh
247:10 okay do we need to write comma do we let's explore okay select okay customer
247:13 let's explore okay select okay customer we have oh man okay by the way that's
247:16 we have oh man okay by the way that's what we need select sales ID product SK
247:19 what we need select sales ID product SK gross amount that's it that's it
247:21 gross amount that's it that's it actually I can ignore this surrogate
247:23 actually I can ignore this surrogate keys because I don't want these
247:25 keys because I don't want these surrogate keys I just want these product
247:27 surrogate keys I just want these product category and gender that's it and this
247:30 category and gender that's it and this is our query and I can actually first of
247:33 is our query and I can actually first of all compile this code just to make sure
247:35 all compile this code just to make sure like we have everything okay
247:38 like we have everything okay and perfect and let's do one more thing
247:42 and perfect and let's do one more thing we can use h payment method gross amount
247:48 we can use h payment method gross amount we had more things right we had let's
247:51 we had more things right we had let's say
247:53 say one thing was amount something like that
247:56 one thing was amount something like that let me just check because I want to use
247:57 let me just check because I want to use that macro that we created That way you
248:00 that macro that we created That way you will understand like how we can just
248:02 will understand like how we can just utilize our macros in the select
248:03 utilize our macros in the select statements. It is unit price and
248:05 statements. It is unit price and quantity. Let's create a transformed
248:08 quantity. Let's create a transformed column basically calculated column by
248:10 column basically calculated column by multiplying these two columns. Okay. So
248:12 multiplying these two columns. Okay. So I'll simply say macro which is multiply
248:17 I'll simply say macro which is multiply and I will pass these two columns which
248:19 and I will pass these two columns which is uh unit price
248:27 and then I will pass uh quantity.
248:30 uh quantity. Perfect. And I will call it as
248:33 Perfect. And I will call it as calculated
248:39 price or calculated gross amount. It makes sense.
248:41 makes sense. Okay. It makes sense. Okay, makes sense.
248:45 Okay. It makes sense. Okay, makes sense. So this way you can just perform this.
248:47 So this way you can just perform this. Let me just first of all compile this
248:48 Let me just first of all compile this code. So this is my compiled code. As
248:51 code. So this is my compiled code. As you can see, we do not have any kind of
248:53 you can see, we do not have any kind of Oh, see this is automatically replaced
248:54 Oh, see this is automatically replaced with this particular unit price into
248:56 with this particular unit price into quantity where we have this particular
248:58 quantity where we have this particular macro. That's how you can work with this
249:01 macro. That's how you can work with this thing. Make sense? Okay. So now this is
249:04 thing. Make sense? Okay. So now this is my model. This is my silver model. Make
249:07 my model. This is my silver model. Make sense? And just to make sure it will
249:08 sense? And just to make sure it will work fine, we can simply run this in our
249:10 work fine, we can simply run this in our select statement view before running it
249:12 select statement view before running it in
249:14 in the table or view products cannot be
249:16 the table or view products cannot be find really really
249:20 find really really bronze. Oh, I wrote bronze products. Let
249:22 bronze. Oh, I wrote bronze products. Let me remove it. Okay, now let's try
249:28 me remove it. Okay, now let's try products or products.
249:31 products or products. Okay, now let's run it.
249:41 column variable function customer SK. Okay. Oh, really? Customer SK is not
249:43 Okay. Oh, really? Customer SK is not there. Customer
249:46 there. Customer SK
249:48 SK I think customer SK is there. What's the
249:53 I think customer SK is there. What's the Okay. Customer dot customer SK
249:56 Okay. Customer dot customer SK sales dot customer SK. And what the
249:59 sales dot customer SK. And what the error is? Column variable or function
250:01 error is? Column variable or function parameter with name customer SK cannot
250:04 parameter with name customer SK cannot be resolved. Are you serious? Let me
250:06 be resolved. Are you serious? Let me check
250:08 check customer.
250:10 customer. Let me run it.
250:21 We have customer SK. We have We have bro customer SK.
250:24 customer SK. Customer SK. Okay.
250:27 Customer SK. Okay. So where are we using customer SK? We
250:29 So where are we using customer SK? We are using it here. Sales dot customers
250:31 are using it here. Sales dot customers SK customer. Customers SK. Customer SK
250:34 SK customer. Customers SK. Customer SK is here.
250:36 is here. Customer SK is here as well.
250:41 Customer SK is here as well. Huh. Strange.
250:44 Huh. Strange. Let me run it again.
250:57 cannot be resolved. Okay. What's the issue?
251:04 Oh anala who will write from
251:07 who will write from perfect now let's run it
251:20 so now perfect we have the view that I wanted okay perfect but this is not the
251:22 wanted okay perfect but this is not the end result okay this is just the joined
251:24 end result okay this is just the joined view that we want now I want to find
251:27 view that we want now I want to find that
251:29 that which category
251:30 which category Okay. Per gender performed well. Which
251:34 Okay. Per gender performed well. Which category per gender performed well?
251:37 category per gender performed well? Let's say category beverages. So which
251:40 Let's say category beverages. So which category like let's say in the beverages
251:42 category like let's say in the beverages gender which gender performed better. So
251:44 gender which gender performed better. So I just want to find it for each category
251:46 I just want to find it for each category and I want to let's say order it by with
251:50 and I want to let's say order it by with the amount or whatever. So this query is
251:51 the amount or whatever. So this query is not completed. That's why I told you
251:52 not completed. That's why I told you like this will be a complex query. But
251:54 like this will be a complex query. But that's what we build in the real world,
251:56 that's what we build in the real world, right? That's what we build in the real
251:58 right? That's what we build in the real world. That's why DBT is here to help
251:59 world. That's why DBT is here to help you out with the complex SQL queries.
252:02 you out with the complex SQL queries. Make sense? Okay. So, this is my select
252:04 Make sense? Okay. So, this is my select statement. Okay. So, I can just make it
252:06 statement. Okay. So, I can just make it as another CTE. Okay. And I can call it
252:10 as another CTE. Okay. And I can call it as
252:12 as um joined query
252:16 um joined query as
252:18 as perfect
252:20 perfect join query as this one. Okay.
252:25 join query as this one. Okay. So, this is my joint query. Okay. Now I
252:27 So, this is my joint query. Okay. Now I can just use this join query to perform
252:30 can just use this join query to perform the aggregation. Now we simply say
252:31 the aggregation. Now we simply say select
252:33 select select uh
252:36 select uh gender
252:37 gender first of all category
252:40 first of all category category then gender okay then let's say
252:44 category then gender okay then let's say sum of amount
252:46 sum of amount uh gross amount
252:49 uh gross amount perfect as total sales
252:53 perfect as total sales okay from
252:56 okay from join query
252:58 join query perfect group by
253:01 perfect group by I can say group by all as well because
253:03 I can say group by all as well because in DB in data bricks we have group by
253:05 in DB in data bricks we have group by all function but it's fine category this
253:07 all function but it's fine category this and I want to say order by
253:10 and I want to say order by order by I want to say total sales dsc
253:13 order by I want to say total sales dsc perfect let's run this let's see this is
253:15 perfect let's run this let's see this is our basically the end query
253:25 perfect I got all the things see gaming female 2599 gaming other 20 and gaming
253:28 female 2599 gaming other 20 and gaming male
253:29 male 178 to4. Wow. Females are actually
253:34 178 to4. Wow. Females are actually making more sales in gaming. Wow. TV and
253:37 making more sales in gaming. Wow. TV and home theater as well. Blogs as well.
253:40 home theater as well. Blogs as well. Computers and tablets as well. Beverages
253:43 Computers and tablets as well. Beverages as well. Okay. It's Oh, in cleaning bro
253:48 as well. Okay. It's Oh, in cleaning bro just check clean. Okay. Cleaning mails
253:51 just check clean. Okay. Cleaning mails are just making more you can say sales.
253:54 are just making more you can say sales. Okay. Personal care. Obviously, that's
253:56 Okay. Personal care. Obviously, that's not even a question. That's male will be
253:59 not even a question. That's male will be in the negative.
254:01 in the negative. They just use the same product that
254:03 They just use the same product that female uses like they just take, hey,
254:04 female uses like they just take, hey, just give me a little bit of it and I
254:06 just give me a little bit of it and I can just use it. Bakery makes sense.
254:09 can just use it. Bakery makes sense. Okay, dairy is like normal because
254:11 Okay, dairy is like normal because everyone consumes the dairy products
254:13 everyone consumes the dairy products produce. Okay, so see who is making more
254:16 produce. Okay, so see who is making more sales, who is more expensive, who just
254:21 sales, who is more expensive, who just see data is in front of you. You can
254:25 see data is in front of you. You can just decide who is actually spending
254:28 just decide who is actually spending more. Okay. So that's where your salary
254:32 more. Okay. So that's where your salary is going, right? Your package is not
254:35 is going, right? Your package is not less. Female is making more more more
254:38 less. Female is making more more more sales. Not sales like more expenses.
254:42 sales. Not sales like more expenses. So now our silver layer is also ready.
254:46 So now our silver layer is also ready. Okay, silver layer is also ready. And by
254:48 Okay, silver layer is also ready. And by the way, I was just kidding. So do not
254:49 the way, I was just kidding. So do not take it personally. Just a fun class. Do
254:52 take it personally. Just a fun class. Do not behave like an immature kid. I know
254:56 not behave like an immature kid. I know there are like a few immature kid. Do
254:58 there are like a few immature kid. Do not worry, you will also be mature with
255:00 not worry, you will also be mature with the time. Just
255:02 the time. Just so this is our final query that we need
255:04 so this is our final query that we need to build for the silver layer. And we
255:05 to build for the silver layer. And we have actually made a lot of changes to
255:07 have actually made a lot of changes to this silver layer. And a little homework
255:09 this silver layer. And a little homework for you. This type of query I want you
255:12 for you. This type of query I want you to build for returns as well because
255:13 to build for returns as well because everything will be exactly same but
255:15 everything will be exactly same but instead of sales it will be returns.
255:17 instead of sales it will be returns. Okay? This is a small homework for you.
255:20 Okay? This is a small homework for you. Make sense? Very good. So this have a
255:22 Make sense? Very good. So this have a silver layer one big table that we want
255:23 silver layer one big table that we want to just find this is basically not a one
255:25 to just find this is basically not a one big table basically a KPI that we have
255:27 big table basically a KPI that we have created in the silver layer okay on the
255:30 created in the silver layer okay on the basis of the requirement that we got
255:32 basis of the requirement that we got make sense very good now let's actually
255:35 make sense very good now let's actually run dbt run d run dbt run
255:44 okay now I will say dbt run I can run all the things but I don't want to run
255:46 all the things but I don't want to run all the things I will simply say select
255:48 all the things I will simply say select and I will say models mod and then
255:51 and I will say models mod and then silver because I just want to run silver
255:53 silver because I just want to run silver models because our all the other models
255:55 models because our all the other models are already there. I don't want to use a
255:57 are already there. I don't want to use a compute.
255:59 compute. Okay. So let's see if our silver model
256:01 Okay. So let's see if our silver model is ready. It's called silver sales info
256:04 is ready. It's called silver sales info and
256:06 and okay makes sense.
256:08 okay makes sense. Sales info running complete
256:09 Sales info running complete successfully. Let me just check and
256:11 successfully. Let me just check and verify if we got it.
256:20 Okay. Now in the silver perfect we got this table perfect perfect we got this
256:23 this table perfect perfect we got this table in the silver view and that's how
256:25 table in the silver view and that's how we create basically you can say complex
256:28 we create basically you can say complex views and that's how we use macros gene
256:31 views and that's how we use macros gene jars within the DBD I hope it gave you
256:33 jars within the DBD I hope it gave you an end to end understanding make sense
256:36 an end to end understanding make sense now I want to talk about snapshots it is
256:40 now I want to talk about snapshots it is very very important and let me just give
256:41 very very important and let me just give you a quick hint we use snapshots to
256:44 you a quick hint we use snapshots to work with the slowly changing dimensions
256:47 work with the slowly changing dimensions Wow, I know you're really really happy
256:49 Wow, I know you're really really happy by hearing the word. Oh, we're just
256:50 by hearing the word. Oh, we're just learning slowly changing dimension. Yes,
256:52 learning slowly changing dimension. Yes, now let's actually see how you can just
256:54 now let's actually see how you can just build slowly changing dimensions within
256:55 build slowly changing dimensions within DBT. Okay, let me just show you. So,
256:58 DBT. Okay, let me just show you. So, bro, now it's time to talk about
257:02 bro, now it's time to talk about snapshots. Basically, snapshots are like
257:06 snapshots. Basically, snapshots are like not like snapshots are basically created
257:10 not like snapshots are basically created so that we can work with slowly changing
257:12 so that we can work with slowly changing dimensions using DBT. You know that
257:15 dimensions using DBT. You know that being a data engineer, one of the most
257:19 being a data engineer, one of the most important tasks for a data engineer is
257:22 important tasks for a data engineer is to create a slowly changing dimension.
257:24 to create a slowly changing dimension. And on top of it, creating a slowly
257:27 And on top of it, creating a slowly changing dimension type one is not a big
257:29 changing dimension type one is not a big deal. It's very easy because it's simple
257:31 deal. It's very easy because it's simple absurd. But creating a slowly changing
257:34 absurd. But creating a slowly changing dimension type two is a task because you
257:36 dimension type two is a task because you need to keep a track of the history as
257:38 need to keep a track of the history as well. Make sense? And the good news is
257:41 well. Make sense? And the good news is with the help of snapshots, it is very
257:43 with the help of snapshots, it is very easy. Okay? So do not think like it will
257:44 easy. Okay? So do not think like it will be really critical one like it is
257:46 be really critical one like it is critical but it is not complex. Okay, it
257:49 critical but it is not complex. Okay, it is complex but it will not be complex
257:50 is complex but it will not be complex because I'll be just telling you. Okay,
257:52 because I'll be just telling you. Okay, so the thing is let me just take you a
257:54 so the thing is let me just take you a documentation and I hope this time
257:58 documentation and I hope this time search engine will help us. Okay,
258:00 search engine will help us. Okay, snapshots
258:02 snapshots DVD by the way snapshots are recently
258:05 DVD by the way snapshots are recently updated. Earlier we used to use SQL
258:08 updated. Earlier we used to use SQL files to create snapshots as we do it
258:10 files to create snapshots as we do it for other models but recently we are
258:15 for other models but recently we are using YAML files directly to create
258:17 using YAML files directly to create snapshots obviously we can use SQL files
258:19 snapshots obviously we can use SQL files as well but that is a deprecated way and
258:22 as well but that is a deprecated way and I don't want to learn the deprecated
258:24 I don't want to learn the deprecated things right let me hit enter and let's
258:26 things right let me hit enter and let's see what the search engine has returned
258:29 see what the search engine has returned okay that looks good add snapshots to
258:32 okay that looks good add snapshots to your tag okay so basically let's scroll
258:34 your tag okay so basically let's scroll a little What are snapshots? Analysts
258:37 a little What are snapshots? Analysts often need to look back in time. Yes. At
258:39 often need to look back in time. Yes. At previous date, series in their mutable
258:41 previous date, series in their mutable tables. Obviously, because every table
258:44 tables. Obviously, because every table can be changed. While some source data
258:46 can be changed. While some source data system are built in a way that makes
258:48 system are built in a way that makes accessing history data possible, that is
258:50 accessing history data possible, that is not always the case. That is true. That
258:52 not always the case. That is true. That is true. DBT provide a mechanism
258:54 is true. DBT provide a mechanism snapshots which records changes to a
258:56 snapshots which records changes to a mutable table over time, which is also
258:58 mutable table over time, which is also called slowly changing dimension type
259:00 called slowly changing dimension type two. And I hope that you know about
259:02 two. And I hope that you know about slowly changing dimension type two. If
259:04 slowly changing dimension type two. If you know if you if you do not know so
259:06 you know if you if you do not know so this is basically an example. So let's
259:08 this is basically an example. So let's say there was an order and it status was
259:11 say there was an order and it status was pending at this particular date and on
259:13 pending at this particular date and on the next day it says shipped but we need
259:16 the next day it says shipped but we need to create a slowly changing dimension
259:18 to create a slowly changing dimension type two for this particular table which
259:20 type two for this particular table which will say hey this was the previous
259:22 will say hey this was the previous status and this is the recent status.
259:25 status and this is the recent status. That is why dbt valid to basically this
259:27 That is why dbt valid to basically this is a column which says valid from and
259:30 is a column which says valid from and valid to if it is currently in use and
259:34 valid to if it is currently in use and if it is not updated in the present then
259:36 if it is not updated in the present then obviously valid to will have the value
259:40 obviously valid to will have the value either as null or very big value
259:44 either as null or very big value something like 9999 0909 something like
259:47 something like 9999 0909 something like that. Okay. configuring snapshots. As
259:50 that. Okay. configuring snapshots. As you can see that currently we use YAML
259:54 you can see that currently we use YAML files directly. Earlier we used to use
259:56 files directly. Earlier we used to use SQL files. Okay. And let's see if we
259:59 SQL files. Okay. And let's see if we have anything special and do not worry
260:01 have anything special and do not worry about configuration because I will be
260:02 about configuration because I will be just telling you with a very good
260:03 just telling you with a very good example. So as you can see to add a
260:06 example. So as you can see to add a snapshot to your project follow these
260:08 snapshot to your project follow these steps for user using version 1.8 the
260:11 steps for user using version 1.8 the previous one because I think we are
260:12 previous one because I think we are using 1.10.
260:14 using 1.10. So earlier refer to legacy snapshot
260:16 So earlier refer to legacy snapshot configuration. If I just open this link,
260:18 configuration. If I just open this link, you will see that they are using SQL
260:20 you will see that they are using SQL files. See configuration select from
260:24 files. See configuration select from blah blah blah. So this was a kind of
260:26 blah blah blah. So this was a kind of configuration block that we used to
260:28 configuration block that we used to write and we were using snapshot
260:30 write and we were using snapshot function here. But that is a deprecated
260:32 function here. But that is a deprecated way. So I will not tell you because you
260:35 way. So I will not tell you because you will be confused after that, right? And
260:37 will be confused after that, right? And I don't want to do that. And let's see.
260:40 I don't want to do that. And let's see. And as we know like this is just telling
260:42 And as we know like this is just telling like hey what can be you use because
260:44 like hey what can be you use because there are like multiple options to
260:46 there are like multiple options to create snapshots within DPD. There are
260:48 create snapshots within DPD. There are like multiple strategies. It is based on
260:50 like multiple strategies. It is based on you can say time stamp. It is also based
260:53 you can say time stamp. It is also based on check. There are so many things but I
260:55 on check. There are so many things but I will just tell you the best way and in
260:58 will just tell you the best way and in such a way that you can digest that
261:00 such a way that you can digest that knowledge. Okay. Okay. Okay. Because the
261:03 knowledge. Okay. Okay. Okay. Because the more you will working with DBT, the more
261:05 more you will working with DBT, the more you will learn with the time. But you
261:08 you will learn with the time. But you need a very strong fundamental knowledge
261:10 need a very strong fundamental knowledge about snapshots. So let's create a
261:13 about snapshots. So let's create a snapshot based on any table. And what do
261:16 snapshot based on any table. And what do you think like which table we should
261:17 you think like which table we should pick? If I go to bronze, if I go to
261:21 pick? If I go to bronze, if I go to let's say bronze store, this can be a
261:22 let's say bronze store, this can be a very good example for our story changing
261:25 very good example for our story changing dimension. But the thing is I want a
261:27 dimension. But the thing is I want a table in which we will be also having a
261:30 table in which we will be also having a date column. Okay? So that we can
261:31 date column. Okay? So that we can actually because uh whenever we want to
261:34 actually because uh whenever we want to work with store changing dimension we
261:35 work with store changing dimension we have to make sure that we have date
261:37 have to make sure that we have date column. Let's see if we have this bronze
261:41 column. Let's see if we have this bronze store.
261:43 store. Uh no. So let let's do one thing. Okay
261:46 Uh no. So let let's do one thing. Okay let's do one thing. Let's create a table
261:49 let's do one thing. Let's create a table so that we can just understand like how
261:51 so that we can just understand like how slowly changing dimension type two
261:53 slowly changing dimension type two works. Okay. So I will go to queries
261:56 works. Okay. So I will go to queries and I will simply click on create query
261:59 and I will simply click on create query just to create basically a simple table
262:01 just to create basically a simple table and this query you'll be using to
262:04 and this query you'll be using to populate the data as well within this
262:05 populate the data as well within this particular table. Okay, let's create a
262:08 particular table. Okay, let's create a table. Create table. I'll pick the
262:10 table. Create table. I'll pick the catalog. Catalog is DBD tutorial dev.
262:13 catalog. Catalog is DBD tutorial dev. And just a hint, very soon we are also
262:16 And just a hint, very soon we are also covering the deployment because we are
262:18 covering the deployment because we are already using continuous integration
262:19 already using continuous integration that is CI part and very soon we'll be
262:21 that is CI part and very soon we'll be covering CD part that is continuous
262:23 covering CD part that is continuous deployment. So don't worry about that.
262:26 deployment. So don't worry about that. So I will be picking let's say bronze
262:29 So I will be picking let's say bronze schema. I will say create table. Let's
262:32 schema. I will say create table. Let's say we want to create a slowly changing
262:34 say we want to create a slowly changing dimension type two on top of products
262:36 dimension type two on top of products table which is a very popular you can
262:39 table which is a very popular you can say table right products I know we
262:43 say table right products I know we already have product table but let's
262:44 already have product table but let's create on products okay
262:48 create on products okay products
262:50 products okay or let's say items just to make
262:53 okay or let's say items just to make sure like it is different okay items
262:56 sure like it is different okay items make sense and let's create columns such
262:59 make sense and let's create columns such as let's say ID okay ID will be int then
263:03 as let's say ID okay ID will be int then name item name okay it will be string
263:06 name item name okay it will be string okay and then uh let's say one more
263:09 okay and then uh let's say one more column uh let's say category okay
263:12 column uh let's say category okay category string and obviously create
263:14 category string and obviously create date or or update date okay let's say
263:17 date or or update date okay let's say update date like at what time this
263:22 update date like at what time this particular record was updated okay time
263:24 particular record was updated okay time stamp simple and I want to just insert
263:27 stamp simple and I want to just insert some records
263:30 some records insert into
263:33 insert into items and
263:36 items and values
263:38 values one
263:40 one and then item one category 1. Perfect.
263:43 and then item one category 1. Perfect. This is a very good example and I'll be
263:45 This is a very good example and I'll be using current time stamp instead of this
263:46 using current time stamp instead of this particular time stamp because we'll be
263:48 particular time stamp because we'll be adding more and more records. Right? So
263:50 adding more and more records. Right? So current time stamp.
263:58 Perfect. current time stamp and current time
264:00 current time stamp and current time stamp and let's insert three records.
264:03 stamp and let's insert three records. It's fine. Okay, let me just run this.
264:10 So this is basically our table which is in the bronze schema. Remember this.
264:12 in the bronze schema. Remember this. This is not in our source. I think we
264:15 This is not in our source. I think we should have created this table in the
264:16 should have created this table in the source schema. Ideally, ideally, ideally
264:20 source schema. Ideally, ideally, ideally that makes sense. So let's do one thing.
264:22 that makes sense. So let's do one thing. Let's create this table in the source.
264:24 Let's create this table in the source. Okay, let me just pick source. Okay. And
264:28 Okay, let me just pick source. Okay. And then click on run all. Make sense? Okay.
264:31 then click on run all. Make sense? Okay. Very good. Just forget about bronze
264:33 Very good. Just forget about bronze schema. This is in our source. And we
264:34 schema. This is in our source. And we want to use this from source. Makes
264:36 want to use this from source. Makes sense. Now, let me just jump on my VS
264:38 sense. Now, let me just jump on my VS code editor. So, what we're going to do,
264:41 code editor. So, what we're going to do, we need to create a snapshot. Simple.
264:44 we need to create a snapshot. Simple. And this is a folder. This is a
264:47 And this is a folder. This is a directory. And I know you are smart
264:48 directory. And I know you are smart enough. You will say Anlama, we need to
264:50 enough. You will say Anlama, we need to create one file inside this folder. Yes.
264:51 create one file inside this folder. Yes. And it will be the YAML file. So let's
264:54 And it will be the YAML file. So let's say
264:56 say gold
264:58 gold items doyamel. Okay, because this is a
265:02 items doyamel. Okay, because this is a kind of gold layer that we are building
265:04 kind of gold layer that we are building because in the gold layer we populate
265:05 because in the gold layer we populate the dimensions most of the time. Okay,
265:08 the dimensions most of the time. Okay, so let's say this is our gold
265:10 so let's say this is our gold items.yamel. Make sense? Make sense? So
265:12 items.yamel. Make sense? Make sense? So now what we going to do?
265:15 now what we going to do? You can directly
265:17 You can directly use that table as a source for this
265:20 use that table as a source for this particular snapshot.
265:24 particular snapshot. Okay. But ideally like this is not a
265:28 Okay. But ideally like this is not a rule of thumb. Ideally you can say as a
265:31 rule of thumb. Ideally you can say as a good practice we should always create a
265:34 good practice we should always create a dedicated source for these snapshots.
265:37 dedicated source for these snapshots. Reason reason
265:40 Reason reason because in your source
265:44 because in your source you will be finding trouble to define
265:48 you will be finding trouble to define the primary key.
265:50 the primary key. Okay. And why is that? Because let's say
265:54 Okay. And why is that? Because let's say you have the source. Okay, let's say
265:57 you have the source. Okay, let's say items item one, item two, item three.
266:01 items item one, item two, item three. Make sense? Let's say next day or
266:04 Make sense? Let's say next day or basically let's say after a few minutes
266:06 basically let's say after a few minutes I change the item name from item three
266:10 I change the item name from item three to new item three or let's say item
266:12 to new item three or let's say item three new. Okay, primary key is still
266:16 three new. Okay, primary key is still three but the values changed. So that
266:18 three but the values changed. So that means it is following the rule of
266:20 means it is following the rule of primary key. That is fine. But when I
266:23 primary key. That is fine. But when I will be creating a slowly changing
266:25 will be creating a slowly changing dimension. I would need to define the
266:28 dimension. I would need to define the key column right at that time there will
266:30 key column right at that time there will be two values of three and it will
266:33 be two values of three and it will create a problem because there cannot be
266:36 create a problem because there cannot be duplicate values on the key column. Make
266:38 duplicate values on the key column. Make sense? So in that particular scenario in
266:41 sense? So in that particular scenario in the real world in the basically you can
266:44 the real world in the basically you can say uh realtime scenarios we create a
266:47 say uh realtime scenarios we create a source and this source task is only and
266:51 source and this source task is only and only dduplication. It can obviously do a
266:54 only dduplication. It can obviously do a lot of other things as well but majorly
266:57 lot of other things as well but majorly its task is to ddup basically dduplicate
267:00 its task is to ddup basically dduplicate the data. So what it will do every time
267:03 the data. So what it will do every time it will pick only and only the latest
267:07 it will pick only and only the latest time stamp basically latest key with the
267:09 time stamp basically latest key with the latest time stamp make sense and it will
267:12 latest time stamp make sense and it will pass that data to the slowly changing
267:14 pass that data to the slowly changing dimension type two and on the slowly
267:16 dimension type two and on the slowly changing dimension type to basically
267:18 changing dimension type to basically snapshot we will be managing the history
267:22 snapshot we will be managing the history h got it 100% be honest obviously you
267:27 h got it 100% be honest obviously you will be getting 100% when you see the
267:30 will be getting 100% when you see the code let me show to you. So this is our
267:33 code let me show to you. So this is our gold items.yamel. Okay. And before
267:35 gold items.yamel. Okay. And before populating it, what we need to do? We
267:37 populating it, what we need to do? We need to uh let's first of all create the
267:40 need to uh let's first of all create the source. Okay. Not source for this gold
267:45 source. Okay. Not source for this gold real source like the source schema. If I
267:48 real source like the source schema. If I go to models
267:55 and source this one. So let's say we have one more source.
268:04 Okay. And source name is I think items. Yep. Perfect. So we have the source. Now
268:08 Yep. Perfect. So we have the source. Now we going to create our source for YAML.
268:13 we going to create our source for YAML. Basically a
268:15 Basically a downstream not downstream basically the
268:17 downstream not downstream basically the upstream for the YAML. What is
268:19 upstream for the YAML. What is downstream? Upstream. Okay fancy words.
268:22 downstream? Upstream. Okay fancy words. Dependency. So basically this particular
268:24 Dependency. So basically this particular gold YAML will be populated using
268:27 gold YAML will be populated using something and that something will be in
268:29 something and that something will be in the gold folder. Let me create that
268:31 the gold folder. Let me create that thing. So this is basically a model and
268:34 thing. So this is basically a model and its name will be source
268:42 source um
268:44 um okay
268:46 okay source gold items
268:50 source gold items make sense dots equal just to keep the
268:53 make sense dots equal just to keep the naming convention same source gold items
268:55 naming convention same source gold items okay makes sense so what I will do I
268:58 okay makes sense so what I will do I will simply say select ax
269:05 from refer
269:14 not refer I would say source okay source source name is source and items okay but
269:17 source name is source and items okay but now we need to ddup as I just mentioned
269:20 now we need to ddup as I just mentioned and how we can just ddup we always use
269:22 and how we can just ddup we always use row number to ddup the data so I will
269:24 row number to ddup the data so I will say row number over partition by order
269:26 say row number over partition by order item ID not item ID we just have ID so
269:30 item ID not item ID we just have ID so ID Okay. And order by will be update
269:33 ID Okay. And order by will be update date.
269:42 Update date. Perfect. And I will just rename it as ddup key or basically ddup.
269:45 rename it as ddup key or basically ddup. Okay. Ddoop.
269:49 Okay. Ddoop. Okay.
269:58 Where ddoop ID equals to 1. Makes sense? And this way we will only be getting
270:00 And this way we will only be getting only and only the latest you can say um
270:04 only and only the latest you can say um value. But obviously I do not want any
270:08 value. But obviously I do not want any kind of duplication
270:10 kind of duplication ID. So what I will do I will simply
270:13 ID. So what I will do I will simply select the columns from this. So let's
270:15 select the columns from this. So let's say
270:17 say with
270:37 Select S from DDUP key. Yeah, makes sense. So now what will happen? We need
270:41 sense. So now what will happen? We need to select the columns. I want ID only.
270:45 to select the columns. I want ID only. I want name.
270:48 I want name. I want category. I want update date.
270:51 I want category. I want update date. That's it. I don't want all the values.
270:53 That's it. I don't want all the values. So let me just query this and let's see
270:55 So let me just query this and let's see what we get. So this will be our
270:57 what we get. So this will be our basically the source for this particular
271:00 basically the source for this particular snapshot. Okay. Okay. Perfect as I just
271:04 snapshot. Okay. Okay. Perfect as I just expected. So what we going to do now?
271:07 expected. So what we going to do now? Now it's time to populate the
271:11 Now it's time to populate the snapshot. Okay. So this is empty. Let's
271:14 snapshot. Okay. So this is empty. Let's go to this particular documentation. And
271:17 go to this particular documentation. And this is basically example basically the
271:19 this is basically example basically the example usage that we can also try.
271:22 example usage that we can also try. Okay. Example usage. And there are like
271:24 Okay. Example usage. And there are like so many example usage. If you just
271:26 so many example usage. If you just scroll at the top,
271:29 scroll at the top, you will see this particular thing as
271:31 you will see this particular thing as well. That is also a kind of usage.
271:33 well. That is also a kind of usage. Okay. If you scroll down, relation
271:38 Okay. If you scroll down, relation relation, it's fine.
271:41 relation, it's fine. You can also pick this one as well. It's
271:42 You can also pick this one as well. It's up to you how you just want to pick it.
271:44 up to you how you just want to pick it. Okay. So, let's say let's pick this
271:47 Okay. So, let's say let's pick this example usage that it was saying example
271:49 example usage that it was saying example usage. Perfect.
271:52 usage. Perfect. And it it does not have like all the
271:54 And it it does not have like all the things I want that big date as well. So
271:58 things I want that big date as well. So let's pick this one. Yeah, perfect. So
272:00 let's pick this one. Yeah, perfect. So let me just copy this
272:03 let me just copy this and let me just paste it here.
272:06 and let me just paste it here. Perfect. So now it's time to configure
272:09 Perfect. So now it's time to configure it. And now it's time to tell you each
272:11 it. And now it's time to tell you each and everything of the snapshot. First of
272:13 and everything of the snapshot. First of all, we need to name it. And just to
272:16 all, we need to name it. And just to make everything you can say consistent
272:19 make everything you can say consistent we need to pick the same name gold
272:22 we need to pick the same name gold items.
272:24 items. Make sense? Now what will be the source
272:27 Make sense? Now what will be the source of this particular snapshot? We just
272:30 of this particular snapshot? We just created the source. What is the source?
272:32 created the source. What is the source? The source is
272:35 The source is uh this one. This model. Which model?
272:40 uh this one. This model. Which model? Source gold items. Make sense? If you
272:43 Source gold items. Make sense? If you just see like like just just use your
272:46 just see like like just just use your common sense like how it will just use
272:48 common sense like how it will just use that particular source. So in the
272:51 that particular source. So in the relation you have to mention that source
272:54 relation you have to mention that source for this particular thing. But the thing
272:56 for this particular thing. But the thing is here you will be just using a
272:58 is here you will be just using a function instead of just writing the
273:00 function instead of just writing the name directly. Why? Because as you know
273:02 name directly. Why? Because as you know that every time we use ref function but
273:06 that every time we use ref function but we do not use ref function directly. We
273:08 we do not use ref function directly. We use inside a macro. But here you do not
273:11 use inside a macro. But here you do not need to write like this ref
273:14 need to write like this ref and then gold items. Okay, basically
273:17 and then gold items. Okay, basically source gold items.
273:20 source gold items. No, here you can directly use ref
273:22 No, here you can directly use ref function without any kind of
273:29 here as well you can see if they have used ref function. No, because they have
273:31 used ref function. No, because they have just used source directly. But yeah, you
273:33 just used source directly. But yeah, you can just use ref function. It's fine.
273:35 can just use ref function. It's fine. Now it is saying configuration. So what
273:38 Now it is saying configuration. So what will be the schema? Schema will be gold.
273:40 will be the schema? Schema will be gold. Okay. Okay. Database that means catalog
273:43 Okay. Okay. Database that means catalog name. Catalog name will be uh what was
273:47 name. Catalog name will be uh what was the catalog name.
273:58 DBD tutorial dev. And it is saying unique key is ID. As I just told you
274:00 unique key is ID. As I just told you strategy is very important. There are
274:01 strategy is very important. There are basically two types of strategies. Time
274:03 basically two types of strategies. Time stamp and check. DBD recommends to use
274:05 stamp and check. DBD recommends to use time stamp strategy. And in the industry
274:08 time stamp strategy. And in the industry as well, we use this particular
274:10 as well, we use this particular strategy. We consider a column which is
274:12 strategy. We consider a column which is of the time stamp format. And we decide
274:15 of the time stamp format. And we decide which value to pick. Okay. And it is
274:17 which value to pick. Okay. And it is saying okay I will pick this uh
274:20 saying okay I will pick this uh strategy. But what is the column name?
274:21 strategy. But what is the column name? The column name is update date. Make
274:24 The column name is update date. Make sense? And what is this thing? DBD valid
274:27 sense? And what is this thing? DBD valid to current. So basically if the value is
274:30 to current. So basically if the value is not updated or basically if the value is
274:31 not updated or basically if the value is not changed, what value it will provide
274:34 not changed, what value it will provide to that particular thing? DBT valid to
274:36 to that particular thing? DBT valid to current. If you do not pass it, it will
274:37 current. If you do not pass it, it will provide null. If you pass it like this,
274:40 provide null. If you pass it like this, a very big date, it will pass this
274:42 a very big date, it will pass this value. So, it depends like how your
274:43 value. So, it depends like how your organization wants to see the data.
274:45 organization wants to see the data. Okay? And some organizations prefer
274:48 Okay? And some organizations prefer null. Some organizations prefer this
274:50 null. Some organizations prefer this long date. It totally depends because
274:51 long date. It totally depends because they do not want to just keep nulls in
274:53 they do not want to just keep nulls in their column. That's the only thing.
274:55 their column. That's the only thing. Okay? So, this is done. Now, what do we
274:58 Okay? So, this is done. Now, what do we want to do? We want to populate this
275:00 want to do? We want to populate this particular snapshot. Okay. How we can
275:02 particular snapshot. Okay. How we can just populate this particular snapshot?
275:04 just populate this particular snapshot? I will run a command
275:08 I will run a command cd. Okay. And then dbd snapshot.
275:13 cd. Okay. And then dbd snapshot. So when I hit dbd snapshot, it will go
275:16 So when I hit dbd snapshot, it will go to the snapshot folder and it will build
275:19 to the snapshot folder and it will build the snapshot. Okay. And as you can see
275:21 the snapshot. Okay. And as you can see completed with one error. Very good.
275:23 completed with one error. Very good. Failure in snapshot gold items. And what
275:25 Failure in snapshot gold items. And what is the error?
275:27 is the error? Uh database error. Okay. the table or
275:31 Uh database error. Okay. the table or view gold dots source gold items cannot
275:35 view gold dots source gold items cannot be found. Very good. This was expected
275:37 be found. Very good. This was expected because obviously this particular source
275:39 because obviously this particular source gold items is not actually
275:42 gold items is not actually created so far because we directly use
275:44 created so far because we directly use this particular command and I wanted to
275:46 this particular command and I wanted to show you this because see I just ran
275:49 show you this because see I just ran this command dbd snapshot.
275:52 this command dbd snapshot. Make sense? Make sense? And if we just
275:55 Make sense? Make sense? And if we just want to run the models, we run this
275:56 want to run the models, we run this thing dbt run and then model name. Or if
276:00 thing dbt run and then model name. Or if we just write this, it will run all the
276:01 we just write this, it will run all the models. Do we need to run so many things
276:05 models. Do we need to run so many things like every time do we need to run
276:07 like every time do we need to run multiple commands and how we will be
276:09 multiple commands and how we will be just worrying about all these things
276:11 just worrying about all these things like should we run this thing, should we
276:13 like should we run this thing, should we run that thing. So that is why we have a
276:16 run that thing. So that is why we have a command called dbd build. So it just
276:19 command called dbd build. So it just runs your models, snapshots, seeds,
276:23 runs your models, snapshots, seeds, literally everything. This is the major
276:26 literally everything. This is the major command that we use dbt build. Hit this.
276:29 command that we use dbt build. Hit this. Now it will just build the entire
276:32 Now it will just build the entire project. And you will see that in the
276:35 project. And you will see that in the logs. So it will build all your models
276:38 logs. So it will build all your models one by one. Seeds, snapshots,
276:41 one by one. Seeds, snapshots, everything. See everything.
276:44 everything. See everything. Now it will be just one of 17. all the
276:46 Now it will be just one of 17. all the 17 things, all the macros, all the
276:49 17 things, all the macros, all the seeds, sources, everything, literally
276:52 seeds, sources, everything, literally everything. This is the command that we
276:54 everything. This is the command that we use to build all the things. This is the
276:56 use to build all the things. This is the command that we use to orchestrate our
276:58 command that we use to orchestrate our DBT pipelines. This is the command that
277:00 DBT pipelines. This is the command that we used for the deployment as well. Yes,
277:03 we used for the deployment as well. Yes, there is small tweak in that. We were
277:05 there is small tweak in that. We were just talking about that thing as well in
277:06 just talking about that thing as well in the CD part that is deployment. So, it
277:09 the CD part that is deployment. So, it is saying completed successfully. Wow.
277:11 is saying completed successfully. Wow. So now I'm very excited to see the
277:14 So now I'm very excited to see the results for my catalog basically this
277:18 results for my catalog basically this sedd. Okay. So let's create the query
277:22 sedd. Okay. So let's create the query here. I will say
277:25 here. I will say select
277:31 ax from uh catalog name we already know. So it
277:33 uh catalog name we already know. So it will be gold items right
277:36 will be gold items right and this is gold.
277:40 and this is gold. Let's run this. Let's see what do we
277:42 Let's run this. Let's see what do we have. Okay, so all these things here and
277:46 have. Okay, so all these things here and it created some additional column for
277:47 it created some additional column for us. As we all know this is like date and
277:50 us. As we all know this is like date and this is like valid from and this is like
277:52 this is like valid from and this is like very big date too. Now I want to test
277:54 very big date too. Now I want to test it. I will change the name of item
277:56 it. I will change the name of item three. Okay, I will put the value like
278:00 three. Okay, I will put the value like this. Let's say
278:07 I will first of all remove these values. Let's only change three and I will just
278:09 Let's only change three and I will just make it item three new. Okay. And I will
278:12 make it item three new. Okay. And I will insert this record with the updated time
278:14 insert this record with the updated time stamp because obviously new value is
278:16 stamp because obviously new value is added. So we are expecting that. Okay.
278:19 added. So we are expecting that. Okay. We cannot be found. Okay. Makes sense
278:22 We cannot be found. Okay. Makes sense because it is in the source. Yep. So now
278:25 because it is in the source. Yep. So now we are adding this record in our source
278:27 we are adding this record in our source table. If we query the source table, we
278:30 table. If we query the source table, we will be seeing two different values for
278:31 will be seeing two different values for three. But our source that we are using
278:34 three. But our source that we are using for snapshot it will ddup it and it will
278:37 for snapshot it will ddup it and it will only pass this particular value. But do
278:39 only pass this particular value. But do you know what it will not simply absert
278:41 you know what it will not simply absert this value it is a slowly changing
278:44 this value it is a slowly changing dimension type two. It will keep the
278:46 dimension type two. It will keep the previous value as well and it will
278:48 previous value as well and it will update this as well. So you want to test
278:50 update this as well. So you want to test it. I also want to test it. Okay. So now
278:52 it. I also want to test it. Okay. So now I will simply run dbt build for one more
278:56 I will simply run dbt build for one more time. Let's see what happens. Okay.
278:58 time. Let's see what happens. Okay. Okay. And let's see if our slowly
278:59 Okay. And let's see if our slowly changing dimension type two is working
279:02 changing dimension type two is working fine or not. Okay. I am also testing it.
279:05 fine or not. Okay. I am also testing it. You are also testing it. So
279:08 You are also testing it. So both are testing together. And let me
279:11 both are testing together. And let me just meanwhile check if we just want to
279:12 just meanwhile check if we just want to cover
279:14 cover more things before deployment because
279:16 more things before deployment because obviously they're like so many things
279:18 obviously they're like so many things and
279:20 and you have to explore the things when you
279:22 you have to explore the things when you are just working with DBT more and more
279:24 are just working with DBT more and more things. But I am sure like all these
279:26 things. But I am sure like all these things will build your strong strong
279:28 things will build your strong strong strong base. Okay, makes sense. Uh,
279:35 strong base. Okay, makes sense. Uh, okay.
279:37 okay. Okay, complete successfully. Yeah, let's
279:39 Okay, complete successfully. Yeah, let's test it. And I think we're good to go
279:42 test it. And I think we're good to go with the deployment. Yes, let's let's
279:43 with the deployment. Yes, let's let's start the deployment. But yes, don't
279:45 start the deployment. But yes, don't worry, don't worry. Don't worry, don't
279:46 worry, don't worry. Don't worry, don't worry. I'm going to talk about slowly
279:50 worry. I'm going to talk about slowly changing dimension type two because we
279:51 changing dimension type two because we need to validate it. So let's see what
279:54 need to validate it. So let's see what is the value now in the gold items. We
279:56 is the value now in the gold items. We should see both the values of item three
279:58 should see both the values of item three right
280:00 right and pick gold schema
280:09 and perfect. If we do not see both the values how how
280:12 If we do not see both the values how how we cannot see we will see see we have
280:16 we cannot see we will see see we have built slowing dimension type two in just
280:18 built slowing dimension type two in just few minutes. in just few minutes. And if
280:22 few minutes. in just few minutes. And if you have built slowly changing dimension
280:24 you have built slowly changing dimension type 2 in the real world without DBD by
280:27 type 2 in the real world without DBD by just using pispart, you know that it
280:29 just using pispart, you know that it takes a lot of time. You know that.
280:31 takes a lot of time. You know that. Okay? Because you need to first update
280:33 Okay? Because you need to first update the existing values, then you need to um
280:36 the existing values, then you need to um insert the new values, right? So it's
280:41 insert the new values, right? So it's like this. And you can see that for the
280:44 like this. And you can see that for the previous value the valid to date is also
280:47 previous value the valid to date is also the current date not like current date
280:49 the current date not like current date like today's date because it was expired
280:52 like today's date because it was expired today. But for the new value it is like
280:54 today. But for the new value it is like very big date very long date right very
280:57 very big date very long date right very very very good very good. So you have
280:59 very very good very good. So you have successfully built
281:02 successfully built changing dimension type two as well. And
281:03 changing dimension type two as well. And if we just look at our development it
281:05 if we just look at our development it looks cool. Okay. And trust me you have
281:08 looks cool. Okay. And trust me you have literally learned so many things. Let me
281:09 literally learned so many things. Let me just close the terminal. And I hope now
281:12 just close the terminal. And I hope now you know a lot of things and you are
281:15 you know a lot of things and you are ready to explore and build amazing
281:18 ready to explore and build amazing solutions in the real industry. And I am
281:21 solutions in the real industry. And I am so sure that you can confidently clear
281:23 so sure that you can confidently clear the interviews. Do you know why this
281:25 the interviews. Do you know why this technology is new? Amazing updates are
281:28 technology is new? Amazing updates are coming every single day. So now you have
281:32 coming every single day. So now you have the updated knowledge. Now you know the
281:34 the updated knowledge. Now you know the updated syntax. Now you know the all the
281:37 updated syntax. Now you know the all the new things, right? you have the latest
281:39 new things, right? you have the latest context in your mind. Feel confident.
281:43 context in your mind. Feel confident. Feel free to sit in the interviews
281:45 Feel free to sit in the interviews wherever DBT is required and just say
281:48 wherever DBT is required and just say confidently, hey, I have watched An
281:51 confidently, hey, I have watched An Lamb's YouTube video. So, just ask me
281:53 Lamb's YouTube video. So, just ask me any question about DBT. I can say you
281:56 any question about DBT. I can say you will be able to answer 90% of the
281:58 will be able to answer 90% of the questions. Let me just talk about those
282:00 questions. Let me just talk about those 10% of the question. Even if you are not
282:03 10% of the question. Even if you are not able to answer 10% of the questions
282:06 able to answer 10% of the questions accurately, you will be able to give
282:09 accurately, you will be able to give some context for that question. And that
282:11 some context for that question. And that is what interviewer always
282:14 is what interviewer always uh you can say interviewers always look
282:16 uh you can say interviewers always look for. Why? Because they are not expecting
282:18 for. Why? Because they are not expecting you to just answer all the questions
282:20 you to just answer all the questions accurately. You are not a chat GPD. You
282:23 accurately. You are not a chat GPD. You are a person who will be developing
282:25 are a person who will be developing solutions. So they want someone even if
282:28 solutions. So they want someone even if you are not aware of anything, you
282:30 you are not aware of anything, you should have some context. So that you
282:32 should have some context. So that you can easily find the relevant documents,
282:34 can easily find the relevant documents, you can easily find relevant resources,
282:36 you can easily find relevant resources, you can work with the already like
282:39 you can work with the already like having obviously they will be having
282:40 having obviously they will be having some experts in the you can say existing
282:42 some experts in the you can say existing industry. So you should be able to do
282:45 industry. So you should be able to do all of these things, right? So that is
282:47 all of these things, right? So that is why trust me you're good. Okay. So now
282:52 why trust me you're good. Okay. So now let me first of all go to get okay let
282:55 let me first of all go to get okay let me just say get add.
282:58 me just say get add. Okay, then say get commit and the
283:01 Okay, then say get commit and the message will be
283:03 message will be all done.
283:05 all done. Okay, perfect. Now let's say get switch
283:08 Okay, perfect. Now let's say get switch and main. Let's merge the code. Okay,
283:12 and main. Let's merge the code. Okay, it's a kind of pull request if you have
283:14 it's a kind of pull request if you have a repo. Okay, get merge feature on.
283:19 a repo. Okay, get merge feature on. Perfect. So we are on main branch and
283:21 Perfect. So we are on main branch and everything is committed. Very good. See
283:24 everything is committed. Very good. See how easily you are learning Git as well.
283:26 how easily you are learning Git as well. That's the best way to learn Git. Just
283:28 That's the best way to learn Git. Just start using it. Just start using it.
283:30 start using it. Just start using it. Okay. So now let's talk about the
283:32 Okay. So now let's talk about the continuous deployment part. So now you
283:34 continuous deployment part. So now you know that you have build everything in
283:40 know that you have build everything in the dev environment. Dev environment
283:42 the dev environment. Dev environment means basically
283:44 means basically dev catalog. Okay. You can imagine
283:48 dev catalog. Okay. You can imagine that in your company because every
283:50 that in your company because every company has a different
283:53 company has a different you can say
283:55 you can say structure to manage environments. Okay,
283:59 structure to manage environments. Okay, let's say they are treating dev catalog
284:02 let's say they are treating dev catalog differently, QA catalog differently or
284:04 differently, QA catalog differently or basically prod catalog differently
284:06 basically prod catalog differently because we are just directly talking
284:07 because we are just directly talking about production. Okay, so they have a
284:10 about production. Okay, so they have a dedicated catalog for production. Make
284:13 dedicated catalog for production. Make sense? So now you want to move all the
284:17 sense? So now you want to move all the things literally all the things from
284:20 things literally all the things from whatever you have in dev catalog. Let's
284:21 whatever you have in dev catalog. Let's say you have
284:24 say you have so many things you want all the same
284:26 so many things you want all the same things in the prod catalog as well. How
284:29 things in the prod catalog as well. How you can just do that? How? Here the
284:33 you can just do that? How? Here the profiles.yamel file will help you a lot.
284:36 profiles.yamel file will help you a lot. Okay. Why? Let me just tell you. Don't
284:38 Okay. Why? Let me just tell you. Don't worry. So whatever you have built here
284:41 worry. So whatever you have built here so many things, right? Literally so many
284:43 so many things, right? Literally so many things, so many folders, seeds, models,
284:46 things, so many folders, seeds, models, bronze, gold, so many things, right? So
284:49 bronze, gold, so many things, right? So many macros. You cannot hardcode
284:52 many macros. You cannot hardcode anything like hardcode everything again
284:54 anything like hardcode everything again to just populate the same stuff to
284:56 to just populate the same stuff to production. That is why you need to
284:58 production. That is why you need to deploy. That is why we are saying
285:01 deploy. That is why we are saying continuous deployment. And you know
285:02 continuous deployment. And you know what? Deployment in DBD is very easy.
285:05 what? Deployment in DBD is very easy. Okay, how? Because of this
285:07 Okay, how? Because of this profiles.yamel. Let me just open this.
285:10 profiles.yamel. Let me just open this. If you look at this particular file, I
285:12 If you look at this particular file, I told you that this is basically a kind
285:14 told you that this is basically a kind of connection.
285:16 of connection. This is what a kind of connection.
285:19 This is what a kind of connection. Make sense? In which we have written
285:21 Make sense? In which we have written almost everything. Hm. Make sense? And
285:25 almost everything. Hm. Make sense? And within this file, we get something
285:27 within this file, we get something called as a kind of not really but a
285:30 called as a kind of not really but a kind of environment variables.
285:34 kind of environment variables. Okay, environment variables
285:38 Okay, environment variables just for your understanding these are
285:40 just for your understanding these are not actually environment variables
285:42 not actually environment variables because in DBT cloud we have a dedicated
285:45 because in DBT cloud we have a dedicated tab for environment variables but we
285:47 tab for environment variables but we call it as target variables for DBT code
285:51 call it as target variables for DBT code and DBT cloud as well. So these are
285:53 and DBT cloud as well. So these are target variables make sense. This is a
285:56 target variables make sense. This is a right word but for understanding point
285:58 right word but for understanding point of view you can treat them as a
286:00 of view you can treat them as a environment variables because if you
286:02 environment variables because if you already know some deployment stuff CI/CD
286:04 already know some deployment stuff CI/CD so you know like we just create
286:06 so you know like we just create environment variable so that we can
286:07 environment variable so that we can change the values dynamically. So both
286:09 change the values dynamically. So both are same thing. Okay very good. So now
286:12 are same thing. Okay very good. So now what we need to do we need to deploy
286:16 what we need to do we need to deploy this thing into the production catalog.
286:18 this thing into the production catalog. Where is the production catalog?
286:20 Where is the production catalog? Obviously we need to set up. Okay, let's
286:22 Obviously we need to set up. Okay, let's set up the production catalog because
286:23 set up the production catalog because that is also a task. Just use your
286:26 that is also a task. Just use your common sense. I know it is not very
286:28 common sense. I know it is not very common but try to use it. You will be
286:30 common but try to use it. You will be able to use it. When you started your
286:33 able to use it. When you started your project, you had your source schema
286:36 project, you had your source schema ready which is your source data. Make
286:39 ready which is your source data. Make sense? Very good. Just tell me one
286:41 sense? Very good. Just tell me one thing. Will you use the same data of dev
286:45 thing. Will you use the same data of dev or will you be having a different data
286:47 or will you be having a different data in broad? In most of the scenarios
286:49 in broad? In most of the scenarios different data in broad but obviously in
286:52 different data in broad but obviously in this particular tutorial we can use the
286:54 this particular tutorial we can use the same data but I want you to just create
286:57 same data but I want you to just create the same environment that you will be
286:59 the same environment that you will be having in the real industry. So what I'm
287:01 having in the real industry. So what I'm going to do I will be creating same
287:03 going to do I will be creating same source schema in the production catalog
287:06 source schema in the production catalog and I will just show you how you can
287:07 and I will just show you how you can just change the values because it is
287:10 just change the values because it is very easy to just connect to the same
287:11 very easy to just connect to the same source because there will be no changing
287:13 source because there will be no changing environment variables and blah blah
287:15 environment variables and blah blah blah. But I want to I want you to learn
287:17 blah. But I want to I want you to learn it actually what we do in the real
287:19 it actually what we do in the real world. Okay. So let's create a catalog
287:21 world. Okay. So let's create a catalog first of all and I will name it as let's
287:24 first of all and I will name it as let's say
287:26 say what is this? What is this? Let me just
287:29 what is this? What is this? Let me just go to catalog.
287:32 go to catalog. Okay, perfect. Create a catalog and
287:34 Okay, perfect. Create a catalog and catalog name will be DBT
287:38 catalog name will be DBT tutorial
287:40 tutorial broad. Okay, dbt tutorial broad. So this
287:44 broad. Okay, dbt tutorial broad. So this is my catalog. Configure catalog. Grant
287:46 is my catalog. Configure catalog. Grant access. Next, next and save. Done.
287:51 access. Next, next and save. Done. Okay. So now let's create a schema. And
287:53 Okay. So now let's create a schema. And schema name will be source. Make sense?
287:56 schema name will be source. Make sense? Sorus
287:59 Sorus click on create. Now you will be saying
288:01 click on create. Now you will be saying Asha do we need to again click on create
288:03 Asha do we need to again click on create table uploading files do we need to do
288:05 table uploading files do we need to do that you can but let me just tell you a
288:08 that you can but let me just tell you a hack basically it's not a hack it's like
288:10 hack basically it's not a hack it's like a smart way of doing it so I can just go
288:12 a smart way of doing it so I can just go to queries okay and let me just use the
288:15 to queries okay and let me just use the existing query that I just used it okay
288:18 existing query that I just used it okay and let me just remove everything and
288:21 and let me just remove everything and what I will do I will say
288:24 what I will do I will say create
288:26 create table and table name will be let's say
288:29 table and table name will be let's say uh Uh what was the table name?
288:32 uh Uh what was the table name? Uh in the source. Okay. DIM customer.
288:40 Okay, makes sense. And where we want to create this? In the production.
288:43 create this? In the production. Okay. In the source schema. Perfect. And
288:46 Okay. In the source schema. Perfect. And what's the query? as select axis from
288:52 what's the query? as select axis from uh DBT
288:54 uh DBT core dev dot source dot
289:05 hey dbt core catalog no what is this dbt tutorial dev
289:08 tutorial dev source
289:10 source and this is gold broad basically hey
289:15 and this is gold broad basically hey what are you doing Okay, source.
289:18 what are you doing Okay, source. Perfect.
289:19 Perfect. DBD tutorial
289:27 pro source and then obviously dim customers
289:29 customers customer basically. Let's run this. So
289:31 customer basically. Let's run this. So what it will do? It will simply
289:32 what it will do? It will simply replicate the table and it is saying
289:35 replicate the table and it is saying broad. Obviously it is not prod, it is
289:37 broad. Obviously it is not prod, it is dev. It is simply duplicate the data
289:40 dev. It is simply duplicate the data from the dev to our prod. And this way
289:42 from the dev to our prod. And this way we can just have a exact copy or it can
289:45 we can just have a exact copy or it can be different as well but you need to use
289:46 be different as well but you need to use a common sense like we are just simply
289:48 a common sense like we are just simply creating a source schema that's it can
289:50 creating a source schema that's it can be different it can be same right so dim
289:52 be different it can be same right so dim customer and then let's say dim product
290:10 then we need to say dim product dim sales
290:24 Okay, perfect. Then factory returns.
290:34 Okay, perfect. What do we have next?
290:48 Okay, dim date and then dim store. See how quick it is. How was the smarter
290:52 See how quick it is. How was the smarter way of doing it?
290:55 way of doing it? dim store. Dim store.
291:04 Perfect. I think we are good. Yep. I think so. I think so.
291:08 Yep. I think so. I think so. Okay, let me just double check before
291:11 Okay, let me just double check before seeing any errors. Oh, items is left. We
291:15 seeing any errors. Oh, items is left. We need to create items. Yes,
291:21 items and items.
291:24 and items. Not dim items, just items.
291:38 Okay. So this is also done. So now what we need to do, we have everything set up
291:40 we need to do, we have everything set up for our deployment. So now what we can
291:43 for our deployment. So now what we can do? We can create another connection.
291:46 do? We can create another connection. Yes, just copy this value and then just
291:51 Yes, just copy this value and then just paste it here and then just call it as
291:54 paste it here and then just call it as prod.
291:56 prod. Okay. And you can change the value here
291:59 Okay. And you can change the value here as well because rest of the stuffs are
292:01 as well because rest of the stuffs are same. Okay. And don't worry, I will just
292:03 same. Okay. And don't worry, I will just paste this value in my original
292:04 paste this value in my original profiles.l file. This is just for you.
292:06 profiles.l file. This is just for you. Okay. So obviously you will be having
292:08 Okay. So obviously you will be having token here as well. So now what will
292:10 token here as well. So now what will happen? This is saying you need to
292:13 happen? This is saying you need to understand this. Okay. This is saying
292:17 understand this. Okay. This is saying target. Okay. What is a target currently
292:20 target. Okay. What is a target currently as per this particular file? Dev. So
292:23 as per this particular file? Dev. So currently it is pointing to this
292:26 currently it is pointing to this particular connection.
292:28 particular connection. Okay. Everything is going here.
292:31 Okay. Everything is going here. You will say Anlama do we need to change
292:33 You will say Anlama do we need to change this value? We will be saying prod.
292:35 this value? We will be saying prod. That's one way of doing it and that's an
292:37 That's one way of doing it and that's an unprofessional way of doing it. We
292:39 unprofessional way of doing it. We should not change anything in this
292:41 should not change anything in this particular file regarding target. Okay.
292:44 particular file regarding target. Okay. So now what we will do just tell you one
292:46 So now what we will do just tell you one thing if I say target there are
292:48 thing if I say target there are basically target variables which I just
292:50 basically target variables which I just mentioned as environment variables. For
292:52 mentioned as environment variables. For example,
292:54 example, target dot catalog. What will be the
292:57 target dot catalog. What will be the value of target catalog
293:01 value of target catalog according to the current specification?
293:03 according to the current specification? Just tell me this thing. It will be
293:08 Just tell me this thing. It will be this one, right? This one. Why?
293:12 this one, right? This one. Why? Because target means dev. Dev dot
293:15 Because target means dev. Dev dot catalog means this one.
293:18 catalog means this one. But what if I change this value
293:21 But what if I change this value dynamically and if I say target catalog
293:24 dynamically and if I say target catalog what will be the value of this?
293:25 what will be the value of this? Obviously this one because it will say
293:28 Obviously this one because it will say prod. catalog. Make sense? Prod.alog.
293:32 prod. catalog. Make sense? Prod.alog. This one.
293:34 This one. Oh okay. Make sense? So now our first
293:39 Oh okay. Make sense? So now our first task is to parameterize all the areas
293:44 task is to parameterize all the areas all the things wherever we have used or
293:47 all the things wherever we have used or basically hardcoded the catalog value.
293:50 basically hardcoded the catalog value. Make sense? So in this case we have to
293:53 Make sense? So in this case we have to use something called as target dot
293:56 use something called as target dot catalog. Make sense? Okay. So let me
293:58 catalog. Make sense? Okay. So let me just show you what will be the value of
294:00 just show you what will be the value of that thing. If I just go to my analysis,
294:03 that thing. If I just go to my analysis, okay, and if I just create a file, let's
294:06 okay, and if I just create a file, let's say target variables
294:10 say target variables dossql, okay, if I say
294:14 dossql, okay, if I say target dot catalog, let's see what we
294:19 target dot catalog, let's see what we will be getting.
294:20 will be getting. Let's see dbt tutorial dev. See because
294:25 Let's see dbt tutorial dev. See because currently our profiles is actually
294:28 currently our profiles is actually pointing to dev. Make sense? Make sense?
294:31 pointing to dev. Make sense? Make sense? Very good. This is the uh you can say
294:34 Very good. This is the uh you can say par um environment variable. Similarly,
294:36 par um environment variable. Similarly, we have target dots schema target dot so
294:38 we have target dots schema target dot so many things, right? Very good. So our
294:41 many things, right? Very good. So our task is to parameterize everything. So
294:44 task is to parameterize everything. So you know that in our dbt project.yamel.
294:47 you know that in our dbt project.yamel. Okay. If I scroll up, we do not have
294:50 Okay. If I scroll up, we do not have actually anything related to catalog.
294:52 actually anything related to catalog. But in our sources, let's start from the
294:55 But in our sources, let's start from the very scratch. In our sources, we have
294:59 very scratch. In our sources, we have hard-coded the value for this thing.
295:00 hard-coded the value for this thing. See, database here, we need to change it
295:03 See, database here, we need to change it from database to
295:09 target.catalog. Simple.
295:10 Simple. This is the way to do it. And I think we
295:12 This is the way to do it. And I think we also need to add single quote otherwise
295:13 also need to add single quote otherwise it will not work. Yeah, we need to add
295:16 it will not work. Yeah, we need to add single quote. Just make sure you are
295:18 single quote. Just make sure you are adding single quote. Okay.
295:21 adding single quote. Okay. Okay. Makes sense? So this is the way to
295:23 Okay. Makes sense? So this is the way to do this. And if you remember, we have
295:26 do this. And if you remember, we have one more area where we have hardcoded
295:28 one more area where we have hardcoded the database value. What is that?
295:30 the database value. What is that? Snapshot. Okay. If you go to snapshot,
295:34 Snapshot. Okay. If you go to snapshot, if you go here
295:36 if you go here in this YAML here as well, we have
295:38 in this YAML here as well, we have hardcoded the value. Let's change it
295:41 hardcoded the value. Let's change it here as well. Perfect. Make sense? So
295:45 here as well. Perfect. Make sense? So now it can be deployed. Now in order to
295:48 now it can be deployed. Now in order to deploy it first of all obviously I need
295:51 deploy it first of all obviously I need to just make changes in the original
295:52 to just make changes in the original file as well. Let me just quickly do it.
295:54 file as well. Let me just quickly do it. So I have also made the changes in my
295:56 So I have also made the changes in my original profiles. AML. So I am very
295:58 original profiles. AML. So I am very excited to deploy these changes to my
296:00 excited to deploy these changes to my production. I will simply open the dbd
296:02 production. I will simply open the dbd command. Okay. Just be ready. Now is the
296:06 command. Okay. Just be ready. Now is the moment. Okay.
296:09 moment. Okay. So I will simply say cd. Perfect. And I
296:12 So I will simply say cd. Perfect. And I will say dbt build. And here we're going
296:14 will say dbt build. And here we're going to override the target variable value.
296:17 to override the target variable value. How? You'll simply say target
296:20 How? You'll simply say target broad.
296:22 broad. Okay, it should match the value that you
296:24 Okay, it should match the value that you have defined in your profiles. Right,
296:26 have defined in your profiles. Right, the second connection name.
296:29 the second connection name. Should I hit enter? Hit it.
296:33 Should I hit enter? Hit it. Wow. Does not have a target named broad.
296:38 Wow. Does not have a target named broad. The valid target names for this profile
296:41 The valid target names for this profile are dev. Let me just check my profile. I
296:43 are dev. Let me just check my profile. I think I had just messed up something
296:45 think I had just messed up something maybe. So we have here dev. There was a
296:48 maybe. So we have here dev. There was a small typo like everything is fine in
296:50 small typo like everything is fine in this particular dummy file but there was
296:52 this particular dummy file but there was a typo in my original file. Okay. And by
296:54 a typo in my original file. Okay. And by the way I will just revoke the token. I
296:55 the way I will just revoke the token. I don't know like why I'm just
296:58 don't know like why I'm just oh man number dbd build. Let me just run
297:02 oh man number dbd build. Let me just run it again. Are you ready for this time?
297:06 it again. Are you ready for this time? Really? Okay that's it.
297:10 Really? Okay that's it. Woohoo. Fingers crossed.
297:13 Woohoo. Fingers crossed. Okay. Eight models, seven analysis,
297:15 Okay. Eight models, seven analysis, seven data test, one seed, one snapshot,
297:17 seven data test, one seed, one snapshot, seven sources.
297:19 seven sources. Let's see one by one. We want to see
297:21 Let's see one by one. We want to see that completed successfully. I want to
297:23 that completed successfully. I want to see that
297:25 see that because we have literally, yes, you as
297:28 because we have literally, yes, you as well literally worked hard to build this
297:31 well literally worked hard to build this whole thing, right?
297:33 whole thing, right? And it's time to deploy it.
297:36 And it's time to deploy it. And no one actually cares about your
297:39 And no one actually cares about your development if it is not deployed. your
297:41 development if it is not deployed. your project manager, your your manager, no
297:45 project manager, your your manager, no one. If it is deployed, everyone will
297:47 one. If it is deployed, everyone will say well done. If it is not deployed,
297:49 say well done. If it is not deployed, hey, what do you have done? We actually
297:51 hey, what do you have done? We actually do not worry. We do not actually care
297:52 do not worry. We do not actually care about your development and blah blah
297:54 about your development and blah blah blah. So deployment is very important.
297:59 blah. So deployment is very important. We should build something that will
298:00 We should build something that will deploy first and then we will start
298:02 deploy first and then we will start building because obviously they they
298:04 building because obviously they they will only value you once the objects are
298:06 will only value you once the objects are deployed.
298:08 deployed. By the way, that was a really really
298:09 By the way, that was a really really really deep thought. We should first
298:12 really deep thought. We should first deploy it and then we should just start
298:15 deploy it and then we should just start the building. Is it possible kind of
298:19 the building. Is it possible kind of like we can just deploy the maybe
298:21 like we can just deploy the maybe infrastructure first of all like all the
298:24 infrastructure first of all like all the dependencies, all the variables?
298:26 dependencies, all the variables? I don't know. We'll think about it.
298:29 I don't know. We'll think about it. We'll let you know. Don't worry. Don't
298:31 We'll let you know. Don't worry. Don't worry. Don't worry. So, let's see. Okay,
298:34 worry. Don't worry. So, let's see. Okay, it started actually now. And I hope
298:37 it started actually now. And I hope everything is fine.
298:50 Okay. And just a quick one, it will also perform your test as well. Just for your
298:53 perform your test as well. Just for your quick overview, dbt build command will
298:56 quick overview, dbt build command will take care of your test as well. Do not
298:57 take care of your test as well. Do not feel like, hey, will it skip the test?
299:00 feel like, hey, will it skip the test? No. Oh, by the way, 16 are done. 17.
299:03 No. Oh, by the way, 16 are done. 17. Wow.
299:06 Wow. Perfect, perfect, perfect, perfect. It
299:08 Perfect, perfect, perfect, perfect. It is complete successfully. No failures.
299:11 is complete successfully. No failures. Wow. Let me see if it is deployed
299:16 Wow. Let me see if it is deployed or not. Okay, let me just refresh.
299:20 or not. Okay, let me just refresh. And if I just open it. Oh, all the
299:23 And if I just open it. Oh, all the schemas are there. Bronze, yes. Gold,
299:28 schemas are there. Bronze, yes. Gold, yes, silver. Perfect. So, this way
299:32 yes, silver. Perfect. So, this way you can also deploy it. Right. Right.
299:35 you can also deploy it. Right. Right. Right. Right. Are you are you are you
299:38 Right. Right. Are you are you are you happy?
299:41 happy? Are you happy? And if I just open this
299:44 Are you happy? And if I just open this particular one. Okay. And if I just run
299:48 particular one. Okay. And if I just run this, why do you want to run this? Just
299:50 this, why do you want to run this? Just to want to show you the lineage because
299:52 to want to show you the lineage because I told you like I will just show the
299:53 I told you like I will just show the lineage as well. So this is the lineage.
299:55 lineage as well. So this is the lineage. Okay. And this is an amazing UI now. And
299:58 Okay. And this is an amazing UI now. And you can even reduce the size a little
300:00 you can even reduce the size a little bit. So this way you can also see the
300:02 bit. So this way you can also see the lineage of each model. And you are
300:04 lineage of each model. And you are seeing that bronze customer, bronze
300:05 seeing that bronze customer, bronze product, bronze sales and so on. Now you
300:08 product, bronze sales and so on. Now you will say Anlampa where is the lineage of
300:10 will say Anlampa where is the lineage of bronze customer because it is al because
300:12 bronze customer because it is al because by default it will only show you the
300:15 by default it will only show you the direct lineage of this particular sales
300:17 direct lineage of this particular sales info because it is directly connected to
300:18 info because it is directly connected to this. If you want to see the lineage of
300:20 this. If you want to see the lineage of this you can click on plus sign.
300:23 this you can click on plus sign. Just click on plus and see dim customer
300:25 Just click on plus and see dim customer source. Same with this one. Same with
300:28 source. Same with this one. Same with this one. Perfect. Make sense? So if you
300:31 this one. Perfect. Make sense? So if you click on this you will see that all the
300:33 click on this you will see that all the orange ticks are for this particular
300:35 orange ticks are for this particular one. If you click here only this is the
300:37 one. If you click here only this is the one. If you click here only this is the
300:38 one. If you click here only this is the one. If you click here only this is the
300:39 one. If you click here only this is the one and it will it is going to this
300:42 one and it will it is going to this particular downstream as well. Make
300:44 particular downstream as well. Make sense? So this is a very good overview
300:46 sense? So this is a very good overview of this particular you can say lineage.
300:48 of this particular you can say lineage. It is dynamic. You can change it as
300:50 It is dynamic. You can change it as well.
300:51 well. Wow. Literally. Wow. So I'm really
300:54 Wow. Literally. Wow. So I'm really really excited about committing all the
300:56 really excited about committing all the changes and I want to just talk about
300:57 changes and I want to just talk about one more thing. Let me just make the
300:59 one more thing. Let me just make the commit.
301:04 Okay. Get add because I just modified that
301:06 Get add because I just modified that file.
301:17 Get commit minus m. Now it's done. Perfect. So now let's talk about one
301:19 Perfect. So now let's talk about one more thing. It is not directly relevant
301:21 more thing. It is not directly relevant to your you can say dbd cicd nothing.
301:24 to your you can say dbd cicd nothing. Just you can say additional information.
301:26 Just you can say additional information. What's that? like it it is indirectly
301:28 What's that? like it it is indirectly related to it. So far we have created a
301:31 related to it. So far we have created a kind of
301:32 kind of local repo everything right local repo
301:37 local repo everything right local repo you can also create a remote repo and I
301:40 you can also create a remote repo and I want to give you a small homework for
301:41 want to give you a small homework for this really yes
301:44 this really yes so if you know how to create a GitHub
301:48 so if you know how to create a GitHub repo it's very simple you can simply
301:50 repo it's very simple you can simply create your GitHub account and create
301:52 create your GitHub account and create your local rep create your uh remote
301:54 your local rep create your uh remote repo in the GitHub make sense okay you
301:58 repo in the GitHub make sense okay you need to push push this local repo, this
302:01 need to push push this local repo, this one that we have just created to that
302:03 one that we have just created to that remote repo. Okay, and it's very easy.
302:07 remote repo. Okay, and it's very easy. If you do not know how to do all of
302:09 If you do not know how to do all of these things, okay, you can just watch
302:11 these things, okay, you can just watch my GitHub master get GitHub masterclass
302:13 my GitHub master get GitHub masterclass video and you can simply search on
302:15 video and you can simply search on Google get GitHub an Lamba. So see as
302:19 Google get GitHub an Lamba. So see as I've just talked about this many time if
302:22 I've just talked about this many time if you just want to look for any topic
302:23 you just want to look for any topic related to data engineering just search
302:25 related to data engineering just search the topic with an Lamba just add an
302:28 the topic with an Lamba just add an Lamba and see this is the video Git and
302:31 Lamba and see this is the video Git and GitHub and if you do not know about Git
302:33 GitHub and if you do not know about Git it will just give you all the
302:34 it will just give you all the information that are there for you and
302:35 information that are there for you and you should know Git in 2025 bro it's
302:37 you should know Git in 2025 bro it's like almost uh 2025 is al also like
302:41 like almost uh 2025 is al also like almost over you should just know about
302:43 almost over you should just know about Git it's like mandatory stuff okay so I
302:47 Git it's like mandatory stuff okay so I will expect that you now know how to
302:49 will expect that you now know how to push the changes to GitHub. Okay, so
302:52 push the changes to GitHub. Okay, so this is your local repo and this is just
302:54 this is your local repo and this is just a homework for you and it will help you
302:56 a homework for you and it will help you to add this project in your resume.
302:58 to add this project in your resume. Okay, because a lot of people say hey
303:00 Okay, because a lot of people say hey can how can we just add this in my
303:02 can how can we just add this in my resume? How can we just do this bro? So
303:05 resume? How can we just do this bro? So many excuses just listen to me. So this
303:08 many excuses just listen to me. So this is your local repo and let's say this is
303:10 is your local repo and let's say this is your remote repo. This is let's say
303:12 your remote repo. This is let's say GitHub.
303:14 GitHub. It's very easy. Let me just give you
303:16 It's very easy. Let me just give you quick actions. So first of all, this is
303:17 quick actions. So first of all, this is your local repo. This is your main
303:18 your local repo. This is your main branch. Okay, you just need to create a
303:20 branch. Okay, you just need to create a repo inside this. Let's say this is your
303:22 repo inside this. Let's say this is your repo. If you create a readme file, it
303:25 repo. If you create a readme file, it will create a commit. And what will
303:27 will create a commit. And what will happen? You first need to pull those
303:29 happen? You first need to pull those changes in your local repo. Okay, let's
303:32 changes in your local repo. Okay, let's say pull main. Make sense? And pull
303:36 say pull main. Make sense? And pull main. Even before that, you first need
303:38 main. Even before that, you first need to create a remote connection. So you
303:40 to create a remote connection. So you can simply say get remote minus v. I'm
303:42 can simply say get remote minus v. I'm just giving you get commands. Get remote
303:44 just giving you get commands. Get remote minus v. Just check if you have
303:46 minus v. Just check if you have anything. If you see the URL, that means
303:48 anything. If you see the URL, that means your local repo is already connected
303:50 your local repo is already connected with your GitHub account. If not, you
303:52 with your GitHub account. If not, you will not see anything. Then you need to
303:53 will not see anything. Then you need to set the remote repo. How? You can simply
303:58 set the remote repo. How? You can simply say get remote um add origin. Okay, you
304:02 say get remote um add origin. Okay, you can say
304:04 can say get remote. I think these are the
304:06 get remote. I think these are the commands. You can double check, but I
304:08 commands. You can double check, but I know these are the commands. Get remote
304:09 know these are the commands. Get remote add origin. And then just provide your
304:10 add origin. And then just provide your URL. Okay, whatever your URL is for your
304:14 URL. Okay, whatever your URL is for your repo. Make sense? So this way you have
304:16 repo. Make sense? So this way you have just added your remote origin. Then you
304:18 just added your remote origin. Then you need to pull all the commits that are
304:21 need to pull all the commits that are there in the remote repo. What are the
304:22 there in the remote repo. What are the commits? You will say it is empty. You
304:24 commits? You will say it is empty. You created the readmi file. So it created
304:26 created the readmi file. So it created one commit. So the best practice is do
304:28 one commit. So the best practice is do not create a readme file so that there
304:29 not create a readme file so that there will be no commits here. Okay? If you
304:32 will be no commits here. Okay? If you have created it's fine. If you have not
304:34 have created it's fine. If you have not created it's fine. So that means this
304:36 created it's fine. So that means this step is optional. Okay. This step is
304:39 step is optional. Okay. This step is optional.
304:41 optional. Make sense? Next step will be
304:44 Make sense? Next step will be next step will be push this push these
304:47 next step will be push this push these changes. So you will simply say get push
304:50 changes. So you will simply say get push origin main. Okay. So this will push all
304:53 origin main. Okay. So this will push all these changes to main. And during this
304:55 these changes to main. And during this step it will also ask for the you can
304:58 step it will also ask for the you can say configuration basically token. So
305:00 say configuration basically token. So you can just provide the token and how
305:02 you can just provide the token and how you can just create the token. You
305:03 you can just create the token. You should know how to create the token in
305:05 should know how to create the token in GitHub. Just go to your settings then
305:06 GitHub. Just go to your settings then developer settings then just create a
305:08 developer settings then just create a classic token and that's it. Okay. So
305:10 classic token and that's it. Okay. So the moment you hit this get push origin
305:13 the moment you hit this get push origin main it will create a kind of repo. Okay
305:16 main it will create a kind of repo. Okay repo is already there but it will just
305:18 repo is already there but it will just populate all these changes. So this way
305:20 populate all these changes. So this way you have your changes in your remote
305:22 you have your changes in your remote repo as well similar to this one. I hope
305:26 repo as well similar to this one. I hope it makes sense. Okay, make sense? Very
305:29 it makes sense. Okay, make sense? Very good. So this is the way you can push
305:31 good. So this is the way you can push your changes, publish your changes and
305:33 your changes, publish your changes and you can highlight these things. You can
305:34 you can highlight these things. You can flex actually if you are in university
305:37 flex actually if you are in university or anywhere you can flex these things
305:39 or anywhere you can flex these things with your friends. Hey we have just
305:41 with your friends. Hey we have just built DBD project cool project. If you
305:43 built DBD project cool project. If you are already in the industry you can just
305:44 are already in the industry you can just flex with your let's say in the
305:46 flex with your let's say in the interviews or anywhere. Okay because
305:48 interviews or anywhere. Okay because flex is important especially uh yeah if
305:52 flex is important especially uh yeah if you are from like your favorite country.
305:55 you are from like your favorite country. So this is the way you can just push all
305:57 So this is the way you can just push all these changes from local repo to GitHub.
305:58 these changes from local repo to GitHub. Okay. Make sense? Very very good. Make
306:02 Okay. Make sense? Very very good. Make sense? So this is your DBT master class
306:05 sense? So this is your DBT master class with CI/CD.
306:07 with CI/CD. I hope it was helpful and I just want
306:09 I hope it was helpful and I just want one thing from you. I literally put my
306:12 one thing from you. I literally put my heart in into this video because I
306:14 heart in into this video because I wanted to make this video special. So
306:16 wanted to make this video special. So just make sure this video will reach to
306:19 just make sure this video will reach to maximum number of people. I am not just
306:22 maximum number of people. I am not just saying for my you can say uh benefit. I
306:25 saying for my you can say uh benefit. I want to make this video available to the
306:28 want to make this video available to the maximum number of people so that they
306:30 maximum number of people so that they can also learn this new technology and
306:32 can also learn this new technology and they can grow in their career obviously
306:34 they can grow in their career obviously for their family for their own you can
306:37 for their family for their own you can say happiness and all those things. I
306:39 say happiness and all those things. I want to see the smiles on people face
306:40 want to see the smiles on people face because DBT is a very very very new
306:43 because DBT is a very very very new technology and interviewers are
306:45 technology and interviewers are expecting you to just know DBT and if I
306:48 expecting you to just know DBT and if I know that this video can help them I
306:50 know that this video can help them I will be more than happy to bring that
306:53 will be more than happy to bring that smile on their faces and that's it. In
306:55 smile on their faces and that's it. In return you can simply write down your
306:59 return you can simply write down your honest feedback on the comment section
307:01 honest feedback on the comment section and I'm open to that. um like everything
307:04 and I'm open to that. um like everything is like obviously transparent in my case
307:06 is like obviously transparent in my case right you know me very well you know me
307:09 right you know me very well you know me nothing is hidden from you so this way
307:14 nothing is hidden from you so this way let's end this video and it was lovely
307:17 let's end this video and it was lovely lovely lovely covering this technology
307:20 lovely lovely covering this technology and just click on the video coming on
307:22 and just click on the video coming on the screen and I will see you there
307:24 the screen and I will see you there bye-bye