YouTube Transcript:
Data Structures - Full Course Using C and C++

Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.

AutoDub

Understand YouTube Foreign Videos

Immersive YouTube Dubbing in English

Break language barriers, embrace global quality content

Use for Free

Video Transcript

Video Summary

Summary

Core Theme

This content provides a comprehensive introduction to data structures, focusing on fundamental concepts like arrays, linked lists, stacks, queues, trees, graphs, and their respective operations and implementations. It emphasizes the importance of choosing the right data structure for efficient software development by analyzing time and space complexities.

Mind Map

Click to expand

Click to explore the full interactive mind map

In this lesson and in this series of lessons, we will introduce you to the

concept of data structures. Data structure is the most fundamental and

building block concept in computer science and good knowledge of data

structures is a must to design and develop efficient software systems. Okay, so let's

get started. We deal with data all the time and how we store, organize and group

our data together matters. Let's pick up some examples from our day-to-day life

where organizing data in a particular structure helps us. We are able to search

a word quickly and efficiently in a language dictionary because the words in

the dictionary are sorted. What if the words in the dictionary were not sorted?

It would be impractical and impossible to search for a word among millions of

words. So dictionary is organized as a sorted list of words. Let's pick up

another example. If we have something like a city map, the data like position of a

landmark and road network connection. So all this data is organized in the form

of geometries. We show the map data in the form of these geometries on a two

dimensional plane. So map data needs to be structured like this so that we have

scales and directions and we are effectively able to search for a landmark

and get route from one place to another. And I'll pick one more example for

something like daily cash in and cash out statement of a business, what we

also call a cash book in accounts. It makes most sense to organize and store the

data in the form of a tabular schema. It is very easy to aggregate data and

extract information if the data is organized in these columns in these

tables. So different kind of structures are needed to organize different kind of

data. Now computers work with all kind of data. Computers work with text, images,

videos, relational data, geospatial data and pretty much any kind of data that we

have on this planet. How we store, organize and group data in computers

matters because computers deal with really really large data and even with

the computational power of machines, if we do not use the right kind of

structures, the right kind of logical structures, then our software systems

will not be efficient. Formal definition of a data structure would be that a

data structure is a way to store and organize data in a computer so that the

data can be used efficiently. When we study data structures as ways to store and

organize data, we study them in two ways. So I'll say that we talk about data

structures as one, we talk about them as mathematical and logical models. When we

talk about them as mathematical and logical models, we just look at an

abstract view of them. We just look at from a high level what all features and

what all operations define that particular data structure. Example of abstract view

from real world can be something like the abstract view of a device named

television can be that it is an electrical device that can be turned on and

off. It can receive signals for satellite programs and play the audio video of the

program and as long as I have a device like this, I do not bother how circuits

are embedded to create this device or which company makes this device. So this

is an abstract view. So when we study data structures as mathematical or

logical models, we just define their abstract view or in other words, we have

a term for this, we define them as abstract data types. An example of abstract

data type can be, I want to define something called a list that should be

able to store a group of elements of a particular data type and we should be

able to read the elements by their position in the list and we should be

also able to modify element at a particular position in the list. I would

say store a given number of elements of any data type. So we are just defining a

model. Now we can implement this in a programming language in a number of ways.

So this is a definition of an abstract data type. We also call abstract data

type as ADT and if you see, all the high-level language is already have a

concrete implementation of such an ADT in the form of arrays. So arrays give

us all these functionalities. So arrays are data types which are concrete

implementation. So the second way of talking about data structures is talking

about their implementation. So implementations would be some concrete types

and not an abstract data type. We can implement the same ADT in multiple ways

in the same language. For example in C or C++ we can implement this list ADT as a

data structure named linked list and if you have not heard about it, we will be

talking about them a lot. We will be talking about linked list a lot in the

coming lessons. Okay so let's define an abstract data type formally because this

is one term that we will encounter quite often. Abstract data types are entities

that are definitions of data and operation but do not have implementations.

So they do not have any implementation details. We will be talking about a lot

of data structures in this course. We will be talking about them as abstract

data types and we will also be looking at how to implement them. Some of the

data structures that we will talk about are arrays linked list, stack, queue, tree,

graph and the list goes on. There are many more to study. So when we will study

these data structures we will study their logical view. We will study what

operations are available to us with these data structures. We will study the cost

of these operations mostly in terms of time and then definitely we will study

the implementation in a programming language. So we will be studying all these

data structures in the coming lessons and this is all for this introductory

lesson. Thanks for watching. In our previous lesson we introduced you to the

concept of data structures and we saw how we can talk about data structures in

two ways. One as a mathematical and logical model that we also call that we

also term as an abstract data type or ADT and then we also study data structures

as concrete implementations. In this lesson we will study one simple data

structure. We will first define an abstract view of it. We will first define

it as an abstract data type and then we will see the possible implementations and

this data structure is list. List is a common real world entity. List is nothing

but a collection of objects of the same type. We can have a list of words, we can

have a list of names or we can have a list of numbers. So let us first define

list as an abstract data type. So when we define abstract data type we just

define the data that will store and we define the operations available with the

type and we do not go into the implementation details. Let us first define

a very basic list. I want a list that can store a given number of elements of a

given data type. This would be a static list. The number of elements in the list

will not change and we will know the number of elements before creating the

list. We should be able to write or modify element at any position in the

list and of course we should be able to read element at a particular position

in the list. So if I ask you for an implementation of such a list and you

have taken a basic course in programming, a basic introductory course then you'll

be like hey I know this and array gives us all these features. All these

operations are available with an array. We can create an array of any data type

so let us say if we want to create a list of integers then we declare the array

type as integer and then we can give the size as a parameter in declaration. I can

write or modify element at a particular position. The elements are A0, A1 and

are accessed something like this. We all know about arrays and then we can also

read elements at a particular position. The element at ith position is accessed

as AI. So array is a data structure that gives us implementation for this list.

Now I want a list that should have many more features. I want it to handle more

scenarios for me. So I'll redefine this list here. I do not want a static list a

static collection with a fixed size. I want a dynamic list that should grow as

per my need. So the features of my list are that I'll call my list empty if there

are no elements in the list. I'll say the size of the list is 0 when it is empty

and then I can insert an element into the list and I can insert an element at any

position in the list and in an existing list. I can remove element from the list.

I can count the number of elements in the list and I should be able to read or write

or rather read or modify element at a particular position in the list and I

should also be able to specify the data type for the list. So I should be able to

while creating the list I should be able to say whether this is a list of integers

or whether this is a list of string or float or whatever. Now I want a data

structure which is implementation of this dynamic list. So how do I get it?

Well actually we can implement such a dynamic list using arrays. It's just that

we will have to write some more operations on top of arrays to provide

for all these functionalities. So let us see how we can implement this

particular list using arrays. Let's for the sake of simplicity of design assume

that the data type for the list is integer. So we are creating a list of

dynamic list of integers. What we can do is to implement such a list we can

declare a really large array. We will define some max size and declare an array

of this max size. Now as we know the elements in the array are indexed as a0,

a1, a2 and we go on like this. So what I'll do is I'll define a variable that

will mark the end of the list in this array. So if the list is empty we can

initialize this variable or we can set this variable as minus 1 because the

lowest index possible is 0. So if end is minus 1 the list is empty. At any time

a part of the array will store the list. Okay so let's say initially when the

list is empty this pointer end is pointing to index minus 1 which is not

valid which does not exist. And now I insert an integer into this array and let's

say if we do not give the position at which the number is to be inserted the

number is always inserted towards the tail of the list towards the end of the

list. So the list will be like we will have an element at position 0 and now

end is index 0. So at any time end marks the this variable end marks the end of

the list in this array. Now if I want to insert something in the list at a

particular position let's say I want to insert number 5 at index 2 then to

accommodate 5 here at this particular position we will have to shift all the

elements one unit towards the right. All the elements starting index 2 we need to

shift all the elements starting index 2 towards the right. Okay I just inserted

some elements into the list let me also write the function call for these. Let's

say we went in this order we inserted 2 then we inserted 4 and then we inserted

in the end we are inserting 5 and we will also give the position at which we

want to insert. So this insert with two arguments would be the call to insert

element at a particular position. So after all these operations after all these

insertions this is what the list will look like. This arrow here marks the end

of the list in the array. Now if I want to remove an element from a particular

position let's say I make a call to something to the remove function I want to

remove the element 2. So I'll pass the index 0 here I want to remove the element

at index 0. So to do so all these elements after index 0 will be shifted one unit

towards the left or towards the lower indices and 2 will go away. Now this end

variable here is being adjusted after each insertion that we are making. So after

this insertion end will be 0 after this 1, 2, 3 and so on after this remove end

will be 4 again. Okay looks like we pretty much have an implementation of

this list in the left that is described as an abstract data type. We have a logic

of calling the list empty when we have this variable n is equal to minus 1. We

can insert element at a particular position in the list we can remove

element. It's just that we have to perform some shifts in the array. We can

count the number of elements in the list. It will be equal to n plus 1 the value

in the variable n plus 1. We can read or modify element at a position well

this is an array so we can definitely read or modify element at a particular

position. If we wanted to choose the data type it was just choosing the array of

that particular data type. Now this looks like a cool implementation except that

we have one problem. We said that the array will be of some large size some

max size. But what is a good max size? We can always exhaust array the list can

always grow to exhaust the array. There is no good max size. So we need to have a

strategy for the scenario when the list will fill up the whole array. So what do

we do in that case? We need to keep that into our design. We cannot extend the same

array. It is not possible to do so. So we will have to create a new array a

larger array. So when the array is full we will create a new larger array and

copy all the elements from the previous array into the new array. And then we can

free the memory for the previous array. Now the question is by how much should we

increase the size of the new array? This whole operation of creating a new array

and copying all the elements from the previous array into the new array is

costly in terms of time and definitely a good design would be to avoid such big

cost. So the strategy that we choose is that each time the array is full we create

a new larger array of double the size of the previous array. And why this is the

best strategy is something that we will not discuss in this lesson. So we will

create a larger array of double size and copy elements from previous array into

this new array. This looks like a cool implementation. The study of data structures

is not just about studying the operations and the implementation of these operations.

It's also about analyzing the cost of these operations. So let us see what are

the costs in terms of time for all these operations that we have in the dynamic

list. The access to any element in this dynamic list if we want to access access

it using index for read or write then this will take constant time because we

have an array here. And in array elements are arranged in one contiguous block of

memory using the starting address or the base address of the block of the

memory of the block of memory and the index or the position of the element we

can calculate the address of that particular element and access it in

constant time. Big O notation that is used to describe the time complexity of

operations for constant time it is written as in terms of Big O the time

complexity is written as Big O of 1. If we wanted to insert element if we wanted

to insert element at the end of the array end of the list then that again will be

constant time but if we would insert element at a particular position in the

list then we will have to shift elements towards higher indices. In the worst case

we will have to shift all the elements to the right when we will be inserting at

the first position. So the time taken for insertion will be proportional to the

length of the list let's say the length of the list is n or in other words we

will say that insertion will be Big O of n in terms of time complexity. If you do

not know about Big O notation do not bother just understand that inserting an

element at a particular position will be a linear function in terms of the size

of the list. Removing an element will again be Big O of n. Time taken will be

proportional to the current size of the list n is the size of the list here okay

now inserting an element at the end we just said that it will happen in constant

time it is not so if the array is full then we will create a new array let's

call inserting element at the end as adding an element. Adding an element will take

constant time if the list is not full but it will take time proportional to the

size of the list size of the array if the array is full. So adding in the worst

case will be Big O of n again as we said when the list is full we create a new

copy double the size of the previous array and then we copy the previous array

the elements from previous array into the new array. So prime of a see what looks

like the good thing with this kind of implementation. Well the good thing is

that we can access elements at any index in constant time which is the

property of the array but if we have to insert some element in between and if we

have to remove element from the list then it is costly. If the list grows and

strings a lot then we will also have to create a new array and have all this

thing of copying elements from previous array into new array again and again. And

one more problem is that a lot of time a lot of the array would be unused the

memory there is of no use but definitely the use of array as dynamic list is not

efficient in terms of memory this kind of implementation is not efficient in

terms of memory. This leads us to think can we have a data structure that will

give us a dynamic list and use the memory more efficiently. We have one

data structure that gives us good utilization of the memory and this

data structure is linked list and we will study about the linked lists in the

next lesson. So that's it for this lesson thanks for watching. In this lesson we

will introduce you to link the list data structure. In our previous lesson we

tried to implement a dynamic list using arrays and we had some issues there. It

was not most efficient in terms of memory usage in terms of memory consumption.

When we use arrays we have some limitations. To be able to understand

linked list well we need to understand these limitations. So I am going to tell

you a simple story to help you understand this. Now let us say this is computer's

memory and each partition here is one byte of memory. Now as we know each byte of

memory has an address we are showing only a section of the memory that's why it is

extending towards the bottom and the top. Let's say the address increases from

bottom to top. So if this byte is address 200 the next byte would be address 201

and next byte would be address 202 and so on. What I want to do is I want to draw

this memory from left to right horizontally instead of trying it from

bottom to top like this. This looks better. Let's say this byte here is

address 200 and as we go towards the right the address increases. So this is

like 201 and we go on like 202, 203 and so on. It doesn't really matter whether we

show memory from bottom to top or left to right. These are just logical ways to

look at the memory. So coming back to our story memory is a crucial resource and

all the applications keep asking for it. So Mr. Computer has given this job of

managing the memory to one of his components to one of his guys who he

calls the memory manager. Now this guy keeps track of what part of the memory

is free and what part of the memory is allocated and anyone who needs memory

to store something needs to talk to this guy. Albert is our programmer and he is

building an application. He needs to store some data in the memory so he

needs to talk to the memory manager. He can talk to the memory manager in a

high level language like C. Let us say that he is using C to talk to the memory

manager. First he wants to store an integer in the memory. So he communicates

this to memory manager by declaring an integer variable something like this.

The memory manager sees this declaration and he says that okay you need to store

an integer variable. So I need to give you four bytes of memory because integer

variable is stored in four bytes in a typical architecture and let us say in

this architecture it is stored in four bytes. So the memory manager looks for

four bytes of free space in the memory and assigns it or allocates it for

variable X. Address of a block of memory is the address of the first byte in the

memory. So let us say this first byte of memory here is at address 217. So variable

X is at address 217. So memory manager kind of communicates it back to Albert

that hey I have assigned address 217 for your variable X you can store whatever

you want there and Albert can fill in any data into this variable. Now Albert

needs to store a list of integers a list of numbers and he thinks that the

maximum number of integers in this list will be four. So he asks the memory

manager for an integer array of size four named A. Now array is always stored in

memory as one contiguous block of memory. So memory manager is like okay I need to

look for a block of memory of 16 bytes for this variable this array A. So the

memory manager allocates this block starting address 201 and ending address

216 for this variable A which is an array of four integers. Because array is

stored as one contiguous block of memory and memory manager conveys the starting

address of this block whenever Albert tries to access any of the elements in

the array let's say he tries to access let's say he tries to write some value at

the fourth element in the array which he accesses as A3. Albert's application

knows where to write this particular value because it knows the base address

the starting address of the block A the array A and from base address using the

index which is three here it calculates the address of A3. So it knows that A3

is at address 213. So to access any of the elements in the array the application

takes constant time and this is one awesome thing about arrays that

irrespective of the size of the arrays the application and application can

access any of the elements in the array in constant time. Now let's say Albert

uses this array of four integers to store his list. So I'll fill in some

values here at these positions let's say this is 8, this is 2, this is 6, this is

5, this is 4. Now Albert at some point feels that okay I need to have one more

element in this list. Now he has declared an array of size 4 and he wants to add

a fifth element in the array. So he asks the memory manager that hey I want to

extend my array A is it possible to do so I want to extend the same block and the

memory manager is like when I allocate memory for an array I do not expect that

you will ask for an extension. So I use whatever memory is available adjacent to

that block for other variables. In some cases I may extend the same block but in

this case I have an element and a variable X next to your block so I cannot

give you an extension. So Albert is like what all options do I have? Memory

manager is like you can tell me the new size and I can recreate a new block at

some new address and we will have to copy all the elements from the previous block

to the new block. So Albert says that okay let's do it but the memory manager

is like you still need to give me the size of the new block. Albert thinks that

this time he'll give a really large size for the new array or the new block so

that it does not fill up. This new block starting address 2 to 4 is allocated.

Albert asks memory manager to free the previous block and this is some cost he

has to copy all the elements all the numbers from the previous block into the

new block and now he can add one more element to this list and he has kept his

array large this time just in case he needs more numbers in the list. So the

only option that Albert had was to create a as an entirely new block as an

entirely new array and Albert is still feeling bad because if the list is too

small he is not using some part of the array and so memory is getting wasted

and if the list again grows too much he will again have to create a new array a

new block and he will again have to copy all the elements from the previous block

into the new block. Albert is desperately seeking a solution to this problem and

the solution to this problem is a data structure named linked list. So let us

now try to understand linked list data structure and see how it solves Albert's

problem. What Albert can do is that instead of asking the memory manager for

an array which will be one large contiguous block of memory he can ask memory for

one unit of data at a time for one element at a time in a separate request.

I am cleaning up the memory here once again let's say Albert wants to store

this list of four integers in the memory. What if he requests memory for one

integer at a time. So first he pings memory manager for some memory to store

number six memory manager will be like okay you need space to store an integer

so you get this block of four bytes at address 204. So Albert can store number

six here now Albert makes another request a separate request for number five. Let

say he gets this block starting address 217 for number five because he makes a

separate request he may or may not get memory adjacent to number six higher

probabilities that he will not get an adjacent memory location. So similarly

Albert makes separate requests for number four and two. So let's say he gets these

two blocks at address 232 and 242 respectively for numbers four and two. So

as you can see when all but makes separate requests for each integer instead of

getting one contiguous block of memory he gets these disjoint non-contiguous

blocks of memory. So we need to store some more information here we need to

store the information that this is the first element in the list and this is the

second element in this list. So we need to link these blocks together somehow with

an array it was very simple. We had one contiguous block of memory so so we knew

where a particular element is by calculating its address using the

starting address of the block and the position of the element in the array but

here we need to store the information that this is the first block which stores

the first element and this is the second block which stores the second element

and so on. To link these blocks together and to store the information that this

is the first block in the list and this is the second block in the list what we

can do is that we can store some extra information with each block. So what if

we can have two parts in each block something like this and in one part of

the block we store the data or the value and in the other part of the block we

store the address of the next block. In this example in the first block the

address part would be 2 1 7 the address of the next block that stores 5 and in

this next block or the second block the address part would be 2 3 2. In the block

at address 2 3 2 we will store the address 2 4 2 the address of the next

block that stores number 2 and the block at 2 4 2 is the last block there is no

next block after this. So in the address part we can have address as 0 0 is in

valid address 0 can be used to mark that this is the end of the list there is no

link to the next node or next block after this particular block. So all but now

actually has to request memory manager for a block of memory that will store

two variables one and integer variable that will store the value of our element

and one a pointer variable that will store the address of the next block or

the next node in the list. In C he can define a type named node like this he

will have two fields in the node one to store the data this field will be an

integer and one more field to store the address of the next node in the list. So

Albert will ask a node Albert will ask memory for a node from the memory manager

and the memory manager will be like okay you need a node that needs four bytes

for an integer variable and four more bytes for the pointer variable that will

store the address pointer variable also in a typical architecture is stored in

four bytes. So now memory manager gives us a block of eight bytes and we call

this block a node. Now notice that the second field in the node structure is

node star which means pointer to node so this field will only store an address

of the next node in the list. So if we store the list like this in the memory as

these non-contiguous nodes connected to each other then this is a linked list

data structure. Logical view of the linked list data structure will be

something like this. Data is stored in these nodes and each node store the data

as well as the link to the next node. So each node kind of points to the next

node. The first node is also called the head node and the only information about

the list that we keep all the time is address of the head node or address of

the first node. So address of the head node kind of gives us access to the

complete list. The address in the last node is null or 0 which means that the

last node does not point to any other node. Now if we want to traverse the

linked list the only way to do it is we start at the head and we go to the first

guy and then we ask the first guy the address of the next guy address of the

next node and then we go to the next node and ask the address of the next node and

this is the only way to access the elements in the linked list. If we want to

insert a node in the linked list let's say we want to add number 3 at the end

of the linked list then all we need to do is first create a node in the linked

list. Sorry first create a node independently and separately it will get

some memory location. So we created this node with value 3. Now all we need to do

is fill the address properly adjust these links properly. So the address of this

particular node will be filled in this node with value 2 and this node the

address part can be null. So it is the last node it does not point to any other

node. Let's also show these nodes in the memory here. So I've written the address

of each node in brown at top of these nodes and I've also filled in this address

field of each node. Let's say the node for value 3 gets address 2 5 2. So this is

how things will be in the memory and this is how the logical view will be. The

linked list is always identified by the address of the first node and unlike

arrays we cannot access any of the elements in constant time. In the case of

arrays using the starting address of the block of memory and using the position of

the element in the list. We could calculate the address of the element but in this

case we have to start at the head and we have to ask this element for next

element and then ask the next element who is your next. It's like playing

treasure hunt. You go to the first guy and then you get the address for the

second guy and then you go to the second guy and you get the address for the

third guy. So the time taken to access elements will be proportional to the size

of the list. Let's say the size of the list is n. There are n elements in the

list. In the worst case to traverse the last element you will go through all

the elements. So time taken to access elements is proportional to n or in other

words we say that this operation will cost us or rather the time complexity of

this operation is big O of n. Insertion into the list. We can insert anywhere in

the list. We first need to create a node and just adjust these links properly. Like

say I want n at third position in the list. So all we need to do is create a node,

store the value 10 in the data part, something like this. Let's say we get the

node 10 at address 310. So we will adjust the address field in the second node to

point to and this node with address 310 and this node will point to the node with

value 4. Now to insert also we will have to traverse the list and go to that

particular position and so this will be big O of n again in terms of time

complexity. The only thing is that the insertion will be a simple operation. We

will not have to do all the shifts as we had to do in an array to insert

something in between. We had to shift all the elements by one position towards

higher indices. Similarly to delete something from this list will also be O n.

So we can see some good things about linked list. There is no extra use of

memory in the sense that some memory is unused. We are using some extra memory.

We are using some extra memory to store the addresses but we have the benefit that we

create nodes as and when we want and we can also free the nodes as and when we want.

We do not have to guess the size of the list beforehand like in the case of arrays.

Now we will discuss all the operations on linked list and the cost of these

operations as well as comparison with arrays in our next lessons. We will also be implementing

linked list in C or C++. So this is all for a basic introduction to linked list. Thanks for watching.

In our previous lesson we introduced you to linked list data structure and we saw how linked

lists solve some of the problems that we have with arrays. So now the obvious question would be

which one is better and array or a linked list. Well there is no such thing as one data structure

is better than another data structure. One data structure may be really good for one kind of

requirement while another data structure can be really good for another kind of requirement.

So it all depends upon factors like what is the most frequent operation that you want to

perform with the data structure or what is the size of the data and there can be other factors as well.

So in this lesson we will compare these two data structures based on some parameters based on

the cost of operations that we have with these data structures. So all in all we will comparatively

study the advantages and disadvantages and try to understand in which scenario we should use an array

and in which scenario we should use a linked list. So I will draw two columns here one for array

and another for linked list and the first parameter that we want to talk about is the cost of accessing

an element irrespective of the size of an array it takes constant time to access an element in the

array. Now this is because an array is stored as one contiguous block of memory. So if we know the

starting address or the base address of this block of memory let us say what we have here is an integer

array and the base address is 200 the first byte in this block is at address 200 then let's say if

we want to calculate the address of element at index i then it will be equal to 200 plus i into

size of an integer in bytes. So size of an integer in bytes is typically 4 bytes so it will be 200

plus 4 into i. So if 0th element is at address 200 if we want to calculate the address for element

at index 6 it will be 200 plus 6 into 4 which will be equal to 224. So knowing address of any

element in an array is just this calculation for our application in terms of big o notation constant

time is also called big o of 1. So accessing an element in an array is big o of 1 in terms of

time complexity. If you are not aware of big o notation check the description of this video for

a tutorial on time complexity analysis. Now in a linked list data is not stored in a contiguous

block of memory. So if we have a linked list something like this let's say we have a linked

list of integers here then we have multiple blocks of memory at different addresses. Each block in

the linked list is called a node and each node has two fields one to store the data and one to store

the address of the next node. So we call the second field the link field. The only information

that we keep with us about a linked list is the address of the first node which we also call

the head node and this is what we keep passing to all the functions also the address of the head

node. To access an element in the linked list at a particular position we first need to start at

the head node or the first node and at the first node we need to see the address of the second node

and then we go to the second node and see the address of the third node. In the worst case to

access the last element in the list we will be traversing all the nodes in the list. In the average

case we will be accessing the middle element maybe. So if n is the size of the linked list

and is the number of elements in the linked list then we will traverse n by two elements.

So the time taken in the average case also is proportional to number of elements in the linked

list. So we can say that the time complexity in average case is big O of n. So on this parameter

cost of accessing an element arrays course heavily over linked list. So if you have a

requirement where you want to access elements in the list all the time then definitely array is a

better choice. Now the second parameter that we want to talk about is memory requirement or

memory usage. With an array we need to know the size of the array before creating it

because array is created as one contiguous block of memory. So array is a fixed size.

What we typically do is create a large enough array and some part of the array stores our list

and some part of the array is vacant or empty so that we can add more elements in the list.

For example we have an array of seven integers here and we have only three integers in the list.

Rest four positions are unused. There would be some garbage value there.

With linked list let's say we have let's say we have this linked list of integers.

There is no unused memory. We ask memory for one node at a time so we do not keep any reserved

space but we use extra memory for pointer variables. And this extra memory requirement for pointer

variables in a linked list cannot be ignored in a typical architecture let's say integer is stored

in four bytes and pointer variable also takes four bytes. So if you see the memory requirement for

this array of seven integers is 28 bytes and the memory requirement for this linked list would be

eight into three where eight is the size of each node four for integer and four

bytes for the pointer variable. So this is also 24 bytes. If we add one more element to the list

in the array we will just use one more position. While in linked list we will create one more node

and we'll take another eight bytes. So this will be 32 bytes linked list would fetch us a lot of

advantage of the data. The data part is large in size. So in this case we had a linked list of

integers. So integer is only four bytes. What if we had a linked list in which a data part was

some complex type that took 16 bytes. So four bytes for the link and 16 bytes for the data

each node would have been 20 bytes and array of seven elements for 16 bytes of data would be 16

byte for each element would be 112 bytes and linked list of four would be only 80 bytes.

So it all depends. If the data part for the list takes a lot of memory linked list will definitely

consume lot less memory. Otherwise it depends what strategy we are choosing to decide the size of

the array. At any time how much array we keep unused. Now one more point with memory allocation

because arrays are created as one contiguous block of memory. Sometimes when we may want to create a

really really large array then maybe memory may not be available as one large block but if we are

using linked list memory may be available as multiple small blocks. So we will have this problem

of fragmentation in the memory. Sometimes we may get many small units of memory but may not get

one large block of memory. This may be a rare phenomenon but this is a possibility. So this is

also where linked list scores. Because arrays have fixed size once array gets filled and we

need more memory then there is no other option than to create a new array of larger size

and copy the content from the previous array into the new array. So this is also one cost

which is not there with linked list. So we need to keep these constraints and these requirements

in mind when we want to decide for one of these data structures for our requirement.

Now the third parameter that we want to talk about is cost of inserting an element in the list.

Remember when we are talking about arrays here we are also talking talking about the possible use

of array as dynamic list. So there can be three scenarios in insertion. First scenario will be

when we want to insert an element at the beginning of the list. Let's say we want to insert number

three at the beginning of the list. In the case of arrays we will have to shift each element

by one position towards the higher index. So the time taken will be proportional to the

size of the list. So this will be big O of n. Let's say n is the size of the list.

This will be big O of n in terms of time complexity. In the case of linked list inserting an element

in the beginning will mean only creating a new node and adjusting the head pointer and the link

of this new node. So the time taken will not depend upon the size of the list. It will be constant.

So for linked list inserting an element at the beginning is big O of one in terms of

time complexity. Inserting an element at n for an array. Let's say we are talking about dynamic array

a dynamic list in which we create a new array if it gets field filled. If there is space in the

array we just write to the next higher index of the list. So it will be constant time.

So time complexity is over if array is not full. If array is full we will have to create a new array

and copy all the previous content into new array which will take O in time where n is the size of

the list. In the case of linked list adding an element inserting an element at the end will

mean traversing the whole list and then creating a new node and adjusting the links. So time taken

will be proportional to n. I'll use this color coding for linked list. Here n is the number of

elements in the list. Now the third case would be when we want to insert in the middle of the list

at some nth position or maybe some ith position. Again in the case of arrays we will have to shift

elements. Now for the average case we may want to insert at the mid position in the array. So we

will have to shift n by two elements where n is the number of elements in the list. So the time

taken will is definitely proportional to n in average case. So complexity will be big O of n.

For linked list also we will have to traverse till that position and then only we can adjust

the links even though we will not have any shifting we will have to traverse till that point and in

the average case time taken will be proportional to n and the time complexity will be big O of n.

If you can see deleting an element will also have these three scenarios and the time complexity

for deleting for these three scenarios will also be the same. And the final point the final parameter

that I want to talk about is which one is easy to use and implement and array definitely is a lot

easier to use linked list implementation especially in C or C++ is more prone to errors like segmentation

fault and memory leak it takes good care to work with linked list. So this was arrays versus linked

list in our next lesson we will implement linked list in C or C++ we will get our hands dirty with

some real code. So this is it for this lesson thanks for watching. In our previous lessons we

described linked list we saw the cost of various operations in linked list and we also compared

linked list with arrays. So now let us implement linked list the implementation will be pretty

similar in C and C++ there will be slight differences that we will discuss. The prerequisite for this

lesson is that you should have a working knowledge of pointers in C C++ and you should also know the

concept of dynamic memory allocation. If you want to refresh any of these concepts check the description

of this video for additional resources. Okay so let's get started. As we know in a linked list

data is stored in multiple non-contiguous blocks of memory and we call each block of memory a

node in the linked list. So let me first draw a linked list here. So we have a linked list of

integers here with three nodes as we know each node has two fields or two parts one to store the

data and another to store the address of the next node what we can also call link to the next node.

So let's say the address of the first node is 200 and address of the second node is 100 and the

address of the third node is 300 for this linked list. This is only a logical view of the linked

list. So the address part of the first node will be 100 the address of the second node and we will

have 300 here. The address part of the last node will be null which is only a synonym or macro

for address 0. 0 is an invalid address a pointer variable equal to 0 or null with address 0 or

null means that the pointer variable does not point to a valid memory location. The memory block

the address of the memory block allocated to each of the nodes is totally random there is no relation

it's not a guarantee that the addresses will be in increasing order or decreasing order

or adjacent to each other. So that's why we need to keep these links.

Now the identity of the linked list that we always keep with us is the address of the first

node what we also call the head node. So we keep another variable that will be of type pointer to

node and this guy will store the address of the first node and we can name this pointer variable

whatever let's say this pointer variable is named a. The name of this particular pointer variable

that points to the head node or the first node can also be interpreted as the name for the linked

list also because this is the only identity of the linked list that we keep with us all the time.

So let us now see how this logical view can be mapped to a real program in C or C++.

In our program node will be a data type that will have two fields want to store the data

and another to store the address. In C we can define such a data type as structure.

So let's say we define a structure named node with two fields first field to store the data

the type of the data here is integer. So this will be node for a linked list of integers.

If we wanted a linked list of say double this data type would be double. The second field will be

pointer to node struct node star we can name this link or we can name this next or whatever.

This is C style of declaring node star or pointer to node. If this was C++ we could simply write

node star I would write it this way the C++ way it looks better to me. In our logical view here

this variable A is of type node star or pointer to node. Each of these three rectangles with two

fields are of type node and this field in the node the first field is of type integer and the

second field is of type pointer to node or node star. It is important to know which one is what

in the logical view we should have this visualization before we go on to implement linked list.

Okay so let us now create this particular linked list of integers that we are showing here

through our code. To be able to do so we will have to implement two operations one to insert

a node into the linked list one operation to insert a node in the linked list and another

operation to traverse the linked list. But before that the first thing that we want to do is that

we want to declare a pointer to the head node a variable that will store the address of the

head node for the sake of clarity I'll write head node here. So I have declared a pointer to node

named A. Initially when the list is empty when there is no element in the list this pointer should

point nowhere so we write a statement something like A is equal to null to say the same. Now with

these two statements what we have done is we have created a pointer to node named A and this and

this pointer points nowhere so the list is empty. Now let's say we want to insert a node in this list

so what we do is we first create a node creating a node is nothing but creating a memory block

to store a node in C we use the function malloc to create a memory block as argument we pass the

number of bytes that we want in the block so we say that give me a memory block that will be equal

to the size of a node so this call to malloc will create a memory block. This is a dynamically

allocated memory memory allocated during runtime and the only way to work with this kind of memory

is through reference to this memory location through pointers. Let us assume that this memory block

assigned here is at address 200. Now malloc returns a void pointer that gives us the address of

assigned memory block so we need to collect it in some variable so let's say we create a variable

named temp which is pointer to node so we can collect the return of malloc the address in this

particular variable we will need a type casting here because malloc returns void pointer and we are

having temp as pointer to node so now we have created one node in the memory. Now what we need

to do is fill in the data in this particular node and adjust the links which will mean writing the

correct address in the variable a and the link field of this newly created node to do so we will

have to dereference this particular pointer variable and that we just created. As we know if we put

an asterisk sign in front of the pointer variable we mean dereferencing it to modify the value at

that particular address. Now in this case we have a node which has two fields so once we dereference

if we want to access each of the fields we need to put something like a dot data here

to access the data and a dot link to write to the link field so we will write a statement like this

to fill in value 2 here and we have this temp variable pointing to this right now

and the link part of this newly created node should be null because because this is the first

and the last node and the final thing that we need to do is write the address of this newly created

node in a so we will write something like a is equal to temp okay temp was to temporarily

store the address of the node till the time we had not fixed all the links properly we can now

use temp for some other purpose our linked list is intact now it has one node these two lines that

we have written here for dereferencing and writing the values into the new node there is alternate

syntax for this instead of writing something like start temp bracketed dot data we could also write

temp followed by this arrow and data we will have two characters to make this arrow one hyphen and

one this right angular bracket right angular brace so we can write something like this

and the same thing below we can write something like this to create a memory block in c++

we can use malloc as well as we can use the new operator so in c++ it gets very simple we could

simply write node start temp is equal to new node like this and we would mean the same thing

this is a lot cleaner and new operator is always preferred over malloc so if you're using c++

new is recommended so so far through our program we created an empty list by creating this pointer

to the head node and assigning the value null to it initially then we created a node and we added

this first node in this linked list when the list is empty and we want to insert a node the logic

is pretty straightforward when the list is not empty we may want to insert a node at the beginning

of the list or we may want to insert a node at the end of the list or we may even want to insert a

node somewhere in the middle of the list at a particular position we will write separate functions

and routines for these different kind of insertions and we will see running code in a compiler

in our coming lessons let's just talk about the logic here in this whatever unstructured code I

have right now so I want to write a code to insert two more nodes each time at the end of the list

we actually want to create the linked list with three nodes having values two four and six

that was our initial example in the beginning okay so let us add two more nodes with values

four and six into this linked list at this stage in our code we already have a variable temp which

is pointing to this particular node we will create a new node and use the same variable name to collect

the address of this new node so we will write a statement like this so a new node is created

and temp now stores the address of this new node which is located at address 100 here once again

we can set the data and then because this is going to be the last node we need to set the link

as null now all we need to do is build the link of this particular node right the address of this

newly created node into the address field of this last node to do so we will have to traverse the

list and we will have to go to the end of the list to do so we will write something like this

we can create a new variable temp one which will be pointed to node and initially we can

point to the head node point this variable to the head node by a statement like this we can write

a loop like this now this is generic logic to reach the end of the list it will not be so clear

if we see this logic with only one node as we have in this example let's draw a list with multiple

nodes so we are pointing temp one to the first node here and if the link part of this node

is null we are at the last node else we can move to the next node so temp one equal temp one dot

link will get us to the next node and we will go on till we reach the last node

for the last node this particular condition temp one dot link not equal null will be false

because the link part will be null and we will exit this while loop so this is our code logic

for traversal of the list all the way till end if we want to print the elements in the list we

will write something like this and we will write print temp dot data inside this while loop but

right now we want to insert in the at the end of the list and we are only traversing the list

to reach the last node there is one more thing that I want to point out we are using this variable

temp one and initially storing the address in a we are not doing something like a equal a dot link

and using the variable a itself to traverse the list because if we modify a we will lose

on the address of the head node so a is never modified the address of the head node whatever

variable stores the address of the head node is never modified only these temporary variables are

modified to traverse the list so finally after all this we will write a statement like temp one dot

link is equal to temp temp one is pointing here so now this address part is updated and this link

is built so we have two nodes now in the list once again when we want to insert

node with number six in this list we will have to create a new node by this logic then we will

have to traverse the list by this logic so we will point temp one here first and then the loop will

move the temp one to the end let's say this new block is at address 300 so this last line finally

will adjust the link of the node at address 100 to insert a node at the end there is one logic in

these four lines if the list is empty and there is another logic in these remaining lines if the

list is not empty ideally we will be writing all these logics all this logic in a function

we will do that in our coming lessons we will implement separate methods to print all the nodes

in a linked list and to insert a node at the end we will implement a separate method to insert a

node at the beginning of the list and at a particular position in the list so this is all for this

lesson thanks for watching in our previous lesson we saw how we can map the logical view of a linked

list into a c or c++ program we saw how we can implement two basic operations one traversal of

the linked list and another inserting a node at the end of the linked list in this lesson we will

see a running code that will insert a node at the beginning of the linked list so let's get started

I will write a c program here the first thing that we want to do in our program is that we want

to define a node a node will be a structure in c it will have two fields want to store the data

let's say we want to create a linked list of integers so our data type will be integer if we wanted

to create a linked list of characters then our type would be character here so we will have another

field that will be pointed to node that will store the address of the next node we can name this

variable link or some people some people also like to name this variable next because it sounds more

intuitive this variable will store the address of the next node in the linked list

in c whenever we have to declare node or pointer to node we will have to write

struct node or struct node star in c++ we will have to write only node star and that's one difference

okay so this is the definition of our node now to create a linked list we will have to create

a variable that will be pointer to node and that will store the address of the first node in the

linked list what we also call the head node so I will create a pointer to node here

struct node star we can name this variable whatever often for the sake of understanding

we name this variable head now I have declared this variable as a global variable I have not

declared this variable inside any function and I'll come back to why I'm doing so now I'll write

the main method this is the entry point to my program the first thing that I want to do is I want to

say head is equal to null which will mean that this pointer variable points nowhere so right now

the list is empty so far what we have done here in our code is that we have created a global

variable named head which is of type pointer to node and the value in this pointer variable is null

so so far the list is empty now what I want to do in my program is that I want to ask the user

to input some numbers and I want to insert all these numbers into the linked list so I'll print

something like how many numbers let's say the user wants to input n numbers so I'll collect this

number in this variable n and then I'll define another variable I to run the loop and so I'm running

a loop here if it was c++ I could declare this integer variable right here inside the loop

now I'll write a print statement like this and I'll define another variable x and each time

I'll take this variable x as input from the user and now I will insert this particular number x

this particular integer x into the linked list by making a call to the method insert and then

each time we insert we will print all the nodes in the linked list the value of all the nodes in

the linked list by calling a function named print there will be no argument to this function print

of course we need to implement these two functions insert and print let me first write down the

definition of these two functions so let us implement these two functions insert and print

let us first implement the function insert that will insert a node at the beginning of the linked

list now in the insert function what we need to do is we first need to create a node in c we can

create a node using malloc function we have talked about this earlier malloc returns a pointer to

the starting address of the memory block we are having to type custer because malloc returns a

void pointer and we need a pointer to node a variable that is pointer to node and then only if we

day reference we day reference using an asterisk sign then we will be able to access the fields of

the node so the data part will be x and we have an alternate syntax for this particular

syntax we could simply write something like temp and this arrow and it will mean the same thing

and this is more common with these two lines in the insert function all we are doing is we are

creating a node let's say we get this node and let's assume that the address that we get for this

node is hundred now there is a variable temp where we are storing the address we can do one thing

whenever we create a node we can set data to whatever we want to set and we can set the link field

initially to null and if needed we can modify the link field so I'll write one more statement

temp.next is equal to null remember temp is a pointer variable here and we are de-referencing

the pointer variable to modify the value at this particular node temp will also take some space in

the memory that's why I have shown this rectangular block for both the pointer variables head and temp

and node has two parts one for the pointer variables and one for the data so this part the link part

is null we can either write null here or we can write it like this it's the same thing logically it

means the same now if we want to insert this node in the beginning of the list there can be two

scenarios one when the list is empty like in this case so the only thing that we need to do is

we need to point head to this particular node instead of pointing to null so I will write a

statement like head is equal to temp and the value in head now will be address hundred and that's

what we mean when we say a pointer variable points to a particular node we store the address of that

node so this is our linked list after we insert the first node let us now see what we can do to

insert a node at the beginning if the list is not empty like what we have right now once again we

can create a node fill in the value x here that is passed as argument initially we may set the link

field as null and let's say this node gets address 115 the memory and we have this variable temp

through which we are referencing this particular memory block now unlike the previous case if we

just set head is equal to temp now this is not good enough because we also need to build this link

we need to set the next or the link of the newly created node to whatever the previous head was

so what we can do is we can write something like if head is not equal to null or if the list is

not empty first set m dot next equal head so we first build this link the address here will be

hundred and then we say head equal temp so we cut this link and point head to this newly created

node and this is our modified linked list after insertion of this second node at the beginning of

the list now one final thing here this particular line the third line temp dot next equal null

this is getting used only when the list is empty if you see when the list is empty head is already

null so we can avoid writing two statements we can simply write this one statement m dot next

equal head and this will also cover the scenario when the list is empty now the only thing remaining

in this program to get this running is the implementation of this print function so let us

implement this print function now what i will do here is i'll create a local variable which is

pointed to node named temp and i need to write struct node here i keep missing this in c you

need to write it like this and i want to set this as address of the head node so this global

variable has the address of the head node now i want to traverse the linked list so i will write

a loop like this while temp is not equal to null i'll keep going to the next node using this statement

temp is equal to temp dot next and at each stage i'll print the value in that node as temp dot data

now i'll write two more print one outside this while loop and one outside after this while loop

to print an end of line now why did we use a temporary variable because we do not want to

modify head because we will lose the reference of the first node so first we collect the address

in head in another temporary variable and we are modifying the addresses in this temporary

variable using temp is equal to temp dot next to traverse the list now let us now run this program

and see what happens so this is asking how many numbers you want to insert in the list

let's say we want to insert five numbers initially the list is empty let's say the first number that

we want to insert is two at each stage we are printing the list so the list is now two the first

element and the last element is two we will insert another number the list is now five two five is

inserted at the beginning of the list again we inserted eight and eight is also inserted at

the beginning of the list okay let's insert number one the list is now one eight five two

and finally I inserted number 10 so the final list is 10 1 8 5 2 this seems to be working

now if we were writing this code in c++ we could have done a couple of things we could have written

a class and organized the code in an object oriented manner we could also have used new operator

in place of the malloc function and now coming back to the fact that we have declared this head

as global variable what if this was not a global variable this was declared inside this main function

as a local variable so I'll remove this global declaration now this head will not be accessible

in other functions so we need to pass address of the first node as argument to other functions

to both these functions print and insert so to this print method we will pass let's say we

name this argument as head now we can name this argument argument as head or a or temp or whatever

if we name this argument as head this head in print will be a local variable of print and will not be

this head in main these two heads will be different these two variables will be different when the

main function calls print passing its head then the value in this particular head in the main

function is copied to this another head in the print function so now in the print function we

may not use this m variable what we can do is we can use this variable head itself to traverse the

list and this should be good we are not modifying this head here in the main similarly to the insert

function we will have to pass the address of the first node and this head again is just a copy

this is again a local variable so after we modify the linked list the head in main method should

also be modified there are two ways to do it one we can pass the pointer to node as return from

this method so in the main method insert function will take another argument head and we will have

to collect the return into head again so that it is modified now this code will work fine

whoops i forgot to write a return here return head and we can run this program like before

we can give all the values and see that the list is building up correctly there was another way

of doing this instead of asking this insert function to return the address of head we could

have passed this particular variable head by reference so we could have passed insert ampersand

head head is already a pointer to node so in the insert function we will have to receive

pointer to pointer node star star and to avoid confusion let's name this variable something

else this time let's name this pointer to head so to get head we will have to write something like

we will have to dereference this particular variable and write astric pointer to head

everywhere and the return type will be void sometimes we want to name this variable as head

this local variable as head doesn't matter but we will have to take care of using it properly

now this code will also work as you can see here we can insert nodes and this seems to be going well

if you do not understand this concept of scope you can refer to the description of this video for

additional resources so this was inserting a node at the beginning of the linked list thanks for watching

in our previous lesson we had written code to insert a node at the beginning of the linked list

now in this lesson we will write program to insert a node at any given position in the linked list

so let me first explain the problem in a logical view let's say we have a linked list of integers

here there are three nodes in this linked list let us say they are at addresses 200 and 250

respectively in the memory and we have a variable head and that is pointer to node that stores the

address of the first node in the list now let us say we number these nodes we number these positions

on a one based index so this is the first node in the list and this is the second node and this

is the third node and we want to write a function insert that will take the data to be inserted in

the list and the position at which we want to insert this particular data so we will be inserting

a node at that particular position with this data there will be a couple of scenarios the list could

be empty so this variable head will be null or this argument being passed to the insert function

the position n could be an invalid position for example 5 is an invalid position here

for this linked list the maximum possible position at which we can insert a node in this list will

be 4 if we want to insert at position 1 we want to insert at the beginning and if we want to

insert at position 4 we want to insert at end so our insert function should gracefully handle

all these scenarios let us assume for the sake of simplicity for the sake of simplifying our

implementation that we always give a valid position we will always give a valid position

so that we do not have to handle the error condition in case of invalid position

the implementation logic for this function will be pretty straightforward

we will first create a node let's say in this example we want to insert a node with value

8 at third position in the list so i'll set the data here in the node the data part is 8

now to insert a node at the nth position we will first have to go to the n minus 1th node

in this case n is equal to 3 so we will go to the second node

now the first thing that we will have to do is we will have to set the link field of this newly

created node equal to the link field of this n minus 1th node so we will have to build this link

let's say the address that we get for this newly created node is 150 once we build this link we

can break this link and set the link of this newly created node as address of this set the link of

this n minus 1th node as address of this newly created node we may have special cases in our

implementation like the list may be empty or maybe we may want to insert a node at the beginning

let's say we will take care of special cases if any in our actual implementation

so now let's move on to implement this particular function in our program

in my c program the first thing that i need to do is i want to define a node so node will be a

structure and we have seen this earlier so node has these two fields one data of type integer and

another next of type pointer to node now to create a linked list the first thing that i need to create

is a pointer to node that will always store the address of the first node or the head node in

the linked list so i will create struct node star let's name this variable head and once again i

have created this variable as a global variable to understand linked list implementation we need

to understand what goes where what variable sits in what section of the memory and what is the scope

of these variables what goes in the stack section of the memory and what goes in the heap section

of the memory so this time as we write this code we will see what goes where in the main method

first i'll set this head as null to say that initially the list is empty so let us now see what

has gone where so far in our program in what section of the memory and the memory that is

allocated to our program or application is typically divided into these four parts or these four sections

we have talked about this in our lesson on dynamic memory allocation there is a link to our lesson

on dynamic memory allocation in the description of this video i'll quickly talk about what these

sections are one section of the applications memory is used to store all the instructions that need

to be executed another section is allocated to store the global variables that live for the entire

lifetime of the program of the application now one section of the memory which is called stack

is used to store all the information about function call executions to store all the local

variables and these three sections that we talked about are fixed in size their size is decided at

compile time the last section that we call heap or free store is not fixed and we can request

memory from the heap during runtime and that's what we do when we use malloc or new operator

now i have drawn these three sections of the memory stack heap and the section to store the

global variables in our program we have declared a global variable named head which is pointed to

node so it will go and sit here and this variable is like anyone can access it initially value here

is null now in my program what i want to do is i first want to define two functions uh insert

and this function should take two arguments data and the position at which i want to insert a node

and insert that particular node at that position insert data at that position in the list

and another function print that will simply print all the numbers in the linked list

now in the main method i want to make a sequence of function calls first i want to insert number

two the list is empty right now so i can only insert that position one so after this insert list

will be having this one number this particular number two and let's say again i want to insert

number three at position two so this will be our list after this insertion and i will make

two more insertions and finally i'll print the list so this is my main method i could have also

asked a user to input a number and position but let's say we go this way this time

now let us first implement insert i'll move this print above so the first thing that i want to do

in this method is i want to create a node so i will make a call to malloc in c++ we can simply

write a new node for this call to malloc and this looks a lot cleaner let's go c++ way this time

now what i can do is i can first set the data field and set the link initially

as null i have named this variable temp one because i want to use another temp variable

in this function i'll come to that in a while now we first need to handle one special case

when we want to insert at the head when we want to insert at the first position

so if n is equal to one we simply want to set the link field of the newly created node as whatever

the existing head is and then adjust this variable to point to the new head which will be this newly

created node and we will be done at this stage so we will not execute any further and return from

this function if you can see this will work even when the list is empty because the head will be

null in that case i'll show a simulation in the memory in a while so hold on till then

things will be pretty clear to you after that now for all other cases we will first need to go to

the n-1th node as we had discussed in our logic initially so what i'll do is i'll create another

pointer to node name this variable temp two and we will start at the head and then we will run a loop

and go to the n-1th node something like this we will run the loop n-2 times because right now we

are pointing to head which is the first node so if we do this temp two equal temp two dot next

n-2 times we will be pointing temp two to n-1th node and now the first thing that we need to do

is set the next or the link field of newly created node as the link field of this n-1th node and then

we can adjust the link of this n-1th node to point to our newly created node and now i'm writing this

print here i've written this print here we have used a temporary variable a temporary pointer to

node initially pointed to pointed it to head and we have traversed the whole list okay so let

us now run this program and see what happens we are getting this output which seems to be correct

the list should be four five two three in this order now i have this code i'll run through this

code and show you what's happening in the memory when the program starts execution initially the

main method is invoked some part of the memory from the stack is allocated for execution of a

function all the local variables and the state of execution of this function is saved in this

particular section we also call this stack frame of a function here in this main method we have

not declared any local variable we just set head to null which we have already done here now the

next line is a call to function insert so the machine will set the execution of this particular

method main on hold and go on to execute this call to insert so insert comes into this stack

and insert has couple of local local variables it has two arguments data and this variable n

this stack frame will be a little larger because we will have a couple of local variables

and now we create this another local variable which is a pointer to node m1 and we use the new operator

to create a memory block in the heap and this guy temp1 initially stores the address of this

memory block let's say this memory block is at address 150 so this guy stores the address 150

when we request some memory to store something on the heap using new or malloc we do not get a

variable name and the only way to access it is through a pointer variable so this pointer variable

is the remote control here kind of so here when we say temp1 dot data is equal to this much

through this pointer which is our remote we are going and writing this value to here

and then we are saying temp dot next equal null so null is nothing but address 0 so we are writing

address 0 here so we have created a node and in our first call n is equal to 1 so we will come

to this condition now we want to set temp1 dot next equal head temp1 dot next is

this section this second field and this is already equal to head head is null here and this is

already null null is nothing but 0 the only reason we set temp dot next equal head will work for

empty cases because head would be null and now we are saying head is equal to temp1

so head guy now points to this because it stores address 150 like temp1

and in this first call to insert after this we will return so the execution of insert will

finish and now the control returns to the main method we come to this line where we make another

call to insert with different arguments this time we pass number 3 to be inserted at position 2

now once again memory in the stack frame will be allocated

for this particular call to insert the stack frame allocation is corresponding to a particular

call so each time the function execution finishes all the local variables are gone from the memory

now once again in this call we create a new node we keep the address initially in this temporary

variable temp1 now let's say we get this node at address 100 this time now n is not equal to 1 we

will move on to create another temporary variable temp2 now we are not creating a new node and

storing the address in temp2 here we are saying temp2 is initially equal to head so we store the

address 150 so initially we make this guy point to the head node and now we want to run this loop

and want to keep going to the next node until we reach n minus 1th node in this case n is equal to

2 so this loop will not execute this statement even once n minus 1th node is the first node itself

now we execute these two lines the next of the newly created node will be set first so we will

build this link oops no temp2 dot the next is 0 only so even after reset this will be 0

and now we are setting temp2 dot next as temp1 so we are building this link and now this call to

insert will finish so we go back to the main method so this is how things will happen for

other calls also so after everything we have inserted when we will reach this print statement

in the main function our list will be something like this in the memory this is a little messy

i've chosen this addresses as per my convenience for the sake of example and now print will execute

and once again i'm using a temp variable in print by now it should have been clear to you

why we use temp variable again and again and why this variable head that stores the address of

the first node is so important now what if this head was not global what if we would have declared

this head inside the main method we have talked about this in our previous lesson head will not

be accessible everywhere so in each call to these functions in each call to insert we will have to

return some value from the function to update this head or we will have to pass this head by

reference we have talked about this in our previous lesson so this is it for this this lesson

in our next lesson we will see program to delete a node at a particular position in the list

so thanks for watching in our previous lesson we wrote program to insert a node at nth position

or a given position in a list in a linked list now in this lesson we will write a program to delete

a node at any given position in a linked list so once again i have drawn a linked list here

we have four nodes in this list at addresses 100, 200, 150 and 250 respectively so this is

my example of a linked list of integers and let's say we number the positions on a one based index

so this is the first node in the list and this is the second node this is the third node and this

is the fourth node when we talk about deleting a node from the linked list we will have to do two

things first we will have to fix the links so that the node is no more a part of the list

let's say in this case we want to delete the node at third position so we will go to the second node

for nth node we will have to go to the n minus 1th node and we will have to set the link part of

the n minus 1th node as the link of the nth node which will be the n plus 1th node so we will cut

this link and now this node at address 150 is not part of the linked list because when we will

traverse the linked list we will go from address 100 to 200 and from 200 we will go to 250

this is one scenario for deletion in which we have a node before and a node after

there will be special cases like we may want to delete the node at the first position or the

head itself in that case we will have to point head to the second node we will have to build this

link now we will talk about all these special cases in our implementation let's first understand

the logic now fixing the links is not good enough because all that we do when we fix the links

is that we detach the node from the linked list so that it is no more accessible but it is still

occupying space in the memory as we know a node is allocated space from what we call the dynamic

memory or the heap section of the memory we have talked about this earlier in C or C plus plus

we have to explicitly free this memory when we are done using it because it is not automatically

deallocated and memory being a crucial resource we do not want to consume it unnecessarily when

we do not need it so the second thing that we will have to do is we will have to free the space

that's being taken by the node and that's when the node will actually be deleted from the memory

so let us now write code for this I'm writing my C program here the first thing that I have

done is I have defined a node which is a structure with two fields one to store data and another

to store address of the next node so the second field is appointed to node now to create a linked

list we will have to first create a pointer to node a variable which is pointer to node and

that will store the address of the head node or the first node in the list and now I want to define

three functions first insert function that will take some value some data to be inserted into the

list and always insert this value at the end of the list then I want to define a print function

that will print all the elements in the list now we have defined this variable head as a global

variable so it will be accessible to all these functions and the third function that I want to

write is delete that will take the position end of the node to be deleted and delete the node at

that particular position we will go back to implementing these methods first I'll write the

main method so in the main method first what I'll do is I'll set head as null so at this stage

the list is empty and then I'll make a couple of calls to insert function to insert some integers

in the list so after this fourth insert the list will be two four six five because we are always

inserting at the end of the list this insert function will insert the node at the end of the list

now what I want to do in my main method is I want to ask a user for a position and I'll input this

position from the console and then I'll delete the node at this particular position and then

I'll print the whole linked list now let's also make a call to print after all the inserts

okay so this is what we want to do in our main method we want to insert four integers

in a linked list to create a list two four six five in this order and then I want to print the list

then I want to input a number from the console and delete the node at that particular position

now let us assume that we will always give a valid position and in my implementation also

I will not handle the error condition when position will not be valid

we have seen implementation of insert and print earlier so I will not go into their implementation

details what I'll do now is I'll implement delete function now in my delete function let's first

handle the case when there is a node before the node that we want to delete so we have an n minus

one-th node what I'll do is I'll first create a temporary variable that is pointed to node

and point this to head and using this temporary variable we will go to n minus one-th node to go

to the n minus one-th node we will have to run a loop n minus two times and we will have to

do something like this temp1 is equal to temp1.next now what I'll do is I'll create a variable to

point to the nth node name this temp2 and this will be equal to temp1.next and now I can fix the link

I can say that adjust the link section the link part of n minus one-th node

to point to n plus one-th node which will be temp2.next now our linked link is fixed

and this variable temp2 stores the nth node reference to the nth node so we can make a call to free

function now free function deallocate whatever memory is allocated through malloc if we were using

c++ and used and if we would have used new operator we should have said delete temp2

okay now we should be good this much code will work for scenarios when we have an n minus one-th

node and even if there is no n plus one-th node if n plus one-th position is null this will work

for this that scenario I'll leave that as an exercise for you to validate now we have not handled

one special case when we want to delete the head node so if n is equal to one then what we want

to do is we just want to set head as temp1.next temp1 is right now equal to head and now head has

moved on to point to the second node and temp1 still points to the first node so links are fixed

and we can free the first node which is now detached from the linked list because head is now

pointing to the second node okay so this is our delete function I have missed one thing here

for n not equal to one we should not execute this section of the code so either we put an else

statement after this or what we can do is we can say return after we execute these statements

for this condition now this code should work if I've got everything right so let us now run this

and see what happens I have already written the insert and print functions I'll come back to this

main function this is my list 2465 and I can enter any of the positions one two three or four

so let's first say we want to delete the head node and we are printing the list after deleting

a particular node so the list now is 465 this seems to be correct let us run this again and

this time I delete number five from position four the list is now 246 which is correct again

similarly if I enter position two the list is 265 which is correct so we seem to be good

I'll quickly walk you through this code in the logical view to make things further clear

let's say we first make a call to delete node from the first position that is we want to delete

the head node so in this code what we are doing is we are first creating a variable temp1 which is

pointed to node initially temp1 is equal to head so it stores the address 100 so it points to the

head node now n is equal to 1 so we come to this instruction head is equal to temp1 dot next

actually this is temp arrow dot next but while reading we read this as temp1 dot next this is

nothing but a syntactical sugar for this statement asterisk temp1 dot next

so we are de-referencing this pointer variable to go to this node and then accessing the next

field of this node now we are saying head is equal to temp1 dot next so head is now 200

so we are building this link and breaking this link and now in the next line we say free temp1

so we want to free the memory which is being pointed to by this variable temp1

temp1 still points to this node at address 100 so this node now will be cleared from the memory

and now we return so this function does not execute any further it finishes its execution

once the function execution finishes temp1 which was a local variable also gets cleared from the

memory head is a global variable so it does not gets cleared this is how we know the linked list

this is the identity of the linked list this particular variable head let's read on this code

again and this time i want to delete the node at third position in the list i have drawn this

initial list so once again we create this variable temp1 we say that the address here is equal to

100 so it points to the head node or the first node and now n is not equal to 1 it is equal to

3 so we come to this particular loop n is equal to 3 so this loop will execute exactly once

this statement will execute exactly once so temp1 will now move to address 200 so temp1 is now pointing

to the second node this is what we wanted to do we wanted temp1 to point to n minus 1th node n

is 3 here now we create another variable another pointer to node temp2 and we set this guy as temp1

dot next temp1 dot next is 150 so we set this guy as 150 so this guy points to the nth node

or the third node now in the next line we are saying that temp1 dot next this value which is 150

right now is now temp2 dot next address of the n plus 1th node or fourth node so this guy will

now be 250 so we are building this link and we are breaking this link so we have fixed the links

and now finally we are saying that free the memory which is being pointed to by temp2

so now this third node the memory block will be deleted from the memory and once this function

execution finishes all the local variables temp1 and temp2 will be cleared and this is what the list

will be this node at address 250 will now be the third node so this was deleting a node at a

particular position in the linked list now we can also have a problem where we may want to delete

a node with a particular value now you can try implementing it in the coming lessons we will see

more problems on linked list so thanks for watching in our lessons on linked list so far we have

implemented some of the basic scenarios like inserting a node in linked list and deleting a node

from linked list in this lesson we will write code to reverse a linked list this is one of the most

favorite interview questions and this is a really interesting problem so let me first define the

problem let's say we have been given a linked list of integers like this so this is our input

we have four nodes in this linked list at addresses 100, 200, 150 and 250 respectively

I always write these addresses in the logical view because it's really important that we visualize

how things are in the memory and what is what like this first node that we also call the head node

is being pointed by this particular variable named head so this variable is basically storing the

address of the head node now this variable is only a pointer this is not the head node itself

and we do not have any other identity of the linked list except the address of the head node

so given a linked list like this if we have to reverse it and by reversing we do not mean moving

around data like we cannot move five at address 100 two at address 250 and do something like this

we actually have to adjust the links so our output should be something like this the head

pointer should point to this node at address 250 and we should go like 250 250 150 to 200

and this node at address 100 should have address 0 or null in each of these nodes this first field

in red is the data part and the second field is the address part so this is what we will get when

we will reverse the list there are two approaches to solve this problem one is an iterative approach

where we will be using a loop we will traverse through the list and at each step we will revert

one of the links another solution is using recursion in this lesson we will try to understand

the iterative solution so coming back to our input list the iterative solution is relatively

easier to understand what we can do is we can traverse the whole list and as we go to each node

we can adjust the link part of that node to make it point to the previous node instead of the next

node so we will start at the first node at each step we want to reverse the link so we want to

make the node point to the previous node instead of the next node for the first node there is no

previous node so let's say the previous node is null and now we want to cut this particular link

and we want to build this particular link so we will simply change the address field to 0

and we have reversed the link part of this particular node and now we will go to the next

node in the list we will come to this node of course the question would be how would we

go to the next node if we have broken this link here we will come back to that in our implementation

details let's say we are able to traverse the list and go to each of the nodes at each step

let's say we store all the relevant information to do that in some temporary variables now at this

node again we will reverse the link so the address part will be set as 100 here now we will go to the

next node at address 150 once again to reverse the link we will set the address as 200 here

so we will break this link and basically we are building this link and now we will go to address

250 the next node we will set the address 150 here so we will cut this link and build this link

and finally when we have reached the last node we will adjust the address in this

head variable to 250 so this particular variable this particular pointer

will point to this node at address 250 and our linked list is reversed now

so let us implement this particular logic in a real c program I will redraw the original input

list in my ccode I will define node as a structure like this this is how we have defined a node

in all our previous lessons so there will be two fields one to store the data which will be of type

integer and another to store the address of the next node we will name this feed it next and it

will be of type pointer to node and let's say head is a global variable so head is a pointer to node

head is a variable which is a pointer to node and it is a global variable so it is accessible to all

the functions it we do not need to pass it around to functions now all I want to do in my code is

I want to write a reverse function that will reverse the linked list which is pointed to

by this particular pointer head as we said we will traverse the whole list and at each step

we will modify the link field of the node to make it point to the previous node instead of the next

node so how do we traverse the list we would traverse the list in our ccode something like this

we will first take a variable which will be pointer to node let's say we will name it temp

then first we will set temp to head by saying this we will make temp point to

the first node and then we will run a loop like this we will say that what temp while temp is not

equal to null take temp to the next address with a statement like temp is equal to temp dot next

in our problem here we don't just have to traverse the list as we traverse the list we have to

reverse the link so we have to set the address field of a particular node as the address of the

previous node instead of the next node now in a linked list we would always know the address

of the next node but we would never know the address of the previous node so as we traverse the

list we will have to keep track of the previous node in another variable so what I will do here is

something like this I will also declare a variable named previous and initially set it to null

because for the first node or the head node the previous node is null and now in my loop we will

have to update both these variables and the variable temp that will store the current node

and the variable pref that will store the address of the previous node and now in my loop I can do

something like this at each step if temp is our current node as we are traversing the list

then we will say that temp dot next is equal to previous so we will set the link part of the

current node as the address of the previous node in our example here at the first step we will say

that temp dot next will be 0 null is nothing but address 0 so we will cut this link and we will

build this link now we should be able to move temp to 200 now and we should be able to move

previous to 100 now in the next step but there is a problem as soon as we adjusted the link of

this particular node at address 100 to make it point to null we lost the address of the next

node so how do we move temp to this particular node at address 200 we cannot set temp equal temp

dot next now if we set temp equal temp dot next now we will go to null so this is a problem

so at each step in our iteration before we set the link field of the current node to make it point

to the previous node we should store the address of the next node in our temporary variable in

another temporary variable so what I'll do here is something like this first of all I want to name

this particular variable temp as current to mean that this is the current node at any stage in my

iteration so we initially set current to head and then we are running the loop as while current is

not equal to null and then I've also declared one more temporary pointer variable named next

what I'll do at each step each step in my iteration inside the while loop is that first I'll say

something like next is equal to current dot next so first I'll store the address of the next node

in this particular variable next so in our example here for the first node initially things will look

some something like this now we can set the link part of the current node as address of the previous

node with a statement like this so when we will write the address 0 here initially we will break

this link and create this link we will not lose the information about the next node now we can

redefine our previous and current so we will first move previous to current and then we will move

current to next please note that this particular variable next is a local variable in the reverse

function and when we say something like current dot next we mean the link field in the node while

when we say when we simply say next we mean this particular local pointer variable so they're

different this is not current dot next actually this is current arrow next which is an alternate

syntax for asterisk current dot next so we use the asterisk operator to dereference that address

and then we access the next field for the sake of saying we say current dot next dot temp dot next

so with these two lines in our loop we are resetting our previous and current pointers

this is how we are traversing the list if you see in the next iteration current is 200 it is not

equal to null null is 0 so we will go to this particular statement next is equal to current dot

next so next we'll now store the address 150 and now we will say current dot next is equal to previous

so we will cut this link previous is 100 right now so we will set 100 here so basically we will

build this particular link and then we will move we will first move previous to current

and then move current to next and we will go on like this

so finally we will reach a stage like this when current will be equal to null we will come out of

the loop and when we will come out of the loop this particular variable previous this particular

pointer previous will store the address of the last node and there is one more thing remaining

here we need to adjust this particular variable head this link at this stage does not exist and

in my code I'll say head should now be equal to the address invariable previous so head is now

250 this is our new head and now our list is reversed there are a couple of things that I

want to point out here one thing is that we must see whether our implementation is working for

all test cases so we must also verify it for special or corner test cases in this case corner

test case will be when the list is empty in that case head will be null or when the list is having

only one node if you see this particular implementation will work for these two scenarios

give it give it some time and you should be able to figure it out let's now run this code with

complete implementation of all the functions to insert and print nodes in my code here i have

written reverse function to accept the address of the head node as argument and then return the

address of the head node after modification of the list after reversal of the list

and then i have written the main method in which i'm declaring head as a local variable

and then i'm using couple of insert functions i'm making couple of calls to insert function

insert function also takes two arguments the address of the head node and the data to be inserted

and it returns back the address of the head node it could either be modified or not modified

let's say we are inserting at the end of the list so initially our list will be two four six

eight and then we are making a call to the print function which i have written to

print the elements in the list and then i'm making a call to reverse and finally printing again

my logic of the reverse function remains the same except that i've changed the method signature

and in the end i'm returning head which will return the address of the head node

let's say we have written all the other functions insert and print correctly

these are the two functions insert and print so let's now run this code and see what happens

before the list is reversed the output is two four six eight and after the list is reversed

the output is eight six four two let us try this for the case when we have only one element in the list

so i'll remove i'll comment out these three insert statements and this also seems to be working

so this was reversal of linked list through iteration in the next lesson we will write code to reverse

linked list using recursion so thanks for watching in our series on linked list so far we have implemented

some of the basic operations like insertion deletion and traversal now in this lesson we will write

code to traverse and print the elements of a linked list using recursion

prerequisite for this lesson is that you should understand a recursion as a programming concept

recursive traversal of linked list actually helps us solve a couple of interesting problems

but in this lesson we will keep it simple we will just traverse and print all the elements in linked

list using recursion and we will write one simple variation to print all the elements in reversed

order using recursion we will actually not reverse the list we will just print the elements in reversed

order so once again i have taken example of a linked list of integers here we have four nodes

each rectangle here is a node it has two fields one to store the data and another to store the

address of the next node let's say we have four nodes that addresses 100 200 150 and 250 respectively

and of course we will also have a variable that will store the address of the head node let's name

this variable head programmatically in our c or c++ program a node will be defined something

like this we will have a structure with two fields one to store the data and another to store the

address of the next node what we want to do in this particular lesson is that we want to write

two functions first we want to write a function named print that will take address of a node as

argument we will pass this function the address of the head node so let's name this argument head

and in this function we will use recursion to print the elements in the list so for this particular

example here if we want to print a space separated list of all the elements our output will be

something like this and we also want to write another function named reverse print here also

we will take the address of a node so we will pass this guy the address of the head node and in this

function we will use recursion to print the elements in the list in reversed order so if we have to

print a space separated list for this example and our output will be something like this so let's first

implement the print function in my c code here i'll declare print function like this it will

take us argument the address of a node so the argument is of type pointer to node

initially we will pass the address of the head node we can name this argument head or we can

name this argument p we can name it whatever but we must understand that this will be a local

variable and let's not bother about other infrastructure in the code like how we would create a linked

list and how we would insert a node in the linked list let's assume that they are in place so let's

keep the name of this particular argument p now recursion is a function calling itself so we have

been passed the address of a node initially the head node so what we can do in our code is first

we can print the value at that particular node with a print f statement like this and then we can

make another call to the print function and this time we will pass the address of the next node

with a statement like this this next field is also a pointer to node so this will pass the address

of the this will be the address of the next node there is one more important thing in recursion

and that we should never forget and it's the exit condition from recursion we should not go on

making recursive calls infinitely so in this case if we go from the first node to the second node

and from the second node to the third node using recursion then finally at one stage p will be

equal to null in one of the calls at this stage we can avoid making a recursive call we will exit

we will show you a simulation of how things will happen in memory hold on for a while

so once we will reach the end of the list p will be equal to null and we will exit

the recursion at that stage now i'll write the main method i've already written the insert function

here so i'll declare a variable head as null in the main method so head will be a local variable

once again we could name this particular variable a or b or whatever just because this variable

points to the head node or the first node in the list we named this variable as head

and then we will insert some nodes in the linked list using by making call to the insert function

that takes the address of the head node as argument initially head is set as null to say that the

list is empty and there should be two arguments to head to the function insert the address of the

head node and the value that needs to be inserted and why is it that this particular function insert

is returning a pointer it's because this head in the main method is a local variable and if we pass

it to the function we just pass a copy of the address of the head node in this head which will

be a local variable of insert function so this guy returns back the address of the modified head

so we can update it in the main function this function inserts a node at the end of the list

so initially when head is null head will be modified in the insert function for other cases

it will not be modified if we are inserting at the end so we will make four such calls to the insert

function to create a linked list of four integers two four six five and now we will make a call to

print function and pass it the address of the head node let us now run this code and see what

happens as you can see we have got this output two four six five the print function here in our

code which is a recursive implementation to print the lists is working now i'll make one

slight change in the print function instead of printing the value in the node and then making

a recursive call i'll first make a recursive call and then when the recursive call finishes

i'll print the value in the node and i'll not modify anything else in the code the main method

will remain the same and if we run this code we can say that we can see that the elements in the

list are printed in reversed order so we just implemented the reverse print function that we

had talked about let us now analyze these two recursive implementations in a logical view

in our example here if we want to print this particular list we will do something like from

the main method we will make a call to the print function passing a third address of the head node

so initially this print function is being called with p equal hundred now in the execution of this

function we will come here if p is equal to null null is address 0 and our argument is hundred so

control will not go inside this if condition we will come here we will print p arrow data p arrow

data means that we will first dereference the address so we will go to the address hundred and

then we will look at the data field there so on the console we will print the data field of

data field at address hundred and now we will make a recursive call we will make a call to

print function passing it address p arrow next which is 200 and the execution of this particular

call will not finish it will finish only after print 200 finishes we will come back to it now

print 200 once again prints the data at address 200 and then makes a recursive call to print function

passing address 150 and we will go on like this in this call to print with address 250

we will first print the data and the address field the the value of p dot next p arrow next

is 0 what we can also say null so we will make a call like this now for this call

with argument null we have reached and the exit condition recursion will not go grow any further

so we will just print an end of line and return this particular structure that we have drawn here

is called recursion tree so print null function call will finish and control will return back

to print 250 there is no statement after this particular recursive call finishes

so we will simply exit this function call also and control will return back to print 150

and we will go on like this finally we will come back to the main method

if you want to see how the recursion will execute in the memory then i'll have to draw a diagram

like this applications memory the memory that is allocated for the execution of a program

has these two sections all the details of function call execution and the local variables

they are stored in the stack section of the memory and any memory that is allocated

using the malloc function or the new operator in c++ they go into the heap section the memory

for the nodes in a linked list is allocated from the heap so that's why these four nodes

in our example are sitting in the heap if you want to know in detail about stack and heap

check the description of this video for a lesson on dynamic memory allocation

when the program will start executing first the main function will be invoked

anytime a function is invoked some amount of memory from the stack is allocated for the execution

of that particular function now it's called the stack frame of that function so let's say main

is executing we have already inserted some nodes in the linked list we have this variable head in

the main function so all the local variables in sit in the stack frame of the function

so head will sit here now at this stage let's say main makes a call to print function

so main was executing and now it makes a call to print function execution of main will be paused

and we will go on to execute the print function the argument passed to the print function is

hundred which is stored in a local variable this argument p is a local variable in the print

function now print function again makes a recursive call now a stack frame is always allocated

corresponding to each recursive each call of a function so a function calling itself is not

different from a function calling another function at any time whatever function call is at the top

of the stack is executing finally even we will reach the exit condition of the recursion stack

will be something like this and then first this call where p is zero will finish we will come back

to this particular call and then this will finish and we will go on like this so this is how

recursion works this is how things will happen in the memory okay so now i'll clear this diagram

of stack and heap in the right and i'll make some change in my print function

what i've done is i have renamed my function print as reverse print and in my function

i'm first making a recursive call and after coming back from that recursive call i'm writing

a print statement and from the main function i'll make a call to reverse print let's write rp as

shortcut for reverse print and initially i'll pass the address of the head node

so i'll make a call like this reverse print hundred the control will come inside this function

p is hundred it is not equal to null and i've also drawn the console like before

now this particular function call does not print first it first makes a recursive call

so this guy will go ahead and make a recursive call to the reverse print function passing

it address 200 nothing will be written on the console and once again this particular function

will make a recursive call like this and once again this particular function will go ahead

and make a recursive call like this and finally we will have a recursive call where the function

is passed address null at this stage we will come to in the exit condition in the recursion

the recursion will not grow any further we will simply return the control will return to

this particular call reverse print 250 so we will come here now to this print of statement

the data field at address 250 is 4 so 4 will be printed on the console and now this particular

function call will finish and we will go to reverse print 150 and now this call will print 5 and

exit and we will go on like this finally we will return back to the main function with this output

on the console the elements of the list printed in reversed order so this was recursive traversal

of linked list to print its elements i must point out here that for normal traversal of the linked

list not for the reverse print for the normal print an iterative approach will be a lot more

efficient than the recursive approach because in a iterative approach we will just use

one temporary variable while in recursion we will use space in the stack section of the memory for

so many function calls so there is implicit use of memory there for reverse print operation

we will anyway have to store elements in some structure so if we use recursion it's still okay

in the coming lessons we will solve more problems more interesting problems on linked list so

thanks for watching in our previous lesson we saw how we can traverse a linked list using recursion

we wrote code to print the elements of linked list in forward as well as reverse order using

recursion we did not actually reverse the list we just printed the elements in reverse order now

in this lesson we will reverse a linked list using recursion this is yet another famous programming

interview question so if we have an input list like this we have a linked list of integers here

we have four nodes in the linked list each rectangular block here with two partitions is a node first

field is to store the data and another to store the address of the next node the second field stores

the address of the next node and of course we will have one variable to store the address

of the first node or head node we named that variable head we may name it anything i have

named it head so this is our input list and after reversal our output should be like this

this variable head should store the address of the last node in the original list the last node

in the original list was that address 250 and we will go like from 250 to 150 to 250 to 200

200 to 100 and 100 to null null is nothing but address 0 we have already seen how we can reverse

a linked list using iterative method in one of our previous lessons let us now see how we can solve

this problem using recursion in our solution we must reverse the list by adjusting the links by

reversing the links not by moving around data or something so let us first understand the logic

that we can use in our recursive approach if you remember from our previous lesson where we had

used recursion to print the list backward print the elements in reverse order then recursion gives

us a way to traverse the list backward in our c or c++ program programmatically

node will be a structure like this so let's first look at the function from the previous

lesson the recursive function that was used to print the list backward to this function we pass

the address of a node initially we pass the address of the head node and we have this exit

condition if the address passed is null then we simply return else we make a recursive call

and pass the address of the next node so main method will typically call reverse print passing

it the address of the head node and this guy will first make a recursive call and then

when this recursive call finishes then only it will print so i'm writing rps shortcut for

reverse print so the recursion will go on like this and when it reaches this particular call when

argument is null it will return so this call will finish and again the control will come

to this call with an address 250 as argument and now we are printing the value of the node at

address 250 which will be 4 and then this guy finishes and then we go ahead and print 5

and similarly we then go on to print 6 and 2 so recursion kind of gives us a way to first

traverse the list in the forward direction and then traverse the list in the backward direction

so let us now see how we can implement reverse function using recursion let's say for the sake

of simplicity and implementation that head is a global variable so it is accessible to all the

functions now we will implement a function named reverse that will take the address of a node as

argument initially we will pass address of the head node to this function now i want to do

something like this in my recursion i want to go on till the end i want to go on making a recursive

call till i reach the last node for the last node the link part will be null so this is my

exit condition from recursion this exit condition is what will stop us from going on infinitely

in a recursion and what i'm doing here is something very simple as soon as i'm reaching the last

node i'm modifying the head pointer to make it point to this guy so the recursion will work like this

from the main method we will call the reverse function passing it the address of the head node

address 100 we will come and check this condition if p.next is equal to null no it is equal to 200

for the node at address 100 so this recursion will go on till we reach this call call to reverse

passing it address 250 and now we will come down and now we have come to this exit condition and now

head will be set as p and the list will look like this and now reverse 250 the call to reverse 250

will finish and we will come back to reverse 150 there is no statement here after this recursive

call to reverse function if there were some statements here then they would have executed

now for reverse 150 after we would have come from reverse 250 and that's how we actually traverse

the list in reverse order if you see when reverse 250 has finished the node till 250 is already

reversed because head is pointing to this node and the link part of this node node is set as null

so till 250 we are already reversed now when we come to 150 we can make sure the list is reversed

till 150 when we finish the execution of reverse 150 to do to do that we can write statement like

this we will have to do two things we will have to cut this node and make this type point to this

guy so we will build this link and we would have to cut this link and make this guy point to null

and that's how node till address 150 will be reversed after we finish this call

so i've written these three lines in my function that will execute after the recursive call

so they will execute when the recursion is folding up and we are traversing the list in the backward

direction so when we are executing reverse 150 and we have come back to it after recursion we are

at this particular line so p would be 150 and q would be p dot next so q would be 250 so this

guy is p and this guy is q and we are saying that set q dot next is equal to p so we will set this

particular field as 100 so we are building this link and cutting this link and now we are saying

that set p dot next equal null so we are building this link making p dot next null and now this call

to reverse 150 finishes and when this call has finished the list till 150 is reversed as you can

see head is 250 so from 250 we will go to 150 and 150 from 150 we are going to null so till 150

we have a reversed list so this is how things will look like when the call to reverse 200 finishes

till 200 we have a reversed list and once again we come to execution of reverse 100

and this is how things will look like finally when reverse 100 will finish and we will return back

to the main function we had seen in the previous lesson that how things will happen in the memory

when recursion executes in recursion we save the state of execution of all the function calls

in stack section of the memory in this function all we are doing is basically we are storing the

addresses of node in a structure as we go forward in recursion and then we first work on the last

node to make it part of the reversed list and then we once again come back to the previous node and

we and we keep doing this watch the previous lesson for detailed explanation and simulation

of how things will happen in the memory for recursion there are a couple of more things here

one thing is that instead of writing these two lines i could write one line for these two lines

i could say something like p arrow next arrow next equal p and that would have meant the same

except that this statement is more obfuscated and there is one more thing we have assumed

that head is a global variable whatever head is not a global variable this reverse function

will have to return the address of the modified head i'll leave that as an exercise for you to do

so this was reversing a linked list using recursion thanks for watching hello everyone

in our lessons in this series so far we have discussed linked list quite a bit

we have seen how we can create a linked list and how we can perform various operations with linked

list linked lists as we know our collections of entities that we call nodes so far in all

our implementations we have created linked lists in which each node would contain two fields one to

store data and another to store address of the next node let's say we have a linked list of integers

here so i'll fill in some values in data field of each node let's assume that these nodes are at

addresses 200, 250 and 350 respectively i'll also fill in the address field in each node the address

field in first node will be the address of second node which is 250 the address field in second

node will be address of third node which is 350 and address part in third node will be zero or

null the identity of a linked list that we always keep with us is the address of head node or

reference to head node let's say we have a variable named head only to store the address of the head

node remember this variable named head is only a pointer to the head node ideally we should have

named there's something like head pointer it's only pointing to the head node it's not

the head node itself head node is this guy the first node in the linked list okay so right now

in the linked list that we are showing here each node has only one link a link to the next node

in a real program node for the linked list that i'm showing here will be defined like this

this is how we have defined nodes so far in all our lessons we have two fields here one

of type integer to store data and another of type pointer to node struct node asterisk

i'm calling this field next when we say linked list by default we mean such a list that we can also

call singly linked list what we have here is a singly linked list what we want to talk about

in this lesson is idea of a doubly linked list the idea of a doubly linked list is really simple

in a doubly linked list each node would have two links one to the next node and another to

the previous node programmatically this is how we will define node for a doubly linked list in

c or c plus plus i have one more field here which once again is a pointer to node so i can store

the address of a node i can point to a node using this field and this field will be used to store

the address of the previous node in a logical representation i will draw my node like this now

i have one field to store data one to store address of previous node and one to store address of next

node let's say i want to create a doubly linked list of integers i have created three nodes here

let's say these address these nodes are at addresses 400 600 and 800 respectively i'll fill in some

data let's say the cell in the middle in each node is to store data the right most cell is

let's say to store the address of the next node so for first node this field will be 600

which means we have a link like this for second node this field will be 800 for third node this

field will be zero for first node there is no previous node so this leftmost cell

which is supposed to contain the address of the previous node will be zero or null

the previous node for second node will be 400 and the previous node for the third node is the node

at address 600 and of course we will have a variable to store the address of the head node

okay so what we have here is a doubly linked list of integers with three nodes okay so with this much

you already know doubly linked list if you have ever implemented a singly linked list

then it should not be very difficult implementing a doubly linked list one obvious question would be

why would we ever want to create a doubly linked list what are the advantages or huge cases of a

doubly linked list first advantage is that now if we have a pointer to any node then we can do a

forward as well as reverse lookup with just one pointer we can look at the current node the next

node as well as the previous node i am showing a pointer named temp here if temp is a pointer

pointing to a node then temp dot next is a pointer pointing to the next node it's the address of the

next node and temp dot previous or rather temp arrow previous this is actually a syntactical sugar

for asterisk temp dot prev so this guy temp arrow prev is previous node or in pure words pointer to

previous node the value stored in temp for this example right now is 600 temp dot next is 800

and temp dot prev is 400 in a singly linked list there is no way you can look at the previous node

with just one pointer you will have to use an extra pointer to keep track of the previous node

in a lot of scenarios the ability to look at the previous node makes our life easier even

implementation of some of the operations like deletion becomes a lot easier in a singly linked

list to delete a node you would need two pointers one to the node to be deleted and one to the previous

node but in our doubly linked list we can do so using only one pointer the pointer to the node to

be deleted all in all this ability that we can do a reverse lookup in the linked list is really

useful we can flow through the linked list in both directions disadvantage of doubly linked list

is that we are having to use extra memory for pointer to previous node for a linked list of integers

let's say integer takes four bytes in a typical architecture and pointer also takes

four bytes pointer variable also takes four bytes then in a singly linked list each node

will be eight bytes four for data and four for a link to the next node in a doubly linked list

each node will be 12 bytes we will take four bytes for data and eight bytes for links for a linked list

of integers we will take twice for links than data with a doubly linked list we also need to be more

careful while resetting links while inserting or deleting we need to reset a couple of more links

than a singly linked list and so we are more prone to errors we will implement doubly linked list in

a c program in next lesson we will write basic operations like traversal insertion and deletion

this is it for this lesson thanks for watching in our previous lesson we saw what doubly linked

lists are now in this lesson we are going to implement doubly linked list in c we are going to

write simple operations like insertion traversal and deletion in a doubly linked list as we saw in

our previous lesson each node contains three fields i have drawn logical representation of

a doubly linked list here one to store data one to store address of next node and one to store

address of previous node for a linked list of integers node will be defined like this in a c

or c plus plus program in the logical representation i'll fill in some data in each node let's say

these nodes are at addresses 400 600 and 800 respectively i'll also fill in next and previous

fields and we must also have a pointer variable pointing to the head node quite often we

name this pointer variable head in my implementation i'm going to write these functions i'm going to

write a function to insert a node at beginning or head of linked list this function will take an

integer as argument i'll write another function to insert a node at tail of linked list i'll write

one function to print elements in linked list while traversing it from head to tail i'll write

another one to print the elements in reverse order while traversing the list from tail to head

reverse print function will validate whether reverse link for each node is created properly

or not let's now write these functions in a real c program in my c program here i have defined

node as a structure with three fields first field is of type integer to store data second field is

of type pointer to node to store reference of next node and the third field is a pointer to

node to store the reference of previous node i have defined a variable named head which once

again is a pointer to node and i have defined this variable in global scope head is a global

variable when we define a variable inside a function it's called a local variable the lifetime of a

local variable is lifetime of a function call it's created during a function call execution and

it's cleared from the memory when function call execution finishes but global variables

live in the memory for whole lifetime of an application they live till the time program is

executing global variables can be accessed everywhere in all functions local variables are not accessible

everywhere unless you access them through pointers in all our previous implementations we have mostly

declared head as global variable okay so let's now write the functions the first function that i want

to write is insert at head this function will take an integer as argument the first thing that we

want to do here is we want to create a node we can always declare a node like this just like

declaration of any other variable we can say struct node and then we can give an identifier

or name and now in this my node that i have created i can fill in all the fields

but the problem here is that when i'm creating a node like this i'm creating it as a local variable

and it will be cleared from memory when function call will finish a local variable lives in what we

call stack section of applications memory and we cannot control its lifetime it's cleared from

memory when function call finishes we do not want this our requirement is that a node should be

in memory unless we explicitly remove it so that's why we create a node in in dynamic memory or what

we call heap section of memory anything in heap is not cleared unless we explicitly free it to

create a node in heap we use malloc function in c or new operator in c++ all malloc function

does is it reserves some memory in heap and this memory can be used for writing anything

any variable any object access to this memory always happens through a pointer variable we have

talked about this concept quite a bit in our previous lessons but i keep on repeating because

this is really important concept so here with this statement i have created a node in dynamic

memory or heap that can be referenced through a variable which is pointer to node i have named

this variable temp now i can use this pointer variable to fill in values in various fields of

the node i'll have to dereference this pointer variable using asterisk operator and then i can

access various fields like data prep or next there is an alternate syntax for this asterisk

temp dot data we can simply write temp arrow data and similarly i can access other fields also

so to access prep field i can say temp arrow prep let's set this as null and let's also set the

next field as null if you want to understand or refresh the concept of stack and heap in memory

then you can check the description of this video for a link to our lesson on dynamic memory allocation

okay so in my function insert at head i have created a node in heap section of memory

and i'm referencing that node using this pointer variable named temp temp is not a very meaningful

name let's use a name like new node or new node pointer i would like to separate out this logic

of node creation these lines for node creation in a separate function i've written a function

here named ket new node that will take an integer's argument create a node filling in data field as

x and setting both previous and next pointers as null this function will return a pointer to nodes

so i will return new node from here i'm writing a separate function because i can avoid duplicate

code by using a separate function for creation of node because i'm going to create a node

for function in function insert at head as well as in function insert at tail that i'll be writing

after some time now in insert at head function i can simply call this function get new node passing

it x this function is returning a pointer to newly created node that i'm going to receive in this

variable which once again is appointed to node named temp we can name this variable also as new node

this new node in insert at head is different from this new node in get new node these are local

variables this new node is local to insert at head and this new node is local to get new node

now there will be two cases in insertion at head list could be empty so head will be equal to null

in this case we can simply set head as the address of new node and return or exit

things will be clear if i'll show everything in logical view also right now my linked list is

empty here in this logical view that i'm showing let's say i have made a call to insert at head

passing it number two get new node function will give me a new node let's say a new node is created

at address 400 with this statement head equal new node we are setting the address stored in

new node variable in head null is nothing but address 0 as soon as this function insert at head

will finish this variable new node will be cleared from memory but the node itself will not be cleared

if we would have created node like this struct node new node and in this declaration new node

is not pointed to node it's node and we are not saying struct node as stress

so if we would have created node like this the node also would have been cleared okay coming back

to the function here let's write rest of the logic to insert a node when list is not empty

this is what i'll do now i'm making a call to insert at head passing it number four

once new node is created i'll first set the previous field of existing existing head node

as the address of this new node so i'm building this link then i'll set the next field of new node

as the address of current head and now i can break this link and build this link

so i'll set head as address of new node this is how things will look like finally

let's also quickly see how things will actually move in various sections of applications memory

the memory that is allocated to a program is typically divided into these four segments

we have seen this diagram quite a bit in our earlier lessons code or text segment stores

all the instructions to be executed there is a segment to store global variables there is a

section that we call stack that is used just like scratch pad or white board for function

call execution stack is where all the local variables go and not just local variables all

the information about function call execution heap is what we also call dynamic memory

i'm showing stack heap and global section separately here in our program we had declared head as a

global variable initially for an empty list we'll set head as null or zero now let's say we will do

that in main function now when a call to insert at head is made at this stage let's say i'm making

a call passing number two as argument let's say we are making a call to insert head from main function

when program starts execution first main function is invoked whenever a function is invoked some

amount of memory from the stack is allocated for execution of that function that section is called

stack frame of that function and all the local variables of that function live inside its stack

frame when function call execution finishes the stack frame is reclaimed when main will make a

call to insert at head the execution of main will pause at at at the line where it's making a call

a stack frame will be allocated for execution of insert at head i'm writing shortcut i a h for

insert at head because i'm short of space here all the arguments of insert at head all the local

variables will live inside this stack frame we are creating a variable named new

named new node which is a pointer to node as local variable and we are making a call to get new node

function execution of insert at head will pause and we will go on to execute get new node we could

write get new node like this here i'm creating a node on stack x is a local variable and get new

node also then i'm creating a node filling in data as the value of x which is two i'm setting

previous and next fields as null or zero and then because i need to return a pointer to node i have

used ampersand operator here using ampersand operator gives us pointer to a variable let's say

this new node that we have in the stack frame of get new node has addressed 50 with this return when

get new node will finish the value in this new node of insert at head will be 50

please note that with this code this new node in get new node function is of type

struct node while this new node in insert at head is of type pointer to struct node so they are

different types we can return this address 50 that's fine but the stack frame for get new node will

be reclaimed once the function finishes so now even though you have the address 50

there is no node there we cannot control allocation and deallocation of memory on stack

it happens automatically that's why we use the memory on heap if i'm using this code for creation

of new node then what i'm doing is i'm declaring this variable new node not as struct node but as

struct node asdrisk that is pointer to node i'm using malloc to create the actual node in heap

section let's say i'm getting a dress 400 for this node now for a section of memory in heap

for something in heap we cannot have a direct name the only way to access something in heap is

through a pointer if we will lose this pointer we will lose this node okay so now what we are doing

is using this point a new node which is local to get new node function we are accessing this node

filling in data filling in address fields and now we are returning this address 400

now when get new node is finishing i'm collecting the return this address 400 in this variable in

this local variable new node we are returning back to insert at head function and at this line

head at this stage is null so now we are saying that set head is equal equal new node head is a

global variable it's not going to be cleared for whole lifetime of application and now we are

returning stack frame of insert at head will be cleared and this is what we finally have

when we will make another call to insert at head once again fresh stack frames will be allocated

in the execution of functions appropriate links will be created so our linked list will be modified

accordingly i hope all of this is making some sense with another call to insert at head when

everything will finish and control will return back to main we can have a picture like this

let's say i got a node at 600 right cell is for next node right cell is storing the

address of next node and left cell is storing the address of previous node so this will this is what

we will have let's now go and write rest of the functions print function will be same as print

for singly linked list we will take a temporary pointer to node initially set it to head and then

we will use this statement temp equal temp dot next to go to the next node and we will keep on

printing in reverse print we will first go to the end node of the list using next pointer and then

we will traverse backward using this statement temp equal temp arrow pref so we will use the

previous pointer and while traversing backward we will print the data okay let's now test all

these functions that we have written so far in the main function i'm setting head as null to say

that the list is empty initially and now i'm writing couple of insert statements i'm making a

couple of calls to insert at head function and after each call i'm printing the list both in

forward as well as reverse direction let's run this code and see the output this is what i'm getting

and i think this is as expected there is one more function insert at tail that i had said i'll write

if you have understood things so far it should not be very difficult for you to write this function

insert at tail i'll leave this as an exercise for you i'll stop here now if you want to get this

source code check the description of this video for a link in coming lessons we are going to talk

about circular linked list and we will see some more interesting problems on linked list thanks

for watching in this lesson we are going to introduce you to stack data structure

data structures as we know are ways to store and organize data in computers so far in the series

we have discussed some of the data structures we have talked about arrays and linked lists now in

this lesson we are going to talk about stacks and we are going to talk about stack as abstract data

type or ADT when we talk about a data structure as abstract data type we talk only about the

features or operations available with the data structure we do not go into implementation details

so basically we define the data structure only as a mathematical or logical model

we will go into implementation of stack in later lessons in this lesson we are going to talk only

about stack ADT so we are only going to have a look at the logical view of stack stack as

a data structure in computer science is not very different from stack as a way of organizing

objects in real world here are some examples of stack from real world first figure is of a stack

of dinner plates second figure is of a mathematical puzzle called tower of hanoi where we have three

rods or three pegs and multiple disks and the game is about moving a stack of disks

from one peg to another with this constraint that a disk cannot go on top of a smaller disk

third figure is of a pack of tennis balls stack basically is a collection with this property that

an item in the stack must be inserted or removed from the same end that we call the top of stack

in fact this is not just a property this is a constraint or restriction only the top of a stack

is accessible and any item has to be inserted or removed from the top a stack is also called last

in first out collection most recently added item in a stack has to go out first in the first

example you will always pick up a dinner plate from top of the stack and if you will have to put

a plate back into the stack you will always put it back on top of the stack you can argue that

I can slip out a plate from in between without actually removing the plates on the top so the

constraint that I should take out a plate always from the top is not strictly enforced for the

sake of argument this is fine you can say this in other two examples when we have disks in a peg

and tennis balls in this box that can open only from one side there is no way you can take out

an item from in between any insertion or removal has to happen from top you cannot slip out an

item from in between you can take out an item but for that you will have to remove all the items

on top of that item let's now formally define stack as an abstract data type a stack is a list

or collection with the restriction that insertion and deletion can be performed only from one end

that we call the top of stack let's now define the interface or operations available with

stack adt there are two fundamental operations available with a stack and insertion is called

a push operation push operation can insert or push some item x onto the stack another operation

second operation is called pop pop is removing the most recent item from the stack most recent

element from the stack push and pop are the fundamental operations and there can be few more

typically there is one operation called top that simply returns the element at top of the stack

and there can be an operation to check whether a stack is empty or not so this operation will

return true if the stack is empty false otherwise so push is inserting an element on top of stack

and pop is removing an element from top of stack we can push or pop only one element at a time

all these operations that have written here can be performed in constant time

or in other words the time complexity is big o of one remember an element that is pushed or

inserted last onto a stack is popped or removed first so stack is called last in first out structure

what goes in last comes out first last in first out in short is called leafo logically a stack is

represented something like this as a three-sided figure as a container open from one side this is

representation of an empty stack let's name this stack s let's say this figure is representing

a stack of integers right now the stack is empty i will perform push and pop operations

to insert and remove integers from the stack i will first write down the operation here and then

show you what will happen in the logical representation let's first perform a push i want to push

number two onto the stack the stack is empty right now so we cannot pop anything after the push

stack will look something like this there is only one integer in the stack so of course

it's on top let's push another integer this time i want to push number 10

and now let's say we want to perform a pop the integer at top right now is 10 with a pop

it will be removed from the stack let's do few more push

i just pushed 7 and 5 onto the stack at this stage if i will call top operation it will return me

number five is empty will return me false at this stage a pop will remove five from the stack

as you can see the element the integer which is coming last is going out first

that's why we call stack last in first out data structure we can pop till the stack gets empty

one more pop and stack will be empty so this pretty much is stack data structure

now one obvious question can be what are the real scenarios where stack helps us let's list

down some of the applications of stack stack data structure is used for execution of function

calls in a program we have talked about this quite a bit in our lessons on dynamic memory

allocation and linked lists we can also say that stack is used for recursion because recursion is

also a chain of function calls it's just that all the calls are to the same function to know more

about this application you can check the description of this video for a link to my course schools

lesson on dynamic memory allocation another application of stack is we can use it to implement

undo operation in an editor and we can perform undo operation in any text editor or image editor

right now i'm pressing ctrl z and as you can see some of the text that i have written

is getting cleared you can implement this using a stack stack is used in a number of important

algorithms like for example a compiler verifies whether parenthesis in a source code are balanced

or not using stack data structure corresponding to each opening curly brace or opening parenthesis

in a source code there must be a closing parenthesis at appropriate position and if parenthesis in a

source code are not put properly if they are not balanced compiler should throw error

and this check can be performed using a stack we will discuss some of these problems in detail

incoming lessons this much is good for an introduction in our next lesson we will discuss

implementation of stack this is it for this lesson thanks for watching in our previous lesson we

introduced you to stack data structure we talked about stack as abstract data type or ADT as we

know when we define a data structure as abstract data type we define it as a mathematical or

logical model we define only the features or operations available with the data structure

and do not bother about implementation now in this lesson we will see how we can implement

stack data structure we will first discuss possible implementations of stack and then we'll go ahead

and write some code okay so let's get started as we had seen a stack is a list or collection with

this restriction with this constraint that insertion and deletion that we call push and pop operations

in a stack must be performed one element at a time and only from one end that we call the top

of stack so if you see if we can add only this one extra property only this one extra constraint

to any implementation of a list that insertion and deletion must be performed only from one end

then we can get a stack there are two popular ways of creating lists we have talked about them

a lot in our previous lessons we can use any of them to create a stack we can implement stacks

using a arrays and p linked lists both these implementations are pretty intuitive let's first

discuss array based implementation let's say i want to create a stack of integers so what i can

do is i can first create an array of integers i'm creating an array of 10 integers here i'm naming

this array a now i'm going to use this array to store a stack what i'm going to say is that at

any point some part of this array starting index 0 till an index marked as stop will be my stack

we can create a variable named top to store the index of top of stack for an empty stack

top is set as minus 1 right now in this figure top is pointing to an imaginary minus 1 index in

the array and insertion or push operation will be something like this i will write a function

named push that will take an integer x as argument in push function we will first increment top

and then we can fill in integer x at top index here we are assuming that

a and top will be accessible to push function even when they are not passed as arguments

in c we can declare them as global variables or in an object oriented implementation

all these entities can be members of a class i'm only writing pseudo code to explain

the implementation logic okay so for this example array that i'm showing here right now

top is set as minus 1 so my stack is empty let's insert something onto the stack i will have to make

call to push function let's say i want to insert number 2 onto the stack in a call to push first

top will be incremented and then the integer passed as argument will be written at top index

so 2 will be written at index 0 let's push one more number let's say i want to push number 10

this time once again top will be incremented 10 will now go at index 1 with each push the stack

will expand towards higher indices in the array to pop an element from the stack i'm writing a

function here for pop operation all i need to do is decrement top by one with a call to pop let's

i'm making a call to pop function here top will simply be decremented whatever cells are in yellow

in this figure are part of my stack we do not need to reset this value before popping if a cell is

not part of stack anymore we do not care what garbage lies there next time when we will push

we will modify it anyway so let's say after this pop operation i want to perform a push i want to

insert number 7 onto the stack so top once again will be incremented and value at index 2 will be

overwritten the new value will be 7 these two functions push and pop that i have written here

will take constant time we have simple operations in these two functions and execution time will

not depend upon size of stack while defining stack identity we had said that all the operations

must take constant time or in other words the time complexity should be big o of 1

in our implementation here both push and pop operations are big o of 1 one important thing here

we can push onto the stack only till array is not exhausted only till some space is left in the

array we can have a situation where stack would consume the whole array so top will be equal to

highest index in the array a further push will not be possible because it will result in an overflow

this is one limitation with array based implementation to avoid an overflow we can always

create a large enough array for that we will have to be reasonably sure that stack will not grow

beyond a certain limit in most practical cases large enough array works but irrespective of that

we must handle overflow in our implementation there are couple of things that we can do in case

of an overflow push function can check whether array is exhausted or not and it can throw an error

in case of an overflow so push operation will not succeed this will not be a really good behavior

we can do another thing we can use the concept of dynamic array we have talked about dynamic array

in initial lessons in the series what we can do is in case of an overflow we can create

a new larger array we can copy the content of stack from older filled up array into new array

if possible we can delete the smaller array the cost of copy will be big o of n or in simple words

time taken to copy elements from smaller array to larger array will be proportional to number of

elements in stack or the size of the smaller array because anyway stack will occupy the whole array

there must be some strategy to decide the size of larger array optimal strategy is that we should

create an array twice the size of smaller array there can be two scenarios in a push operation

in a normal push we will take constant time in case of an overflow we will first create

a larger array twice the size of smaller array copy all elements in time proportional to size

of the smaller array and then we will take constant time to insert the new element

the time complexity of push with this strategy will be big o of one in best case

and big o of n in worst case in case of an overflow time complexity will be big o of n

but we will still be big o of one in average case if we will calculate the time taken for

n pushes then it will be proportional to n remember n is the number of elements in stack

big o of n is basically saying that time taken will be very close to some constant times n

in simple words time taken will be proportional to n if we are taking c into n time for n pushes

to find out average we will divide by n average time taken for each push will be a constant

hence big o of one in average case i will not go into all the mathematics of why it's big o of

n for n pushes to know about it you can check the description of this video for some resources

okay so this pretty much is core of our implementation we have talked about two more operations in

definition of stack ADT top operation simply returns the element at top of stack so top function

will look something like this we will simply return the element at top index to verify whether

stack is empty or not this is another operation that we had defined we can simply check the value

of top if it is equal to minus one we can say the stack is empty we can return true else we can

return false sometimes pop and top operations are combined together in that case pop will not

just remove an element from top of stack it will also return that element language libraries in a

lot of programming languages give us implementation of stack signature of functions in these implementations

can vary slightly okay now i will quickly show you a basic implementation of stack in C

in my C code here i'm going to write a simple array based implementation to create a stack of

integers the first thing that i'm going to do is i'm going to create an array of integers

as global variable and the size of this array is max size where max size is defined by this macro

as 101 i will declare another global variable named top and set it as minus one initially

remember top equal minus one means an empty stack when a variable is not declared inside any function

it's a global variable it can be accessed anywhere so you do not have to pass it as argument to functions

and now i will write all the operations this is my push function i'm first incrementing top

and then setting the value at top as x x is the integer to be inserted past as argument

instead of writing these two statements i can write one statement like this and i will be good

i'm using pre increment operator so increment will happen before assignment i also want to handle

overflow we will have an overflow when top index will be equal to max size minus one highest index

available in the array in case of an overflow i simply want to print an error message something

like this and return so in this implementation i'm not using a dynamic array in case of overflow

push will not succeed okay now this is my pop function i'm simply decrementing top here also we

must handle one error condition if stack is already empty we cannot pop so i'm writing these statements

here if top is equal to minus one we cannot pop i will print this error message that there is no

element to pop and simply return now let's write top operation top operation will simply

return the integer at top index so now my basic operations are all written here i have already

written push pop and top in main function i will make some calls to push and pop and i want to write

one more function named print and this is something that i'm going to write only to verify that push

and pop are happening properly i will simply print all the elements in the stack in my main

function after each push or pop operation i will make a call to print i'm writing multiple

function calls two function calls on same line here because i'm short of space remember print

function is not a typical operation available with stack i'm writing it only to test my implementation

so this pretty much is my code let's now run this program and see what happens

this is what i'm getting as output we are pushing three integers two five and ten and then we are

performing a pop so ten gets removed from the stack and then we are pushing 12 so this is a

basic implementation of stack in c this is not an ideal implementation an ideal implementation

should be something like we should have a data type called stack and we should be able to create

instances of it we can easily do it in an object oriented implementation we can do it in c also

using structures check the description of this video for link to source code of this implementation

as well as of an object oriented implementation in our next lesson we will discuss linked list

implementation of stack this is it for this lesson thanks for watching

in our previous lesson we saw how we can implement stack using arrays now in this lesson we will

see how we can implement stack using linked list for this lesson i'm assuming that you already know

about both stack as well as linked list stack as we know from our discussion so far is called a

last in first out data structure whatever goes in last in a stack comes out first it's a list

with this restriction that insertion and deletion must be performed only from one end that we call

the top of stack an insertion in a stack is called push operation and deletion is called pop to

implement a stack all we need to do is enforce this behavior in any implementation of a list that

insertion and deletion must be performed only from one end and we can call that end top of stack

it's really easy to enforce this behavior in a linked list i have drawn a linked list of integers

here this is logical representation of a linked list a linked list is a collection of entities that

we call nodes each node contains two fields one two store data and another to store the address

of the next node let's assume that these nodes are at addresses 100 200 and 400 respectively

so i will fill up the address part as well the identity of a linked list is the address of the

first node that we also call the head node a variable stores the address of head node we often

name this variable as head unlike arrays linked lists are not a fixed size and elements in a

linked list are not stored in one contiguous block of memory we already know how to create a linked

list or insert and delete elements from a linked list from our previous lessons i'm just doing a

quick recap here to insert an element in a linked list we first create a new node which is basically

blocking some part of memory to store our data in this example here let's say for my new node i'm

getting address 350 we can set the data part of the linked list as whatever value i want to add

in the list and then i need to modify the address field of some of the existing nodes to link this

node in actual list now for a stack we want that insertion and deletion must always happen from the

same end we can use a linked list at stack if we always insert and delete a node at same end we

have two options we can insert or delete from end of the list what we also call tail or

beginning of the list that we call head if you remember from our previous lessons inserting a

node at end of linked list is not a constant time operation the cost of both insertion and

deletion at end of linked list if we have to talk about the time complexity of it is big o of n

here in the definition of stack we are saying that push and pop operations should take constant

time or the time complexity should be big o of one but if we will insert and delete from end

time complexity will be big o of n to insert a new node in a linked list at the end we need to go

to the last node and set the address part of that node to make it point to the new node to traverse

a linked list and go to the last node we should start at the head or the first node from first node

we get the address of the second node so we go to the second node and from second node we get the

address of the third node it's like playing treasure hunt you go to the first guy ask the address of

the second guy and then you go to the second guy ask the address of the third guy and so on

now once i've reached this last node in my example here i can set its address part to make it point

to the newly created node all in all this operation will take time proportional to number of elements

in the linked list to delete a node from end once again we will have to traverse the whole list

we will have to go to the second last node break this link we will set the address field as zero

or null and then we can simply wipe off the last node removed from the list from computer's memory

once again the cost of traversal will be big o of n so inserting and deleting at end or tail is

not an option for us because we will not be able to do push and pop in constant time if we choose

to insert and delete from end the cost of inserting or deleting from beginning however

is big o of one it will take constant time to insert a node at beginning or delete a node from

beginning to insert a node at beginning we must create a new node in this example here once again

i have created a new node let's say the address of the new node is 350 i will insert some data in the

first field of this node okay so to insert this node at beginning we just need to build two links

first we need to build this link so we will set the address here as whatever the address of the

current head is and then we can break this link and make this guy the new head by setting its address

here in this variable named head to delete a node in this example here we will have to first cut this

link and build this link which will mean resetting the address in this variable head and then we can

free the memory allocated to this particular guy this particular node deletion from beginning

once again is a constant time operation so this is the thing if we will insert at beginning and

delete from beginning then all our conditions are satisfied so linked list implementation of stack

is pretty straightforward all we need to do is insert a node at the beginning and delete a node

from beginning so head of the linked list is basically the top of stack i would rather name

this variable top here i'll quickly write a basic implementation in c i'm defining node as a

structure in c i want to create a stack of integers so first field in the node is an integer another

field is pointed to node that will store the address of the next node we have seen this definition of

node in all our previous lessons on linked list the next thing that i'm doing is i'm declaring

a variable named top which is pointed to node and initially i'm setting the address in it as null

i'm using variable name top instead of head here when top is null our stack is empty by initializing

top as null i'm saying that initially my stack is empty now let's write push and pop functions

this is my push function push is taking an integer x as argument that must be inserted

onto the stack the first thing that we are doing in push function is that we are creating a node

using malloc let's say in this example in this logical representation that i'm showing here

i'm performing a push operation so i'm making a call to push function passing it number two as

argument so a node is created in memory is created in what we call the dynamic memory or heap let's

say the address of this node is hundred this variable is basically a pointer pointing to this node

temp is a pointer pointing to this node in the next line we are setting the data field in this node

we are dereferencing temp to do so then we are setting the link part of this newly created node

as existing top so we are building this link and then we are saying top equal temp so we are building

this link this is simple insertion at beginning of a linked list we have one complete video in this

series on how to insert a node at beginning of linked list let's do one more push let's say i want

to push number five onto the stack this time once again a node will be created we will set the data

and then we will first point this guy to the existing top and then make this pointer variable

point to this guy the new top let's say the address of this guy is 250 so the address in

this variable top will be set as 250 after this second push this is how my stack will look like

top here is a global variable so we do not need to pass it as argument to functions it is accessible

to all the functions in an object oriented implementation it can be a private field

and we can set it as null in the constructor okay let's now see how push sorry pop function

will look like this is my pop function let's say for this example i'm making a call to pop function

if the stack is already empty we can check whether stack is empty or not by checking whether top

is null or not if top is null stack is empty in this case we can throw some error and return

for this example here stack is not empty we have two integers in the stack what we are first doing

is we are creating a pointer to node temp and pointing it to the top node and now we are breaking

this link we are setting the address in top as address of the next node and now using this

pointer variable temp we are freeing the memory allocated to the node being removed from the list

once i exit the pop function this is my stack so this pretty much is the core of our implementation

i would encourage you to write rest of the stuff yourself you can write code for operations like

top and is empty linked list implementation of stack has some advantages one of the advantages

is that unlike array based implementation we do not need to worry about overflow

unless we exhaust the memory of the machine itself some amount of extra memory is used in

each node to store reference or address but the fact that we use memory when needed

and release when not needed is something that makes push and pop operations more graceful

so this is linked list based implementation of stack in our coming lessons we will solve

some problems using stack this is it for this lesson thanks for watching

in our previous lesson we saw how we can implement a stack we saw two popular implementations of stack

one using arrays and another using linked list a warrior should not just possess a weapon

he must also know when and how to use it as programmers we must know

in what all scenarios we can use a particular data structure in this lesson i'm going to talk

about one simple use case of stack a stack can be used to reverse a list or collection

or simply to traverse a list or collection in reverse order i'm going to talk about two problems

reversal of string and reversal of linked list and i'm going to solve both these problems

using stack let's first discuss reversal of string i have a string in the form of a

character array here i have this string hello a string is a sequence of characters

this is a c-style string in c a string must be terminated with a null character so this last

character is a null character reversal means characters in the array should be rearranged

like what i'm showing here in the right null character is used only to mark the end of string

it is not part of string okay there are couple of efficient ways in which we can reverse a string

let's first discuss how we can solve this problem using a stack and then we will see how

efficient it is what we can do is we can create a stack of characters i'm showing logical representation

of a stack here this is a stack of characters and right now it's empty and now what we can do is

we can traverse the characters in the string from left to right and start pushing them onto the stack

so first h goes into the stack then the next character is e then l then we have another l

and then the last character is o once all the characters in the string have gone into the stack

we can once again start at the zeroeth index now we need to write the topmost character

in the stack at this index we can get the topmost character by calling top operation

and now we can perform a pop and now we can go to the next index fill in whatever is at top of stack

and perform a pop again we can go on doing this until stack is not empty so all the positions

in the character array will be overwritten so finally we have reversed our string here

in a stack whatever goes in last comes out first so if we will push a bunch of items onto a stack

and once all items are pushed if we will start popping we will get the items in reverse order

first item pushed onto the stack will come out last let's quickly write code for this logic

i'm going to write c++ here things will be pretty similar in other languages so it doesn't really

matter what i'm going to do in my code is i'm going to create a character array to store a string

and then i will ask user to input a string once i input the string i will make a call to a function

named reverse passing it the array and length of string that i will get by making a call to string

length function and finally i'm printing the reversed string now i need to write the reverse

function in reverse function i want to use a stack a stack of characters we have already seen

how we can implement stack in c++ we can create a class named stack that would have

an array of characters and an integer variable named top to mark the top of stack in array

and these variables can be private and we can work upon the stack using these public functions

in reverse function we can simply create an object of stack and use it this class can be an

array based implementation of stack or a linked list based implementation of stack it doesn't really

matter in c++ and many other languages language libraries also give us implementation of stack

in this program i'm not going to write my own stack i'm going to use stack from what we call

standard template library in c++ i will have to use this include statement hash include stack

and now i have a stack class available to me to create an object of this class i need to write

stack and within angular brackets data type for which we want a stack then after space name or

identifier with this one statement here i have created a stack of characters let's now write the

core logic this n in the signature of reverse function is number of characters in string

this array as we know array in c or c++ is always passed by reference through a pointer this c

followed by brackets is only an alternate syntax for asterisk c it's interpreted like this by the

compiler okay so now what i'm going to do is i'm going to run a loop starting 0 till n minus 1

so i will traverse the string from left to right and as i traverse the string i will push

the character onto stack by calling push function i will use a statement like this

once push is done i'll do another loop for pop i will run a loop with this variable i starting

at 0 going till n minus 1 and i'll first set c i as top of stack and then i will perform a pop

operation if you want to know more about functions available with stack in stl like their signatures

and how to use them you can check the description of this video for some resources this is all i need

to do in my reverse function let's run this code and see what happens i need to enter a string

let's enter hello this is what i get as output which seems to be correct let's run this again and

this time i want to enter my code school this looks all right too so we seem to be good so this

function is solving my problem of reversal let's now see how efficient it is let's analyze its

time complexity we know that all operations on stack take constant time so all these statements

within loop inside loop will take constant time the first loop is running n times and then the

second loop is also running n times first loop will execute in big o of n and the second loop

will also execute in big o of n the loops are not nested they are one after other so in such scenario

complexity of the whole function will also be big o of n time complexity is big o of n

but we are using some extra memory here for stack we are pushing all the characters in the string

want to stack the extra space taken in stack will be proportional to number of characters

in the string will be proportional to n so we can say that space complexity of this function

is also big o of n in simple words extra space taken is directly proportional to n

there are efficient ways to reverse a string without using extra space the most efficient way

probably would be to use just two variables to mark the start and end index in the string

initially let's say i am using variables i and j initially i for this example is zero and j is

four while i is less than j we can swap the characters at these positions and once we have

swapped we can increment i and decrement j if i is less than j we can swap again

and once again increment i and decrement j now i is not less than j i is equal to j at this stage

we can stop swapping and we are done this algorithm has space complexity big o of one we are using

constant extra memory here time complexity of this approach once again is big o of n

we will do n by two swaps so time taken will be proportional to n definitely because of space

complexity this approach is better than our stack approach sometimes when we know that our

input will be very small and time and space is not much of concern we use a particular algorithm

for ease of implementation for its being intuitive it's clearly not the case when we are using stack

to reverse or string but for this other problem reversal of linked list that we had said we will

discuss using a stack gives us a neat and intuitive solution i have drawn a linked list of integers

here as we know linked lists are collections of entities that we call nodes each node contains

two fields one to store data and other to store address of next node i have assumed that these

nodes in this example here are at addresses 100 150 250 and 300 respectively identity of a

linked list is address of the head node we typically store this address in a variable named head

in an array it takes constant time to access any element so whether it's the first element

or last element it takes constant time to access it it is so because array is stored as one

contiguous block of memory so if we know the starting address of the array let's say the starting

address of this array is 400 and size of each element in the array characteristics one byte

so for this example each element is one byte then we can calculate a address of any element

so we know that a4 is at 400 plus 4 or 404 but in a linked list nodes are stored at

these joint locations in memory to access any node we have to start at the head node

so we can't do something as simple as having two pointers at start and end and accessing the

elements we have already seen in the series two possible approaches that can be used to reverse

a linked list one was an iterative solution where we go on reversing links as we traverse the linked

list using some temporary variables another solution was using recursion the time complexity of

iterative solution is big o of n space complexity is big o of one in recursive solution we do not

create a stack explicitly but recursion uses the stack in computer's memory that is used to execute

function calls in such a case we say that we are using implicit stack stack is not being created

explicitly but still we are using an implicit stack i will come back to this and explain in detail

the time complexity of recursive solution once again is big o of n but the space complexity is

big o of n this time space complexity is also big o of n now let's see how we can use an

explicit stack to solve this problem once again i have drawn logical representation

of stack here right now the stack is empty in a program this will be a stack of type pointer to

node what i'm going to do now is i'm going to traverse this linked list using a temporary

pointer to node the temporary variable will initially point to head when we will go to a

particular node we will push the address of that node onto the stack so first 100 will go to stack

and now we will move to the next node now 150 will go in stack and now we will go to 250

and then to the last node at 300 we are showing addresses here in the stack but basically the

objects that we are pushing are pointers to node or in other words references to nodes

if node is defined like this in c++ we will have to use these statements to traverse the linked

list and push all the references let's say head is a pointer to node which i'm assuming is a global

variable that will store the address of head node i'm using a temporary variable that is

pointer to node initially i'm storing the address of head node in this temporary variable and then

i'm running a loop and i'm traversing the linked list and as i'm traversing i'm pushing the reference

onto stack once all the references are pushed onto stack we can start popping them and as we will pop

them we will get references to nodes in reverse order it would be like going through the list in

reverse order while traversing the list in reverse order we can build reverse links

the first thing that i'll do is i'll take a temporary variable that will be pointed to node

and store the address of address at the top of stack which right now is 300 now i will set head

as this address so head now becomes 300 and then i will pop i'm running you through this example

here as i'm writing code head and temp right now are both 300 and now i will run a loop like this

like what i have written here while stack is not empty this function empty returns true if stack

is empty i'm using stack from standard template library in c++ so while stack is not empty i'm

going to say that set temp dot next as address at top of stack basically i'm using this pointer to

node temp to dereference and set this particular address field right now top is 250 so i'm building

this reverse link next statement is a pop and in the next statement i'm saying temp equal temp dot

next which means temp will now point to this node at 250 stack is not empty so loop will execute

again we are writing address here now then we should pop and then move to 150 using this

statement temp equal temp dot next now we're building this link popping and then

oops this should have been 150 and with the next temp equal temp dot next we're going here

even though we have built this link by setting this field here this node is still pointing to

this guy because the stack is empty now we will exit the loop after the loop after exit from the

loop i have written one more line temp dot next equal null so i'm setting the last link part of

last node in reversed list as null finally this is my reverse function i have assumed that head is

a global variable and it's a pointer to node if you want the complete source code you can check

the description of this video for a link using a stack in this case is making our life easier

reversing a linked list is still a complex problem try to just print the elements of linked list in

reverse order if you will use a stack it will be really easy i will stop here for this lesson

if you know if you want to know what i meant by implicit stack you can once again check the

description of this video for some resources so this is it for this lesson thanks for watching

in our previous lesson we saw one simple application of stack we saw that a stack can be

used to reverse a list or collection or maybe to simply traverse a list or collection in reverse

order now in this lesson we will discuss another famous problem that can be solved

using stack and this is also a popular programming interview question and the problem is given an

expression in the form of a string comprising of let's say constants variables operators

and parenthesis and when i say parenthesis i also want to include curly braces and brackets in

my definition of parenthesis so my expression or string can contain characters that can be

upper or lowercase letters symbols for operators and an opening or closing parenthesis or an opening

or closing curly brace or an opening or closing square bracket let's write down some expressions

here i'm going to write a simple expression we have one simple expression here with one pair

of opening and closing parenthesis here in this expression we have nested parenthesis

now given such expressions we want to write a program that would tell us whether parenthesis

in the expression are balanced or not and what do we really mean by balanced parenthesis what we

really mean by balanced parenthesis is that corresponding to each opening parenthesis or

opening curly brace or opening bracket we should have a closing counterpart in correct order

these two expressions here are balanced however this next expression is not balanced

a closing curly brace is missing here this next expression is also not balanced because we are

missing an opening square bracket here this next one is also not balanced because corresponding to

this opening curly brace we do not have a closing curly brace and corresponding to this closing

parenthesis we do not have an opening parenthesis if we are opening with a curly brace we should

also close with a curly brace these two want count for each other checking for balanced parenthesis is

one of the tasks performed by a compiler when we write a program we often miss an opening or closing

curly brace or an opening or closing parenthesis compiler must check for this balancing and if symbols

are not balanced it should give you an error in this problem here what's inside a parenthesis

does not matter we do not want to check for correctness of anything that is

inside a parenthesis so in the string any character other than opening and closing parenthesis or

opening and closing curly brace or opening and closing square bracket can be ignored this problem

sometimes is better stated like this given a string comprising only of opening and closing

characters of parenthesis braces or brackets we want to check for balancing so only these characters

and their order is important while parsing a real expression we can simply ignore other

characters all we care about is these characters and their order okay so now how do we solve this

problem one straightforward thing that comes to mind is that because we should have a closing

counterpart for an opening parenthesis or opening curly brace or opening square bracket what we can

do is we can count the number of opening and closing symbols for each of these three types

and they should be equal so the number of opening parenthesis should be equal to number of closing

parenthesis and the number of opening curly braces should be equal to number of closing curly braces

and same should be true for square brackets as well but it will not be good enough this expression

here has one opening parenthesis and one closing parenthesis but it's not balanced this next one

is balanced but this one with same number of characters of each type as the second expression

is not balanced so this approach won't work apart from count being equal there are some other

properties that must be conserved every opening parenthesis must find a closing counterpart to

its right and every closing parenthesis must find an opening counterpart in its left which is not

true in the first expression and the other property that must be conserved is that a parenthesis can

close only when all the parenthesis opened after it are closed this parenthesis has been opened after

this square bracket so this square bracket cannot close unless this parenthesis has closed

anything that is opened last should be closed first well actually it should not

be last opened first closed in this example here this is getting opened last but this guy

that is open previous to this is closed first and it is fine the property that must be conserved

is that as we scan the expression from left to right any closer should be for the previous

unclosed parenthesis any closer should be for the last unclosed let's scan some expressions

from left to right and see how it's true let's scan this last one we will go from left to right

first character is an opening of square bracket second one is an opening parenthesis

let's mark opening of unclosed parenthesis in red okay now we have a closer here

the third character is a closer this should be the closer for the last unclosed so this

should be the closer for this one this guy this opening parenthesis last unclosed now is this

guy next character once again is an opening parenthesis now we have two unclosed parenthesis

at this stage and this one is the last unclosed the next one is a closure so so it should be closer

for the last unclosed now the last unclosed once again is the opening of square bracket

now when we have a closer it should be closer for this guy

we can use this approach to solve this problem what we can do is we can scan the expression

from left to right and as we scan at any stage we can keep track of all the unclosed parenthesis

basically what we can do is whenever we get an opening symbol an opening parenthesis an

opening curly brace or an opening square bracket we can add it to a list if we get a closing symbol

it should be the closer for the last element in the list in case of an inconsistency

like if the last opening symbol in the list is not of the same type as the closing symbol

or if there is no last opening symbol at all because the list is empty we can stop this whole

process and say that parenthesis are not balanced else we can remove the last opening symbol in the

list because we have got its counterpart and continue this whole process things will be

further clear if i will run through an example i will run through this last example once again

we are going to scan this expression from left to right and we will maintain a list to keep track

of all the open parenthesis that are not yet closed we will give a track of all the unclosed

parenthesis opened but not closed initially this list is empty the first character that we have got

is an opening of square bracket this will go into the list and we will move to the next character

the next character is an opening parenthesis so once again it should go to the list we should

always insert at end in the list the next character is a closing of parenthesis now we must look at

the last opening symbol in the list and if it is of the same type then we have got its counterpart

and we should remove this now we move on to the next character this is once again an opening

parenthesis it should go in the list at the end the next character is a closing of parenthesis

so we will look at the last element in the list it's an opening parenthesis so we can remove it

from the list and now we go to the last character which is a closing of square bracket once again

we need to look at the last element in the list we have one element only one element in the list

at this stage it's an opening of square bracket so once again we can remove it from the list

now we are done scanning the list and the list is empty once again if everything is all right if

parenthesis are balanced we will always end with an empty list if in the end list is not empty

then some opening parenthesis has not found its closing counterpart and expression is not balanced

one thing worth noticing here is that we are always inserting or removing one element at a time from

the same end of the list in this whole process whatever is coming in last in the list is going

out first there is a special kind of list that enforces this behavior that element should be inserted

and removed from the same end and we call it a stack in a stack we can insert and remove an element

one at a time from the same end in constant time so what we can do is whenever we get an opening

symbol while scanning the list we can push it onto the stack and when we get a closing symbol we can

check whether the opening symbol at the top of stack is of the same type as the closing symbol

if it's of the same type we can pop it if it's not of the same type we can simply say that parenthesis

are not balanced i will quickly write pseudo code for this logic i'm going to write a function

named check balanced parenthesis that will take an expression in the form of a string as argument

first of all i will store the number of characters in the string in a variable and then i will create

a stack and i will create a stack of characters and now i will scan the expression from left to

right using a loop while scanning if the character is an opening symbol if it's an opening parenthesis

or opening curly brace or opening square bracket we can push that character onto the stack let's

say this function push will push a character onto s else if expression i or the character

at ith position while scanning is a closing symbol of any of the three types we can have two scenarios

if stack is empty or top of stack does not pair with the closing symbol if we have a closing of

parenthesis then the top of stack should be an opening of parenthesis it cannot be an opening

of curly brace in such a scenario we can conclude that the parenthesis are not balanced else we

can perform a pop finally once our scanning is over we can check whether stack is empty or not

if it's empty parenthesis are balanced if it's not they are not balanced so this is my pseudo code

let's run through couple of examples and see whether this works for all scenarios or test cases or

not let's first look at this expression the first thing that we are doing in our code is that we are

creating a stack of characters i have drawn logical representation of a stack here okay now let's scan

this string let's say we have a zero based index and the string is just a character array we are

starting the scan we are going inside the loop this is a closing of parenthesis so this if statement

will not hold true so we will go to the else condition and now we will go inside the else

to check for this condition whether stack is empty or not or whether the top of stack pairs

with this closing symbol or not the stack is empty if the stack is empty there is no opening

counterpart for this closing symbol so we will simply return false returning means exiting the

function so we are simply concluding here that parenthesis are not balanced and exiting

let's go through this one now first we have an opening square bracket so we will go to the first

if and push next one is an opening parenthesis once again it will be pushed next one is a closing

square bracket so the condition for this else if will be true we will go inside this else if

now this time the top of stack is an opening parenthesis it should have been an opening square

bracket and then only we would have a pair so this time also we will have to return false

and exit okay now let's go through this one first we will have a push the next one

will also be a push now next one is a closer of parenthesis which pairs with the top of stack

which is opening of parenthesis so we will have a pop we will go to the next character and this

one once again is an opening parenthesis so there will be a push next one is a closing parenthesis

and the top is an opening parenthesis the pair so there will be a pop last character is a closing

curly brace so once again we will see whether top of stack is an opening curly brace or not do we

have a pair or not yes we have a pair so there will be a pop with this our scanning will finish

and finally stack should be empty it is empty so we have balanced parenthesis here

try implementing this pseudo code in a language of your choice and see whether it works for all

test cases or not if you want to look at my implementation you can check the description of

this video for a link in the coming lessons we will see some more problems on stack this is it for

this lesson thanks for watching hello everyone in this lesson we are going to talk about one

important and really interesting topic in computer science where we find application of stack

data structure and this topic is evaluation of arithmetic and logical expressions

so how do we write an expression I have written some simple arithmetic expressions here an expression

can have constants variables and symbols that can be operators or parenthesis and all these

components must be arranged according to a set of rules according to a grammar and we should be

able to parse and evaluate the expression according to this grammar all these expressions that I have

written here have a common structure we have an operator in between two operands

operand by definition is an object or value on which operation is performed in this expression

two plus three two and three are operands and plus is operator in the next expression

a and b are operands and minus is operator in the third expression this asterisk is for

multiplication operation so so this is the operator the first operand p is a variable

and the second operand 2 is a constant this is the most common way of writing an expression

but this is not the only way this way of writing an expression in which we write

an operator in between operands is called in fix notation operand doesn't always have to be

a constant or variable operand can be an expression itself in this fourth expression that I have

written here one of the operands of multiplication operator is an expression itself another operand

is a constant we can have a further complex expression in this fifth expression that I have

written here both the operands of multiplication operator are expressions we have three operators

in this expression here for this first plus operator p and q this variables p and q are operands

for the second plus operator we have r and s and for this multiplication operator the first

operand is this expression p plus q and the second operand is this expression r plus s

while evaluating expressions with multiple operators operations will have to be performed

in certain order like in this fourth example we will first have to perform the addition and then

only we can perform multiplication in this fifth expression first we will have to perform these

two additions and then we can perform the multiplication we will come back to evaluation but if you can

see in all these expressions operator is placed in between operands this is the syntax that we are

following one thing that I must point out here throughout this lesson we are going to talk only

about binary operators an operator that requires exactly two operands is called a binary operator

technically we can have an operator that may require just one operand or maybe more than two

operands but we are talking only about expressions with binary operators okay so let's now see what

all rules we need to apply to evaluate such expressions written in this syntax that we are

calling in fix notation for an expression with just one operator there is no problem we can

simply apply that operator for an expression with multiple operators and no parenthesis like

this we need to decide an order in which operators should be applied in this expression if we will

perform the addition first then this expression will reduce to 10 into 2 and will finally evaluate

as 20 but if we will perform the multiplication first then this expression will reduce to 4 plus 12

and will finally evaluate to 16 so basically we can look at this expression in two ways

we can say that operands for addition operator are 4 and 6 and operands for multiplication are

this expression 4 plus 6 and this constant 2 or we can say that operands for multiplication are 6

and 2 and operands for addition operation are 4 and this expression 6 into 2 there is some ambiguity

here but if you remember your high school mathematics this problem is resolved by following operator

precedence rule in an algebraic expression this is the precedence that we follow first preference

is given to parenthesis or brackets next preference is given to exponents i'm using this symbol for

exponent operator so if i have to write 2 to the power 3 i'll be writing it something like this

in case of multiple exponentiation operator we apply the operators from right to left so

if i have something like this then first this rightmost exponentiation operator will be applied

so this will reduce to 512 if you will apply the left operator first then this will evaluate

to 64 after exponents next preference is given to multiplication and division and if it's between

multiplication and division operators then we should go from left to right after multiplication

and division we have addition and subtraction and here also we go from left to right if we have

an expression like this with just addition and subtraction operators then we will apply the

leftmost operator first because the precedence of these operators is same and this will evaluate to

3 if you will apply the plus operator first this will evaluate as 1 and that will be wrong

in this second expression 4 plus 6 into 2 that i have written here if we will apply operator

precedence then multiplication should be performed first if we want to perform the addition first

then we need to write this 4 plus 6 within parenthesis and now addition will be performed first because

precedence of parenthesis is greater i'll take example of another complex expression and try to

evaluate it just to make things further clear so i have an expression here in this expression

we have four operators one multiplication one division one subtraction and one addition

multiplication and division have higher precedence between these two multiplication and division

which have same precedence we will pick the left one first so we will first reduce this expression

like this and now we will perform the division and now we have only subtraction and addition

so we will go from left to right and this is what we will finally get this right to left and left

to right rule that i have written here for operators with equal precedence is better termed as operator

associativity if in case of multiple operators with equal precedence we go from left to right

then we say that the operators are left associative and if we go from right to left we say that the

operators are right associative while evaluating an expression in in fixed form we first need to

look at precedence and then to resolve conflict among operators with equal precedence we need to

see associativity all in all we need to do so many things just to parse and evaluate an

in fix expression the use of parenthesis becomes really important because that's how we can control

the order in which operation should be performed parenthesis add explicit intent that operation

should be performed in this order and also improve readability of expression i have modified

this third expression we have some parenthesis here now and most often we write in fix expressions

like this only using a lot of parenthesis even though in fix notation is the most

common way of writing expressions it's not very easy to parse and evaluate an in fix expression

without ambiguity so mathematicians and logicians studied this problem and came up with two other

ways of writing expressions that are parenthesis free and can be passed without ambiguity without

requiring to take care of any of these operator precedence or associativity rules and these two

ways are post fix and prefix notations prefix notation was proposed earlier in year 1924 by

a polished logician prefix notation is also known as polished notation in prefix notation operator

is placed before operands this expression two plus three in in fix will be written as plus two

three in prefix plus operator will be placed before the two operands two and three p minus

q will be written as minus pq once again just like in fix notation operand in prefix notation

doesn't always have to be a constant or variable operand can be a complex prefix notation itself

this expression a plus b as to risk c in in fix form will be written like this in prefix form

i'll come back to how we can convert in fix expression to prefix first have a look at this

third expression in prefix form for this multiplication operator the two operands are

variables b and c these three elements are in prefix syntax first we have the operator and then

we have the two operands the operands for addition operator are variable a and this prefix expression

as to risk b c in in fix expression we need to use parenthesis because an operand can possibly be

associated with two operators like in this third expression in in fix form b can be associated with

both plus and multiplication to resolve this conflict we need to use operator precedence

and associativity rules or use parenthesis to explicitly specify association but in prefix

form and also in post-fix form that we will discuss in some time an operand can be associated

with only one operator so we do not have this ambiguity while parsing and evaluating prefix

and post-fix expressions we do not need extra information we do not need all the operator

precedence and associativity rules i'll come back to how we can evaluate prefix notation

i'll first define post-fix notation post-fix notation is also known as reverse polished notation

this syntax was proposed in 1950s by some computer scientists in post-fix notation operator is

placed after operands programmatically post-fix expression is easiest to parse and list costly

in terms of time and memory to evaluate and that's why this was actually invented prefix expression

can also be evaluated in similar time and memory but the algorithm to parse and evaluate post-fix

expression is really straightforward and intuitive and that's why it's preferred for

computation using machines i'm going to write post-fix for these expressions that i had written

earlier in other forms this first expression 2 plus 3 in post-fix will be 2 3 plus to separate

the operands we can use a space or some other delimiter like a comma that's how you would typically

store prefix or post-fix in a string when you'll have to write a program this second expression

in post-fix will be pq minus so as you can see in post-fix form we are placing the operator after

the operands this third expression in post-fix will be abc asterisk and then plus for this

multiplication operator operands are variables b and c and for this addition operands are variable

a and this post-fix expression bc asterisk we will see efficient algorithms to convert

in-fix to prefix or post-fix in later lessons for now let's not bother how we will do this

in a program let's quickly see how we can do this manually to convert an expression from

in-fix to any of these other two forms we need to go step by step just the way we would go in

evaluation i have picked this expression a plus b into c in in-fix form we should first convert

the part that should be evaluated first so we should go in order of precedence we can also first

put all the implicit parenthesis so here we will first convert this b into c so first we are doing

this conversion for multiplication operator and then we will do this conversion for addition operator

we will bring addition to the front so this is how the expression will transform we can use

parenthesis in intermediate steps and once we are done with all the steps we can erase the parenthesis

let's now do the same thing for post-fix we will first do the conversion for

multiplication operator and then in next step we will do it for addition

and now we can get rid of all the parenthesis parenthesis surely adds readability to any of

these expressions to any of these forms but if we are not bothered about human readability

then for a machine we are actually saving some memory that would be used to store parenthesis

information in fix expression definitely is most human readable but prefix and post fix are good

for machines so this is in fix prefix and post fix notation for you in next lesson we will discuss

evaluation of prefix and post fix notations this is it for this lesson thanks for watching

in our previous lesson we saw what prefix and post fix expressions are but we did not discuss how

we can evaluate these expressions in this lesson we will see how we can evaluate prefix and post

fix expressions algorithms to evaluate prefix and post fix expressions are similar but i'm going

to talk about post fix evaluation first because it's easier to understand and implement and then

i'll talk about evaluation of prefix okay so let's get started i have written an expression in

in fix form here and i first want to convert this to post fix form as we know in in fix form operator

is written in between operands and we want to convert to post fix in which operator is written

after operands we have already seen how we can do this in our previous lesson we need to go step

by step just the way we would go in evaluation of in fix we need to go in order of precedence

and in each step we need to identify operands of an operator and we need to bring the operator

in front of the operands what we can actually do is we can first resolve operator precedence and

put parenthesis at appropriate places in this expression we'll first do this multiplication

this first multiplication then we'll do this second multiplication then we will perform this

addition and finally the subtraction okay now we will go one operator at a time

operands for this multiplication operator are a and b so this a asterisk b will become

a b asterisk now next we need to look at this multiplication this will transform to see the

asterisk and now we can do the change for this addition the two operands are these two expressions

in post fix so i'm placing the plus operator after these two expressions finally for this last

operator the operands are this complex expression and this variable e so this is how we will look

like after the transformation finally when we are done with all the operators we can get rid of

all the parenthesis they are not needed in post fix expression this is how you can do the conversion

manually we will discuss efficient ways of doing this programmatically in later lessons

we will discuss algorithms to convert in fix to prefix or post fix in later lessons in this lesson

we are only going to look at algorithms to evaluate prefix and post fix expressions

okay so we have this post fix expression here and we want to evaluate this expression

let's say for these values of variables a b c d and e so we have this expression in terms of

values to evaluate i'll first quickly tell you how you can evaluate a post fix expression manually

what you need to do is you need to scan the expression from left to right and find the

first occurrence of an operator like here multiplication is the first operator in post fix expression

operands of an operator will always lie to its left for the first operator the preceding two

entities will always be operands you need to look for the first occurrence of this pattern

operand operand operator in the expression and now you can apply the operator on these two operands

and reduce the expression so this is what i'm getting after evaluating two three asterisk

now we need to repeat this process till we are done with all the operators once again we need to

scan the expression from left to right and look for the first operator if the expression is correct

it will be preceded by two values so basically we need to look for first occurrence of this pattern

operand operand operator so now we can reduce this we have six and then we have five into four

twenty we are using space as still a meter here there should be some space in between two operands

okay so this is what i have now once again i look for the first occurrence of operand operand

and operator we will go on like this till we are done with all the operators

when i'm saying we need to look for first occurrence of this pattern operand operand and operator

what i mean by operand here is a value and not a complex expression itself the first operator

will always be preceded by two values and if you will give this some thought you will be able to

understand why if you can see in this expression we are applying the operators in the same order

in which we have them while parsing from left to right so first we are applying this left most

multiplication on two and three then we are applying the next multiplication on five and four then

we are performing the addition and then finally we are performing the subtraction and whenever

we are performing an operation we are picking the last two operands preceding the operator in the

expression so if we have to do this programmatically if we have to evaluate a post fix expression given

to us in a string like this and let's say operands and operators are separated by space we can have

some other delimiter like comma also to separate operands and operator now what we can do is we

can parse the string from left to right in each step in this parsing in each step in this scanning

process we can get a token that will either be an operator or an operand what we can do is as we

parse from left to right we can keep track of all the operands seen so far and i'll come back to

how it will help us so i'm keeping all the operands so seen so far in a list the first entity that

we have here is two which is an operand so it will go to the list next we have three which once

again is operand so it will go into the list next we have this multiplication operator

now this multiplication should be applied to last two operands preceding it last two operands to

the left of it because we already have the elements stored in this list all we need to do is we

need to pick the last two from this list and perform the operation it should be two into three

and with this multiplication we have reduced the expression this two three asterisk has now become

six it has become an operand that can be used by an operator later we are at this stage right now

that i'm showing in the right i'll continue the scanning next we have an operand we'll push this

number five onto the list next we have four which once again will come to the list and now we have

the multiplication operator and it should be applied to the last two operands in the reduced

expression and we should put the result back into the list this is the stage where we are right now

so this list actually is storing all the operands in the reduced expression

preceding the position at which we are during passing now for this addition we should take out

the last two elements from the list and then we should put the result back next we have an operand

we are at this stage right now next we have an operator this subtraction we will perform this

subtraction and put the result back finally when i'm done scanning the whole expression i'll have

only one element left in the list and this will be my final answer this will be my final result

this is an efficient algorithm we are doing only one pass on the string representing the expression

and we have our result the list that we are using here if you could notice is being used in a special

way we are inserting operands one at a time from one side and then to perform an operation we are

taking out operand from the same side whatever is coming in last is getting out first

this whole thing that we are doing here with the list can be done efficiently with a stack

which is nothing but a special kind of list in which elements are inserted

and removed from the same side in which whatever gets in last comes out first it's called a last

in first out structure let's do this evaluation again i have drawn logical representation of stack

here and this time i'm going to use this stack i'll also write pseudo code for this algorithm i'm

going to write a function named evaluate postfix that will take a string as argument let's name this

string expression exp for expression in my function here i'll first create a stack now for the sake

of simplicity let's assume that each operand or operator in the expression will be of only one

character so to get a token or operator we can simply run a loop from zero till

length of expression minus one so expression i will be my operand or operator if expression i

is operand i should put it push it onto the stack else if expression i is operator we should do

two pop operations in the stack store the value of the operands in some variable i'm using variables

named op1 and op2 let's say this pop function will remove an element from top of stack s

and also return this element once we have the two operands we can perform the operation i'm using

this variable to store the output let's say this function will perform the operation now the result

should be pushed back onto the stack if i have to run through this expression with whatever code i

have right now then first entity is two which is operand so it should be pushed onto the stack

next we have three once again this will go to the stack next we have this multiplication operator

so we will come to this else if part of the code i'll make first pop and i'll store three

in this variable op1 well actually this is the second operand so i should say this one is op2

and next one will be op1 once i have popped these two elements i can perform the operation

as you can see i'm doing the same stuff that i was doing with the list the only thing is that

i'm showing things vertically stack is being shown as a vertical list i'm inserting or taking

out from the top now i'll push the result back onto the stack now we will move to the next entity

which is operand it will go into the stack next four will also go into the stack and now we have

this multiplication so we will perform two pop operations after this operation is performed

result will be pushed back next we have addition so we will go on like this we have 26 pushed onto

the stack now now it's nine which will go in and finally we have this subtraction 26 minus nine

17 will be pushed onto the stack at this stage we will be done with the loop

we are done with all the tokens all the operands and operators the top of stack can be returned as

final result at this stage we will have only one element in the stack and this element will be

my final result you will have to take care of some parsing logic in actual implementation

operand can be a number of multiple digits and then we will have delimiter like space or comma

so you'll have to take care of that parsing operand or operator will be some task if you want to see

my implementation you can check the description of this video for a link okay so this was post-fix

evaluation let's now quickly see how we can do prefix evaluation once again i've written this

expression in infix form and i'll first convert it to prefix we will go in order of precedence i first

put this parenthesis this two asterix three will become asterix two three this five into four will

become asterisk five four and now we will pick this plus operator whose operands are these two

prefix expressions finally for the subtraction operator this is the first operand and this is

the second operand in the last step we can get rid of all the parenthesis so this is what i have

finally let's now see how we can evaluate a prefix expression like this we will do it just like

post-fix this time all we need to do is we need to scan from right so we will go from right to left

once again we will use a stack if it's an operand we can push it on to the stack so here for this

example nine will go on to the stack and now we will go to the next entity in the left it's four

once again we have an operand it will go on to the stack now we have five

five will also be pushed on to the stack and now we have this multiplication operator at this stage

we need to pop two elements from the stack this time the first element popped will be the first

operand in post-fix the first element popped was the second operand this time the second element

popped will be the second operand for this multiplication first operand is five and second

operand is four this order is really important for multiplication the order doesn't matter but for

say division or subtraction this will matter result 20 will be pushed on to the stack

and we will keep moving left now we have three and two both will go on to the stack

and now we have this multiplication operation three and two will be popped and their product

six will be pushed now we have this addition the two elements at top are 20 and six they will be

popped and their sum 26 will be pushed finally we have this subtraction 26 and nine will be popped

out and 17 will be pushed and finally this is my answer prefix evaluation can be performed

in couple of other ways also but this is easiest and most straightforward okay so this was prefix

and post-fix evaluation using stack in coming lessons we will see efficient algorithms to

convert in-fix to prefix or post-fix this is it for this lesson thanks for watching in our

previous lesson we saw how we can evaluate prefix and post-fix expressions now in this lesson we

will see an efficient algorithm to convert in-fix to post-fix we already know of one way of doing

this we have seen how we can do this manually to convert an in-fix expression to post-fix we apply

operator precedence and associativity rules let's do the conversion for this expression that I have

written here the precedence of multiplication operator is higher so we will first convert this

part b asterisk c b asterisk c will become b c asterisk the operator will come in front of the

operands now we can do the conversion for this addition for addition the operands are a and this

post-fix expression in the final step we can get rid of all the parentheses so finally this is

my post-fix expression we can use this logic in a program also but it will not be very efficient

and the implementation will also be somewhat complex i'm going to talk about one algorithm

which is really simple and efficient and in this algorithm we need to parse the in-fix expression

only once from left to right and we can create the post-fix expression if you can see in in-fix

to post-fix conversion the positions of operands and operators may change but the order in which

operands occur from left to right will not change the order of operators may change this is an

important observation in both in-fix and post-fix forms here the order of operands as we go from

left to right is first we have a then we have p and then we have c but the order of operators

is different in in-fix first we have plus and then we have multiplication in post-fix first we

have multiplication and then addition in post-fix form we will always have the operators in the

same order in which they should be executed i'm going to perform this conversion once again but

this time i'm going to use a different logic what i'll do is i'll parse the in-fix expression from

left to right so i'll go from left to right looking at each token that will either be an operand

or an operator in this expression we will start at a a is an operand if it's an operand we can

simply append it in the post-fix string or expression that we are trying to create

at least for a it should be very clear that there is nothing that can come before a

okay so the first rule is that if it's an operand we can simply put it in the post-fix expression

moving on next we have an operator we cannot put the operator in the post-fix expression

because we have not seen its right operand yet while parsing we have seen only its left operand

we can place it only after its right operand is also placed so what i'm going to do is i'm going

to keep this operator in a separate list or collection and place it later in the post-fix expression

when it can be placed and the structure that i'm going to use for storage is stack a stack is only

a special kind of list in which whatever comes in last goes out first insertion and deletion

happen from the same end i have pushed plus operator onto the stack here moving on next we have b

which is an operand as we had said operand can simply be appended there is nothing that can come

before this operand the operator in the stack is anyway waiting for the operand to come now at

this stage can we place the addition operator in the post-fix string well actually what's after b

also matters in this case we have this multiplication operator after b which has higher precedence

and so the actual operand for addition is this whole expression be asterisk c we cannot perform

the addition until multiplication is finished so while parsing when i'm at b and i have not seen

what's ahead of b i cannot decide the fate of the operator in the stack so let's just move on now

we have this multiplication operator i want to make this expression further complex

to explain things better so i'm adding something at tail here in this expression

now i want to convert this expression to post-fix form i'm not having any parenthesis here we will

see how we can deal with parenthesis later let's look at an expression where parenthesis does not

override operator precedence okay so right now in this expression while parsing from left to right

we are at this multiplication operator the multiplication operator itself cannot go into

the post-fix expression because we have not seen its right operand yet and until its right

operand is placed in the post-fix expression we cannot place it the operator that we would be

looking at while parsing that operator itself cannot be placed right away but looking at that

operator we can decide whether something from the collection something from the stack

can be placed into the post-fix expression that we are constructing or not

any operator in the stack having higher precedence than the operator that we are looking at

can be popped and placed into the post-fix expression let's just follow this as rule for

now and i'll explain it later there is only one operator in the stack and it is not having

higher precedence than multiplication so we will not pop it and place it in the post-fix expression

multiplication itself will be pushed if an element in the stack has something on top of it that

something will always be of higher precedence so let's move on in this expression now now we are

at c which is an operand so it can simply go next we have an operator subtraction subtraction

itself cannot go but as we had said if there is anything on the stack having higher precedence

than the operator that we are looking at it should be popped out and should go

and the question is why we are putting these operators in the stack we are not placing them

in the post-fix expression because we are not sure whether we are done with their right

operand or not but after that operator as soon as i'm getting an operator of lower precedence

that marks the boundary of the right operand for this multiplication operator

c is my right operand it's this simple variable for addition b*c is my right operand because

subtraction has lower precedence anything on or after that cannot be part of my right operand

subtraction i should say has lower priority because of the associativity rule if you remember

the order of operation addition and subtraction have same precedence but the one that would occur

in left would be given preference so the idea is any time for an operator if i'm getting

a an operator of lower priority we can pop it from the stack and place it in the expression

here we will first pop multiplication and place it and then we can pop addition

and now we will push subtraction onto the stack let's move on now d is an operand

so it will simply go next we have multiplication there is nothing in the stack having higher

precedence than multiplication so we will pop nothing multiplication will go onto the stack

next we have an operand it will simply go now there are two ways in which we can find

the end of right operand for an operator a is if we get an operator of lesser precedence

be if we reach the end of the expression now that we have reached end of expression we can simply

pop and place these operators so first multiplication will go and then subtraction will go

let's quickly write pseudo code for whatever i have said so far and then you can sit with some

examples and analyze the logic i'm going to write a function named in fix to post fix that will take

a string exp for expression as argument for the sake of simplicity let's assume that

each operand or operator will be of one character only in an actual implementation you can assume

them to be tokens of multiple characters so in my pseudo code here the first thing that i'll do is

i'll create a stack of characters named s now i'll run a loop starting 0 till length of expression

minus 1 so i'm looking at each character that can either be an operand or operator

if the character is an operand we can append it to the post fix string well actually i should have

declared and initialized a string before this loop this is the result string in which i'll be

appending else if expression i is operator we need to look for operators in the stack

having higher precedence so i'll say while stack is not empty and the top of stack has

higher precedence and let's say this function has higher precedence we'll take two arguments

two operators so if the top of stack has higher precedence than the operator that we are looking

at we can append the top of stack to the result which is the variable that will store the post fix

string and then we can pop that operator i'm assuming that this s is some class that has these functions

stop and pop and empty to check whether it's empty or not finally once i'm done with the popping

outside this while loop i need to push the current operator

s is an object of some class that will have these functions stop pop and empty okay so this

is the end of my for loop at the end of it i may have some operators left in the stack i'll pop

these operators and append them to the post fix string i'll use this while loop i'll say that

while the stack is not empty append the operator at top and pop it and finally after this while loop

i can return the result string that will contain my post fix expression so this is my pseudo code for

whatever logic i've explained so far in my logic i've not taken care of parenthesis

what if my infix expression would have parenthesis like this there will be slight change from what

we were doing previously with parenthesis any part of the expression within parenthesis

should be treated as independent complete expression in itself and no element outside

the parenthesis will influence its execution in this expression this part a plus b is one within

one parenthesis its execution will not be influenced by this multiplication or this subtraction

which is outside it similarly this whole thing is within the outer parenthesis so this multiplication

operator outside will not have any influence on execution of this part as a whole if parenthesis

are nested inner parenthesis is sorted out or resolved first and then only outer parenthesis

can be resolved with parenthesis we will have some extra rules we will still go from left to right

and we will still use stack and let's say i'm going to write the post fix part in right here

as i created now while parsing a token can be an operand an operator or an opening or closing of

parenthesis we will have some extra rules i'll first tell them and then i'll explain if it's an

opening parenthesis we can push it onto the stack the first token here in this example is an opening

parenthesis so it will be pushed onto the stack and then we will move on we have an opening parenthesis

once again so once again we will push it now we have an operand there is no change in rule

for operand it will simply be appended to the post fix part next we have an operator

remember what we were doing for operator earlier we were looking at top of stack and

popping as long as we were getting operator of higher precedence earlier when we were not using

parenthesis we could go on popping and empty the stack but now we need to look at top of stack and

pop only till we get an opening parenthesis because if we are getting an opening parenthesis

then it's the boundary of the last open parenthesis and this operator does not have any influence after

that outside that so this plus operator does not have any influence outside this opening parenthesis

i'll explain the scenario with some more examples later let's first understand the rule so the rule

is if i'm seeing an operator i need to look at the top of stack if it's an operator of higher

precedence i can pop and then i should look at the next stop if it's once again an operator

of higher precedence i should pop again but i should stop when i see an opening parenthesis

at this stage we have an opening parenthesis at top so we do not need to look look below it

nothing will be popped anyway addition however will go onto the stack remember after the whole

popping game we pushed the operator itself next we have an operand it will go and we will move on

next we have a closing of parenthesis when i'm getting a closing of parenthesis i'm getting a

logical end of the last opened parenthesis for part of the expression within that parenthesis

it's coming to the end and remember what we were doing earlier when we were reaching the end of

infix expression we were popping all the operators out and placing them so this time also we need to

pop all the operators out but only those operators that are part of this parenthesis that we are

closing so we need to pop all the operators until we get an opening parenthesis i'm popping this plus

and appending it next we have an opening of parenthesis so i'll stop but as last step i will

pop this opening also because we are done for this parenthesis okay so the rule for closing

of parenthesis pop until you're getting an opening parenthesis and then finally pop that

particular opening parenthesis also let's move on now next we have an operator we need to look at

top of stack it's an opening of parenthesis this operator will simply be pushed next we have an

operand next we have an operator once again we will look at the top we have multiplication

which is higher precedence so this should be popped and appended we will look at the top again it's

an opening of parenthesis so we should stop looking now minus will be pushed now next we have an operand

next we have closing of parenthesis so we need to pop until we get an opening minus will be appended

finally the opening will also be popped next we have an operator and this will simply go

next we have an operand and now we have reached the end of expression so everything in the stack

will be popped and appended so this finally is my post fix expression i'll take one more example

and convert it to make things further clear i want to convert this expression i'll start at

the beginning first we have an operand then this multiplication operator which will simply go onto

the stack the stack right now is empty there is nothing on the top to compare it with next we

have an opening parenthesis which will simply go next we have an operand it will be appended

and now we move on to this addition operator if this opening parenthesis was not there

the top of stack would have been the multiplication operator which has higher precedence so it would

have been popped but now we will look at the top and it's an opening parenthesis so we cannot look

below and we will simply have to move on next we have c i missed pushing the addition operator

last time okay after c we have this closure so we need to pop until we get an opening

and then we need to pop one opening also finally we have reached the end of expression so everything

in the stack will be popped and appended so this finally is my post fix part post fix form

in my pseudo code that i had written earlier only the part within this for loop will change

to take care of parenthesis in case we have an operator we need to look at

top of the stack and pop but only till we are getting an opening parenthesis so i have put this

extra condition in the while loop this condition will make sure that we stop

once we get an opening parenthesis right now in the for loop we are dealing with

operator and operators we will have two more conditions if it's an opening of parenthesis

we should push else if it's a closer we can go on popping and appending let's say this

function is opening parenthesis we'll check whether a character is opening of parenthesis or not

in fact we should use this function here also when i'm checking whether current token is

opening or not because it could be an opening curly brace or opening bracket also

this function will then take care let's say this function will take care and similarly for

this last else if we should use this function is closing parenthesis okay things are consistent

now after this while loop in the last else if we should do one extra pop and this extra pop will

pop the opening parenthesis and now we are done with this else if and this is closer of my for loop

rest of this stuff will remain same after the for loop we can pop the leftovers and the pen to the

string and finally we can return so this is my final pseudo code you can check the description

of this video for a link to real implementation actual actual source code okay so i'll stop here

now this is it for this lesson thanks for watching hello everyone we have been talking

about data structures for some time now as we know data structures are ways to store and organize

data in computers so far in this series we have discussed some of the data structures like

arrays linked lists and in last couple of lessons we have talked about stack in this lesson we are

going to introduce you to queues we are going to talk about qadt just the way we did it for stacks

first we are going to talk about q as abstract data type or adt as we know when we talk about

a data structure as abstract data type we define only the features or operations available with

the data structure and do not go into implementation details we will see possible implementations in

later lessons in this lesson we are only going to discuss logical view of q data structure okay so

let's get started q data structure is exactly what we mean when we say q in real world a q is a

structure in which whatever goes in first comes out first in short we call q a fee for structure

earlier we had seen stack which is a last in first out structure which is called a last in first

out structure or in short leifo a stack is a collection in which both insertion and removal

happen from the same end that we call the top of stack in q however an insertion must happen from

one end that we call rear or tail of the q and any removal must happen from the other end

that we can call front or head of the q if i have to define q formally as an abstract data type

then a q is a list or collection with the restriction or constraint that insertion can be and must be

performed at one end that we call the rear of q or the tail of q and deletion can be performed

at other end that we can call the front of q or head of q let's now define the interface or

operations available with q just like stack we have two fundamental operations here

an insertion is called in q operation some people also like to name this operation push

in q operations should insert an element at tail or rear end of q deletion is called

dq operation in some implementations people call this operation pop also

push and pop are more famous in context of stack in q and dq are more famous in context of qs

while implementing you can choose any of these names in your interface

dq should remove an element from front or head of the q and dq typically also returns this element

that it removes from the head the signatures of nq and dq for a q of integers can be something like

this nq is returning void here while dq is returning an integer this integer should be

the removed element from the q you can design dq also to return void typically a third operation

front or peak is kept just to look at the element at the head just like the top operation that we

had kept in stack this operation should just return the element at front and should not delete

anything okay we can have few more operations we can have one operation to check whether q is

empty or not if q has a limited size then we can have one operation to check whether q is full or

not why i'm calling out these alternate names for operations is also because most of the time

we do not write our own implementation of a data structure we use inbuilt implementations

available with language laboratories interface can be different in different language laboratories

for example if you would use the inbuilt q in c plus plus the function to insert is push

while in c sharpets nq so we should not confuse i'll just keep more famous names here

okay so these are the operations that i have defined with q adt nq dq front and is empty

we can insert or remove one element at a time from the q using nq and dq front is only to look at

the element at head is empty is only to verify whether q is empty or not all these operations

that have written here must take constant time or in other words their time complexity should be

big o of one logically a q can be shown as a figure or container open from two sides

so an element can be inserted or in queued from one side and an element can be removed or

dequeued from other side if you remember stack we show a stack as a container open from one side

so an insertion or what we call push in context of stack and removal or pop both must happen from

the same side in q insertion and removal should happen from different sides

let's say i want to create a queue of integers let's say initially we have an empty queue

i will first write down one of the operations and then show you the simulation in logical view

let's say i first want to nq number two this figure that i'm showing here

right now is an empty queue of integers and i'm saying that i'm performing an in queue operation

here in a program i would be calling an in queue function passing it number two as argument

after this nq we have one element in the queue we have one integer in the queue

because we have only one element in the queue right now front and rear of the queue are this

are same let's nq one more integer now i want to insert number five

five will be inserted at rear or tail of the queue let's nq one more

and now i want to call dequeued operation so we will pick two from head of the queue and it will go

out if dequeued is supposed to return this removed integer then we will get integer two as return

nq and deque are the fundamental operations available with queue in our design we can have some

more for our convenience like we have front and is empty here a call to front at this stage

will get us number five integer five as return no integer will be removed from the queue

calling s is empty at this stage can return us a boolean false or zero for false and one for true

so this pretty much is how queue works now one obvious question can be what are the real scenarios

where we can use queue what are the use cases of queue data structure queue is most often used in

a scenario where there is a shared resource that's supposed to serve some requests but the resource

can handle only one request at a time it can serve only one request at a time in such a scenario

it makes most sense to queue up the requests the request that comes first gets served first

let's say we have a printer shared in a network any machine in the network can send a print request

to this printer printer can serve only one request at a time it can print only one document at a time

so if a request comes when it's busy it can't be like i'm busy request later that will be really

rude of the printer what really happens is that the program that manages the printer puts the

print request in a queue as long as there is something in the queue printer keeps picking up

a request from the front of the queue and serves it processor on your computer is also a shared

resource a lot of running programs or processes need time of the processor but the processor

can attend to only one process at a time processor is the guy who has to execute all the instructions

who has to perform all the arithmetic and logical operations so the processes are put in a queue

queues in general can be used to simulate weight in a number of scenarios we will discuss some

of these applications of queue in detail while solving some problems in later lessons this is

good for an introduction in next lesson we will see how we can implement queue this is it for this

lesson thanks for watching in our previous lesson we introduced you to queue data structure we talked

about queue as abstract data type or ADT as we know when we talk about the data structure as abstract

data type we define it as a mathematical or logical model we define only the features or operations

available with the data structure and do not go into implementation details in this lesson we are

going to discuss possible implementations of queue i will do a quick recap of what we have

discussed so far a queue is a list or collection with this restriction with this constraint

that insertion can be performed at one end that we call rear of queue or tail of queue and deletion

can be performed at other end that we call the front of queue or the head of queue and insertion

in queue is called in queue operation a deletion is called the queue operation i have defined

queue ADT with these four operations that i have written here in an actual implementation all these

operations will be functions front operation should simply return the element at front of queue it

should not remove any element from the queue is empty should simply check whether queue is empty

or not and all these operations must take constant time and queue the queue or looking at the element

at front the time taken for any of these operations must not depend upon a variable like number of

elements in queue or in other words time complexity of all these operations must be big o of one

okay so let's get started we are saying that a queue is a special kind of list in which elements

can be inserted or removed one at a time and insertion and removal happen at different ends

of the queue we can insert an element at one end and we can remove an element from the other end

just the way we did it for stack we can add these constraints or extra properties of queue

to some implementation of a list and create a queue there are two popular implementations of

queue we can have an array based implementation and we can have linked list based implementation

let's first discuss array based implementation let's say we want to create a queue of integers

what we can do is we can first create an array of integers i have created an array of 10 integers

here i have named this array a now what i'm going to do is i'm going to use this array

to store my queue what i'm going to say is that at any point some part of the array starting an

index marked as front till an index marked as rear will be my queue in this array i'm showing front

of the queue towards left and rear towards right in earlier examples i was showing front towards

right and rear towards left doesn't really matter any side can be front and any side can be rear

it's just that an element must always be added from rear side and must always be removed from front

so if at any stage a segment of the array from an index marked as front till an index marked as

rear is my queue and rest of the positions in the array are free space that can be used to

expand the queue to insert an element to nq we can increment rear so we will add a new cell

in the queue towards rear end and in this cell we can write the new value element to be inserted

can come to this position i'll fill in some values here at these positions so we have these

integers in the queue and let's say we want to insert number five to insert we will increment

rear of course there should be an available cell in the right an available empty cell in the right

and now we can write value five here after insertion new rear index is seven and the value at index

seven is five now the queue means we must remove an element from front of the queue in this example

here a deque operation should remove number two from the queue to deque we can simply increment

front because at any point only the cells starting front till rear are part of my queue

by incrementing front i have discarded index two from the queue and we do not care what value lies

in a cell that is not part of the queue when we will include a cell in the queue we will overwrite

the value in that cell anyway so just incrementing front is good enough for deque operation

let's quickly write pseudocode for whatever we have discussed so far

in my code i will have two variables named front and rear and initially i'll set them both as minus

one let's say for an empty queue both front and rear will be minus one to check whether

q is empty or not we can simply check the value of front and rear and if they're both minus one

we can say that q is empty i just wrote his empty function here minus one is not a valid index

for an empty queue there will be no front and rear in our implementation we are saying that we will

represent empty state of queue by setting both front and rear as minus one

now let's write the nq function nq will take an integer x as argument there will be a couple

of conditions in nq if rear is already equal to maximum index available in array a we cannot insert

or nq n element in such scenario we can return and exit i would rather use a function named

is full to determine whether q is full or not if q is already full we can do much we should

simply exit else if q is empty we can add a cell to the queue we can add cell at index zero in the

queue and now we can set the value at index rear as x in all other cases we can first

increment rear and then we can fill in value x at index rear i can get this statement a rear

equal x outside these two conditional statements because it's common to them so this is my nq

function in the example array that i'm showing here let's nq some integers i'll make calls to

nq function and show you the simulation in the figure here let's say first i want to insert

number two in the queue i'm making a call to nq function passing number two as argument

the queue is empty so we will set both front and rear as zero now we will come to this statement

we will write value two at index zero so this is my queue after one nq operation front and

rear of the queue is same let's make another call to nq this time i want to insert number five

this time q is not empty so rear will be incremented we have added a cell to the queue by incrementing

rear and now we will write the value five at the new rear index let's nq one more number i have

nq seven let's now write dq operation there will be couple of cases in dq if the q is already empty

we cannot remove an element in this case we can simply print or throw an error and return

or exit there will be one more special case if the q has only one element in this case front and

rear will not be minus one but they will both be equal because we are already checking for minus

one case in his empty function in the previous if in this else if we can simply check whether

front is equal to rear or not if this is the case our dq will make the q empty and to mark the q

as empty we need to set both front and rear as minus one this is what we had said that we will

represent an empty queue by marking both front and rear as minus one in default or normal scenario

we will simply increment front we should really be careful about corner cases in any implementation

that's where most of the bugs come okay so this finally is my dq function

in this example here at this stage let's say we want to perform a dq q is not empty

and we do not have only one element in the queue so we will simply increment front before

incrementing we could set the value in this cell at index zero as something but the value in a cell

that is not part of q anymore doesn't really matter at this stage it doesn't really matter

what we have at index zero or index three or any other index apart from the segment between

front and rear when we will add a cell in the queue we will overwrite the value in that cell

anyway let's now perform some more in queues and dq's i'm in queuing three and then i'm in queuing

one we teach in queue we are incrementing rear i just performed some more in queue here

now let's perform a dq if i'll perform one more in queue here rear will be equal to maximum index

available in the array let's enqueue one more now at this stage we cannot enqueue an element

anymore because we cannot increment rear enqueue operation will fail now there are two unused cells

right now but with whatever logic we have written we cannot use these two cells that are in the

left of front in fact this is a real problem as we will dequeue more and more all the cells left

of front index will never be used again they will simply be wasted can we do something to use these

cells well we can use the concept of a circular array circular array is an idea that we use in a

lot of scenarios the idea is very simple as we traverse an array we can imagine that there is no

end in the array from zero we can go to one from one we can go to two and finally when we will reach

the last index in the array like in this example when we are at index nine the next index for me

is index zero we can imagine this array something like this remember this is only a logical way of

looking at the array in circular interpretation of array if i'm pointing to a position and my

current position is i then the next position or next index will not simply be i plus one it will

be i plus one modulo the number of elements in array or the size of array let's say n is the number

of elements in array then the next position will be i plus one modulo n the modulo operation will

get us the remainder upon dividing by n for any i other than n minus one this modulo operation will

not have any effect but for i equal n minus one next position will be n modulo n which will be

equal to zero when you divide the number by itself the remainder is zero previous position in circular

interpretation of array will be i plus n minus one modulo n we could simply say i minus one modulo

n just to make sure this expression inside the parenthesis is always positive i'm adding n here

give this some thought you should be able to get why it should be i plus n minus one modulo n

now with this interpretation of array we can increment rear in an nq operation as long as there

is any unused cell in the array i'm going to modify functions in my pseudo code now is empty

will remain the same we are still saying that for an empty q front and rear will be minus one

let's scroll down and come to nq now in circular interpretation i will call my q full when the

position next to rear in circular interpretation that we will calculate as rear plus one modulo

n will be equal to front so we will have a situation like this right now the next position to rear

in circular interpretation is front so there is no unused cell the complete array is exhausted

nothing will change in this condition if q is empty we can simply set front and rear as zero

in the last else condition we will increment rear like this we will say rear is equal to

rear plus one modulo n where n is number of elements in the array with this much change my

nq function is good now let's make a call to nq and insert something in this array here i want

to insert number 15 we will come to this last else condition rear right now is nine so this

expression will be nine plus one modulo n n is 10 here the size of this array a is 10 here

this will evaluate to zero now my new rear is zero i will write number 15 here let's now see

what we need to do in dq function nothing will change in the first two conditions if q is already

empty or if there is only one element in the q we will handle these cases in same manner in the

final else when we are incrementing front we need to increment it in a circular manner so we will

say front equal front plus one modulo n where n is number of elements in the array total number

of elements in the array or size of array now let's perform a dq we will come to this condition

front right now is two so this will be two plus one modulo 10 one more cell is available to us now

this much is the core of our implementation front operation will be really straightforward

we simply need to return the element at front index here also we first need to check whether

q is empty or not we should return a front only when front is not equal to minus one all these

operations all these functions that i have written here will take constant time their time complexity

will be big o of one we are performing simple arithmetic and assignments in the functions

and not doing anything costly like running a loop so time taken will not depend upon

size of q or some other variable i leave this here it should not be very difficult converting this

pseudo code to a running program in a language of your choice if you want to see my code you can

check the description of this video for a link thanks for watching

in our previous lesson we saw how we can implement q using arrays now in this lesson we will see how

we can implement q using linked list q as we know from our previous discussions is a structure

in which whatever goes in first comes out first q data structure is a list or collection with this

restriction that insertion can be performed at one end and deletion can be performed at other end

these are typical operations that we defined with q and insertion is called in q operation

and a deletion is called dq front operation front function should simply return the element

at front of list and is empty should check whether q is empty or not and all these operations must

take constant time their time complexity should be big o of one when we were implementing q with

arrays we used the idea of a circular array to implement q then in this case we have a limitation

the limitation is that array will always have a fixed size and once all the positions in the

array are taken once the array is exhausted we have two options we can either deny insertion

so we can say that the q is full and we cannot insert anything now or what we can do is we can

create a new larger array and copy the elements from previous array to the new larger array

which will be a costly process we can avoid this problem if we will use linked list to implement

q please note that this representation of circular array that i'm showing here

is only a logical way of looking at an array we can show this array like this also as i was

saying in an array implementation we will have this question what if array gets filled and we

need to take care of this we can either say q is full or we can create a new larger array and

copy elements from previous filled array into this new larger array the time taken for this copy

operation will be proportional to number of elements in filled array or in other words we

can say that the time complexity of this copy operation will be big o of n there is another

problem with array implementation we can have a large enough array and q may not be using most of

it like right now in this array 90 percent of the memory is unused memory is an important resource

and we should always avoid blocking memory unnecessarily it's not that some amount of unused memory

will be a real problem in a modern day machine it's just that while designing solutions and

algorithms we should analyze and understand these implications let's now see how good we

will be with a linked list implementation i have drawn a logical view of a linked list of integers

here coming back to basic definition of q as we know a q is a list or collection with this

constraint with this property that an element must always be inserted from one side of the

q that we call the rear of q and an element must always be removed from the other side that we

call the front of q it's really easy to enforce this property in a linked list a linked list as we

know is a collection of entities that we call nodes and these nodes are stored at non-contiguous

locations in memory each node contains two fields one to store data and another to store

address of the next node or reference to the next node let's assume that nodes in this figure

are at addresses hundred two hundred and three hundred respectively i have also filled in the

address fields the identity of linked list that we always keep with us is address of the head node

we often name the pointer or reference variable that would store this address head okay so now

we are saying that we want to use linked list to implement q these are the typical operations that

we define with a q we can use a linked list like a q we can pick one side for insertion or in q

operation so a node in the linked list must always be inserted from this side the other side will

then be used for dq so if we are picking head side for in q operation a dq must always happen from

tail if we are picking tail for in q operation then then dq must always happen from head

whatever side we are picking for whatever operation we need to be taking care of one requirement and

the requirement is that these operations must take constant time or in other words their time

complexity must be big o of one as we know from our previous lessons the cost of insertion or removal

from head side is big o of one but the cost of insertion or removal from tail side is big o of

n so here's the deal in a normal implementation of linked list if we will insert at one side and

remove from other side then one of these operations nq or dq depending on how we are picking the sides

will cost us big o of n but the requirement that we have is that both these operations must take

constant time so we definitely need to do something to make sure that both nq and dq operations

take constant time let's call this side front and this side rear so i want to nq a node from this

side and i want to dq from this side we are good for dq operation because removal from front will

take constant time but insertion or nq operation will be big o of n let's first see why insertion

at tail will be costly and then maybe we can try to do something to insert at rear end what we

will have to do is first we will have to create a node we have a new node here let's say i've got

this node at address 350 and the integer that i want to nq is 7 the address part of this node can

be set as null now what we need to do is we need to build this link we need to set the address part

of the last node as address of this newly created node and to do so we first need to have a pointer

pointing to this last node storing the address of this last node in a linked list the only identity

that we always keep with us is address of the head node to get a pointer to any other node

we need to start at head so we will first create a pointer temp and we will initially set it to

head and now in one step we can move this pointer variable to the next of whatever node it is pointing

to it's pointing to we use a statement like temp equal temp dot next to move to the next node

so from first node we will go to the second node and then from second we will go to the third node

in this example third node is the rear node and now using this pointer temp we can write the address

part of this node and build this link this whole traversal that we are having to get a pointer from

head to tail is what's taking all the time what we can do is we can avoid this whole traversal

we can have a pointer variable just like head that should always store the address of

rear node i can call this variable tail or rear let's call this rear and let's call this variable

that is storing the address of head node front in any insertion or removal we will have to update

both front and rear now but now when we will in queue let's say i have got a node at address 450

and and i want to insert this node at rear end now using the rear pointer we can update the address

field here so we are building this link and now we can update rear we will only have to modify

some address fields and time taken for in queue operation will not depend upon number of nodes

in the linked list so now with this design both in queue and dequeue operations will be constant

time operations the time complexity for both will be big o of one let's quickly see how real code

in c will look like for this design i have declared node as a structure with two fields

one to store data and another to store address of next node and now instead of declaring a pointer

variable named head appointed to node named head i'm declaring two pointers appointed to node

named front and another pointer to node named rear and initially i'm setting them both has null

let's say i'm defining these two variables in global scope so they will be accessible to all

functions my enqueue function will take an integer as argument in this function i'll first create a

node i'll use malloc in c or new operator in c plus plus to create a node in what we call dynamic

memory i'm pointing to the newly created node using this variable which is pointed to node

named temp now we can have two cases in insertion or in queue operation if there is no element in

the queue if the queue is empty in this case both front and rear will be null we will simply set

both front and rear as address of this new node being pointed to by temp and we will return or

exit else because we already have a pointer to rear node we will first set the address part of

current rear as the address of this newly created node and then we will modify the address in rear

variable to make it point to this newly created node while writing all of this i'm assuming that

you already know how to implement a linked list if you want to refresh your concepts you can check

earlier lessons in this series or you can check the description of this video for a link to lesson

on linked list implementation in c or c plus plus this code will be further clear if i'll

show things moving in a simulation let's say initially we have an empty queue so both front and

rear will be null null is only a macro for address zero at this stage let's say we are making a call

to nq function passing it number two now let's go through the nq function and see what will happen

first we will create a node data part of this node will be set as two and address part initially

will be set as null let's say we got this node at address temp at address hundred so a variable

named temp is is storing this address this variable is pointing to this node right now front and

rear are both null so we will go inside this if condition and simply set both front and rear

as hundred when the function will finish execution temp which is a local variable will be cleared

from memory after setting both front and rear as address of this newly created node we are returning

so this is how the queue will look like after first nq let's say we are making another call to

nq function at this stage passing number four as argument once again a new node will be created

let's say i got the new node at address 200 this time the queue is not empty so in this function

we will first go to this statement rear dot next equal temp so we will set the next part of this

node at address hundred as the address of the newly created node which is 200 so we will build this

link and now we will store the address of the new rear node in this variable named rear so this is

how my queue will look like after this second nq let's do one more nq let's send q number six

let's say we got a new node this time at address 300 so this is how our queue will look like okay

let's now write the queue function in dequeue function i'll first create a temporary pointer

to node in which i'll store the address of the current head or current front let's say for this

example at this stage i'm making a call to dequeue function we will have couple of cases in dequeue

also the queue could be empty so in this case we can print an error message and return in case of

empty queue front and rear will both be equal to null we can check in one of these and we will be

good in the case when front and rear will be equal we will simply set both front and rear as null

in all other cases we can simply make front point to the next node so we will simply do a front equal

front to odd next but why have we used this temporary pointer to node why have i declared

this temporary pointer to node in this code well simply incrementing front will not be good enough

in this example when i'm calling dequeue i'm first creating temp let's walk through whatever

code i've written so far so in the first line i'm creating temp and then because q is not empty

and there are more than one elements in the queue i'm setting front as address of the next node

so my queue is good now all the links are appropriately modified but this node which was

front previously is still in the memory anything in dynamic memory has to be explicitly freed

to free this node we will use free function and to this free function we should be passing

address of the node and that's why we had created temp with this free the node will be wiped off

from memory so these are nq and dequeue operations for you and if you can see there are simple

statements in these functions there are no loops so these functions will take constant time

the time complexity will be big o of one in the beginning of this lesson we had also discussed

some limitations with array implementation like what if array gets filled and that of unused memory

we do not have these limitations in a linked list implementation we are using some extra memory

to store address of next node but apart from that there is no other major disadvantage

i'll stop here now you can write rest of the functions like front function to look at the

element at front or is empty function to check whether queue is empty or not yourself if you want

to get my source code then you can check the description of this video for a link so thanks for

watching hello everyone in this lesson we'll introduce you to an interesting data structure

that has got its application in a wide number of scenarios in computer science

and this data structure is tree so far in this series we have talked about what we can call

linear data structures array linked list stack and queue all of these are linear data structures

all of these are basically collections of different kinds in which data is arranged

in a sequential manner in all these structures that i'm showing here we have a logical start

and a logical end and then an element in any of these collections can have a next element and

a previous element so all in all we have linear or sequential arrangement now as we understand

these data structures are ways to store and organize data in computers for different kinds

of data we use different kinds of data structure our choice of data structure depends upon a number

of factors first of all it's about what needs to be stored a certain data structure can be best fit

for a particular kind of data then we make here for the cost of operations quite often we want to

minimize the cost of most frequently performed operations for example let's say we have a simple

list and we are searching for an element in the list most of the time then we may want to store

the list or collection as an array in sorted order so we can perform something like binary

search really fast another factor can be memory consumption sometimes we may want to minimize

the memory usage and finally we may also choose a data structure for ease of implementation

although this may not be the best strategy tree is one data structure that's quite often used to

represent hierarchical data for example let's say we want to show employees in an organization

and their positions in organizational hierarchy then we can show it something like this

let's say this is organizational hierarchy of some company in this company john is CEO

and john has two direct reports Steve and Rama then Steve has three direct reports Steve is

manager of Lee Bob and Ella they may be having some designation Rama also has two direct reports

then Bob has two direct reports and then Tom has one direct report this particular logical

structure that I've drawn here is a tree well you have to look at look at the structure upside down

and then it will resemble a real tree the root here is at top and we are branching out in

downward direction logical representation of tree data structure is always like this

root at top and branching out in downward direction okay so tree is an efficient way of storing

and organizing data that is naturally hierarchical but this is not the only application of tree

in computer science we will talk about other applications and some of the implementation

details like how we can create such a logical structure in computer's memory later first I

want to define tree as a logical model tree data structure can be defined as a collection of entities

called nodes linked together to simulate a hierarchy tree is a non-linear data structure it's a

hierarchical structure the topmost node in the tree is called root of the tree each node will

contain some data and this can be data of any type in the tree that I'm showing in right here

data is name of employee and designation so we can have an object with two string fields one to store

name and another to store designation okay so each node will contain some data and may contain link

or reference to some other nodes that can be called its children now I'm introducing you to some

vocabulary that we use for tree data structure what I'm going to do here is I'm going to number

these nodes in the left trace so I can refer to these nodes using these numbers I'm numbering

these nodes only for my convenience it's not to show any order okay coming back as I had said each

node will have some data we can fill in some data in these circles it can be data of any type it can

be an integer or a character or a string or we can simply assume that there is some data

filled inside these nodes and we are not showing it okay as we were discussing a node may have

link or reference to some other nodes that will be called its children each arrow in this structure

here is a link okay now as you can see the root node which is numbered 1 by me and once again this

number is not indicative of any order I could have called the root node node number 10 also so root

node has linked to these two nodes number two and three so two and three will be called children of

one and node one will be called parent of nodes two and three I'll write down all these terms that

I am talking about we mentioned root children and parent in this tree one is a parent of one is

parent of two and three two is child of one and now four five and six are children of two

so node two is child of node one but parent of nodes four five and six children of same parent

are called sibling I'm showing siblings in same color here two and three are sibling then four

five and six are sibling then seven eight are sibling and finally nine and ten are sibling

I hope you are clear with these terms now the topmost node in the tree is called root root would be the

only node without a parent and then if a node has a direct link to some other node then we have a

parent child relationship between the nodes any node in the tree that does not have a child is

called leaf node all these nodes marked in black here are leaves so leaf is one more term

all other nodes with at least one child can be called internal nodes

and we can have some more relationships like parent of parent can be called

grandparent so one is grandparent of four and four is grandchild of one in general if we can grow

go from node a to b walking through the links and remember these links are not bidirectional

we have a link from one to two so we can go from one to two but we cannot go from two to one

when we are walking the tree we can walk in only one direction okay so if we can go from node

a to node b then a can be called ancestor of b and b can be called descendant of a

let's pick up this node number 10 one two and five are all ancestors of 10 and 10 is a descendant

of all of these nodes we can walk from any of these nodes to 10 okay let me now ask you some

questions to make sure you understand things what are the common ancestors of four and nine

ancestors of four are one and two and ancestors of nine are one two and five so common ancestors

will be one and two okay next question are six and seven sibling sibling must have same parent

six and seven do not have same parent they have same grandparent one is grandparent of both

nodes not having same parent but having same grandparent can be called cousins so six and seven

are cousins and these relationships are really interesting we can also say that node number three

is uncle of node number six because because it's sibling of two which is father of six

or i should say parent of six so we have quite some terms in vocabulary of tree

okay now i'll talk about some properties of tree tree can be called a recursive data structure

we can define tree recursively as a structure that consists of a distinguished node called root

and some subtrees and the arrangement is such that root of the tree contains link

two roots of all the subtrees t1 t2 and t3 in this figure are subtrees in the tree that i have drawn

in left here we have two subtrees for root node i'm showing the root node in red the left subtree

in brown and the right subtree in yellow we can further split the left subtree and look at it like

node number two is root of this subtree and this particular tree with node number two as root has

three subtrees i'm showing the three subtrees in three different colors recursion basically is

reducing something in a self-similar manner this recursive property of tree will be used

everywhere in all implementation and users of tree the next property that i want to talk about

is in a tree with n nodes there will be exactly n minus one links or edges each arrow in this figure

can be called a link or an edge all nodes except the root node will have exactly one incoming edge

if you can see i'll pick this node number two there is only one incoming link this is incoming

link and these three are outgoing links there will be one link for each parent child relationship

so in a valid tree if there are n nodes there will be exactly n minus one edges one incoming edge

for each node except the root okay now i want to talk about these two properties called depth

and height depth of some node x in a tree can be defined as length of the path from root to node

x each edge in the path will contribute one unit to the length so we can also say number of edges

in path from root to x the depth of root node will be zero let's pick some other node

for this node number five we have two edges in the path from root so the depth of this

node is two in this tree here depth of nodes two and three is one depth of nodes four five six

seven and eight is two and the depth of nodes nine ten and eleven is three okay now height of

a node entry can be defined as number of edges in longest path from that node to a leaf node

so height of some node x will be equal to number of edges in longest path from x to a leaf in this

figure for node three the longest path from this node to any leaf is two so height of node three is

two node eight is also a leaf node i'll mark all the leaf nodes here a leaf node is a node with zero

child the longest path from node three to any of the leaf nodes is two so the height of node three is

two height of leaf nodes will be zero so what will be the height of root node in this tree

we can reach all the leaves from root node number of edges in longest path is three so height of the

root node here is three we also define height of a tree height of tree is defined as height of root

node height of this tree that i'm showing here is three height and depth are different properties

and height and depth of a node may or may not be same we often confuse between the two

based on properties trees are classified into various categories there are different kinds

of trees that are used in different scenarios simplest and most common kind of tree is a tree

with this property that any node can have at most two children in this figure node two has three

children i'm getting rid of some nodes and now this is a binary tree binary tree is most famous

and throughout this series we will mostly be talking about binary trees the most common way of

implementing tree is dynamically created nodes linked using pointers or references just the way

we do for linked list we can look at the tree like this in this structure that i have drawn in right

here node has three fields one of the fields is to store data let's say middle cell is to store

data the left cell is to store the address of the left child and the right cell is to store address

of right child because this is a binary tree we cannot have more than two children we can call

one of the children left child and another right child programmatically in c or c++ we can define

node as a structure like this we have three fields here one to store data let's say data type is

integer i have filled in some data in these nodes so in each node we have three fields we have an

integer variable to store the data and then we have two pointers to node one to store the address of

the left child that will be the root of the left subtree and another to store the address of the

right child we have kept only two pointers because because we can have at most two children in binary

tree this particular definition of node can be used only for a binary tree for generic trees

that can have any number of children we use some other structure and i'll talk about it

in later lessons in fact we will discuss implementation in detail in later lessons this is just to give

you a brief idea of how things will be like in implementation okay so this is cool we understand

what a tree data structure is but in the beginning we had said that storing naturally hierarchical

data is not the only application of tree so let's quickly have a look at some of the applications

of tree in computer science first application of course is storing naturally hierarchical data

for example the file system on your disk drive the file and folder hierarchy is naturally hierarchical

data it's stored in the form of tree next application is organizing data organizing collections

for quick search insertion and deletion for example binary search tree that we'll be discussing a lot

in next couple of lessons can give us order of log n time for searching an element in it

a special kind of tree called tri is used is used to store dictionary it's really fast and efficient

and is used for dynamic spell checking tree data structure is also used in network routing

algorithms and this list goes on we'll talk about different kinds of trees and their applications

in later lessons i'll stop here now this is good for an introduction in next couple of lessons

we'll talk about binary search tree and its implementation this is it for this lesson thanks

for watching in our previous lesson we introduced you to tree data structure we discussed tree as

a logical model and talked briefly about some of the applications of tree now in this lesson

we will talk a little bit more about binary trees as we had seen in our previous lesson binary tree

is a tree with this property that each node in the tree can have at most two children we will first

talk about some general properties of binary tree and then we can discuss some special kind of

binary trees like binary search tree which is a really efficient structure for storing ordered data

in a binary tree as we were saying each node can have at most two children in this tree that

I've drawn here nodes have either zero or two children we could have a node with just one child

I have added one more node here and now we have a node with just one child because each node

in a binary tree can have at most two children we call one of the children left child and another

right child for the root node this particular node is left child and this one is right child

a node may have both left and right child and these four nodes have both left and right child

or a node can have either of left and right child this one has got a left child but has not got

right child I'll add one more node here now this node has a right child but does not have a left

child in a program we would set the reference or pointer to left child as null so we can say

that for this node left child is null and similarly for this node we can say that the right child is

null for all the other nodes that do not have children that are leaf nodes a node with zero

child is called leaf node for all these nodes we can say that both left and right child are null

based on properties we classify binary trees into different types I'll draw some more binary trees

here if a tree has just one node then also it's a binary tree this structure is also a binary tree

this is also a binary tree remember the only condition is that a node cannot have more than

two children a binary tree is called strict binary tree or proper binary tree if each node can have

either two or zero children this tree that I'm showing here is not a strict binary tree because

we have two nodes that have one child I'll get rid of two nodes and now this is a strict binary tree

we call a binary tree complete binary tree if all levels except possibly the last level are

completely filled and all nodes are as far left as possible all levels except possibly the last

level will anyway be filled so the nodes at the last level if it's not filled completely must be

as far left as possible right now this tree is not a complete binary tree nodes at same depth

can be called nodes at same level root node in a tree has step zero depth of a node is defined as

length of path from root to that node in this figure let's say nodes at step zero are nodes at level

zero I can simply say L0 for level zero now these two nodes are at level one these four nodes are

at level two and finally these two nodes are at level three the maximum depth of any node in the

tree is three maximum depth of a tree is also equal to height of the tree if we will go numbering

all the levels in the tree like L0 L1 L2 and so on then the maximum number of nodes that we can

have at some level i will be equal to two to the power i at level zero we can have one node

two to the power zero is one then at level one we can have at max two nodes at level two we can

have two to the power two nodes at max which is four so in general at any level i we can have

at max two to the power i nodes you should be able to see this very clearly because each node can

have two children so if we have x nodes at a level then each of these x nodes can have two children

so at next level we can have at most two x children here in this binary tree we have four nodes at

level two which is the maximum for level two now each of these nodes can possibly have two children

i'm just drawing the arrows here so at level three we can have max two times four that is eight

nodes now for a complete binary tree all the levels have to be completely filled we can give

exception to the last level or the best level it doesn't have to be full but the nodes have to be

as left as possible this particular tree that i'm showing here is not a complete binary tree

because we have two vacant node positions in left here i'll do slight change in this structure

now this is a complete binary tree we can have more nodes at level three but there should not be a

vacant position in left i have added one more node here and this still is a complete binary tree

if all the levels are completely filled such a binary tree can also be called perfect binary tree

in a perfect binary tree all levels will be completely filled if h is the height of a perfect

binary tree remember height of a binary tree is length of longest path between root to any of the

leaf nodes or i should say number of edges in longest path from root to any of the leaf nodes

height of a binary tree will also be equal to max step here for this binary tree height or max step

this tree maximum number of nodes in a tree with height h will be equal to we'll have two to the

power zero nodes at level zero two to the power one node at level one and we'll go on summing for

height h we'll go till two to the power h at the best level we will have two to the power h nodes

now this will be equal to two to the power h plus one minus one h plus one is number of levels here

we can also say two to the power number of levels minus one in this tree number of levels is four

we have l zero till l three so number of nodes maximum number of nodes will be two to the power

four minus one which is 15 so a perfect binary tree will have maximum number of nodes possible

for a height because all levels will be completely filled well i should say maximum number of nodes

in a binary tree with height h okay i can ask you this also what will be height of a perfect

binary tree with n nodes let's say n is number of nodes in a perfect binary tree to find out how

height we'll have to solve this equation n equal to the power h plus one minus one because if height

is h number of nodes will be two to the power h plus one minus one we can solve this equation

and the result will be this remember n is number of nodes here i leave the maths for you to understand

height will be equal to log n plus one to the base two minus one in this perfect binary tree

that i'm showing here number of nodes is 15 so n is 15 n plus one will be 16 so the h will be

log 16 to the base two minus one log 16 to the base two will be four so the final value will be

four minus one equal three in general for a complete binary tree we can also calculate height as

floor-off log n to the base two so we need to take integral part of log n to the base two

perfect binary tree is also a complete binary tree here n is 15 log of 15 to base two is

3.906891 if we'll take the integral part then this will be three i'll not go into proof of how

height of complete binary tree will be log n to the base two we'll try to see that later

all this maths will be really helpful when we will analyze cost of various operations on binary

tree cost of a lot of operations on tree in terms of time depends upon the height of tree for example

in binary search tree which is a special kind of binary tree the cost of searching inserting or

removing an element in terms of time is proportional to the height of tree so in such case we would

want the height of the tree to be less height of a tree will be less if the tree will be dense if

the tree will be close to a perfect binary tree or a complete binary tree minimum height of a tree

with n nodes can be log n to the base two when the tree will be a complete binary tree if we will

have an arrangement like this then the tree will have maximum height with n nodes minimum height

possible is flow rough or integral part of log into the base two and maximum height possible

with n nodes is n minus one when we will have a sparse tree like this which is as good as a linked

list now think about this if i'm saying that time taken for an operation

is proportional to height of the tree or in other words i can say that if time complexity

of an operation is big o of h where h is height of the binary tree then for a complete or perfect

binary tree my time complexity will be big o of log n to the base two and in worst case for this

parse tree my time complexity will be big o of n order of log n is almost best running time possible

for n as high as two to the power hundred log n to the base two is just hundred with order of

n running time if n will be two to the power hundred we won't be able to finish our computation

in years even with most powerful machines ever made so here's the thing quite often we want to

keep the height of a binary tree minimum possible or most commonly we say that we try to keep a

binary tree balanced we call a binary tree balanced binary tree if for each node the difference

between height of left and right sub tree is not more than some number k mostly k would be one

so we can say that for each node difference between height of left and right sub tree

should not be more than one there is something that i want to talk about height of a tree we had

defined height earlier as number of edges in longest path from root to a leaf height of a tree

with just one node where the node itself will be a leaf node will be zero we can define an empty tree

as a tree with no node and we can say that height of an empty tree is minus one so height of tree

with just one node is zero and height of an empty tree is minus one quite often people calculate

height as number of nodes in longest path from root to a leaf in this figure i have drawn one

of the longest paths from root to a leaf we have three edges in this path so the height is three

if we will count number of nodes in the path height will be four this looks very intuitive and i have

seen this definition of height at a lot of places if we will count the nodes height of tree with just

one node will be equal to one and then we can say height of an empty tree will be zero but this is

not the correct definition and we are not going to use this assumption we are going to say say

that height of an empty tree is minus one and height of tree with one node is zero the difference

between heights of left and right sub trees of a node can be calculated as absolute value of height

of left sub tree minus height of right sub tree and in this calculation height of a sub tree can be

minus one also for this leaf node here in this figure both left and right sub trees are empty

so both h left or height of left sub tree and h right or height of right sub tree will be minus

one but the difference overall will be zero for all nodes in a perfect tree difference will be zero

i have got rid of some nodes in this tree and now by the side of each node i have written

the value of diff this is still a balanced binary tree because the maximum diff for any

node is one let's get rid of some more nodes in this tree and now this is not balanced because

one of the nodes has diff two for this particular node height of left sub tree is one and height of

right sub tree is minus one because right sub tree is empty so the absolute value of difference is

two we try to keep a tree balanced to make sure it's tense and its height is minimized if height

is minimized cost of various operations that depend upon height are minimized okay the next thing that

i want to talk about very briefly is how we can store binary trees in memory one of the ways that

we had seen in our previous lesson which is most commonly used is dynamically created nodes

linked to each other using pointers or references for a binary tree of integers in c or c plus plus

we can define a node like this data type here is integer so we have a field to store data

and we have two pointer variables one to store address of left child and another to store address

of right child this of course is the most common way nodes dynamically created at random locations in

memory linked together through pointers but in some special cases we use arrays also arrays are

typically used for complete binary trees i have drawn a perfect binary tree here let's say this is

a tree of integers what we can do is we can number these nodes from zero starting at root and going

level by level from left to right so we'll go like zero one two three four five and six now i can

create an array of seven integers and these numbers can be used as indices for these nodes

so at zero th position i'll fill two at one th position i'll fill four at th position we'll have

one and i'll go on like this we have filled in all the data in the array but how will we store

the information about the links how will we know that the left child of root has value four and the

right child of root has value one well in case of complete binary tree if we will number the nodes

like this then for a node at index i the index of left child will be two i plus one and the index

of right child will be two i plus two and remember this is true only for a complete binary tree

for zero left child is two i plus one for i equals zero will be one and two i plus two will be two

now for one left child is at index three right child is at index four for i equal two two i plus

one will be five and two i plus two will be six we will discuss our implementation in detail when

we will talk about a special kind of binary tree called heap arrays are used to implement heaps

i'll stop here now in our next lesson we will talk about binary search tree which is also a special

kind of binary tree that gives us a really efficient storing structure in which we can

search something quickly as well as update it quickly this is it for this lesson thanks for watching

in our previous lesson we talked about binary trees in general now in this lesson we're going

to talk about binary search tree a special kind of binary tree which is an efficient structure

to organize data for quick search as well as quick update but before i start talking about

binary search trees i want you to think of a problem what data structure will you use

to store a modifiable collection so let's say you have a collection and it can be a collection

of any data type records in the collection can be of any type now you want to store this collection

in computer's memory in some structure and then you want to be able to quickly search for a

record in the collection and you also want to be able to modify the collection you want to be able

to insert an element in the collection or remove an element from the collection so what data structure

will you use well you can use an array or a linked list these are two well-known data structures

in which we can store a collection now what will be the running time of these operations

search insertion or removal if we will use an array or a linked list let's first talk about

arrays and for sake of simplicity let's say we want to store integers to store a modifiable list

or collection of integers we can create a large enough array and we can store the records in some

part of the array we can keep the end of the list marked in this array that i'm showing here

we have integers from 0 till 3 we have records from 0 till 3 and rest of the array is available

space now to search some x in the collection we will have to scan the array from index 0 till

end and in worst case we may have to look at all the elements in the list if n is the number of

elements in the list time taken will be proportional to n or in other words we can say that time

complexity of this operation will be big o of n okay now what will be the cost of insertion

let's say we want to insert number five in this list so if there is some available space

all these cells in yellow are available we can add one more cell by incrementing this marker

end and we can fill in the integer to be added the time taken for this operation will be constant

running time will not depend upon number of elements in the collection so we can say that

time complexity will be big o of 1 okay now what about removal let's say we want to remove one

from the collection what we'll have to do is we'll have to shift all records to the right of one

by one position to the left and then we can decrement end the cost of removal in worst case

once again will be big o of n in worst case we will have to shift n minus one elements

here the cost of insertion will be big o of 1 if the array will have some available space

so the array has to be large enough if the array gets filled what we can do is we can create a new

larger array typically we create an array twice the size of the filled up array so we can create a

new larger array and then we can copy the content of the filled up array into this new larger array

the copy operation will cost us big o of n we have discussed this idea of dynamic array quite

a bit in our previous lessons so insertion will be big o of 1 if array is not filled up and it will

be big o of n if array is filled up for now let's just assume that the array will always be large

enough let's now discuss the cost of these operations if we will use a linked list

if we would use a linked list i have drawn a linked list of integers here data type can be

anything the cost of search operation once again will be big o of n where n is number of

records in the collection or number of nodes in the linked list to search in worst case we will

have to traverse the whole list we will have to look at all the nodes the cost of insertion in a

linked list is big o of 1 at head and it's big o of n at tail we can choose to insert at head

to keep the cost low so running time of insertion we can say is big o of 1 or in other words we will

take constant time removal once again will be big o of n we will first have to traverse the linked

list and search the record and in worst case we may have to look at all the nodes

okay so this is the cost of operations if we are going to use array or linked list

insertion definitely is fast but how good is big o of n for an operational like search

what do you think if we are searching for a record x then in the worst case we will have to compare

this record x with all the n records in the collection let's say our machine can perform

a million comparisons in one second so we can say that machine can perform 10 to the power

6 comparisons in one second so cost of one comparison will be 10 to the power minus 6 second

machines in today's world deal with really large data it's very much possible for real world data

to have 100 million or billion records a lot of countries in this world have population more than

100 million two countries have more than a billion people living in them if we will have data about

all the people living in a country then it can easily be 100 million records okay so if we are

saying that the cost of one comparison is 10 to the power minus 6 second if n will be 100 million

time taken will be 100 seconds 100 seconds for a search is not reasonable and search may be a

frequently performed operation can we do something better can we do better than big o of n well in

an array we can perform binary search if it's sorted and the running time of binary search

is big o of log n which is the best running time to have i have drawn this array of integers here

records in the array are sorted here the data type is integer for some other data type for some

complex data type we should be able to sort the collection based on some property or some key

of the records we should be able to compare the keys of records and the comparison logic will be

different for different data types for a collection of strings for example we may want to have the

records sorted in dictionary or lexicographical order so we will compare and see which string will come

first in dictionary order now this is the requirement that we have for binary search

the data structure should be an array and the records must be sorted okay so the cost of search

operation can be minimized if we will use a sorted array but in insertion or removal we will have to

make sure that the array is sorted afterwards in this array if i want to insert number five at this

stage i can't simply put five at index six what i'll have to do is i'll first have to find the

position at which i can insert five in the sorted list we can find the position in order of log n

time using binary search we can perform a binary search to find the first integer greater than five

in the list so we can find the position quickly in this case it's index two but then we will have to

shift all the records starting this position one position to the right and now i can insert five

so even though we can find the position at which a record should be inserted

quickly in big go off log n this shifting in worst case will cost us big go off n so the running

time overall for an insertion will be big o of n and similarly the cost of removal will also be

big o of n we will have to shift some records okay so when we are using sorted array cost of search

operation is minimized in binary search for n records we will have at max log n to the base

two comparisons so if we can compare if we can perform million comparisons in a second

then for n equal two to the power 31 which is greater than two billion we are going to take

only 31 microseconds log of two to the power 31 to base two will be 31 okay we are fine with

search now we will be good for any practical value of n but what about insertion and removal

they are still big o of n can we do something better here well if we will use this data structure

called binary search tree i'm writing it in short bst for binary search tree then the cost of all

these three operations can be big o of log n in average case the cost of all the operations will

be big o of n in worst case but we can avoid the worst case by making sure that the tree is always

balanced we have talked about balanced binary tree in our previous lesson binary search tree is

only a special kind of binary tree to make sure that the cost of these operations is always big

o of log n we should keep the binary search tree balanced we'll talk about this in detail later

let's first see what a binary search tree is and how cost of these operations is minimized when

we use a binary search tree binary search tree is a binary tree in which for each node value of all

the nodes in left subtree is lesser and value of all the nodes in right subtree is greater

i have drawn binary tree as a recursive structure here as we know in a binary tree each node can

have at most two children we can call one of the children left child if we will look at the tree

as a recursive structure left child will be the root of left subtree and similarly right child

will be the root of right subtree now for a binary tree to be called binary search tree

value of all the nodes in left subtree must be lesser or we can say lesser or equal to handle

duplicates and the value of all the nodes in right subtree must be greater and this must be true for

all the nodes so in this recursive structure here both left and right subtrees must also be

binary search trees i'll draw a binary search tree of integers now i have drawn a binary search

tree of integers here let's see whether this property that for each node value of all the

nodes in left subtree must be lesser or equal and value of all the nodes in right subtree must be

greater is true or not let's first look at the root node nodes in the left subtree have values

10 8 and 12 so they're all lesser than 15 and in right subtree we have 17 20 and 25 they're all

greater than 15 so we are good for the root node now let's look at this node with value 10

in left we have 8 which is lesser in right we have 12 which is greater so we are good we are

good for this node 2 having value 20 and we don't need to bother about leaf nodes because they do

not have children so this is a binary search tree now what if i change this value 12

to 16 now is this still a binary search tree well for node with value 10 we are good

the node with value 16 is in its right so not a problem but for the root node we have a node in

left subtree with higher value now so this tree is not a binary search tree i'll revert back and

make the value 12 again now as we were saying we can search in search or delete in a binary search

tree in big o of log n time in average case how is it really possible let's first talk about search

if these integers that i have here in the tree were in a sorted array we could have performed

binary search and what do we do in binary search let's say we want to search number 10 in this array

what we do in binary search is we first define the complete list as our search space

the number can exist only within the search space i'll mark search space using these two pointers

start and end now we compare the number to be searched or the element to be searched with

mid element of the search space or the median and if the record being searched if the element

being searched is lesser we go searching in the left half else we go searching in the right half

in case of equality we have found the element in this case 10 is lesser than 15

so we will go searching towards left our search space is reduced now to half once again we will

compare to the mid element and bingo this time we have got a match in binary search we start with

n elements in search space and then if mid element is not the element that we are looking for we

reduce the search space to n by 2 and we go on reducing the search space to half till we either

find the record that we are looking for or we get to only one element in search space

and be done in this whole reduction if we will go from n to n by 2 to n by 4 to n by 8 and so on

we will have log n to the base two steps if we are taking k steps then n upon 2 to the power k

will be equal to 1 which will imply 2 to the power k will be equal to n and k will be equal to log

n to the base 2 so this is why running time of binary search is big o of log n now if we'll use

this binary search tree to store the integers search operation will be very similar let's say

we want to search for number 12 what we'll do is we'll start at root and then we will compare the

value to be searched the integer to be searched with value of root if it's equal we are done with

the search if it's lesser we know that we need to go to the left subtree because in a binary search

tree all the elements in left subtree are lesser and all the elements in right subtree are greater

now we'll go and look at the left child of node with value 15 we know that number 12 that we are

looking for can exist in this subtree only and anything apart from this subtree is discarded so

we have reduced the search space to only these three nodes having value 10, 8 and 12 now once

again we'll compare 12 with 10 we are not equal 12 is greater so we know that we need to go

looking in right subtree of this node with value 10 so now our search space is reduced

to just one node once again we will compare the value here at this node and we have a match

so searching an element in binary search tree is basically this traversal in which

at each step we will go either towards left or right and hence in at each step we will discard

one of the subtrees if the tree is balanced we call a tree balanced if for all nodes

the difference between the heights of left and right subtrees is not greater than one so if

the tree is balanced we will start with a search space of n nodes and when we will discard one of

the subtrees we will discard n by two nodes so our search space will be reduced to n by two

and then in next step we will reduce the search space to n by four we will go on reducing like this

till we find the element or till our search space is reduced to only one node when we will be done

so the search here is also a binary search and that's why the name binary search tree

this tree that i'm showing here is balanced in fact this is a perfect binary tree but with

same records we can have an unbalanced tree like this this tree has got the same integer values

as we had in the previous structure and this is also a binary search tree but this is unbalanced

this is as good as a linked list in this tree there is no right subtree for any of the nodes

search space will be reduced by only one at each step from n nodes in search space we will go to n

minus one nodes and then to n minus two nodes all the way till one will be n steps in binary search

tree in average case cost of search insertion or deletion is big o of log n and in worst case

this is the worst case arrangement that i'm showing you running time will be big o of n

we always try to avoid the worst case by trying to keep the binary search tree balanced

with same records in the tree there can be multiple possible arrangements for these integers in this

tree another arrangement is this for all the nodes we have nothing to discard in left subtree

in a search this is another arrangement this is still balanced because for all the nodes the

difference between the heights of left and right subtrees is not greater than one but this is the

best arrangement when we have a perfect binary tree at each step we will have exactly n by two

nodes to discard okay now to insert some record in binary search tree we will first have to find

the position at which we can insert and we can find the position in big o of log n time let's say

we want to insert 19 in this tree what we will do is we will start at the root if the value to be

inserted is lesser or equal if there is no child insert as left child or go left if the value is

critter and there is no right child insert as right child or go right in this case 19 is critter so

we will go right now we are at 20 19 is lesser and left subtree is not empty we have a left child

so we will go left now we are at 17 19 is critter than 17 so it should go in right of 17 there is no

right child of 17 so we will create a node with value 19 and link it to this node with value 17

as right child because we are using pointers or references here just like linked list no shifting

is needed like an array creating a link will take constant time so overall insertion will also

cost us like search to delete also we will first have to search the node search once again will be

big o of log n and deleting the node will only mean adjusting some links so removal also is going to

be like search big o of log n in average case binary search tree gets unbalanced during insertion

and deletion so often during insertion and deletion we restore the balancing there are ways to do it

and we will talk about all of this in detail in later lessons in next lesson we will discuss

implementation of binary search tree in detail this is it for this lesson thanks for watching

in our previous lesson we saw what binary search trees are now in this lesson we are going to

implement binary search tree we will be writing some code for binary search tree prerequisites

for this lesson is that you must understand the concepts of pointers and dynamic memory allocation

in cc++ if you have already followed this series and seen our lessons on linked list

then implementation of binary search tree or binary tree in general it's not going to be very

different we will have nodes and links here as well okay so let's get started binary search tree or

bst as we know is a binary tree in which for each node value of all the nodes in left subtree is

lesser or equal and value of all the nodes in right subtree is greater we can draw bst

as a recursive structure like this value of all the nodes in left subtree must be lesser or equal

and value of all the nodes in right subtree must be greater and this must be true for all nodes

and not just a root node so in this recursive definition here both left and right subtrees must

also be binary search trees i have drawn a binary search tree of integers here now the question is

how can we create this non-linear logical structure in computer's memory i had talked about this

briefly when we had discussed binary trees the most popular way is dynamically created nodes

linked to each other using pointers or references just the way we do it for linked lists because in

a binary search tree or in a binary tree in general each node can have at most two children we can

define node as an object with three fields something like what i'm showing here we can have a field to

store data another to store address or reference to left child and another to store address or

reference to right child if there is no left or right child for a node reference can be set as null

in c or c++ we can define node like this there is a field to store data here the data type is

integer but it can be anything there is one field which is pointer to node node asterisk means

pointer to node this one is to store the address of left child and we have another one to store

the address of right child this definition of node is very similar to definition of node

for doubly linked list remember in doubly linked list also each node had two links

one to previous node and another to next node but doubly linked list was a linear arrangement

this definition of node is for a binary tree we could also name this something like bst node

but node is also fine let's go with node now in our implementation just like linked list

all the nodes will be created in dynamic memory or heap section of applications memory

using malloc function in c or new operator in c++ we can use malloc in c++ as well

now as we know any object created in dynamic memory or heap section of applications memory

cannot have a name or identifier it has to be accessed through a pointer malloc or new operator

returners pointer to the object created in heap if you want to revise some of these concepts of

dynamic memory allocation you can check the description of this video for

link to a lesson it's really important that you understand this concept of stack and heap in

applications memory really well now for a linked list if you remember the information that we always

keep with us is address of the head node if we know the head node we can access all other nodes

using links in case of trees the information that we always keep with us is address of the

root node if we know the root node we can access all other nodes in the tree using links to create

a tree we first need to declare a pointer to bst node i'll rather call node bst node here bst

for binary search tree so to create a tree we first need to declare a pointer to bst node that will

always store the address of root node i have declared a pointer to node here named root ptr

for pointer in c you can't just write bst node as to risk root ptr you will have to write

struct space bst node as to risk you will have to write struct here as well i'm gonna write c++

here but anyway right now i'm trying to explain the logic we will not bother about my new details

of implementation in this logical structure of tree that i'm showing here each node as you can see

has three fields three cells leftmost cell is to store the address of left child and rightmost

cell is to store address of right child let's say root node is at address 200 in memory and i'll

assume some random addresses for all other nodes as well now i can fill in these left and right

cells for each node with addresses of left and right children in our definition data is first field

but in this logical structure i'm showing data in the middle okay so for each node i have filled in

addresses for both left and right child address is zero or null if there is no child now as we were

saying identity of the tree is address of the root node we need to have a pointer to node in

which we can store the address of the root node we must have a variable of type pointer to node

to store the address of root node all these rectangles with three cells are nodes they are created using

malloc or new operator and live in heap section of applications memory we cannot have name or

identifier for them they are always accessed through pointers this root PTR root pointer has

to be a local or global variable we will discuss this in a little more detail in some time

quite often we like to name this root pointer just root we can do so but we must not confuse

this is pointer to root and not the root itself to create a BST as i was saying we first need to

declare this pointer initially we can set this pointer as null to say that the tree is empty

a tree with no node can be called empty tree and for empty tree root pointer should be set as null

we can do this declaration and setting the root as null in main function in our program

actually let's just write this code in a real compiler i'm writing c++ here as you can see in

the main function i have declared this pointer to node which will always store the address of root

node of my tree and i'm initially setting this as null to say that the tree is empty with this

much of code we have created an empty tree but what's the point of having an empty tree we should

have some data in it so what i want to do now is i want to write a function to be able to insert

a node in the tree i will write a function named insert that will take address of the root node

and data to be inserted as argument and this function will insert a node with this data

at appropriate position in the tree in the main function i'll make calls to this insert function

passing it address of root and the data to be inserted let's say first i want to insert

number 15 and then i want to insert number 10 and then number 20 we can insert more but let's first

write the logic for insert function before i write the logic for insert function i want to

write a function to create a new node in dynamic memory or heap this function get new node

should take an integer the data to be inserted as argument create a node in heap using

new or malloc and return back the address of this new node i'm creating the new node here

using this new operator the operator will will return me the address of the newly created node

which i'm collecting in this variable of type pointer to bst node in c instead of new operator

we will have to use malloc we can use malloc in c++ as well c++ is only a super set of c

malloc will also do here now anything in dynamic memory or heap is always accessed

through pointer now using this pointer new node we can access the fields of the newly created node

i'll have to dereference this pointer using asterisk operator so i'm writing asterisk new node

and now i can access the fields we have three fields in node data and two pointers to node left

and right i've set the data here instead of writing asterisk new node dot data we have

this alternate syntax that we could use you could simply write new node

arrow data and this will mean the same we have used this syntax quite a lot in our lessons on linked

list now for the new node we can set the left and right child as null and finally we can return the

address of the new node okay coming back to the insert function we can have a couple of cases

in insertion first of all tree may be empty for this first insertion when we are inserting this

number 15 tree will be empty if tree is empty we can simply create a new node and set it as root

with this statement root equal get new node i'm setting root as address of the new node

but there is something not all right here this root is local variable of insert function

and its scope is only within this function we want this root root in main to be modified

this guy is a local variable of main function there are two ways of doing this

we can either return the address of the new root so return type of insert function will be

pointer to bst node and not void and here in the main function we will have to write statement

like root equal insert and the arguments so we will have to collect the return and update our root

in main function another way is that we can pass the address of this root of main to the insert

function this root is already a pointer to node so its address can be collected in a pointer to

pointer so insert function in insert function first argument will be a pointer to pointer and here

we can pass the address we'll say ampersand root to pass the address we can name this argument

root or we can name this argument root ptr we can name this whatever now what we need to do is we

need to dereference this using asterisk operator to access the value in root of main and we can

also set the value in root of main so here with this statement we are setting the value

and the return type now can be void this pointer to pointer thing gets a little tricky

i'll go with the former approach actually there is another way instead of declaring

root as a local variable in main function we can declare root as global variable

global variable as you know has to be declared outside all the functions if root would be global

variable it would be accessible to all the functions and we will not have to pass the address stored

in it as argument anyway coming back to the logic for insertion as we were saying if the tree is

empty we can simply create a new node and we can simply set it as root at this stage we wanted to

insert 15 if we will make call to the insert function address of root is 0 or null null is

only a macro for 0 and the second argument is the number to be inserted in this call to insert function

we will make call to get new node let's say we got this new node at address 200 get new node

function will return us address 200 which we can set as root here but this root is a local variable

we will return this address 200 back to the main function and in the main function we are actually

doing this root equal insert so in the main function we are building this link

okay our next call in the main function was to insert number 10 at this stage root is 200

the address in root is 200 and the value to be inserted is 10 now the tree is not empty so what

do we do if the tree is not empty we can basically have two cases if the data to be inserted is lesser

or equal we need to insert it in the left subtree of root and if the data to be inserted is greater

we need to insert it in right subtree of the root so we can reduce this problem in a self-similar

manner in a recursive manner recursion is one thing that we are going to use almost all the time while

working with trees in this function i'll say that if the data to be inserted is less than or equal to

the data in root then make a recursive call to insert data in left subtree the root of the left

subtree will be the left child so in this recursive call we are passing address of left child and

data as argument and after the data is inserted in left subtree the root of the left subtree can

change insert function will return the address of the new root of the left subtree and we need to

set it as left child of the current node in this example tree here right now both left and right

subtree are empty we are trying to insert number 10 so we have made call to this function insert

from main function we have called insert passing it address 200 and value or data 10 now 10 is

lesser than 15 so control will come to this line and a call will be made to insert data in left

subtree now left subtree is empty so address of root for left subtree is 0 data passed data to be

inserted passed as argument is 10 now this first insert call will wait for this insert below to finish

and return for this last insert call root is null let's say we got this node at address 150

now this insert call will return back 150 and execution of first insert call will resume at this

line and now this particular address will be set as 150 so we will build this link

and now this insert call can finish it can return back the current root actually this return root

should be there for all cases so i'm taking it out and i have it after all these conditions

of course we will have one more else here if the data is greater we need to go insert in right subtree

the third call in insert function was to insert number 20 now this time we will go to this

else statement this statement in else let's say we got this new node at address 300 so this

guy will return 300 for this node at 200 right child will will be set as 300 and now this call

to insert can finish the return will be 200 okay at this stage what if a call is made to insert number

25 we are at root right now the node with address 200 25 is greater so we need to go and insert

in right subtree right subtree is not empty this time so once again for this call also we will come

to this else last else because 25 is greater than 20 now in this call we will go to the first if

a node will be created let's say we got this node in heap at address 500 this particular call

insert 025 will return 500 and finish now for the node at 300 right child will be set as 500 so this

link will get built now this guy will return 300 the root for this subtree has not changed and this

first call to insert will also wrap up it will return to 200 so we are looking good for all cases

this insert function will work for all cases we could write this insert function without using

recursion i encourage you to do so you will have to use some temporary pointer to node and loops

recursion is very intuitive here and recursion is intuitive in pretty much everything that we do

with trees so it's really important that we understand recursion really well okay i'll write one more

function now to search some data in bst in the main function here i have made some more calls to

insert now i want to write a function named search that should take as argument address of the root

node and the data to be searched and this function should return me true if data is there in the tree

false otherwise once again we will have couple of cases if the root is null then we can return

false if the data in root is equal to the data that we are looking for then we can return true

else we can have two cases either we need to go and search in the left subtree or we need to go in

the right subtree so once again i'm using recursion here i am making recursive call to search function

in these two cases if you have understood the previous recursion then this is very similar

let's test this code now what i've done here is i've asked the user to enter a number to be searched

and then i'm making call to the search function and if this function is returning me true

i'm printing found else i'm printing not found let's run this code and see what happens i have

moved multiple insert statements in one line because i'm short of space here let's say we want

to search for number eight eight is found and now let's say we want to search for 22 22 is not found

so we are looking good i'll stop here now you can check the description of this video for

link to all the source code we will do a lot more with trees in coming lessons in our next lesson

we will go a little deeper and try to see how things move in various sections of applications

memory how things move in stack and heap sections of memory when we execute these functions it will

give you a lot of clarity this is it for this lesson thanks for watching in our previous lesson we wrote

some code for binary search tree we wrote functions to insert and search data in bst now in this lesson

we will go a little deeper and try to understand how things move in various sections of applications

memory when these functions get executed and this will give you a lot of clarity this will give you

some general insight into how memory is managed for execution of a program and how recursion

which is so frequently used in case of trees works the concepts that i'm going to talk about in this

lesson have been discussed earlier in some of our previous lessons but it will be good to go through

these concepts again when we are implementing trees so here is the code that we have written

we have this function get new node to create a new node in dynamic memory and then we have this

function insert to insert a new node in the tree and then we have this function to search some

data in the tree and finally this is the main function you can check the description of this

video for link to this source code now in main function here we have this pointer to bst node

named root to store the address of root node of my tree and i'm initially setting it as null

to create an empty tree and then i'm making some calls to insert function to insert some data in

the tree and finally i'm asking user to input a number and i'm making call to search function

to find this number in the tree if the search function is returning me true i'm printing found

else i'm printing not found let's see what will happen in memory when this program will execute

the memory that is allocated to a program or application for its execution in a typical

architecture can be divided into these four segments there is one segment called text segment

to store all the instructions in the program the instructions would be compiled instructions

in machine language there is another segment to store all the global variables a variable that is

declared outside all the functions is called global variable it is accessible to all the functions

the next segment stack is basically scratch space for function call execution

all the local variables the variables that are declared within functions live in stack and finally

the fourth section heap which we also call the free store is the dynamic memory that can grow

or shrink as per our need the size of all of the segments is fixed the size of all of the segments

is decided at compile time but heap can grow during runtime and we cannot control allocation or

deallocation of memory in any other segment during runtime but we can control allocation

and deallocation in heap we have discussed all of this in detail in our lesson on dynamic memory

allocation you can check the description for a link now what i'm going to do here is i'm going

to draw stack and heap sections as these two rectangular containers i'm kind of zooming into

these two sections now i'll show you how things will move in these two sections of applications

memory when this program will execute when this program will start execution first the main function

will be called now whenever a function is called some amount of memory from the stack is allocated

for its execution the allocated memory is called stack frame of the function call all the local

variables and the state of execution of the function call would be stored in the stack frame

of the function call in the main function we have this local variable root which is pointer to bst

node so i'm showing root here in this stack frame we will execute the instructions sequentially

in the first line in main function we have declared root and we are initializing it and setting it as

null null is only a macro for address 0 so here in in this figure i'm setting address in root as 0

now in the next line we are making a call to insert function so what will happen is execution

of main will pause at this stage and a new stack frame will be allocated for execution of insert

main will wait for this insert above to finish and return once this insert call finishes main will

resume at line 2 we have these two local variables root and data in insert function in which we are

collecting the arguments now for this call to insert function we will go inside the first if

condition here because root is null at this line we will make call to get new node function

so once again execution of this insert call will pause and a new stack frame will be allocated

for execution of get new node function we have two local variables in get new node data in which

we are collecting the argument and this pointer to bst node named new node now in this function we are

using new operator to create a bst node in heap now let's say we got a new node at address 200

new operator will return us this address 200 so this address will be set here

in new node so we have this link here and now using this pointer new node we are setting value

in these three fields of node let's say the first field is to store data so we are setting value 15

here and let's say this second cell is to store address of left child this is being set as null

and the address of right child is also being set as null and now get new node will return

the address of new node and finish its execution whenever a function call finishes the stack frame

allocated to it is reclaimed call to insert function will resume at this line and the return

of get new node address 200 will be set in this root which is local variable for insert call

and now insert function this particular call to insert function will return the address of root

the address stored in this variable root which is 200 now and finish and now main will resume

at this line and root of main will be set as 200 the return of this insert call insert root 15

will be set here now in the execution of main control we'll go to the next line

and we have this call to insert function to insert number 10 once again execution of main will be

paused and a stack frame will be allocated for execution of insert now this time for insert call

root is not null so we will not go inside the first if we will access the data field of this node

at address 200 using this pointer named root in insert function and we will compare it with

this value 10 10 is lesser than 15 so we will go to this line and now we are making a recursive call

here recursion is a function calling itself and a function calling itself is not any different from

a function a calling another function b so what will happen here is that execution of this particular

insert call will be paused and a new stack frame will be allocated for execution of this another

insert call to which the arguments passed are address 0 in this local variable root left child

of node at address 200 is null so we are passing 0 in root and in data we are passing 10 now for

this particular insert call control will go inside first if and we will make a call to get new node

function at this line so execution of this insert will pause and we will go to get new node function

here we are creating a new node in heap let's say we got this new node at address 150 now get new

node will return 150 and finish execution of this call to insert will resume at this line return of

get new node will be set here and now this call to insert will return address 150 and finish insert

below will resume at this line and now in this insert call left child of this node at address 200

will be set as return of the previous insert call which is 150 so now these two nodes are linked

and finally this insert call will finish control will return back to main at this line root will

be rewritten as 200 but earlier also it was 200 it's not changing next in the main function we

have called to insert number 20 i'm not going to show the simulation for this one once again the

allocated memory in stack will grow and shrink and finally when the control will return back to

main function after this insert call is over we will have a node in heap with value 20 set as right

child of this node at 200 let's say we got this new node with value 20 at address 300 so as you can

see the address of right child in node at address 200 is set as 300 now next one is to insert number

25 this one is interesting let's see what will happen for this one main will be paused and we will

go to this call to insert in the root which is local to this call address passed is 200 and we

have passed number 25 in data now here 25 is greater than the value in this node at address 200 so

we will go inside this last else condition we need to insert in the right sub tree so another call

to insert will be made we will pass address 300 as root and data passed will be 25 only now for this

call once again the value in node at 300 for this call root is 300 is lesser than 25 25 is greater

and greater than 20 so once again we will come to this last else and make a recursive call to insert

in the right subtree the right subtree is empty this time so for this insert call at top the address

in root here will be 0 so for this call we will go to the first if and make a call to get new node

let's say this new node returns us node at address 100 i'm short of space so i'm not showing everything

and get new nodes stack frame here we will return back to this insert call at top and now this root

is set as 100 address of the newly created node and now this call to insert will finish we will

come back to this insert below and this insert will resume at this line inside the last else

and the right child of node at address 300 will be set as 100 and now this insert will return back

address 300 whatever is set in its root and this insert below will resume at this line

inside the last else right child of node at address 200 will be set as 300 it was 300

previously also so even after overwriting we will not change and this insert will finish now finally

main will resume at this line root of main will be set as return of this insert call it will only

be overwritten with same value it's really important that this root in main and all the links in nodes

are properly updated quite often because of bugs in our code will lose some links or some unwanted

links are created now as you can see we are creating all the nodes in heap here heap gives us this

flexibility that we can decide the creation of node during runtime and we can control the lifetime

of anything in heap any memory claimed in heap has to be explicitly deallocated using free in

c or delete operator in c++ else the memory in heap remains allocated till the program is running

the memory in stack as you can see gets deallocated when function call finishes

the rest of the function calls here in main function will execute in similar manner

i'll leave it for you to see and think about right now we have this tree in the heap logically

memory itself is a linear structure and this is how tree which is a non-linear structure

which is logically a non-linear structure will fit in it the way i'm showing the nodes

at random locations linked to each other in this heap i hope this explanation gave you some clarity

in coming lessons we will solve some problems on tree this is it for this lesson thanks for

watching in our previous lessons we wrote some basic code for binary search tree but to solidify

our concepts we need to write some more code so i've picked this simple problem for you

given a binary search tree we want to find minimum and maximum element in it

let's see how we can solve this problem i have drawn logical representation of a binary search tree

of integers here as we know in a binary search tree for all nodes value of nodes in left sub tree

is lesser and value of nodes in right sub tree is greater this is how we can define node for a

binary search tree in cc plus plus we can have a structure with three fields one to store data

another to store address of left child and another to store address of right child as we had seen

earlier in bst implementation identity of the tree that we always keep with us that we pass to

functions is address of the root node so what i want to do here is i first want to write

a function named find min that should take address of the root node as argument and return me the

minimum element in the tree and just like find min we can write another function named find max

that can return us the maximum element in bst let's first see how we can find the minimum element

there are two possible approaches here we can write an iterative solution in which we can use

a simple loop to find the minimum element or we can use recursion let's first see the iterative

solution if we have a pointer to the root node and we want to find the minimum element in bst

then from root we need to go left as long as it's possible to go using the left links

because in a bst for all nodes nodes in left have lesser value and nodes in right have greater value

so we need to go left as long as it's possible we can start with a temporary pointer to root

node we can name this pointer temp or we can name this pointer current to say that we are currently

pointing to this node in my function here i have declared this pointer to bst node named current and

initially i'm setting the address of root in it and with this pointer we can go to the left child

with a statement like current equal current arrow left we first need to check if there is a left child

and then we need to move the pointer we can use a while loop like this if the left child of current

node is not null we can move this pointer current to the left child with this statement current equal

current arrow left here in this example currently we are pointing to this node with value 15

it has a left child so we can move to this node with value 10 once again this node 2 has a left

child so we can go left again now this node with value 8 does not have a left child so we cannot go

towards left any further we will come out of the while loop and at this point the node that we are

pointing to has minimum value so we can return the data in that node there is one case that we are

missing in this function if the tree is empty we can throw some error now we can return some

value indicative of empty tree if i know that the tree would have only positive values i can

return something like minus one so here in my function i have added this condition if root

is equal to null that is if the tree is empty print this error and return minus one one more thing

we do not need to use this extra pointer to bst node named current root here is a local variable

and we can use this root itself so we can write our code like this while left of root is not

equal to null we can go left with this statement root equal root arrow left and finally we can return

root arrow data which is only an alternate syntax for asterisk root dot data modifying

this local root is not going to modify my root in main function or whatever function i'm calling

this find min function from so this is our iterative solution to find minimum element in bst

the logic for finding maximum is similar the only difference will be that instead of going left

we will have to go right all the time i leave it for you to implement let's now see how we can

find minimum element using recursion if we want to reduce this problem in a recursive manner in a

self-similar manner then what we can say is if the left subtree is not empty then we can reduce the

problem to finding minimum in left subtree if left subtree is empty we already know the minimum

because we cannot have a minimum in right subtree here is the recursion that we can write root being

null is a corner case if root is null that is if the tree is empty we can throw error else if left

child of root is null we can return the data in root else if left child is not null or in other

words if the left subtree is not empty we can reduce the problem to searching minimum in the left

subtree so we are making this recursive call to find min passing it address of the left child

passing it address of the root of left subtree left child would be the root of left subtree

this second else if is our base condition to exit from recursion if you had understood the

recursion that we had written earlier to insert a node in bst then this recursion should not

be very difficult for you to understand so here is our recursive solution to find minimum in a

bst to find maximum element all we need to do is we need to go searching in right subtree

okay i'll stop here now in coming lessons we will solve some more interesting problems on bst

thanks for watching in this lesson we're going to write code to find height or what we can also call

maximum depth of a binary tree we have already discussed depth and height in our first introductory

lesson on trees but i'll do a quick recap here first of all i've drawn a binary tree here

i've not filled in any data in the nodes data can be anything binary tree as we know is a tree

in which each node can have at most two children so a node can have zero one or two children i'll

just number these nodes so i can refer to them i'll say this root node is number one and i'll go

level by level from left to right counting two three four and so on now height of a tree

is defined as number of edges in longest path from root to a leaf node

in this example three four six seven eight and nine are leaf nodes a leaf node is a node with

zero children number of edges in longest path from root to a leaf node is three for both eight

and nine number of edges in path from root is three so height of the tree is three actually we can

define height of a node in the tree as number of edges in longest path from that node to a leaf

node so height of a tree basically is height of the root node in this example tree height of node

three is one height of node two is two and height of node one is three and because this is the root

node this is also the height of the tree height of a leaf node would be zero so if a tree has only

one node then the root node itself would be a leaf node and so height of the tree would be zero

so this is definition of height of a tree we often also talk about

depth and we often confuse between depth and height but these two are different properties

depth of a node is defined as number of edges in path from root to that node

basically depth is distance from root and height is distance from the best accessible leaf node

for node two in this example tree depth is one and height is two for node number nine which is

a leaf node depth is three and height is zero for root node depth is zero and height is three

height of a tree would be equal to maximum depth of any node in the tree so height

and max depth these two terms are used for each other okay let's now see how we can calculate height

or max depth of a binary tree i'm going to write a function named find height that will take

reference or address of the root node as argument and return me the height of the binary tree

now the logic to calculate height can be something like this for any node if we can somehow calculate

the height of its left subtree and also the height of its right subtree then the height of that node

would be greater of the heights of left and right subtrees plus one for the root node in this tree

height of the left subtree is two and height of the right subtree is one so height of the root

node would be greater of these two values plus one plus one for the edge connecting the root node

to the subtree so height of the root node which would also be the height of the tree is three here

in our code we can calculate height of left and right subtrees using recursion what i'll do here

and find height function is i'll first make a recursive call to find height of the left subtree

we can say to find height of left subtree or to find height of left child both will mean the same

i'm collecting the return of this recursive call in a variable named left height

and now i'll make another recursive call to calculate height of right subtree or right child

now height of the tree or height of whatever node for which we have made this function call

would be greater of these two values left height and right height plus one now there is only one

more thing missing in this recursion we need to write the base or exit condition we cannot go into

recursion infinitely what we can do is we can go on till we make a recursive call with root equal

null and if root is null that is if the tree or subtree is empty we can return something what

should we return here give this some thought if i have made a call to find height of let's say

this leaf node this node with number seven then for this guy both left and right children are null

in call for this node number seven we will make two recursive calls passing null in both the calls

so what should we return should we return zero if these two calls will return zero then height of

seven will be one because in the return statement here we're saying max of left and right height

plus one but as we had discussed earlier height of a leaf node should be zero so if we are returning

zero for root equal null it's not all right what we can do is we can return minus one

when we are returning minus one then this edge to null that does not exist but still was getting

counted will be balanced with this minus one i hope this is making sense and going by convention

also height of an empty tree is set to be minus one so this is pseudocode for my function to find

height of a binary tree some people define height as number of nodes in longest path from root to a

leaf node we are counting edges here and this is the right definition if you want to count

number of nodes then for a leaf node height would be one and for empty tree height would be zero

so all you need to do is return zero here and this is the code if you want to count

number of nodes but i think the right definition is number of edges so i'll return minus one here

time complexity of this function is big o of n where n is number of nodes in the tree we will

make one recursive call corresponding to each node in the tree so we are kind of visiting each

node in the tree once and so running time will be proportional to number of nodes i'll skip detail

analysis of running time in this lesson this is what my find height function will look like in

c or c plus plus max here is a function that will return greater of two values passed to it as

arguments so this is it for this lesson thanks for watching in this lesson we are going to talk

about binary tree traversal when we are working with trees we may often want to visit all the

nodes in the tree now tree is not a linear data structure like array or linked list

in a linear data structure there would be a logical start and a logical end so we can start

with a pointer at one of the ends and keep moving it towards the other end for a linear

data structure like linked list for each node or element we would have only one next element

but tree is not a linear data structure i have drawn a binary tree here data type is

character this time i fill in these characters in the nodes now for a tree at any time if

we are pointing to a particular node then we can have more than one possible directions we can have

more than one possible next nodes in this binary tree for example if we will start with a pointer

at root node then we have two possible directions from f we can either go left to d or we can go

right to j and of course if we will go in one direction then we will somehow have to come back

and go into the other direction later so tree traversal is not so straightforward

and what we are going to discuss in this lesson is algorithms for tree traversal tree traversal

can formally be defined as the process of visiting each node in the tree exactly once in some order

and by visiting a node we mean reading or processing data in the node for us in this lesson

visit will mean printing the data in the node based on the order in which nodes are visited

tree traversal algorithms can broadly be classified into two categories we can either go breadth first

or we can go depth first breadth first traversal and depth first traversal are general techniques

to traverse or search a craft craft is a data structure and we have not talked about craft

so far in this series we will discuss craft in later lessons for now just know that tree is only

a special kind of craft and in this lesson we are going to discuss breadth first and depth first

traversal in context of trees in a tree in breadth first approach we would visit all the nodes at

same depth or level before visiting the nodes at next level in this binary tree that i'm showing

here this node with value f which is the root node is at level 0 i'm writing l0 here for level 0

depth of a node is defined as number of edges in path from root to that node root node would have

depth 0 these two nodes d and j are at depth 1 so we can say that these nodes are at level 1

now these four nodes are at level 2 these three nodes are at level 3 and finally this node with

value h is at level 4 so what we can do in breadth first approach is that we can start at level 0

we would have only one node at level 0 the root node so we can visit the root node i'll write the

value in the node as i'm visiting it now level 0 is done now i can go to level 1 and visit the

nodes from left to right so after f we would visit t and then we would visit j and now we are done

with level 1 so we can go to level 2 now we will go like b then e then g and then k and now we can

go to level 3 ac and i and finally i can go to level 4 this kind of breadth first traversal in case

of trees is called level order traversal and we will discuss how we can do this programmatically

in some time but this is the order in which we would visit the nodes we would go level by level

from left to right in breadth first approach for any node we visit all its children before visiting

any of its grandchildren in this tree first we are visiting f and then we are visiting d

and then we are not going to any child of d like b or e along the depth next we are going to j

but in depth first approach if we would go to a child we would complete the whole subtree of the

child before going to the next child in this example tree here from f the root node if we are going

left to d then we should visit all the nodes in this left subtree that is we should finish this

left subtree in its complete depth or in other words we should finish all the grandchildren of f

along this path before going to right child of f j and once again when we will go to j we will

visit all the grandchildren along this path so basically we will visit the complete right subtree

in depth first approach the relative order of visiting the left subtree the right subtree

and the root node can be different for example we can first visit the right subtree and then the

root and then the left subtree or we can do something like we can first visit the root and then the

left subtree and then the right subtree so the relative order can be different but the core idea

in depth first strategy is that visiting a child is visiting the complete subtree in that path

and remember visiting a node is reading processing or printing the data in that node

based on the relative order of left subtree right subtree and the root there are three popular

depth first strategies one way is that we can first visit the root node then the left subtree

and then the right subtree left and right subtrees will be visited recursively in

same manner such a traversal is called tree order traversal another way is that we can first

visit the left subtree then the root and then the right subtree such a traversal is called

in order traversal and if root is visited after left and right subtrees then such a traversal

is called post order traversal in total there are six possible permutations for left right and root

but conventionally a left subtree is always visited before the right subtree so these are

the three strategies that we use only the position of root is changing here if it's before left and

right then it's pre-order if it's in between it's in order and if it's after left and right subtrees

then it's post order there is an easy way to remember these three depth first algorithms

if we can denote visiting a node or reading the data in that node with letter d going to the left

subtree as l and going to the right subtree as r so if we can say d for data l for left and r for

right then in pre-order for each node we will go d l r first we will read the data in that node

then we will go left and once the left subtree is done we will go right in in order traversal

first we will finish the left subtree then we will read the data in current node

and then we will go right in post order for each node first we will go left once left subtree is

done we will go right and then we will read the data in current node so pre-order is data left

right in order is left data right and post-order is left right and then data

pre-order in order and post-order are really easy and intuitive to implement

using recursion but we will discuss implementation later let's now see what will be the pre-order

in order and post-order traversal for this tree that i've drawn here let's first see what will be

the pre-order traversal for this binary tree we need to start at root node and for each node

we first need to read the data or in other words visit that node in fact instead of d l r we could

have said v l r here v for visit we can use any of these assumptions v for visit or d for data

i will go with d l r here so let's start at the root for each node we first need to read the data

i'm writing f here the data that i just read and now i need to go left and finish the complete

left subtree and once all the nodes in the left subtree are visited then only i can go to the right

subtree the problem here is actually getting reduced in a self-similar or recursive manner

now we need to focus on this left subtree now we are at d root of this left subtree of f

once again for this node we will first read the data and now we can go left we will go towards e

only when these three nodes a b and c will be done now we are focusing on this subtree

comprising of these three nodes now we are at b we can read the data and now we can go left

to a there is nothing in left of a so we can say that for left for a left subtree is done

and there is nothing in right as well so we can say right is also done now for b left subtree is done

so we can go right to c and left and right of c and null and now for d left subtree is done so we

can go right once again for e left and right and null and now at this stage for f complete left

subtree is visited so we can go right now we need to go left of j and there is nothing in left of

g so we can go right and now we can go left of i for h there is nothing in left and right now

at this stage left subtree of i is done and right subtree is null and now we can go back to j

the left subtree for j is done so we can go to its right subtree finally we have k here and we are

done with all the nodes this is how we can perform a preorder traversal manually actual implementation

would be a simple recursion and we will discuss it later let's now see what will be the in order

traversal for this binary tree in in order traversal we will first finish visiting the left subtree

then visit the current node and then go right once again we will start at the root and we will

first go left now we will first finish this subtree once again for d we will first go left to b

and from b we will go to a now for a there is nothing in left so we can say that for this guy

left subtree is done so we can read the data and now we can go to its right but there is nothing

in right as well so this guy is done now for b left subtree is done so we can read the data

and now for b we can go right for c once again there is nothing in left so we can read the data

and there is nothing in right as well now left of d is completely done so we can visit it read

the data here now we can go to its right to e for e once again left and right and null at this

stage left subtree of f is done so we can read on the data and now we can go to right of f if we

will go on like this this finally will be my in order traversal this tree that i'm showing here

is actually a binary search tree for each node the value of nodes in left is lesser

and the value of nodes in right is greater so if we are printing in this order left subtree

and then the current node and then the right subtree then we would get a sorted list in order

traversal of a binary search tree would give you a sorted list okay now you should be able to figure

out the post order traversal this is what we will get for post order traversal i leave it for you

to see whether this is correct or not i'll stop here now in next lesson we will discuss

implementation of these tree traversal algorithms thanks for watching in this lesson we are going to

write code for level order traversal of a binary tree as we have discussed in our previous lesson

in level order traversal we visit all nodes at a particular depth or level in the tree before

visiting the nodes at next deeper level for this binary tree that i'm showing here if i have to

traverse the tree and print the data in nodes in level order then this is how we will go

we will start at level 0 and print f and now we are done with level 0 so we can go to level 1

and we can visit the nodes at level 1 from left to right from f we will go to d and from d we

will go to j now level 1 is done so we can go to level 2 so we will go like b e g and then k

and now we can go to next level aci and finally we will be done at h this is the order in which

we should visit the nodes but the question is how can we visit the nodes in this order in a program

like linked list we can't just have one pointer and keep moving it if i'll start with a pointer

at root let's say i have a pointer named current to point to the current node that i'm visiting

then it's possible for me to go from f to d using this pointer because there is a link so i can go

left to d but from d i cannot go to j because there is no link from d to j the only way we can

go to j is from f and once we have moved the pointer to d we can't even go back to f because

there is no backward link from d to f so what can we do to traverse the nodes in level order clearly

we can't go with just one pointer what we can do is as we visit a node we can keep reference or

address of all its children in a queue so we can visit them later a node in the queue can be called

discovered node whose address is known to us but we have not visited it yet initially we can start

with a address of root node in the queue to mean that initially this is the only discovered node

let's say for this example tree address of the root node is 400 and i'll assume some random

addresses for other nodes as well i will mark a discovered node in yellow okay now initially i'm

in queuing the root node and by storing a node in the queue i'll mean storing the address of the

node in the queue so initially we are starting with one discovered node now as long as the queue

has some discovered node at least one discovered node that is as long as the queue is not empty

we can take out a node from the front visit it and then enqueue its children visiting a node for

us is printing the value in that node so i'll write f here and now i'll enqueue the children

of this root node first i'll enqueue the left child and then the right child i'll mark visited

node in another color okay now we have one visited node and two discovered node

and now once again we can take out the node at front of the queue visit it and enqueue its children

by using a queue we are doing two things here first of all as we are moving from a node

we are not losing reference to its children because we are storing the references and then

because queue is our first in first out structure so a node that is discovered first that is inserted

first will be visited first so we will get this ordered that we are desiring give this some thought

and it's not very difficult to see okay so now we can dequeue and visit this node at address 200

and once again before i move on from this node i need to enqueue its children

so now at this stage we have two visited nodes three discovered nodes

and six undiscovered nodes and now we can take out the next node from front of queue

we'll visit it and enqueue its children if we will go on like this all the nodes will be visited

in the order that we are desiring at this stage we can dequeue node at 120 visit it

and then queue its children so we will go on like this until all the nodes are visited and the queue

is empty after b we will have e here nothing will go into the queue this time next we will have g here

and the address of i will go into the queue now k will be visited now at this stage we have

reference to three nodes in the queue now we will visit this node at 320 with value a

then we have c and now we will print i and the node with value h the node with data h will go into

the queue finally we will visit this node and now we are done with all the nodes the queue is empty

once the queue is empty we are done with our traversal so this is the algorithm for level

order traversal of a binary tree as you saw in this approach at any time we are keeping a

bunch of addresses in the memory in the queue instead of using just one pointer to move around

so of course we are using a lot of extra memory and i'll talk about the space complexity of

this algorithm and sometime i hope you got the core logic right let's now write code for this

algorithm i'm going to write c plus plus here this is how i'm defining node for my binary tree

i have a structure here with three fields one to store data and the data type is character this

time because in the example tree that we were showing earlier data type was character and we

have two more fields that are pointers to node one to store the address of left child and another

to store the address of right child now what i want to do here is i want to write a function

named level order that should take address of the root node as argument and print the data in the

nodes in level order now to test this function i'll have to write a lot of code to create and insert

nodes in a binary tree i'll have to write some more functions i'll skip writing all that code

you can pick the code for creation and insertion from previous lessons all i'll write is this function

level order now in this function here i'll first take care of one corner case if the tree is empty

that is if root is null we can simply return else if the tree is not empty we need to create

a queue i'm not going to write my own implementation of queue here in c plus plus we can use the queue

in standard template library and to use it first we'll have to write a statement like hash include

queue here and now i can create a queue of any type in this function i'll create a queue of pointer

to node with a statement like this now as we had discussed earlier initially we start with one

discovered node in the queue the only node known to us initially is the root node with this statement

queue dot push root i have inserted the address of root node in the queue and now i'll run this

while loop for which the condition is that the queue should not be empty and what i really mean

here is that while there is at least one discovered node we should go inside the loop and inside the

loop we should take out a node from the front this function front returns the element at front of the

queue and because the data type is pointer to node i'm collecting the return of this function in

this pointer to node a named current now i can visit this node being pointed by current and by

visiting if we mean reading the data in that node i'll simply print the data and now we want to push

the addresses of children of this node into the queue so i'll say that if the left child is not null

insert it into the queue and similarly if right child is not null push it into the queue or

rather push its address into the queue and i'll write one more statement to remove the element

from front of the queue with call to front the element is not removed from the queue with this

call pop we are removing the element okay so this is implementation of level order traversal in

c++ you can check the description of this video for a link to source code and there you can also

find all the extra code to test this function let's now talk about time and space complexity

of level order traversal if there are n nodes in the tree and in this algorithm visit to a node

is reading the data in that node and inserting its children in the queue then a visit to a node

will take constant time and each node will be visited exactly once so time taken will be

proportional to the number of nodes or in other words we can say that the time complexity is big

o of n for all cases irrespective of the shape of the tree time complexity of level order traversal

will be big o of n now let's talk about space complexity space complexity as we know is the

measure of rate of growth of extra memory used with input size we are not using constant amount

of extra memory in this algorithm we have this queue that will grow and shrink while executing

this algorithm assuming that the queue is dynamic maximum amount of extra memory used will depend

upon maximum number of elements in the queue at any time we can have couple of cases in some cases

extra memory used will be lesser and in some cases extra memory used will be greater

for a tree like this where each node has only one child we will have maximum one element in

the queue at any time during each visit one node will be taken out from the queue and one node

will be inserted so the amount of extra memory taken will not depend upon the number of nodes

space complexity will be big o of one but for a tree like this the amount of extra memory

used will depend upon the number of nodes in the tree this is a perfect binary tree all the levels

are full if you can see as the algorithm will execute at some point for each level all the

nodes in that level will be in the queue in a perfect binary tree we will have n by 2 nodes at

the deepest level so maximum number of nodes in the queue is going to be at least n by 2

so basically extra memory used is proportional to n the number of nodes so space complexity

will be big o of n for this case i'm not going to prove it but for average case space complexity

will be big o of n so for both worst and average cases we will be big o of n in terms of space

complexity and when we are saying best average and worst cases here it's only going by space

complexity time complexity will be big o of n for all cases so this is time and space complexity

analysis of level order traversal i'll stop here now in next lesson we will discuss depth first

traversal algorithms pre-order in order and post-order this is it for this lesson thanks for watching

in our previous lesson we talked about level order traversal of binary tree which is basically

breadth first traversal now in this lesson we are going to discuss these three depth first

algorithms pre-order in order and post-order i have drawn a binary tree here data type filled

in the nodes is character now as we had discussed in earlier lessons in depth first traversal of

binary tree if we go in one direction then we visit all the nodes in that direction or in other

words we visit the complete subtree in that direction and then only we go in other direction

in this example tree that i've drawn here if i'm at root and i'm going left then i'll visit

all the nodes in this left subtree and then only i can go right and once again when i'll go right

i'll visit all the nodes in this right subtree if you can see in this approach we are reducing

the problem in a self-similar or recursive manner we can say that in total visiting all

the nodes in the tree is visiting the root node visiting the left subtree and visiting the right

subtree remember by visiting a node we mean reading or processing the data in that node

and by visiting a subtree we mean visiting all the nodes in the subtree in depth first strategy

relative order of visiting the left subtree right subtree and the root can be different

for example we can first visit the right subtree then the root and then the left subtree

or we can first visit the root and then the left subtree and then the right subtree

conventionally left subtree is always visited before right subtree with this constraint we will

have three permutations we can first visit the root and then the left subtree and then the

right subtree and such a traversal will be called pre-order traversal or we can first visit the

left subtree then the root and then the right subtree and such a traversal will be called

in order traversal and we can also go left right and then root and such a traversal will be called

post-order traversal left and right subtrees will be visited recursively in same manner as the

original tree so in pre-order once again for the subtrees we will go root left and then right

in in order we'll keep going left root and then right the actual implementation of these

algorithms is really easy and intuitive let's first see code for pre-order traversal

I first written the algorithm in words here in pre-order traversal we first need to visit

the root then the left subtree and then the right subtree now I want to write a function that should

take pointer or reference to root node as argument and print data in all the nodes in pre-order

let's say visiting a node for us is printing the data in that node in c or c++ my method signature

will look something like this this function will take address of the root node as argument

argument type is pointer to node I'll define node as a structure with three fields like this

data type in this definition is character and there are two fields to store the addresses of

left and right children now in pre-order function I'll first visit or print the data in root node

and now I'll make a recursive call to visit the left subtree I have made a recursive call here

and to this call I'm passing address of the left child of my current root because left child will be

the root of left subtree and I'll have another call like this to visit the right subtree there is one

more thing that we need to add in this function and we will be done we cannot go into recursion

infinitely we need to have a base condition where we should exit if a tree or subtree is empty or

in other words for any call if root is null we can return or exit now with this much of code I'm

done with my pre-order function this will work fine in c or c++ actually in c make sure you write

struct space node instead of writing just node rest of the things are fine it will be good to

visualize this recursion so let's now quickly see how this pre-order function will work if this

example tree that I'm showing in right here is passed to it I'll redraw this tree and show it

like this here I'm depicting node as a structure with three fields let's say the leftmost cell here

is to store the address of left child the cell in middle is to store the data and the rightmost

cell is to store the address of right child now let's assume some addresses for these nodes

let's say the root node is at address 200 and I'll assume some random addresses for other nodes as

well and now I can fill in left and right fields for each node and as we know the identity of tree

that we always keep with us is reference or address of the root node this is what we pass to all the

functions in our implementation we often use a variable of type pointer to node named root to

store the address of root node we can name this variable anything we can name this variable root

or we can name this variable root ptr but this is just a pointer this particular block that I'm

showing here is for pointer to node and all these rectangles with three cells are nodes

this is how things are organized in memory now for this tree let's say we are making a call

to this pre-order function I'll make a call to pre-order passing it address 200 for this call

root is not null so we will not return at first line in this function we will go ahead and print

the data in this node at address 200 I'll write output for all print statements here and now this

function will make a recursive call execution of this particular function call will pause it will

resume only after this recursive call pre-order 150 finishes this second call is to visit this

left subtree this call pre-order 150 is to visit this left subtree address of the left child of

node at 200 is 150 once again for this call root is not null so we will go ahead and print the data

in node at 150 is d and now once again there will be a recursive call with this call pre-order 400

we are saying that we're going to visit this subtree once again we will print the data and

make another recursive call now we have made a call to visit this particular subtree with just

one node for this call we will print the data and now for node at 250 address of left child is zero

or null we will make a call pre-order zero but for this call we will simply return

because the address in this variable root will be null we have hit the base condition for our

recursion call to pre-order zero we'll finish and pre-order 250 will resume now in this particular

function call we'll make another call for right subtree for node at 250 even the right child is

null we will have another recursive call passing address zero but this once again will simply return

and now call to pre-order 250 will finish and call to pre-order 400 will resume

now in call to pre-order 400 we will make another recursive call to pre-order 180

with this call pre-order 180 we are visiting this particular subtree with just one node

for this call first we will print the data and then we will make a recursive call to pre-order zero

now pre-order zero will simply return and then we will have another call to pre-order zero

for right child of 180 the recursion will go on like this there is one thing that i want to talk

about here that's happening in this whole process even though we are not using any extra memory

explicitly in our function because of the recursion we are growing the function call stack

we have discussed memory management a number of times in our earlier lessons

you can check description of this video for a link to one of those lessons as we know for

each function call we allocate some amount of memory in what we call stack section of applications

memory and this allocated memory is reclaimed when the function call finishes at this stage

of execution of my recursion for this example my call stack will look something like this i'm writing

p as shortcut for pre-order because i'm short of space here let's say we made a call to pre-order

passing it address 200 from main function main function will be at bottom of stack at any time

only the call at top of stack will be executing and all other calls will be paused call stack keeps

growing and shrinking during execution of a program because memory is allocated for a new

function call and it's reclaimed when a function call finishes so even though we are not using any

extra memory explicitly here we are using memory implicitly in the call stack so space complexity

which is measure of rate of growth of extra memory used with input will depend upon the maximum

amount of extra memory used in the call stack i'll talk about space complexity once more later

for now let's come back to this recursion that i was executing call to this pre-order 0 will

finish and pre-order 180 will resume memory allocated for execution of pre-order 0 will be reclaimed

now for pre-order 180 both recursive calls have finished so this guy will also finish

even for pre-order 400 both calls have finished so pre-order 150 will resume now this guy will

make a recursive call to pre-order function passing it address 450 address of its right

child memory in the stack will be allocated for execution of pre-order 450 now in this

call we will first print the data and then we will make two recursive calls to pre-order passing

address 0 each time because for this node at 450 both children are null both calls will simply

return and then pre-order 450 will finish and now pre-order 150 will also be done if you can see

the call stack will grow only till we reach a leaf node a node with no children and then it will

start shrinking again maximum growth of call stack due to this recursion will depend upon

maximum depth or height of the tree we can say that extra space used will be proportional to

height of the tree or in other words space complexity of this algorithm is big o of h

where h is height of the tree okay coming back to the recursion we are done with pre-order 150

so pre-order 200 will resume and now we will make a call to visit this particular sub tree

in this call we will print j and then we will make a call passing address 60 so now we are

visiting this particular sub tree here we will first print g and then this guy will make a call

to pre-order 0 which will simply return and then there will be another call to pre-order 500

here we will print i and then we will make two recursive calls passing address 0 every time because

node at 500 is a leaf node with no children after this guy finishes pre-order 60 will resume

now this guy will also finish and pre-order 350 will resume and now we will have a call to

pre-order 700 which once again is a leaf node so k which is data in this node will be printed

and then we will make two calls passing address 0 which will simply return now at this stage

all these calls can finish we are done visiting all the nodes finally we will return back to the

caller of pre-order 200 which probably would be the main function so this is pre-order traversal

for you i hope you got how this regression works code for in-order and post-order will be very

similar in in-order traversal my base case will be the same so i'll say if root is null then return

or exit if root is not null i first need to visit the left sub tree i am visiting the left

sub tree with this recursive call then i need to visit the root so now i'm writing the sprint

of statement to print the data and now i can visit the right sub tree so this second recursive call

and this is my in-order function in-order traversal of this example tree that i have drawn here

will be this this particular binary tree is actually also a binary search tree and in-order

traversal of a binary search tree would give us elements in the tree in sorted order

okay let's now write code for post-order for this function once again the base case will

be the same so i'll say if root is null return or exit if root is not null i first need to visit

the left sub tree so i have made this recursive call then the right sub tree so i'll have this

another recursive call and now i can visit the root node post-order traversal for this example tree

will be this so this is pre-order in-order and post-order for you you can check the description

of this video for a link to all the source code let's now quickly talk about time and space

complexity of these algorithms time complexity of all these three algorithms is big o of n

if you could see then there was one function call corresponding to each node where we were

actually visiting that node where we were actually printing the data in that node so running time

should actually be proportional to number of nodes there is a better formal and mathematical

way of proving that time complexity of these algorithms is big o of n you can check the description

of this video for link to that space complexity as we had discussed earlier will be big o of h

where h is height of the tree height of a tree in worst case will be n minus 1 so in worst case

space complexity of these algorithms can be big o of n in best or average case height of a tree

will be big o of log n to the base two so we can say that in best or average case

space complexity will be big o of log n i'll stop here now in coming lessons we will solve

some problems on binary tree thanks for watching in this lesson we are going to solve a simple

problem on binary tree which is also a famous programming interview question and the problem is

given a binary tree we need to check if the binary tree is a binary search tree or not

as we know a binary tree is a tree in which each node can have act most two children

all these trees that i have drawn here are binary trees but not all of them are binary search trees

binary search tree as we know is a binary tree in which for each node value of all the nodes in

left subtree is lesser and if we want to allow duplicates we can say lesser or equal and value

of all the nodes in right subtree is greater we can define binary search tree as a recursive

structure like this elements in left subtree must be lesser or equal and elements in right

subtree must be greater and this should be true for all nodes and not just the root node

so left and right subtrees should themselves also be binary search trees

of these binary trees that i'm showing here a and c are binary search trees but b and d

are not in b for the root node with value 10 we have 11 in its left subtree which is

greater than 10 and in a binary tree for any node all values in its left subtree must be lesser

in d we are good for the root node the value in root node is 5 and we have 1 in left subtree

which is lesser and we have 8 9 and 12 in right subtree which are greater so we are good for the

root node but for this node with value 8 we have 9 in its left so this tree is not a binary search

tree so how should we go about solving this problem basically i want to write a function

that should take pointer or reference to root node of a binary tree as argument and the function

should return true if the binary tree is bst false otherwise this is how my method signature will

look like in c plus plus in c we do not have boolean type so return type here can be int

we can return 1 for true and 0 for false i'll also write the definition of node here

for a binary tree node would be a structure with 3 fields one to store data and two to store

addresses of left and right children in my definition of node here data type is integer

and we have two pointers to node to store addresses of left and right children okay coming back to

the problem there are multiple approaches and we are going to talk about all of them the first

approach that i'm going to talk about is easy to think of but it's not so efficient

but let's discuss it anyway we are saying that for a binary tree to be called binary search tree

it should have a recursive structure like this for the root node all the elements in left

sub tree must be lesser or equal and all the elements in right sub tree must be greater

and left and right sub trees should themselves also be binary search trees

so let's just check for all of this i'm going to write a function named its sub tree lesser

that will take a dress of root node of a binary tree or sub tree and an integer value as argument

and this function will return true if all the elements in the sub tree are lesser than this value

and similarly i'll write another function named its sub tree greater that will return true if all

the elements in a sub tree are greater than a given value i have just declared these functions

i'll write body of these functions later let's come back to this function is binary search tree

in this function i'm going to say that if all elements in left sub tree are lesser

and i'll verify this by making a call to its sub tree lesser function passing it a dress of

left child of my current root left child would be the root of left sub tree and the data in root

this function call will return true if all the elements in left sub tree would be lesser than

the data in root now the next thing that i want to check for is if elements in right sub tree

are greater than the data in root or not these two conditions are not sufficient

we also need to check if left and right sub trees are binary search trees or not

so i'll add two more conditions here i have made a recursive call to its binary search tree

function passing it a dress of left child and i have made another call passing a dress of right

child and if all these four function calls is sub tree lesser is sub tree greater and is binary

search tree for left and right sub trees return true if all these four checks pass then our tree is

a binary search tree we can return true else we need to return false there is only one thing

that we are missing in this function now we are missing the base case if root is null that is

if the tree or sub tree is empty we can return true this is the base case for our recursion

where we should stop with this much of code is binary search tree function is complete

but let's also write its sub tree lesser and its sub tree greater functions because

they're also part of our logic this function has to be a generic function that should check

if all the elements in a given tree are lesser than a given value or not we will have to traverse

the complete tree or sub tree and see value in all the nodes and compare these values against

this given integer i'll first handle the base case in this function if the tree is empty we can

return true else we need to check if the data in root is less than or equal to the given value

and we also need to recursively check if left and right sub trees of the current root have lesser

value or not so i'm adding two more conditions here i'm making two recursive calls one for the

left sub tree and another for the right sub tree if all these three conditions are true

then we are good else we can return false its sub tree greater function will be very similar

instead of writing these two functions is sub tree lesser and its sub tree and sub tree

greater we could also do something like this we could find the maximum in left sub tree and compare

it with the data in root if maximum of a sub tree is lesser then all the elements are lesser and

similarly if the minimum of a sub tree is greater all the elements are greater for the right sub tree

we could find the minimum so instead of writing these two functions is sub tree lesser and its

sub tree greater we could write something like find max and find min and this would also fit

so this is our solution using one of the approaches let's quickly run this code on an

example binary tree and see how it will execute i have drawn a very simple binary tree here

which actually is a binary search tree let's assume some addresses for these nodes in the tree

let's say the root node is at address 200 and i'll assume some random addresses for other nodes as well

to check if this binary tree is a binary search tree or not we will make a call to

his binary search tree function i'm writing i b s t here as shortcut for his binary search tree

because i'm short of space here so i'll make a call to this function maybe from the main function

passing address 200 address of the root node for this function call address in this local variable

address collected in this local variable root will be 200 root is not null null is only a macro for

address 0 for this call root is not null so we will not return true at this line we will go to the

next f now here we will make a call to his subtree lesser function arguments passed will be address

of left child which is 150 and seven the data in node at 200 execution of the calling function

will pause and will resume only after the called function returns now in this call to his subtree

lesser root is not null so we will not return true at first line we will go to the next f

now here the first condition is if data in root and the root this time is 150

because this call is for this left subtree and for this left subtree address of root is 150

data in root is 4 which is lesser than 7 so the first condition is true and we can go to the

second condition which is a recursive call this call will pause and we will go to the next call

here once again the data in node at 180 one is lesser than 7 so first condition is true and we will

make a recursive call left subtree for node at 180 is null there is no left child so we will return

at first line root is null this time this particular call will simply return true now in this previous

call when root is 180 second condition for if is also true so we will make another call for right

subtree once again at rest past will be 0 and we will simply return true and now for this call

is subtree lesser 187 all three conditions are true so this guy can also return true and now this

call ISL 157 will resume now this guy will make a recursive call for the right subtree and this guy

after everything will also return true now for this call because all three conditions in the if

statement are true this guy will also return true and now is binary search tree function will resume

for this call we have evaluated the first condition we have got true now this guy will make another

call to its subtree crater passing address of right child and value 7 this guy after everything

will return true and now we will have two recursive calls to check if left and right subtrees are

binary search trees or not we will first have a call for the left subtree the execution will go

on like this but i want you to see something in each call to binary search tree function we are

comparing the data in root with all the elements in left subtree and then all the elements in right

subtree this example tree could be really large then in that case in the first call to is binary

search tree for this complete tree we would recursively traverse this whole left subtree

to see whether all the values in this up tree are less than seven or not and then we will traverse

all nodes in this right subtree to see if values are greater than seven or not and then in next

call to is binary search tree when we would be validating whether this particular subtree

is BST or not we would recursively traverse this subtree if values are lesser than four or not

and this subtree to see if values are greater than four or not so all in all during this whole

process there will be a lot of traversal data in notes will be read and compared multiple times

if you can see all notes in this particular subtree will be traversed once in call to

is binary search tree for 200 when we will compare value in these notes with seven and then these

notes will once again be traversed in call to is binary search tree for 150 when they will be

compared with four they will be traversed in call to its subtree lesser all in all these two functions

is subtree lesser and its subtree greater are very expensive for each node we are looking at

all nodes in its subtrees there is an efficient solution in which we do not need to compare

data in a node with data in all nodes in its subtrees and let's see what the solution is

what we can do is we can define a permissible range for each node and data in that node must

be in that range we can start at the root node with range minus infinity to infinity because

for the root node there is no upper and lower limit and now as we are traversing we can set a

range for other nodes when we are going left we need to reset the upper bound so for this node at

150 data has to be between minus infinity and seven data in left child cannot be greater than

data in root if we are going right we need to set the lower bound for this node at 300 range

would be seven to infinity seven is not included in the range data has to be strictly greater than

seven for this node at 180 the range will be minus infinity to four for this node with value six

lower bound will be four and upper bound would be seven now my code will go like this my function

is binary search tree will take two more arguments an integer to mark the lower bound or min value

and another integer to mark the upper bound or max value and now instead of checking whether

all the elements in left subtree are lesser than the data in root and all the elements in

right subtree are greater than the data in root or not we will simply check whether data in root

is in this range or not so i'll get rid of these two function calls it's subtree lesser and it's

subtree greater which are really expensive and i'll add these two conditions data in root must be

greater than min value and data in root must be less than max value these two checks will take

constant time it's subtree lesser and it's subtree greater functions were not taking constant time

running time for them was proportional to number of nodes in the subtree okay now these two recursive

calls should also have two more arguments for the left child lower bound will not change

upper bound will be the data in current node and for the right child upper bound will not change

and lower bound will be the data in current node this recursion looks good to me we already have

the base case written the only thing is that the caller of his binary search tree function

may only want to pass the address of root node so what we can do is instead of naming this function

is binary search tree we can name this function as a utility function like is BSTutil and we can

have another function named its binary search tree in which we can take only the address of root

node and this function can call BST is BSTutil function passing address of root minimum possible

value in integer variable for minus infinity and maximum possible value in integer variable

for plus infinity int min and int max here are macros for minimum and maximum possible values in

int so this is our solution using second approach which is quite efficient in this recursion we

will go to each node once and at each node we will take constant time to see whether data in

that node is in a defined range or not time complexity would be big o of n where n is number

of nodes in the binary tree for the previous algorithm time complexity was big o of n square

one more thing in this code i have not handled the case that binary search tree can have duplicates

i'm saying that elements in left subtree must be strictly lesser and elements in right subtree

must be strictly greater i'll leave it for you to see how you will allow duplicates

there is another solution to this problem you can perform in order traversal of binary tree

and if the tree is binary search tree you would read the data in sorted order

in order traversal of a binary search tree gives a sorted list you can do some hack

while performing in order traversal and check if you are getting the elements in sorted order or

not during the whole traversal you only need to keep track of previously read node and at any time

data in a node that you are reading must be greater than data in previously read node try

implementing this solution it will be interesting okay i'll stop here now in common lessons we will

discuss some more problems on binary tree thanks for watching

in this lesson we are going to write code to delete a node from binary search tree

in most data structures deletion is tricky in case of binary search trees too it's not so

straightforward so let's first see what all complications we may have while trying to delete

a node from binary search tree i have drawn a binary search tree of integers here as we know

in a binary search tree for each node value of all nodes in its left subtree is lesser

and value of all nodes in its right subtree is greater for example in this tree if i'll pick

this node with value five then we have three and one in its left subtree which are lesser

and we have seven and nine in its right subtree which are greater and you can pick any other

node in the tree and this property will be true else the tree is not a bst now when we need to

delete a node this property must be conserved let's try to delete some nodes from this example

tree and see if we can rearrange things and conserve this property of binary search tree or not

what if i want to delete this node with value 19 to delete a node from tree we need to do two

things we need to remove the reference of the node from its parent so the node is detached

from the tree here we will cut this link we will set right child of this node with value 17 as null

and the second thing that we need to do is reclaim the memory allocated to the node being

deleted that is wipe off the node object from memory this particular node with value 19 that

we are trying to delete here is a leaf node it has no children and even if we take this guy out

by simply cutting this link that is removing its reference from its parent and then wiping it off

from memory there is no problem property of binary search tree that for each node value of nodes

and left should be lesser and value of nodes in right should be greater is conserved so deleting a

leaf node a node with no children is really easy in this tree these four nodes with values

one nine thirteen and nineteen are leaf nodes to delete any of these we just need to cut the link

and wipe off the node that is clear it from memory but what if we want to delete a non-leaf node

what if in this example we want to delete this node with value 15 i can't just cut this link

because if i'll cut this link we will detach not just the node with value 15 but this complete

sub tree we have two more nodes in this sub tree we could have had a lot more we need to make sure

that all other nodes except the node with value 15 that's being deleted remain in the tree so what

do we do now this particular node that we are trying to delete here has two children or two

subtrees i'll come back to case of node with two children later because this is not so easy to crack

what i want to discuss first is the case when the node being deleted would have only one child

if the node being deleted would have only one child like in this example this node with

value seven this guy has only one child this guy has a right child but does not have a left child

for such a node what we can do is we can link its parent to this only child so the child and

everything below the child we could have some more nodes below nine as well will remain attached

to the tree and only the node being deleted will be detached now we are not losing any other node

than the node with value seven this is my tree after the deletion is there still a binary search

tree yes it is only the right subtree of node with value five has changed earlier we had seven and

nine in right subtree of five and now we have nine which is fine what if we were having some more

nodes below nine here in this tree i can have a node in the left of nine and the value in this

node has to be lesser than twelve greater than five greater than seven and lesser than nine

we are left with only one choice we can only have eight here in right we can have something

lesser than twelve and greater than five seven and nine all in all between nine and twelve

okay so if the original tree was this much after deletion this is how my tree will look like

okay so are we good now is the tree in right of bst well yes it is when we are setting this node

with value nine as right child of the node with value five we are basically setting this particular

subtree as right subtree of the node with value five now this subtree is already in right of five

so value of all nodes in this subtree is already greater than five and the subtree itself of course

is a binary search tree any subtree in a binary search tree will also be a binary search tree

so even after deletion even after the rearrangement property of the tree that for each node nodes in

left should be lesser and nodes in right should be greater in value is conserved so this is what

we need to do to delete a node with just one child or a node with just one subtree connect

its parent to its only child and then wipe it off from memory there are only two nodes in this tree

that have only one child let's try to delete this other one with value three all we need to do here

is set one has left child of five once again if there were some more nodes below one then also

there was no issue okay so now we are good for two cases we're good for leaf nodes and we are good

for nodes with just one child and now we should think about the third case what if a node has two

children what should we do in this case let's come back to this node with value 15 that we were

trying to delete earlier with two children we can't do something like connect parent to one

of the children while trying to delete 15 if we will connect 12 to 13 if we will make 13 the right

child of 12 then we will include 13 and anything below 13 that is we will include the left subtree

of 15 but we will lose the right subtree of 15 that is 17 and anything below 17 similarly if

we will make 17 the right child then we will lose the left subtree of 15 that is 13 and anything below

13 actually this case is tricky and before I talk about a possible solution I want to insert some

more nodes here I want to have some more nodes in subtrees of 13 and 17 the reason I'm inserting

some more nodes here is because I want to discuss a generic case and that's why I want these two

subtrees to have more than one node okay coming back when I'm trying to delete this node my intent

basically is to remove this value 15 from the tree my delete function will have signature something

like this it will take pointer or reference to the root node and value to be deleted as argument

so here I'm deleting this particular node because I want to remove 15 from the tree

what I'm going to do now is something with which I can reduce case three to either case one or case

two I'll wipe off 15 from this node and I'll fill in some other value in this node of course I can't

fill in any random value what I'll do is I'll look for the minimum in right subtree of this node

and I'll fill in that value here minimum in right subtree of this node is 17 so I have filled 17 here

we now have two nodes with value 17 but notice that this node has only one child we can delete

this node because we know how to delete a node with one child and once this node is deleted my

tree will be good the final arrangement will be a valid arrangement for my BST but why minimum in

right subtree why not value in any other leaf node or any other node with one child well we

also need to conserve this property that for each node nodes in left should have lesser value

and nodes in right should have greater value for this node if I'm bringing in the minimum from

its right subtree then because I'm bringing in something from its right subtree it will be

greater than the previous value 17 is greater than 15 so all the elements in left of course will be

lesser and because it's the minimum in right subtree all the elements in right of this guy

would either be greater or equal we'll have a duplicate that will be equal once the duplicate is

removed everything else will be fine in a tree or sub tree if a node has minimum value it won't

have a left child because if there is a left child there is something lesser and this is another

property that we are exploiting give this some thought in a tree or sub tree node with minimum

value will not have a left child there may or may not be a right child if we would have a right

child like here we have a right child so here we are reducing case three to case two if there was

no child we would have reduced case three to case one okay so let's get rid of the duplicate

I'll build a link like this and after deletion this is what my tree will look like so this is

what we need to do in case three we need to find the minimum in right subtree of the targeted node

then copy or fill in this value and finally we need to delete the duplicate or the node with

minimum value from right subtree there was another possible approach here and I must talk about it

instead of going for minimum in right we could also go for maximum in left subtree maximum in

left subtree would of course be greater than or equal to all the values in left maximum in left

subtree of node with value 15 is 14 I'm copying 14 here now all the nodes in left are lesser than

or equal to 14 and because we are picking something from left subtree it will still be lesser than

the value being deleted 14 is less than 15 so all the nodes in this right subtree will still be

greater and if we are picking maximum in a tree or subtree then that node will not have a right

child because if we have something in right we have something greater so the value can't be maximum

the node may have a left child in this case node with value 14 doesn't have a left child

so we are basically introducing case three to case one I'll simply get rid of this node

so we are looking good even after deletion in case three we can apply any of these methods

and this is all in logic part let's now write code for this logic I'll write c++ and we will use

recursion if you're not very comfortable applying recursion on trees then make sure you watch earlier

lessons in this series you can find link to them in description of this video

in my code here I have defined node as a structure with three fields we have one field to store data

and we have two fields that are pointers to node to store addresses of left and right children

and I want to write a function named delete that should take pointer to root node and the data

to be deleted as argument and this function should return pointer to root node because the root may

change after deletion what we're passing to delete function is only a local copy of a root's address

if the address is changing we need to return it back to delete a given value or data we first need

to find it in the tree and once we find the node containing that data we can try to delete it

remember the only identity of tree that we pass to functions is a address of the root node and to

perform any action on the tree we need to start at root so let's first search for the node with

this data first I'll cover a corner case if root is null that is if the tree is empty

we can simply return I can say return root or return null here they will mean the same

because root is null else if the data that we are looking for is less than the data in root

then it's in the left subtree the problem can be reduced to deleting the data from left subtree

we need to go and find the data in left subtree so we can make a recursive call to delete function

passing address of the left child and the data to be deleted now the root of the left subtree

that is the left child of this current node may change after deletion but the good thing is

delete function will return address of the modified root of the left subtree so we can set the return

as left child of the current node now if data that we are trying to delete is greater than

the data in root we need to go and delete the data from right subtree and if the data is needed

greater nor lesser that is if it's equal then we can try deleting the node containing that data

now let's handle the three cases one by one if there is no child we can simply delete that node

what I'll do here is I'll first wipe off the node from memory and this is how I'll do it

what we have in root right now is address of the node to be deleted I'm using delete operator here

and that's used to deallocate memory of an object in heap in c you would use free function

now root is a dangling pointer because the object in heap is deleted but root still has its address

so we can set root as null and now we can return root reference of this node in its parent will

not be fixed here once this recursive call finishes then somewhere in these two statements in any of

these two statements in any of these two else ifs the link will be corrected I hope this is making

sense okay now let's handle other cases if only the left child is null then what I want to do is

I first want to store the address of current node that I'm trying to delete in a temporary

pointer to node and now I want to move the root this pointer named root to the right child

so the right child becomes the root of this sub tree and now we can delete the node

that is being pointed to by temp we will use delete operator in c we would be using free

function and now we can return root similarly if the right child is null I'll first store the

address of current root in a temporary pointer to node then I'll make the left child new root of

the sub tree so we'll move to the left child and then I'll delete the previous root whose address

I have in temp and finally I'll return root actually we need to return root in all cases so I'll remove

this return root statement from all this if and else if and write one return root after everything

let's talk about the third case now in case of two children what we need to do is we need to

search for minimum element in right sub tree of the node that we are trying to delete

let's say this function find min will give me address of the node with minimum value in our tree

or sub tree so I'm calling this function find min and I'm collecting the return in a pointer to

node named temp now I should set the data in current node that I'm trying to delete as

this minimum value and now the problem is getting reduced to deleting this minimum value from the

right sub tree of current node with this much code I think I'm done with delete function

this looks good to me let's quickly run this code on an example tree and see if this works or not

I have drawn a binary search tree here let's say these values outside these nodes are addresses of

the nodes now I want to delete number 15 from this tree so I'll make a call to delete function

passing address of the root which is 200 and 15 the value to be deleted in delete function for

this particular call control will come to this line a recursive call will be made execution of

this call delete 200 comma 15 will pause and it will resume only after this function below delete

350 comma 15 returns now for this call below we will go inside the third else in case 3

here we will find the node with minimum value in right which is 17 which is 400 the value is 17

address is 400 first we will set the data in node at 350 as 17 and now we are making a recursive

call to delete 17 from right sub tree of 350 we have only one node in right sub tree of 350

here we have case 1 in this call we will simply delete the node at 400 and return null remember

root will be returned in all calls in the end now delete 350 comma 15 will resume and in this

resumed call we will set a address of right child of node at 350 as null as you can see the link

in parent is being corrected when the recursion is unfolding and the function call corresponding to

the parent is resuming and now this guy can return and now in this call we will resume at this line

so right child of node at 200 will be set as 350 it's already 350 but it will be written again

and now this call can also finish so I hope you got some sense of how this recursion is working

you can find link to all the source code and code to test the delete function

in description of this video this is it for this lesson thanks for watching

in this lesson we are going to solve one other interesting problem on binary search tree

and the problem is given a node in a binary search tree we need to find its in-order

successor that is the node that would come immediately after the given node in in-order

traversal of the binary search tree as we know in in-order traversal of a binary tree we first

visit the left subtree then the root and then the right subtree left and right subtrees are

visited recursively in same manner so for each node we first visit its left subtree

then the node itself and then its right subtree we have already discussed in-order traversal

in detail in a previous lesson in the series you can check the description of this video for

a link to it in-order implementation will basically be a recursive function something like what i'm

showing in right here there are two recursive calls in this function one to visit the left subtree

and another to visit the right subtree time complexity of in-order traversal is big o of n

where n is number of nodes in the tree we visit each node exactly once so time taken is proportional

to number of nodes in the tree i have drawn a binary search tree of integers here binary search

tree as we know is a binary tree in which for each node value of nodes in left is lesser and value

of nodes in right is greater let's quickly see what will be the in-order traversal for this

binary search tree we'll start at root of the tree now for any node we first need to visit all

nodes in its left and then only we can visit that node so we will have to go left basically we will

make a recursive call to go to left child of this node for this guy once again we have something

in left so we will make another recursive call and go to its left child now we are at this node

with value eight and we will have to go left one more time and now for this node with value six

which is a leaf node we have nothing in left so we can simply say that its left subtree is done

and hence we can visit this guy visiting for me is reading the data in that node

i'll write the data here and now for this node there's nothing in right as well so

we can simply say that its right is also done and now we're completely done for this guy

so recursive call corresponding to this node will finish and we will go back to call corresponding

to its parent if we will come back to a node from its left child then it will be unvisited

because we can't visit a node until its left is done so when we are coming back to eight eight

is unvisited so we can simply visit this node that is read the data in this node

when i'll visit a node i'll paint it in yellow and now there's nothing in right of this node so

we can simply say that right is done now we are done with this node so call corresponding to this

node will finish and we will go back to its parent once again we're coming back to the parent from

left so the parent that is this node with value 10 is unvisited if we would come back to a node from

right then it would already be visited so i'm visiting 10 and now we can go to right of 10

so far we have visited three nodes we first visited node with value six and then we visited

node with value eight so eight is successor of six and then 10 is successor of eight

now let's see what will be the successor of 10 for nodes with values six and eight there was

nothing in right so we were unwinding and going to the parent but for a node if there would be

something in right that is if there would be a right sub tree then its successor would definitely

be in its right sub tree because after visiting that node we will go right now at this stage we are

at this node with value 12 for this guy we will first go left and now we are at node with value 11

which is a leaf node there's nothing in left so we can simply say that left is done and we can

print the data that is visit this node so in order successor of 10 is 11 now for node with value 11

there's nothing in right so we will go back to its parent and now we can visit this guy so after 11

we have 12 there's nothing in right of 12 so call for this guy will finish and we will go to its

parent now we're coming back to 10 again but this time from right so this guy is already visited

so we need not do anything we can simply go to its parent and now we are at this node with value 15

we are coming from left this guy is unvisited so we can visit it and now we can go to its right

we will go on like this successor of 15 would be 16 and after 16 we will print 17

then after 17 we will print 20 then 25 and the last element would be 27

so this is in order traversal of this binary search tree notice that we have printed the

integers in sorted order when we perform in order traversal on a binary search tree then elements

are visited in sorted order now the problem that we want to solve is given a value in the tree

we want to find its in order successor in a binary search tree it would be the next higher value in

the tree but what's the big deal here can't we just perform in order traversal and while performing

the traversal figure order successor well we can do so but it will be expensive running time of

in order is big o of n and we may want to do better finding next and previous element in some data

could be a frequently performed operation and good thing about binary search tree is that

frequently performed operations like insertion, deletion and search happen in big o of h where

h is height of the tree so it would be good if we are able to find successor and predecessor

in big o of h we always try to keep a tree balanced to minimize its height height of a balanced

binary tree is log n to the base 2 and big o of log n running time for any operation is almost

the best running time that we can have so can we find in order successor in big o of h i have

retrawn the example tree here let's see what we can do in various cases what node would we

visit after this node with value 10 can we deduce this logically well if you remember the simulation

of in order traversal that we had done earlier then if we have already visited this node

then we are done with its left subtree and we have read the data in this node and we need to go

right now in the right subtree we will have to go left as long as it's possible to go

and if we can't go left anymore like here there is nothing in left of this node with value 11

then this is the node that i'm visiting next so for a node if there is a right subtree

then in order successor would be the leftmost node in its right subtree in a bst it would be

the node with minimum value in its right subtree i would say this is case one in this case all we

need to do is we need to go as left as possible in right subtree in a bst it will also mean finding

the minimum in right subtree leftmost node will also be the minimum in the subtree

now this is one case our node here had a right subtree what would be in the successor if there

would be no right subtree what node would we visit after this node with value 8 this guy does not

have a right subtree if we have already visited this guy then we have visited its left and this

node itself and there is nothing in right so we can say that right is also visited but we have not

found the successor yet now where do we go from here well if you remember the simulation that we

had done earlier we need to go to the parent of this node and if we are going to the parent from

left which is the case here then the parent would be unvisited for this node with value 10 we just

finished its left subtree and we are coming back so now we can visit this node so this is my successor

let's now pick another node with no right subtree what would be in order successor of this node

with value 12 what node would we visit next now here once again we do not have a right subtree

for this node so we must go back to its parent and see if it's unvisited but if we are going

to the parent from right if the node that we just visited is a right child which is the case here

then the parent would already be visited because we are coming back after visiting its right subtree

this node must have been visited before going right so what should we do now the recursion will

roll back further and we need to go to parent of 10 and now we are going to 15 from left so this

guy is unvisited so we can visit this node and this is my successor if the node does not have a

right subtree we need to go to the nearest ancestor for which given node would be in its left subtree

here for 12 we first went to 10 but 12 is in right subtree of 10 so we went to the next ancestor

15 and 12 is in left of 15 so this is the nearest ancestor for which 12 is in left

and hence this is my in order successor this algorithm works fine but there is an issue

how do we go from a node to its parent well we can design our tree such that node can have

reference to its parent so far in most lessons we have defined node as a structure with three

fields something like this this is how we would define node in c or c plus plus we have one field

to store data and we have two pointers to node to store reference or addresses of left and right

children often it makes a lot of sense to have one more field to store the address of parent we

can design a tree like this and then we will not have problem walking the tree up using parent link

we can easily go to the ancestors but what if there is no link to parent in this case what we

can do is we can start at root and walk the tree from root to the given node in a bst this is really

easy for 12 we will start at root 12 is lesser than value in root so we need to go left and now

we are at 10 or 12 is greater than 10 so we need to go right and now we are at 12 if we will walk

the tree from root to the given node we will go through all the ancestors of the given node

in order successor would be the deepest node or the best ancestor in this path

for which given node would be in left subtree 12 has only two ancestors we have 10 but 12 is in

right of 10 and then we have 15 and 12 is in left of 15 so 15 is my successor now let's use this

technique to find successor of 6 we will first walk down from root to this node 6 is in left for

all the ancestors but the best ancestor for which 6 is in left is this node with value 8 so this is

my successor remember we need to look at ancestors only if there is no right subtree

for 6 there is no right subtree okay so the algorithm looks good let's now write code for this

in my c++ code here i'm going to write a function named ket successor that will take address of

root node and address of another node for which we need to find the successor and this function

will return address of the successor node we could design this function differently instead of taking

pointer to the node for which we want to find the successor as argument we could just take

the data as argument and for this data for this element we can find the successor node and return

its address and that's why the return type here is struct node asterisk because we will be

returning address in a pointer or what we can also do is we can return the element itself the

successor element itself we can implement with any of these signatures let's implement this one

we will pass the data in current node and we will return back the address of the successor

now the first thing that we need to do is we need to search the node with this data

i'm going to make call to a function named find that will take address of the root node and the data

and will return me pointer to the node with this data if this function returns me null that is if

the data is not found in the tree we can simply return null else we have the address of the current

node in this pointer to node that we have named current now in a bsd this search operation will

cost us big o of h where h is height of the tree search in our bsd is not very expensive

we could have avoided this search if we would have passed address of the current node instead of

passing the data as this second argument but let's go with this let's now find the successor of this

node if this node has rights up tree that is if the right sub tree is not null we need to go to the

leftmost node in the right sub tree i have declared a temporary pointer to node here and initially

i've set it to current dot right and with this while loop i'll go to the leftmost node

while there is something in the left keep going and finally when i'll come out of this loop

i'll have address of leftmost node in the right sub tree and i can return this address

this particular node will also be the node with minimum value in right sub tree i'll move this

code in another function i have written this function named find min that will return node with

minimum value in a tree or sub tree in get successor function i'll simply say return find min and i'll

pass the address of right child of current node so basically i'm passing the right sub tree here

okay now let's talk about case two if there is no right sub tree what we need to do is we need to

walk the tree from root till current node and we need to find the deepest ancestor for which

current node will be in its left subtree what i'm going to do here is i'm going to declare

a pointer to node named successor and initially i'll set it as null and i'll have another

pointer to node named ancestor and initially i'll set this as root and with this while loop we will

walk the tree till we have not reached the current node to walk the tree we will use the property of

binary search tree that for each node value of nodes in left is lesser and value of nodes in right

is greater if data in current node is less than the data in ancestor then first of all this ancestor

may be my in order successor because the current node is in its left so what we can do is we can

set this guy as successor and we can go left while traversing if we will find a deeper node

with this property that current node will be in its left sub tree then successor will be updated

else if the current node lies in right we simply need to move right when we'll come out of this

while loop successor will either be null or it will be the address of some node not all nodes in the

tree will have a successor node with maximum value will not have a successor after coming out of this

while loop we can return the successor so this is my get successor function and i think this

should work you can find link to complete source code in description of this video overall time

complexity of this function will be big o of h and this is what we wanted we wanted to find

successor in big o of h here we are already performing the search in big o of h uh finding

minimum will also take big o of h and walking the tree from root to a node in bst will also take

big o of h so overall this is big o of h if you have understood this code this logic then it should

be very easy for you writing function to find in order predecessor i encourage you to write it

i'll stop here now in coming lessons we will solve some more interesting problems on binary

trees and binary search trees thanks for watching hello everyone so far in this series on data

structures we have talked about some of the linear data structures like array linked list stack

and q in all these structures data is arranged in a linear or sequential manner so we can call

them linear data structures and we have also talked about tree which is a non-linear data structure

tree is a hierarchical structure now as we understand data structures are ways to store

and organize data and for different kinds of data we use different kinds of data structures

in this lesson we are going to introduce you to another non-linear data structure

that has got its application in a wide number of scenarios in computer science it is used to model

and represent a variety of systems and this data structure is graph when we study data structures

we often first study them as mathematical or logical models here also we will first study graph

as a mathematical or logical model and we will go into implementation details later okay so let's

get started a graph just like a tree is a collection of objects or entities that we call

nodes or vertices connected to each other through a set of edges but in a tree connections are bound

to be in a certain way in a tree there are rules dictating the connection among the nodes

in a tree with n nodes we must have exactly n minus one edges one edge for each parent child

relationship as we know an edge in a tree is for a parent child relationship and all nodes in a tree

except the root node would have a parent would have exactly one parent and that's why if there

are n nodes there must be exactly n minus one edges in a tree all nodes must be reachable from

the root and there must be exactly one possible path from root to a node now in a graph there are

no rules dictating the connection among the nodes a graph contains a set of nodes and a set of edges

and edges can be connecting nodes in any possible way tree is only a special kind of graph now

graph as a concept has been studied extensively in mathematics if you have taken a course on discrete

mathematics then you must be knowing about graphs already in computer science we basically study

and implement the same concept of craft from mathematics the study of crafts is often referred

to as craft theory in pure mathematical terms we can define graph something like this a craft g

is an ordered pair of a set v of vertices and a set e of edges now i'm using some mathematical

jargon here an ordered pair is just a pair of mathematical objects in which the order of objects

in the pair matters this is how we write and represent an ordered pair objects separated by

comma put within parenthesis now because the order here matters we can say that v is the first object

in the pair and e is the second object an ordered pair ab is not equal to ba unless a and b are

equal in our definition of craft here first object in the pair must always be a set of vertices

and the second object must be a set of edges that's why we are calling the pair an ordered pair

we also have concept of unordered pair an unordered pair is simply a set of two elements order is

not important here we write an unordered pair using curly brackets or braces because the order

is not important here unordered pair ab is equal to ba it doesn't matter which object is first and

which object is second okay coming back so a graph is an ordered pair of a set of vertices

and a set of edges and g equal ve is a formal mathematical notation that we use to define a

graph now i have a craft drawn here in the right this graph has eight vertices and ten edges what

i want to do is i want to give some names to these vertices because each node in a graph must have

some identification it can be a name or it can be an index i'm naming these vertices as v1 v2 v3 v4

v5 and so on and this naming is not indicative of any order there is no first second and third

node here i could give any name to any node so my set of vertices here is this we have eight

elements in the set v1 v2 v3 v4 v5 v6 v7 and v8 so this is my set of vertices for this

graph now what's my set of edges to answer this we first need to know how to represent an edge

an edge is uniquely identified by its two end points so we can just write the names of the two

end points of an edge as a pair and it can be a representation for the edge but edges can

be of two types we can have a directed edge in which connection is one way or we can have an

undirected edge in which connection is two way in this example graph that i'm showing here edges

are undirected but if you remember the tree that i had shown earlier then we had directed edges in

that tree with this directed edge that i'm showing you here we are saying that there is a link or

path from vertex u to v but we cannot assume a path from v to u this connection is one way for a

directed edge one of the end points would be the origin and the other end point would be the

destination and we draw the edge with an arrowhead pointing towards the destination

for our edge here origin is u and destination is v a directed edge can be represented as an

ordered pair first element in the pair can be the origin and second element can be the destination

so with this directed edge represented as ordered pair uv we have a path from u to v

if we want a path from v to u we need to draw another directed edge here with v as origin

and u as destination and this edge can be represented as ordered pair vu the upper one here is uv and

the below one is vu and they are not same now if the edge is undirected the connection is two way

an undirected edge can be represented as an unordered pair here because the edge is bidirectional

origin and destination are not fixed we only need to know what two end points are being connected

by the edge so now that we know how to represent edges we can write the set of edges for this

example graph here we have an undirected edge between v1 and v2 then we have one between v1

and v3 then we have v1 v4 this is really simple i'll just go ahead and write all of them

so this is my set of edges typically in a graph all edges would either be directed

or undirected it's possible for a graph to have both directed and undirected edges

but we are not going to study such graphs we are only going to study graphs in which

all edges would either be directed or undirected a graph with all directed edges is called a

directed graph or digraph and a graph with all undirected edges is called an undirected graph

there is no special name for an undirected graph usually if the graph is directed

we explicitly say that it's a directed graph or digraph so these are two types of graph directed

graph or digraph in which edges are unidirectional or ordered pairs and undirected

graph in which edges are bi-directional or unordered pairs now many real world systems and problems

can be modeled using a graph graphs can be used to represent any collection of objects

having some kind of pairwise relationship let's have a look at some of the interesting examples

a social network like facebook can be represented as an undirected graph

a user would be a node in the graph and if two users are friends there would be an edge connecting

them a real social network would have millions and billions of nodes i can show only few in my

diagram here because i'm short of space now social network is an undirected graph because

friendship is a mutual relationship if i'm your friend you are my friend too so connections have

to be two-way now once a system is modeled as a graph a lot of problems can easily be solved

by applying standard algorithms in graph theory like here in this social network let's say we want

to do something like suggest friends to a user let's say we want to suggest some connections to

rama one possible approach to do so can be suggesting friends of friends who are not connected

already rama has three friends ela bob and kt and friends of these three that are not connected to

rama already can be suggested there is no friend of ela which is not connected to rama already

bob however has three friends storm sam and lee that are not friends with rama so they can

be suggested and kt has two friends lee and swati that are not connected to rama we have

counted lee already so in all we can suggest these four users to rama now even though we

described this problem in context of a social network this is a standard graph problem the problem

here in pure graph terms is finding all nodes having length of shortest path from a given node

equal to two standard algorithms can be applied to solve this problem we'll talk about concepts

like path in a graph in some time for now just know that the problem that we just described in

context of a social network is a standard craft problem okay so a social network like facebook

is an undirected graph now let's have a look at another example

interlinked web pages on the internet or the world wide web can be represented as a directed

graph a web page that would have a unique address or url would be a node in the graph and we can

have a directed edge if a page contains link to another page now once again there are billions of

pages on the web but i can show only few here the edges in this graph are directed because

the relationship is not mutual this time if page a has a link to page b then it's not necessary that

page b will also have a link to page a let's say one of the pages on mycodeschool.com has a

tutorial on graph and on this page i have put a link to wikipedia article on graph let's assume

that in this example graph that i'm showing you here page p is my mycode school tutorial on graph

with this address or url mycodeschool.com/videos/craft and let's say page q is the wikipedia

article on graph with this url wikipedia.org/vicky/craft now on my page that is page p i have put a link to

wikipedia page on graph if you are on page p you can click on this link and go to page q

but wikipedia has not reciprocated to my favor by putting a link back to my page

so if you are on page q you cannot click on a link and come to page p connection here is one way

and that's why we have drawn a directed edge here okay now once again if we are able to represent

web as a directed graph we can apply standard graph theory algorithms to solve problems and

perform tasks one of the tasks that search engines like google perform very regularly

is web crawling search engines use a program called web crawler that systematically

prozes the world wide web to collect and store data about web pages search engines can then

use this data to provide quick and accurate results against search queries now even though

in this context we are using a nice and heavy term like web crawling web crawling is basically

graph traversal or in simpler words act of visiting all notes in a graph and no prizes for

guessing that there are standard algorithms for graph traversal we'll be studying graph traversal

algorithms in later lessons okay now the next thing that i want to talk about is concept of a

weighted graph sometimes in a graph all connections cannot be treated as equal some connections can

be preferable to others like for example we can represent intercity road network that is the

network of highways and freeways between cities as an undirected graph i'm assuming that all highways

would be bidirectional intra city road network that is road network within a city would definitely

have one way roads and so intra city road network must be represented as a directed graph but

intercity road network in my opinion can be represented as an undirected graph now clearly

we cannot treat all connections as equal here roads would be of different lengths and to perform a

lot of tasks to solve a lot of problems we need to take length of roads into account in such cases

we associate some weight or cost with every edge we label the edges with their weights in this case

weight can be length of the roads so what i'll do here is i'll just label these edges with some

values for their lengths and let's say these values are in kilometers and now edges in this

graph are weighted and this graph can be called a weighted graph let's say in this graph we want

to pick the best route from city A to city D have a look at these four possible routes i'm showing

them in different colors now if i would treat all edges as equal then i would say that the green

route through B and C and the red route through E and F are equally good both these parts have

three edges and this yellow route through E is the best because we have only two edges in this path

but with different weights assigned to the connections i need to add up weights of edges in a path

to calculate total cost when i'm taking weight into account shortest route is through B and C

connections have different weights and this is really important here in this graph

actually we can look at all the graphs as weighted graphs an unweighted graph can basically be

seen as a weighted graph in which weight of all the edges is same and typically we assume the weight

as one okay so we have represented inter cities road network as a weighted undirected graph

social network was an unweighted undirected graph and worldwide web was an unweighted directed

graph and this one is a weighted undirected graph now this was inter city road network i think

inter city road network that is road network within our city can be modeled as a weighted directed

graph because in a city there would be some one ways intersections in inter city road network

would be nodes and road segments would be our edges and by the way we can also draw an undirected

graph as directed it's just that for each undirected edge we'll have two directed edges

we may not be able to redraw a directed graph as undirected but we can always redraw an undirected

graph as directed okay i'll stop here now this much is good for an introductory lesson

in next lesson we'll talk about some more properties of graph this is it for this lesson

thanks for watching in our previous lesson we introduced you to graphs we defined graph as a

mathematical or logical model and talked about some of the properties and applications of graph

now in this lesson we will discuss some more properties of graph but first i want to do a

quick recap of what we have discussed in our previous lesson a graph can be defined as an

ordered pair of a set of vertices and a set of edges we use this formal mathematical notation

g equal ve to define a graph here v is set of vertices and e is set of edges ordered pair is

just a pair of mathematical objects in which order of objects in the pair matters it matters

which element is first and which element is second in the pair now as we know to denote number of

elements in a set that we also call cardinality of a set we use the same notation that we use

for modulus or absolute value so this is how we can denote number of vertices and number of edges

in a graph number of vertices would be number of elements in set v and number of edges would be

number of elements in set e moving forward this is how i'm going to denote number of vertices

and number of edges in all my explanations now as we had discussed earlier edges in a graph

can either be directed that is one way connections or undirected that is two way connections

a graph with only directed edges is called a directed graph or die graph and a graph with only

undirected edges is called an undirected graph now sometimes all connections in a graph cannot

be treated as equal so we label edges with some weight or cost like what i'm showing here

and a graph in which some value is associated to connections as cost or weight is called a weighted

graph a graph is unweighted if there is no cost distinction among edges okay now we can also have

some special kind of edges in a graph these edges complicate algorithms and make working

with graphs difficult but i'm going to talk about them anyway an edge is called a self loop or

self edge if it involves only one vertex if both end points of an edge are same then it's called

a self loop we can have a self loop in both directed and undirected graphs but the question is

why would we ever have a self loop in a graph well sometimes if edges are depicting some

relationship or connection that's possible with the same node as origin as well as destination

then we can have a self loop for example as we had discussed in our previous lesson

interlinked web pages on the internet or the worldwide web can be presented as a directed graph

a page with a unique URL can be a node in the graph and we can have a directed edge if a page

contains link to another page now we can have a self loop in this graph because it's very much

possible for a web page to have a link to itself have a look at this web page my code school dot

com slash videos in the header we have links for workouts page problems page and videos page

right now i'm already on videos page but i can still click on videos link and all that will

happen with the click is a refresh because i'm already on videos page my origin and

destination are same here so if i'm representing worldwide web as a directed graph the way we just

discussed then we have a self loop here now the next special type of edge that i want to talk about

is multi edge an edge is called a multi edge if it occurs more than once in a graph once again

we can have a multi edge in both directed and undirected graphs first multi edge that i'm showing you

here is undirected and the second one is directed now once again the question why should we ever

have a multi edge well let's say we are representing flight network between cities as a graph a city

would be a node and we can have an edge if there is a direct flight connection between any two cities

but then there can be multiple flights between a pair of cities these flights would have different

names and may have different costs if i want to keep the information about all the flights

in my graph i can draw multi edges i can draw one directed edge for each flight and then i can

label an edge with its cost or any other property i just labeled edges here with some random flight

numbers now as we were saying earlier self loops and multi edges often complicate working with

graphs their presence means we need to take extra care while solving problems if a graph contains

no self loop or multi edge it's called a simple graph in our lessons we will mostly be dealing

with simple graphs now i want you to answer a very simple question given number of vertices

in a simple graph that is a graph with no self loop or multi edge what would be maximum possible

number of edges well let's see let's say we want to draw a directed graph with four vertices

i have drawn four vertices here i'll name these vertices v1 v2 v3 and v4 so this is my set of vertices

number of elements in set v is four now it's perfectly fine if i choose not to draw any edge here

this will still be a graph set of edges can be empty nodes can be totally disconnected

so minimum possible number of edges in a graph is zero now if this is a directed graph

what do you think can be maximum number of edges here well each node can have directed edges to

all other nodes in this figure here each node can have directed edges to three other nodes

we have four nodes in total so maximum possible number of edges here is four into three that is

twelve i have shown edges originating from our vertex in same color here this is the maximum

that we can draw if there is no self loop or multi edge in general if there are n vertices

then maximum number of edges in a directed graph would be n into n minus one so in a simple

directed graph number of edges would be in this range zero to n into n minus one now what do you

think would be the maximum for an undirected graph in an undirected graph we can have only one

bi-directional edge between a pair of nodes we can't have two edges in different directions

so here the maximum would be half of the maximum for directed so if the graph is simple and undirected

number of edges would be in the range zero to n into n minus one by two remember this is true only

if there is no self loop or multi edge now if you can see number of edges in a graph can be

really really large compared to number of vertices for example if number of vertices in a directed

graph is equal to ten maximum number of edges would be ninety if number of vertices is hundred

maximum number of edges would be ninety nine hundred maximum number of edges would be close

to square of number of vertices a graph is called dense if number of edges in the graph is close to

maximum possible number of edges that is if the number of edges is of the order of square of

number of vertices and a graph is called sparse if the number of edges is really less typically

close to number of vertices and not more than that there is no defined boundary for what can

be called dense and what can be called sparse it all depends on context but this is an important

classification while working with graphs a lot of decisions are made based on whether the graph is

dense or sparse for example we typically choose a different kind of storage structure in computer's

memory for a dense graph we typically store a dense graph in something called adjacency matrix

and for a sparse graph we typically use something called adjacency list

i'll be talking about adjacency matrix and adjacency list in next lesson okay now the next concept

that i want to talk about is concept of path in a graph a path in a graph is a sequence of vertices

where each adjacent pair in the sequence is connected by an edge i'm highlighting a path here in this

example graph the sequence of vertices a b f h is a path in this graph now we have an undirected

graph here edges are bidirectional in a directed graph all edges must also be aligned in one direction

the direction of the path a path is called simple path if no vertices are repeated and if vertices

are not repeated then edges will also not be repeated so in a simple path both vertices and edges are

not repeated this path a b f h that i have highlighted here is a simple path but we could also have a

path like this here start vertex is a and end vertex is d in this path one edge and two vertices

are repeated in graph theory there is some inconsistency in use of this term path most of

the time when we say path we mean a simple path and if repetition is possible we use this term

walk so a path is basically a walk in which no vertices or edges are repeated a walk is called

a trail if vertices can be repeated but edges cannot be repeated i'm highlighting a trail here

in this example graph okay now i want to say this once again walk and path are often used as synonyms

but most often when we say path we mean simple path a path in which vertices and edges are not

repeated between two different vertices if there is a walk in which vertices or edges are repeated

like this walk that i'm showing you here in this example graph then there must also be a path

or simple path that is a walk in which vertices or edges would not be repeated in this walk that

i'm showing you here we are starting at a and we are ending our walk at c there is a simple path

from a to c with just one edge all we need to do is we need to avoid going to be e h d and then

coming back again to a so this is why we mostly talk about simple path between two vertices

because if any other walk is possible simple path is also possible and it makes most sense to look

for a simple path so this is what i'm going to do throughout our lessons i'm going to say path

and by path i'll mean simple path and if it's not a simple path i'll say it explicitly

a graph is called strongly connected if in the graph there is a path from any vertex to any

other vertex if it's an undirected graph we simply call it connected and if it's a directed graph

we call it strongly connected in leftmost and rightmost graphs that i'm showing you here

we have a path from any vertex to any other vertex but in this graph in the middle we do not have a

path from any vertex to any other vertex we cannot go from vertex c to a we can go from a to c but

we cannot go from c to a so this is not a strongly connected graph remember if it's an undirected

graph we simply say connected and if it's a directed graph we say strongly connected

if a directed graph is not strongly connected but can be turned into connected graph

by treating all edges as undirected then such a directed graph is called weakly connected

if we just ignore the directions of the edges here this is connected but i would recommend that

you just remember connected and strongly connected this leftmost undirected graph is connected

i removed one of the edges and now this is not connected now we have two disjoint connected

components here but the graph overall is not connected connectedness of a graph is a really

important property if you remember introsity road network road network within a city that would

have a lot of one ways can be represented as a directed graph now an introsity road network

should always be strongly connected we should be able to reach any street from any street

any intersection to any intersection okay now that we understand concept of a path

next i want to talk about cycle in a graph a walk is called a closed walk if it starts and ends

at same vertex like what i'm showing here and there is one more condition the length of the walk

must be greater than zero length of a walk or path is number of edges in the path

like for this closed walk that i'm showing you here length is five because we have five edges in

this walk so a closed walk is walk that starts and ends at same vertex and the length of which

is greater than zero now some may call closed walk a cycle but generally we use the term cycle for

a simple cycle a simple cycle is a closed walk in which other than start and end vertices

no other vertex or edge is repeated right now what i'm showing you here in this example graph

is a simple cycle or we can just say cycle a graph with no cycle is called an acyclic graph

a tree if drawn with undirected edges would be an example of an undirected acyclic graph

here in this tree we can have a closed walk but we cannot have a simple cycle in this closed walk

that i'm showing you here our edge is repeated there would be no simple cycle in a tree

and apart from tree we can have other kind of undirected acyclic graphs also

our tree also has to be connected now we can also have a directed acyclic graph

as you can see here also we do not have any cycle you cannot have a path of length

greater than zero starting and ending at the same vertex or directed acyclic graph is often called

a dag cycles in a graph cause a lot of issues in designing algorithms for problems like

finding shortest route from one vertex to another and we will talk about cycles a lot when we will

study some of these advanced algorithms in coming lessons for this lesson i'll stop here now

in our next lesson we will discuss ways of creating and storing graph in computer's memory

this is it for this lesson thanks for watching hello everyone in our previous lessons we introduced

you to graphs and we also looked at and talked about some of the properties of graph but so far

we have not discussed how we can implement graph how we can create a logical structure like graph

in computer's memory so let us try to discuss this a graph as we know contains a set of vertices

and a set of edges and this is how we define graph in pure mathematical terms a graph g

is defined as an ordered pair of a set v of vertices and a set e of edges now to create

and store a graph in computer's memory the simplest thing that we probably can do is that

we can create two lists one to store all the vertices and another to store all the edges

for a list we can use an array of appropriate size or we can use an implementation of a dynamic

list in fact we can use a dynamic list available to us in language libraries something like vector

in c++ or array list in Java now a vertex is identified by its name so the first list the list

of vertices would simply be a list of names or strings i just filled in names of all the vertices

for this example graph here now what should we fill in this edge list here an edge is identified

by its two endpoints so what we can do is we can create an edge as an object with two fields

we can define edge as a structure or class with two fields one to store the start vertex

and another to store the end vertex edge list would basically be an array or list of this type

struct edge in these two definitions of edge that i have written here in the first one i have used

character pointers because in c we typically use character pointers to store or refer to strings

we could use character array also in c++ or Java where we can create classes we have string

available to us as a data type so we can use that also so we can use any of these for the fields

we can use character pointer or character array or string data type if it's available depends on

how you want to design your implementation now let's fill this edge list here for this example graph

each row now here has two boxes let's say the first one is to store the start vertex and the

second one is to store the end vertex the graph that we have here is an undirected graph so any

vertex can be called start vertex and any vertex can be called end vertex

order of the vertices is not important here we have nine edges here one between a and b

another between a and c another between a and d and then we have be and bf instead of having bf

as an entry we could also have fb but we just need one of them and then we have cg dh

eh and fh actually there's one more we also have gh we have 10 edges in total here and not nine

now once again because this is an undirected graph if we are saying that there is an edge from f

to h we are also saying that there is an edge from h to f there is no need to have another

entry as hf we will unnecessarily be using extra memory if this was a directed graph fh and hf

would have meant two different connections which is the start vertex and which is the end vertex

would have mattered maybe in case of undirected graphs we should name the fields as first vertex

and second vertex and in case of directed graphs we should name the fields as start vertex and

end vertex now our graph here could also be a weighted graph we could have some cost or weight

associated with the edges as we know in an unweighted graph cost of all the connections is equal

but in a weighted graph different connections would have different weight or different cost

now in this example graph here i have associated some weights to these edges now how do you think

we should store this data the weight of edges well if the graph is weighted we can have one more field

in the edge object to store the weight now an entry in my edge list has three fields

one to store the start vertex one to store the end vertex and one more to store the weight

so this is one possible way of storing a graph we can simply create two lists one to store the

vertices and another to store the edges but this is not very efficient for any possible way of

storing and organizing data we must also see its cost and when we say cost we mean two things

time cost of various operations and the memory usage typically we measure the rate of growth of

time taken with size of input or data what we also call time complexity

and we measure the rate of growth of memory consumed with size of input or data what we also

call space complexity time and space complexities are most commonly expressed in terms of what we

call big o notation for this lesson i'm assuming that you already know about time and space complexity

analysis and big o notation if you want to revise some of these concepts then you can check the

description of this video for link to some lessons we always want to minimize the time cost of most

frequently performed operations and we always want to make sure that we do not consume

unreasonably high memory okay so let's now analyze this particular structure that we are

trying to use to store our graph let's first discuss the memory usage for the first list

the vertex list least number of rows needed or consumed would be equal to number of vertices

now each row here in this vertex list is a name or string and string can be of any length

right now all strings have just one character because i simply named the notes a b c and so on

but we could have names with multiple characters and because strings can be of different lengths

all rows may not be consuming the same amount of memory like here here i'm showing an

intra city road network as a weighted graph cities are my notes and road distances are my weights

now for this graph as you can see names are of different lengths so all rows in vertex list or

all rows in edge list would not cost us same more characters will cost us more bytes

but we can safely assume that the names will not be too long we can safely assume that in almost

all practical scenarios average length of strings will be a really small value if we

assume it to be always lesser than some constant then the total space consumed in this vertex list

will be proportional to the number of rows consumed that is the number of vertices or in other words

we can say that space complexity here is big o of number of vertices this is how we write number

of vertices with two vertical bars what we basically mean here is number of elements in set V

now for the edge list once again we are storing strings in first two fields of the edge object

so once again each row here will not consume same amount of memory but if we are just storing

the reference or pointer to a string like here in the first row instead of having values filled

in these two fields we could have references or pointers to the names in the vertex list

if we will design things like this each row will consume same memory this in fact is better

because references in most cases would cost us a lot lesser than a copy of the name and as

reference we can have the actual address of the string and that's what we are doing when we are

saying that start vertex and end vertex can be character pointers or maybe a better design would be

simply having the index of the name or string in vertex list let's say a is at index 0 in the

vertex list and b is at index 1 and c is at index 2 and i'll go on like this

now for start vertex and end vertex we can have two integer fields

as you can see in both my definitions of edge start vertex and end vertex are of type int now

and in each row of edge list first and second field are filled with integer values

i have filled in appropriate values of indices this definitely is a better design and if you can

see now each row in edge list would cost us the same amount of memory so overall space consumed

in edge list would be proportional to number of edges or in other words space complexity here is

big o of number of edges okay so this is analysis of our memory usage overall space complexity of

this design would be big o of number of vertices plus number of edges is this memory usage

and reasonably high well we cannot do a lot better than this if we want to store a graph

in computer's memory so we are all right in terms of memory usage now let's discuss time cost of

operations what do you think can be most frequently performed operations while working with craft

one of the most frequently performed operations while working with craft would be finding all

nodes adjacent to a given node that is finding all nodes directly connected to a given node

what do you think would be time cost of finding all nodes directly connected to a given node

well we will have to scan the whole edge list we will have to perform a linear search we will

have to go through all the entries in the list and see if the start or end node in the entry is

our given node for a directed graph we would see if the start node in the entry is our given node

or not and for an undirected graph we would see both the start as well as the end node running time

would be proportional to number of edges or in other words time complexity of this operation would

be big o of number of edges okay now another frequently performed operation can be

finding if two given nodes are connected or not in this case also we will have to perform

a linear search on the edge list in worst case we will have to look at all the entries in the edge

list so worst case running time would be proportional to number of edges so for this operation to

time complexity is big o of number of edges now let's try to see how good or bad

this running time big o of number of edges is if you remember this discussion from our previous

lesson in a simple graph in a graph with no self loop or multi-edge if number of vertices

that is the number of elements in set v is equal to n then maximum number of edges

would be n into n minus one if the graph is directed each node will be connected to every

other node and of course minimum number of edges can be zero we can have a graph with no edge

maximum number of edges would be n into n minus one by two if the graph is undirected

but all in all if you can see number of edges can go almost up to square of number of vertices

number of edges can be of the order of square of number of vertices let's denote number of vertices

here as small v so number of edges can be of the order of v square in a graph typically any

operation running in order of number of edges would be considered very costly we try to keep

things in order of number of vertices when we are comparing the two running times this is very obvious

big o of v is a lot better than big o of v square all in all this vertex list and edge

list kind of representation is not very efficient in terms of time cost of operations we should

think of some other efficient design we should think of something better we will talk about

another possible way of storing and representing graph in next lesson this is it for this lesson

thanks for watching so in our previous lesson we discussed one possible way of

storing and representing a graph in which we used two lists one to store the vertices

and another to store the edges a record in vertex list here is name of a node and a

record in edge list is an object containing references to the two end points of an edge

and also the weight of that edge because this example graph that i'm showing you here is

a weighted graph we called this kind of representation edge list representation

but we realized that this kind of storage is not very efficient in terms of time cost of most

frequently performed operations like finding nodes adjacent to a given node

or finding if two nodes are connected or not to perform any of these operations we need to scan

the whole edge list we need to perform a linear search on the edge list so the time complexity is

big o of number of edges and we know that number of edges in a graph can be really really large

in worst case it can be close to square of number of vertices in a graph anything running in order

of number of edges is considered very costly we often want to keep the cost in order of number

of vertices so we should think of some other efficient design we should think of something

better than this one more possible design is that we can store the edges in a two-dimensional

array or matrix we can have a two-dimensional matrix or array of size v cross v where v is

number of vertices as you can see i have drawn an 8 cross 8 array here because number of vertices

in my example graph here is 8 let's name this array a now if we want to store a graph that is

unweighted let's just remove the weights from this example graph here and now our graph is

unweighted and if we have a value or index between zero and v minus one for each vertex

which we have here if we are storing the vertices in a vertex list then we have an index between

zero and v minus one for each vertex we can say that a is 0th node b is 1th node c is 2th node and

so on we are picking up indices from the vertex list okay so if the graph is unweighted and each

vertex has an index between zero and v minus one then in this matrix or to the array we can set

ith row and jth column that is aij as one or boolean value true if there is an edge from i to j

zero or false otherwise if I have to fill this matrix for this example graph here then I'll go

vertex by vertex vertex vertex zero is connected to vertex one two and three vertex one is connected

to zero four and five this is an undirected graph so if we have an edge from zero to one

we also have an edge from one to zero so one at row and zero at the column should also be set as

one now let's go to node two it's connected to zero and six three is connected to zero and seven

four is connected to one and seven five once again is connected to one and seven

six is connected to two and seven and seven is connected to three four five and six

all the remaining positions in this array should be set as zero

notice that this matrix is symmetric for an undirected graph this matrix would be symmetric

because aij would be equal to aji we would have two positions filled for each edge

in fact to see all the edges in the graph we need to go through only one of these two halves

now this would not be true for a directed graph only one position will be filled for each edge

and we will have to go through the entire matrix to see all the edges okay now this kind of representation

of a graph in which edges or connections are stored in a matrix or to the array is called

a jcnc matrix representation this particular matrix that I have drawn here is an ajcnc matrix

now with this kind of storage or representation what do you think would be the time cost of finding

all nodes adjacent to a given node let's say given this vertex list and adjacency matrix

we want to find all nodes adjacent to node named f if we are given name of a node then we first need

to know its index and to know the index we will have to scan the vertex list there is no other way

once we figure out the index like for f index is 5 then we can go to the row with that index in

the ajcnc matrix and we can scan this complete row to find all the adjacent nodes scanning the

vertex list to figure out the index in worst case will cost us time proportional to the number of

vertices because in worst case we may have to scan the whole list and scanning a row in the

adjacency matrix would once again cost us time proportional to number of vertices because in a row

we would have exactly v columns where v is number of vertices so overall time cost of this operation

is big o of v now most of the time while performing operations we must pass indices to avoid scanning

the vertex list all the time if we know an index we can figure out the name in constant time

because in an array we can access element at any index in constant time but if we know a name and

want to figure out the index then it will cost us we go off v we will have to scan the vertex list

we will have to perform a linear search on it okay moving on now what would be the time cost of

finding if two nodes are connected or not now once again the two nodes can be given to us as indices

or names if the nodes would be passed as indices then we simply need to look at value in a particular

row and particular column we simply need to look at aij for some values of i and j and this will

cost us constant time you can look at value in any cell in a two-dimensional array in constant

time so if indices are given time complexity of this operation would be big o of 1 which simply

means that we will take constant time but if names are given then we also need to do the scanning

to figure out the indices which will cost us big o of v overall time complexity would be big o of v

the constant time access would not mean anything the scanning of vertex list all the time to figure

out the indices can be avoided we can use some extra memory to create a hash table with names

and indices as key value pairs and then the time cost of finding index from name would also be big

o of 1 that is constant hash table is a data structure and i have not talked about it in

any of my lessons so far if you do not know about hash table just search online for a basic idea of

it okay so as you can see with a jcense matrix representation our time cost of some of the most

frequently performed operations is in order of number of vertices and not in order of number of

edges which can be as high as square of number of vertices okay now if we want to store a weighted

graph in a jcense matrix representation then aij in the matrix can be set as weight of an edge

for non-existent edges we can have a default value like a really large or maximum possible

integer value that is never expected to be an edge weight i have just filled in infinity here

to mean that we can choose the default as infinity minus infinity or any other value that would never

ever be a valid edge weight okay now for further discussion i'll come back to an unweighted graph

a jcense matrix looks really good so should we not use it always well with this design we have

improved on time but we have gone really high on memory usage instead of using memory units

exactly equal to number of edges what we were doing with an edge list kind of storage

here we are using exactly v square units of memory we are using big o of v square space

we are not just storing the information that these two nodes are connected we are also storing

not of it that is these two nodes are not connected which probably is redundant information

if a graph is tense if the number of edges is really close to v square then this is good but if

the graph is sparse that is if number of edges is lot lesser than v square then we are wasting

a lot of memory in storing these zeros like for this example graph that i have drawn here in the edge

list we were consuming 10 units of memory we had 10 rows consumed in the edge list but here we are

consuming 64 units most graphs with a really large number of vertices would not be very tense

would not have number of edges anywhere close to v square like for example let's say we are

modeling a social network like facebook has a graph such that a user in the network is a node

and there is an undirected edge if two users are friends facebook has a billion users

but i'm showing only a few in my example graph here because i'm short of space

let's just assume that we have a billion users in our network so number of vertices in our graph is

10 to the power 9 which is a billion now do you think number of connections in our social network

can ever be close to square of number of users that will mean everyone in the network is a friend

of everyone else a user of our social network will not be friend to all other billion users

we can safely assume that a user on an average would not have more than a thousand friends

with this assumption we would have 10 to the power 12 edges in our graph

actually this is an undirected graph so we should do a divide by two here so that we do not count

an edge twice so if average number of friends is thousand then total number of connections in

my graph is 5 into 10 to the power 11 now this is a lot lesser than square of number of vertices

so basically if we would use an adjacency matrix for this kind of a graph we would waste a hell lot

of space and moreover even if we are not looking in relative terms 10 to the power 18 units of memory

even in absolute sense is a lot 10 to the power 18 bytes would be about a thousand petabytes

now this really is a lot of space this much data would never ever fit on one physical disc

5 into 10 to the power 11 bytes on the other hand is just 0.5 terabytes

a typical personal computer these days would have this much of storage so as you can see for

something like a large social graph adjacency matrix representation is not very efficient

adjacency matrix is good when a graph is tens that is when the number of edges is close to

square of number of vertices or sometimes when total number of possible connections that is v

square is so less that wasted space would not even matter but most real world graphs would be sparse

and adjacency matrix would not be a good fit let's think about another example let's think about

world wide web as a directed graph if you can think of web pages as nodes in a graph and hyperlinks

as directed edges then a web page would not have linked to all other web pages and once again

number of web pages would be in order of millions a web page would have linked to only a few other

web pages so the graph would be sparse most real world graphs would be sparse

and adjacency matrix even though it's giving us good running time for most frequently performed

operations would not be a good fit because it's not very efficient in terms of space

so what should we do well there's another representation that gives us similar or maybe

even better running time than adjacency matrix and does not consume so much space

it's called adjacency list representation and we will talk about it in our next lesson

this is it for this lesson thanks for watching so in our previous lesson we talked about adjacency

matrix as a way to store and represent a graph and as we discussed and analyzed this data structure

we saw that it's very efficient in terms of time cost of operations with this data structure it costs

big o of 1 that is constant time to find if two nodes are connected or not and it costs big o of

v where v is number of vertices to find all nodes adjacent to a given node but we also saw that

adjacency matrix is not very efficient when it comes to space consumption we consume space

in order of square of number of vertices in adjacency matrix representation as you know

we store edges in a two-dimensional array or matrix of size v cross v where v is number of

vertices in my example graph here we have eight vertices that's why i have an eight cross eight

matrix here we are consuming eight square that is 64 units of space here now what's basically

happening is that for each vertex for each node we have a row in this matrix where we are storing

information about all its connections this is the row for the zeroth node that is a this is the row

for the one-th node that is b this is for c and we can go on like this so each node has got a row

and a row is basically a one-dimensional array of size equal to number of vertices that is v

and what exactly are we storing in a row let's just look at this first row in which we are storing

connections of node a this two-dimensional matrix or array that we have here is basically an array

of one-dimensional arrays so each row has to be a one-dimensional array so how are we storing

the connections of node a in these eight cells in this one-dimensional array of size eight

a zero in the zeroth position means that there is no edge starting a and ending at

zeroth node which again is a an edge starting and ending at itself is called a self loop

and there is no self loop on a of one in one-th position here means that there is an edge

from a to one-th node that is b the way we are storing information here is that

index or position in this one-dimensional array is being used to represent end point of an edge

for this complete row for this complete one-dimensional array

start is always the same it's always the zeroth node that is a in general in the

adjacency matrix row index represents the start point and column index represents the end point

now here when we are looking only at the first row start is always a and the indices 0 1 2 and so

on are representing the end points and the value at a particular index or position tells us whether

we have an edge ending at that node or not one here means that the edge exists 0 would have meant

that the edge does not exist now when we are storing information like this if you can see

we are not just storing that b c and d are connected to a we are also storing the not of it

we are also storing the information that a e f g and h are not connected to a

if we are storing what all nodes are connected through that we can also deduce what all nodes

are not connected these zeros in my opinion are redundant information causing extra consumption

of memory most real-world graphs are sparse that is number of connections is really small compared to

total number of possible connections so most often there would be too many zeros and very few ones

think about it let's say we are trying to store connections in a social network like Facebook

in an adjacency matrix which would be the most impractical thing to do in my opinion

but anyway for the sake of argument let's say we are trying to do it just to store connections

of one user i would have a row or one-dimensional matrix of size 10 to the power nine on an average

in a social network you would not have more than thousand friends if i have thousand friends then

in the row used to store my connections i would only have thousand ones and rest that is 10 to the

power nine minus thousand would be zeros and i'm not trying to force you to agree but just like me

if you also think that these zeros are storing redundant information and are extra consumption

of memory then even if we are storing these ones and zeros in just one byte as boolean values

these many zeros here is almost one gigabyte of memory once are just one kilobyte so given

this problem let's try to do something different here let's just try to keep the information that

these nodes are connected and get rid of the information that these nodes are not connected

because it can be inferred it can be deduced and there are a couple of ways in which we can do this

here to store connections of a instead of using an array such that index represents end point of

an edge and value at that particular index represents whether we have an edge ending there or not

we can simply keep a list of all the nodes to which we are connected this is the list or set of

nodes to which a is connected we can represent this list either using the indices or using

the actual names for the nodes let's just use indices because names can be long and may consume

more memory you can always look at the vertex list and find out the name in constant time now in our

machine we can store this set of nodes which basically is a set of integers in something as

simple as an array and this array as you can see is a different arrangement from our previous array

in our earlier arrangement index was representing index of a node in the graph and value was

representing whether there was a connection to that node or not here index does not represent

anything and the values are the actual indices of the nodes to which we are connected

now instead of using an array here to store this set of integers we can also use a linked list

and why just array or linked list i would argue that we can also use a tree here in fact a binary

search tree is a good way to store a set of values there are ways to keep a binary search tree

balanced and if you always keep a binary search tree balanced you can perform search insertion

and deletion all three operations in order of log of number of nodes we will discuss cost of

operations for any of these possible ways in some time right now all i want to say is that there are

a bunch of ways in which we can store connections of a node for our example graph that we started with

instead of an adjacency matrix we can try to do something like this we are still storing

the same information we are still saying that 0th node is connected to 1th 2th and 3th node

1th node is connected to 0th 4th and 5th node 2th node is connected to 0th and 6th node and so on

but we are consuming a lot less memory here programmatically this adjacency matrix here

is just a two-dimensional array of size 8 cross 8 so we are consuming 64 units of space in total

but this structure in right does not have all the rows of same size how do you think we can create

such a structure programmatically well it depends in c or c plus plus if you understand pointers

then we can create an array of pointers of size 8 and each pointer can point to a one-dimensional

array of different size 0th pointer can point to an array of size 3 because 0th node has

three connections and we need an array of size 3 1th pointer can point to an array of size 3

because 1th node also has three connections 2th node however has only two connections

so 2th pointer should point to an array of size 2 and we can go on like this the 7th node has

four connections so 7th pointer should should point to an array of size 4

if you do not understand any of this pointer thing that i'm doing right now

you can refer to my code schools lesson titled pointers and arrays the link to which you can find

in the description of this video but think about it the basic idea is that each row can be a one-dimensional

array of different size and you can implement this with whatever tools you have in your favorite

programming language now let's quickly see what are the pros and cons of this structure in the right

in comparison to the matrix in the left we are definitely consuming less memory with the structure

in right with a jcnc matrix our space consumption is proportional to square of number of vertices

while with the second structure space consumption is proportional to number of edges

and we know that most real-world graphs are sparse that is the number of edges is really small in

comparison to square of number of vertices square of number of vertices is basically

total number of possible edges and for us to reach this number every node should be connected to

every other node in most graphs a node is connected to few other nodes and not all other nodes

in this second structure we are avoiding this typical problem of too much space consumption in

an a jcnc matrix by only keeping the ones and getting rid of the redundant zeros

here for an undirected graph like this one we would consume exactly two into number of edges

units of memory and for a directed graph we would consume exactly e that is number of edges units

of memory but all in all space consumption will be proportional to number of edges

or in other words space complexity would be big o of e so the second structure is definitely

better in terms of space consumption but let's now also try to compare these two structures

for time cost of operations what do you think would be the time cost of finding if two nodes are

connected or not we know that it's constant time or big o of 1 for an a jcnc matrix because

if we know the start and end point we know the cell in which to look for 0 or 1 but in the second

structure we cannot do this we will have to scan through a row so if I ask you something like

can you tell me if there is a connection from node 0 to 7 then you will have to scan this

zeroeth row you will have to perform a linear search on this zeroeth row to find 7 right now

all the rows in this structure are sorted you can argue that I can keep all the rows sorted and then

I can perform a binary search which would be a lot less costlier that's fine but if you just

perform a linear search then in worst case we can have exactly v that is number of vertices

cells in a row so if we perform a linear search in worst case we will take

time proportional to number of vertices and of course the time cost would be

big o of log v if we would perform a binary search logarithmic run times are really good

but to get this here we always need to keep our rows sorted keeping an array always sorted

is costly in other ways and I'll come back to it later for now let's just say that this would cost

us big o of v now what do you think would be the time cost of finding all nodes adjacent

to a given node that is finding all neighbors of a node well even in case of a adjacency matrix

we now have to scan a complete row so it would be big o of v for the matrix as well as this second

structure here because here also in worst case we can have v cells in a row equivalent to having

all ones in a row in an adjacency matrix when we try to see the time cost of an operation

we mostly analyze the worst case so for this operation we are big o of v for both so this is

the picture that we are getting looks like we are saving some space with this second structure

but we are not saving much on time well I would still argue that it's not true when we analyze time

complexity we mostly analyze it for the worst case but what if we already know that we are not

going to hit the worst case if we can go back to our previous assumption that we are dealing with

a sparse graph that we are dealing with a graph in which a node would be connected to

few other nodes and not all other nodes then the second structure will definitely save us time

things would look better once again if we would analyze them in context of a social network

I'll set some assumptions let's say we have a billion users in our social network

and the maximum number of friends that anybody has is 10,000 and let's also assume computational

power of our machine let's say our machine or system can scan or read 10 to the power 6

cells in a second this is a reasonable assumption because machines often execute a couple of millions

instructions per second now what would be the actual cost of finding all nodes adjacent to a

given node in a JNC matrix well we will have to scan a complete row in the matrix that would be

10 to the power 9 cells because in a matrix we would always have cells equal to number of vertices

and if we would divide this by a million we would get the time in seconds

to scan a row of 10 to the power 9 cells we would take 1000 seconds which is also 16.66 minutes

this is unreasonably high but with the second structure maximum number of cells in a row

would be 10,000 because the number of cells would exactly be equal to number of connections

and this is the maximum number of friends or connections a person in the network has

so here we would take 10 to the power 4 upon 10 to the power 6 that is 10 to the power minus

two seconds which is equal to 10 milliseconds 10 milliseconds is not unreasonable now let's try

to deduce the cost for the second operation finding if two nodes are connected or not in case of

a JNC matrix we would know exactly what cell to read we would know the memory location of that

specific cell and reading that one cell would cost us one upon 10 to the power 6 seconds

which is one microsecond in the second structure we would not know the exact cell

we will have to scan a row so once again maximum time taken would be 10 milliseconds

just like finding adjacent nodes so now given this analysis if you would have to design a social

network what structure would you choose no brainer isn't it machine cannot make a user

wait for 16 minutes would you ever use such a system milliseconds is fine but minutes it's just

too much so now we know that for most real world graphs this second structure is better because

it saves us space as well as time remember i'm saying most and not all because for this logic to

be true for my reasoning to be valid graph has to be sparse number of edges has to be significantly

lesser than square of number of vertices so now having analyzed space consumption and time cost

of at least two most frequently performed operations looks like this second structure would be better

for most graphs well there can be a bunch of operations in a graph and we should account for

all kind of operations so before making up my mind i would analyze cost of few more operations

what if after storing this example graph in computer's memory in any of these structures

we decide to add a new edge let's say we got a new connection in the graph from a to g

then how do you think we can store this new information this new edge in both these structures

the idea here is to assess that once the structures are created in computer's memory

how would we do if the graph changes how would we do if a node or edge is inserted or deleted

if a new edge is inserted in case of an adjacency matrix we just need to go to a specific cell

and flip the zero at that cell to one in this case we would go to zero at row

and sixth column and overwrite it with value one and if it was a deletion then we would go to a

specific cell and make the one zero now how about this second structure how would you do it here

we need to add a six in the first row

and if you have followed this series on data structures then you know that it's not possible

to dynamically increase size of an existing array this would not be so straightforward

we will have to create a new array of size four for the zero at row then we will have to copy

content of the old array write the new value and then wipe off the old one from the memory

it's tricky implementing a dynamic or changing list using arrays this creation of new array and

copying of old data is costly and this is the precise reason why we often use another data

structure to store dynamic or changing lists and this another data structure is linked list

so why not use a linked list why can't each row be a linked list something like this

logically we still have a list here but concrete implementation wise we are no more using an

array that we need to change dynamically we are using a linked list it's a lot easier to do

insertions and deletions in a linked list now programmatically to create this kind of structure

in computers memory we need to create a linked list for each node to store its neighbors

so what we can do is we can create an array of pointers just like what we had done when we were

using arrays the only difference would be that this time each of these pointers would point to

head of a linked list that would be a node i have defined node of a linked list here

node of a linked list would have two fields one to store data and another to store address

of the next node a0 would be a pointer to head or first node of linked list for a

a1 would be a pointer to head of linked list for b and we will go on like a2 for c a3 for d and so

on actually i have drawn the linked lists here in the left but i have not drawn the array of pointers

let's say this is my array of pointers now a0 here this one is a pointer to node and it points

to the head of linked list containing the neighbors of a let's assume that head of linked list for

a has addressed 400 so in a0 we would have 400 it's really important to understand what is what

here in this structure this one a0 is a pointer to node and all a pointer does is store an address

or reference this one is a node and it has two fields one to store data and another a pointer

to node to store the address of next node let's assume that the address of next node in this first

linked list is 450 then we should have 450 here and if the next one is at let's say 500

then we should have 500 in address part of the second node the address in last one would be

0 or null now this kind of structure in which we store information about neighbors of a node

in a linked list is what we typically call an adjacency list what i have here is an adjacency list

for an undirected unweighted graph to store a weighted graph in an adjacency list i would have

one more field in node to store weight i have written some random weights next to the edges

in this graph and to store this extra information i have added one extra field in node

both in logical structure and the code all right now finally with this particular

structure that we are calling a adjacency list we should be fine with space consumption

space consumed will be proportional to number of edges and not to square of number of vertices

most graphs are sparse and number of edges in most cases is significantly lesser than

square of number of vertices ideally for space complexity i should say big o of number of edges

plus number of vertices because storing vertices will also consume some memory

but if we can assume that number of vertices will be significantly lesser in comparison to

number of edges then we can simply say big o of number of edges but it's always good if we do

the counting right now for time cost of operations the argument that we were earlier making using a

sparse graph like social network is still true adjacency list would overall be better than adjacency

matrix finally let's come back to the question how flexible are we with this structure

if we need to add a new connection or delete an existing connection and is there any way we can

improve upon it well i'll leave this for you to think but i'll give you a hint what if instead of

using a linked list to store information about all the neighbors we use a binary search tree

do you think we would do better for some of these operations i think we would do better because

the time cost for searching inserting and deleting a neighbor would reduce with this thought i'll

sign off, this is it for this lesson, thanks for watching.

Click on any text or timestamp to jump to that moment in the video

Most transcripts ready in under 5 seconds

One-Click Copy125+ LanguagesSearch ContentJump to Timestamps

Paste YouTube URL

Enter any YouTube video link to get the full transcript

Most transcripts ready in under 5 seconds

Get Our Chrome Extension

Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.

Add to Chrome — Free

Works with YouTube, Coursera, Udemy and more educational platforms

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube TranscriptPreparing your results…

YouTube Transcript:Data Structures - Full Course Using C and C++