This content provides a comprehensive introduction to data structures, focusing on fundamental concepts like arrays, linked lists, stacks, queues, trees, graphs, and their respective operations and implementations. It emphasizes the importance of choosing the right data structure for efficient software development by analyzing time and space complexities.
Mind Map
Click to expand
Click to explore the full interactive mind map • Zoom, pan, and navigate
In this lesson and in this series of lessons, we will introduce you to the
concept of data structures. Data structure is the most fundamental and
building block concept in computer science and good knowledge of data
structures is a must to design and develop efficient software systems. Okay, so let's
get started. We deal with data all the time and how we store, organize and group
our data together matters. Let's pick up some examples from our day-to-day life
where organizing data in a particular structure helps us. We are able to search
a word quickly and efficiently in a language dictionary because the words in
the dictionary are sorted. What if the words in the dictionary were not sorted?
It would be impractical and impossible to search for a word among millions of
words. So dictionary is organized as a sorted list of words. Let's pick up
another example. If we have something like a city map, the data like position of a
landmark and road network connection. So all this data is organized in the form
of geometries. We show the map data in the form of these geometries on a two
dimensional plane. So map data needs to be structured like this so that we have
scales and directions and we are effectively able to search for a landmark
and get route from one place to another. And I'll pick one more example for
something like daily cash in and cash out statement of a business, what we
also call a cash book in accounts. It makes most sense to organize and store the
data in the form of a tabular schema. It is very easy to aggregate data and
extract information if the data is organized in these columns in these
tables. So different kind of structures are needed to organize different kind of
data. Now computers work with all kind of data. Computers work with text, images,
videos, relational data, geospatial data and pretty much any kind of data that we
have on this planet. How we store, organize and group data in computers
matters because computers deal with really really large data and even with
the computational power of machines, if we do not use the right kind of
structures, the right kind of logical structures, then our software systems
will not be efficient. Formal definition of a data structure would be that a
data structure is a way to store and organize data in a computer so that the
data can be used efficiently. When we study data structures as ways to store and
organize data, we study them in two ways. So I'll say that we talk about data
structures as one, we talk about them as mathematical and logical models. When we
talk about them as mathematical and logical models, we just look at an
abstract view of them. We just look at from a high level what all features and
what all operations define that particular data structure. Example of abstract view
from real world can be something like the abstract view of a device named
television can be that it is an electrical device that can be turned on and
off. It can receive signals for satellite programs and play the audio video of the
program and as long as I have a device like this, I do not bother how circuits
are embedded to create this device or which company makes this device. So this
is an abstract view. So when we study data structures as mathematical or
logical models, we just define their abstract view or in other words, we have
a term for this, we define them as abstract data types. An example of abstract
data type can be, I want to define something called a list that should be
able to store a group of elements of a particular data type and we should be
able to read the elements by their position in the list and we should be
also able to modify element at a particular position in the list. I would
say store a given number of elements of any data type. So we are just defining a
model. Now we can implement this in a programming language in a number of ways.
So this is a definition of an abstract data type. We also call abstract data
type as ADT and if you see, all the high-level language is already have a
concrete implementation of such an ADT in the form of arrays. So arrays give
us all these functionalities. So arrays are data types which are concrete
implementation. So the second way of talking about data structures is talking
about their implementation. So implementations would be some concrete types
and not an abstract data type. We can implement the same ADT in multiple ways
in the same language. For example in C or C++ we can implement this list ADT as a
data structure named linked list and if you have not heard about it, we will be
talking about them a lot. We will be talking about linked list a lot in the
coming lessons. Okay so let's define an abstract data type formally because this
is one term that we will encounter quite often. Abstract data types are entities
that are definitions of data and operation but do not have implementations.
So they do not have any implementation details. We will be talking about a lot
of data structures in this course. We will be talking about them as abstract
data types and we will also be looking at how to implement them. Some of the
data structures that we will talk about are arrays linked list, stack, queue, tree,
graph and the list goes on. There are many more to study. So when we will study
these data structures we will study their logical view. We will study what
operations are available to us with these data structures. We will study the cost
of these operations mostly in terms of time and then definitely we will study
the implementation in a programming language. So we will be studying all these
data structures in the coming lessons and this is all for this introductory
lesson. Thanks for watching. In our previous lesson we introduced you to the
concept of data structures and we saw how we can talk about data structures in
two ways. One as a mathematical and logical model that we also call that we
also term as an abstract data type or ADT and then we also study data structures
as concrete implementations. In this lesson we will study one simple data
structure. We will first define an abstract view of it. We will first define
it as an abstract data type and then we will see the possible implementations and
this data structure is list. List is a common real world entity. List is nothing
but a collection of objects of the same type. We can have a list of words, we can
have a list of names or we can have a list of numbers. So let us first define
list as an abstract data type. So when we define abstract data type we just
define the data that will store and we define the operations available with the
type and we do not go into the implementation details. Let us first define
a very basic list. I want a list that can store a given number of elements of a
given data type. This would be a static list. The number of elements in the list
will not change and we will know the number of elements before creating the
list. We should be able to write or modify element at any position in the
list and of course we should be able to read element at a particular position
in the list. So if I ask you for an implementation of such a list and you
have taken a basic course in programming, a basic introductory course then you'll
be like hey I know this and array gives us all these features. All these
operations are available with an array. We can create an array of any data type
so let us say if we want to create a list of integers then we declare the array
type as integer and then we can give the size as a parameter in declaration. I can
write or modify element at a particular position. The elements are A0, A1 and
are accessed something like this. We all know about arrays and then we can also
read elements at a particular position. The element at ith position is accessed
as AI. So array is a data structure that gives us implementation for this list.
Now I want a list that should have many more features. I want it to handle more
scenarios for me. So I'll redefine this list here. I do not want a static list a
static collection with a fixed size. I want a dynamic list that should grow as
per my need. So the features of my list are that I'll call my list empty if there
are no elements in the list. I'll say the size of the list is 0 when it is empty
and then I can insert an element into the list and I can insert an element at any
position in the list and in an existing list. I can remove element from the list.
I can count the number of elements in the list and I should be able to read or write
or rather read or modify element at a particular position in the list and I
should also be able to specify the data type for the list. So I should be able to
while creating the list I should be able to say whether this is a list of integers
or whether this is a list of string or float or whatever. Now I want a data
structure which is implementation of this dynamic list. So how do I get it?
Well actually we can implement such a dynamic list using arrays. It's just that
we will have to write some more operations on top of arrays to provide
for all these functionalities. So let us see how we can implement this
particular list using arrays. Let's for the sake of simplicity of design assume
that the data type for the list is integer. So we are creating a list of
dynamic list of integers. What we can do is to implement such a list we can
declare a really large array. We will define some max size and declare an array
of this max size. Now as we know the elements in the array are indexed as a0,
a1, a2 and we go on like this. So what I'll do is I'll define a variable that
will mark the end of the list in this array. So if the list is empty we can
initialize this variable or we can set this variable as minus 1 because the
lowest index possible is 0. So if end is minus 1 the list is empty. At any time
a part of the array will store the list. Okay so let's say initially when the
list is empty this pointer end is pointing to index minus 1 which is not
valid which does not exist. And now I insert an integer into this array and let's
say if we do not give the position at which the number is to be inserted the
number is always inserted towards the tail of the list towards the end of the
list. So the list will be like we will have an element at position 0 and now
end is index 0. So at any time end marks the this variable end marks the end of
the list in this array. Now if I want to insert something in the list at a
particular position let's say I want to insert number 5 at index 2 then to
accommodate 5 here at this particular position we will have to shift all the
elements one unit towards the right. All the elements starting index 2 we need to
shift all the elements starting index 2 towards the right. Okay I just inserted
some elements into the list let me also write the function call for these. Let's
say we went in this order we inserted 2 then we inserted 4 and then we inserted
in the end we are inserting 5 and we will also give the position at which we
want to insert. So this insert with two arguments would be the call to insert
element at a particular position. So after all these operations after all these
insertions this is what the list will look like. This arrow here marks the end
of the list in the array. Now if I want to remove an element from a particular
position let's say I make a call to something to the remove function I want to
remove the element 2. So I'll pass the index 0 here I want to remove the element
at index 0. So to do so all these elements after index 0 will be shifted one unit
towards the left or towards the lower indices and 2 will go away. Now this end
variable here is being adjusted after each insertion that we are making. So after
this insertion end will be 0 after this 1, 2, 3 and so on after this remove end
will be 4 again. Okay looks like we pretty much have an implementation of
this list in the left that is described as an abstract data type. We have a logic
of calling the list empty when we have this variable n is equal to minus 1. We
can insert element at a particular position in the list we can remove
element. It's just that we have to perform some shifts in the array. We can
count the number of elements in the list. It will be equal to n plus 1 the value
in the variable n plus 1. We can read or modify element at a position well
this is an array so we can definitely read or modify element at a particular
position. If we wanted to choose the data type it was just choosing the array of
that particular data type. Now this looks like a cool implementation except that
we have one problem. We said that the array will be of some large size some
max size. But what is a good max size? We can always exhaust array the list can
always grow to exhaust the array. There is no good max size. So we need to have a
strategy for the scenario when the list will fill up the whole array. So what do
we do in that case? We need to keep that into our design. We cannot extend the same
array. It is not possible to do so. So we will have to create a new array a
larger array. So when the array is full we will create a new larger array and
copy all the elements from the previous array into the new array. And then we can
free the memory for the previous array. Now the question is by how much should we
increase the size of the new array? This whole operation of creating a new array
and copying all the elements from the previous array into the new array is
costly in terms of time and definitely a good design would be to avoid such big
cost. So the strategy that we choose is that each time the array is full we create
a new larger array of double the size of the previous array. And why this is the
best strategy is something that we will not discuss in this lesson. So we will
create a larger array of double size and copy elements from previous array into
this new array. This looks like a cool implementation. The study of data structures
is not just about studying the operations and the implementation of these operations.
It's also about analyzing the cost of these operations. So let us see what are
the costs in terms of time for all these operations that we have in the dynamic
list. The access to any element in this dynamic list if we want to access access
it using index for read or write then this will take constant time because we
have an array here. And in array elements are arranged in one contiguous block of
memory using the starting address or the base address of the block of the
memory of the block of memory and the index or the position of the element we
can calculate the address of that particular element and access it in
constant time. Big O notation that is used to describe the time complexity of
operations for constant time it is written as in terms of Big O the time
complexity is written as Big O of 1. If we wanted to insert element if we wanted
to insert element at the end of the array end of the list then that again will be
constant time but if we would insert element at a particular position in the
list then we will have to shift elements towards higher indices. In the worst case
we will have to shift all the elements to the right when we will be inserting at
the first position. So the time taken for insertion will be proportional to the
length of the list let's say the length of the list is n or in other words we
will say that insertion will be Big O of n in terms of time complexity. If you do
not know about Big O notation do not bother just understand that inserting an
element at a particular position will be a linear function in terms of the size
of the list. Removing an element will again be Big O of n. Time taken will be
proportional to the current size of the list n is the size of the list here okay
now inserting an element at the end we just said that it will happen in constant
time it is not so if the array is full then we will create a new array let's
call inserting element at the end as adding an element. Adding an element will take
constant time if the list is not full but it will take time proportional to the
size of the list size of the array if the array is full. So adding in the worst
case will be Big O of n again as we said when the list is full we create a new
copy double the size of the previous array and then we copy the previous array
the elements from previous array into the new array. So prime of a see what looks
like the good thing with this kind of implementation. Well the good thing is
that we can access elements at any index in constant time which is the
property of the array but if we have to insert some element in between and if we
have to remove element from the list then it is costly. If the list grows and
strings a lot then we will also have to create a new array and have all this
thing of copying elements from previous array into new array again and again. And
one more problem is that a lot of time a lot of the array would be unused the
memory there is of no use but definitely the use of array as dynamic list is not
efficient in terms of memory this kind of implementation is not efficient in
terms of memory. This leads us to think can we have a data structure that will
give us a dynamic list and use the memory more efficiently. We have one
data structure that gives us good utilization of the memory and this
data structure is linked list and we will study about the linked lists in the
next lesson. So that's it for this lesson thanks for watching. In this lesson we
will introduce you to link the list data structure. In our previous lesson we
tried to implement a dynamic list using arrays and we had some issues there. It
was not most efficient in terms of memory usage in terms of memory consumption.
When we use arrays we have some limitations. To be able to understand
linked list well we need to understand these limitations. So I am going to tell
you a simple story to help you understand this. Now let us say this is computer's
memory and each partition here is one byte of memory. Now as we know each byte of
memory has an address we are showing only a section of the memory that's why it is
extending towards the bottom and the top. Let's say the address increases from
bottom to top. So if this byte is address 200 the next byte would be address 201
and next byte would be address 202 and so on. What I want to do is I want to draw
this memory from left to right horizontally instead of trying it from
bottom to top like this. This looks better. Let's say this byte here is
address 200 and as we go towards the right the address increases. So this is
like 201 and we go on like 202, 203 and so on. It doesn't really matter whether we
show memory from bottom to top or left to right. These are just logical ways to
look at the memory. So coming back to our story memory is a crucial resource and
all the applications keep asking for it. So Mr. Computer has given this job of
managing the memory to one of his components to one of his guys who he
calls the memory manager. Now this guy keeps track of what part of the memory
is free and what part of the memory is allocated and anyone who needs memory
to store something needs to talk to this guy. Albert is our programmer and he is
building an application. He needs to store some data in the memory so he
needs to talk to the memory manager. He can talk to the memory manager in a
high level language like C. Let us say that he is using C to talk to the memory
manager. First he wants to store an integer in the memory. So he communicates
this to memory manager by declaring an integer variable something like this.
The memory manager sees this declaration and he says that okay you need to store
an integer variable. So I need to give you four bytes of memory because integer
variable is stored in four bytes in a typical architecture and let us say in
this architecture it is stored in four bytes. So the memory manager looks for
four bytes of free space in the memory and assigns it or allocates it for
variable X. Address of a block of memory is the address of the first byte in the
memory. So let us say this first byte of memory here is at address 217. So variable
X is at address 217. So memory manager kind of communicates it back to Albert
that hey I have assigned address 217 for your variable X you can store whatever
you want there and Albert can fill in any data into this variable. Now Albert
needs to store a list of integers a list of numbers and he thinks that the
maximum number of integers in this list will be four. So he asks the memory
manager for an integer array of size four named A. Now array is always stored in
memory as one contiguous block of memory. So memory manager is like okay I need to
look for a block of memory of 16 bytes for this variable this array A. So the
memory manager allocates this block starting address 201 and ending address
216 for this variable A which is an array of four integers. Because array is
stored as one contiguous block of memory and memory manager conveys the starting
address of this block whenever Albert tries to access any of the elements in
the array let's say he tries to access let's say he tries to write some value at
the fourth element in the array which he accesses as A3. Albert's application
knows where to write this particular value because it knows the base address
the starting address of the block A the array A and from base address using the
index which is three here it calculates the address of A3. So it knows that A3
is at address 213. So to access any of the elements in the array the application
takes constant time and this is one awesome thing about arrays that
irrespective of the size of the arrays the application and application can
access any of the elements in the array in constant time. Now let's say Albert
uses this array of four integers to store his list. So I'll fill in some
values here at these positions let's say this is 8, this is 2, this is 6, this is
5, this is 4. Now Albert at some point feels that okay I need to have one more
element in this list. Now he has declared an array of size 4 and he wants to add
a fifth element in the array. So he asks the memory manager that hey I want to
extend my array A is it possible to do so I want to extend the same block and the
memory manager is like when I allocate memory for an array I do not expect that
you will ask for an extension. So I use whatever memory is available adjacent to
that block for other variables. In some cases I may extend the same block but in
this case I have an element and a variable X next to your block so I cannot
give you an extension. So Albert is like what all options do I have? Memory
manager is like you can tell me the new size and I can recreate a new block at
some new address and we will have to copy all the elements from the previous block
to the new block. So Albert says that okay let's do it but the memory manager
is like you still need to give me the size of the new block. Albert thinks that
this time he'll give a really large size for the new array or the new block so
that it does not fill up. This new block starting address 2 to 4 is allocated.
Albert asks memory manager to free the previous block and this is some cost he
has to copy all the elements all the numbers from the previous block into the
new block and now he can add one more element to this list and he has kept his
array large this time just in case he needs more numbers in the list. So the
only option that Albert had was to create a as an entirely new block as an
entirely new array and Albert is still feeling bad because if the list is too
small he is not using some part of the array and so memory is getting wasted
and if the list again grows too much he will again have to create a new array a
new block and he will again have to copy all the elements from the previous block
into the new block. Albert is desperately seeking a solution to this problem and
the solution to this problem is a data structure named linked list. So let us
now try to understand linked list data structure and see how it solves Albert's
problem. What Albert can do is that instead of asking the memory manager for
an array which will be one large contiguous block of memory he can ask memory for
one unit of data at a time for one element at a time in a separate request.
I am cleaning up the memory here once again let's say Albert wants to store
this list of four integers in the memory. What if he requests memory for one
integer at a time. So first he pings memory manager for some memory to store
number six memory manager will be like okay you need space to store an integer
so you get this block of four bytes at address 204. So Albert can store number
six here now Albert makes another request a separate request for number five. Let
say he gets this block starting address 217 for number five because he makes a
separate request he may or may not get memory adjacent to number six higher
probabilities that he will not get an adjacent memory location. So similarly
Albert makes separate requests for number four and two. So let's say he gets these
two blocks at address 232 and 242 respectively for numbers four and two. So
as you can see when all but makes separate requests for each integer instead of
getting one contiguous block of memory he gets these disjoint non-contiguous
blocks of memory. So we need to store some more information here we need to
store the information that this is the first element in the list and this is the
second element in this list. So we need to link these blocks together somehow with
an array it was very simple. We had one contiguous block of memory so so we knew
where a particular element is by calculating its address using the
starting address of the block and the position of the element in the array but
here we need to store the information that this is the first block which stores
the first element and this is the second block which stores the second element
and so on. To link these blocks together and to store the information that this
is the first block in the list and this is the second block in the list what we
can do is that we can store some extra information with each block. So what if
we can have two parts in each block something like this and in one part of
the block we store the data or the value and in the other part of the block we
store the address of the next block. In this example in the first block the
address part would be 2 1 7 the address of the next block that stores 5 and in
this next block or the second block the address part would be 2 3 2. In the block
at address 2 3 2 we will store the address 2 4 2 the address of the next
block that stores number 2 and the block at 2 4 2 is the last block there is no
next block after this. So in the address part we can have address as 0 0 is in
valid address 0 can be used to mark that this is the end of the list there is no
link to the next node or next block after this particular block. So all but now
actually has to request memory manager for a block of memory that will store
two variables one and integer variable that will store the value of our element
and one a pointer variable that will store the address of the next block or
the next node in the list. In C he can define a type named node like this he
will have two fields in the node one to store the data this field will be an
integer and one more field to store the address of the next node in the list. So
Albert will ask a node Albert will ask memory for a node from the memory manager
and the memory manager will be like okay you need a node that needs four bytes
for an integer variable and four more bytes for the pointer variable that will
store the address pointer variable also in a typical architecture is stored in
four bytes. So now memory manager gives us a block of eight bytes and we call
this block a node. Now notice that the second field in the node structure is
node star which means pointer to node so this field will only store an address
of the next node in the list. So if we store the list like this in the memory as
these non-contiguous nodes connected to each other then this is a linked list
data structure. Logical view of the linked list data structure will be
something like this. Data is stored in these nodes and each node store the data
as well as the link to the next node. So each node kind of points to the next
node. The first node is also called the head node and the only information about
the list that we keep all the time is address of the head node or address of
the first node. So address of the head node kind of gives us access to the
complete list. The address in the last node is null or 0 which means that the
last node does not point to any other node. Now if we want to traverse the
linked list the only way to do it is we start at the head and we go to the first
guy and then we ask the first guy the address of the next guy address of the
next node and then we go to the next node and ask the address of the next node and
this is the only way to access the elements in the linked list. If we want to
insert a node in the linked list let's say we want to add number 3 at the end
of the linked list then all we need to do is first create a node in the linked
list. Sorry first create a node independently and separately it will get
some memory location. So we created this node with value 3. Now all we need to do
is fill the address properly adjust these links properly. So the address of this
particular node will be filled in this node with value 2 and this node the
address part can be null. So it is the last node it does not point to any other
node. Let's also show these nodes in the memory here. So I've written the address
of each node in brown at top of these nodes and I've also filled in this address
field of each node. Let's say the node for value 3 gets address 2 5 2. So this is
how things will be in the memory and this is how the logical view will be. The
linked list is always identified by the address of the first node and unlike
arrays we cannot access any of the elements in constant time. In the case of
arrays using the starting address of the block of memory and using the position of
the element in the list. We could calculate the address of the element but in this
case we have to start at the head and we have to ask this element for next
element and then ask the next element who is your next. It's like playing
treasure hunt. You go to the first guy and then you get the address for the
second guy and then you go to the second guy and you get the address for the
third guy. So the time taken to access elements will be proportional to the size
of the list. Let's say the size of the list is n. There are n elements in the
list. In the worst case to traverse the last element you will go through all
the elements. So time taken to access elements is proportional to n or in other
words we say that this operation will cost us or rather the time complexity of
this operation is big O of n. Insertion into the list. We can insert anywhere in
the list. We first need to create a node and just adjust these links properly. Like
say I want n at third position in the list. So all we need to do is create a node,
store the value 10 in the data part, something like this. Let's say we get the
node 10 at address 310. So we will adjust the address field in the second node to
point to and this node with address 310 and this node will point to the node with
value 4. Now to insert also we will have to traverse the list and go to that
particular position and so this will be big O of n again in terms of time
complexity. The only thing is that the insertion will be a simple operation. We
will not have to do all the shifts as we had to do in an array to insert
something in between. We had to shift all the elements by one position towards
higher indices. Similarly to delete something from this list will also be O n.
So we can see some good things about linked list. There is no extra use of
memory in the sense that some memory is unused. We are using some extra memory.
We are using some extra memory to store the addresses but we have the benefit that we
create nodes as and when we want and we can also free the nodes as and when we want.
We do not have to guess the size of the list beforehand like in the case of arrays.
Now we will discuss all the operations on linked list and the cost of these
operations as well as comparison with arrays in our next lessons. We will also be implementing
linked list in C or C++. So this is all for a basic introduction to linked list. Thanks for watching.
In our previous lesson we introduced you to linked list data structure and we saw how linked
lists solve some of the problems that we have with arrays. So now the obvious question would be
which one is better and array or a linked list. Well there is no such thing as one data structure
is better than another data structure. One data structure may be really good for one kind of
requirement while another data structure can be really good for another kind of requirement.
So it all depends upon factors like what is the most frequent operation that you want to
perform with the data structure or what is the size of the data and there can be other factors as well.
So in this lesson we will compare these two data structures based on some parameters based on
the cost of operations that we have with these data structures. So all in all we will comparatively
study the advantages and disadvantages and try to understand in which scenario we should use an array
and in which scenario we should use a linked list. So I will draw two columns here one for array
and another for linked list and the first parameter that we want to talk about is the cost of accessing
an element irrespective of the size of an array it takes constant time to access an element in the
array. Now this is because an array is stored as one contiguous block of memory. So if we know the
starting address or the base address of this block of memory let us say what we have here is an integer
array and the base address is 200 the first byte in this block is at address 200 then let's say if
we want to calculate the address of element at index i then it will be equal to 200 plus i into
size of an integer in bytes. So size of an integer in bytes is typically 4 bytes so it will be 200
plus 4 into i. So if 0th element is at address 200 if we want to calculate the address for element
at index 6 it will be 200 plus 6 into 4 which will be equal to 224. So knowing address of any
element in an array is just this calculation for our application in terms of big o notation constant
time is also called big o of 1. So accessing an element in an array is big o of 1 in terms of
time complexity. If you are not aware of big o notation check the description of this video for
a tutorial on time complexity analysis. Now in a linked list data is not stored in a contiguous
block of memory. So if we have a linked list something like this let's say we have a linked
list of integers here then we have multiple blocks of memory at different addresses. Each block in
the linked list is called a node and each node has two fields one to store the data and one to store
the address of the next node. So we call the second field the link field. The only information
that we keep with us about a linked list is the address of the first node which we also call
the head node and this is what we keep passing to all the functions also the address of the head
node. To access an element in the linked list at a particular position we first need to start at
the head node or the first node and at the first node we need to see the address of the second node
and then we go to the second node and see the address of the third node. In the worst case to
access the last element in the list we will be traversing all the nodes in the list. In the average
case we will be accessing the middle element maybe. So if n is the size of the linked list
and is the number of elements in the linked list then we will traverse n by two elements.
So the time taken in the average case also is proportional to number of elements in the linked
list. So we can say that the time complexity in average case is big O of n. So on this parameter
cost of accessing an element arrays course heavily over linked list. So if you have a
requirement where you want to access elements in the list all the time then definitely array is a
better choice. Now the second parameter that we want to talk about is memory requirement or
memory usage. With an array we need to know the size of the array before creating it
because array is created as one contiguous block of memory. So array is a fixed size.
What we typically do is create a large enough array and some part of the array stores our list
and some part of the array is vacant or empty so that we can add more elements in the list.
For example we have an array of seven integers here and we have only three integers in the list.
Rest four positions are unused. There would be some garbage value there.
With linked list let's say we have let's say we have this linked list of integers.
There is no unused memory. We ask memory for one node at a time so we do not keep any reserved
space but we use extra memory for pointer variables. And this extra memory requirement for pointer
variables in a linked list cannot be ignored in a typical architecture let's say integer is stored
in four bytes and pointer variable also takes four bytes. So if you see the memory requirement for
this array of seven integers is 28 bytes and the memory requirement for this linked list would be
eight into three where eight is the size of each node four for integer and four
bytes for the pointer variable. So this is also 24 bytes. If we add one more element to the list
in the array we will just use one more position. While in linked list we will create one more node
and we'll take another eight bytes. So this will be 32 bytes linked list would fetch us a lot of
advantage of the data. The data part is large in size. So in this case we had a linked list of
integers. So integer is only four bytes. What if we had a linked list in which a data part was
some complex type that took 16 bytes. So four bytes for the link and 16 bytes for the data
each node would have been 20 bytes and array of seven elements for 16 bytes of data would be 16
byte for each element would be 112 bytes and linked list of four would be only 80 bytes.
So it all depends. If the data part for the list takes a lot of memory linked list will definitely
consume lot less memory. Otherwise it depends what strategy we are choosing to decide the size of
the array. At any time how much array we keep unused. Now one more point with memory allocation
because arrays are created as one contiguous block of memory. Sometimes when we may want to create a
really really large array then maybe memory may not be available as one large block but if we are
using linked list memory may be available as multiple small blocks. So we will have this problem
of fragmentation in the memory. Sometimes we may get many small units of memory but may not get
one large block of memory. This may be a rare phenomenon but this is a possibility. So this is
also where linked list scores. Because arrays have fixed size once array gets filled and we
need more memory then there is no other option than to create a new array of larger size
and copy the content from the previous array into the new array. So this is also one cost
which is not there with linked list. So we need to keep these constraints and these requirements
in mind when we want to decide for one of these data structures for our requirement.
Now the third parameter that we want to talk about is cost of inserting an element in the list.
Remember when we are talking about arrays here we are also talking talking about the possible use
of array as dynamic list. So there can be three scenarios in insertion. First scenario will be
when we want to insert an element at the beginning of the list. Let's say we want to insert number
three at the beginning of the list. In the case of arrays we will have to shift each element
by one position towards the higher index. So the time taken will be proportional to the
size of the list. So this will be big O of n. Let's say n is the size of the list.
This will be big O of n in terms of time complexity. In the case of linked list inserting an element
in the beginning will mean only creating a new node and adjusting the head pointer and the link
of this new node. So the time taken will not depend upon the size of the list. It will be constant.
So for linked list inserting an element at the beginning is big O of one in terms of
time complexity. Inserting an element at n for an array. Let's say we are talking about dynamic array
a dynamic list in which we create a new array if it gets field filled. If there is space in the
array we just write to the next higher index of the list. So it will be constant time.
So time complexity is over if array is not full. If array is full we will have to create a new array
and copy all the previous content into new array which will take O in time where n is the size of
the list. In the case of linked list adding an element inserting an element at the end will
mean traversing the whole list and then creating a new node and adjusting the links. So time taken
will be proportional to n. I'll use this color coding for linked list. Here n is the number of
elements in the list. Now the third case would be when we want to insert in the middle of the list
at some nth position or maybe some ith position. Again in the case of arrays we will have to shift
elements. Now for the average case we may want to insert at the mid position in the array. So we
will have to shift n by two elements where n is the number of elements in the list. So the time
taken will is definitely proportional to n in average case. So complexity will be big O of n.
For linked list also we will have to traverse till that position and then only we can adjust
the links even though we will not have any shifting we will have to traverse till that point and in
the average case time taken will be proportional to n and the time complexity will be big O of n.
If you can see deleting an element will also have these three scenarios and the time complexity
for deleting for these three scenarios will also be the same. And the final point the final parameter
that I want to talk about is which one is easy to use and implement and array definitely is a lot
easier to use linked list implementation especially in C or C++ is more prone to errors like segmentation
fault and memory leak it takes good care to work with linked list. So this was arrays versus linked
list in our next lesson we will implement linked list in C or C++ we will get our hands dirty with
some real code. So this is it for this lesson thanks for watching. In our previous lessons we
described linked list we saw the cost of various operations in linked list and we also compared
linked list with arrays. So now let us implement linked list the implementation will be pretty
similar in C and C++ there will be slight differences that we will discuss. The prerequisite for this
lesson is that you should have a working knowledge of pointers in C C++ and you should also know the
concept of dynamic memory allocation. If you want to refresh any of these concepts check the description
of this video for additional resources. Okay so let's get started. As we know in a linked list
data is stored in multiple non-contiguous blocks of memory and we call each block of memory a
node in the linked list. So let me first draw a linked list here. So we have a linked list of
integers here with three nodes as we know each node has two fields or two parts one to store the
data and another to store the address of the next node what we can also call link to the next node.
So let's say the address of the first node is 200 and address of the second node is 100 and the
address of the third node is 300 for this linked list. This is only a logical view of the linked
list. So the address part of the first node will be 100 the address of the second node and we will
have 300 here. The address part of the last node will be null which is only a synonym or macro
for address 0. 0 is an invalid address a pointer variable equal to 0 or null with address 0 or
null means that the pointer variable does not point to a valid memory location. The memory block
the address of the memory block allocated to each of the nodes is totally random there is no relation
it's not a guarantee that the addresses will be in increasing order or decreasing order
or adjacent to each other. So that's why we need to keep these links.
Now the identity of the linked list that we always keep with us is the address of the first
node what we also call the head node. So we keep another variable that will be of type pointer to
node and this guy will store the address of the first node and we can name this pointer variable
whatever let's say this pointer variable is named a. The name of this particular pointer variable
that points to the head node or the first node can also be interpreted as the name for the linked
list also because this is the only identity of the linked list that we keep with us all the time.
So let us now see how this logical view can be mapped to a real program in C or C++.
In our program node will be a data type that will have two fields want to store the data
and another to store the address. In C we can define such a data type as structure.
So let's say we define a structure named node with two fields first field to store the data
the type of the data here is integer. So this will be node for a linked list of integers.
If we wanted a linked list of say double this data type would be double. The second field will be
pointer to node struct node star we can name this link or we can name this next or whatever.
This is C style of declaring node star or pointer to node. If this was C++ we could simply write
node star I would write it this way the C++ way it looks better to me. In our logical view here
this variable A is of type node star or pointer to node. Each of these three rectangles with two
fields are of type node and this field in the node the first field is of type integer and the
second field is of type pointer to node or node star. It is important to know which one is what
in the logical view we should have this visualization before we go on to implement linked list.
Okay so let us now create this particular linked list of integers that we are showing here
through our code. To be able to do so we will have to implement two operations one to insert
a node into the linked list one operation to insert a node in the linked list and another
operation to traverse the linked list. But before that the first thing that we want to do is that
we want to declare a pointer to the head node a variable that will store the address of the
head node for the sake of clarity I'll write head node here. So I have declared a pointer to node
named A. Initially when the list is empty when there is no element in the list this pointer should
point nowhere so we write a statement something like A is equal to null to say the same. Now with
these two statements what we have done is we have created a pointer to node named A and this and
this pointer points nowhere so the list is empty. Now let's say we want to insert a node in this list
so what we do is we first create a node creating a node is nothing but creating a memory block
to store a node in C we use the function malloc to create a memory block as argument we pass the
number of bytes that we want in the block so we say that give me a memory block that will be equal
to the size of a node so this call to malloc will create a memory block. This is a dynamically
allocated memory memory allocated during runtime and the only way to work with this kind of memory
is through reference to this memory location through pointers. Let us assume that this memory block
assigned here is at address 200. Now malloc returns a void pointer that gives us the address of
assigned memory block so we need to collect it in some variable so let's say we create a variable
named temp which is pointer to node so we can collect the return of malloc the address in this
particular variable we will need a type casting here because malloc returns void pointer and we are
having temp as pointer to node so now we have created one node in the memory. Now what we need
to do is fill in the data in this particular node and adjust the links which will mean writing the
correct address in the variable a and the link field of this newly created node to do so we will
have to dereference this particular pointer variable and that we just created. As we know if we put
an asterisk sign in front of the pointer variable we mean dereferencing it to modify the value at
that particular address. Now in this case we have a node which has two fields so once we dereference
if we want to access each of the fields we need to put something like a dot data here
to access the data and a dot link to write to the link field so we will write a statement like this
to fill in value 2 here and we have this temp variable pointing to this right now
and the link part of this newly created node should be null because because this is the first
and the last node and the final thing that we need to do is write the address of this newly created
node in a so we will write something like a is equal to temp okay temp was to temporarily
store the address of the node till the time we had not fixed all the links properly we can now
use temp for some other purpose our linked list is intact now it has one node these two lines that
we have written here for dereferencing and writing the values into the new node there is alternate
syntax for this instead of writing something like start temp bracketed dot data we could also write
temp followed by this arrow and data we will have two characters to make this arrow one hyphen and
one this right angular bracket right angular brace so we can write something like this
and the same thing below we can write something like this to create a memory block in c++
we can use malloc as well as we can use the new operator so in c++ it gets very simple we could
simply write node start temp is equal to new node like this and we would mean the same thing
this is a lot cleaner and new operator is always preferred over malloc so if you're using c++
new is recommended so so far through our program we created an empty list by creating this pointer
to the head node and assigning the value null to it initially then we created a node and we added
this first node in this linked list when the list is empty and we want to insert a node the logic
is pretty straightforward when the list is not empty we may want to insert a node at the beginning
of the list or we may want to insert a node at the end of the list or we may even want to insert a
node somewhere in the middle of the list at a particular position we will write separate functions
and routines for these different kind of insertions and we will see running code in a compiler
in our coming lessons let's just talk about the logic here in this whatever unstructured code I
have right now so I want to write a code to insert two more nodes each time at the end of the list
we actually want to create the linked list with three nodes having values two four and six
that was our initial example in the beginning okay so let us add two more nodes with values
four and six into this linked list at this stage in our code we already have a variable temp which
is pointing to this particular node we will create a new node and use the same variable name to collect
the address of this new node so we will write a statement like this so a new node is created
and temp now stores the address of this new node which is located at address 100 here once again
we can set the data and then because this is going to be the last node we need to set the link
as null now all we need to do is build the link of this particular node right the address of this
newly created node into the address field of this last node to do so we will have to traverse the
list and we will have to go to the end of the list to do so we will write something like this
we can create a new variable temp one which will be pointed to node and initially we can
point to the head node point this variable to the head node by a statement like this we can write
a loop like this now this is generic logic to reach the end of the list it will not be so clear
if we see this logic with only one node as we have in this example let's draw a list with multiple
nodes so we are pointing temp one to the first node here and if the link part of this node
is null we are at the last node else we can move to the next node so temp one equal temp one dot
link will get us to the next node and we will go on till we reach the last node
for the last node this particular condition temp one dot link not equal null will be false
because the link part will be null and we will exit this while loop so this is our code logic
for traversal of the list all the way till end if we want to print the elements in the list we
will write something like this and we will write print temp dot data inside this while loop but
right now we want to insert in the at the end of the list and we are only traversing the list
to reach the last node there is one more thing that I want to point out we are using this variable
temp one and initially storing the address in a we are not doing something like a equal a dot link
and using the variable a itself to traverse the list because if we modify a we will lose
on the address of the head node so a is never modified the address of the head node whatever
variable stores the address of the head node is never modified only these temporary variables are
modified to traverse the list so finally after all this we will write a statement like temp one dot
link is equal to temp temp one is pointing here so now this address part is updated and this link
is built so we have two nodes now in the list once again when we want to insert
node with number six in this list we will have to create a new node by this logic then we will
have to traverse the list by this logic so we will point temp one here first and then the loop will
move the temp one to the end let's say this new block is at address 300 so this last line finally
will adjust the link of the node at address 100 to insert a node at the end there is one logic in
these four lines if the list is empty and there is another logic in these remaining lines if the
list is not empty ideally we will be writing all these logics all this logic in a function
we will do that in our coming lessons we will implement separate methods to print all the nodes
in a linked list and to insert a node at the end we will implement a separate method to insert a
node at the beginning of the list and at a particular position in the list so this is all for this
lesson thanks for watching in our previous lesson we saw how we can map the logical view of a linked
list into a c or c++ program we saw how we can implement two basic operations one traversal of
the linked list and another inserting a node at the end of the linked list in this lesson we will
see a running code that will insert a node at the beginning of the linked list so let's get started
I will write a c program here the first thing that we want to do in our program is that we want
to define a node a node will be a structure in c it will have two fields want to store the data
let's say we want to create a linked list of integers so our data type will be integer if we wanted
to create a linked list of characters then our type would be character here so we will have another
field that will be pointed to node that will store the address of the next node we can name this
variable link or some people some people also like to name this variable next because it sounds more
intuitive this variable will store the address of the next node in the linked list
in c whenever we have to declare node or pointer to node we will have to write
struct node or struct node star in c++ we will have to write only node star and that's one difference
okay so this is the definition of our node now to create a linked list we will have to create
a variable that will be pointer to node and that will store the address of the first node in the
linked list what we also call the head node so I will create a pointer to node here
struct node star we can name this variable whatever often for the sake of understanding
we name this variable head now I have declared this variable as a global variable I have not
declared this variable inside any function and I'll come back to why I'm doing so now I'll write
the main method this is the entry point to my program the first thing that I want to do is I want to
say head is equal to null which will mean that this pointer variable points nowhere so right now
the list is empty so far what we have done here in our code is that we have created a global
variable named head which is of type pointer to node and the value in this pointer variable is null
so so far the list is empty now what I want to do in my program is that I want to ask the user
to input some numbers and I want to insert all these numbers into the linked list so I'll print
something like how many numbers let's say the user wants to input n numbers so I'll collect this
number in this variable n and then I'll define another variable I to run the loop and so I'm running
a loop here if it was c++ I could declare this integer variable right here inside the loop
now I'll write a print statement like this and I'll define another variable x and each time
I'll take this variable x as input from the user and now I will insert this particular number x
this particular integer x into the linked list by making a call to the method insert and then
each time we insert we will print all the nodes in the linked list the value of all the nodes in
the linked list by calling a function named print there will be no argument to this function print
of course we need to implement these two functions insert and print let me first write down the
definition of these two functions so let us implement these two functions insert and print
let us first implement the function insert that will insert a node at the beginning of the linked
list now in the insert function what we need to do is we first need to create a node in c we can
create a node using malloc function we have talked about this earlier malloc returns a pointer to
the starting address of the memory block we are having to type custer because malloc returns a
void pointer and we need a pointer to node a variable that is pointer to node and then only if we
day reference we day reference using an asterisk sign then we will be able to access the fields of
the node so the data part will be x and we have an alternate syntax for this particular
syntax we could simply write something like temp and this arrow and it will mean the same thing
and this is more common with these two lines in the insert function all we are doing is we are
creating a node let's say we get this node and let's assume that the address that we get for this
node is hundred now there is a variable temp where we are storing the address we can do one thing
whenever we create a node we can set data to whatever we want to set and we can set the link field
initially to null and if needed we can modify the link field so I'll write one more statement
temp.next is equal to null remember temp is a pointer variable here and we are de-referencing
the pointer variable to modify the value at this particular node temp will also take some space in
the memory that's why I have shown this rectangular block for both the pointer variables head and temp
and node has two parts one for the pointer variables and one for the data so this part the link part
is null we can either write null here or we can write it like this it's the same thing logically it
means the same now if we want to insert this node in the beginning of the list there can be two
scenarios one when the list is empty like in this case so the only thing that we need to do is
we need to point head to this particular node instead of pointing to null so I will write a
statement like head is equal to temp and the value in head now will be address hundred and that's
what we mean when we say a pointer variable points to a particular node we store the address of that
node so this is our linked list after we insert the first node let us now see what we can do to
insert a node at the beginning if the list is not empty like what we have right now once again we
can create a node fill in the value x here that is passed as argument initially we may set the link
field as null and let's say this node gets address 115 the memory and we have this variable temp
through which we are referencing this particular memory block now unlike the previous case if we
just set head is equal to temp now this is not good enough because we also need to build this link
we need to set the next or the link of the newly created node to whatever the previous head was
so what we can do is we can write something like if head is not equal to null or if the list is
not empty first set m dot next equal head so we first build this link the address here will be
hundred and then we say head equal temp so we cut this link and point head to this newly created
node and this is our modified linked list after insertion of this second node at the beginning of
the list now one final thing here this particular line the third line temp dot next equal null
this is getting used only when the list is empty if you see when the list is empty head is already
null so we can avoid writing two statements we can simply write this one statement m dot next
equal head and this will also cover the scenario when the list is empty now the only thing remaining
in this program to get this running is the implementation of this print function so let us
implement this print function now what i will do here is i'll create a local variable which is
pointed to node named temp and i need to write struct node here i keep missing this in c you
need to write it like this and i want to set this as address of the head node so this global
variable has the address of the head node now i want to traverse the linked list so i will write
a loop like this while temp is not equal to null i'll keep going to the next node using this statement
temp is equal to temp dot next and at each stage i'll print the value in that node as temp dot data
now i'll write two more print one outside this while loop and one outside after this while loop
to print an end of line now why did we use a temporary variable because we do not want to
modify head because we will lose the reference of the first node so first we collect the address
in head in another temporary variable and we are modifying the addresses in this temporary
variable using temp is equal to temp dot next to traverse the list now let us now run this program
and see what happens so this is asking how many numbers you want to insert in the list
let's say we want to insert five numbers initially the list is empty let's say the first number that
we want to insert is two at each stage we are printing the list so the list is now two the first
element and the last element is two we will insert another number the list is now five two five is
inserted at the beginning of the list again we inserted eight and eight is also inserted at
the beginning of the list okay let's insert number one the list is now one eight five two
and finally I inserted number 10 so the final list is 10 1 8 5 2 this seems to be working
now if we were writing this code in c++ we could have done a couple of things we could have written
a class and organized the code in an object oriented manner we could also have used new operator
in place of the malloc function and now coming back to the fact that we have declared this head
as global variable what if this was not a global variable this was declared inside this main function
as a local variable so I'll remove this global declaration now this head will not be accessible
in other functions so we need to pass address of the first node as argument to other functions
to both these functions print and insert so to this print method we will pass let's say we
name this argument as head now we can name this argument argument as head or a or temp or whatever
if we name this argument as head this head in print will be a local variable of print and will not be
this head in main these two heads will be different these two variables will be different when the
main function calls print passing its head then the value in this particular head in the main
function is copied to this another head in the print function so now in the print function we
may not use this m variable what we can do is we can use this variable head itself to traverse the
list and this should be good we are not modifying this head here in the main similarly to the insert
function we will have to pass the address of the first node and this head again is just a copy
this is again a local variable so after we modify the linked list the head in main method should
also be modified there are two ways to do it one we can pass the pointer to node as return from
this method so in the main method insert function will take another argument head and we will have
to collect the return into head again so that it is modified now this code will work fine
whoops i forgot to write a return here return head and we can run this program like before
we can give all the values and see that the list is building up correctly there was another way
of doing this instead of asking this insert function to return the address of head we could
have passed this particular variable head by reference so we could have passed insert ampersand
head head is already a pointer to node so in the insert function we will have to receive
pointer to pointer node star star and to avoid confusion let's name this variable something
else this time let's name this pointer to head so to get head we will have to write something like
we will have to dereference this particular variable and write astric pointer to head
everywhere and the return type will be void sometimes we want to name this variable as head
this local variable as head doesn't matter but we will have to take care of using it properly
now this code will also work as you can see here we can insert nodes and this seems to be going well
if you do not understand this concept of scope you can refer to the description of this video for
additional resources so this was inserting a node at the beginning of the linked list thanks for watching
in our previous lesson we had written code to insert a node at the beginning of the linked list
now in this lesson we will write program to insert a node at any given position in the linked list
so let me first explain the problem in a logical view let's say we have a linked list of integers
here there are three nodes in this linked list let us say they are at addresses 200 and 250
respectively in the memory and we have a variable head and that is pointer to node that stores the
address of the first node in the list now let us say we number these nodes we number these positions
on a one based index so this is the first node in the list and this is the second node and this
is the third node and we want to write a function insert that will take the data to be inserted in
the list and the position at which we want to insert this particular data so we will be inserting
a node at that particular position with this data there will be a couple of scenarios the list could
be empty so this variable head will be null or this argument being passed to the insert function
the position n could be an invalid position for example 5 is an invalid position here
for this linked list the maximum possible position at which we can insert a node in this list will
be 4 if we want to insert at position 1 we want to insert at the beginning and if we want to
insert at position 4 we want to insert at end so our insert function should gracefully handle
all these scenarios let us assume for the sake of simplicity for the sake of simplifying our
implementation that we always give a valid position we will always give a valid position
so that we do not have to handle the error condition in case of invalid position
the implementation logic for this function will be pretty straightforward
we will first create a node let's say in this example we want to insert a node with value
8 at third position in the list so i'll set the data here in the node the data part is 8
now to insert a node at the nth position we will first have to go to the n minus 1th node
in this case n is equal to 3 so we will go to the second node
now the first thing that we will have to do is we will have to set the link field of this newly
created node equal to the link field of this n minus 1th node so we will have to build this link
let's say the address that we get for this newly created node is 150 once we build this link we
can break this link and set the link of this newly created node as address of this set the link of
this n minus 1th node as address of this newly created node we may have special cases in our
implementation like the list may be empty or maybe we may want to insert a node at the beginning
let's say we will take care of special cases if any in our actual implementation
so now let's move on to implement this particular function in our program
in my c program the first thing that i need to do is i want to define a node so node will be a
structure and we have seen this earlier so node has these two fields one data of type integer and
another next of type pointer to node now to create a linked list the first thing that i need to create
is a pointer to node that will always store the address of the first node or the head node in
the linked list so i will create struct node star let's name this variable head and once again i
have created this variable as a global variable to understand linked list implementation we need
to understand what goes where what variable sits in what section of the memory and what is the scope
of these variables what goes in the stack section of the memory and what goes in the heap section
of the memory so this time as we write this code we will see what goes where in the main method
first i'll set this head as null to say that initially the list is empty so let us now see what
has gone where so far in our program in what section of the memory and the memory that is
allocated to our program or application is typically divided into these four parts or these four sections
we have talked about this in our lesson on dynamic memory allocation there is a link to our lesson
on dynamic memory allocation in the description of this video i'll quickly talk about what these
sections are one section of the applications memory is used to store all the instructions that need
to be executed another section is allocated to store the global variables that live for the entire
lifetime of the program of the application now one section of the memory which is called stack
is used to store all the information about function call executions to store all the local
variables and these three sections that we talked about are fixed in size their size is decided at
compile time the last section that we call heap or free store is not fixed and we can request
memory from the heap during runtime and that's what we do when we use malloc or new operator
now i have drawn these three sections of the memory stack heap and the section to store the
global variables in our program we have declared a global variable named head which is pointed to
node so it will go and sit here and this variable is like anyone can access it initially value here
is null now in my program what i want to do is i first want to define two functions uh insert
and this function should take two arguments data and the position at which i want to insert a node
and insert that particular node at that position insert data at that position in the list
and another function print that will simply print all the numbers in the linked list
now in the main method i want to make a sequence of function calls first i want to insert number
two the list is empty right now so i can only insert that position one so after this insert list
will be having this one number this particular number two and let's say again i want to insert
number three at position two so this will be our list after this insertion and i will make
two more insertions and finally i'll print the list so this is my main method i could have also
asked a user to input a number and position but let's say we go this way this time
now let us first implement insert i'll move this print above so the first thing that i want to do
in this method is i want to create a node so i will make a call to malloc in c++ we can simply
write a new node for this call to malloc and this looks a lot cleaner let's go c++ way this time
now what i can do is i can first set the data field and set the link initially
as null i have named this variable temp one because i want to use another temp variable
in this function i'll come to that in a while now we first need to handle one special case
when we want to insert at the head when we want to insert at the first position
so if n is equal to one we simply want to set the link field of the newly created node as whatever
the existing head is and then adjust this variable to point to the new head which will be this newly
created node and we will be done at this stage so we will not execute any further and return from
this function if you can see this will work even when the list is empty because the head will be
null in that case i'll show a simulation in the memory in a while so hold on till then
things will be pretty clear to you after that now for all other cases we will first need to go to
the n-1th node as we had discussed in our logic initially so what i'll do is i'll create another
pointer to node name this variable temp two and we will start at the head and then we will run a loop
and go to the n-1th node something like this we will run the loop n-2 times because right now we
are pointing to head which is the first node so if we do this temp two equal temp two dot next
n-2 times we will be pointing temp two to n-1th node and now the first thing that we need to do
is set the next or the link field of newly created node as the link field of this n-1th node and then
we can adjust the link of this n-1th node to point to our newly created node and now i'm writing this
print here i've written this print here we have used a temporary variable a temporary pointer to
node initially pointed to pointed it to head and we have traversed the whole list okay so let
us now run this program and see what happens we are getting this output which seems to be correct
the list should be four five two three in this order now i have this code i'll run through this
code and show you what's happening in the memory when the program starts execution initially the
main method is invoked some part of the memory from the stack is allocated for execution of a
function all the local variables and the state of execution of this function is saved in this
particular section we also call this stack frame of a function here in this main method we have
not declared any local variable we just set head to null which we have already done here now the
next line is a call to function insert so the machine will set the execution of this particular
method main on hold and go on to execute this call to insert so insert comes into this stack
and insert has couple of local local variables it has two arguments data and this variable n
this stack frame will be a little larger because we will have a couple of local variables
and now we create this another local variable which is a pointer to node m1 and we use the new operator
to create a memory block in the heap and this guy temp1 initially stores the address of this
memory block let's say this memory block is at address 150 so this guy stores the address 150
when we request some memory to store something on the heap using new or malloc we do not get a
variable name and the only way to access it is through a pointer variable so this pointer variable
is the remote control here kind of so here when we say temp1 dot data is equal to this much
through this pointer which is our remote we are going and writing this value to here
and then we are saying temp dot next equal null so null is nothing but address 0 so we are writing
address 0 here so we have created a node and in our first call n is equal to 1 so we will come
to this condition now we want to set temp1 dot next equal head temp1 dot next is
this section this second field and this is already equal to head head is null here and this is
already null null is nothing but 0 the only reason we set temp dot next equal head will work for
empty cases because head would be null and now we are saying head is equal to temp1
so head guy now points to this because it stores address 150 like temp1
and in this first call to insert after this we will return so the execution of insert will
finish and now the control returns to the main method we come to this line where we make another
call to insert with different arguments this time we pass number 3 to be inserted at position 2
now once again memory in the stack frame will be allocated
for this particular call to insert the stack frame allocation is corresponding to a particular
call so each time the function execution finishes all the local variables are gone from the memory
now once again in this call we create a new node we keep the address initially in this temporary
variable temp1 now let's say we get this node at address 100 this time now n is not equal to 1 we
will move on to create another temporary variable temp2 now we are not creating a new node and
storing the address in temp2 here we are saying temp2 is initially equal to head so we store the
address 150 so initially we make this guy point to the head node and now we want to run this loop
and want to keep going to the next node until we reach n minus 1th node in this case n is equal to
2 so this loop will not execute this statement even once n minus 1th node is the first node itself
now we execute these two lines the next of the newly created node will be set first so we will
build this link oops no temp2 dot the next is 0 only so even after reset this will be 0
and now we are setting temp2 dot next as temp1 so we are building this link and now this call to
insert will finish so we go back to the main method so this is how things will happen for
other calls also so after everything we have inserted when we will reach this print statement
in the main function our list will be something like this in the memory this is a little messy
i've chosen this addresses as per my convenience for the sake of example and now print will execute
and once again i'm using a temp variable in print by now it should have been clear to you
why we use temp variable again and again and why this variable head that stores the address of
the first node is so important now what if this head was not global what if we would have declared
this head inside the main method we have talked about this in our previous lesson head will not
be accessible everywhere so in each call to these functions in each call to insert we will have to
return some value from the function to update this head or we will have to pass this head by
reference we have talked about this in our previous lesson so this is it for this this lesson
in our next lesson we will see program to delete a node at a particular position in the list
so thanks for watching in our previous lesson we wrote program to insert a node at nth position
or a given position in a list in a linked list now in this lesson we will write a program to delete
a node at any given position in a linked list so once again i have drawn a linked list here
we have four nodes in this list at addresses 100, 200, 150 and 250 respectively so this is
my example of a linked list of integers and let's say we number the positions on a one based index
so this is the first node in the list and this is the second node this is the third node and this
is the fourth node when we talk about deleting a node from the linked list we will have to do two
things first we will have to fix the links so that the node is no more a part of the list
let's say in this case we want to delete the node at third position so we will go to the second node
for nth node we will have to go to the n minus 1th node and we will have to set the link part of
the n minus 1th node as the link of the nth node which will be the n plus 1th node so we will cut
this link and now this node at address 150 is not part of the linked list because when we will
traverse the linked list we will go from address 100 to 200 and from 200 we will go to 250
this is one scenario for deletion in which we have a node before and a node after
there will be special cases like we may want to delete the node at the first position or the
head itself in that case we will have to point head to the second node we will have to build this
link now we will talk about all these special cases in our implementation let's first understand
the logic now fixing the links is not good enough because all that we do when we fix the links
is that we detach the node from the linked list so that it is no more accessible but it is still
occupying space in the memory as we know a node is allocated space from what we call the dynamic
memory or the heap section of the memory we have talked about this earlier in C or C plus plus
we have to explicitly free this memory when we are done using it because it is not automatically
deallocated and memory being a crucial resource we do not want to consume it unnecessarily when
we do not need it so the second thing that we will have to do is we will have to free the space
that's being taken by the node and that's when the node will actually be deleted from the memory
so let us now write code for this I'm writing my C program here the first thing that I have
done is I have defined a node which is a structure with two fields one to store data and another
to store address of the next node so the second field is appointed to node now to create a linked
list we will have to first create a pointer to node a variable which is pointer to node and
that will store the address of the head node or the first node in the list and now I want to define
three functions first insert function that will take some value some data to be inserted into the
list and always insert this value at the end of the list then I want to define a print function
that will print all the elements in the list now we have defined this variable head as a global
variable so it will be accessible to all these functions and the third function that I want to
write is delete that will take the position end of the node to be deleted and delete the node at
that particular position we will go back to implementing these methods first I'll write the
main method so in the main method first what I'll do is I'll set head as null so at this stage
the list is empty and then I'll make a couple of calls to insert function to insert some integers
in the list so after this fourth insert the list will be two four six five because we are always
inserting at the end of the list this insert function will insert the node at the end of the list
now what I want to do in my main method is I want to ask a user for a position and I'll input this
position from the console and then I'll delete the node at this particular position and then
I'll print the whole linked list now let's also make a call to print after all the inserts
okay so this is what we want to do in our main method we want to insert four integers
in a linked list to create a list two four six five in this order and then I want to print the list
then I want to input a number from the console and delete the node at that particular position
now let us assume that we will always give a valid position and in my implementation also
I will not handle the error condition when position will not be valid
we have seen implementation of insert and print earlier so I will not go into their implementation
details what I'll do now is I'll implement delete function now in my delete function let's first
handle the case when there is a node before the node that we want to delete so we have an n minus
one-th node what I'll do is I'll first create a temporary variable that is pointed to node
and point this to head and using this temporary variable we will go to n minus one-th node to go
to the n minus one-th node we will have to run a loop n minus two times and we will have to
do something like this temp1 is equal to temp1.next now what I'll do is I'll create a variable to
point to the nth node name this temp2 and this will be equal to temp1.next and now I can fix the link
I can say that adjust the link section the link part of n minus one-th node
to point to n plus one-th node which will be temp2.next now our linked link is fixed
and this variable temp2 stores the nth node reference to the nth node so we can make a call to free
function now free function deallocate whatever memory is allocated through malloc if we were using
c++ and used and if we would have used new operator we should have said delete temp2
okay now we should be good this much code will work for scenarios when we have an n minus one-th
node and even if there is no n plus one-th node if n plus one-th position is null this will work
for this that scenario I'll leave that as an exercise for you to validate now we have not handled
one special case when we want to delete the head node so if n is equal to one then what we want
to do is we just want to set head as temp1.next temp1 is right now equal to head and now head has
moved on to point to the second node and temp1 still points to the first node so links are fixed
and we can free the first node which is now detached from the linked list because head is now
pointing to the second node okay so this is our delete function I have missed one thing here
for n not equal to one we should not execute this section of the code so either we put an else
statement after this or what we can do is we can say return after we execute these statements
for this condition now this code should work if I've got everything right so let us now run this
and see what happens I have already written the insert and print functions I'll come back to this
main function this is my list 2465 and I can enter any of the positions one two three or four
so let's first say we want to delete the head node and we are printing the list after deleting
a particular node so the list now is 465 this seems to be correct let us run this again and
this time I delete number five from position four the list is now 246 which is correct again
similarly if I enter position two the list is 265 which is correct so we seem to be good
I'll quickly walk you through this code in the logical view to make things further clear
let's say we first make a call to delete node from the first position that is we want to delete
the head node so in this code what we are doing is we are first creating a variable temp1 which is
pointed to node initially temp1 is equal to head so it stores the address 100 so it points to the
head node now n is equal to 1 so we come to this instruction head is equal to temp1 dot next
actually this is temp arrow dot next but while reading we read this as temp1 dot next this is
nothing but a syntactical sugar for this statement asterisk temp1 dot next
so we are de-referencing this pointer variable to go to this node and then accessing the next
field of this node now we are saying head is equal to temp1 dot next so head is now 200
so we are building this link and breaking this link and now in the next line we say free temp1
so we want to free the memory which is being pointed to by this variable temp1
temp1 still points to this node at address 100 so this node now will be cleared from the memory
and now we return so this function does not execute any further it finishes its execution
once the function execution finishes temp1 which was a local variable also gets cleared from the
memory head is a global variable so it does not gets cleared this is how we know the linked list
this is the identity of the linked list this particular variable head let's read on this code
again and this time i want to delete the node at third position in the list i have drawn this
initial list so once again we create this variable temp1 we say that the address here is equal to
100 so it points to the head node or the first node and now n is not equal to 1 it is equal to
3 so we come to this particular loop n is equal to 3 so this loop will execute exactly once
this statement will execute exactly once so temp1 will now move to address 200 so temp1 is now pointing
to the second node this is what we wanted to do we wanted temp1 to point to n minus 1th node n
is 3 here now we create another variable another pointer to node temp2 and we set this guy as temp1
dot next temp1 dot next is 150 so we set this guy as 150 so this guy points to the nth node
or the third node now in the next line we are saying that temp1 dot next this value which is 150
right now is now temp2 dot next address of the n plus 1th node or fourth node so this guy will
now be 250 so we are building this link and we are breaking this link so we have fixed the links
and now finally we are saying that free the memory which is being pointed to by temp2
so now this third node the memory block will be deleted from the memory and once this function
execution finishes all the local variables temp1 and temp2 will be cleared and this is what the list
will be this node at address 250 will now be the third node so this was deleting a node at a
particular position in the linked list now we can also have a problem where we may want to delete
a node with a particular value now you can try implementing it in the coming lessons we will see
more problems on linked list so thanks for watching in our lessons on linked list so far we have
implemented some of the basic scenarios like inserting a node in linked list and deleting a node
from linked list in this lesson we will write code to reverse a linked list this is one of the most
favorite interview questions and this is a really interesting problem so let me first define the
problem let's say we have been given a linked list of integers like this so this is our input
we have four nodes in this linked list at addresses 100, 200, 150 and 250 respectively
I always write these addresses in the logical view because it's really important that we visualize
how things are in the memory and what is what like this first node that we also call the head node
is being pointed by this particular variable named head so this variable is basically storing the
address of the head node now this variable is only a pointer this is not the head node itself
and we do not have any other identity of the linked list except the address of the head node
so given a linked list like this if we have to reverse it and by reversing we do not mean moving
around data like we cannot move five at address 100 two at address 250 and do something like this
we actually have to adjust the links so our output should be something like this the head
pointer should point to this node at address 250 and we should go like 250 250 150 to 200
and this node at address 100 should have address 0 or null in each of these nodes this first field
in red is the data part and the second field is the address part so this is what we will get when
we will reverse the list there are two approaches to solve this problem one is an iterative approach
where we will be using a loop we will traverse through the list and at each step we will revert
one of the links another solution is using recursion in this lesson we will try to understand
the iterative solution so coming back to our input list the iterative solution is relatively
easier to understand what we can do is we can traverse the whole list and as we go to each node
we can adjust the link part of that node to make it point to the previous node instead of the next
node so we will start at the first node at each step we want to reverse the link so we want to
make the node point to the previous node instead of the next node for the first node there is no
previous node so let's say the previous node is null and now we want to cut this particular link
and we want to build this particular link so we will simply change the address field to 0
and we have reversed the link part of this particular node and now we will go to the next
node in the list we will come to this node of course the question would be how would we
go to the next node if we have broken this link here we will come back to that in our implementation
details let's say we are able to traverse the list and go to each of the nodes at each step
let's say we store all the relevant information to do that in some temporary variables now at this
node again we will reverse the link so the address part will be set as 100 here now we will go to the
next node at address 150 once again to reverse the link we will set the address as 200 here
so we will break this link and basically we are building this link and now we will go to address
250 the next node we will set the address 150 here so we will cut this link and build this link
and finally when we have reached the last node we will adjust the address in this
head variable to 250 so this particular variable this particular pointer
will point to this node at address 250 and our linked list is reversed now
so let us implement this particular logic in a real c program I will redraw the original input
list in my ccode I will define node as a structure like this this is how we have defined a node
in all our previous lessons so there will be two fields one to store the data which will be of type
integer and another to store the address of the next node we will name this feed it next and it
will be of type pointer to node and let's say head is a global variable so head is a pointer to node
head is a variable which is a pointer to node and it is a global variable so it is accessible to all
the functions it we do not need to pass it around to functions now all I want to do in my code is
I want to write a reverse function that will reverse the linked list which is pointed to
by this particular pointer head as we said we will traverse the whole list and at each step
we will modify the link field of the node to make it point to the previous node instead of the next
node so how do we traverse the list we would traverse the list in our ccode something like this
we will first take a variable which will be pointer to node let's say we will name it temp
then first we will set temp to head by saying this we will make temp point to
the first node and then we will run a loop like this we will say that what temp while temp is not
equal to null take temp to the next address with a statement like temp is equal to temp dot next
in our problem here we don't just have to traverse the list as we traverse the list we have to
reverse the link so we have to set the address field of a particular node as the address of the
previous node instead of the next node now in a linked list we would always know the address
of the next node but we would never know the address of the previous node so as we traverse the
list we will have to keep track of the previous node in another variable so what I will do here is
something like this I will also declare a variable named previous and initially set it to null
because for the first node or the head node the previous node is null and now in my loop we will
have to update both these variables and the variable temp that will store the current node
and the variable pref that will store the address of the previous node and now in my loop I can do
something like this at each step if temp is our current node as we are traversing the list
then we will say that temp dot next is equal to previous so we will set the link part of the
current node as the address of the previous node in our example here at the first step we will say
that temp dot next will be 0 null is nothing but address 0 so we will cut this link and we will
build this link now we should be able to move temp to 200 now and we should be able to move
previous to 100 now in the next step but there is a problem as soon as we adjusted the link of
this particular node at address 100 to make it point to null we lost the address of the next
node so how do we move temp to this particular node at address 200 we cannot set temp equal temp
dot next now if we set temp equal temp dot next now we will go to null so this is a problem
so at each step in our iteration before we set the link field of the current node to make it point
to the previous node we should store the address of the next node in our temporary variable in
another temporary variable so what I'll do here is something like this first of all I want to name
this particular variable temp as current to mean that this is the current node at any stage in my
iteration so we initially set current to head and then we are running the loop as while current is
not equal to null and then I've also declared one more temporary pointer variable named next
what I'll do at each step each step in my iteration inside the while loop is that first I'll say
something like next is equal to current dot next so first I'll store the address of the next node
in this particular variable next so in our example here for the first node initially things will look
some something like this now we can set the link part of the current node as address of the previous
node with a statement like this so when we will write the address 0 here initially we will break
this link and create this link we will not lose the information about the next node now we can
redefine our previous and current so we will first move previous to current and then we will move
current to next please note that this particular variable next is a local variable in the reverse
function and when we say something like current dot next we mean the link field in the node while
when we say when we simply say next we mean this particular local pointer variable so they're
different this is not current dot next actually this is current arrow next which is an alternate
syntax for asterisk current dot next so we use the asterisk operator to dereference that address
and then we access the next field for the sake of saying we say current dot next dot temp dot next
so with these two lines in our loop we are resetting our previous and current pointers
this is how we are traversing the list if you see in the next iteration current is 200 it is not
equal to null null is 0 so we will go to this particular statement next is equal to current dot
next so next we'll now store the address 150 and now we will say current dot next is equal to previous
so we will cut this link previous is 100 right now so we will set 100 here so basically we will
build this particular link and then we will move we will first move previous to current
and then move current to next and we will go on like this
so finally we will reach a stage like this when current will be equal to null we will come out of
the loop and when we will come out of the loop this particular variable previous this particular
pointer previous will store the address of the last node and there is one more thing remaining
here we need to adjust this particular variable head this link at this stage does not exist and
in my code I'll say head should now be equal to the address invariable previous so head is now
250 this is our new head and now our list is reversed there are a couple of things that I
want to point out here one thing is that we must see whether our implementation is working for
all test cases so we must also verify it for special or corner test cases in this case corner
test case will be when the list is empty in that case head will be null or when the list is having
only one node if you see this particular implementation will work for these two scenarios
give it give it some time and you should be able to figure it out let's now run this code with
complete implementation of all the functions to insert and print nodes in my code here i have
written reverse function to accept the address of the head node as argument and then return the
address of the head node after modification of the list after reversal of the list
and then i have written the main method in which i'm declaring head as a local variable
and then i'm using couple of insert functions i'm making couple of calls to insert function
insert function also takes two arguments the address of the head node and the data to be inserted
and it returns back the address of the head node it could either be modified or not modified
let's say we are inserting at the end of the list so initially our list will be two four six
eight and then we are making a call to the print function which i have written to
print the elements in the list and then i'm making a call to reverse and finally printing again
my logic of the reverse function remains the same except that i've changed the method signature
and in the end i'm returning head which will return the address of the head node
let's say we have written all the other functions insert and print correctly
these are the two functions insert and print so let's now run this code and see what happens
before the list is reversed the output is two four six eight and after the list is reversed
the output is eight six four two let us try this for the case when we have only one element in the list
so i'll remove i'll comment out these three insert statements and this also seems to be working
so this was reversal of linked list through iteration in the next lesson we will write code to reverse
linked list using recursion so thanks for watching in our series on linked list so far we have implemented
some of the basic operations like insertion deletion and traversal now in this lesson we will write
code to traverse and print the elements of a linked list using recursion
prerequisite for this lesson is that you should understand a recursion as a programming concept
recursive traversal of linked list actually helps us solve a couple of interesting problems
but in this lesson we will keep it simple we will just traverse and print all the elements in linked
list using recursion and we will write one simple variation to print all the elements in reversed
order using recursion we will actually not reverse the list we will just print the elements in reversed
order so once again i have taken example of a linked list of integers here we have four nodes
each rectangle here is a node it has two fields one to store the data and another to store the
address of the next node let's say we have four nodes that addresses 100 200 150 and 250 respectively
and of course we will also have a variable that will store the address of the head node let's name
this variable head programmatically in our c or c++ program a node will be defined something
like this we will have a structure with two fields one to store the data and another to store the
address of the next node what we want to do in this particular lesson is that we want to write
two functions first we want to write a function named print that will take address of a node as
argument we will pass this function the address of the head node so let's name this argument head
and in this function we will use recursion to print the elements in the list so for this particular
example here if we want to print a space separated list of all the elements our output will be
something like this and we also want to write another function named reverse print here also
we will take the address of a node so we will pass this guy the address of the head node and in this
function we will use recursion to print the elements in the list in reversed order so if we have to
print a space separated list for this example and our output will be something like this so let's first
implement the print function in my c code here i'll declare print function like this it will
take us argument the address of a node so the argument is of type pointer to node
initially we will pass the address of the head node we can name this argument head or we can
name this argument p we can name it whatever but we must understand that this will be a local
variable and let's not bother about other infrastructure in the code like how we would create a linked
list and how we would insert a node in the linked list let's assume that they are in place so let's
keep the name of this particular argument p now recursion is a function calling itself so we have
been passed the address of a node initially the head node so what we can do in our code is first
we can print the value at that particular node with a print f statement like this and then we can
make another call to the print function and this time we will pass the address of the next node
with a statement like this this next field is also a pointer to node so this will pass the address
of the this will be the address of the next node there is one more important thing in recursion
and that we should never forget and it's the exit condition from recursion we should not go on
making recursive calls infinitely so in this case if we go from the first node to the second node
and from the second node to the third node using recursion then finally at one stage p will be
equal to null in one of the calls at this stage we can avoid making a recursive call we will exit
we will show you a simulation of how things will happen in memory hold on for a while
so once we will reach the end of the list p will be equal to null and we will exit
the recursion at that stage now i'll write the main method i've already written the insert function
here so i'll declare a variable head as null in the main method so head will be a local variable
once again we could name this particular variable a or b or whatever just because this variable
points to the head node or the first node in the list we named this variable as head
and then we will insert some nodes in the linked list using by making call to the insert function
that takes the address of the head node as argument initially head is set as null to say that the
list is empty and there should be two arguments to head to the function insert the address of the
head node and the value that needs to be inserted and why is it that this particular function insert
is returning a pointer it's because this head in the main method is a local variable and if we pass
it to the function we just pass a copy of the address of the head node in this head which will
be a local variable of insert function so this guy returns back the address of the modified head
so we can update it in the main function this function inserts a node at the end of the list
so initially when head is null head will be modified in the insert function for other cases
it will not be modified if we are inserting at the end so we will make four such calls to the insert
function to create a linked list of four integers two four six five and now we will make a call to
print function and pass it the address of the head node let us now run this code and see what
happens as you can see we have got this output two four six five the print function here in our
code which is a recursive implementation to print the lists is working now i'll make one
slight change in the print function instead of printing the value in the node and then making
a recursive call i'll first make a recursive call and then when the recursive call finishes
i'll print the value in the node and i'll not modify anything else in the code the main method
will remain the same and if we run this code we can say that we can see that the elements in the
list are printed in reversed order so we just implemented the reverse print function that we
had talked about let us now analyze these two recursive implementations in a logical view
in our example here if we want to print this particular list we will do something like from
the main method we will make a call to the print function passing a third address of the head node
so initially this print function is being called with p equal hundred now in the execution of this
function we will come here if p is equal to null null is address 0 and our argument is hundred so
control will not go inside this if condition we will come here we will print p arrow data p arrow
data means that we will first dereference the address so we will go to the address hundred and
then we will look at the data field there so on the console we will print the data field of
data field at address hundred and now we will make a recursive call we will make a call to
print function passing it address p arrow next which is 200 and the execution of this particular
call will not finish it will finish only after print 200 finishes we will come back to it now
print 200 once again prints the data at address 200 and then makes a recursive call to print function
passing address 150 and we will go on like this in this call to print with address 250
we will first print the data and the address field the the value of p dot next p arrow next
is 0 what we can also say null so we will make a call like this now for this call
with argument null we have reached and the exit condition recursion will not go grow any further
so we will just print an end of line and return this particular structure that we have drawn here
is called recursion tree so print null function call will finish and control will return back
to print 250 there is no statement after this particular recursive call finishes
so we will simply exit this function call also and control will return back to print 150
and we will go on like this finally we will come back to the main method
if you want to see how the recursion will execute in the memory then i'll have to draw a diagram
like this applications memory the memory that is allocated for the execution of a program
has these two sections all the details of function call execution and the local variables
they are stored in the stack section of the memory and any memory that is allocated
using the malloc function or the new operator in c++ they go into the heap section the memory
for the nodes in a linked list is allocated from the heap so that's why these four nodes
in our example are sitting in the heap if you want to know in detail about stack and heap
check the description of this video for a lesson on dynamic memory allocation
when the program will start executing first the main function will be invoked
anytime a function is invoked some amount of memory from the stack is allocated for the execution
of that particular function now it's called the stack frame of that function so let's say main
is executing we have already inserted some nodes in the linked list we have this variable head in
the main function so all the local variables in sit in the stack frame of the function
so head will sit here now at this stage let's say main makes a call to print function
so main was executing and now it makes a call to print function execution of main will be paused
and we will go on to execute the print function the argument passed to the print function is
hundred which is stored in a local variable this argument p is a local variable in the print
function now print function again makes a recursive call now a stack frame is always allocated
corresponding to each recursive each call of a function so a function calling itself is not
different from a function calling another function at any time whatever function call is at the top
of the stack is executing finally even we will reach the exit condition of the recursion stack
will be something like this and then first this call where p is zero will finish we will come back
to this particular call and then this will finish and we will go on like this so this is how
recursion works this is how things will happen in the memory okay so now i'll clear this diagram
of stack and heap in the right and i'll make some change in my print function
what i've done is i have renamed my function print as reverse print and in my function
i'm first making a recursive call and after coming back from that recursive call i'm writing
a print statement and from the main function i'll make a call to reverse print let's write rp as
shortcut for reverse print and initially i'll pass the address of the head node
so i'll make a call like this reverse print hundred the control will come inside this function
p is hundred it is not equal to null and i've also drawn the console like before
now this particular function call does not print first it first makes a recursive call
so this guy will go ahead and make a recursive call to the reverse print function passing
it address 200 nothing will be written on the console and once again this particular function
will make a recursive call like this and once again this particular function will go ahead
and make a recursive call like this and finally we will have a recursive call where the function
is passed address null at this stage we will come to in the exit condition in the recursion
the recursion will not grow any further we will simply return the control will return to
this particular call reverse print 250 so we will come here now to this print of statement
the data field at address 250 is 4 so 4 will be printed on the console and now this particular
function call will finish and we will go to reverse print 150 and now this call will print 5 and
exit and we will go on like this finally we will return back to the main function with this output
on the console the elements of the list printed in reversed order so this was recursive traversal
of linked list to print its elements i must point out here that for normal traversal of the linked
list not for the reverse print for the normal print an iterative approach will be a lot more
efficient than the recursive approach because in a iterative approach we will just use
one temporary variable while in recursion we will use space in the stack section of the memory for
so many function calls so there is implicit use of memory there for reverse print operation
we will anyway have to store elements in some structure so if we use recursion it's still okay
in the coming lessons we will solve more problems more interesting problems on linked list so
thanks for watching in our previous lesson we saw how we can traverse a linked list using recursion
we wrote code to print the elements of linked list in forward as well as reverse order using
recursion we did not actually reverse the list we just printed the elements in reverse order now
in this lesson we will reverse a linked list using recursion this is yet another famous programming
interview question so if we have an input list like this we have a linked list of integers here
we have four nodes in the linked list each rectangular block here with two partitions is a node first
field is to store the data and another to store the address of the next node the second field stores
the address of the next node and of course we will have one variable to store the address
of the first node or head node we named that variable head we may name it anything i have
named it head so this is our input list and after reversal our output should be like this
this variable head should store the address of the last node in the original list the last node
in the original list was that address 250 and we will go like from 250 to 150 to 250 to 200
200 to 100 and 100 to null null is nothing but address 0 we have already seen how we can reverse
a linked list using iterative method in one of our previous lessons let us now see how we can solve
this problem using recursion in our solution we must reverse the list by adjusting the links by
reversing the links not by moving around data or something so let us first understand the logic
that we can use in our recursive approach if you remember from our previous lesson where we had
used recursion to print the list backward print the elements in reverse order then recursion gives
us a way to traverse the list backward in our c or c++ program programmatically
node will be a structure like this so let's first look at the function from the previous
lesson the recursive function that was used to print the list backward to this function we pass
the address of a node initially we pass the address of the head node and we have this exit
condition if the address passed is null then we simply return else we make a recursive call
and pass the address of the next node so main method will typically call reverse print passing
it the address of the head node and this guy will first make a recursive call and then
when this recursive call finishes then only it will print so i'm writing rps shortcut for
reverse print so the recursion will go on like this and when it reaches this particular call when
argument is null it will return so this call will finish and again the control will come
to this call with an address 250 as argument and now we are printing the value of the node at
address 250 which will be 4 and then this guy finishes and then we go ahead and print 5
and similarly we then go on to print 6 and 2 so recursion kind of gives us a way to first
traverse the list in the forward direction and then traverse the list in the backward direction
so let us now see how we can implement reverse function using recursion let's say for the sake
of simplicity and implementation that head is a global variable so it is accessible to all the
functions now we will implement a function named reverse that will take the address of a node as
argument initially we will pass address of the head node to this function now i want to do
something like this in my recursion i want to go on till the end i want to go on making a recursive
call till i reach the last node for the last node the link part will be null so this is my
exit condition from recursion this exit condition is what will stop us from going on infinitely
in a recursion and what i'm doing here is something very simple as soon as i'm reaching the last
node i'm modifying the head pointer to make it point to this guy so the recursion will work like this
from the main method we will call the reverse function passing it the address of the head node
address 100 we will come and check this condition if p.next is equal to null no it is equal to 200
for the node at address 100 so this recursion will go on till we reach this call call to reverse
passing it address 250 and now we will come down and now we have come to this exit condition and now
head will be set as p and the list will look like this and now reverse 250 the call to reverse 250
will finish and we will come back to reverse 150 there is no statement here after this recursive
call to reverse function if there were some statements here then they would have executed
now for reverse 150 after we would have come from reverse 250 and that's how we actually traverse
the list in reverse order if you see when reverse 250 has finished the node till 250 is already
reversed because head is pointing to this node and the link part of this node node is set as null
so till 250 we are already reversed now when we come to 150 we can make sure the list is reversed
till 150 when we finish the execution of reverse 150 to do to do that we can write statement like
this we will have to do two things we will have to cut this node and make this type point to this
guy so we will build this link and we would have to cut this link and make this guy point to null
and that's how node till address 150 will be reversed after we finish this call
so i've written these three lines in my function that will execute after the recursive call
so they will execute when the recursion is folding up and we are traversing the list in the backward
direction so when we are executing reverse 150 and we have come back to it after recursion we are
at this particular line so p would be 150 and q would be p dot next so q would be 250 so this
guy is p and this guy is q and we are saying that set q dot next is equal to p so we will set this
particular field as 100 so we are building this link and cutting this link and now we are saying
that set p dot next equal null so we are building this link making p dot next null and now this call
to reverse 150 finishes and when this call has finished the list till 150 is reversed as you can
see head is 250 so from 250 we will go to 150 and 150 from 150 we are going to null so till 150
we have a reversed list so this is how things will look like when the call to reverse 200 finishes
till 200 we have a reversed list and once again we come to execution of reverse 100
and this is how things will look like finally when reverse 100 will finish and we will return back
to the main function we had seen in the previous lesson that how things will happen in the memory
when recursion executes in recursion we save the state of execution of all the function calls
in stack section of the memory in this function all we are doing is basically we are storing the
addresses of node in a structure as we go forward in recursion and then we first work on the last
node to make it part of the reversed list and then we once again come back to the previous node and
we and we keep doing this watch the previous lesson for detailed explanation and simulation
of how things will happen in the memory for recursion there are a couple of more things here
one thing is that instead of writing these two lines i could write one line for these two lines
i could say something like p arrow next arrow next equal p and that would have meant the same
except that this statement is more obfuscated and there is one more thing we have assumed
that head is a global variable whatever head is not a global variable this reverse function
will have to return the address of the modified head i'll leave that as an exercise for you to do
so this was reversing a linked list using recursion thanks for watching hello everyone
in our lessons in this series so far we have discussed linked list quite a bit
we have seen how we can create a linked list and how we can perform various operations with linked
list linked lists as we know our collections of entities that we call nodes so far in all
our implementations we have created linked lists in which each node would contain two fields one to
store data and another to store address of the next node let's say we have a linked list of integers
here so i'll fill in some values in data field of each node let's assume that these nodes are at
addresses 200, 250 and 350 respectively i'll also fill in the address field in each node the address
field in first node will be the address of second node which is 250 the address field in second
node will be address of third node which is 350 and address part in third node will be zero or
null the identity of a linked list that we always keep with us is the address of head node or
reference to head node let's say we have a variable named head only to store the address of the head
node remember this variable named head is only a pointer to the head node ideally we should have
named there's something like head pointer it's only pointing to the head node it's not
the head node itself head node is this guy the first node in the linked list okay so right now
in the linked list that we are showing here each node has only one link a link to the next node
in a real program node for the linked list that i'm showing here will be defined like this
this is how we have defined nodes so far in all our lessons we have two fields here one
of type integer to store data and another of type pointer to node struct node asterisk
i'm calling this field next when we say linked list by default we mean such a list that we can also
call singly linked list what we have here is a singly linked list what we want to talk about
in this lesson is idea of a doubly linked list the idea of a doubly linked list is really simple
in a doubly linked list each node would have two links one to the next node and another to
the previous node programmatically this is how we will define node for a doubly linked list in
c or c plus plus i have one more field here which once again is a pointer to node so i can store
the address of a node i can point to a node using this field and this field will be used to store
the address of the previous node in a logical representation i will draw my node like this now
i have one field to store data one to store address of previous node and one to store address of next
node let's say i want to create a doubly linked list of integers i have created three nodes here
let's say these address these nodes are at addresses 400 600 and 800 respectively i'll fill in some
data let's say the cell in the middle in each node is to store data the right most cell is
let's say to store the address of the next node so for first node this field will be 600
which means we have a link like this for second node this field will be 800 for third node this
field will be zero for first node there is no previous node so this leftmost cell
which is supposed to contain the address of the previous node will be zero or null
the previous node for second node will be 400 and the previous node for the third node is the node
at address 600 and of course we will have a variable to store the address of the head node
okay so what we have here is a doubly linked list of integers with three nodes okay so with this much
you already know doubly linked list if you have ever implemented a singly linked list
then it should not be very difficult implementing a doubly linked list one obvious question would be
why would we ever want to create a doubly linked list what are the advantages or huge cases of a
doubly linked list first advantage is that now if we have a pointer to any node then we can do a
forward as well as reverse lookup with just one pointer we can look at the current node the next
node as well as the previous node i am showing a pointer named temp here if temp is a pointer
pointing to a node then temp dot next is a pointer pointing to the next node it's the address of the
next node and temp dot previous or rather temp arrow previous this is actually a syntactical sugar
for asterisk temp dot prev so this guy temp arrow prev is previous node or in pure words pointer to
previous node the value stored in temp for this example right now is 600 temp dot next is 800
and temp dot prev is 400 in a singly linked list there is no way you can look at the previous node
with just one pointer you will have to use an extra pointer to keep track of the previous node
in a lot of scenarios the ability to look at the previous node makes our life easier even
implementation of some of the operations like deletion becomes a lot easier in a singly linked
list to delete a node you would need two pointers one to the node to be deleted and one to the previous
node but in our doubly linked list we can do so using only one pointer the pointer to the node to
be deleted all in all this ability that we can do a reverse lookup in the linked list is really
useful we can flow through the linked list in both directions disadvantage of doubly linked list
is that we are having to use extra memory for pointer to previous node for a linked list of integers
let's say integer takes four bytes in a typical architecture and pointer also takes
four bytes pointer variable also takes four bytes then in a singly linked list each node
will be eight bytes four for data and four for a link to the next node in a doubly linked list
each node will be 12 bytes we will take four bytes for data and eight bytes for links for a linked list
of integers we will take twice for links than data with a doubly linked list we also need to be more
careful while resetting links while inserting or deleting we need to reset a couple of more links
than a singly linked list and so we are more prone to errors we will implement doubly linked list in
a c program in next lesson we will write basic operations like traversal insertion and deletion
this is it for this lesson thanks for watching in our previous lesson we saw what doubly linked
lists are now in this lesson we are going to implement doubly linked list in c we are going to
write simple operations like insertion traversal and deletion in a doubly linked list as we saw in
our previous lesson each node contains three fields i have drawn logical representation of
a doubly linked list here one to store data one to store address of next node and one to store
address of previous node for a linked list of integers node will be defined like this in a c
or c plus plus program in the logical representation i'll fill in some data in each node let's say
these nodes are at addresses 400 600 and 800 respectively i'll also fill in next and previous
fields and we must also have a pointer variable pointing to the head node quite often we
name this pointer variable head in my implementation i'm going to write these functions i'm going to
write a function to insert a node at beginning or head of linked list this function will take an
integer as argument i'll write another function to insert a node at tail of linked list i'll write
one function to print elements in linked list while traversing it from head to tail i'll write
another one to print the elements in reverse order while traversing the list from tail to head
reverse print function will validate whether reverse link for each node is created properly
or not let's now write these functions in a real c program in my c program here i have defined
node as a structure with three fields first field is of type integer to store data second field is
of type pointer to node to store reference of next node and the third field is a pointer to
node to store the reference of previous node i have defined a variable named head which once
again is a pointer to node and i have defined this variable in global scope head is a global
variable when we define a variable inside a function it's called a local variable the lifetime of a
local variable is lifetime of a function call it's created during a function call execution and
it's cleared from the memory when function call execution finishes but global variables
live in the memory for whole lifetime of an application they live till the time program is
executing global variables can be accessed everywhere in all functions local variables are not accessible
everywhere unless you access them through pointers in all our previous implementations we have mostly
declared head as global variable okay so let's now write the functions the first function that i want
to write is insert at head this function will take an integer as argument the first thing that we
want to do here is we want to create a node we can always declare a node like this just like
declaration of any other variable we can say struct node and then we can give an identifier
or name and now in this my node that i have created i can fill in all the fields
but the problem here is that when i'm creating a node like this i'm creating it as a local variable
and it will be cleared from memory when function call will finish a local variable lives in what we
call stack section of applications memory and we cannot control its lifetime it's cleared from
memory when function call finishes we do not want this our requirement is that a node should be
in memory unless we explicitly remove it so that's why we create a node in in dynamic memory or what
we call heap section of memory anything in heap is not cleared unless we explicitly free it to
create a node in heap we use malloc function in c or new operator in c++ all malloc function
does is it reserves some memory in heap and this memory can be used for writing anything
any variable any object access to this memory always happens through a pointer variable we have
talked about this concept quite a bit in our previous lessons but i keep on repeating because
this is really important concept so here with this statement i have created a node in dynamic
memory or heap that can be referenced through a variable which is pointer to node i have named
this variable temp now i can use this pointer variable to fill in values in various fields of
the node i'll have to dereference this pointer variable using asterisk operator and then i can
access various fields like data prep or next there is an alternate syntax for this asterisk
temp dot data we can simply write temp arrow data and similarly i can access other fields also
so to access prep field i can say temp arrow prep let's set this as null and let's also set the
next field as null if you want to understand or refresh the concept of stack and heap in memory
then you can check the description of this video for a link to our lesson on dynamic memory allocation
okay so in my function insert at head i have created a node in heap section of memory
and i'm referencing that node using this pointer variable named temp temp is not a very meaningful
name let's use a name like new node or new node pointer i would like to separate out this logic
of node creation these lines for node creation in a separate function i've written a function
here named ket new node that will take an integer's argument create a node filling in data field as
x and setting both previous and next pointers as null this function will return a pointer to nodes
so i will return new node from here i'm writing a separate function because i can avoid duplicate
code by using a separate function for creation of node because i'm going to create a node
for function in function insert at head as well as in function insert at tail that i'll be writing
after some time now in insert at head function i can simply call this function get new node passing
it x this function is returning a pointer to newly created node that i'm going to receive in this
variable which once again is appointed to node named temp we can name this variable also as new node
this new node in insert at head is different from this new node in get new node these are local
variables this new node is local to insert at head and this new node is local to get new node
now there will be two cases in insertion at head list could be empty so head will be equal to null
in this case we can simply set head as the address of new node and return or exit
things will be clear if i'll show everything in logical view also right now my linked list is
empty here in this logical view that i'm showing let's say i have made a call to insert at head
passing it number two get new node function will give me a new node let's say a new node is created
at address 400 with this statement head equal new node we are setting the address stored in
new node variable in head null is nothing but address 0 as soon as this function insert at head
will finish this variable new node will be cleared from memory but the node itself will not be cleared
if we would have created node like this struct node new node and in this declaration new node
is not pointed to node it's node and we are not saying struct node as stress
so if we would have created node like this the node also would have been cleared okay coming back
to the function here let's write rest of the logic to insert a node when list is not empty
this is what i'll do now i'm making a call to insert at head passing it number four
once new node is created i'll first set the previous field of existing existing head node
as the address of this new node so i'm building this link then i'll set the next field of new node
as the address of current head and now i can break this link and build this link
so i'll set head as address of new node this is how things will look like finally
let's also quickly see how things will actually move in various sections of applications memory
the memory that is allocated to a program is typically divided into these four segments
we have seen this diagram quite a bit in our earlier lessons code or text segment stores
all the instructions to be executed there is a segment to store global variables there is a
section that we call stack that is used just like scratch pad or white board for function
call execution stack is where all the local variables go and not just local variables all
the information about function call execution heap is what we also call dynamic memory
i'm showing stack heap and global section separately here in our program we had declared head as a
global variable initially for an empty list we'll set head as null or zero now let's say we will do
that in main function now when a call to insert at head is made at this stage let's say i'm making
a call passing number two as argument let's say we are making a call to insert head from main function
when program starts execution first main function is invoked whenever a function is invoked some
amount of memory from the stack is allocated for execution of that function that section is called
stack frame of that function and all the local variables of that function live inside its stack
frame when function call execution finishes the stack frame is reclaimed when main will make a
call to insert at head the execution of main will pause at at at the line where it's making a call
a stack frame will be allocated for execution of insert at head i'm writing shortcut i a h for
insert at head because i'm short of space here all the arguments of insert at head all the local
variables will live inside this stack frame we are creating a variable named new
named new node which is a pointer to node as local variable and we are making a call to get new node
function execution of insert at head will pause and we will go on to execute get new node we could
write get new node like this here i'm creating a node on stack x is a local variable and get new
node also then i'm creating a node filling in data as the value of x which is two i'm setting
previous and next fields as null or zero and then because i need to return a pointer to node i have
used ampersand operator here using ampersand operator gives us pointer to a variable let's say
this new node that we have in the stack frame of get new node has addressed 50 with this return when
get new node will finish the value in this new node of insert at head will be 50
please note that with this code this new node in get new node function is of type
struct node while this new node in insert at head is of type pointer to struct node so they are
different types we can return this address 50 that's fine but the stack frame for get new node will
be reclaimed once the function finishes so now even though you have the address 50
there is no node there we cannot control allocation and deallocation of memory on stack
it happens automatically that's why we use the memory on heap if i'm using this code for creation
of new node then what i'm doing is i'm declaring this variable new node not as struct node but as
struct node asdrisk that is pointer to node i'm using malloc to create the actual node in heap
section let's say i'm getting a dress 400 for this node now for a section of memory in heap
for something in heap we cannot have a direct name the only way to access something in heap is
through a pointer if we will lose this pointer we will lose this node okay so now what we are doing
is using this point a new node which is local to get new node function we are accessing this node
filling in data filling in address fields and now we are returning this address 400
now when get new node is finishing i'm collecting the return this address 400 in this variable in
this local variable new node we are returning back to insert at head function and at this line
head at this stage is null so now we are saying that set head is equal equal new node head is a
global variable it's not going to be cleared for whole lifetime of application and now we are
returning stack frame of insert at head will be cleared and this is what we finally have
when we will make another call to insert at head once again fresh stack frames will be allocated
in the execution of functions appropriate links will be created so our linked list will be modified
accordingly i hope all of this is making some sense with another call to insert at head when
everything will finish and control will return back to main we can have a picture like this
let's say i got a node at 600 right cell is for next node right cell is storing the
address of next node and left cell is storing the address of previous node so this will this is what
we will have let's now go and write rest of the functions print function will be same as print
for singly linked list we will take a temporary pointer to node initially set it to head and then
we will use this statement temp equal temp dot next to go to the next node and we will keep on
printing in reverse print we will first go to the end node of the list using next pointer and then
we will traverse backward using this statement temp equal temp arrow pref so we will use the
previous pointer and while traversing backward we will print the data okay let's now test all
these functions that we have written so far in the main function i'm setting head as null to say
that the list is empty initially and now i'm writing couple of insert statements i'm making a
couple of calls to insert at head function and after each call i'm printing the list both in
forward as well as reverse direction let's run this code and see the output this is what i'm getting
and i think this is as expected there is one more function insert at tail that i had said i'll write
if you have understood things so far it should not be very difficult for you to write this function
insert at tail i'll leave this as an exercise for you i'll stop here now if you want to get this
source code check the description of this video for a link in coming lessons we are going to talk
about circular linked list and we will see some more interesting problems on linked list thanks
for watching in this lesson we are going to introduce you to stack data structure
data structures as we know are ways to store and organize data in computers so far in the series
we have discussed some of the data structures we have talked about arrays and linked lists now in
this lesson we are going to talk about stacks and we are going to talk about stack as abstract data
type or ADT when we talk about a data structure as abstract data type we talk only about the
features or operations available with the data structure we do not go into implementation details
so basically we define the data structure only as a mathematical or logical model
we will go into implementation of stack in later lessons in this lesson we are going to talk only
about stack ADT so we are only going to have a look at the logical view of stack stack as
a data structure in computer science is not very different from stack as a way of organizing
objects in real world here are some examples of stack from real world first figure is of a stack
of dinner plates second figure is of a mathematical puzzle called tower of hanoi where we have three
rods or three pegs and multiple disks and the game is about moving a stack of disks
from one peg to another with this constraint that a disk cannot go on top of a smaller disk
third figure is of a pack of tennis balls stack basically is a collection with this property that
an item in the stack must be inserted or removed from the same end that we call the top of stack
in fact this is not just a property this is a constraint or restriction only the top of a stack
is accessible and any item has to be inserted or removed from the top a stack is also called last
in first out collection most recently added item in a stack has to go out first in the first
example you will always pick up a dinner plate from top of the stack and if you will have to put
a plate back into the stack you will always put it back on top of the stack you can argue that
I can slip out a plate from in between without actually removing the plates on the top so the
constraint that I should take out a plate always from the top is not strictly enforced for the
sake of argument this is fine you can say this in other two examples when we have disks in a peg
and tennis balls in this box that can open only from one side there is no way you can take out
an item from in between any insertion or removal has to happen from top you cannot slip out an
item from in between you can take out an item but for that you will have to remove all the items
on top of that item let's now formally define stack as an abstract data type a stack is a list
or collection with the restriction that insertion and deletion can be performed only from one end
that we call the top of stack let's now define the interface or operations available with
stack adt there are two fundamental operations available with a stack and insertion is called
a push operation push operation can insert or push some item x onto the stack another operation
second operation is called pop pop is removing the most recent item from the stack most recent
element from the stack push and pop are the fundamental operations and there can be few more
typically there is one operation called top that simply returns the element at top of the stack
and there can be an operation to check whether a stack is empty or not so this operation will
return true if the stack is empty false otherwise so push is inserting an element on top of stack
and pop is removing an element from top of stack we can push or pop only one element at a time
all these operations that have written here can be performed in constant time
or in other words the time complexity is big o of one remember an element that is pushed or
inserted last onto a stack is popped or removed first so stack is called last in first out structure
what goes in last comes out first last in first out in short is called leafo logically a stack is
represented something like this as a three-sided figure as a container open from one side this is
representation of an empty stack let's name this stack s let's say this figure is representing
a stack of integers right now the stack is empty i will perform push and pop operations
to insert and remove integers from the stack i will first write down the operation here and then
show you what will happen in the logical representation let's first perform a push i want to push
number two onto the stack the stack is empty right now so we cannot pop anything after the push
stack will look something like this there is only one integer in the stack so of course
it's on top let's push another integer this time i want to push number 10
and now let's say we want to perform a pop the integer at top right now is 10 with a pop
it will be removed from the stack let's do few more push
i just pushed 7 and 5 onto the stack at this stage if i will call top operation it will return me
number five is empty will return me false at this stage a pop will remove five from the stack
as you can see the element the integer which is coming last is going out first
that's why we call stack last in first out data structure we can pop till the stack gets empty
one more pop and stack will be empty so this pretty much is stack data structure
now one obvious question can be what are the real scenarios where stack helps us let's list
down some of the applications of stack stack data structure is used for execution of function
calls in a program we have talked about this quite a bit in our lessons on dynamic memory
allocation and linked lists we can also say that stack is used for recursion because recursion is
also a chain of function calls it's just that all the calls are to the same function to know more
about this application you can check the description of this video for a link to my course schools
lesson on dynamic memory allocation another application of stack is we can use it to implement
undo operation in an editor and we can perform undo operation in any text editor or image editor
right now i'm pressing ctrl z and as you can see some of the text that i have written
is getting cleared you can implement this using a stack stack is used in a number of important
algorithms like for example a compiler verifies whether parenthesis in a source code are balanced
or not using stack data structure corresponding to each opening curly brace or opening parenthesis
in a source code there must be a closing parenthesis at appropriate position and if parenthesis in a
source code are not put properly if they are not balanced compiler should throw error
and this check can be performed using a stack we will discuss some of these problems in detail
incoming lessons this much is good for an introduction in our next lesson we will discuss
implementation of stack this is it for this lesson thanks for watching in our previous lesson we
introduced you to stack data structure we talked about stack as abstract data type or ADT as we
know when we define a data structure as abstract data type we define it as a mathematical or
logical model we define only the features or operations available with the data structure
and do not bother about implementation now in this lesson we will see how we can implement
stack data structure we will first discuss possible implementations of stack and then we'll go ahead
and write some code okay so let's get started as we had seen a stack is a list or collection with
this restriction with this constraint that insertion and deletion that we call push and pop operations
in a stack must be performed one element at a time and only from one end that we call the top
of stack so if you see if we can add only this one extra property only this one extra constraint
to any implementation of a list that insertion and deletion must be performed only from one end
then we can get a stack there are two popular ways of creating lists we have talked about them
a lot in our previous lessons we can use any of them to create a stack we can implement stacks
using a arrays and p linked lists both these implementations are pretty intuitive let's first
discuss array based implementation let's say i want to create a stack of integers so what i can
do is i can first create an array of integers i'm creating an array of 10 integers here i'm naming
this array a now i'm going to use this array to store a stack what i'm going to say is that at
any point some part of this array starting index 0 till an index marked as stop will be my stack
we can create a variable named top to store the index of top of stack for an empty stack
top is set as minus 1 right now in this figure top is pointing to an imaginary minus 1 index in
the array and insertion or push operation will be something like this i will write a function
named push that will take an integer x as argument in push function we will first increment top
and then we can fill in integer x at top index here we are assuming that
a and top will be accessible to push function even when they are not passed as arguments
in c we can declare them as global variables or in an object oriented implementation
all these entities can be members of a class i'm only writing pseudo code to explain
the implementation logic okay so for this example array that i'm showing here right now
top is set as minus 1 so my stack is empty let's insert something onto the stack i will have to make
call to push function let's say i want to insert number 2 onto the stack in a call to push first
top will be incremented and then the integer passed as argument will be written at top index
so 2 will be written at index 0 let's push one more number let's say i want to push number 10
this time once again top will be incremented 10 will now go at index 1 with each push the stack
will expand towards higher indices in the array to pop an element from the stack i'm writing a
function here for pop operation all i need to do is decrement top by one with a call to pop let's
i'm making a call to pop function here top will simply be decremented whatever cells are in yellow
in this figure are part of my stack we do not need to reset this value before popping if a cell is
not part of stack anymore we do not care what garbage lies there next time when we will push
we will modify it anyway so let's say after this pop operation i want to perform a push i want to
insert number 7 onto the stack so top once again will be incremented and value at index 2 will be
overwritten the new value will be 7 these two functions push and pop that i have written here
will take constant time we have simple operations in these two functions and execution time will
not depend upon size of stack while defining stack identity we had said that all the operations
must take constant time or in other words the time complexity should be big o of 1
in our implementation here both push and pop operations are big o of 1 one important thing here
we can push onto the stack only till array is not exhausted only till some space is left in the
array we can have a situation where stack would consume the whole array so top will be equal to
highest index in the array a further push will not be possible because it will result in an overflow
this is one limitation with array based implementation to avoid an overflow we can always
create a large enough array for that we will have to be reasonably sure that stack will not grow
beyond a certain limit in most practical cases large enough array works but irrespective of that
we must handle overflow in our implementation there are couple of things that we can do in case
of an overflow push function can check whether array is exhausted or not and it can throw an error
in case of an overflow so push operation will not succeed this will not be a really good behavior
we can do another thing we can use the concept of dynamic array we have talked about dynamic array
in initial lessons in the series what we can do is in case of an overflow we can create
a new larger array we can copy the content of stack from older filled up array into new array
if possible we can delete the smaller array the cost of copy will be big o of n or in simple words
time taken to copy elements from smaller array to larger array will be proportional to number of
elements in stack or the size of the smaller array because anyway stack will occupy the whole array
there must be some strategy to decide the size of larger array optimal strategy is that we should
create an array twice the size of smaller array there can be two scenarios in a push operation
in a normal push we will take constant time in case of an overflow we will first create
a larger array twice the size of smaller array copy all elements in time proportional to size
of the smaller array and then we will take constant time to insert the new element
the time complexity of push with this strategy will be big o of one in best case
and big o of n in worst case in case of an overflow time complexity will be big o of n
but we will still be big o of one in average case if we will calculate the time taken for
n pushes then it will be proportional to n remember n is the number of elements in stack
big o of n is basically saying that time taken will be very close to some constant times n
in simple words time taken will be proportional to n if we are taking c into n time for n pushes
to find out average we will divide by n average time taken for each push will be a constant
hence big o of one in average case i will not go into all the mathematics of why it's big o of
n for n pushes to know about it you can check the description of this video for some resources
okay so this pretty much is core of our implementation we have talked about two more operations in
definition of stack ADT top operation simply returns the element at top of stack so top function
will look something like this we will simply return the element at top index to verify whether
stack is empty or not this is another operation that we had defined we can simply check the value
of top if it is equal to minus one we can say the stack is empty we can return true else we can
return false sometimes pop and top operations are combined together in that case pop will not
just remove an element from top of stack it will also return that element language libraries in a
lot of programming languages give us implementation of stack signature of functions in these implementations
can vary slightly okay now i will quickly show you a basic implementation of stack in C
in my C code here i'm going to write a simple array based implementation to create a stack of
integers the first thing that i'm going to do is i'm going to create an array of integers
as global variable and the size of this array is max size where max size is defined by this macro
as 101 i will declare another global variable named top and set it as minus one initially
remember top equal minus one means an empty stack when a variable is not declared inside any function
it's a global variable it can be accessed anywhere so you do not have to pass it as argument to functions
and now i will write all the operations this is my push function i'm first incrementing top
and then setting the value at top as x x is the integer to be inserted past as argument
instead of writing these two statements i can write one statement like this and i will be good
i'm using pre increment operator so increment will happen before assignment i also want to handle
overflow we will have an overflow when top index will be equal to max size minus one highest index
available in the array in case of an overflow i simply want to print an error message something
like this and return so in this implementation i'm not using a dynamic array in case of overflow
push will not succeed okay now this is my pop function i'm simply decrementing top here also we
must handle one error condition if stack is already empty we cannot pop so i'm writing these statements
here if top is equal to minus one we cannot pop i will print this error message that there is no
element to pop and simply return now let's write top operation top operation will simply
return the integer at top index so now my basic operations are all written here i have already
written push pop and top in main function i will make some calls to push and pop and i want to write
one more function named print and this is something that i'm going to write only to verify that push
and pop are happening properly i will simply print all the elements in the stack in my main
function after each push or pop operation i will make a call to print i'm writing multiple
function calls two function calls on same line here because i'm short of space remember print
function is not a typical operation available with stack i'm writing it only to test my implementation
so this pretty much is my code let's now run this program and see what happens
this is what i'm getting as output we are pushing three integers two five and ten and then we are
performing a pop so ten gets removed from the stack and then we are pushing 12 so this is a
basic implementation of stack in c this is not an ideal implementation an ideal implementation
should be something like we should have a data type called stack and we should be able to create
instances of it we can easily do it in an object oriented implementation we can do it in c also
using structures check the description of this video for link to source code of this implementation
as well as of an object oriented implementation in our next lesson we will discuss linked list
implementation of stack this is it for this lesson thanks for watching
in our previous lesson we saw how we can implement stack using arrays now in this lesson we will
see how we can implement stack using linked list for this lesson i'm assuming that you already know
about both stack as well as linked list stack as we know from our discussion so far is called a
last in first out data structure whatever goes in last in a stack comes out first it's a list
with this restriction that insertion and deletion must be performed only from one end that we call
the top of stack an insertion in a stack is called push operation and deletion is called pop to
implement a stack all we need to do is enforce this behavior in any implementation of a list that
insertion and deletion must be performed only from one end and we can call that end top of stack
it's really easy to enforce this behavior in a linked list i have drawn a linked list of integers
here this is logical representation of a linked list a linked list is a collection of entities that
we call nodes each node contains two fields one two store data and another to store the address
of the next node let's assume that these nodes are at addresses 100 200 and 400 respectively
so i will fill up the address part as well the identity of a linked list is the address of the
first node that we also call the head node a variable stores the address of head node we often
name this variable as head unlike arrays linked lists are not a fixed size and elements in a
linked list are not stored in one contiguous block of memory we already know how to create a linked
list or insert and delete elements from a linked list from our previous lessons i'm just doing a
quick recap here to insert an element in a linked list we first create a new node which is basically
blocking some part of memory to store our data in this example here let's say for my new node i'm
getting address 350 we can set the data part of the linked list as whatever value i want to add
in the list and then i need to modify the address field of some of the existing nodes to link this
node in actual list now for a stack we want that insertion and deletion must always happen from the
same end we can use a linked list at stack if we always insert and delete a node at same end we
have two options we can insert or delete from end of the list what we also call tail or
beginning of the list that we call head if you remember from our previous lessons inserting a
node at end of linked list is not a constant time operation the cost of both insertion and
deletion at end of linked list if we have to talk about the time complexity of it is big o of n
here in the definition of stack we are saying that push and pop operations should take constant
time or the time complexity should be big o of one but if we will insert and delete from end
time complexity will be big o of n to insert a new node in a linked list at the end we need to go
to the last node and set the address part of that node to make it point to the new node to traverse
a linked list and go to the last node we should start at the head or the first node from first node
we get the address of the second node so we go to the second node and from second node we get the
address of the third node it's like playing treasure hunt you go to the first guy ask the address of
the second guy and then you go to the second guy ask the address of the third guy and so on
now once i've reached this last node in my example here i can set its address part to make it point
to the newly created node all in all this operation will take time proportional to number of elements
in the linked list to delete a node from end once again we will have to traverse the whole list
we will have to go to the second last node break this link we will set the address field as zero
or null and then we can simply wipe off the last node removed from the list from computer's memory
once again the cost of traversal will be big o of n so inserting and deleting at end or tail is
not an option for us because we will not be able to do push and pop in constant time if we choose
to insert and delete from end the cost of inserting or deleting from beginning however
is big o of one it will take constant time to insert a node at beginning or delete a node from
beginning to insert a node at beginning we must create a new node in this example here once again
i have created a new node let's say the address of the new node is 350 i will insert some data in the
first field of this node okay so to insert this node at beginning we just need to build two links
first we need to build this link so we will set the address here as whatever the address of the
current head is and then we can break this link and make this guy the new head by setting its address
here in this variable named head to delete a node in this example here we will have to first cut this
link and build this link which will mean resetting the address in this variable head and then we can
free the memory allocated to this particular guy this particular node deletion from beginning
once again is a constant time operation so this is the thing if we will insert at beginning and
delete from beginning then all our conditions are satisfied so linked list implementation of stack
is pretty straightforward all we need to do is insert a node at the beginning and delete a node
from beginning so head of the linked list is basically the top of stack i would rather name
this variable top here i'll quickly write a basic implementation in c i'm defining node as a
structure in c i want to create a stack of integers so first field in the node is an integer another
field is pointed to node that will store the address of the next node we have seen this definition of
node in all our previous lessons on linked list the next thing that i'm doing is i'm declaring
a variable named top which is pointed to node and initially i'm setting the address in it as null
i'm using variable name top instead of head here when top is null our stack is empty by initializing
top as null i'm saying that initially my stack is empty now let's write push and pop functions
this is my push function push is taking an integer x as argument that must be inserted
onto the stack the first thing that we are doing in push function is that we are creating a node
using malloc let's say in this example in this logical representation that i'm showing here
i'm performing a push operation so i'm making a call to push function passing it number two as
argument so a node is created in memory is created in what we call the dynamic memory or heap let's
say the address of this node is hundred this variable is basically a pointer pointing to this node
temp is a pointer pointing to this node in the next line we are setting the data field in this node
we are dereferencing temp to do so then we are setting the link part of this newly created node
as existing top so we are building this link and then we are saying top equal temp so we are building
this link this is simple insertion at beginning of a linked list we have one complete video in this
series on how to insert a node at beginning of linked list let's do one more push let's say i want
to push number five onto the stack this time once again a node will be created we will set the data
and then we will first point this guy to the existing top and then make this pointer variable
point to this guy the new top let's say the address of this guy is 250 so the address in
this variable top will be set as 250 after this second push this is how my stack will look like
top here is a global variable so we do not need to pass it as argument to functions it is accessible
to all the functions in an object oriented implementation it can be a private field
and we can set it as null in the constructor okay let's now see how push sorry pop function
will look like this is my pop function let's say for this example i'm making a call to pop function
if the stack is already empty we can check whether stack is empty or not by checking whether top
is null or not if top is null stack is empty in this case we can throw some error and return
for this example here stack is not empty we have two integers in the stack what we are first doing
is we are creating a pointer to node temp and pointing it to the top node and now we are breaking
this link we are setting the address in top as address of the next node and now using this
pointer variable temp we are freeing the memory allocated to the node being removed from the list
once i exit the pop function this is my stack so this pretty much is the core of our implementation
i would encourage you to write rest of the stuff yourself you can write code for operations like
top and is empty linked list implementation of stack has some advantages one of the advantages
is that unlike array based implementation we do not need to worry about overflow
unless we exhaust the memory of the machine itself some amount of extra memory is used in
each node to store reference or address but the fact that we use memory when needed
and release when not needed is something that makes push and pop operations more graceful
so this is linked list based implementation of stack in our coming lessons we will solve
some problems using stack this is it for this lesson thanks for watching
in our previous lesson we saw how we can implement a stack we saw two popular implementations of stack
one using arrays and another using linked list a warrior should not just possess a weapon
he must also know when and how to use it as programmers we must know
in what all scenarios we can use a particular data structure in this lesson i'm going to talk
about one simple use case of stack a stack can be used to reverse a list or collection
or simply to traverse a list or collection in reverse order i'm going to talk about two problems
reversal of string and reversal of linked list and i'm going to solve both these problems
using stack let's first discuss reversal of string i have a string in the form of a
character array here i have this string hello a string is a sequence of characters
this is a c-style string in c a string must be terminated with a null character so this last
character is a null character reversal means characters in the array should be rearranged
like what i'm showing here in the right null character is used only to mark the end of string
it is not part of string okay there are couple of efficient ways in which we can reverse a string
let's first discuss how we can solve this problem using a stack and then we will see how
efficient it is what we can do is we can create a stack of characters i'm showing logical representation
of a stack here this is a stack of characters and right now it's empty and now what we can do is
we can traverse the characters in the string from left to right and start pushing them onto the stack
so first h goes into the stack then the next character is e then l then we have another l
and then the last character is o once all the characters in the string have gone into the stack
we can once again start at the zeroeth index now we need to write the topmost character
in the stack at this index we can get the topmost character by calling top operation
and now we can perform a pop and now we can go to the next index fill in whatever is at top of stack
and perform a pop again we can go on doing this until stack is not empty so all the positions
in the character array will be overwritten so finally we have reversed our string here
in a stack whatever goes in last comes out first so if we will push a bunch of items onto a stack
and once all items are pushed if we will start popping we will get the items in reverse order
first item pushed onto the stack will come out last let's quickly write code for this logic
i'm going to write c++ here things will be pretty similar in other languages so it doesn't really
matter what i'm going to do in my code is i'm going to create a character array to store a string
and then i will ask user to input a string once i input the string i will make a call to a function
named reverse passing it the array and length of string that i will get by making a call to string
length function and finally i'm printing the reversed string now i need to write the reverse
function in reverse function i want to use a stack a stack of characters we have already seen
how we can implement stack in c++ we can create a class named stack that would have
an array of characters and an integer variable named top to mark the top of stack in array
and these variables can be private and we can work upon the stack using these public functions
in reverse function we can simply create an object of stack and use it this class can be an
array based implementation of stack or a linked list based implementation of stack it doesn't really
matter in c++ and many other languages language libraries also give us implementation of stack
in this program i'm not going to write my own stack i'm going to use stack from what we call
standard template library in c++ i will have to use this include statement hash include stack
and now i have a stack class available to me to create an object of this class i need to write
stack and within angular brackets data type for which we want a stack then after space name or
identifier with this one statement here i have created a stack of characters let's now write the
core logic this n in the signature of reverse function is number of characters in string
this array as we know array in c or c++ is always passed by reference through a pointer this c
followed by brackets is only an alternate syntax for asterisk c it's interpreted like this by the
compiler okay so now what i'm going to do is i'm going to run a loop starting 0 till n minus 1
so i will traverse the string from left to right and as i traverse the string i will push
the character onto stack by calling push function i will use a statement like this
once push is done i'll do another loop for pop i will run a loop with this variable i starting
at 0 going till n minus 1 and i'll first set c i as top of stack and then i will perform a pop
operation if you want to know more about functions available with stack in stl like their signatures
and how to use them you can check the description of this video for some resources this is all i need
to do in my reverse function let's run this code and see what happens i need to enter a string
let's enter hello this is what i get as output which seems to be correct let's run this again and
this time i want to enter my code school this looks all right too so we seem to be good so this
function is solving my problem of reversal let's now see how efficient it is let's analyze its
time complexity we know that all operations on stack take constant time so all these statements
within loop inside loop will take constant time the first loop is running n times and then the
second loop is also running n times first loop will execute in big o of n and the second loop
will also execute in big o of n the loops are not nested they are one after other so in such scenario
complexity of the whole function will also be big o of n time complexity is big o of n
but we are using some extra memory here for stack we are pushing all the characters in the string
want to stack the extra space taken in stack will be proportional to number of characters
in the string will be proportional to n so we can say that space complexity of this function
is also big o of n in simple words extra space taken is directly proportional to n
there are efficient ways to reverse a string without using extra space the most efficient way
probably would be to use just two variables to mark the start and end index in the string
initially let's say i am using variables i and j initially i for this example is zero and j is
four while i is less than j we can swap the characters at these positions and once we have
swapped we can increment i and decrement j if i is less than j we can swap again
and once again increment i and decrement j now i is not less than j i is equal to j at this stage
we can stop swapping and we are done this algorithm has space complexity big o of one we are using
constant extra memory here time complexity of this approach once again is big o of n
we will do n by two swaps so time taken will be proportional to n definitely because of space
complexity this approach is better than our stack approach sometimes when we know that our
input will be very small and time and space is not much of concern we use a particular algorithm
for ease of implementation for its being intuitive it's clearly not the case when we are using stack
to reverse or string but for this other problem reversal of linked list that we had said we will
discuss using a stack gives us a neat and intuitive solution i have drawn a linked list of integers
here as we know linked lists are collections of entities that we call nodes each node contains
two fields one to store data and other to store address of next node i have assumed that these
nodes in this example here are at addresses 100 150 250 and 300 respectively identity of a
linked list is address of the head node we typically store this address in a variable named head
in an array it takes constant time to access any element so whether it's the first element
or last element it takes constant time to access it it is so because array is stored as one
contiguous block of memory so if we know the starting address of the array let's say the starting
address of this array is 400 and size of each element in the array characteristics one byte
so for this example each element is one byte then we can calculate a address of any element
so we know that a4 is at 400 plus 4 or 404 but in a linked list nodes are stored at
these joint locations in memory to access any node we have to start at the head node
so we can't do something as simple as having two pointers at start and end and accessing the
elements we have already seen in the series two possible approaches that can be used to reverse
a linked list one was an iterative solution where we go on reversing links as we traverse the linked
list using some temporary variables another solution was using recursion the time complexity of
iterative solution is big o of n space complexity is big o of one in recursive solution we do not
create a stack explicitly but recursion uses the stack in computer's memory that is used to execute
function calls in such a case we say that we are using implicit stack stack is not being created
explicitly but still we are using an implicit stack i will come back to this and explain in detail
the time complexity of recursive solution once again is big o of n but the space complexity is
big o of n this time space complexity is also big o of n now let's see how we can use an
explicit stack to solve this problem once again i have drawn logical representation
of stack here right now the stack is empty in a program this will be a stack of type pointer to
node what i'm going to do now is i'm going to traverse this linked list using a temporary
pointer to node the temporary variable will initially point to head when we will go to a
particular node we will push the address of that node onto the stack so first 100 will go to stack
and now we will move to the next node now 150 will go in stack and now we will go to 250
and then to the last node at 300 we are showing addresses here in the stack but basically the
objects that we are pushing are pointers to node or in other words references to nodes
if node is defined like this in c++ we will have to use these statements to traverse the linked
list and push all the references let's say head is a pointer to node which i'm assuming is a global
variable that will store the address of head node i'm using a temporary variable that is
pointer to node initially i'm storing the address of head node in this temporary variable and then
i'm running a loop and i'm traversing the linked list and as i'm traversing i'm pushing the reference
onto stack once all the references are pushed onto stack we can start popping them and as we will pop
them we will get references to nodes in reverse order it would be like going through the list in
reverse order while traversing the list in reverse order we can build reverse links
the first thing that i'll do is i'll take a temporary variable that will be pointed to node
and store the address of address at the top of stack which right now is 300 now i will set head
as this address so head now becomes 300 and then i will pop i'm running you through this example
here as i'm writing code head and temp right now are both 300 and now i will run a loop like this
like what i have written here while stack is not empty this function empty returns true if stack
is empty i'm using stack from standard template library in c++ so while stack is not empty i'm
going to say that set temp dot next as address at top of stack basically i'm using this pointer to
node temp to dereference and set this particular address field right now top is 250 so i'm building
this reverse link next statement is a pop and in the next statement i'm saying temp equal temp dot
next which means temp will now point to this node at 250 stack is not empty so loop will execute
again we are writing address here now then we should pop and then move to 150 using this
statement temp equal temp dot next now we're building this link popping and then
oops this should have been 150 and with the next temp equal temp dot next we're going here
even though we have built this link by setting this field here this node is still pointing to
this guy because the stack is empty now we will exit the loop after the loop after exit from the
loop i have written one more line temp dot next equal null so i'm setting the last link part of
last node in reversed list as null finally this is my reverse function i have assumed that head is
a global variable and it's a pointer to node if you want the complete source code you can check
the description of this video for a link using a stack in this case is making our life easier
reversing a linked list is still a complex problem try to just print the elements of linked list in
reverse order if you will use a stack it will be really easy i will stop here for this lesson
if you know if you want to know what i meant by implicit stack you can once again check the
description of this video for some resources so this is it for this lesson thanks for watching
in our previous lesson we saw one simple application of stack we saw that a stack can be
used to reverse a list or collection or maybe to simply traverse a list or collection in reverse
order now in this lesson we will discuss another famous problem that can be solved
using stack and this is also a popular programming interview question and the problem is given an
expression in the form of a string comprising of let's say constants variables operators
and parenthesis and when i say parenthesis i also want to include curly braces and brackets in
my definition of parenthesis so my expression or string can contain characters that can be
upper or lowercase letters symbols for operators and an opening or closing parenthesis or an opening
or closing curly brace or an opening or closing square bracket let's write down some expressions
here i'm going to write a simple expression we have one simple expression here with one pair
of opening and closing parenthesis here in this expression we have nested parenthesis
now given such expressions we want to write a program that would tell us whether parenthesis
in the expression are balanced or not and what do we really mean by balanced parenthesis what we
really mean by balanced parenthesis is that corresponding to each opening parenthesis or
opening curly brace or opening bracket we should have a closing counterpart in correct order
these two expressions here are balanced however this next expression is not balanced
a closing curly brace is missing here this next expression is also not balanced because we are
missing an opening square bracket here this next one is also not balanced because corresponding to
this opening curly brace we do not have a closing curly brace and corresponding to this closing
parenthesis we do not have an opening parenthesis if we are opening with a curly brace we should
also close with a curly brace these two want count for each other checking for balanced parenthesis is
one of the tasks performed by a compiler when we write a program we often miss an opening or closing
curly brace or an opening or closing parenthesis compiler must check for this balancing and if symbols
are not balanced it should give you an error in this problem here what's inside a parenthesis
does not matter we do not want to check for correctness of anything that is
inside a parenthesis so in the string any character other than opening and closing parenthesis or
opening and closing curly brace or opening and closing square bracket can be ignored this problem
sometimes is better stated like this given a string comprising only of opening and closing
characters of parenthesis braces or brackets we want to check for balancing so only these characters
and their order is important while parsing a real expression we can simply ignore other
characters all we care about is these characters and their order okay so now how do we solve this
problem one straightforward thing that comes to mind is that because we should have a closing
counterpart for an opening parenthesis or opening curly brace or opening square bracket what we can
do is we can count the number of opening and closing symbols for each of these three types
and they should be equal so the number of opening parenthesis should be equal to number of closing
parenthesis and the number of opening curly braces should be equal to number of closing curly braces
and same should be true for square brackets as well but it will not be good enough this expression
here has one opening parenthesis and one closing parenthesis but it's not balanced this next one
is balanced but this one with same number of characters of each type as the second expression
is not balanced so this approach won't work apart from count being equal there are some other
properties that must be conserved every opening parenthesis must find a closing counterpart to
its right and every closing parenthesis must find an opening counterpart in its left which is not
true in the first expression and the other property that must be conserved is that a parenthesis can
close only when all the parenthesis opened after it are closed this parenthesis has been opened after
this square bracket so this square bracket cannot close unless this parenthesis has closed
anything that is opened last should be closed first well actually it should not
be last opened first closed in this example here this is getting opened last but this guy
that is open previous to this is closed first and it is fine the property that must be conserved
is that as we scan the expression from left to right any closer should be for the previous
unclosed parenthesis any closer should be for the last unclosed let's scan some expressions
from left to right and see how it's true let's scan this last one we will go from left to right
first character is an opening of square bracket second one is an opening parenthesis
let's mark opening of unclosed parenthesis in red okay now we have a closer here
the third character is a closer this should be the closer for the last unclosed so this
should be the closer for this one this guy this opening parenthesis last unclosed now is this
guy next character once again is an opening parenthesis now we have two unclosed parenthesis
at this stage and this one is the last unclosed the next one is a closure so so it should be closer
for the last unclosed now the last unclosed once again is the opening of square bracket
now when we have a closer it should be closer for this guy
we can use this approach to solve this problem what we can do is we can scan the expression
from left to right and as we scan at any stage we can keep track of all the unclosed parenthesis
basically what we can do is whenever we get an opening symbol an opening parenthesis an
opening curly brace or an opening square bracket we can add it to a list if we get a closing symbol
it should be the closer for the last element in the list in case of an inconsistency
like if the last opening symbol in the list is not of the same type as the closing symbol
or if there is no last opening symbol at all because the list is empty we can stop this whole
process and say that parenthesis are not balanced else we can remove the last opening symbol in the
list because we have got its counterpart and continue this whole process things will be
further clear if i will run through an example i will run through this last example once again
we are going to scan this expression from left to right and we will maintain a list to keep track
of all the open parenthesis that are not yet closed we will give a track of all the unclosed
parenthesis opened but not closed initially this list is empty the first character that we have got
is an opening of square bracket this will go into the list and we will move to the next character
the next character is an opening parenthesis so once again it should go to the list we should
always insert at end in the list the next character is a closing of parenthesis now we must look at
the last opening symbol in the list and if it is of the same type then we have got its counterpart
and we should remove this now we move on to the next character this is once again an opening
parenthesis it should go in the list at the end the next character is a closing of parenthesis
so we will look at the last element in the list it's an opening parenthesis so we can remove it
from the list and now we go to the last character which is a closing of square bracket once again
we need to look at the last element in the list we have one element only one element in the list
at this stage it's an opening of square bracket so once again we can remove it from the list
now we are done scanning the list and the list is empty once again if everything is all right if
parenthesis are balanced we will always end with an empty list if in the end list is not empty
then some opening parenthesis has not found its closing counterpart and expression is not balanced
one thing worth noticing here is that we are always inserting or removing one element at a time from
the same end of the list in this whole process whatever is coming in last in the list is going
out first there is a special kind of list that enforces this behavior that element should be inserted
and removed from the same end and we call it a stack in a stack we can insert and remove an element
one at a time from the same end in constant time so what we can do is whenever we get an opening
symbol while scanning the list we can push it onto the stack and when we get a closing symbol we can
check whether the opening symbol at the top of stack is of the same type as the closing symbol
if it's of the same type we can pop it if it's not of the same type we can simply say that parenthesis
are not balanced i will quickly write pseudo code for this logic i'm going to write a function
named check balanced parenthesis that will take an expression in the form of a string as argument
first of all i will store the number of characters in the string in a variable and then i will create
a stack and i will create a stack of characters and now i will scan the expression from left to
right using a loop while scanning if the character is an opening symbol if it's an opening parenthesis
or opening curly brace or opening square bracket we can push that character onto the stack let's
say this function push will push a character onto s else if expression i or the character
at ith position while scanning is a closing symbol of any of the three types we can have two scenarios
if stack is empty or top of stack does not pair with the closing symbol if we have a closing of
parenthesis then the top of stack should be an opening of parenthesis it cannot be an opening
of curly brace in such a scenario we can conclude that the parenthesis are not balanced else we
can perform a pop finally once our scanning is over we can check whether stack is empty or not
if it's empty parenthesis are balanced if it's not they are not balanced so this is my pseudo code
let's run through couple of examples and see whether this works for all scenarios or test cases or
not let's first look at this expression the first thing that we are doing in our code is that we are
creating a stack of characters i have drawn logical representation of a stack here okay now let's scan
this string let's say we have a zero based index and the string is just a character array we are
starting the scan we are going inside the loop this is a closing of parenthesis so this if statement
will not hold true so we will go to the else condition and now we will go inside the else
to check for this condition whether stack is empty or not or whether the top of stack pairs
with this closing symbol or not the stack is empty if the stack is empty there is no opening
counterpart for this closing symbol so we will simply return false returning means exiting the
function so we are simply concluding here that parenthesis are not balanced and exiting
let's go through this one now first we have an opening square bracket so we will go to the first
if and push next one is an opening parenthesis once again it will be pushed next one is a closing
square bracket so the condition for this else if will be true we will go inside this else if
now this time the top of stack is an opening parenthesis it should have been an opening square
bracket and then only we would have a pair so this time also we will have to return false
and exit okay now let's go through this one first we will have a push the next one
will also be a push now next one is a closer of parenthesis which pairs with the top of stack
which is opening of parenthesis so we will have a pop we will go to the next character and this
one once again is an opening parenthesis so there will be a push next one is a closing parenthesis
and the top is an opening parenthesis the pair so there will be a pop last character is a closing
curly brace so once again we will see whether top of stack is an opening curly brace or not do we
have a pair or not yes we have a pair so there will be a pop with this our scanning will finish
and finally stack should be empty it is empty so we have balanced parenthesis here
try implementing this pseudo code in a language of your choice and see whether it works for all
test cases or not if you want to look at my implementation you can check the description of
this video for a link in the coming lessons we will see some more problems on stack this is it for
this lesson thanks for watching hello everyone in this lesson we are going to talk about one
important and really interesting topic in computer science where we find application of stack
data structure and this topic is evaluation of arithmetic and logical expressions
so how do we write an expression I have written some simple arithmetic expressions here an expression
can have constants variables and symbols that can be operators or parenthesis and all these
components must be arranged according to a set of rules according to a grammar and we should be
able to parse and evaluate the expression according to this grammar all these expressions that I have
written here have a common structure we have an operator in between two operands
operand by definition is an object or value on which operation is performed in this expression
two plus three two and three are operands and plus is operator in the next expression
a and b are operands and minus is operator in the third expression this asterisk is for
multiplication operation so so this is the operator the first operand p is a variable
and the second operand 2 is a constant this is the most common way of writing an expression
but this is not the only way this way of writing an expression in which we write
an operator in between operands is called in fix notation operand doesn't always have to be
a constant or variable operand can be an expression itself in this fourth expression that I have
written here one of the operands of multiplication operator is an expression itself another operand
is a constant we can have a further complex expression in this fifth expression that I have
written here both the operands of multiplication operator are expressions we have three operators
in this expression here for this first plus operator p and q this variables p and q are operands
for the second plus operator we have r and s and for this multiplication operator the first
operand is this expression p plus q and the second operand is this expression r plus s
while evaluating expressions with multiple operators operations will have to be performed
in certain order like in this fourth example we will first have to perform the addition and then
only we can perform multiplication in this fifth expression first we will have to perform these
two additions and then we can perform the multiplication we will come back to evaluation but if you can
see in all these expressions operator is placed in between operands this is the syntax that we are
following one thing that I must point out here throughout this lesson we are going to talk only
about binary operators an operator that requires exactly two operands is called a binary operator
technically we can have an operator that may require just one operand or maybe more than two
operands but we are talking only about expressions with binary operators okay so let's now see what
all rules we need to apply to evaluate such expressions written in this syntax that we are
calling in fix notation for an expression with just one operator there is no problem we can
simply apply that operator for an expression with multiple operators and no parenthesis like
this we need to decide an order in which operators should be applied in this expression if we will
perform the addition first then this expression will reduce to 10 into 2 and will finally evaluate
as 20 but if we will perform the multiplication first then this expression will reduce to 4 plus 12
and will finally evaluate to 16 so basically we can look at this expression in two ways
we can say that operands for addition operator are 4 and 6 and operands for multiplication are
this expression 4 plus 6 and this constant 2 or we can say that operands for multiplication are 6
and 2 and operands for addition operation are 4 and this expression 6 into 2 there is some ambiguity
here but if you remember your high school mathematics this problem is resolved by following operator
precedence rule in an algebraic expression this is the precedence that we follow first preference
is given to parenthesis or brackets next preference is given to exponents i'm using this symbol for
exponent operator so if i have to write 2 to the power 3 i'll be writing it something like this
in case of multiple exponentiation operator we apply the operators from right to left so
if i have something like this then first this rightmost exponentiation operator will be applied
so this will reduce to 512 if you will apply the left operator first then this will evaluate
to 64 after exponents next preference is given to multiplication and division and if it's between
multiplication and division operators then we should go from left to right after multiplication
and division we have addition and subtraction and here also we go from left to right if we have
an expression like this with just addition and subtraction operators then we will apply the
leftmost operator first because the precedence of these operators is same and this will evaluate to
3 if you will apply the plus operator first this will evaluate as 1 and that will be wrong
in this second expression 4 plus 6 into 2 that i have written here if we will apply operator
precedence then multiplication should be performed first if we want to perform the addition first
then we need to write this 4 plus 6 within parenthesis and now addition will be performed first because
precedence of parenthesis is greater i'll take example of another complex expression and try to
evaluate it just to make things further clear so i have an expression here in this expression
we have four operators one multiplication one division one subtraction and one addition
multiplication and division have higher precedence between these two multiplication and division
which have same precedence we will pick the left one first so we will first reduce this expression
like this and now we will perform the division and now we have only subtraction and addition
so we will go from left to right and this is what we will finally get this right to left and left
to right rule that i have written here for operators with equal precedence is better termed as operator
associativity if in case of multiple operators with equal precedence we go from left to right
then we say that the operators are left associative and if we go from right to left we say that the
operators are right associative while evaluating an expression in in fixed form we first need to
look at precedence and then to resolve conflict among operators with equal precedence we need to
see associativity all in all we need to do so many things just to parse and evaluate an
in fix expression the use of parenthesis becomes really important because that's how we can control
the order in which operation should be performed parenthesis add explicit intent that operation
should be performed in this order and also improve readability of expression i have modified
this third expression we have some parenthesis here now and most often we write in fix expressions
like this only using a lot of parenthesis even though in fix notation is the most
common way of writing expressions it's not very easy to parse and evaluate an in fix expression
without ambiguity so mathematicians and logicians studied this problem and came up with two other
ways of writing expressions that are parenthesis free and can be passed without ambiguity without
requiring to take care of any of these operator precedence or associativity rules and these two
ways are post fix and prefix notations prefix notation was proposed earlier in year 1924 by
a polished logician prefix notation is also known as polished notation in prefix notation operator
is placed before operands this expression two plus three in in fix will be written as plus two
three in prefix plus operator will be placed before the two operands two and three p minus
q will be written as minus pq once again just like in fix notation operand in prefix notation
doesn't always have to be a constant or variable operand can be a complex prefix notation itself
this expression a plus b as to risk c in in fix form will be written like this in prefix form
i'll come back to how we can convert in fix expression to prefix first have a look at this
third expression in prefix form for this multiplication operator the two operands are
variables b and c these three elements are in prefix syntax first we have the operator and then
we have the two operands the operands for addition operator are variable a and this prefix expression
as to risk b c in in fix expression we need to use parenthesis because an operand can possibly be
associated with two operators like in this third expression in in fix form b can be associated with
both plus and multiplication to resolve this conflict we need to use operator precedence
and associativity rules or use parenthesis to explicitly specify association but in prefix
form and also in post-fix form that we will discuss in some time an operand can be associated
with only one operator so we do not have this ambiguity while parsing and evaluating prefix
and post-fix expressions we do not need extra information we do not need all the operator
precedence and associativity rules i'll come back to how we can evaluate prefix notation
i'll first define post-fix notation post-fix notation is also known as reverse polished notation
this syntax was proposed in 1950s by some computer scientists in post-fix notation operator is
placed after operands programmatically post-fix expression is easiest to parse and list costly
in terms of time and memory to evaluate and that's why this was actually invented prefix expression
can also be evaluated in similar time and memory but the algorithm to parse and evaluate post-fix
expression is really straightforward and intuitive and that's why it's preferred for
computation using machines i'm going to write post-fix for these expressions that i had written
earlier in other forms this first expression 2 plus 3 in post-fix will be 2 3 plus to separate
the operands we can use a space or some other delimiter like a comma that's how you would typically
store prefix or post-fix in a string when you'll have to write a program this second expression
in post-fix will be pq minus so as you can see in post-fix form we are placing the operator after
the operands this third expression in post-fix will be abc asterisk and then plus for this
multiplication operator operands are variables b and c and for this addition operands are variable
a and this post-fix expression bc asterisk we will see efficient algorithms to convert
in-fix to prefix or post-fix in later lessons for now let's not bother how we will do this
in a program let's quickly see how we can do this manually to convert an expression from
in-fix to any of these other two forms we need to go step by step just the way we would go in
evaluation i have picked this expression a plus b into c in in-fix form we should first convert
the part that should be evaluated first so we should go in order of precedence we can also first
put all the implicit parenthesis so here we will first convert this b into c so first we are doing
this conversion for multiplication operator and then we will do this conversion for addition operator
we will bring addition to the front so this is how the expression will transform we can use
parenthesis in intermediate steps and once we are done with all the steps we can erase the parenthesis
let's now do the same thing for post-fix we will first do the conversion for
multiplication operator and then in next step we will do it for addition
and now we can get rid of all the parenthesis parenthesis surely adds readability to any of
these expressions to any of these forms but if we are not bothered about human readability
then for a machine we are actually saving some memory that would be used to store parenthesis
information in fix expression definitely is most human readable but prefix and post fix are good
for machines so this is in fix prefix and post fix notation for you in next lesson we will discuss
evaluation of prefix and post fix notations this is it for this lesson thanks for watching
in our previous lesson we saw what prefix and post fix expressions are but we did not discuss how
we can evaluate these expressions in this lesson we will see how we can evaluate prefix and post
fix expressions algorithms to evaluate prefix and post fix expressions are similar but i'm going
to talk about post fix evaluation first because it's easier to understand and implement and then
i'll talk about evaluation of prefix okay so let's get started i have written an expression in
in fix form here and i first want to convert this to post fix form as we know in in fix form operator
is written in between operands and we want to convert to post fix in which operator is written
after operands we have already seen how we can do this in our previous lesson we need to go step
by step just the way we would go in evaluation of in fix we need to go in order of precedence
and in each step we need to identify operands of an operator and we need to bring the operator
in front of the operands what we can actually do is we can first resolve operator precedence and
put parenthesis at appropriate places in this expression we'll first do this multiplication
this first multiplication then we'll do this second multiplication then we will perform this
addition and finally the subtraction okay now we will go one operator at a time
operands for this multiplication operator are a and b so this a asterisk b will become
a b asterisk now next we need to look at this multiplication this will transform to see the
asterisk and now we can do the change for this addition the two operands are these two expressions
in post fix so i'm placing the plus operator after these two expressions finally for this last
operator the operands are this complex expression and this variable e so this is how we will look
like after the transformation finally when we are done with all the operators we can get rid of
all the parenthesis they are not needed in post fix expression this is how you can do the conversion
manually we will discuss efficient ways of doing this programmatically in later lessons
we will discuss algorithms to convert in fix to prefix or post fix in later lessons in this lesson
we are only going to look at algorithms to evaluate prefix and post fix expressions
okay so we have this post fix expression here and we want to evaluate this expression
let's say for these values of variables a b c d and e so we have this expression in terms of
values to evaluate i'll first quickly tell you how you can evaluate a post fix expression manually
what you need to do is you need to scan the expression from left to right and find the
first occurrence of an operator like here multiplication is the first operator in post fix expression
operands of an operator will always lie to its left for the first operator the preceding two
entities will always be operands you need to look for the first occurrence of this pattern
operand operand operator in the expression and now you can apply the operator on these two operands
and reduce the expression so this is what i'm getting after evaluating two three asterisk
now we need to repeat this process till we are done with all the operators once again we need to
scan the expression from left to right and look for the first operator if the expression is correct
it will be preceded by two values so basically we need to look for first occurrence of this pattern
operand operand operator so now we can reduce this we have six and then we have five into four
twenty we are using space as still a meter here there should be some space in between two operands
okay so this is what i have now once again i look for the first occurrence of operand operand
and operator we will go on like this till we are done with all the operators
when i'm saying we need to look for first occurrence of this pattern operand operand and operator
what i mean by operand here is a value and not a complex expression itself the first operator
will always be preceded by two values and if you will give this some thought you will be able to
understand why if you can see in this expression we are applying the operators in the same order
in which we have them while parsing from left to right so first we are applying this left most
multiplication on two and three then we are applying the next multiplication on five and four then
we are performing the addition and then finally we are performing the subtraction and whenever
we are performing an operation we are picking the last two operands preceding the operator in the
expression so if we have to do this programmatically if we have to evaluate a post fix expression given
to us in a string like this and let's say operands and operators are separated by space we can have
some other delimiter like comma also to separate operands and operator now what we can do is we
can parse the string from left to right in each step in this parsing in each step in this scanning
process we can get a token that will either be an operator or an operand what we can do is as we
parse from left to right we can keep track of all the operands seen so far and i'll come back to
how it will help us so i'm keeping all the operands so seen so far in a list the first entity that
we have here is two which is an operand so it will go to the list next we have three which once
again is operand so it will go into the list next we have this multiplication operator
now this multiplication should be applied to last two operands preceding it last two operands to
the left of it because we already have the elements stored in this list all we need to do is we
need to pick the last two from this list and perform the operation it should be two into three
and with this multiplication we have reduced the expression this two three asterisk has now become
six it has become an operand that can be used by an operator later we are at this stage right now
that i'm showing in the right i'll continue the scanning next we have an operand we'll push this
number five onto the list next we have four which once again will come to the list and now we have
the multiplication operator and it should be applied to the last two operands in the reduced
expression and we should put the result back into the list this is the stage where we are right now
so this list actually is storing all the operands in the reduced expression
preceding the position at which we are during passing now for this addition we should take out
the last two elements from the list and then we should put the result back next we have an operand
we are at this stage right now next we have an operator this subtraction we will perform this
subtraction and put the result back finally when i'm done scanning the whole expression i'll have
only one element left in the list and this will be my final answer this will be my final result
this is an efficient algorithm we are doing only one pass on the string representing the expression
and we have our result the list that we are using here if you could notice is being used in a special
way we are inserting operands one at a time from one side and then to perform an operation we are
taking out operand from the same side whatever is coming in last is getting out first
this whole thing that we are doing here with the list can be done efficiently with a stack
which is nothing but a special kind of list in which elements are inserted
and removed from the same side in which whatever gets in last comes out first it's called a last
in first out structure let's do this evaluation again i have drawn logical representation of stack
here and this time i'm going to use this stack i'll also write pseudo code for this algorithm i'm
going to write a function named evaluate postfix that will take a string as argument let's name this
string expression exp for expression in my function here i'll first create a stack now for the sake
of simplicity let's assume that each operand or operator in the expression will be of only one
character so to get a token or operator we can simply run a loop from zero till
length of expression minus one so expression i will be my operand or operator if expression i
is operand i should put it push it onto the stack else if expression i is operator we should do
two pop operations in the stack store the value of the operands in some variable i'm using variables
named op1 and op2 let's say this pop function will remove an element from top of stack s
and also return this element once we have the two operands we can perform the operation i'm using
this variable to store the output let's say this function will perform the operation now the result
should be pushed back onto the stack if i have to run through this expression with whatever code i
have right now then first entity is two which is operand so it should be pushed onto the stack
next we have three once again this will go to the stack next we have this multiplication operator
so we will come to this else if part of the code i'll make first pop and i'll store three
in this variable op1 well actually this is the second operand so i should say this one is op2
and next one will be op1 once i have popped these two elements i can perform the operation
as you can see i'm doing the same stuff that i was doing with the list the only thing is that
i'm showing things vertically stack is being shown as a vertical list i'm inserting or taking
out from the top now i'll push the result back onto the stack now we will move to the next entity
which is operand it will go into the stack next four will also go into the stack and now we have
this multiplication so we will perform two pop operations after this operation is performed
result will be pushed back next we have addition so we will go on like this we have 26 pushed onto
the stack now now it's nine which will go in and finally we have this subtraction 26 minus nine
17 will be pushed onto the stack at this stage we will be done with the loop
we are done with all the tokens all the operands and operators the top of stack can be returned as
final result at this stage we will have only one element in the stack and this element will be
my final result you will have to take care of some parsing logic in actual implementation
operand can be a number of multiple digits and then we will have delimiter like space or comma
so you'll have to take care of that parsing operand or operator will be some task if you want to see
my implementation you can check the description of this video for a link okay so this was post-fix
evaluation let's now quickly see how we can do prefix evaluation once again i've written this
expression in infix form and i'll first convert it to prefix we will go in order of precedence i first
put this parenthesis this two asterix three will become asterix two three this five into four will
become asterisk five four and now we will pick this plus operator whose operands are these two
prefix expressions finally for the subtraction operator this is the first operand and this is
the second operand in the last step we can get rid of all the parenthesis so this is what i have
finally let's now see how we can evaluate a prefix expression like this we will do it just like
post-fix this time all we need to do is we need to scan from right so we will go from right to left
once again we will use a stack if it's an operand we can push it on to the stack so here for this
example nine will go on to the stack and now we will go to the next entity in the left it's four
once again we have an operand it will go on to the stack now we have five
five will also be pushed on to the stack and now we have this multiplication operator at this stage
we need to pop two elements from the stack this time the first element popped will be the first
operand in post-fix the first element popped was the second operand this time the second element
popped will be the second operand for this multiplication first operand is five and second
operand is four this order is really important for multiplication the order doesn't matter but for
say division or subtraction this will matter result 20 will be pushed on to the stack
and we will keep moving left now we have three and two both will go on to the stack
and now we have this multiplication operation three and two will be popped and their product
six will be pushed now we have this addition the two elements at top are 20 and six they will be
popped and their sum 26 will be pushed finally we have this subtraction 26 and nine will be popped
out and 17 will be pushed and finally this is my answer prefix evaluation can be performed
in couple of other ways also but this is easiest and most straightforward okay so this was prefix
and post-fix evaluation using stack in coming lessons we will see efficient algorithms to
convert in-fix to prefix or post-fix this is it for this lesson thanks for watching in our
previous lesson we saw how we can evaluate prefix and post-fix expressions now in this lesson we
will see an efficient algorithm to convert in-fix to post-fix we already know of one way of doing
this we have seen how we can do this manually to convert an in-fix expression to post-fix we apply
operator precedence and associativity rules let's do the conversion for this expression that I have
written here the precedence of multiplication operator is higher so we will first convert this
part b asterisk c b asterisk c will become b c asterisk the operator will come in front of the
operands now we can do the conversion for this addition for addition the operands are a and this
post-fix expression in the final step we can get rid of all the parentheses so finally this is
my post-fix expression we can use this logic in a program also but it will not be very efficient
and the implementation will also be somewhat complex i'm going to talk about one algorithm
which is really simple and efficient and in this algorithm we need to parse the in-fix expression
only once from left to right and we can create the post-fix expression if you can see in in-fix
to post-fix conversion the positions of operands and operators may change but the order in which
operands occur from left to right will not change the order of operators may change this is an
important observation in both in-fix and post-fix forms here the order of operands as we go from
left to right is first we have a then we have p and then we have c but the order of operators
is different in in-fix first we have plus and then we have multiplication in post-fix first we
have multiplication and then addition in post-fix form we will always have the operators in the
same order in which they should be executed i'm going to perform this conversion once again but
this time i'm going to use a different logic what i'll do is i'll parse the in-fix expression from
left to right so i'll go from left to right looking at each token that will either be an operand
or an operator in this expression we will start at a a is an operand if it's an operand we can
simply append it in the post-fix string or expression that we are trying to create
at least for a it should be very clear that there is nothing that can come before a
okay so the first rule is that if it's an operand we can simply put it in the post-fix expression
moving on next we have an operator we cannot put the operator in the post-fix expression
because we have not seen its right operand yet while parsing we have seen only its left operand
we can place it only after its right operand is also placed so what i'm going to do is i'm going
to keep this operator in a separate list or collection and place it later in the post-fix expression
when it can be placed and the structure that i'm going to use for storage is stack a stack is only
a special kind of list in which whatever comes in last goes out first insertion and deletion
happen from the same end i have pushed plus operator onto the stack here moving on next we have b
which is an operand as we had said operand can simply be appended there is nothing that can come
before this operand the operator in the stack is anyway waiting for the operand to come now at
this stage can we place the addition operator in the post-fix string well actually what's after b
also matters in this case we have this multiplication operator after b which has higher precedence
and so the actual operand for addition is this whole expression be asterisk c we cannot perform
the addition until multiplication is finished so while parsing when i'm at b and i have not seen
what's ahead of b i cannot decide the fate of the operator in the stack so let's just move on now
we have this multiplication operator i want to make this expression further complex
to explain things better so i'm adding something at tail here in this expression
now i want to convert this expression to post-fix form i'm not having any parenthesis here we will
see how we can deal with parenthesis later let's look at an expression where parenthesis does not
override operator precedence okay so right now in this expression while parsing from left to right
we are at this multiplication operator the multiplication operator itself cannot go into
the post-fix expression because we have not seen its right operand yet and until its right
operand is placed in the post-fix expression we cannot place it the operator that we would be
looking at while parsing that operator itself cannot be placed right away but looking at that
operator we can decide whether something from the collection something from the stack
can be placed into the post-fix expression that we are constructing or not
any operator in the stack having higher precedence than the operator that we are looking at
can be popped and placed into the post-fix expression let's just follow this as rule for
now and i'll explain it later there is only one operator in the stack and it is not having
higher precedence than multiplication so we will not pop it and place it in the post-fix expression
multiplication itself will be pushed if an element in the stack has something on top of it that
something will always be of higher precedence so let's move on in this expression now now we are
at c which is an operand so it can simply go next we have an operator subtraction subtraction
itself cannot go but as we had said if there is anything on the stack having higher precedence
than the operator that we are looking at it should be popped out and should go
and the question is why we are putting these operators in the stack we are not placing them
in the post-fix expression because we are not sure whether we are done with their right
operand or not but after that operator as soon as i'm getting an operator of lower precedence
that marks the boundary of the right operand for this multiplication operator
c is my right operand it's this simple variable for addition b*c is my right operand because
subtraction has lower precedence anything on or after that cannot be part of my right operand
subtraction i should say has lower priority because of the associativity rule if you remember
the order of operation addition and subtraction have same precedence but the one that would occur
in left would be given preference so the idea is any time for an operator if i'm getting
a an operator of lower priority we can pop it from the stack and place it in the expression
here we will first pop multiplication and place it and then we can pop addition
and now we will push subtraction onto the stack let's move on now d is an operand
so it will simply go next we have multiplication there is nothing in the stack having higher
precedence than multiplication so we will pop nothing multiplication will go onto the stack
next we have an operand it will simply go now there are two ways in which we can find
the end of right operand for an operator a is if we get an operator of lesser precedence
be if we reach the end of the expression now that we have reached end of expression we can simply
pop and place these operators so first multiplication will go and then subtraction will go
let's quickly write pseudo code for whatever i have said so far and then you can sit with some
examples and analyze the logic i'm going to write a function named in fix to post fix that will take
a string exp for expression as argument for the sake of simplicity let's assume that
each operand or operator will be of one character only in an actual implementation you can assume
them to be tokens of multiple characters so in my pseudo code here the first thing that i'll do is
i'll create a stack of characters named s now i'll run a loop starting 0 till length of expression
minus 1 so i'm looking at each character that can either be an operand or operator
if the character is an operand we can append it to the post fix string well actually i should have
declared and initialized a string before this loop this is the result string in which i'll be
appending else if expression i is operator we need to look for operators in the stack
having higher precedence so i'll say while stack is not empty and the top of stack has
higher precedence and let's say this function has higher precedence we'll take two arguments
two operators so if the top of stack has higher precedence than the operator that we are looking
at we can append the top of stack to the result which is the variable that will store the post fix
string and then we can pop that operator i'm assuming that this s is some class that has these functions
stop and pop and empty to check whether it's empty or not finally once i'm done with the popping
outside this while loop i need to push the current operator
s is an object of some class that will have these functions stop pop and empty okay so this
is the end of my for loop at the end of it i may have some operators left in the stack i'll pop
these operators and append them to the post fix string i'll use this while loop i'll say that
while the stack is not empty append the operator at top and pop it and finally after this while loop
i can return the result string that will contain my post fix expression so this is my pseudo code for
whatever logic i've explained so far in my logic i've not taken care of parenthesis
what if my infix expression would have parenthesis like this there will be slight change from what
we were doing previously with parenthesis any part of the expression within parenthesis
should be treated as independent complete expression in itself and no element outside
the parenthesis will influence its execution in this expression this part a plus b is one within
one parenthesis its execution will not be influenced by this multiplication or this subtraction
which is outside it similarly this whole thing is within the outer parenthesis so this multiplication
operator outside will not have any influence on execution of this part as a whole if parenthesis
are nested inner parenthesis is sorted out or resolved first and then only outer parenthesis
can be resolved with parenthesis we will have some extra rules we will still go from left to right
and we will still use stack and let's say i'm going to write the post fix part in right here
as i created now while parsing a token can be an operand an operator or an opening or closing of
parenthesis we will have some extra rules i'll first tell them and then i'll explain if it's an
opening parenthesis we can push it onto the stack the first token here in this example is an opening
parenthesis so it will be pushed onto the stack and then we will move on we have an opening parenthesis
once again so once again we will push it now we have an operand there is no change in rule
for operand it will simply be appended to the post fix part next we have an operator
remember what we were doing for operator earlier we were looking at top of stack and
popping as long as we were getting operator of higher precedence earlier when we were not using
parenthesis we could go on popping and empty the stack but now we need to look at top of stack and
pop only till we get an opening parenthesis because if we are getting an opening parenthesis
then it's the boundary of the last open parenthesis and this operator does not have any influence after
that outside that so this plus operator does not have any influence outside this opening parenthesis
i'll explain the scenario with some more examples later let's first understand the rule so the rule
is if i'm seeing an operator i need to look at the top of stack if it's an operator of higher
precedence i can pop and then i should look at the next stop if it's once again an operator
of higher precedence i should pop again but i should stop when i see an opening parenthesis
at this stage we have an opening parenthesis at top so we do not need to look look below it
nothing will be popped anyway addition however will go onto the stack remember after the whole
popping game we pushed the operator itself next we have an operand it will go and we will move on
next we have a closing of parenthesis when i'm getting a closing of parenthesis i'm getting a
logical end of the last opened parenthesis for part of the expression within that parenthesis
it's coming to the end and remember what we were doing earlier when we were reaching the end of
infix expression we were popping all the operators out and placing them so this time also we need to
pop all the operators out but only those operators that are part of this parenthesis that we are
closing so we need to pop all the operators until we get an opening parenthesis i'm popping this plus
and appending it next we have an opening of parenthesis so i'll stop but as last step i will
pop this opening also because we are done for this parenthesis okay so the rule for closing
of parenthesis pop until you're getting an opening parenthesis and then finally pop that
particular opening parenthesis also let's move on now next we have an operator we need to look at
top of stack it's an opening of parenthesis this operator will simply be pushed next we have an
operand next we have an operator once again we will look at the top we have multiplication
which is higher precedence so this should be popped and appended we will look at the top again it's
an opening of parenthesis so we should stop looking now minus will be pushed now next we have an operand
next we have closing of parenthesis so we need to pop until we get an opening minus will be appended
finally the opening will also be popped next we have an operator and this will simply go
next we have an operand and now we have reached the end of expression so everything in the stack
will be popped and appended so this finally is my post fix expression i'll take one more example
and convert it to make things further clear i want to convert this expression i'll start at
the beginning first we have an operand then this multiplication operator which will simply go onto
the stack the stack right now is empty there is nothing on the top to compare it with next we
have an opening parenthesis which will simply go next we have an operand it will be appended
and now we move on to this addition operator if this opening parenthesis was not there
the top of stack would have been the multiplication operator which has higher precedence so it would
have been popped but now we will look at the top and it's an opening parenthesis so we cannot look
below and we will simply have to move on next we have c i missed pushing the addition operator
last time okay after c we have this closure so we need to pop until we get an opening
and then we need to pop one opening also finally we have reached the end of expression so everything
in the stack will be popped and appended so this finally is my post fix part post fix form
in my pseudo code that i had written earlier only the part within this for loop will change
to take care of parenthesis in case we have an operator we need to look at
top of the stack and pop but only till we are getting an opening parenthesis so i have put this
extra condition in the while loop this condition will make sure that we stop
once we get an opening parenthesis right now in the for loop we are dealing with
operator and operators we will have two more conditions if it's an opening of parenthesis
we should push else if it's a closer we can go on popping and appending let's say this
function is opening parenthesis we'll check whether a character is opening of parenthesis or not
in fact we should use this function here also when i'm checking whether current token is
opening or not because it could be an opening curly brace or opening bracket also
this function will then take care let's say this function will take care and similarly for
this last else if we should use this function is closing parenthesis okay things are consistent
now after this while loop in the last else if we should do one extra pop and this extra pop will
pop the opening parenthesis and now we are done with this else if and this is closer of my for loop
rest of this stuff will remain same after the for loop we can pop the leftovers and the pen to the
string and finally we can return so this is my final pseudo code you can check the description
of this video for a link to real implementation actual actual source code okay so i'll stop here
now this is it for this lesson thanks for watching hello everyone we have been talking
about data structures for some time now as we know data structures are ways to store and organize
data in computers so far in this series we have discussed some of the data structures like
arrays linked lists and in last couple of lessons we have talked about stack in this lesson we are
going to introduce you to queues we are going to talk about qadt just the way we did it for stacks
first we are going to talk about q as abstract data type or adt as we know when we talk about
a data structure as abstract data type we define only the features or operations available with
the data structure and do not go into implementation details we will see possible implementations in
later lessons in this lesson we are only going to discuss logical view of q data structure okay so
let's get started q data structure is exactly what we mean when we say q in real world a q is a
structure in which whatever goes in first comes out first in short we call q a fee for structure
earlier we had seen stack which is a last in first out structure which is called a last in first
out structure or in short leifo a stack is a collection in which both insertion and removal
happen from the same end that we call the top of stack in q however an insertion must happen from
one end that we call rear or tail of the q and any removal must happen from the other end
that we can call front or head of the q if i have to define q formally as an abstract data type
then a q is a list or collection with the restriction or constraint that insertion can be and must be
performed at one end that we call the rear of q or the tail of q and deletion can be performed
at other end that we can call the front of q or head of q let's now define the interface or
operations available with q just like stack we have two fundamental operations here
an insertion is called in q operation some people also like to name this operation push
in q operations should insert an element at tail or rear end of q deletion is called
dq operation in some implementations people call this operation pop also
push and pop are more famous in context of stack in q and dq are more famous in context of qs
while implementing you can choose any of these names in your interface
dq should remove an element from front or head of the q and dq typically also returns this element
that it removes from the head the signatures of nq and dq for a q of integers can be something like
this nq is returning void here while dq is returning an integer this integer should be
the removed element from the q you can design dq also to return void typically a third operation
front or peak is kept just to look at the element at the head just like the top operation that we
had kept in stack this operation should just return the element at front and should not delete
anything okay we can have few more operations we can have one operation to check whether q is
empty or not if q has a limited size then we can have one operation to check whether q is full or
not why i'm calling out these alternate names for operations is also because most of the time
we do not write our own implementation of a data structure we use inbuilt implementations
available with language laboratories interface can be different in different language laboratories
for example if you would use the inbuilt q in c plus plus the function to insert is push
while in c sharpets nq so we should not confuse i'll just keep more famous names here
okay so these are the operations that i have defined with q adt nq dq front and is empty
we can insert or remove one element at a time from the q using nq and dq front is only to look at
the element at head is empty is only to verify whether q is empty or not all these operations
that have written here must take constant time or in other words their time complexity should be
big o of one logically a q can be shown as a figure or container open from two sides
so an element can be inserted or in queued from one side and an element can be removed or
dequeued from other side if you remember stack we show a stack as a container open from one side
so an insertion or what we call push in context of stack and removal or pop both must happen from
the same side in q insertion and removal should happen from different sides
let's say i want to create a queue of integers let's say initially we have an empty queue
i will first write down one of the operations and then show you the simulation in logical view
let's say i first want to nq number two this figure that i'm showing here
right now is an empty queue of integers and i'm saying that i'm performing an in queue operation
here in a program i would be calling an in queue function passing it number two as argument
after this nq we have one element in the queue we have one integer in the queue
because we have only one element in the queue right now front and rear of the queue are this
are same let's nq one more integer now i want to insert number five
five will be inserted at rear or tail of the queue let's nq one more
and now i want to call dequeued operation so we will pick two from head of the queue and it will go
out if dequeued is supposed to return this removed integer then we will get integer two as return
nq and deque are the fundamental operations available with queue in our design we can have some
more for our convenience like we have front and is empty here a call to front at this stage
will get us number five integer five as return no integer will be removed from the queue
calling s is empty at this stage can return us a boolean false or zero for false and one for true
so this pretty much is how queue works now one obvious question can be what are the real scenarios
where we can use queue what are the use cases of queue data structure queue is most often used in
a scenario where there is a shared resource that's supposed to serve some requests but the resource
can handle only one request at a time it can serve only one request at a time in such a scenario
it makes most sense to queue up the requests the request that comes first gets served first
let's say we have a printer shared in a network any machine in the network can send a print request
to this printer printer can serve only one request at a time it can print only one document at a time
so if a request comes when it's busy it can't be like i'm busy request later that will be really
rude of the printer what really happens is that the program that manages the printer puts the
print request in a queue as long as there is something in the queue printer keeps picking up
a request from the front of the queue and serves it processor on your computer is also a shared
resource a lot of running programs or processes need time of the processor but the processor
can attend to only one process at a time processor is the guy who has to execute all the instructions
who has to perform all the arithmetic and logical operations so the processes are put in a queue
queues in general can be used to simulate weight in a number of scenarios we will discuss some
of these applications of queue in detail while solving some problems in later lessons this is
good for an introduction in next lesson we will see how we can implement queue this is it for this
lesson thanks for watching in our previous lesson we introduced you to queue data structure we talked
about queue as abstract data type or ADT as we know when we talk about the data structure as abstract
data type we define it as a mathematical or logical model we define only the features or operations
available with the data structure and do not go into implementation details in this lesson we are
going to discuss possible implementations of queue i will do a quick recap of what we have
discussed so far a queue is a list or collection with this restriction with this constraint
that insertion can be performed at one end that we call rear of queue or tail of queue and deletion
can be performed at other end that we call the front of queue or the head of queue and insertion
in queue is called in queue operation a deletion is called the queue operation i have defined
queue ADT with these four operations that i have written here in an actual implementation all these
operations will be functions front operation should simply return the element at front of queue it
should not remove any element from the queue is empty should simply check whether queue is empty
or not and all these operations must take constant time and queue the queue or looking at the element
at front the time taken for any of these operations must not depend upon a variable like number of
elements in queue or in other words time complexity of all these operations must be big o of one
okay so let's get started we are saying that a queue is a special kind of list in which elements
can be inserted or removed one at a time and insertion and removal happen at different ends
of the queue we can insert an element at one end and we can remove an element from the other end
just the way we did it for stack we can add these constraints or extra properties of queue
to some implementation of a list and create a queue there are two popular implementations of
queue we can have an array based implementation and we can have linked list based implementation
let's first discuss array based implementation let's say we want to create a queue of integers
what we can do is we can first create an array of integers i have created an array of 10 integers
here i have named this array a now what i'm going to do is i'm going to use this array
to store my queue what i'm going to say is that at any point some part of the array starting an
index marked as front till an index marked as rear will be my queue in this array i'm showing front
of the queue towards left and rear towards right in earlier examples i was showing front towards
right and rear towards left doesn't really matter any side can be front and any side can be rear
it's just that an element must always be added from rear side and must always be removed from front
so if at any stage a segment of the array from an index marked as front till an index marked as
rear is my queue and rest of the positions in the array are free space that can be used to
expand the queue to insert an element to nq we can increment rear so we will add a new cell
in the queue towards rear end and in this cell we can write the new value element to be inserted
can come to this position i'll fill in some values here at these positions so we have these
integers in the queue and let's say we want to insert number five to insert we will increment
rear of course there should be an available cell in the right an available empty cell in the right
and now we can write value five here after insertion new rear index is seven and the value at index
seven is five now the queue means we must remove an element from front of the queue in this example
here a deque operation should remove number two from the queue to deque we can simply increment
front because at any point only the cells starting front till rear are part of my queue
by incrementing front i have discarded index two from the queue and we do not care what value lies
in a cell that is not part of the queue when we will include a cell in the queue we will overwrite
the value in that cell anyway so just incrementing front is good enough for deque operation
let's quickly write pseudocode for whatever we have discussed so far
in my code i will have two variables named front and rear and initially i'll set them both as minus
one let's say for an empty queue both front and rear will be minus one to check whether
q is empty or not we can simply check the value of front and rear and if they're both minus one
we can say that q is empty i just wrote his empty function here minus one is not a valid index
for an empty queue there will be no front and rear in our implementation we are saying that we will
represent empty state of queue by setting both front and rear as minus one
now let's write the nq function nq will take an integer x as argument there will be a couple
of conditions in nq if rear is already equal to maximum index available in array a we cannot insert
or nq n element in such scenario we can return and exit i would rather use a function named
is full to determine whether q is full or not if q is already full we can do much we should
simply exit else if q is empty we can add a cell to the queue we can add cell at index zero in the
queue and now we can set the value at index rear as x in all other cases we can first
increment rear and then we can fill in value x at index rear i can get this statement a rear
equal x outside these two conditional statements because it's common to them so this is my nq
function in the example array that i'm showing here let's nq some integers i'll make calls to
nq function and show you the simulation in the figure here let's say first i want to insert
number two in the queue i'm making a call to nq function passing number two as argument
the queue is empty so we will set both front and rear as zero now we will come to this statement
we will write value two at index zero so this is my queue after one nq operation front and
rear of the queue is same let's make another call to nq this time i want to insert number five
this time q is not empty so rear will be incremented we have added a cell to the queue by incrementing
rear and now we will write the value five at the new rear index let's nq one more number i have
nq seven let's now write dq operation there will be couple of cases in dq if the q is already empty
we cannot remove an element in this case we can simply print or throw an error and return
or exit there will be one more special case if the q has only one element in this case front and
rear will not be minus one but they will both be equal because we are already checking for minus
one case in his empty function in the previous if in this else if we can simply check whether
front is equal to rear or not if this is the case our dq will make the q empty and to mark the q
as empty we need to set both front and rear as minus one this is what we had said that we will
represent an empty queue by marking both front and rear as minus one in default or normal scenario
we will simply increment front we should really be careful about corner cases in any implementation
that's where most of the bugs come okay so this finally is my dq function
in this example here at this stage let's say we want to perform a dq q is not empty
and we do not have only one element in the queue so we will simply increment front before
incrementing we could set the value in this cell at index zero as something but the value in a cell
that is not part of q anymore doesn't really matter at this stage it doesn't really matter
what we have at index zero or index three or any other index apart from the segment between
front and rear when we will add a cell in the queue we will overwrite the value in that cell
anyway let's now perform some more in queues and dq's i'm in queuing three and then i'm in queuing
one we teach in queue we are incrementing rear i just performed some more in queue here
now let's perform a dq if i'll perform one more in queue here rear will be equal to maximum index
available in the array let's enqueue one more now at this stage we cannot enqueue an element
anymore because we cannot increment rear enqueue operation will fail now there are two unused cells
right now but with whatever logic we have written we cannot use these two cells that are in the
left of front in fact this is a real problem as we will dequeue more and more all the cells left
of front index will never be used again they will simply be wasted can we do something to use these
cells well we can use the concept of a circular array circular array is an idea that we use in a
lot of scenarios the idea is very simple as we traverse an array we can imagine that there is no
end in the array from zero we can go to one from one we can go to two and finally when we will reach
the last index in the array like in this example when we are at index nine the next index for me
is index zero we can imagine this array something like this remember this is only a logical way of
looking at the array in circular interpretation of array if i'm pointing to a position and my
current position is i then the next position or next index will not simply be i plus one it will
be i plus one modulo the number of elements in array or the size of array let's say n is the number
of elements in array then the next position will be i plus one modulo n the modulo operation will
get us the remainder upon dividing by n for any i other than n minus one this modulo operation will
not have any effect but for i equal n minus one next position will be n modulo n which will be
equal to zero when you divide the number by itself the remainder is zero previous position in circular
interpretation of array will be i plus n minus one modulo n we could simply say i minus one modulo
n just to make sure this expression inside the parenthesis is always positive i'm adding n here
give this some thought you should be able to get why it should be i plus n minus one modulo n
now with this interpretation of array we can increment rear in an nq operation as long as there
is any unused cell in the array i'm going to modify functions in my pseudo code now is empty
will remain the same we are still saying that for an empty q front and rear will be minus one
let's scroll down and come to nq now in circular interpretation i will call my q full when the
position next to rear in circular interpretation that we will calculate as rear plus one modulo
n will be equal to front so we will have a situation like this right now the next position to rear
in circular interpretation is front so there is no unused cell the complete array is exhausted
nothing will change in this condition if q is empty we can simply set front and rear as zero
in the last else condition we will increment rear like this we will say rear is equal to
rear plus one modulo n where n is number of elements in the array with this much change my
nq function is good now let's make a call to nq and insert something in this array here i want
to insert number 15 we will come to this last else condition rear right now is nine so this
expression will be nine plus one modulo n n is 10 here the size of this array a is 10 here
this will evaluate to zero now my new rear is zero i will write number 15 here let's now see
what we need to do in dq function nothing will change in the first two conditions if q is already
empty or if there is only one element in the q we will handle these cases in same manner in the
final else when we are incrementing front we need to increment it in a circular manner so we will
say front equal front plus one modulo n where n is number of elements in the array total number
of elements in the array or size of array now let's perform a dq we will come to this condition
front right now is two so this will be two plus one modulo 10 one more cell is available to us now
this much is the core of our implementation front operation will be really straightforward
we simply need to return the element at front index here also we first need to check whether
q is empty or not we should return a front only when front is not equal to minus one all these
operations all these functions that i have written here will take constant time their time complexity
will be big o of one we are performing simple arithmetic and assignments in the functions
and not doing anything costly like running a loop so time taken will not depend upon
size of q or some other variable i leave this here it should not be very difficult converting this
pseudo code to a running program in a language of your choice if you want to see my code you can
check the description of this video for a link thanks for watching
in our previous lesson we saw how we can implement q using arrays now in this lesson we will see how
we can implement q using linked list q as we know from our previous discussions is a structure
in which whatever goes in first comes out first q data structure is a list or collection with this
restriction that insertion can be performed at one end and deletion can be performed at other end
these are typical operations that we defined with q and insertion is called in q operation
and a deletion is called dq front operation front function should simply return the element
at front of list and is empty should check whether q is empty or not and all these operations must
take constant time their time complexity should be big o of one when we were implementing q with
arrays we used the idea of a circular array to implement q then in this case we have a limitation
the limitation is that array will always have a fixed size and once all the positions in the
array are taken once the array is exhausted we have two options we can either deny insertion
so we can say that the q is full and we cannot insert anything now or what we can do is we can
create a new larger array and copy the elements from previous array to the new larger array
which will be a costly process we can avoid this problem if we will use linked list to implement
q please note that this representation of circular array that i'm showing here
is only a logical way of looking at an array we can show this array like this also as i was
saying in an array implementation we will have this question what if array gets filled and we
need to take care of this we can either say q is full or we can create a new larger array and
copy elements from previous filled array into this new larger array the time taken for this copy
operation will be proportional to number of elements in filled array or in other words we
can say that the time complexity of this copy operation will be big o of n there is another
problem with array implementation we can have a large enough array and q may not be using most of
it like right now in this array 90 percent of the memory is unused memory is an important resource
and we should always avoid blocking memory unnecessarily it's not that some amount of unused memory
will be a real problem in a modern day machine it's just that while designing solutions and
algorithms we should analyze and understand these implications let's now see how good we
will be with a linked list implementation i have drawn a logical view of a linked list of integers
here coming back to basic definition of q as we know a q is a list or collection with this
constraint with this property that an element must always be inserted from one side of the
q that we call the rear of q and an element must always be removed from the other side that we
call the front of q it's really easy to enforce this property in a linked list a linked list as we
know is a collection of entities that we call nodes and these nodes are stored at non-contiguous
locations in memory each node contains two fields one to store data and another to store
address of the next node or reference to the next node let's assume that nodes in this figure
are at addresses hundred two hundred and three hundred respectively i have also filled in the
address fields the identity of linked list that we always keep with us is address of the head node
we often name the pointer or reference variable that would store this address head okay so now
we are saying that we want to use linked list to implement q these are the typical operations that
we define with a q we can use a linked list like a q we can pick one side for insertion or in q
operation so a node in the linked list must always be inserted from this side the other side will
then be used for dq so if we are picking head side for in q operation a dq must always happen from
tail if we are picking tail for in q operation then then dq must always happen from head
whatever side we are picking for whatever operation we need to be taking care of one requirement and
the requirement is that these operations must take constant time or in other words their time
complexity must be big o of one as we know from our previous lessons the cost of insertion or removal
from head side is big o of one but the cost of insertion or removal from tail side is big o of
n so here's the deal in a normal implementation of linked list if we will insert at one side and
remove from other side then one of these operations nq or dq depending on how we are picking the sides
will cost us big o of n but the requirement that we have is that both these operations must take
constant time so we definitely need to do something to make sure that both nq and dq operations
take constant time let's call this side front and this side rear so i want to nq a node from this
side and i want to dq from this side we are good for dq operation because removal from front will
take constant time but insertion or nq operation will be big o of n let's first see why insertion
at tail will be costly and then maybe we can try to do something to insert at rear end what we
will have to do is first we will have to create a node we have a new node here let's say i've got
this node at address 350 and the integer that i want to nq is 7 the address part of this node can
be set as null now what we need to do is we need to build this link we need to set the address part
of the last node as address of this newly created node and to do so we first need to have a pointer
pointing to this last node storing the address of this last node in a linked list the only identity
that we always keep with us is address of the head node to get a pointer to any other node
we need to start at head so we will first create a pointer temp and we will initially set it to
head and now in one step we can move this pointer variable to the next of whatever node it is pointing
to it's pointing to we use a statement like temp equal temp dot next to move to the next node
so from first node we will go to the second node and then from second we will go to the third node
in this example third node is the rear node and now using this pointer temp we can write the address
part of this node and build this link this whole traversal that we are having to get a pointer from
head to tail is what's taking all the time what we can do is we can avoid this whole traversal
we can have a pointer variable just like head that should always store the address of
rear node i can call this variable tail or rear let's call this rear and let's call this variable
that is storing the address of head node front in any insertion or removal we will have to update
both front and rear now but now when we will in queue let's say i have got a node at address 450
and and i want to insert this node at rear end now using the rear pointer we can update the address
field here so we are building this link and now we can update rear we will only have to modify
some address fields and time taken for in queue operation will not depend upon number of nodes
in the linked list so now with this design both in queue and dequeue operations will be constant
time operations the time complexity for both will be big o of one let's quickly see how real code
in c will look like for this design i have declared node as a structure with two fields
one to store data and another to store address of next node and now instead of declaring a pointer
variable named head appointed to node named head i'm declaring two pointers appointed to node
named front and another pointer to node named rear and initially i'm setting them both has null
let's say i'm defining these two variables in global scope so they will be accessible to all
functions my enqueue function will take an integer as argument in this function i'll first create a
node i'll use malloc in c or new operator in c plus plus to create a node in what we call dynamic
memory i'm pointing to the newly created node using this variable which is pointed to node
named temp now we can have two cases in insertion or in queue operation if there is no element in
the queue if the queue is empty in this case both front and rear will be null we will simply set
both front and rear as address of this new node being pointed to by temp and we will return or
exit else because we already have a pointer to rear node we will first set the address part of
current rear as the address of this newly created node and then we will modify the address in rear
variable to make it point to this newly created node while writing all of this i'm assuming that
you already know how to implement a linked list if you want to refresh your concepts you can check
earlier lessons in this series or you can check the description of this video for a link to lesson
on linked list implementation in c or c plus plus this code will be further clear if i'll
show things moving in a simulation let's say initially we have an empty queue so both front and
rear will be null null is only a macro for address zero at this stage let's say we are making a call
to nq function passing it number two now let's go through the nq function and see what will happen
first we will create a node data part of this node will be set as two and address part initially
will be set as null let's say we got this node at address temp at address hundred so a variable
named temp is is storing this address this variable is pointing to this node right now front and
rear are both null so we will go inside this if condition and simply set both front and rear
as hundred when the function will finish execution temp which is a local variable will be cleared
from memory after setting both front and rear as address of this newly created node we are returning
so this is how the queue will look like after first nq let's say we are making another call to
nq function at this stage passing number four as argument once again a new node will be created
let's say i got the new node at address 200 this time the queue is not empty so in this function
we will first go to this statement rear dot next equal temp so we will set the next part of this
node at address hundred as the address of the newly created node which is 200 so we will build this
link and now we will store the address of the new rear node in this variable named rear so this is
how my queue will look like after this second nq let's do one more nq let's send q number six
let's say we got a new node this time at address 300 so this is how our queue will look like okay
let's now write the queue function in dequeue function i'll first create a temporary pointer
to node in which i'll store the address of the current head or current front let's say for this
example at this stage i'm making a call to dequeue function we will have couple of cases in dequeue
also the queue could be empty so in this case we can print an error message and return in case of
empty queue front and rear will both be equal to null we can check in one of these and we will be
good in the case when front and rear will be equal we will simply set both front and rear as null
in all other cases we can simply make front point to the next node so we will simply do a front equal
front to odd next but why have we used this temporary pointer to node why have i declared
this temporary pointer to node in this code well simply incrementing front will not be good enough
in this example when i'm calling dequeue i'm first creating temp let's walk through whatever
code i've written so far so in the first line i'm creating temp and then because q is not empty
and there are more than one elements in the queue i'm setting front as address of the next node
so my queue is good now all the links are appropriately modified but this node which was
front previously is still in the memory anything in dynamic memory has to be explicitly freed
to free this node we will use free function and to this free function we should be passing
address of the node and that's why we had created temp with this free the node will be wiped off
from memory so these are nq and dequeue operations for you and if you can see there are simple
statements in these functions there are no loops so these functions will take constant time
the time complexity will be big o of one in the beginning of this lesson we had also discussed
some limitations with array implementation like what if array gets filled and that of unused memory
we do not have these limitations in a linked list implementation we are using some extra memory
to store address of next node but apart from that there is no other major disadvantage
i'll stop here now you can write rest of the functions like front function to look at the
element at front or is empty function to check whether queue is empty or not yourself if you want
to get my source code then you can check the description of this video for a link so thanks for
watching hello everyone in this lesson we'll introduce you to an interesting data structure
that has got its application in a wide number of scenarios in computer science
and this data structure is tree so far in this series we have talked about what we can call
linear data structures array linked list stack and queue all of these are linear data structures
all of these are basically collections of different kinds in which data is arranged
in a sequential manner in all these structures that i'm showing here we have a logical start
and a logical end and then an element in any of these collections can have a next element and
a previous element so all in all we have linear or sequential arrangement now as we understand
these data structures are ways to store and organize data in computers for different kinds
of data we use different kinds of data structure our choice of data structure depends upon a number
of factors first of all it's about what needs to be stored a certain data structure can be best fit
for a particular kind of data then we make here for the cost of operations quite often we want to
minimize the cost of most frequently performed operations for example let's say we have a simple
list and we are searching for an element in the list most of the time then we may want to store
the list or collection as an array in sorted order so we can perform something like binary
search really fast another factor can be memory consumption sometimes we may want to minimize
the memory usage and finally we may also choose a data structure for ease of implementation
although this may not be the best strategy tree is one data structure that's quite often used to
represent hierarchical data for example let's say we want to show employees in an organization
and their positions in organizational hierarchy then we can show it something like this
let's say this is organizational hierarchy of some company in this company john is CEO
and john has two direct reports Steve and Rama then Steve has three direct reports Steve is
manager of Lee Bob and Ella they may be having some designation Rama also has two direct reports
then Bob has two direct reports and then Tom has one direct report this particular logical
structure that I've drawn here is a tree well you have to look at look at the structure upside down
and then it will resemble a real tree the root here is at top and we are branching out in
downward direction logical representation of tree data structure is always like this
root at top and branching out in downward direction okay so tree is an efficient way of storing
and organizing data that is naturally hierarchical but this is not the only application of tree
in computer science we will talk about other applications and some of the implementation
details like how we can create such a logical structure in computer's memory later first I
want to define tree as a logical model tree data structure can be defined as a collection of entities
called nodes linked together to simulate a hierarchy tree is a non-linear data structure it's a
hierarchical structure the topmost node in the tree is called root of the tree each node will
contain some data and this can be data of any type in the tree that I'm showing in right here
data is name of employee and designation so we can have an object with two string fields one to store
name and another to store designation okay so each node will contain some data and may contain link
or reference to some other nodes that can be called its children now I'm introducing you to some
vocabulary that we use for tree data structure what I'm going to do here is I'm going to number
these nodes in the left trace so I can refer to these nodes using these numbers I'm numbering
these nodes only for my convenience it's not to show any order okay coming back as I had said each
node will have some data we can fill in some data in these circles it can be data of any type it can
be an integer or a character or a string or we can simply assume that there is some data
filled inside these nodes and we are not showing it okay as we were discussing a node may have
link or reference to some other nodes that will be called its children each arrow in this structure
here is a link okay now as you can see the root node which is numbered 1 by me and once again this
number is not indicative of any order I could have called the root node node number 10 also so root
node has linked to these two nodes number two and three so two and three will be called children of
one and node one will be called parent of nodes two and three I'll write down all these terms that
I am talking about we mentioned root children and parent in this tree one is a parent of one is
parent of two and three two is child of one and now four five and six are children of two
so node two is child of node one but parent of nodes four five and six children of same parent
are called sibling I'm showing siblings in same color here two and three are sibling then four
five and six are sibling then seven eight are sibling and finally nine and ten are sibling
I hope you are clear with these terms now the topmost node in the tree is called root root would be the
only node without a parent and then if a node has a direct link to some other node then we have a
parent child relationship between the nodes any node in the tree that does not have a child is
called leaf node all these nodes marked in black here are leaves so leaf is one more term
all other nodes with at least one child can be called internal nodes
and we can have some more relationships like parent of parent can be called
grandparent so one is grandparent of four and four is grandchild of one in general if we can grow
go from node a to b walking through the links and remember these links are not bidirectional
we have a link from one to two so we can go from one to two but we cannot go from two to one
when we are walking the tree we can walk in only one direction okay so if we can go from node
a to node b then a can be called ancestor of b and b can be called descendant of a
let's pick up this node number 10 one two and five are all ancestors of 10 and 10 is a descendant
of all of these nodes we can walk from any of these nodes to 10 okay let me now ask you some
questions to make sure you understand things what are the common ancestors of four and nine
ancestors of four are one and two and ancestors of nine are one two and five so common ancestors
will be one and two okay next question are six and seven sibling sibling must have same parent
six and seven do not have same parent they have same grandparent one is grandparent of both
nodes not having same parent but having same grandparent can be called cousins so six and seven
are cousins and these relationships are really interesting we can also say that node number three
is uncle of node number six because because it's sibling of two which is father of six
or i should say parent of six so we have quite some terms in vocabulary of tree
okay now i'll talk about some properties of tree tree can be called a recursive data structure
we can define tree recursively as a structure that consists of a distinguished node called root
and some subtrees and the arrangement is such that root of the tree contains link
two roots of all the subtrees t1 t2 and t3 in this figure are subtrees in the tree that i have drawn
in left here we have two subtrees for root node i'm showing the root node in red the left subtree
in brown and the right subtree in yellow we can further split the left subtree and look at it like
node number two is root of this subtree and this particular tree with node number two as root has
three subtrees i'm showing the three subtrees in three different colors recursion basically is
reducing something in a self-similar manner this recursive property of tree will be used
everywhere in all implementation and users of tree the next property that i want to talk about
is in a tree with n nodes there will be exactly n minus one links or edges each arrow in this figure
can be called a link or an edge all nodes except the root node will have exactly one incoming edge
if you can see i'll pick this node number two there is only one incoming link this is incoming
link and these three are outgoing links there will be one link for each parent child relationship
so in a valid tree if there are n nodes there will be exactly n minus one edges one incoming edge
for each node except the root okay now i want to talk about these two properties called depth
and height depth of some node x in a tree can be defined as length of the path from root to node
x each edge in the path will contribute one unit to the length so we can also say number of edges
in path from root to x the depth of root node will be zero let's pick some other node
for this node number five we have two edges in the path from root so the depth of this
node is two in this tree here depth of nodes two and three is one depth of nodes four five six
seven and eight is two and the depth of nodes nine ten and eleven is three okay now height of
a node entry can be defined as number of edges in longest path from that node to a leaf node
so height of some node x will be equal to number of edges in longest path from x to a leaf in this
figure for node three the longest path from this node to any leaf is two so height of node three is
two node eight is also a leaf node i'll mark all the leaf nodes here a leaf node is a node with zero
child the longest path from node three to any of the leaf nodes is two so the height of node three is
two height of leaf nodes will be zero so what will be the height of root node in this tree
we can reach all the leaves from root node number of edges in longest path is three so height of the
root node here is three we also define height of a tree height of tree is defined as height of root
node height of this tree that i'm showing here is three height and depth are different properties
and height and depth of a node may or may not be same we often confuse between the two
based on properties trees are classified into various categories there are different kinds
of trees that are used in different scenarios simplest and most common kind of tree is a tree
with this property that any node can have at most two children in this figure node two has three
children i'm getting rid of some nodes and now this is a binary tree binary tree is most famous
and throughout this series we will mostly be talking about binary trees the most common way of
implementing tree is dynamically created nodes linked using pointers or references just the way
we do for linked list we can look at the tree like this in this structure that i have drawn in right
here node has three fields one of the fields is to store data let's say middle cell is to store
data the left cell is to store the address of the left child and the right cell is to store address
of right child because this is a binary tree we cannot have more than two children we can call
one of the children left child and another right child programmatically in c or c++ we can define
node as a structure like this we have three fields here one to store data let's say data type is
integer i have filled in some data in these nodes so in each node we have three fields we have an
integer variable to store the data and then we have two pointers to node one to store the address of
the left child that will be the root of the left subtree and another to store the address of the
right child we have kept only two pointers because because we can have at most two children in binary
tree this particular definition of node can be used only for a binary tree for generic trees
that can have any number of children we use some other structure and i'll talk about it
in later lessons in fact we will discuss implementation in detail in later lessons this is just to give
you a brief idea of how things will be like in implementation okay so this is cool we understand
what a tree data structure is but in the beginning we had said that storing naturally hierarchical
data is not the only application of tree so let's quickly have a look at some of the applications
of tree in computer science first application of course is storing naturally hierarchical data
for example the file system on your disk drive the file and folder hierarchy is naturally hierarchical
data it's stored in the form of tree next application is organizing data organizing collections
for quick search insertion and deletion for example binary search tree that we'll be discussing a lot
in next couple of lessons can give us order of log n time for searching an element in it
a special kind of tree called tri is used is used to store dictionary it's really fast and efficient
and is used for dynamic spell checking tree data structure is also used in network routing
algorithms and this list goes on we'll talk about different kinds of trees and their applications
in later lessons i'll stop here now this is good for an introduction in next couple of lessons
we'll talk about binary search tree and its implementation this is it for this lesson thanks
for watching in our previous lesson we introduced you to tree data structure we discussed tree as
a logical model and talked briefly about some of the applications of tree now in this lesson
we will talk a little bit more about binary trees as we had seen in our previous lesson binary tree
is a tree with this property that each node in the tree can have at most two children we will first
talk about some general properties of binary tree and then we can discuss some special kind of
binary trees like binary search tree which is a really efficient structure for storing ordered data
in a binary tree as we were saying each node can have at most two children in this tree that
I've drawn here nodes have either zero or two children we could have a node with just one child
I have added one more node here and now we have a node with just one child because each node
in a binary tree can have at most two children we call one of the children left child and another
right child for the root node this particular node is left child and this one is right child
a node may have both left and right child and these four nodes have both left and right child
or a node can have either of left and right child this one has got a left child but has not got
right child I'll add one more node here now this node has a right child but does not have a left
child in a program we would set the reference or pointer to left child as null so we can say
that for this node left child is null and similarly for this node we can say that the right child is
null for all the other nodes that do not have children that are leaf nodes a node with zero
child is called leaf node for all these nodes we can say that both left and right child are null
based on properties we classify binary trees into different types I'll draw some more binary trees
here if a tree has just one node then also it's a binary tree this structure is also a binary tree
this is also a binary tree remember the only condition is that a node cannot have more than
two children a binary tree is called strict binary tree or proper binary tree if each node can have
either two or zero children this tree that I'm showing here is not a strict binary tree because
we have two nodes that have one child I'll get rid of two nodes and now this is a strict binary tree
we call a binary tree complete binary tree if all levels except possibly the last level are
completely filled and all nodes are as far left as possible all levels except possibly the last
level will anyway be filled so the nodes at the last level if it's not filled completely must be
as far left as possible right now this tree is not a complete binary tree nodes at same depth
can be called nodes at same level root node in a tree has step zero depth of a node is defined as
length of path from root to that node in this figure let's say nodes at step zero are nodes at level
zero I can simply say L0 for level zero now these two nodes are at level one these four nodes are
at level two and finally these two nodes are at level three the maximum depth of any node in the
tree is three maximum depth of a tree is also equal to height of the tree if we will go numbering
all the levels in the tree like L0 L1 L2 and so on then the maximum number of nodes that we can
have at some level i will be equal to two to the power i at level zero we can have one node
two to the power zero is one then at level one we can have at max two nodes at level two we can
have two to the power two nodes at max which is four so in general at any level i we can have
at max two to the power i nodes you should be able to see this very clearly because each node can
have two children so if we have x nodes at a level then each of these x nodes can have two children
so at next level we can have at most two x children here in this binary tree we have four nodes at
level two which is the maximum for level two now each of these nodes can possibly have two children
i'm just drawing the arrows here so at level three we can have max two times four that is eight
nodes now for a complete binary tree all the levels have to be completely filled we can give
exception to the last level or the best level it doesn't have to be full but the nodes have to be
as left as possible this particular tree that i'm showing here is not a complete binary tree
because we have two vacant node positions in left here i'll do slight change in this structure
now this is a complete binary tree we can have more nodes at level three but there should not be a
vacant position in left i have added one more node here and this still is a complete binary tree
if all the levels are completely filled such a binary tree can also be called perfect binary tree
in a perfect binary tree all levels will be completely filled if h is the height of a perfect
binary tree remember height of a binary tree is length of longest path between root to any of the
leaf nodes or i should say number of edges in longest path from root to any of the leaf nodes
height of a binary tree will also be equal to max step here for this binary tree height or max step
this tree maximum number of nodes in a tree with height h will be equal to we'll have two to the
power zero nodes at level zero two to the power one node at level one and we'll go on summing for
height h we'll go till two to the power h at the best level we will have two to the power h nodes
now this will be equal to two to the power h plus one minus one h plus one is number of levels here
we can also say two to the power number of levels minus one in this tree number of levels is four
we have l zero till l three so number of nodes maximum number of nodes will be two to the power
four minus one which is 15 so a perfect binary tree will have maximum number of nodes possible
for a height because all levels will be completely filled well i should say maximum number of nodes
in a binary tree with height h okay i can ask you this also what will be height of a perfect
binary tree with n nodes let's say n is number of nodes in a perfect binary tree to find out how
height we'll have to solve this equation n equal to the power h plus one minus one because if height
is h number of nodes will be two to the power h plus one minus one we can solve this equation
and the result will be this remember n is number of nodes here i leave the maths for you to understand
height will be equal to log n plus one to the base two minus one in this perfect binary tree
that i'm showing here number of nodes is 15 so n is 15 n plus one will be 16 so the h will be
log 16 to the base two minus one log 16 to the base two will be four so the final value will be
four minus one equal three in general for a complete binary tree we can also calculate height as
floor-off log n to the base two so we need to take integral part of log n to the base two
perfect binary tree is also a complete binary tree here n is 15 log of 15 to base two is
3.906891 if we'll take the integral part then this will be three i'll not go into proof of how
height of complete binary tree will be log n to the base two we'll try to see that later
all this maths will be really helpful when we will analyze cost of various operations on binary
tree cost of a lot of operations on tree in terms of time depends upon the height of tree for example
in binary search tree which is a special kind of binary tree the cost of searching inserting or
removing an element in terms of time is proportional to the height of tree so in such case we would
want the height of the tree to be less height of a tree will be less if the tree will be dense if
the tree will be close to a perfect binary tree or a complete binary tree minimum height of a tree
with n nodes can be log n to the base two when the tree will be a complete binary tree if we will
have an arrangement like this then the tree will have maximum height with n nodes minimum height
possible is flow rough or integral part of log into the base two and maximum height possible
with n nodes is n minus one when we will have a sparse tree like this which is as good as a linked
list now think about this if i'm saying that time taken for an operation
is proportional to height of the tree or in other words i can say that if time complexity
of an operation is big o of h where h is height of the binary tree then for a complete or perfect
binary tree my time complexity will be big o of log n to the base two and in worst case for this
parse tree my time complexity will be big o of n order of log n is almost best running time possible
for n as high as two to the power hundred log n to the base two is just hundred with order of
n running time if n will be two to the power hundred we won't be able to finish our computation
in years even with most powerful machines ever made so here's the thing quite often we want to
keep the height of a binary tree minimum possible or most commonly we say that we try to keep a
binary tree balanced we call a binary tree balanced binary tree if for each node the difference
between height of left and right sub tree is not more than some number k mostly k would be one
so we can say that for each node difference between height of left and right sub tree
should not be more than one there is something that i want to talk about height of a tree we had
defined height earlier as number of edges in longest path from root to a leaf height of a tree
with just one node where the node itself will be a leaf node will be zero we can define an empty tree
as a tree with no node and we can say that height of an empty tree is minus one so height of tree
with just one node is zero and height of an empty tree is minus one quite often people calculate
height as number of nodes in longest path from root to a leaf in this figure i have drawn one
of the longest paths from root to a leaf we have three edges in this path so the height is three
if we will count number of nodes in the path height will be four this looks very intuitive and i have
seen this definition of height at a lot of places if we will count the nodes height of tree with just
one node will be equal to one and then we can say height of an empty tree will be zero but this is
not the correct definition and we are not going to use this assumption we are going to say say
that height of an empty tree is minus one and height of tree with one node is zero the difference
between heights of left and right sub trees of a node can be calculated as absolute value of height
of left sub tree minus height of right sub tree and in this calculation height of a sub tree can be
minus one also for this leaf node here in this figure both left and right sub trees are empty
so both h left or height of left sub tree and h right or height of right sub tree will be minus
one but the difference overall will be zero for all nodes in a perfect tree difference will be zero
i have got rid of some nodes in this tree and now by the side of each node i have written
the value of diff this is still a balanced binary tree because the maximum diff for any
node is one let's get rid of some more nodes in this tree and now this is not balanced because
one of the nodes has diff two for this particular node height of left sub tree is one and height of
right sub tree is minus one because right sub tree is empty so the absolute value of difference is
two we try to keep a tree balanced to make sure it's tense and its height is minimized if height
is minimized cost of various operations that depend upon height are minimized okay the next thing that
i want to talk about very briefly is how we can store binary trees in memory one of the ways that
we had seen in our previous lesson which is most commonly used is dynamically created nodes
linked to each other using pointers or references for a binary tree of integers in c or c plus plus
we can define a node like this data type here is integer so we have a field to store data
and we have two pointer variables one to store address of left child and another to store address
of right child this of course is the most common way nodes dynamically created at random locations in
memory linked together through pointers but in some special cases we use arrays also arrays are
typically used for complete binary trees i have drawn a perfect binary tree here let's say this is
a tree of integers what we can do is we can number these nodes from zero starting at root and going
level by level from left to right so we'll go like zero one two three four five and six now i can
create an array of seven integers and these numbers can be used as indices for these nodes
so at zero th position i'll fill two at one th position i'll fill four at th position we'll have
one and i'll go on like this we have filled in all the data in the array but how will we store
the information about the links how will we know that the left child of root has value four and the
right child of root has value one well in case of complete binary tree if we will number the nodes
like this then for a node at index i the index of left child will be two i plus one and the index
of right child will be two i plus two and remember this is true only for a complete binary tree
for zero left child is two i plus one for i equals zero will be one and two i plus two will be two
now for one left child is at index three right child is at index four for i equal two two i plus
one will be five and two i plus two will be six we will discuss our implementation in detail when
we will talk about a special kind of binary tree called heap arrays are used to implement heaps
i'll stop here now in our next lesson we will talk about binary search tree which is also a special
kind of binary tree that gives us a really efficient storing structure in which we can
search something quickly as well as update it quickly this is it for this lesson thanks for watching
in our previous lesson we talked about binary trees in general now in this lesson we're going
to talk about binary search tree a special kind of binary tree which is an efficient structure
to organize data for quick search as well as quick update but before i start talking about
binary search trees i want you to think of a problem what data structure will you use
to store a modifiable collection so let's say you have a collection and it can be a collection
of any data type records in the collection can be of any type now you want to store this collection
in computer's memory in some structure and then you want to be able to quickly search for a
record in the collection and you also want to be able to modify the collection you want to be able
to insert an element in the collection or remove an element from the collection so what data structure
will you use well you can use an array or a linked list these are two well-known data structures
in which we can store a collection now what will be the running time of these operations
search insertion or removal if we will use an array or a linked list let's first talk about
arrays and for sake of simplicity let's say we want to store integers to store a modifiable list
or collection of integers we can create a large enough array and we can store the records in some
part of the array we can keep the end of the list marked in this array that i'm showing here
we have integers from 0 till 3 we have records from 0 till 3 and rest of the array is available
space now to search some x in the collection we will have to scan the array from index 0 till
end and in worst case we may have to look at all the elements in the list if n is the number of
elements in the list time taken will be proportional to n or in other words we can say that time
complexity of this operation will be big o of n okay now what will be the cost of insertion
let's say we want to insert number five in this list so if there is some available space
all these cells in yellow are available we can add one more cell by incrementing this marker
end and we can fill in the integer to be added the time taken for this operation will be constant
running time will not depend upon number of elements in the collection so we can say that
time complexity will be big o of 1 okay now what about removal let's say we want to remove one
from the collection what we'll have to do is we'll have to shift all records to the right of one
by one position to the left and then we can decrement end the cost of removal in worst case
once again will be big o of n in worst case we will have to shift n minus one elements
here the cost of insertion will be big o of 1 if the array will have some available space
so the array has to be large enough if the array gets filled what we can do is we can create a new
larger array typically we create an array twice the size of the filled up array so we can create a
new larger array and then we can copy the content of the filled up array into this new larger array
the copy operation will cost us big o of n we have discussed this idea of dynamic array quite
a bit in our previous lessons so insertion will be big o of 1 if array is not filled up and it will
be big o of n if array is filled up for now let's just assume that the array will always be large
enough let's now discuss the cost of these operations if we will use a linked list
if we would use a linked list i have drawn a linked list of integers here data type can be
anything the cost of search operation once again will be big o of n where n is number of
records in the collection or number of nodes in the linked list to search in worst case we will
have to traverse the whole list we will have to look at all the nodes the cost of insertion in a
linked list is big o of 1 at head and it's big o of n at tail we can choose to insert at head
to keep the cost low so running time of insertion we can say is big o of 1 or in other words we will
take constant time removal once again will be big o of n we will first have to traverse the linked
list and search the record and in worst case we may have to look at all the nodes
okay so this is the cost of operations if we are going to use array or linked list
insertion definitely is fast but how good is big o of n for an operational like search
what do you think if we are searching for a record x then in the worst case we will have to compare
this record x with all the n records in the collection let's say our machine can perform
a million comparisons in one second so we can say that machine can perform 10 to the power
6 comparisons in one second so cost of one comparison will be 10 to the power minus 6 second
machines in today's world deal with really large data it's very much possible for real world data
to have 100 million or billion records a lot of countries in this world have population more than
100 million two countries have more than a billion people living in them if we will have data about
all the people living in a country then it can easily be 100 million records okay so if we are
saying that the cost of one comparison is 10 to the power minus 6 second if n will be 100 million
time taken will be 100 seconds 100 seconds for a search is not reasonable and search may be a
frequently performed operation can we do something better can we do better than big o of n well in
an array we can perform binary search if it's sorted and the running time of binary search
is big o of log n which is the best running time to have i have drawn this array of integers here
records in the array are sorted here the data type is integer for some other data type for some
complex data type we should be able to sort the collection based on some property or some key
of the records we should be able to compare the keys of records and the comparison logic will be
different for different data types for a collection of strings for example we may want to have the
records sorted in dictionary or lexicographical order so we will compare and see which string will come
first in dictionary order now this is the requirement that we have for binary search
the data structure should be an array and the records must be sorted okay so the cost of search
operation can be minimized if we will use a sorted array but in insertion or removal we will have to
make sure that the array is sorted afterwards in this array if i want to insert number five at this
stage i can't simply put five at index six what i'll have to do is i'll first have to find the
position at which i can insert five in the sorted list we can find the position in order of log n
time using binary search we can perform a binary search to find the first integer greater than five
in the list so we can find the position quickly in this case it's index two but then we will have to
shift all the records starting this position one position to the right and now i can insert five
so even though we can find the position at which a record should be inserted
quickly in big go off log n this shifting in worst case will cost us big go off n so the running
time overall for an insertion will be big o of n and similarly the cost of removal will also be
big o of n we will have to shift some records okay so when we are using sorted array cost of search
operation is minimized in binary search for n records we will have at max log n to the base
two comparisons so if we can compare if we can perform million comparisons in a second
then for n equal two to the power 31 which is greater than two billion we are going to take
only 31 microseconds log of two to the power 31 to base two will be 31 okay we are fine with
search now we will be good for any practical value of n but what about insertion and removal
they are still big o of n can we do something better here well if we will use this data structure
called binary search tree i'm writing it in short bst for binary search tree then the cost of all
these three operations can be big o of log n in average case the cost of all the operations will
be big o of n in worst case but we can avoid the worst case by making sure that the tree is always
balanced we have talked about balanced binary tree in our previous lesson binary search tree is
only a special kind of binary tree to make sure that the cost of these operations is always big
o of log n we should keep the binary search tree balanced we'll talk about this in detail later
let's first see what a binary search tree is and how cost of these operations is minimized when
we use a binary search tree binary search tree is a binary tree in which for each node value of all
the nodes in left subtree is lesser and value of all the nodes in right subtree is greater
i have drawn binary tree as a recursive structure here as we know in a binary tree each node can
have at most two children we can call one of the children left child if we will look at the tree
as a recursive structure left child will be the root of left subtree and similarly right child
will be the root of right subtree now for a binary tree to be called binary search tree
value of all the nodes in left subtree must be lesser or we can say lesser or equal to handle
duplicates and the value of all the nodes in right subtree must be greater and this must be true for
all the nodes so in this recursive structure here both left and right subtrees must also be
binary search trees i'll draw a binary search tree of integers now i have drawn a binary search
tree of integers here let's see whether this property that for each node value of all the
nodes in left subtree must be lesser or equal and value of all the nodes in right subtree must be
greater is true or not let's first look at the root node nodes in the left subtree have values
10 8 and 12 so they're all lesser than 15 and in right subtree we have 17 20 and 25 they're all
greater than 15 so we are good for the root node now let's look at this node with value 10
in left we have 8 which is lesser in right we have 12 which is greater so we are good we are
good for this node 2 having value 20 and we don't need to bother about leaf nodes because they do
not have children so this is a binary search tree now what if i change this value 12
to 16 now is this still a binary search tree well for node with value 10 we are good
the node with value 16 is in its right so not a problem but for the root node we have a node in
left subtree with higher value now so this tree is not a binary search tree i'll revert back and
make the value 12 again now as we were saying we can search in search or delete in a binary search
tree in big o of log n time in average case how is it really possible let's first talk about search
if these integers that i have here in the tree were in a sorted array we could have performed
binary search and what do we do in binary search let's say we want to search number 10 in this array
what we do in binary search is we first define the complete list as our search space
the number can exist only within the search space i'll mark search space using these two pointers
start and end now we compare the number to be searched or the element to be searched with
mid element of the search space or the median and if the record being searched if the element
being searched is lesser we go searching in the left half else we go searching in the right half
in case of equality we have found the element in this case 10 is lesser than 15
so we will go searching towards left our search space is reduced now to half once again we will
compare to the mid element and bingo this time we have got a match in binary search we start with
n elements in search space and then if mid element is not the element that we are looking for we
reduce the search space to n by 2 and we go on reducing the search space to half till we either
find the record that we are looking for or we get to only one element in search space
and be done in this whole reduction if we will go from n to n by 2 to n by 4 to n by 8 and so on
we will have log n to the base two steps if we are taking k steps then n upon 2 to the power k
will be equal to 1 which will imply 2 to the power k will be equal to n and k will be equal to log
n to the base 2 so this is why running time of binary search is big o of log n now if we'll use
this binary search tree to store the integers search operation will be very similar let's say
we want to search for number 12 what we'll do is we'll start at root and then we will compare the
value to be searched the integer to be searched with value of root if it's equal we are done with
the search if it's lesser we know that we need to go to the left subtree because in a binary search
tree all the elements in left subtree are lesser and all the elements in right subtree are greater
now we'll go and look at the left child of node with value 15 we know that number 12 that we are
looking for can exist in this subtree only and anything apart from this subtree is discarded so
we have reduced the search space to only these three nodes having value 10, 8 and 12 now once
again we'll compare 12 with 10 we are not equal 12 is greater so we know that we need to go
looking in right subtree of this node with value 10 so now our search space is reduced
to just one node once again we will compare the value here at this node and we have a match
so searching an element in binary search tree is basically this traversal in which
at each step we will go either towards left or right and hence in at each step we will discard
one of the subtrees if the tree is balanced we call a tree balanced if for all nodes
the difference between the heights of left and right subtrees is not greater than one so if
the tree is balanced we will start with a search space of n nodes and when we will discard one of
the subtrees we will discard n by two nodes so our search space will be reduced to n by two
and then in next step we will reduce the search space to n by four we will go on reducing like this
till we find the element or till our search space is reduced to only one node when we will be done
so the search here is also a binary search and that's why the name binary search tree
this tree that i'm showing here is balanced in fact this is a perfect binary tree but with
same records we can have an unbalanced tree like this this tree has got the same integer values
as we had in the previous structure and this is also a binary search tree but this is unbalanced
this is as good as a linked list in this tree there is no right subtree for any of the nodes
search space will be reduced by only one at each step from n nodes in search space we will go to n
minus one nodes and then to n minus two nodes all the way till one will be n steps in binary search
tree in average case cost of search insertion or deletion is big o of log n and in worst case
this is the worst case arrangement that i'm showing you running time will be big o of n
we always try to avoid the worst case by trying to keep the binary search tree balanced
with same records in the tree there can be multiple possible arrangements for these integers in this
tree another arrangement is this for all the nodes we have nothing to discard in left subtree
in a search this is another arrangement this is still balanced because for all the nodes the
difference between the heights of left and right subtrees is not greater than one but this is the
best arrangement when we have a perfect binary tree at each step we will have exactly n by two
nodes to discard okay now to insert some record in binary search tree we will first have to find
the position at which we can insert and we can find the position in big o of log n time let's say
we want to insert 19 in this tree what we will do is we will start at the root if the value to be
inserted is lesser or equal if there is no child insert as left child or go left if the value is
critter and there is no right child insert as right child or go right in this case 19 is critter so
we will go right now we are at 20 19 is lesser and left subtree is not empty we have a left child
so we will go left now we are at 17 19 is critter than 17 so it should go in right of 17 there is no
right child of 17 so we will create a node with value 19 and link it to this node with value 17
as right child because we are using pointers or references here just like linked list no shifting
is needed like an array creating a link will take constant time so overall insertion will also
cost us like search to delete also we will first have to search the node search once again will be
big o of log n and deleting the node will only mean adjusting some links so removal also is going to
be like search big o of log n in average case binary search tree gets unbalanced during insertion
and deletion so often during insertion and deletion we restore the balancing there are ways to do it
and we will talk about all of this in detail in later lessons in next lesson we will discuss
implementation of binary search tree in detail this is it for this lesson thanks for watching
in our previous lesson we saw what binary search trees are now in this lesson we are going to
implement binary search tree we will be writing some code for binary search tree prerequisites
for this lesson is that you must understand the concepts of pointers and dynamic memory allocation
in cc++ if you have already followed this series and seen our lessons on linked list
then implementation of binary search tree or binary tree in general it's not going to be very
different we will have nodes and links here as well okay so let's get started binary search tree or
bst as we know is a binary tree in which for each node value of all the nodes in left subtree is
lesser or equal and value of all the nodes in right subtree is greater we can draw bst
as a recursive structure like this value of all the nodes in left subtree must be lesser or equal
and value of all the nodes in right subtree must be greater and this must be true for all nodes
and not just a root node so in this recursive definition here both left and right subtrees must
also be binary search trees i have drawn a binary search tree of integers here now the question is
how can we create this non-linear logical structure in computer's memory i had talked about this
briefly when we had discussed binary trees the most popular way is dynamically created nodes
linked to each other using pointers or references just the way we do it for linked lists because in
a binary search tree or in a binary tree in general each node can have at most two children we can
define node as an object with three fields something like what i'm showing here we can have a field to
store data another to store address or reference to left child and another to store address or
reference to right child if there is no left or right child for a node reference can be set as null
in c or c++ we can define node like this there is a field to store data here the data type is
integer but it can be anything there is one field which is pointer to node node asterisk means
pointer to node this one is to store the address of left child and we have another one to store
the address of right child this definition of node is very similar to definition of node
for doubly linked list remember in doubly linked list also each node had two links
one to previous node and another to next node but doubly linked list was a linear arrangement
this definition of node is for a binary tree we could also name this something like bst node
but node is also fine let's go with node now in our implementation just like linked list
all the nodes will be created in dynamic memory or heap section of applications memory
using malloc function in c or new operator in c++ we can use malloc in c++ as well
now as we know any object created in dynamic memory or heap section of applications memory
cannot have a name or identifier it has to be accessed through a pointer malloc or new operator
returners pointer to the object created in heap if you want to revise some of these concepts of
dynamic memory allocation you can check the description of this video for
link to a lesson it's really important that you understand this concept of stack and heap in
applications memory really well now for a linked list if you remember the information that we always
keep with us is address of the head node if we know the head node we can access all other nodes
using links in case of trees the information that we always keep with us is address of the
root node if we know the root node we can access all other nodes in the tree using links to create
a tree we first need to declare a pointer to bst node i'll rather call node bst node here bst
for binary search tree so to create a tree we first need to declare a pointer to bst node that will
always store the address of root node i have declared a pointer to node here named root ptr
for pointer in c you can't just write bst node as to risk root ptr you will have to write
struct space bst node as to risk you will have to write struct here as well i'm gonna write c++
here but anyway right now i'm trying to explain the logic we will not bother about my new details
of implementation in this logical structure of tree that i'm showing here each node as you can see
has three fields three cells leftmost cell is to store the address of left child and rightmost
cell is to store address of right child let's say root node is at address 200 in memory and i'll
assume some random addresses for all other nodes as well now i can fill in these left and right
cells for each node with addresses of left and right children in our definition data is first field
but in this logical structure i'm showing data in the middle okay so for each node i have filled in
addresses for both left and right child address is zero or null if there is no child now as we were
saying identity of the tree is address of the root node we need to have a pointer to node in
which we can store the address of the root node we must have a variable of type pointer to node
to store the address of root node all these rectangles with three cells are nodes they are created using
malloc or new operator and live in heap section of applications memory we cannot have name or
identifier for them they are always accessed through pointers this root PTR root pointer has
to be a local or global variable we will discuss this in a little more detail in some time
quite often we like to name this root pointer just root we can do so but we must not confuse
this is pointer to root and not the root itself to create a BST as i was saying we first need to
declare this pointer initially we can set this pointer as null to say that the tree is empty
a tree with no node can be called empty tree and for empty tree root pointer should be set as null
we can do this declaration and setting the root as null in main function in our program
actually let's just write this code in a real compiler i'm writing c++ here as you can see in
the main function i have declared this pointer to node which will always store the address of root
node of my tree and i'm initially setting this as null to say that the tree is empty with this
much of code we have created an empty tree but what's the point of having an empty tree we should
have some data in it so what i want to do now is i want to write a function to be able to insert
a node in the tree i will write a function named insert that will take address of the root node
and data to be inserted as argument and this function will insert a node with this data
at appropriate position in the tree in the main function i'll make calls to this insert function
passing it address of root and the data to be inserted let's say first i want to insert
number 15 and then i want to insert number 10 and then number 20 we can insert more but let's first
write the logic for insert function before i write the logic for insert function i want to
write a function to create a new node in dynamic memory or heap this function get new node
should take an integer the data to be inserted as argument create a node in heap using
new or malloc and return back the address of this new node i'm creating the new node here
using this new operator the operator will will return me the address of the newly created node
which i'm collecting in this variable of type pointer to bst node in c instead of new operator
we will have to use malloc we can use malloc in c++ as well c++ is only a super set of c
malloc will also do here now anything in dynamic memory or heap is always accessed
through pointer now using this pointer new node we can access the fields of the newly created node
i'll have to dereference this pointer using asterisk operator so i'm writing asterisk new node
and now i can access the fields we have three fields in node data and two pointers to node left
and right i've set the data here instead of writing asterisk new node dot data we have
this alternate syntax that we could use you could simply write new node
arrow data and this will mean the same we have used this syntax quite a lot in our lessons on linked
list now for the new node we can set the left and right child as null and finally we can return the
address of the new node okay coming back to the insert function we can have a couple of cases
in insertion first of all tree may be empty for this first insertion when we are inserting this
number 15 tree will be empty if tree is empty we can simply create a new node and set it as root
with this statement root equal get new node i'm setting root as address of the new node
but there is something not all right here this root is local variable of insert function
and its scope is only within this function we want this root root in main to be modified
this guy is a local variable of main function there are two ways of doing this
we can either return the address of the new root so return type of insert function will be
pointer to bst node and not void and here in the main function we will have to write statement
like root equal insert and the arguments so we will have to collect the return and update our root
in main function another way is that we can pass the address of this root of main to the insert
function this root is already a pointer to node so its address can be collected in a pointer to
pointer so insert function in insert function first argument will be a pointer to pointer and here
we can pass the address we'll say ampersand root to pass the address we can name this argument
root or we can name this argument root ptr we can name this whatever now what we need to do is we
need to dereference this using asterisk operator to access the value in root of main and we can
also set the value in root of main so here with this statement we are setting the value
and the return type now can be void this pointer to pointer thing gets a little tricky
i'll go with the former approach actually there is another way instead of declaring
root as a local variable in main function we can declare root as global variable
global variable as you know has to be declared outside all the functions if root would be global
variable it would be accessible to all the functions and we will not have to pass the address stored
in it as argument anyway coming back to the logic for insertion as we were saying if the tree is
empty we can simply create a new node and we can simply set it as root at this stage we wanted to
insert 15 if we will make call to the insert function address of root is 0 or null null is
only a macro for 0 and the second argument is the number to be inserted in this call to insert function
we will make call to get new node let's say we got this new node at address 200 get new node
function will return us address 200 which we can set as root here but this root is a local variable
we will return this address 200 back to the main function and in the main function we are actually
doing this root equal insert so in the main function we are building this link
okay our next call in the main function was to insert number 10 at this stage root is 200
the address in root is 200 and the value to be inserted is 10 now the tree is not empty so what
do we do if the tree is not empty we can basically have two cases if the data to be inserted is lesser
or equal we need to insert it in the left subtree of root and if the data to be inserted is greater
we need to insert it in right subtree of the root so we can reduce this problem in a self-similar
manner in a recursive manner recursion is one thing that we are going to use almost all the time while
working with trees in this function i'll say that if the data to be inserted is less than or equal to
the data in root then make a recursive call to insert data in left subtree the root of the left
subtree will be the left child so in this recursive call we are passing address of left child and
data as argument and after the data is inserted in left subtree the root of the left subtree can
change insert function will return the address of the new root of the left subtree and we need to
set it as left child of the current node in this example tree here right now both left and right
subtree are empty we are trying to insert number 10 so we have made call to this function insert
from main function we have called insert passing it address 200 and value or data 10 now 10 is
lesser than 15 so control will come to this line and a call will be made to insert data in left
subtree now left subtree is empty so address of root for left subtree is 0 data passed data to be
inserted passed as argument is 10 now this first insert call will wait for this insert below to finish
and return for this last insert call root is null let's say we got this node at address 150
now this insert call will return back 150 and execution of first insert call will resume at this
line and now this particular address will be set as 150 so we will build this link
and now this insert call can finish it can return back the current root actually this return root
should be there for all cases so i'm taking it out and i have it after all these conditions
of course we will have one more else here if the data is greater we need to go insert in right subtree
the third call in insert function was to insert number 20 now this time we will go to this
else statement this statement in else let's say we got this new node at address 300 so this
guy will return 300 for this node at 200 right child will will be set as 300 and now this call
to insert can finish the return will be 200 okay at this stage what if a call is made to insert number
25 we are at root right now the node with address 200 25 is greater so we need to go and insert
in right subtree right subtree is not empty this time so once again for this call also we will come
to this else last else because 25 is greater than 20 now in this call we will go to the first if
a node will be created let's say we got this node in heap at address 500 this particular call
insert 025 will return 500 and finish now for the node at 300 right child will be set as 500 so this
link will get built now this guy will return 300 the root for this subtree has not changed and this
first call to insert will also wrap up it will return to 200 so we are looking good for all cases
this insert function will work for all cases we could write this insert function without using
recursion i encourage you to do so you will have to use some temporary pointer to node and loops
recursion is very intuitive here and recursion is intuitive in pretty much everything that we do
with trees so it's really important that we understand recursion really well okay i'll write one more
function now to search some data in bst in the main function here i have made some more calls to
insert now i want to write a function named search that should take as argument address of the root
node and the data to be searched and this function should return me true if data is there in the tree
false otherwise once again we will have couple of cases if the root is null then we can return
false if the data in root is equal to the data that we are looking for then we can return true
else we can have two cases either we need to go and search in the left subtree or we need to go in
the right subtree so once again i'm using recursion here i am making recursive call to search function
in these two cases if you have understood the previous recursion then this is very similar
let's test this code now what i've done here is i've asked the user to enter a number to be searched
and then i'm making call to the search function and if this function is returning me true
i'm printing found else i'm printing not found let's run this code and see what happens i have
moved multiple insert statements in one line because i'm short of space here let's say we want
to search for number eight eight is found and now let's say we want to search for 22 22 is not found
so we are looking good i'll stop here now you can check the description of this video for
link to all the source code we will do a lot more with trees in coming lessons in our next lesson
we will go a little deeper and try to see how things move in various sections of applications
memory how things move in stack and heap sections of memory when we execute these functions it will
give you a lot of clarity this is it for this lesson thanks for watching in our previous lesson we wrote
some code for binary search tree we wrote functions to insert and search data in bst now in this lesson
we will go a little deeper and try to understand how things move in various sections of applications
memory when these functions get executed and this will give you a lot of clarity this will give you
some general insight into how memory is managed for execution of a program and how recursion
which is so frequently used in case of trees works the concepts that i'm going to talk about in this
lesson have been discussed earlier in some of our previous lessons but it will be good to go through
these concepts again when we are implementing trees so here is the code that we have written
we have this function get new node to create a new node in dynamic memory and then we have this
function insert to insert a new node in the tree and then we have this function to search some
data in the tree and finally this is the main function you can check the description of this
video for link to this source code now in main function here we have this pointer to bst node
named root to store the address of root node of my tree and i'm initially setting it as null
to create an empty tree and then i'm making some calls to insert function to insert some data in
the tree and finally i'm asking user to input a number and i'm making call to search function
to find this number in the tree if the search function is returning me true i'm printing found
else i'm printing not found let's see what will happen in memory when this program will execute
the memory that is allocated to a program or application for its execution in a typical
architecture can be divided into these four segments there is one segment called text segment
to store all the instructions in the program the instructions would be compiled instructions
in machine language there is another segment to store all the global variables a variable that is
declared outside all the functions is called global variable it is accessible to all the functions
the next segment stack is basically scratch space for function call execution
all the local variables the variables that are declared within functions live in stack and finally
the fourth section heap which we also call the free store is the dynamic memory that can grow
or shrink as per our need the size of all of the segments is fixed the size of all of the segments
is decided at compile time but heap can grow during runtime and we cannot control allocation or
deallocation of memory in any other segment during runtime but we can control allocation
and deallocation in heap we have discussed all of this in detail in our lesson on dynamic memory
allocation you can check the description for a link now what i'm going to do here is i'm going
to draw stack and heap sections as these two rectangular containers i'm kind of zooming into
these two sections now i'll show you how things will move in these two sections of applications
memory when this program will execute when this program will start execution first the main function
will be called now whenever a function is called some amount of memory from the stack is allocated
for its execution the allocated memory is called stack frame of the function call all the local
variables and the state of execution of the function call would be stored in the stack frame
of the function call in the main function we have this local variable root which is pointer to bst
node so i'm showing root here in this stack frame we will execute the instructions sequentially
in the first line in main function we have declared root and we are initializing it and setting it as
null null is only a macro for address 0 so here in in this figure i'm setting address in root as 0
now in the next line we are making a call to insert function so what will happen is execution
of main will pause at this stage and a new stack frame will be allocated for execution of insert
main will wait for this insert above to finish and return once this insert call finishes main will
resume at line 2 we have these two local variables root and data in insert function in which we are
collecting the arguments now for this call to insert function we will go inside the first if
condition here because root is null at this line we will make call to get new node function
so once again execution of this insert call will pause and a new stack frame will be allocated
for execution of get new node function we have two local variables in get new node data in which
we are collecting the argument and this pointer to bst node named new node now in this function we are
using new operator to create a bst node in heap now let's say we got a new node at address 200
new operator will return us this address 200 so this address will be set here
in new node so we have this link here and now using this pointer new node we are setting value
in these three fields of node let's say the first field is to store data so we are setting value 15
here and let's say this second cell is to store address of left child this is being set as null
and the address of right child is also being set as null and now get new node will return
the address of new node and finish its execution whenever a function call finishes the stack frame
allocated to it is reclaimed call to insert function will resume at this line and the return
of get new node address 200 will be set in this root which is local variable for insert call
and now insert function this particular call to insert function will return the address of root
the address stored in this variable root which is 200 now and finish and now main will resume
at this line and root of main will be set as 200 the return of this insert call insert root 15
will be set here now in the execution of main control we'll go to the next line
and we have this call to insert function to insert number 10 once again execution of main will be
paused and a stack frame will be allocated for execution of insert now this time for insert call
root is not null so we will not go inside the first if we will access the data field of this node
at address 200 using this pointer named root in insert function and we will compare it with
this value 10 10 is lesser than 15 so we will go to this line and now we are making a recursive call
here recursion is a function calling itself and a function calling itself is not any different from
a function a calling another function b so what will happen here is that execution of this particular
insert call will be paused and a new stack frame will be allocated for execution of this another
insert call to which the arguments passed are address 0 in this local variable root left child
of node at address 200 is null so we are passing 0 in root and in data we are passing 10 now for
this particular insert call control will go inside first if and we will make a call to get new node
function at this line so execution of this insert will pause and we will go to get new node function
here we are creating a new node in heap let's say we got this new node at address 150 now get new
node will return 150 and finish execution of this call to insert will resume at this line return of
get new node will be set here and now this call to insert will return address 150 and finish insert
below will resume at this line and now in this insert call left child of this node at address 200
will be set as return of the previous insert call which is 150 so now these two nodes are linked
and finally this insert call will finish control will return back to main at this line root will
be rewritten as 200 but earlier also it was 200 it's not changing next in the main function we
have called to insert number 20 i'm not going to show the simulation for this one once again the
allocated memory in stack will grow and shrink and finally when the control will return back to
main function after this insert call is over we will have a node in heap with value 20 set as right
child of this node at 200 let's say we got this new node with value 20 at address 300 so as you can
see the address of right child in node at address 200 is set as 300 now next one is to insert number
25 this one is interesting let's see what will happen for this one main will be paused and we will
go to this call to insert in the root which is local to this call address passed is 200 and we
have passed number 25 in data now here 25 is greater than the value in this node at address 200 so
we will go inside this last else condition we need to insert in the right sub tree so another call
to insert will be made we will pass address 300 as root and data passed will be 25 only now for this
call once again the value in node at 300 for this call root is 300 is lesser than 25 25 is greater
and greater than 20 so once again we will come to this last else and make a recursive call to insert
in the right subtree the right subtree is empty this time so for this insert call at top the address
in root here will be 0 so for this call we will go to the first if and make a call to get new node
let's say this new node returns us node at address 100 i'm short of space so i'm not showing everything
and get new nodes stack frame here we will return back to this insert call at top and now this root
is set as 100 address of the newly created node and now this call to insert will finish we will
come back to this insert below and this insert will resume at this line inside the last else
and the right child of node at address 300 will be set as 100 and now this insert will return back
address 300 whatever is set in its root and this insert below will resume at this line
inside the last else right child of node at address 200 will be set as 300 it was 300
previously also so even after overwriting we will not change and this insert will finish now finally
main will resume at this line root of main will be set as return of this insert call it will only
be overwritten with same value it's really important that this root in main and all the links in nodes
are properly updated quite often because of bugs in our code will lose some links or some unwanted
links are created now as you can see we are creating all the nodes in heap here heap gives us this
flexibility that we can decide the creation of node during runtime and we can control the lifetime
of anything in heap any memory claimed in heap has to be explicitly deallocated using free in
c or delete operator in c++ else the memory in heap remains allocated till the program is running
the memory in stack as you can see gets deallocated when function call finishes
the rest of the function calls here in main function will execute in similar manner
i'll leave it for you to see and think about right now we have this tree in the heap logically
memory itself is a linear structure and this is how tree which is a non-linear structure
which is logically a non-linear structure will fit in it the way i'm showing the nodes
at random locations linked to each other in this heap i hope this explanation gave you some clarity
in coming lessons we will solve some problems on tree this is it for this lesson thanks for
watching in our previous lessons we wrote some basic code for binary search tree but to solidify
our concepts we need to write some more code so i've picked this simple problem for you
given a binary search tree we want to find minimum and maximum element in it
let's see how we can solve this problem i have drawn logical representation of a binary search tree
of integers here as we know in a binary search tree for all nodes value of nodes in left sub tree
is lesser and value of nodes in right sub tree is greater this is how we can define node for a
binary search tree in cc plus plus we can have a structure with three fields one to store data
another to store address of left child and another to store address of right child as we had seen
earlier in bst implementation identity of the tree that we always keep with us that we pass to
functions is address of the root node so what i want to do here is i first want to write
a function named find min that should take address of the root node as argument and return me the
minimum element in the tree and just like find min we can write another function named find max
that can return us the maximum element in bst let's first see how we can find the minimum element
there are two possible approaches here we can write an iterative solution in which we can use
a simple loop to find the minimum element or we can use recursion let's first see the iterative
solution if we have a pointer to the root node and we want to find the minimum element in bst
then from root we need to go left as long as it's possible to go using the left links
because in a bst for all nodes nodes in left have lesser value and nodes in right have greater value
so we need to go left as long as it's possible we can start with a temporary pointer to root
node we can name this pointer temp or we can name this pointer current to say that we are currently
pointing to this node in my function here i have declared this pointer to bst node named current and
initially i'm setting the address of root in it and with this pointer we can go to the left child
with a statement like current equal current arrow left we first need to check if there is a left child
and then we need to move the pointer we can use a while loop like this if the left child of current
node is not null we can move this pointer current to the left child with this statement current equal
current arrow left here in this example currently we are pointing to this node with value 15
it has a left child so we can move to this node with value 10 once again this node 2 has a left
child so we can go left again now this node with value 8 does not have a left child so we cannot go
towards left any further we will come out of the while loop and at this point the node that we are
pointing to has minimum value so we can return the data in that node there is one case that we are
missing in this function if the tree is empty we can throw some error now we can return some
value indicative of empty tree if i know that the tree would have only positive values i can
return something like minus one so here in my function i have added this condition if root
is equal to null that is if the tree is empty print this error and return minus one one more thing
we do not need to use this extra pointer to bst node named current root here is a local variable
and we can use this root itself so we can write our code like this while left of root is not
equal to null we can go left with this statement root equal root arrow left and finally we can return
root arrow data which is only an alternate syntax for asterisk root dot data modifying
this local root is not going to modify my root in main function or whatever function i'm calling
this find min function from so this is our iterative solution to find minimum element in bst
the logic for finding maximum is similar the only difference will be that instead of going left
we will have to go right all the time i leave it for you to implement let's now see how we can
find minimum element using recursion if we want to reduce this problem in a recursive manner in a
self-similar manner then what we can say is if the left subtree is not empty then we can reduce the
problem to finding minimum in left subtree if left subtree is empty we already know the minimum
because we cannot have a minimum in right subtree here is the recursion that we can write root being
null is a corner case if root is null that is if the tree is empty we can throw error else if left
child of root is null we can return the data in root else if left child is not null or in other
words if the left subtree is not empty we can reduce the problem to searching minimum in the left
subtree so we are making this recursive call to find min passing it address of the left child
passing it address of the root of left subtree left child would be the root of left subtree
this second else if is our base condition to exit from recursion if you had understood the
recursion that we had written earlier to insert a node in bst then this recursion should not
be very difficult for you to understand so here is our recursive solution to find minimum in a
bst to find maximum element all we need to do is we need to go searching in right subtree
okay i'll stop here now in coming lessons we will solve some more interesting problems on bst
thanks for watching in this lesson we're going to write code to find height or what we can also call
maximum depth of a binary tree we have already discussed depth and height in our first introductory
lesson on trees but i'll do a quick recap here first of all i've drawn a binary tree here
i've not filled in any data in the nodes data can be anything binary tree as we know is a tree
in which each node can have at most two children so a node can have zero one or two children i'll
just number these nodes so i can refer to them i'll say this root node is number one and i'll go
level by level from left to right counting two three four and so on now height of a tree
is defined as number of edges in longest path from root to a leaf node
in this example three four six seven eight and nine are leaf nodes a leaf node is a node with
zero children number of edges in longest path from root to a leaf node is three for both eight
and nine number of edges in path from root is three so height of the tree is three actually we can
define height of a node in the tree as number of edges in longest path from that node to a leaf
node so height of a tree basically is height of the root node in this example tree height of node
three is one height of node two is two and height of node one is three and because this is the root
node this is also the height of the tree height of a leaf node would be zero so if a tree has only
one node then the root node itself would be a leaf node and so height of the tree would be zero
so this is definition of height of a tree we often also talk about
depth and we often confuse between depth and height but these two are different properties
depth of a node is defined as number of edges in path from root to that node
basically depth is distance from root and height is distance from the best accessible leaf node
for node two in this example tree depth is one and height is two for node number nine which is
a leaf node depth is three and height is zero for root node depth is zero and height is three
height of a tree would be equal to maximum depth of any node in the tree so height
and max depth these two terms are used for each other okay let's now see how we can calculate height
or max depth of a binary tree i'm going to write a function named find height that will take
reference or address of the root node as argument and return me the height of the binary tree
now the logic to calculate height can be something like this for any node if we can somehow calculate
the height of its left subtree and also the height of its right subtree then the height of that node
would be greater of the heights of left and right subtrees plus one for the root node in this tree
height of the left subtree is two and height of the right subtree is one so height of the root
node would be greater of these two values plus one plus one for the edge connecting the root node
to the subtree so height of the root node which would also be the height of the tree is three here
in our code we can calculate height of left and right subtrees using recursion what i'll do here
and find height function is i'll first make a recursive call to find height of the left subtree
we can say to find height of left subtree or to find height of left child both will mean the same
i'm collecting the return of this recursive call in a variable named left height
and now i'll make another recursive call to calculate height of right subtree or right child
now height of the tree or height of whatever node for which we have made this function call
would be greater of these two values left height and right height plus one now there is only one
more thing missing in this recursion we need to write the base or exit condition we cannot go into
recursion infinitely what we can do is we can go on till we make a recursive call with root equal
null and if root is null that is if the tree or subtree is empty we can return something what
should we return here give this some thought if i have made a call to find height of let's say
this leaf node this node with number seven then for this guy both left and right children are null
in call for this node number seven we will make two recursive calls passing null in both the calls
so what should we return should we return zero if these two calls will return zero then height of
seven will be one because in the return statement here we're saying max of left and right height
plus one but as we had discussed earlier height of a leaf node should be zero so if we are returning
zero for root equal null it's not all right what we can do is we can return minus one
when we are returning minus one then this edge to null that does not exist but still was getting
counted will be balanced with this minus one i hope this is making sense and going by convention
also height of an empty tree is set to be minus one so this is pseudocode for my function to find
height of a binary tree some people define height as number of nodes in longest path from root to a
leaf node we are counting edges here and this is the right definition if you want to count
number of nodes then for a leaf node height would be one and for empty tree height would be zero
so all you need to do is return zero here and this is the code if you want to count
number of nodes but i think the right definition is number of edges so i'll return minus one here
time complexity of this function is big o of n where n is number of nodes in the tree we will
make one recursive call corresponding to each node in the tree so we are kind of visiting each
node in the tree once and so running time will be proportional to number of nodes i'll skip detail
analysis of running time in this lesson this is what my find height function will look like in
c or c plus plus max here is a function that will return greater of two values passed to it as
arguments so this is it for this lesson thanks for watching in this lesson we are going to talk
about binary tree traversal when we are working with trees we may often want to visit all the
nodes in the tree now tree is not a linear data structure like array or linked list
in a linear data structure there would be a logical start and a logical end so we can start
with a pointer at one of the ends and keep moving it towards the other end for a linear
data structure like linked list for each node or element we would have only one next element
but tree is not a linear data structure i have drawn a binary tree here data type is
character this time i fill in these characters in the nodes now for a tree at any time if
we are pointing to a particular node then we can have more than one possible directions we can have
more than one possible next nodes in this binary tree for example if we will start with a pointer
at root node then we have two possible directions from f we can either go left to d or we can go
right to j and of course if we will go in one direction then we will somehow have to come back
and go into the other direction later so tree traversal is not so straightforward
and what we are going to discuss in this lesson is algorithms for tree traversal tree traversal
can formally be defined as the process of visiting each node in the tree exactly once in some order
and by visiting a node we mean reading or processing data in the node for us in this lesson
visit will mean printing the data in the node based on the order in which nodes are visited
tree traversal algorithms can broadly be classified into two categories we can either go breadth first
or we can go depth first breadth first traversal and depth first traversal are general techniques
to traverse or search a craft craft is a data structure and we have not talked about craft
so far in this series we will discuss craft in later lessons for now just know that tree is only
a special kind of craft and in this lesson we are going to discuss breadth first and depth first
traversal in context of trees in a tree in breadth first approach we would visit all the nodes at
same depth or level before visiting the nodes at next level in this binary tree that i'm showing
here this node with value f which is the root node is at level 0 i'm writing l0 here for level 0
depth of a node is defined as number of edges in path from root to that node root node would have
depth 0 these two nodes d and j are at depth 1 so we can say that these nodes are at level 1
now these four nodes are at level 2 these three nodes are at level 3 and finally this node with
value h is at level 4 so what we can do in breadth first approach is that we can start at level 0
we would have only one node at level 0 the root node so we can visit the root node i'll write the
value in the node as i'm visiting it now level 0 is done now i can go to level 1 and visit the
nodes from left to right so after f we would visit t and then we would visit j and now we are done
with level 1 so we can go to level 2 now we will go like b then e then g and then k and now we can
go to level 3 ac and i and finally i can go to level 4 this kind of breadth first traversal in case
of trees is called level order traversal and we will discuss how we can do this programmatically
in some time but this is the order in which we would visit the nodes we would go level by level
from left to right in breadth first approach for any node we visit all its children before visiting
any of its grandchildren in this tree first we are visiting f and then we are visiting d
and then we are not going to any child of d like b or e along the depth next we are going to j
but in depth first approach if we would go to a child we would complete the whole subtree of the
child before going to the next child in this example tree here from f the root node if we are going
left to d then we should visit all the nodes in this left subtree that is we should finish this
left subtree in its complete depth or in other words we should finish all the grandchildren of f
along this path before going to right child of f j and once again when we will go to j we will
visit all the grandchildren along this path so basically we will visit the complete right subtree
in depth first approach the relative order of visiting the left subtree the right subtree
and the root node can be different for example we can first visit the right subtree and then the
root and then the left subtree or we can do something like we can first visit the root and then the
left subtree and then the right subtree so the relative order can be different but the core idea
in depth first strategy is that visiting a child is visiting the complete subtree in that path
and remember visiting a node is reading processing or printing the data in that node
based on the relative order of left subtree right subtree and the root there are three popular
depth first strategies one way is that we can first visit the root node then the left subtree
and then the right subtree left and right subtrees will be visited recursively in
same manner such a traversal is called tree order traversal another way is that we can first
visit the left subtree then the root and then the right subtree such a traversal is called
in order traversal and if root is visited after left and right subtrees then such a traversal
is called post order traversal in total there are six possible permutations for left right and root
but conventionally a left subtree is always visited before the right subtree so these are
the three strategies that we use only the position of root is changing here if it's before left and
right then it's pre-order if it's in between it's in order and if it's after left and right subtrees
then it's post order there is an easy way to remember these three depth first algorithms
if we can denote visiting a node or reading the data in that node with letter d going to the left
subtree as l and going to the right subtree as r so if we can say d for data l for left and r for
right then in pre-order for each node we will go d l r first we will read the data in that node
then we will go left and once the left subtree is done we will go right in in order traversal
first we will finish the left subtree then we will read the data in current node
and then we will go right in post order for each node first we will go left once left subtree is
done we will go right and then we will read the data in current node so pre-order is data left
right in order is left data right and post-order is left right and then data
pre-order in order and post-order are really easy and intuitive to implement
using recursion but we will discuss implementation later let's now see what will be the pre-order
in order and post-order traversal for this tree that i've drawn here let's first see what will be
the pre-order traversal for this binary tree we need to start at root node and for each node
we first need to read the data or in other words visit that node in fact instead of d l r we could
have said v l r here v for visit we can use any of these assumptions v for visit or d for data
i will go with d l r here so let's start at the root for each node we first need to read the data
i'm writing f here the data that i just read and now i need to go left and finish the complete
left subtree and once all the nodes in the left subtree are visited then only i can go to the right
subtree the problem here is actually getting reduced in a self-similar or recursive manner
now we need to focus on this left subtree now we are at d root of this left subtree of f
once again for this node we will first read the data and now we can go left we will go towards e
only when these three nodes a b and c will be done now we are focusing on this subtree
comprising of these three nodes now we are at b we can read the data and now we can go left
to a there is nothing in left of a so we can say that for left for a left subtree is done
and there is nothing in right as well so we can say right is also done now for b left subtree is done
so we can go right to c and left and right of c and null and now for d left subtree is done so we
can go right once again for e left and right and null and now at this stage for f complete left
subtree is visited so we can go right now we need to go left of j and there is nothing in left of
g so we can go right and now we can go left of i for h there is nothing in left and right now
at this stage left subtree of i is done and right subtree is null and now we can go back to j
the left subtree for j is done so we can go to its right subtree finally we have k here and we are
done with all the nodes this is how we can perform a preorder traversal manually actual implementation
would be a simple recursion and we will discuss it later let's now see what will be the in order
traversal for this binary tree in in order traversal we will first finish visiting the left subtree
then visit the current node and then go right once again we will start at the root and we will
first go left now we will first finish this subtree once again for d we will first go left to b
and from b we will go to a now for a there is nothing in left so we can say that for this guy
left subtree is done so we can read the data and now we can go to its right but there is nothing
in right as well so this guy is done now for b left subtree is done so we can read the data
and now for b we can go right for c once again there is nothing in left so we can read the data
and there is nothing in right as well now left of d is completely done so we can visit it read
the data here now we can go to its right to e for e once again left and right and null at this
stage left subtree of f is done so we can read on the data and now we can go to right of f if we
will go on like this this finally will be my in order traversal this tree that i'm showing here
is actually a binary search tree for each node the value of nodes in left is lesser
and the value of nodes in right is greater so if we are printing in this order left subtree
and then the current node and then the right subtree then we would get a sorted list in order
traversal of a binary search tree would give you a sorted list okay now you should be able to figure
out the post order traversal this is what we will get for post order traversal i leave it for you
to see whether this is correct or not i'll stop here now in next lesson we will discuss
implementation of these tree traversal algorithms thanks for watching in this lesson we are going to
write code for level order traversal of a binary tree as we have discussed in our previous lesson
in level order traversal we visit all nodes at a particular depth or level in the tree before
visiting the nodes at next deeper level for this binary tree that i'm showing here if i have to
traverse the tree and print the data in nodes in level order then this is how we will go
we will start at level 0 and print f and now we are done with level 0 so we can go to level 1
and we can visit the nodes at level 1 from left to right from f we will go to d and from d we
will go to j now level 1 is done so we can go to level 2 so we will go like b e g and then k
and now we can go to next level aci and finally we will be done at h this is the order in which
we should visit the nodes but the question is how can we visit the nodes in this order in a program
like linked list we can't just have one pointer and keep moving it if i'll start with a pointer
at root let's say i have a pointer named current to point to the current node that i'm visiting
then it's possible for me to go from f to d using this pointer because there is a link so i can go
left to d but from d i cannot go to j because there is no link from d to j the only way we can
go to j is from f and once we have moved the pointer to d we can't even go back to f because
there is no backward link from d to f so what can we do to traverse the nodes in level order clearly
we can't go with just one pointer what we can do is as we visit a node we can keep reference or
address of all its children in a queue so we can visit them later a node in the queue can be called
discovered node whose address is known to us but we have not visited it yet initially we can start
with a address of root node in the queue to mean that initially this is the only discovered node
let's say for this example tree address of the root node is 400 and i'll assume some random
addresses for other nodes as well i will mark a discovered node in yellow okay now initially i'm
in queuing the root node and by storing a node in the queue i'll mean storing the address of the
node in the queue so initially we are starting with one discovered node now as long as the queue
has some discovered node at least one discovered node that is as long as the queue is not empty
we can take out a node from the front visit it and then enqueue its children visiting a node for
us is printing the value in that node so i'll write f here and now i'll enqueue the children
of this root node first i'll enqueue the left child and then the right child i'll mark visited
node in another color okay now we have one visited node and two discovered node
and now once again we can take out the node at front of the queue visit it and enqueue its children
by using a queue we are doing two things here first of all as we are moving from a node
we are not losing reference to its children because we are storing the references and then
because queue is our first in first out structure so a node that is discovered first that is inserted
first will be visited first so we will get this ordered that we are desiring give this some thought
and it's not very difficult to see okay so now we can dequeue and visit this node at address 200
and once again before i move on from this node i need to enqueue its children
so now at this stage we have two visited nodes three discovered nodes
and six undiscovered nodes and now we can take out the next node from front of queue
we'll visit it and enqueue its children if we will go on like this all the nodes will be visited
in the order that we are desiring at this stage we can dequeue node at 120 visit it
and then queue its children so we will go on like this until all the nodes are visited and the queue
is empty after b we will have e here nothing will go into the queue this time next we will have g here
and the address of i will go into the queue now k will be visited now at this stage we have
reference to three nodes in the queue now we will visit this node at 320 with value a
then we have c and now we will print i and the node with value h the node with data h will go into
the queue finally we will visit this node and now we are done with all the nodes the queue is empty
once the queue is empty we are done with our traversal so this is the algorithm for level
order traversal of a binary tree as you saw in this approach at any time we are keeping a
bunch of addresses in the memory in the queue instead of using just one pointer to move around
so of course we are using a lot of extra memory and i'll talk about the space complexity of
this algorithm and sometime i hope you got the core logic right let's now write code for this
algorithm i'm going to write c plus plus here this is how i'm defining node for my binary tree
i have a structure here with three fields one to store data and the data type is character this
time because in the example tree that we were showing earlier data type was character and we
have two more fields that are pointers to node one to store the address of left child and another
to store the address of right child now what i want to do here is i want to write a function
named level order that should take address of the root node as argument and print the data in the
nodes in level order now to test this function i'll have to write a lot of code to create and insert
nodes in a binary tree i'll have to write some more functions i'll skip writing all that code
you can pick the code for creation and insertion from previous lessons all i'll write is this function
level order now in this function here i'll first take care of one corner case if the tree is empty
that is if root is null we can simply return else if the tree is not empty we need to create
a queue i'm not going to write my own implementation of queue here in c plus plus we can use the queue
in standard template library and to use it first we'll have to write a statement like hash include
queue here and now i can create a queue of any type in this function i'll create a queue of pointer
to node with a statement like this now as we had discussed earlier initially we start with one
discovered node in the queue the only node known to us initially is the root node with this statement
queue dot push root i have inserted the address of root node in the queue and now i'll run this
while loop for which the condition is that the queue should not be empty and what i really mean
here is that while there is at least one discovered node we should go inside the loop and inside the
loop we should take out a node from the front this function front returns the element at front of the
queue and because the data type is pointer to node i'm collecting the return of this function in
this pointer to node a named current now i can visit this node being pointed by current and by
visiting if we mean reading the data in that node i'll simply print the data and now we want to push
the addresses of children of this node into the queue so i'll say that if the left child is not null
insert it into the queue and similarly if right child is not null push it into the queue or
rather push its address into the queue and i'll write one more statement to remove the element
from front of the queue with call to front the element is not removed from the queue with this
call pop we are removing the element okay so this is implementation of level order traversal in
c++ you can check the description of this video for a link to source code and there you can also
find all the extra code to test this function let's now talk about time and space complexity
of level order traversal if there are n nodes in the tree and in this algorithm visit to a node
is reading the data in that node and inserting its children in the queue then a visit to a node
will take constant time and each node will be visited exactly once so time taken will be
proportional to the number of nodes or in other words we can say that the time complexity is big
o of n for all cases irrespective of the shape of the tree time complexity of level order traversal
will be big o of n now let's talk about space complexity space complexity as we know is the
measure of rate of growth of extra memory used with input size we are not using constant amount
of extra memory in this algorithm we have this queue that will grow and shrink while executing
this algorithm assuming that the queue is dynamic maximum amount of extra memory used will depend
upon maximum number of elements in the queue at any time we can have couple of cases in some cases
extra memory used will be lesser and in some cases extra memory used will be greater
for a tree like this where each node has only one child we will have maximum one element in
the queue at any time during each visit one node will be taken out from the queue and one node
will be inserted so the amount of extra memory taken will not depend upon the number of nodes
space complexity will be big o of one but for a tree like this the amount of extra memory
used will depend upon the number of nodes in the tree this is a perfect binary tree all the levels
are full if you can see as the algorithm will execute at some point for each level all the
nodes in that level will be in the queue in a perfect binary tree we will have n by 2 nodes at
the deepest level so maximum number of nodes in the queue is going to be at least n by 2
so basically extra memory used is proportional to n the number of nodes so space complexity
will be big o of n for this case i'm not going to prove it but for average case space complexity
will be big o of n so for both worst and average cases we will be big o of n in terms of space
complexity and when we are saying best average and worst cases here it's only going by space
complexity time complexity will be big o of n for all cases so this is time and space complexity
analysis of level order traversal i'll stop here now in next lesson we will discuss depth first
traversal algorithms pre-order in order and post-order this is it for this lesson thanks for watching
in our previous lesson we talked about level order traversal of binary tree which is basically
breadth first traversal now in this lesson we are going to discuss these three depth first
algorithms pre-order in order and post-order i have drawn a binary tree here data type filled
in the nodes is character now as we had discussed in earlier lessons in depth first traversal of
binary tree if we go in one direction then we visit all the nodes in that direction or in other
words we visit the complete subtree in that direction and then only we go in other direction
in this example tree that i've drawn here if i'm at root and i'm going left then i'll visit
all the nodes in this left subtree and then only i can go right and once again when i'll go right
i'll visit all the nodes in this right subtree if you can see in this approach we are reducing
the problem in a self-similar or recursive manner we can say that in total visiting all
the nodes in the tree is visiting the root node visiting the left subtree and visiting the right
subtree remember by visiting a node we mean reading or processing the data in that node
and by visiting a subtree we mean visiting all the nodes in the subtree in depth first strategy
relative order of visiting the left subtree right subtree and the root can be different
for example we can first visit the right subtree then the root and then the left subtree
or we can first visit the root and then the left subtree and then the right subtree
conventionally left subtree is always visited before right subtree with this constraint we will
have three permutations we can first visit the root and then the left subtree and then the
right subtree and such a traversal will be called pre-order traversal or we can first visit the
left subtree then the root and then the right subtree and such a traversal will be called
in order traversal and we can also go left right and then root and such a traversal will be called
post-order traversal left and right subtrees will be visited recursively in same manner as the
original tree so in pre-order once again for the subtrees we will go root left and then right
in in order we'll keep going left root and then right the actual implementation of these
algorithms is really easy and intuitive let's first see code for pre-order traversal
I first written the algorithm in words here in pre-order traversal we first need to visit
the root then the left subtree and then the right subtree now I want to write a function that should
take pointer or reference to root node as argument and print data in all the nodes in pre-order
let's say visiting a node for us is printing the data in that node in c or c++ my method signature
will look something like this this function will take address of the root node as argument
argument type is pointer to node I'll define node as a structure with three fields like this
data type in this definition is character and there are two fields to store the addresses of
left and right children now in pre-order function I'll first visit or print the data in root node
and now I'll make a recursive call to visit the left subtree I have made a recursive call here
and to this call I'm passing address of the left child of my current root because left child will be
the root of left subtree and I'll have another call like this to visit the right subtree there is one
more thing that we need to add in this function and we will be done we cannot go into recursion
infinitely we need to have a base condition where we should exit if a tree or subtree is empty or
in other words for any call if root is null we can return or exit now with this much of code I'm
done with my pre-order function this will work fine in c or c++ actually in c make sure you write
struct space node instead of writing just node rest of the things are fine it will be good to
visualize this recursion so let's now quickly see how this pre-order function will work if this
example tree that I'm showing in right here is passed to it I'll redraw this tree and show it
like this here I'm depicting node as a structure with three fields let's say the leftmost cell here
is to store the address of left child the cell in middle is to store the data and the rightmost
cell is to store the address of right child now let's assume some addresses for these nodes
let's say the root node is at address 200 and I'll assume some random addresses for other nodes as
well and now I can fill in left and right fields for each node and as we know the identity of tree
that we always keep with us is reference or address of the root node this is what we pass to all the
functions in our implementation we often use a variable of type pointer to node named root to
store the address of root node we can name this variable anything we can name this variable root
or we can name this variable root ptr but this is just a pointer this particular block that I'm
showing here is for pointer to node and all these rectangles with three cells are nodes
this is how things are organized in memory now for this tree let's say we are making a call
to this pre-order function I'll make a call to pre-order passing it address 200 for this call
root is not null so we will not return at first line in this function we will go ahead and print
the data in this node at address 200 I'll write output for all print statements here and now this
function will make a recursive call execution of this particular function call will pause it will
resume only after this recursive call pre-order 150 finishes this second call is to visit this
left subtree this call pre-order 150 is to visit this left subtree address of the left child of
node at 200 is 150 once again for this call root is not null so we will go ahead and print the data
in node at 150 is d and now once again there will be a recursive call with this call pre-order 400
we are saying that we're going to visit this subtree once again we will print the data and
make another recursive call now we have made a call to visit this particular subtree with just
one node for this call we will print the data and now for node at 250 address of left child is zero
or null we will make a call pre-order zero but for this call we will simply return
because the address in this variable root will be null we have hit the base condition for our
recursion call to pre-order zero we'll finish and pre-order 250 will resume now in this particular
function call we'll make another call for right subtree for node at 250 even the right child is
null we will have another recursive call passing address zero but this once again will simply return
and now call to pre-order 250 will finish and call to pre-order 400 will resume
now in call to pre-order 400 we will make another recursive call to pre-order 180
with this call pre-order 180 we are visiting this particular subtree with just one node
for this call first we will print the data and then we will make a recursive call to pre-order zero
now pre-order zero will simply return and then we will have another call to pre-order zero
for right child of 180 the recursion will go on like this there is one thing that i want to talk
about here that's happening in this whole process even though we are not using any extra memory
explicitly in our function because of the recursion we are growing the function call stack
we have discussed memory management a number of times in our earlier lessons
you can check description of this video for a link to one of those lessons as we know for
each function call we allocate some amount of memory in what we call stack section of applications
memory and this allocated memory is reclaimed when the function call finishes at this stage
of execution of my recursion for this example my call stack will look something like this i'm writing
p as shortcut for pre-order because i'm short of space here let's say we made a call to pre-order
passing it address 200 from main function main function will be at bottom of stack at any time
only the call at top of stack will be executing and all other calls will be paused call stack keeps
growing and shrinking during execution of a program because memory is allocated for a new
function call and it's reclaimed when a function call finishes so even though we are not using any
extra memory explicitly here we are using memory implicitly in the call stack so space complexity
which is measure of rate of growth of extra memory used with input will depend upon the maximum
amount of extra memory used in the call stack i'll talk about space complexity once more later
for now let's come back to this recursion that i was executing call to this pre-order 0 will
finish and pre-order 180 will resume memory allocated for execution of pre-order 0 will be reclaimed
now for pre-order 180 both recursive calls have finished so this guy will also finish
even for pre-order 400 both calls have finished so pre-order 150 will resume now this guy will
make a recursive call to pre-order function passing it address 450 address of its right
child memory in the stack will be allocated for execution of pre-order 450 now in this
call we will first print the data and then we will make two recursive calls to pre-order passing
address 0 each time because for this node at 450 both children are null both calls will simply
return and then pre-order 450 will finish and now pre-order 150 will also be done if you can see
the call stack will grow only till we reach a leaf node a node with no children and then it will
start shrinking again maximum growth of call stack due to this recursion will depend upon
maximum depth or height of the tree we can say that extra space used will be proportional to
height of the tree or in other words space complexity of this algorithm is big o of h
where h is height of the tree okay coming back to the recursion we are done with pre-order 150
so pre-order 200 will resume and now we will make a call to visit this particular sub tree
in this call we will print j and then we will make a call passing address 60 so now we are
visiting this particular sub tree here we will first print g and then this guy will make a call
to pre-order 0 which will simply return and then there will be another call to pre-order 500
here we will print i and then we will make two recursive calls passing address 0 every time because
node at 500 is a leaf node with no children after this guy finishes pre-order 60 will resume
now this guy will also finish and pre-order 350 will resume and now we will have a call to
pre-order 700 which once again is a leaf node so k which is data in this node will be printed
and then we will make two calls passing address 0 which will simply return now at this stage
all these calls can finish we are done visiting all the nodes finally we will return back to the
caller of pre-order 200 which probably would be the main function so this is pre-order traversal
for you i hope you got how this regression works code for in-order and post-order will be very
similar in in-order traversal my base case will be the same so i'll say if root is null then return
or exit if root is not null i first need to visit the left sub tree i am visiting the left
sub tree with this recursive call then i need to visit the root so now i'm writing the sprint
of statement to print the data and now i can visit the right sub tree so this second recursive call
and this is my in-order function in-order traversal of this example tree that i have drawn here
will be this this particular binary tree is actually also a binary search tree and in-order
traversal of a binary search tree would give us elements in the tree in sorted order
okay let's now write code for post-order for this function once again the base case will
be the same so i'll say if root is null return or exit if root is not null i first need to visit
the left sub tree so i have made this recursive call then the right sub tree so i'll have this
another recursive call and now i can visit the root node post-order traversal for this example tree
will be this so this is pre-order in-order and post-order for you you can check the description
of this video for a link to all the source code let's now quickly talk about time and space
complexity of these algorithms time complexity of all these three algorithms is big o of n
if you could see then there was one function call corresponding to each node where we were
actually visiting that node where we were actually printing the data in that node so running time
should actually be proportional to number of nodes there is a better formal and mathematical
way of proving that time complexity of these algorithms is big o of n you can check the description
of this video for link to that space complexity as we had discussed earlier will be big o of h
where h is height of the tree height of a tree in worst case will be n minus 1 so in worst case
space complexity of these algorithms can be big o of n in best or average case height of a tree
will be big o of log n to the base two so we can say that in best or average case
space complexity will be big o of log n i'll stop here now in coming lessons we will solve
some problems on binary tree thanks for watching in this lesson we are going to solve a simple
problem on binary tree which is also a famous programming interview question and the problem is
given a binary tree we need to check if the binary tree is a binary search tree or not
as we know a binary tree is a tree in which each node can have act most two children
all these trees that i have drawn here are binary trees but not all of them are binary search trees
binary search tree as we know is a binary tree in which for each node value of all the nodes in
left subtree is lesser and if we want to allow duplicates we can say lesser or equal and value
of all the nodes in right subtree is greater we can define binary search tree as a recursive
structure like this elements in left subtree must be lesser or equal and elements in right
subtree must be greater and this should be true for all nodes and not just the root node
so left and right subtrees should themselves also be binary search trees
of these binary trees that i'm showing here a and c are binary search trees but b and d
are not in b for the root node with value 10 we have 11 in its left subtree which is
greater than 10 and in a binary tree for any node all values in its left subtree must be lesser
in d we are good for the root node the value in root node is 5 and we have 1 in left subtree
which is lesser and we have 8 9 and 12 in right subtree which are greater so we are good for the
root node but for this node with value 8 we have 9 in its left so this tree is not a binary search
tree so how should we go about solving this problem basically i want to write a function
that should take pointer or reference to root node of a binary tree as argument and the function
should return true if the binary tree is bst false otherwise this is how my method signature will
look like in c plus plus in c we do not have boolean type so return type here can be int
we can return 1 for true and 0 for false i'll also write the definition of node here
for a binary tree node would be a structure with 3 fields one to store data and two to store
addresses of left and right children in my definition of node here data type is integer
and we have two pointers to node to store addresses of left and right children okay coming back to
the problem there are multiple approaches and we are going to talk about all of them the first
approach that i'm going to talk about is easy to think of but it's not so efficient
but let's discuss it anyway we are saying that for a binary tree to be called binary search tree
it should have a recursive structure like this for the root node all the elements in left
sub tree must be lesser or equal and all the elements in right sub tree must be greater
and left and right sub trees should themselves also be binary search trees
so let's just check for all of this i'm going to write a function named its sub tree lesser
that will take a dress of root node of a binary tree or sub tree and an integer value as argument
and this function will return true if all the elements in the sub tree are lesser than this value
and similarly i'll write another function named its sub tree greater that will return true if all
the elements in a sub tree are greater than a given value i have just declared these functions
i'll write body of these functions later let's come back to this function is binary search tree
in this function i'm going to say that if all elements in left sub tree are lesser
and i'll verify this by making a call to its sub tree lesser function passing it a dress of
left child of my current root left child would be the root of left sub tree and the data in root
this function call will return true if all the elements in left sub tree would be lesser than
the data in root now the next thing that i want to check for is if elements in right sub tree
are greater than the data in root or not these two conditions are not sufficient
we also need to check if left and right sub trees are binary search trees or not
so i'll add two more conditions here i have made a recursive call to its binary search tree
function passing it a dress of left child and i have made another call passing a dress of right
child and if all these four function calls is sub tree lesser is sub tree greater and is binary
search tree for left and right sub trees return true if all these four checks pass then our tree is
a binary search tree we can return true else we need to return false there is only one thing
that we are missing in this function now we are missing the base case if root is null that is
if the tree or sub tree is empty we can return true this is the base case for our recursion
where we should stop with this much of code is binary search tree function is complete
but let's also write its sub tree lesser and its sub tree greater functions because
they're also part of our logic this function has to be a generic function that should check
if all the elements in a given tree are lesser than a given value or not we will have to traverse
the complete tree or sub tree and see value in all the nodes and compare these values against
this given integer i'll first handle the base case in this function if the tree is empty we can
return true else we need to check if the data in root is less than or equal to the given value
and we also need to recursively check if left and right sub trees of the current root have lesser
value or not so i'm adding two more conditions here i'm making two recursive calls one for the
left sub tree and another for the right sub tree if all these three conditions are true
then we are good else we can return false its sub tree greater function will be very similar
instead of writing these two functions is sub tree lesser and its sub tree and sub tree
greater we could also do something like this we could find the maximum in left sub tree and compare
it with the data in root if maximum of a sub tree is lesser then all the elements are lesser and
similarly if the minimum of a sub tree is greater all the elements are greater for the right sub tree
we could find the minimum so instead of writing these two functions is sub tree lesser and its
sub tree greater we could write something like find max and find min and this would also fit
so this is our solution using one of the approaches let's quickly run this code on an
example binary tree and see how it will execute i have drawn a very simple binary tree here
which actually is a binary search tree let's assume some addresses for these nodes in the tree
let's say the root node is at address 200 and i'll assume some random addresses for other nodes as well
to check if this binary tree is a binary search tree or not we will make a call to
his binary search tree function i'm writing i b s t here as shortcut for his binary search tree
because i'm short of space here so i'll make a call to this function maybe from the main function
passing address 200 address of the root node for this function call address in this local variable
address collected in this local variable root will be 200 root is not null null is only a macro for
address 0 for this call root is not null so we will not return true at this line we will go to the
next f now here we will make a call to his subtree lesser function arguments passed will be address
of left child which is 150 and seven the data in node at 200 execution of the calling function
will pause and will resume only after the called function returns now in this call to his subtree
lesser root is not null so we will not return true at first line we will go to the next f
now here the first condition is if data in root and the root this time is 150
because this call is for this left subtree and for this left subtree address of root is 150
data in root is 4 which is lesser than 7 so the first condition is true and we can go to the
second condition which is a recursive call this call will pause and we will go to the next call
here once again the data in node at 180 one is lesser than 7 so first condition is true and we will
make a recursive call left subtree for node at 180 is null there is no left child so we will return
at first line root is null this time this particular call will simply return true now in this previous
call when root is 180 second condition for if is also true so we will make another call for right
subtree once again at rest past will be 0 and we will simply return true and now for this call
is subtree lesser 187 all three conditions are true so this guy can also return true and now this
call ISL 157 will resume now this guy will make a recursive call for the right subtree and this guy
after everything will also return true now for this call because all three conditions in the if
statement are true this guy will also return true and now is binary search tree function will resume
for this call we have evaluated the first condition we have got true now this guy will make another
call to its subtree crater passing address of right child and value 7 this guy after everything
will return true and now we will have two recursive calls to check if left and right subtrees are
binary search trees or not we will first have a call for the left subtree the execution will go
on like this but i want you to see something in each call to binary search tree function we are
comparing the data in root with all the elements in left subtree and then all the elements in right
subtree this example tree could be really large then in that case in the first call to is binary
search tree for this complete tree we would recursively traverse this whole left subtree
to see whether all the values in this up tree are less than seven or not and then we will traverse
all nodes in this right subtree to see if values are greater than seven or not and then in next
call to is binary search tree when we would be validating whether this particular subtree
is BST or not we would recursively traverse this subtree if values are lesser than four or not
and this subtree to see if values are greater than four or not so all in all during this whole
process there will be a lot of traversal data in notes will be read and compared multiple times
if you can see all notes in this particular subtree will be traversed once in call to
is binary search tree for 200 when we will compare value in these notes with seven and then these
notes will once again be traversed in call to is binary search tree for 150 when they will be
compared with four they will be traversed in call to its subtree lesser all in all these two functions
is subtree lesser and its subtree greater are very expensive for each node we are looking at
all nodes in its subtrees there is an efficient solution in which we do not need to compare
data in a node with data in all nodes in its subtrees and let's see what the solution is
what we can do is we can define a permissible range for each node and data in that node must
be in that range we can start at the root node with range minus infinity to infinity because
for the root node there is no upper and lower limit and now as we are traversing we can set a
range for other nodes when we are going left we need to reset the upper bound so for this node at
150 data has to be between minus infinity and seven data in left child cannot be greater than
data in root if we are going right we need to set the lower bound for this node at 300 range
would be seven to infinity seven is not included in the range data has to be strictly greater than
seven for this node at 180 the range will be minus infinity to four for this node with value six
lower bound will be four and upper bound would be seven now my code will go like this my function
is binary search tree will take two more arguments an integer to mark the lower bound or min value
and another integer to mark the upper bound or max value and now instead of checking whether
all the elements in left subtree are lesser than the data in root and all the elements in
right subtree are greater than the data in root or not we will simply check whether data in root
is in this range or not so i'll get rid of these two function calls it's subtree lesser and it's
subtree greater which are really expensive and i'll add these two conditions data in root must be
greater than min value and data in root must be less than max value these two checks will take
constant time it's subtree lesser and it's subtree greater functions were not taking constant time
running time for them was proportional to number of nodes in the subtree okay now these two recursive
calls should also have two more arguments for the left child lower bound will not change
upper bound will be the data in current node and for the right child upper bound will not change
and lower bound will be the data in current node this recursion looks good to me we already have
the base case written the only thing is that the caller of his binary search tree function
may only want to pass the address of root node so what we can do is instead of naming this function
is binary search tree we can name this function as a utility function like is BSTutil and we can
have another function named its binary search tree in which we can take only the address of root
node and this function can call BST is BSTutil function passing address of root minimum possible
value in integer variable for minus infinity and maximum possible value in integer variable
for plus infinity int min and int max here are macros for minimum and maximum possible values in
int so this is our solution using second approach which is quite efficient in this recursion we
will go to each node once and at each node we will take constant time to see whether data in
that node is in a defined range or not time complexity would be big o of n where n is number
of nodes in the binary tree for the previous algorithm time complexity was big o of n square
one more thing in this code i have not handled the case that binary search tree can have duplicates
i'm saying that elements in left subtree must be strictly lesser and elements in right subtree
must be strictly greater i'll leave it for you to see how you will allow duplicates
there is another solution to this problem you can perform in order traversal of binary tree
and if the tree is binary search tree you would read the data in sorted order
in order traversal of a binary search tree gives a sorted list you can do some hack
while performing in order traversal and check if you are getting the elements in sorted order or
not during the whole traversal you only need to keep track of previously read node and at any time
data in a node that you are reading must be greater than data in previously read node try
implementing this solution it will be interesting okay i'll stop here now in common lessons we will
discuss some more problems on binary tree thanks for watching
in this lesson we are going to write code to delete a node from binary search tree
in most data structures deletion is tricky in case of binary search trees too it's not so
straightforward so let's first see what all complications we may have while trying to delete
a node from binary search tree i have drawn a binary search tree of integers here as we know
in a binary search tree for each node value of all nodes in its left subtree is lesser
and value of all nodes in its right subtree is greater for example in this tree if i'll pick
this node with value five then we have three and one in its left subtree which are lesser
and we have seven and nine in its right subtree which are greater and you can pick any other
node in the tree and this property will be true else the tree is not a bst now when we need to
delete a node this property must be conserved let's try to delete some nodes from this example
tree and see if we can rearrange things and conserve this property of binary search tree or not
what if i want to delete this node with value 19 to delete a node from tree we need to do two
things we need to remove the reference of the node from its parent so the node is detached
from the tree here we will cut this link we will set right child of this node with value 17 as null
and the second thing that we need to do is reclaim the memory allocated to the node being
deleted that is wipe off the node object from memory this particular node with value 19 that
we are trying to delete here is a leaf node it has no children and even if we take this guy out
by simply cutting this link that is removing its reference from its parent and then wiping it off
from memory there is no problem property of binary search tree that for each node value of nodes
and left should be lesser and value of nodes in right should be greater is conserved so deleting a
leaf node a node with no children is really easy in this tree these four nodes with values
one nine thirteen and nineteen are leaf nodes to delete any of these we just need to cut the link
and wipe off the node that is clear it from memory but what if we want to delete a non-leaf node
what if in this example we want to delete this node with value 15 i can't just cut this link
because if i'll cut this link we will detach not just the node with value 15 but this complete
sub tree we have two more nodes in this sub tree we could have had a lot more we need to make sure
that all other nodes except the node with value 15 that's being deleted remain in the tree so what
do we do now this particular node that we are trying to delete here has two children or two
subtrees i'll come back to case of node with two children later because this is not so easy to crack
what i want to discuss first is the case when the node being deleted would have only one child
if the node being deleted would have only one child like in this example this node with
value seven this guy has only one child this guy has a right child but does not have a left child
for such a node what we can do is we can link its parent to this only child so the child and
everything below the child we could have some more nodes below nine as well will remain attached
to the tree and only the node being deleted will be detached now we are not losing any other node
than the node with value seven this is my tree after the deletion is there still a binary search
tree yes it is only the right subtree of node with value five has changed earlier we had seven and
nine in right subtree of five and now we have nine which is fine what if we were having some more
nodes below nine here in this tree i can have a node in the left of nine and the value in this
node has to be lesser than twelve greater than five greater than seven and lesser than nine
we are left with only one choice we can only have eight here in right we can have something
lesser than twelve and greater than five seven and nine all in all between nine and twelve
okay so if the original tree was this much after deletion this is how my tree will look like
okay so are we good now is the tree in right of bst well yes it is when we are setting this node
with value nine as right child of the node with value five we are basically setting this particular
subtree as right subtree of the node with value five now this subtree is already in right of five
so value of all nodes in this subtree is already greater than five and the subtree itself of course
is a binary search tree any subtree in a binary search tree will also be a binary search tree
so even after deletion even after the rearrangement property of the tree that for each node nodes in
left should be lesser and nodes in right should be greater in value is conserved so this is what
we need to do to delete a node with just one child or a node with just one subtree connect
its parent to its only child and then wipe it off from memory there are only two nodes in this tree
that have only one child let's try to delete this other one with value three all we need to do here
is set one has left child of five once again if there were some more nodes below one then also
there was no issue okay so now we are good for two cases we're good for leaf nodes and we are good
for nodes with just one child and now we should think about the third case what if a node has two
children what should we do in this case let's come back to this node with value 15 that we were
trying to delete earlier with two children we can't do something like connect parent to one
of the children while trying to delete 15 if we will connect 12 to 13 if we will make 13 the right
child of 12 then we will include 13 and anything below 13 that is we will include the left subtree
of 15 but we will lose the right subtree of 15 that is 17 and anything below 17 similarly if
we will make 17 the right child then we will lose the left subtree of 15 that is 13 and anything below
13 actually this case is tricky and before I talk about a possible solution I want to insert some
more nodes here I want to have some more nodes in subtrees of 13 and 17 the reason I'm inserting
some more nodes here is because I want to discuss a generic case and that's why I want these two
subtrees to have more than one node okay coming back when I'm trying to delete this node my intent
basically is to remove this value 15 from the tree my delete function will have signature something
like this it will take pointer or reference to the root node and value to be deleted as argument
so here I'm deleting this particular node because I want to remove 15 from the tree
what I'm going to do now is something with which I can reduce case three to either case one or case
two I'll wipe off 15 from this node and I'll fill in some other value in this node of course I can't
fill in any random value what I'll do is I'll look for the minimum in right subtree of this node
and I'll fill in that value here minimum in right subtree of this node is 17 so I have filled 17 here
we now have two nodes with value 17 but notice that this node has only one child we can delete
this node because we know how to delete a node with one child and once this node is deleted my
tree will be good the final arrangement will be a valid arrangement for my BST but why minimum in
right subtree why not value in any other leaf node or any other node with one child well we
also need to conserve this property that for each node nodes in left should have lesser value
and nodes in right should have greater value for this node if I'm bringing in the minimum from
its right subtree then because I'm bringing in something from its right subtree it will be
greater than the previous value 17 is greater than 15 so all the elements in left of course will be
lesser and because it's the minimum in right subtree all the elements in right of this guy
would either be greater or equal we'll have a duplicate that will be equal once the duplicate is
removed everything else will be fine in a tree or sub tree if a node has minimum value it won't
have a left child because if there is a left child there is something lesser and this is another
property that we are exploiting give this some thought in a tree or sub tree node with minimum
value will not have a left child there may or may not be a right child if we would have a right
child like here we have a right child so here we are reducing case three to case two if there was
no child we would have reduced case three to case one okay so let's get rid of the duplicate
I'll build a link like this and after deletion this is what my tree will look like so this is
what we need to do in case three we need to find the minimum in right subtree of the targeted node
then copy or fill in this value and finally we need to delete the duplicate or the node with
minimum value from right subtree there was another possible approach here and I must talk about it
instead of going for minimum in right we could also go for maximum in left subtree maximum in
left subtree would of course be greater than or equal to all the values in left maximum in left
subtree of node with value 15 is 14 I'm copying 14 here now all the nodes in left are lesser than
or equal to 14 and because we are picking something from left subtree it will still be lesser than
the value being deleted 14 is less than 15 so all the nodes in this right subtree will still be
greater and if we are picking maximum in a tree or subtree then that node will not have a right
child because if we have something in right we have something greater so the value can't be maximum
the node may have a left child in this case node with value 14 doesn't have a left child
so we are basically introducing case three to case one I'll simply get rid of this node
so we are looking good even after deletion in case three we can apply any of these methods
and this is all in logic part let's now write code for this logic I'll write c++ and we will use
recursion if you're not very comfortable applying recursion on trees then make sure you watch earlier
lessons in this series you can find link to them in description of this video
in my code here I have defined node as a structure with three fields we have one field to store data
and we have two fields that are pointers to node to store addresses of left and right children
and I want to write a function named delete that should take pointer to root node and the data
to be deleted as argument and this function should return pointer to root node because the root may
change after deletion what we're passing to delete function is only a local copy of a root's address
if the address is changing we need to return it back to delete a given value or data we first need
to find it in the tree and once we find the node containing that data we can try to delete it
remember the only identity of tree that we pass to functions is a address of the root node and to
perform any action on the tree we need to start at root so let's first search for the node with
this data first I'll cover a corner case if root is null that is if the tree is empty
we can simply return I can say return root or return null here they will mean the same
because root is null else if the data that we are looking for is less than the data in root
then it's in the left subtree the problem can be reduced to deleting the data from left subtree
we need to go and find the data in left subtree so we can make a recursive call to delete function
passing address of the left child and the data to be deleted now the root of the left subtree
that is the left child of this current node may change after deletion but the good thing is
delete function will return address of the modified root of the left subtree so we can set the return
as left child of the current node now if data that we are trying to delete is greater than
the data in root we need to go and delete the data from right subtree and if the data is needed
greater nor lesser that is if it's equal then we can try deleting the node containing that data
now let's handle the three cases one by one if there is no child we can simply delete that node
what I'll do here is I'll first wipe off the node from memory and this is how I'll do it
what we have in root right now is address of the node to be deleted I'm using delete operator here
and that's used to deallocate memory of an object in heap in c you would use free function
now root is a dangling pointer because the object in heap is deleted but root still has its address
so we can set root as null and now we can return root reference of this node in its parent will
not be fixed here once this recursive call finishes then somewhere in these two statements in any of
these two statements in any of these two else ifs the link will be corrected I hope this is making
sense okay now let's handle other cases if only the left child is null then what I want to do is
I first want to store the address of current node that I'm trying to delete in a temporary
pointer to node and now I want to move the root this pointer named root to the right child
so the right child becomes the root of this sub tree and now we can delete the node
that is being pointed to by temp we will use delete operator in c we would be using free
function and now we can return root similarly if the right child is null I'll first store the
address of current root in a temporary pointer to node then I'll make the left child new root of
the sub tree so we'll move to the left child and then I'll delete the previous root whose address
I have in temp and finally I'll return root actually we need to return root in all cases so I'll remove
this return root statement from all this if and else if and write one return root after everything
let's talk about the third case now in case of two children what we need to do is we need to
search for minimum element in right sub tree of the node that we are trying to delete
let's say this function find min will give me address of the node with minimum value in our tree
or sub tree so I'm calling this function find min and I'm collecting the return in a pointer to
node named temp now I should set the data in current node that I'm trying to delete as
this minimum value and now the problem is getting reduced to deleting this minimum value from the
right sub tree of current node with this much code I think I'm done with delete function
this looks good to me let's quickly run this code on an example tree and see if this works or not
I have drawn a binary search tree here let's say these values outside these nodes are addresses of
the nodes now I want to delete number 15 from this tree so I'll make a call to delete function
passing address of the root which is 200 and 15 the value to be deleted in delete function for
this particular call control will come to this line a recursive call will be made execution of
this call delete 200 comma 15 will pause and it will resume only after this function below delete
350 comma 15 returns now for this call below we will go inside the third else in case 3
here we will find the node with minimum value in right which is 17 which is 400 the value is 17
address is 400 first we will set the data in node at 350 as 17 and now we are making a recursive
call to delete 17 from right sub tree of 350 we have only one node in right sub tree of 350
here we have case 1 in this call we will simply delete the node at 400 and return null remember
root will be returned in all calls in the end now delete 350 comma 15 will resume and in this
resumed call we will set a address of right child of node at 350 as null as you can see the link
in parent is being corrected when the recursion is unfolding and the function call corresponding to
the parent is resuming and now this guy can return and now in this call we will resume at this line
so right child of node at 200 will be set as 350 it's already 350 but it will be written again
and now this call can also finish so I hope you got some sense of how this recursion is working
you can find link to all the source code and code to test the delete function
in description of this video this is it for this lesson thanks for watching
in this lesson we are going to solve one other interesting problem on binary search tree
and the problem is given a node in a binary search tree we need to find its in-order
successor that is the node that would come immediately after the given node in in-order
traversal of the binary search tree as we know in in-order traversal of a binary tree we first
visit the left subtree then the root and then the right subtree left and right subtrees are
visited recursively in same manner so for each node we first visit its left subtree
then the node itself and then its right subtree we have already discussed in-order traversal
in detail in a previous lesson in the series you can check the description of this video for
a link to it in-order implementation will basically be a recursive function something like what i'm
showing in right here there are two recursive calls in this function one to visit the left subtree
and another to visit the right subtree time complexity of in-order traversal is big o of n
where n is number of nodes in the tree we visit each node exactly once so time taken is proportional
to number of nodes in the tree i have drawn a binary search tree of integers here binary search
tree as we know is a binary tree in which for each node value of nodes in left is lesser and value
of nodes in right is greater let's quickly see what will be the in-order traversal for this
binary search tree we'll start at root of the tree now for any node we first need to visit all
nodes in its left and then only we can visit that node so we will have to go left basically we will
make a recursive call to go to left child of this node for this guy once again we have something
in left so we will make another recursive call and go to its left child now we are at this node
with value eight and we will have to go left one more time and now for this node with value six
which is a leaf node we have nothing in left so we can simply say that its left subtree is done
and hence we can visit this guy visiting for me is reading the data in that node
i'll write the data here and now for this node there's nothing in right as well so
we can simply say that its right is also done and now we're completely done for this guy
so recursive call corresponding to this node will finish and we will go back to call corresponding
to its parent if we will come back to a node from its left child then it will be unvisited
because we can't visit a node until its left is done so when we are coming back to eight eight
is unvisited so we can simply visit this node that is read the data in this node
when i'll visit a node i'll paint it in yellow and now there's nothing in right of this node so
we can simply say that right is done now we are done with this node so call corresponding to this
node will finish and we will go back to its parent once again we're coming back to the parent from
left so the parent that is this node with value 10 is unvisited if we would come back to a node from
right then it would already be visited so i'm visiting 10 and now we can go to right of 10
so far we have visited three nodes we first visited node with value six and then we visited
node with value eight so eight is successor of six and then 10 is successor of eight
now let's see what will be the successor of 10 for nodes with values six and eight there was
nothing in right so we were unwinding and going to the parent but for a node if there would be
something in right that is if there would be a right sub tree then its successor would definitely
be in its right sub tree because after visiting that node we will go right now at this stage we are
at this node with value 12 for this guy we will first go left and now we are at node with value 11
which is a leaf node there's nothing in left so we can simply say that left is done and we can
print the data that is visit this node so in order successor of 10 is 11 now for node with value 11
there's nothing in right so we will go back to its parent and now we can visit this guy so after 11
we have 12 there's nothing in right of 12 so call for this guy will finish and we will go to its
parent now we're coming back to 10 again but this time from right so this guy is already visited
so we need not do anything we can simply go to its parent and now we are at this node with value 15
we are coming from left this guy is unvisited so we can visit it and now we can go to its right
we will go on like this successor of 15 would be 16 and after 16 we will print 17
then after 17 we will print 20 then 25 and the last element would be 27
so this is in order traversal of this binary search tree notice that we have printed the
integers in sorted order when we perform in order traversal on a binary search tree then elements
are visited in sorted order now the problem that we want to solve is given a value in the tree
we want to find its in order successor in a binary search tree it would be the next higher value in
the tree but what's the big deal here can't we just perform in order traversal and while performing
the traversal figure order successor well we can do so but it will be expensive running time of
in order is big o of n and we may want to do better finding next and previous element in some data
could be a frequently performed operation and good thing about binary search tree is that
frequently performed operations like insertion, deletion and search happen in big o of h where
h is height of the tree so it would be good if we are able to find successor and predecessor
in big o of h we always try to keep a tree balanced to minimize its height height of a balanced
binary tree is log n to the base 2 and big o of log n running time for any operation is almost
the best running time that we can have so can we find in order successor in big o of h i have
retrawn the example tree here let's see what we can do in various cases what node would we
visit after this node with value 10 can we deduce this logically well if you remember the simulation
of in order traversal that we had done earlier then if we have already visited this node
then we are done with its left subtree and we have read the data in this node and we need to go
right now in the right subtree we will have to go left as long as it's possible to go
and if we can't go left anymore like here there is nothing in left of this node with value 11
then this is the node that i'm visiting next so for a node if there is a right subtree
then in order successor would be the leftmost node in its right subtree in a bst it would be
the node with minimum value in its right subtree i would say this is case one in this case all we
need to do is we need to go as left as possible in right subtree in a bst it will also mean finding
the minimum in right subtree leftmost node will also be the minimum in the subtree
now this is one case our node here had a right subtree what would be in the successor if there
would be no right subtree what node would we visit after this node with value 8 this guy does not
have a right subtree if we have already visited this guy then we have visited its left and this
node itself and there is nothing in right so we can say that right is also visited but we have not
found the successor yet now where do we go from here well if you remember the simulation that we
had done earlier we need to go to the parent of this node and if we are going to the parent from
left which is the case here then the parent would be unvisited for this node with value 10 we just
finished its left subtree and we are coming back so now we can visit this node so this is my successor
let's now pick another node with no right subtree what would be in order successor of this node
with value 12 what node would we visit next now here once again we do not have a right subtree
for this node so we must go back to its parent and see if it's unvisited but if we are going
to the parent from right if the node that we just visited is a right child which is the case here
then the parent would already be visited because we are coming back after visiting its right subtree
this node must have been visited before going right so what should we do now the recursion will
roll back further and we need to go to parent of 10 and now we are going to 15 from left so this
guy is unvisited so we can visit this node and this is my successor if the node does not have a
right subtree we need to go to the nearest ancestor for which given node would be in its left subtree
here for 12 we first went to 10 but 12 is in right subtree of 10 so we went to the next ancestor
15 and 12 is in left of 15 so this is the nearest ancestor for which 12 is in left
and hence this is my in order successor this algorithm works fine but there is an issue
how do we go from a node to its parent well we can design our tree such that node can have
reference to its parent so far in most lessons we have defined node as a structure with three
fields something like this this is how we would define node in c or c plus plus we have one field
to store data and we have two pointers to node to store reference or addresses of left and right
children often it makes a lot of sense to have one more field to store the address of parent we
can design a tree like this and then we will not have problem walking the tree up using parent link
we can easily go to the ancestors but what if there is no link to parent in this case what we
can do is we can start at root and walk the tree from root to the given node in a bst this is really
easy for 12 we will start at root 12 is lesser than value in root so we need to go left and now
we are at 10 or 12 is greater than 10 so we need to go right and now we are at 12 if we will walk
the tree from root to the given node we will go through all the ancestors of the given node
in order successor would be the deepest node or the best ancestor in this path
for which given node would be in left subtree 12 has only two ancestors we have 10 but 12 is in
right of 10 and then we have 15 and 12 is in left of 15 so 15 is my successor now let's use this
technique to find successor of 6 we will first walk down from root to this node 6 is in left for
all the ancestors but the best ancestor for which 6 is in left is this node with value 8 so this is
my successor remember we need to look at ancestors only if there is no right subtree
for 6 there is no right subtree okay so the algorithm looks good let's now write code for this
in my c++ code here i'm going to write a function named ket successor that will take address of
root node and address of another node for which we need to find the successor and this function
will return address of the successor node we could design this function differently instead of taking
pointer to the node for which we want to find the successor as argument we could just take
the data as argument and for this data for this element we can find the successor node and return
its address and that's why the return type here is struct node asterisk because we will be
returning address in a pointer or what we can also do is we can return the element itself the
successor element itself we can implement with any of these signatures let's implement this one
we will pass the data in current node and we will return back the address of the successor
now the first thing that we need to do is we need to search the node with this data
i'm going to make call to a function named find that will take address of the root node and the data
and will return me pointer to the node with this data if this function returns me null that is if
the data is not found in the tree we can simply return null else we have the address of the current
node in this pointer to node that we have named current now in a bsd this search operation will
cost us big o of h where h is height of the tree search in our bsd is not very expensive
we could have avoided this search if we would have passed address of the current node instead of
passing the data as this second argument but let's go with this let's now find the successor of this
node if this node has rights up tree that is if the right sub tree is not null we need to go to the
leftmost node in the right sub tree i have declared a temporary pointer to node here and initially
i've set it to current dot right and with this while loop i'll go to the leftmost node
while there is something in the left keep going and finally when i'll come out of this loop
i'll have address of leftmost node in the right sub tree and i can return this address
this particular node will also be the node with minimum value in right sub tree i'll move this
code in another function i have written this function named find min that will return node with
minimum value in a tree or sub tree in get successor function i'll simply say return find min and i'll
pass the address of right child of current node so basically i'm passing the right sub tree here
okay now let's talk about case two if there is no right sub tree what we need to do is we need to
walk the tree from root till current node and we need to find the deepest ancestor for which
current node will be in its left subtree what i'm going to do here is i'm going to declare
a pointer to node named successor and initially i'll set it as null and i'll have another
pointer to node named ancestor and initially i'll set this as root and with this while loop we will
walk the tree till we have not reached the current node to walk the tree we will use the property of
binary search tree that for each node value of nodes in left is lesser and value of nodes in right
is greater if data in current node is less than the data in ancestor then first of all this ancestor
may be my in order successor because the current node is in its left so what we can do is we can
set this guy as successor and we can go left while traversing if we will find a deeper node
with this property that current node will be in its left sub tree then successor will be updated
else if the current node lies in right we simply need to move right when we'll come out of this
while loop successor will either be null or it will be the address of some node not all nodes in the
tree will have a successor node with maximum value will not have a successor after coming out of this
while loop we can return the successor so this is my get successor function and i think this
should work you can find link to complete source code in description of this video overall time
complexity of this function will be big o of h and this is what we wanted we wanted to find
successor in big o of h here we are already performing the search in big o of h uh finding
minimum will also take big o of h and walking the tree from root to a node in bst will also take
big o of h so overall this is big o of h if you have understood this code this logic then it should
be very easy for you writing function to find in order predecessor i encourage you to write it
i'll stop here now in coming lessons we will solve some more interesting problems on binary
trees and binary search trees thanks for watching hello everyone so far in this series on data
structures we have talked about some of the linear data structures like array linked list stack
and q in all these structures data is arranged in a linear or sequential manner so we can call
them linear data structures and we have also talked about tree which is a non-linear data structure
tree is a hierarchical structure now as we understand data structures are ways to store
and organize data and for different kinds of data we use different kinds of data structures
in this lesson we are going to introduce you to another non-linear data structure
that has got its application in a wide number of scenarios in computer science it is used to model
and represent a variety of systems and this data structure is graph when we study data structures
we often first study them as mathematical or logical models here also we will first study graph
as a mathematical or logical model and we will go into implementation details later okay so let's
get started a graph just like a tree is a collection of objects or entities that we call
nodes or vertices connected to each other through a set of edges but in a tree connections are bound
to be in a certain way in a tree there are rules dictating the connection among the nodes
in a tree with n nodes we must have exactly n minus one edges one edge for each parent child
relationship as we know an edge in a tree is for a parent child relationship and all nodes in a tree
except the root node would have a parent would have exactly one parent and that's why if there
are n nodes there must be exactly n minus one edges in a tree all nodes must be reachable from
the root and there must be exactly one possible path from root to a node now in a graph there are
no rules dictating the connection among the nodes a graph contains a set of nodes and a set of edges
and edges can be connecting nodes in any possible way tree is only a special kind of graph now
graph as a concept has been studied extensively in mathematics if you have taken a course on discrete
mathematics then you must be knowing about graphs already in computer science we basically study
and implement the same concept of craft from mathematics the study of crafts is often referred
to as craft theory in pure mathematical terms we can define graph something like this a craft g
is an ordered pair of a set v of vertices and a set e of edges now i'm using some mathematical
jargon here an ordered pair is just a pair of mathematical objects in which the order of objects
in the pair matters this is how we write and represent an ordered pair objects separated by
comma put within parenthesis now because the order here matters we can say that v is the first object
in the pair and e is the second object an ordered pair ab is not equal to ba unless a and b are
equal in our definition of craft here first object in the pair must always be a set of vertices
and the second object must be a set of edges that's why we are calling the pair an ordered pair
we also have concept of unordered pair an unordered pair is simply a set of two elements order is
not important here we write an unordered pair using curly brackets or braces because the order
is not important here unordered pair ab is equal to ba it doesn't matter which object is first and
which object is second okay coming back so a graph is an ordered pair of a set of vertices
and a set of edges and g equal ve is a formal mathematical notation that we use to define a
graph now i have a craft drawn here in the right this graph has eight vertices and ten edges what
i want to do is i want to give some names to these vertices because each node in a graph must have
some identification it can be a name or it can be an index i'm naming these vertices as v1 v2 v3 v4
v5 and so on and this naming is not indicative of any order there is no first second and third
node here i could give any name to any node so my set of vertices here is this we have eight
elements in the set v1 v2 v3 v4 v5 v6 v7 and v8 so this is my set of vertices for this
graph now what's my set of edges to answer this we first need to know how to represent an edge
an edge is uniquely identified by its two end points so we can just write the names of the two
end points of an edge as a pair and it can be a representation for the edge but edges can
be of two types we can have a directed edge in which connection is one way or we can have an
undirected edge in which connection is two way in this example graph that i'm showing here edges
are undirected but if you remember the tree that i had shown earlier then we had directed edges in
that tree with this directed edge that i'm showing you here we are saying that there is a link or
path from vertex u to v but we cannot assume a path from v to u this connection is one way for a
directed edge one of the end points would be the origin and the other end point would be the
destination and we draw the edge with an arrowhead pointing towards the destination
for our edge here origin is u and destination is v a directed edge can be represented as an
ordered pair first element in the pair can be the origin and second element can be the destination
so with this directed edge represented as ordered pair uv we have a path from u to v
if we want a path from v to u we need to draw another directed edge here with v as origin
and u as destination and this edge can be represented as ordered pair vu the upper one here is uv and
the below one is vu and they are not same now if the edge is undirected the connection is two way
an undirected edge can be represented as an unordered pair here because the edge is bidirectional
origin and destination are not fixed we only need to know what two end points are being connected
by the edge so now that we know how to represent edges we can write the set of edges for this
example graph here we have an undirected edge between v1 and v2 then we have one between v1
and v3 then we have v1 v4 this is really simple i'll just go ahead and write all of them
so this is my set of edges typically in a graph all edges would either be directed
or undirected it's possible for a graph to have both directed and undirected edges
but we are not going to study such graphs we are only going to study graphs in which
all edges would either be directed or undirected a graph with all directed edges is called a
directed graph or digraph and a graph with all undirected edges is called an undirected graph
there is no special name for an undirected graph usually if the graph is directed
we explicitly say that it's a directed graph or digraph so these are two types of graph directed
graph or digraph in which edges are unidirectional or ordered pairs and undirected
graph in which edges are bi-directional or unordered pairs now many real world systems and problems
can be modeled using a graph graphs can be used to represent any collection of objects
having some kind of pairwise relationship let's have a look at some of the interesting examples
a social network like facebook can be represented as an undirected graph
a user would be a node in the graph and if two users are friends there would be an edge connecting
them a real social network would have millions and billions of nodes i can show only few in my
diagram here because i'm short of space now social network is an undirected graph because
friendship is a mutual relationship if i'm your friend you are my friend too so connections have
to be two-way now once a system is modeled as a graph a lot of problems can easily be solved
by applying standard algorithms in graph theory like here in this social network let's say we want
to do something like suggest friends to a user let's say we want to suggest some connections to
rama one possible approach to do so can be suggesting friends of friends who are not connected
already rama has three friends ela bob and kt and friends of these three that are not connected to
rama already can be suggested there is no friend of ela which is not connected to rama already
bob however has three friends storm sam and lee that are not friends with rama so they can
be suggested and kt has two friends lee and swati that are not connected to rama we have
counted lee already so in all we can suggest these four users to rama now even though we
described this problem in context of a social network this is a standard graph problem the problem
here in pure graph terms is finding all nodes having length of shortest path from a given node
equal to two standard algorithms can be applied to solve this problem we'll talk about concepts
like path in a graph in some time for now just know that the problem that we just described in
context of a social network is a standard craft problem okay so a social network like facebook
is an undirected graph now let's have a look at another example
interlinked web pages on the internet or the world wide web can be represented as a directed
graph a web page that would have a unique address or url would be a node in the graph and we can
have a directed edge if a page contains link to another page now once again there are billions of
pages on the web but i can show only few here the edges in this graph are directed because
the relationship is not mutual this time if page a has a link to page b then it's not necessary that
page b will also have a link to page a let's say one of the pages on mycodeschool.com has a
tutorial on graph and on this page i have put a link to wikipedia article on graph let's assume
that in this example graph that i'm showing you here page p is my mycode school tutorial on graph
with this address or url mycodeschool.com/videos/craft and let's say page q is the wikipedia
article on graph with this url wikipedia.org/vicky/craft now on my page that is page p i have put a link to
wikipedia page on graph if you are on page p you can click on this link and go to page q
but wikipedia has not reciprocated to my favor by putting a link back to my page
so if you are on page q you cannot click on a link and come to page p connection here is one way
and that's why we have drawn a directed edge here okay now once again if we are able to represent
web as a directed graph we can apply standard graph theory algorithms to solve problems and
perform tasks one of the tasks that search engines like google perform very regularly
is web crawling search engines use a program called web crawler that systematically
prozes the world wide web to collect and store data about web pages search engines can then
use this data to provide quick and accurate results against search queries now even though
in this context we are using a nice and heavy term like web crawling web crawling is basically
graph traversal or in simpler words act of visiting all notes in a graph and no prizes for
guessing that there are standard algorithms for graph traversal we'll be studying graph traversal
algorithms in later lessons okay now the next thing that i want to talk about is concept of a
weighted graph sometimes in a graph all connections cannot be treated as equal some connections can
be preferable to others like for example we can represent intercity road network that is the
network of highways and freeways between cities as an undirected graph i'm assuming that all highways
would be bidirectional intra city road network that is road network within a city would definitely
have one way roads and so intra city road network must be represented as a directed graph but
intercity road network in my opinion can be represented as an undirected graph now clearly
we cannot treat all connections as equal here roads would be of different lengths and to perform a
lot of tasks to solve a lot of problems we need to take length of roads into account in such cases
we associate some weight or cost with every edge we label the edges with their weights in this case
weight can be length of the roads so what i'll do here is i'll just label these edges with some
values for their lengths and let's say these values are in kilometers and now edges in this
graph are weighted and this graph can be called a weighted graph let's say in this graph we want
to pick the best route from city A to city D have a look at these four possible routes i'm showing
them in different colors now if i would treat all edges as equal then i would say that the green
route through B and C and the red route through E and F are equally good both these parts have
three edges and this yellow route through E is the best because we have only two edges in this path
but with different weights assigned to the connections i need to add up weights of edges in a path
to calculate total cost when i'm taking weight into account shortest route is through B and C
connections have different weights and this is really important here in this graph
actually we can look at all the graphs as weighted graphs an unweighted graph can basically be
seen as a weighted graph in which weight of all the edges is same and typically we assume the weight
as one okay so we have represented inter cities road network as a weighted undirected graph
social network was an unweighted undirected graph and worldwide web was an unweighted directed
graph and this one is a weighted undirected graph now this was inter city road network i think
inter city road network that is road network within our city can be modeled as a weighted directed
graph because in a city there would be some one ways intersections in inter city road network
would be nodes and road segments would be our edges and by the way we can also draw an undirected
graph as directed it's just that for each undirected edge we'll have two directed edges
we may not be able to redraw a directed graph as undirected but we can always redraw an undirected
graph as directed okay i'll stop here now this much is good for an introductory lesson
in next lesson we'll talk about some more properties of graph this is it for this lesson
thanks for watching in our previous lesson we introduced you to graphs we defined graph as a
mathematical or logical model and talked about some of the properties and applications of graph
now in this lesson we will discuss some more properties of graph but first i want to do a
quick recap of what we have discussed in our previous lesson a graph can be defined as an
ordered pair of a set of vertices and a set of edges we use this formal mathematical notation
g equal ve to define a graph here v is set of vertices and e is set of edges ordered pair is
just a pair of mathematical objects in which order of objects in the pair matters it matters
which element is first and which element is second in the pair now as we know to denote number of
elements in a set that we also call cardinality of a set we use the same notation that we use
for modulus or absolute value so this is how we can denote number of vertices and number of edges
in a graph number of vertices would be number of elements in set v and number of edges would be
number of elements in set e moving forward this is how i'm going to denote number of vertices
and number of edges in all my explanations now as we had discussed earlier edges in a graph
can either be directed that is one way connections or undirected that is two way connections
a graph with only directed edges is called a directed graph or die graph and a graph with only
undirected edges is called an undirected graph now sometimes all connections in a graph cannot
be treated as equal so we label edges with some weight or cost like what i'm showing here
and a graph in which some value is associated to connections as cost or weight is called a weighted
graph a graph is unweighted if there is no cost distinction among edges okay now we can also have
some special kind of edges in a graph these edges complicate algorithms and make working
with graphs difficult but i'm going to talk about them anyway an edge is called a self loop or
self edge if it involves only one vertex if both end points of an edge are same then it's called
a self loop we can have a self loop in both directed and undirected graphs but the question is
why would we ever have a self loop in a graph well sometimes if edges are depicting some
relationship or connection that's possible with the same node as origin as well as destination
then we can have a self loop for example as we had discussed in our previous lesson
interlinked web pages on the internet or the worldwide web can be presented as a directed graph
a page with a unique URL can be a node in the graph and we can have a directed edge if a page
contains link to another page now we can have a self loop in this graph because it's very much
possible for a web page to have a link to itself have a look at this web page my code school dot
com slash videos in the header we have links for workouts page problems page and videos page
right now i'm already on videos page but i can still click on videos link and all that will
happen with the click is a refresh because i'm already on videos page my origin and
destination are same here so if i'm representing worldwide web as a directed graph the way we just
discussed then we have a self loop here now the next special type of edge that i want to talk about
is multi edge an edge is called a multi edge if it occurs more than once in a graph once again
we can have a multi edge in both directed and undirected graphs first multi edge that i'm showing you
here is undirected and the second one is directed now once again the question why should we ever
have a multi edge well let's say we are representing flight network between cities as a graph a city
would be a node and we can have an edge if there is a direct flight connection between any two cities
but then there can be multiple flights between a pair of cities these flights would have different
names and may have different costs if i want to keep the information about all the flights
in my graph i can draw multi edges i can draw one directed edge for each flight and then i can
label an edge with its cost or any other property i just labeled edges here with some random flight
numbers now as we were saying earlier self loops and multi edges often complicate working with
graphs their presence means we need to take extra care while solving problems if a graph contains
no self loop or multi edge it's called a simple graph in our lessons we will mostly be dealing
with simple graphs now i want you to answer a very simple question given number of vertices
in a simple graph that is a graph with no self loop or multi edge what would be maximum possible
number of edges well let's see let's say we want to draw a directed graph with four vertices
i have drawn four vertices here i'll name these vertices v1 v2 v3 and v4 so this is my set of vertices
number of elements in set v is four now it's perfectly fine if i choose not to draw any edge here
this will still be a graph set of edges can be empty nodes can be totally disconnected
so minimum possible number of edges in a graph is zero now if this is a directed graph
what do you think can be maximum number of edges here well each node can have directed edges to
all other nodes in this figure here each node can have directed edges to three other nodes
we have four nodes in total so maximum possible number of edges here is four into three that is
twelve i have shown edges originating from our vertex in same color here this is the maximum
that we can draw if there is no self loop or multi edge in general if there are n vertices
then maximum number of edges in a directed graph would be n into n minus one so in a simple
directed graph number of edges would be in this range zero to n into n minus one now what do you
think would be the maximum for an undirected graph in an undirected graph we can have only one
bi-directional edge between a pair of nodes we can't have two edges in different directions
so here the maximum would be half of the maximum for directed so if the graph is simple and undirected
number of edges would be in the range zero to n into n minus one by two remember this is true only
if there is no self loop or multi edge now if you can see number of edges in a graph can be
really really large compared to number of vertices for example if number of vertices in a directed
graph is equal to ten maximum number of edges would be ninety if number of vertices is hundred
maximum number of edges would be ninety nine hundred maximum number of edges would be close
to square of number of vertices a graph is called dense if number of edges in the graph is close to
maximum possible number of edges that is if the number of edges is of the order of square of
number of vertices and a graph is called sparse if the number of edges is really less typically
close to number of vertices and not more than that there is no defined boundary for what can
be called dense and what can be called sparse it all depends on context but this is an important
classification while working with graphs a lot of decisions are made based on whether the graph is
dense or sparse for example we typically choose a different kind of storage structure in computer's
memory for a dense graph we typically store a dense graph in something called adjacency matrix
and for a sparse graph we typically use something called adjacency list
i'll be talking about adjacency matrix and adjacency list in next lesson okay now the next concept
that i want to talk about is concept of path in a graph a path in a graph is a sequence of vertices
where each adjacent pair in the sequence is connected by an edge i'm highlighting a path here in this
example graph the sequence of vertices a b f h is a path in this graph now we have an undirected
graph here edges are bidirectional in a directed graph all edges must also be aligned in one direction
the direction of the path a path is called simple path if no vertices are repeated and if vertices
are not repeated then edges will also not be repeated so in a simple path both vertices and edges are
not repeated this path a b f h that i have highlighted here is a simple path but we could also have a
path like this here start vertex is a and end vertex is d in this path one edge and two vertices
are repeated in graph theory there is some inconsistency in use of this term path most of
the time when we say path we mean a simple path and if repetition is possible we use this term
walk so a path is basically a walk in which no vertices or edges are repeated a walk is called
a trail if vertices can be repeated but edges cannot be repeated i'm highlighting a trail here
in this example graph okay now i want to say this once again walk and path are often used as synonyms
but most often when we say path we mean simple path a path in which vertices and edges are not
repeated between two different vertices if there is a walk in which vertices or edges are repeated
like this walk that i'm showing you here in this example graph then there must also be a path
or simple path that is a walk in which vertices or edges would not be repeated in this walk that
i'm showing you here we are starting at a and we are ending our walk at c there is a simple path
from a to c with just one edge all we need to do is we need to avoid going to be e h d and then
coming back again to a so this is why we mostly talk about simple path between two vertices
because if any other walk is possible simple path is also possible and it makes most sense to look
for a simple path so this is what i'm going to do throughout our lessons i'm going to say path
and by path i'll mean simple path and if it's not a simple path i'll say it explicitly
a graph is called strongly connected if in the graph there is a path from any vertex to any
other vertex if it's an undirected graph we simply call it connected and if it's a directed graph
we call it strongly connected in leftmost and rightmost graphs that i'm showing you here
we have a path from any vertex to any other vertex but in this graph in the middle we do not have a
path from any vertex to any other vertex we cannot go from vertex c to a we can go from a to c but
we cannot go from c to a so this is not a strongly connected graph remember if it's an undirected
graph we simply say connected and if it's a directed graph we say strongly connected
if a directed graph is not strongly connected but can be turned into connected graph
by treating all edges as undirected then such a directed graph is called weakly connected
if we just ignore the directions of the edges here this is connected but i would recommend that
you just remember connected and strongly connected this leftmost undirected graph is connected
i removed one of the edges and now this is not connected now we have two disjoint connected
components here but the graph overall is not connected connectedness of a graph is a really
important property if you remember introsity road network road network within a city that would
have a lot of one ways can be represented as a directed graph now an introsity road network
should always be strongly connected we should be able to reach any street from any street
any intersection to any intersection okay now that we understand concept of a path
next i want to talk about cycle in a graph a walk is called a closed walk if it starts and ends
at same vertex like what i'm showing here and there is one more condition the length of the walk
must be greater than zero length of a walk or path is number of edges in the path
like for this closed walk that i'm showing you here length is five because we have five edges in
this walk so a closed walk is walk that starts and ends at same vertex and the length of which
is greater than zero now some may call closed walk a cycle but generally we use the term cycle for
a simple cycle a simple cycle is a closed walk in which other than start and end vertices
no other vertex or edge is repeated right now what i'm showing you here in this example graph
is a simple cycle or we can just say cycle a graph with no cycle is called an acyclic graph
a tree if drawn with undirected edges would be an example of an undirected acyclic graph
here in this tree we can have a closed walk but we cannot have a simple cycle in this closed walk
that i'm showing you here our edge is repeated there would be no simple cycle in a tree
and apart from tree we can have other kind of undirected acyclic graphs also
our tree also has to be connected now we can also have a directed acyclic graph
as you can see here also we do not have any cycle you cannot have a path of length
greater than zero starting and ending at the same vertex or directed acyclic graph is often called
a dag cycles in a graph cause a lot of issues in designing algorithms for problems like
finding shortest route from one vertex to another and we will talk about cycles a lot when we will
study some of these advanced algorithms in coming lessons for this lesson i'll stop here now
in our next lesson we will discuss ways of creating and storing graph in computer's memory
this is it for this lesson thanks for watching hello everyone in our previous lessons we introduced
you to graphs and we also looked at and talked about some of the properties of graph but so far
we have not discussed how we can implement graph how we can create a logical structure like graph
in computer's memory so let us try to discuss this a graph as we know contains a set of vertices
and a set of edges and this is how we define graph in pure mathematical terms a graph g
is defined as an ordered pair of a set v of vertices and a set e of edges now to create
and store a graph in computer's memory the simplest thing that we probably can do is that
we can create two lists one to store all the vertices and another to store all the edges
for a list we can use an array of appropriate size or we can use an implementation of a dynamic
list in fact we can use a dynamic list available to us in language libraries something like vector
in c++ or array list in Java now a vertex is identified by its name so the first list the list
of vertices would simply be a list of names or strings i just filled in names of all the vertices
for this example graph here now what should we fill in this edge list here an edge is identified
by its two endpoints so what we can do is we can create an edge as an object with two fields
we can define edge as a structure or class with two fields one to store the start vertex
and another to store the end vertex edge list would basically be an array or list of this type
struct edge in these two definitions of edge that i have written here in the first one i have used
character pointers because in c we typically use character pointers to store or refer to strings
we could use character array also in c++ or Java where we can create classes we have string
available to us as a data type so we can use that also so we can use any of these for the fields
we can use character pointer or character array or string data type if it's available depends on
how you want to design your implementation now let's fill this edge list here for this example graph
each row now here has two boxes let's say the first one is to store the start vertex and the
second one is to store the end vertex the graph that we have here is an undirected graph so any
vertex can be called start vertex and any vertex can be called end vertex
order of the vertices is not important here we have nine edges here one between a and b
another between a and c another between a and d and then we have be and bf instead of having bf
as an entry we could also have fb but we just need one of them and then we have cg dh
eh and fh actually there's one more we also have gh we have 10 edges in total here and not nine
now once again because this is an undirected graph if we are saying that there is an edge from f
to h we are also saying that there is an edge from h to f there is no need to have another
entry as hf we will unnecessarily be using extra memory if this was a directed graph fh and hf
would have meant two different connections which is the start vertex and which is the end vertex
would have mattered maybe in case of undirected graphs we should name the fields as first vertex
and second vertex and in case of directed graphs we should name the fields as start vertex and
end vertex now our graph here could also be a weighted graph we could have some cost or weight
associated with the edges as we know in an unweighted graph cost of all the connections is equal
but in a weighted graph different connections would have different weight or different cost
now in this example graph here i have associated some weights to these edges now how do you think
we should store this data the weight of edges well if the graph is weighted we can have one more field
in the edge object to store the weight now an entry in my edge list has three fields
one to store the start vertex one to store the end vertex and one more to store the weight
so this is one possible way of storing a graph we can simply create two lists one to store the
vertices and another to store the edges but this is not very efficient for any possible way of
storing and organizing data we must also see its cost and when we say cost we mean two things
time cost of various operations and the memory usage typically we measure the rate of growth of
time taken with size of input or data what we also call time complexity
and we measure the rate of growth of memory consumed with size of input or data what we also
call space complexity time and space complexities are most commonly expressed in terms of what we
call big o notation for this lesson i'm assuming that you already know about time and space complexity
analysis and big o notation if you want to revise some of these concepts then you can check the
description of this video for link to some lessons we always want to minimize the time cost of most
frequently performed operations and we always want to make sure that we do not consume
unreasonably high memory okay so let's now analyze this particular structure that we are
trying to use to store our graph let's first discuss the memory usage for the first list
the vertex list least number of rows needed or consumed would be equal to number of vertices
now each row here in this vertex list is a name or string and string can be of any length
right now all strings have just one character because i simply named the notes a b c and so on
but we could have names with multiple characters and because strings can be of different lengths
all rows may not be consuming the same amount of memory like here here i'm showing an
intra city road network as a weighted graph cities are my notes and road distances are my weights
now for this graph as you can see names are of different lengths so all rows in vertex list or
all rows in edge list would not cost us same more characters will cost us more bytes
but we can safely assume that the names will not be too long we can safely assume that in almost
all practical scenarios average length of strings will be a really small value if we
assume it to be always lesser than some constant then the total space consumed in this vertex list
will be proportional to the number of rows consumed that is the number of vertices or in other words
we can say that space complexity here is big o of number of vertices this is how we write number
of vertices with two vertical bars what we basically mean here is number of elements in set V
now for the edge list once again we are storing strings in first two fields of the edge object
so once again each row here will not consume same amount of memory but if we are just storing
the reference or pointer to a string like here in the first row instead of having values filled
in these two fields we could have references or pointers to the names in the vertex list
if we will design things like this each row will consume same memory this in fact is better
because references in most cases would cost us a lot lesser than a copy of the name and as
reference we can have the actual address of the string and that's what we are doing when we are
saying that start vertex and end vertex can be character pointers or maybe a better design would be
simply having the index of the name or string in vertex list let's say a is at index 0 in the
vertex list and b is at index 1 and c is at index 2 and i'll go on like this
now for start vertex and end vertex we can have two integer fields
as you can see in both my definitions of edge start vertex and end vertex are of type int now
and in each row of edge list first and second field are filled with integer values
i have filled in appropriate values of indices this definitely is a better design and if you can
see now each row in edge list would cost us the same amount of memory so overall space consumed
in edge list would be proportional to number of edges or in other words space complexity here is
big o of number of edges okay so this is analysis of our memory usage overall space complexity of
this design would be big o of number of vertices plus number of edges is this memory usage
and reasonably high well we cannot do a lot better than this if we want to store a graph
in computer's memory so we are all right in terms of memory usage now let's discuss time cost of
operations what do you think can be most frequently performed operations while working with craft
one of the most frequently performed operations while working with craft would be finding all
nodes adjacent to a given node that is finding all nodes directly connected to a given node
what do you think would be time cost of finding all nodes directly connected to a given node
well we will have to scan the whole edge list we will have to perform a linear search we will
have to go through all the entries in the list and see if the start or end node in the entry is
our given node for a directed graph we would see if the start node in the entry is our given node
or not and for an undirected graph we would see both the start as well as the end node running time
would be proportional to number of edges or in other words time complexity of this operation would
be big o of number of edges okay now another frequently performed operation can be
finding if two given nodes are connected or not in this case also we will have to perform
a linear search on the edge list in worst case we will have to look at all the entries in the edge
list so worst case running time would be proportional to number of edges so for this operation to
time complexity is big o of number of edges now let's try to see how good or bad
this running time big o of number of edges is if you remember this discussion from our previous
lesson in a simple graph in a graph with no self loop or multi-edge if number of vertices
that is the number of elements in set v is equal to n then maximum number of edges
would be n into n minus one if the graph is directed each node will be connected to every
other node and of course minimum number of edges can be zero we can have a graph with no edge
maximum number of edges would be n into n minus one by two if the graph is undirected
but all in all if you can see number of edges can go almost up to square of number of vertices
number of edges can be of the order of square of number of vertices let's denote number of vertices
here as small v so number of edges can be of the order of v square in a graph typically any
operation running in order of number of edges would be considered very costly we try to keep
things in order of number of vertices when we are comparing the two running times this is very obvious
big o of v is a lot better than big o of v square all in all this vertex list and edge
list kind of representation is not very efficient in terms of time cost of operations we should
think of some other efficient design we should think of something better we will talk about
another possible way of storing and representing graph in next lesson this is it for this lesson
thanks for watching so in our previous lesson we discussed one possible way of
storing and representing a graph in which we used two lists one to store the vertices
and another to store the edges a record in vertex list here is name of a node and a
record in edge list is an object containing references to the two end points of an edge
and also the weight of that edge because this example graph that i'm showing you here is
a weighted graph we called this kind of representation edge list representation
but we realized that this kind of storage is not very efficient in terms of time cost of most
frequently performed operations like finding nodes adjacent to a given node
or finding if two nodes are connected or not to perform any of these operations we need to scan
the whole edge list we need to perform a linear search on the edge list so the time complexity is
big o of number of edges and we know that number of edges in a graph can be really really large
in worst case it can be close to square of number of vertices in a graph anything running in order
of number of edges is considered very costly we often want to keep the cost in order of number
of vertices so we should think of some other efficient design we should think of something
better than this one more possible design is that we can store the edges in a two-dimensional
array or matrix we can have a two-dimensional matrix or array of size v cross v where v is
number of vertices as you can see i have drawn an 8 cross 8 array here because number of vertices
in my example graph here is 8 let's name this array a now if we want to store a graph that is
unweighted let's just remove the weights from this example graph here and now our graph is
unweighted and if we have a value or index between zero and v minus one for each vertex
which we have here if we are storing the vertices in a vertex list then we have an index between
zero and v minus one for each vertex we can say that a is 0th node b is 1th node c is 2th node and
so on we are picking up indices from the vertex list okay so if the graph is unweighted and each
vertex has an index between zero and v minus one then in this matrix or to the array we can set
ith row and jth column that is aij as one or boolean value true if there is an edge from i to j
zero or false otherwise if I have to fill this matrix for this example graph here then I'll go
vertex by vertex vertex vertex zero is connected to vertex one two and three vertex one is connected
to zero four and five this is an undirected graph so if we have an edge from zero to one
we also have an edge from one to zero so one at row and zero at the column should also be set as
one now let's go to node two it's connected to zero and six three is connected to zero and seven
four is connected to one and seven five once again is connected to one and seven
six is connected to two and seven and seven is connected to three four five and six
all the remaining positions in this array should be set as zero
notice that this matrix is symmetric for an undirected graph this matrix would be symmetric
because aij would be equal to aji we would have two positions filled for each edge
in fact to see all the edges in the graph we need to go through only one of these two halves
now this would not be true for a directed graph only one position will be filled for each edge
and we will have to go through the entire matrix to see all the edges okay now this kind of representation
of a graph in which edges or connections are stored in a matrix or to the array is called
a jcnc matrix representation this particular matrix that I have drawn here is an ajcnc matrix
now with this kind of storage or representation what do you think would be the time cost of finding
all nodes adjacent to a given node let's say given this vertex list and adjacency matrix
we want to find all nodes adjacent to node named f if we are given name of a node then we first need
to know its index and to know the index we will have to scan the vertex list there is no other way
once we figure out the index like for f index is 5 then we can go to the row with that index in
the ajcnc matrix and we can scan this complete row to find all the adjacent nodes scanning the
vertex list to figure out the index in worst case will cost us time proportional to the number of
vertices because in worst case we may have to scan the whole list and scanning a row in the
adjacency matrix would once again cost us time proportional to number of vertices because in a row
we would have exactly v columns where v is number of vertices so overall time cost of this operation
is big o of v now most of the time while performing operations we must pass indices to avoid scanning
the vertex list all the time if we know an index we can figure out the name in constant time
because in an array we can access element at any index in constant time but if we know a name and
want to figure out the index then it will cost us we go off v we will have to scan the vertex list
we will have to perform a linear search on it okay moving on now what would be the time cost of
finding if two nodes are connected or not now once again the two nodes can be given to us as indices
or names if the nodes would be passed as indices then we simply need to look at value in a particular
row and particular column we simply need to look at aij for some values of i and j and this will
cost us constant time you can look at value in any cell in a two-dimensional array in constant
time so if indices are given time complexity of this operation would be big o of 1 which simply
means that we will take constant time but if names are given then we also need to do the scanning
to figure out the indices which will cost us big o of v overall time complexity would be big o of v
the constant time access would not mean anything the scanning of vertex list all the time to figure
out the indices can be avoided we can use some extra memory to create a hash table with names
and indices as key value pairs and then the time cost of finding index from name would also be big
o of 1 that is constant hash table is a data structure and i have not talked about it in
any of my lessons so far if you do not know about hash table just search online for a basic idea of
it okay so as you can see with a jcense matrix representation our time cost of some of the most
frequently performed operations is in order of number of vertices and not in order of number of
edges which can be as high as square of number of vertices okay now if we want to store a weighted
graph in a jcense matrix representation then aij in the matrix can be set as weight of an edge
for non-existent edges we can have a default value like a really large or maximum possible
integer value that is never expected to be an edge weight i have just filled in infinity here
to mean that we can choose the default as infinity minus infinity or any other value that would never
ever be a valid edge weight okay now for further discussion i'll come back to an unweighted graph
a jcense matrix looks really good so should we not use it always well with this design we have
improved on time but we have gone really high on memory usage instead of using memory units
exactly equal to number of edges what we were doing with an edge list kind of storage
here we are using exactly v square units of memory we are using big o of v square space
we are not just storing the information that these two nodes are connected we are also storing
not of it that is these two nodes are not connected which probably is redundant information
if a graph is tense if the number of edges is really close to v square then this is good but if
the graph is sparse that is if number of edges is lot lesser than v square then we are wasting
a lot of memory in storing these zeros like for this example graph that i have drawn here in the edge
list we were consuming 10 units of memory we had 10 rows consumed in the edge list but here we are
consuming 64 units most graphs with a really large number of vertices would not be very tense
would not have number of edges anywhere close to v square like for example let's say we are
modeling a social network like facebook has a graph such that a user in the network is a node
and there is an undirected edge if two users are friends facebook has a billion users
but i'm showing only a few in my example graph here because i'm short of space
let's just assume that we have a billion users in our network so number of vertices in our graph is
10 to the power 9 which is a billion now do you think number of connections in our social network
can ever be close to square of number of users that will mean everyone in the network is a friend
of everyone else a user of our social network will not be friend to all other billion users
we can safely assume that a user on an average would not have more than a thousand friends
with this assumption we would have 10 to the power 12 edges in our graph
actually this is an undirected graph so we should do a divide by two here so that we do not count
an edge twice so if average number of friends is thousand then total number of connections in
my graph is 5 into 10 to the power 11 now this is a lot lesser than square of number of vertices
so basically if we would use an adjacency matrix for this kind of a graph we would waste a hell lot
of space and moreover even if we are not looking in relative terms 10 to the power 18 units of memory
even in absolute sense is a lot 10 to the power 18 bytes would be about a thousand petabytes
now this really is a lot of space this much data would never ever fit on one physical disc
5 into 10 to the power 11 bytes on the other hand is just 0.5 terabytes
a typical personal computer these days would have this much of storage so as you can see for
something like a large social graph adjacency matrix representation is not very efficient
adjacency matrix is good when a graph is tens that is when the number of edges is close to
square of number of vertices or sometimes when total number of possible connections that is v
square is so less that wasted space would not even matter but most real world graphs would be sparse
and adjacency matrix would not be a good fit let's think about another example let's think about
world wide web as a directed graph if you can think of web pages as nodes in a graph and hyperlinks
as directed edges then a web page would not have linked to all other web pages and once again
number of web pages would be in order of millions a web page would have linked to only a few other
web pages so the graph would be sparse most real world graphs would be sparse
and adjacency matrix even though it's giving us good running time for most frequently performed
operations would not be a good fit because it's not very efficient in terms of space
so what should we do well there's another representation that gives us similar or maybe
even better running time than adjacency matrix and does not consume so much space
it's called adjacency list representation and we will talk about it in our next lesson
this is it for this lesson thanks for watching so in our previous lesson we talked about adjacency
matrix as a way to store and represent a graph and as we discussed and analyzed this data structure
we saw that it's very efficient in terms of time cost of operations with this data structure it costs
big o of 1 that is constant time to find if two nodes are connected or not and it costs big o of
v where v is number of vertices to find all nodes adjacent to a given node but we also saw that
adjacency matrix is not very efficient when it comes to space consumption we consume space
in order of square of number of vertices in adjacency matrix representation as you know
we store edges in a two-dimensional array or matrix of size v cross v where v is number of
vertices in my example graph here we have eight vertices that's why i have an eight cross eight
matrix here we are consuming eight square that is 64 units of space here now what's basically
happening is that for each vertex for each node we have a row in this matrix where we are storing
information about all its connections this is the row for the zeroth node that is a this is the row
for the one-th node that is b this is for c and we can go on like this so each node has got a row
and a row is basically a one-dimensional array of size equal to number of vertices that is v
and what exactly are we storing in a row let's just look at this first row in which we are storing
connections of node a this two-dimensional matrix or array that we have here is basically an array
of one-dimensional arrays so each row has to be a one-dimensional array so how are we storing
the connections of node a in these eight cells in this one-dimensional array of size eight
a zero in the zeroth position means that there is no edge starting a and ending at
zeroth node which again is a an edge starting and ending at itself is called a self loop
and there is no self loop on a of one in one-th position here means that there is an edge
from a to one-th node that is b the way we are storing information here is that
index or position in this one-dimensional array is being used to represent end point of an edge
for this complete row for this complete one-dimensional array
start is always the same it's always the zeroth node that is a in general in the
adjacency matrix row index represents the start point and column index represents the end point
now here when we are looking only at the first row start is always a and the indices 0 1 2 and so
on are representing the end points and the value at a particular index or position tells us whether
we have an edge ending at that node or not one here means that the edge exists 0 would have meant
that the edge does not exist now when we are storing information like this if you can see
we are not just storing that b c and d are connected to a we are also storing the not of it
we are also storing the information that a e f g and h are not connected to a
if we are storing what all nodes are connected through that we can also deduce what all nodes
are not connected these zeros in my opinion are redundant information causing extra consumption
of memory most real-world graphs are sparse that is number of connections is really small compared to
total number of possible connections so most often there would be too many zeros and very few ones
think about it let's say we are trying to store connections in a social network like Facebook
in an adjacency matrix which would be the most impractical thing to do in my opinion
but anyway for the sake of argument let's say we are trying to do it just to store connections
of one user i would have a row or one-dimensional matrix of size 10 to the power nine on an average
in a social network you would not have more than thousand friends if i have thousand friends then
in the row used to store my connections i would only have thousand ones and rest that is 10 to the
power nine minus thousand would be zeros and i'm not trying to force you to agree but just like me
if you also think that these zeros are storing redundant information and are extra consumption
of memory then even if we are storing these ones and zeros in just one byte as boolean values
these many zeros here is almost one gigabyte of memory once are just one kilobyte so given
this problem let's try to do something different here let's just try to keep the information that
these nodes are connected and get rid of the information that these nodes are not connected
because it can be inferred it can be deduced and there are a couple of ways in which we can do this
here to store connections of a instead of using an array such that index represents end point of
an edge and value at that particular index represents whether we have an edge ending there or not
we can simply keep a list of all the nodes to which we are connected this is the list or set of
nodes to which a is connected we can represent this list either using the indices or using
the actual names for the nodes let's just use indices because names can be long and may consume
more memory you can always look at the vertex list and find out the name in constant time now in our
machine we can store this set of nodes which basically is a set of integers in something as
simple as an array and this array as you can see is a different arrangement from our previous array
in our earlier arrangement index was representing index of a node in the graph and value was
representing whether there was a connection to that node or not here index does not represent
anything and the values are the actual indices of the nodes to which we are connected
now instead of using an array here to store this set of integers we can also use a linked list
and why just array or linked list i would argue that we can also use a tree here in fact a binary
search tree is a good way to store a set of values there are ways to keep a binary search tree
balanced and if you always keep a binary search tree balanced you can perform search insertion
and deletion all three operations in order of log of number of nodes we will discuss cost of
operations for any of these possible ways in some time right now all i want to say is that there are
a bunch of ways in which we can store connections of a node for our example graph that we started with
instead of an adjacency matrix we can try to do something like this we are still storing
the same information we are still saying that 0th node is connected to 1th 2th and 3th node
1th node is connected to 0th 4th and 5th node 2th node is connected to 0th and 6th node and so on
but we are consuming a lot less memory here programmatically this adjacency matrix here
is just a two-dimensional array of size 8 cross 8 so we are consuming 64 units of space in total
but this structure in right does not have all the rows of same size how do you think we can create
such a structure programmatically well it depends in c or c plus plus if you understand pointers
then we can create an array of pointers of size 8 and each pointer can point to a one-dimensional
array of different size 0th pointer can point to an array of size 3 because 0th node has
three connections and we need an array of size 3 1th pointer can point to an array of size 3
because 1th node also has three connections 2th node however has only two connections
so 2th pointer should point to an array of size 2 and we can go on like this the 7th node has
four connections so 7th pointer should should point to an array of size 4
if you do not understand any of this pointer thing that i'm doing right now
you can refer to my code schools lesson titled pointers and arrays the link to which you can find
in the description of this video but think about it the basic idea is that each row can be a one-dimensional
array of different size and you can implement this with whatever tools you have in your favorite
programming language now let's quickly see what are the pros and cons of this structure in the right
in comparison to the matrix in the left we are definitely consuming less memory with the structure
in right with a jcnc matrix our space consumption is proportional to square of number of vertices
while with the second structure space consumption is proportional to number of edges
and we know that most real-world graphs are sparse that is the number of edges is really small in
comparison to square of number of vertices square of number of vertices is basically
total number of possible edges and for us to reach this number every node should be connected to
every other node in most graphs a node is connected to few other nodes and not all other nodes
in this second structure we are avoiding this typical problem of too much space consumption in
an a jcnc matrix by only keeping the ones and getting rid of the redundant zeros
here for an undirected graph like this one we would consume exactly two into number of edges
units of memory and for a directed graph we would consume exactly e that is number of edges units
of memory but all in all space consumption will be proportional to number of edges
or in other words space complexity would be big o of e so the second structure is definitely
better in terms of space consumption but let's now also try to compare these two structures
for time cost of operations what do you think would be the time cost of finding if two nodes are
connected or not we know that it's constant time or big o of 1 for an a jcnc matrix because
if we know the start and end point we know the cell in which to look for 0 or 1 but in the second
structure we cannot do this we will have to scan through a row so if I ask you something like
can you tell me if there is a connection from node 0 to 7 then you will have to scan this
zeroeth row you will have to perform a linear search on this zeroeth row to find 7 right now
all the rows in this structure are sorted you can argue that I can keep all the rows sorted and then
I can perform a binary search which would be a lot less costlier that's fine but if you just
perform a linear search then in worst case we can have exactly v that is number of vertices
cells in a row so if we perform a linear search in worst case we will take
time proportional to number of vertices and of course the time cost would be
big o of log v if we would perform a binary search logarithmic run times are really good
but to get this here we always need to keep our rows sorted keeping an array always sorted
is costly in other ways and I'll come back to it later for now let's just say that this would cost
us big o of v now what do you think would be the time cost of finding all nodes adjacent
to a given node that is finding all neighbors of a node well even in case of a adjacency matrix
we now have to scan a complete row so it would be big o of v for the matrix as well as this second
structure here because here also in worst case we can have v cells in a row equivalent to having
all ones in a row in an adjacency matrix when we try to see the time cost of an operation
we mostly analyze the worst case so for this operation we are big o of v for both so this is
the picture that we are getting looks like we are saving some space with this second structure
but we are not saving much on time well I would still argue that it's not true when we analyze time
complexity we mostly analyze it for the worst case but what if we already know that we are not
going to hit the worst case if we can go back to our previous assumption that we are dealing with
a sparse graph that we are dealing with a graph in which a node would be connected to
few other nodes and not all other nodes then the second structure will definitely save us time
things would look better once again if we would analyze them in context of a social network
I'll set some assumptions let's say we have a billion users in our social network
and the maximum number of friends that anybody has is 10,000 and let's also assume computational
power of our machine let's say our machine or system can scan or read 10 to the power 6
cells in a second this is a reasonable assumption because machines often execute a couple of millions
instructions per second now what would be the actual cost of finding all nodes adjacent to a
given node in a JNC matrix well we will have to scan a complete row in the matrix that would be
10 to the power 9 cells because in a matrix we would always have cells equal to number of vertices
and if we would divide this by a million we would get the time in seconds
to scan a row of 10 to the power 9 cells we would take 1000 seconds which is also 16.66 minutes
this is unreasonably high but with the second structure maximum number of cells in a row
would be 10,000 because the number of cells would exactly be equal to number of connections
and this is the maximum number of friends or connections a person in the network has
so here we would take 10 to the power 4 upon 10 to the power 6 that is 10 to the power minus
two seconds which is equal to 10 milliseconds 10 milliseconds is not unreasonable now let's try
to deduce the cost for the second operation finding if two nodes are connected or not in case of
a JNC matrix we would know exactly what cell to read we would know the memory location of that
specific cell and reading that one cell would cost us one upon 10 to the power 6 seconds
which is one microsecond in the second structure we would not know the exact cell
we will have to scan a row so once again maximum time taken would be 10 milliseconds
just like finding adjacent nodes so now given this analysis if you would have to design a social
network what structure would you choose no brainer isn't it machine cannot make a user
wait for 16 minutes would you ever use such a system milliseconds is fine but minutes it's just
too much so now we know that for most real world graphs this second structure is better because
it saves us space as well as time remember i'm saying most and not all because for this logic to
be true for my reasoning to be valid graph has to be sparse number of edges has to be significantly
lesser than square of number of vertices so now having analyzed space consumption and time cost
of at least two most frequently performed operations looks like this second structure would be better
for most graphs well there can be a bunch of operations in a graph and we should account for
all kind of operations so before making up my mind i would analyze cost of few more operations
what if after storing this example graph in computer's memory in any of these structures
we decide to add a new edge let's say we got a new connection in the graph from a to g
then how do you think we can store this new information this new edge in both these structures
the idea here is to assess that once the structures are created in computer's memory
how would we do if the graph changes how would we do if a node or edge is inserted or deleted
if a new edge is inserted in case of an adjacency matrix we just need to go to a specific cell
and flip the zero at that cell to one in this case we would go to zero at row
and sixth column and overwrite it with value one and if it was a deletion then we would go to a
specific cell and make the one zero now how about this second structure how would you do it here
we need to add a six in the first row
and if you have followed this series on data structures then you know that it's not possible
to dynamically increase size of an existing array this would not be so straightforward
we will have to create a new array of size four for the zero at row then we will have to copy
content of the old array write the new value and then wipe off the old one from the memory
it's tricky implementing a dynamic or changing list using arrays this creation of new array and
copying of old data is costly and this is the precise reason why we often use another data
structure to store dynamic or changing lists and this another data structure is linked list
so why not use a linked list why can't each row be a linked list something like this
logically we still have a list here but concrete implementation wise we are no more using an
array that we need to change dynamically we are using a linked list it's a lot easier to do
insertions and deletions in a linked list now programmatically to create this kind of structure
in computers memory we need to create a linked list for each node to store its neighbors
so what we can do is we can create an array of pointers just like what we had done when we were
using arrays the only difference would be that this time each of these pointers would point to
head of a linked list that would be a node i have defined node of a linked list here
node of a linked list would have two fields one to store data and another to store address
of the next node a0 would be a pointer to head or first node of linked list for a
a1 would be a pointer to head of linked list for b and we will go on like a2 for c a3 for d and so
on actually i have drawn the linked lists here in the left but i have not drawn the array of pointers
let's say this is my array of pointers now a0 here this one is a pointer to node and it points
to the head of linked list containing the neighbors of a let's assume that head of linked list for
a has addressed 400 so in a0 we would have 400 it's really important to understand what is what
here in this structure this one a0 is a pointer to node and all a pointer does is store an address
or reference this one is a node and it has two fields one to store data and another a pointer
to node to store the address of next node let's assume that the address of next node in this first
linked list is 450 then we should have 450 here and if the next one is at let's say 500
then we should have 500 in address part of the second node the address in last one would be
0 or null now this kind of structure in which we store information about neighbors of a node
in a linked list is what we typically call an adjacency list what i have here is an adjacency list
for an undirected unweighted graph to store a weighted graph in an adjacency list i would have
one more field in node to store weight i have written some random weights next to the edges
in this graph and to store this extra information i have added one extra field in node
both in logical structure and the code all right now finally with this particular
structure that we are calling a adjacency list we should be fine with space consumption
space consumed will be proportional to number of edges and not to square of number of vertices
most graphs are sparse and number of edges in most cases is significantly lesser than
square of number of vertices ideally for space complexity i should say big o of number of edges
plus number of vertices because storing vertices will also consume some memory
but if we can assume that number of vertices will be significantly lesser in comparison to
number of edges then we can simply say big o of number of edges but it's always good if we do
the counting right now for time cost of operations the argument that we were earlier making using a
sparse graph like social network is still true adjacency list would overall be better than adjacency
matrix finally let's come back to the question how flexible are we with this structure
if we need to add a new connection or delete an existing connection and is there any way we can
improve upon it well i'll leave this for you to think but i'll give you a hint what if instead of
using a linked list to store information about all the neighbors we use a binary search tree
do you think we would do better for some of these operations i think we would do better because
the time cost for searching inserting and deleting a neighbor would reduce with this thought i'll
sign off, this is it for this lesson, thanks for watching.
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.