0:02 this complete system design tutorial
0:04 covers scalability reliability data
0:07 handling and high level architecture
0:09 with clear explanations real world
0:12 examples and practical strategies hike
0:14 will teach you the Core Concepts you
0:17 need to know for a system designs
0:19 interview this is a complete crash
0:21 course on system design interview
0:23 Concepts that you need to know to as
0:25 your job interview the system design
0:27 interview doesn't have to do much with
0:28 coding and people don't want to see you
0:30 write actual code but how you glue an
0:32 entire system together and that is
0:34 exactly what we're going to cover in
0:36 this tutorial we'll go through all of
0:38 the concepts that you need to know to as
0:41 your job interview before designing
0:43 large scale distributed systems it's
0:45 important to understand the high level
0:47 architecture of the individual computer
0:49 let's see how different parts of the
0:52 computer work together to execute our
0:54 code computers function through a
0:56 layered system each optimized for
0:59 varying tasks at Decor computers
1:02 understand only binary zeros and ones
1:05 these are represented as bits one bit is
1:07 the smallest data unit in Computing it
1:11 can be either zero or one one bite
1:13 consists of eight bits and it's used to
1:15 represent a single character like a or
1:18 number like one expanding from here we
1:21 have kilobyte megabyte gigabytes and
1:24 terabytes to store this data we have
1:26 computer disk storage which holds the
1:30 primary data it can be either htd or SS
1:33 D type the disk storage is nonvolatile
1:35 it maintains data without power meaning
1:38 if you turn off or restart the computer
1:41 the data will still be there it contains
1:44 the OS applications and all user files
1:46 in terms of size discs typically range
1:49 from hundreds of gigabytes to multiple
1:53 terabytes while ssds are more expensive
1:55 they offer significantly faster data
1:58 retrieval than HDD for instance an SSD
2:01 may have a r speed of 500 MB per second to
2:02 to
2:07 3,500 while an HDD might offer 80 to 160
2:10 mb per second the next immediate access
2:12 point after dis is the Ram or random
2:15 access memory RAM serves as the primary
2:18 active data holder and it holds data
2:20 structures variables and applications
2:23 data that are currently in use or being
2:26 processed when a program runs its
2:28 variables intermediate computations
2:31 runtime stack and more are stored in Ram
2:33 because it allows for a quick read and
2:36 write access this is a volatile memory
2:38 which means that it requires power to
2:40 retain its contents and after you
2:43 restart the computer the data may not be
2:46 persisted in terms of size Rams range
2:49 from a few Gaby in consumer devices to
2:52 hundreds of gabt in high-end
2:54 servers their read right speed often
2:57 surpasses 5,000 megabytes per second
3:00 which is faster than even the fastest SS
3:02 this dis speed but sometimes even this
3:04 speed isn't enough which brings us to
3:07 the cache the cache is smaller than Ram
3:10 typically it's measured in megabytes but
3:12 access times for cach memory are even
3:15 faster than Ram offering just a few Nan
3:18 for the L1 cache the CPU first checks
3:21 the L1 cach for the data if it's not
3:24 found it checks the L2 and L3 cache and
3:26 then finally it checks the ram the
3:28 purpose of a cach is to reduce the
3:31 average time to Access Data that's why
3:33 we store frequently used data here to
3:36 optimize CPU performance and what about
3:40 the CPU CPU is the brain of the computer
3:42 it fetches decodes and executes
3:44 instructions when you run your code it's
3:47 the CPU that processes the operations
3:49 defined in that program but before it
3:51 can run our code which is written in
3:54 high level languages like Java C++
3:57 python or other languages our code first
4:00 needs to be compiled into machine code a
4:02 compiler performs this translation and
4:04 once the code is compiled into machine
4:07 code the CPU can execute it it can read
4:09 and write from our Ram disk and cach
4:12 data and finally we have motherboard or
4:14 main board which is what you might think
4:16 of as the component that connects
4:19 everything it provides the path phase
4:21 that allow data to flow between these
4:23 components now let's have a look at the
4:25 very high level architecture of a
4:28 production ready up our first key area
4:30 is the cicd pipeline continuous
4:32 integration and continuous deployment
4:34 this ensures that our code goes from the
4:37 repository through a series of tests and
4:39 pipeline checks and onto the production
4:42 server without any manual intervention
4:44 it's configured with platforms like
4:46 Jenkins or GitHub actions for automating
4:47 our deployment
4:50 processes and once our app is in
4:52 production it has to handle lots of user
4:54 requests this is managed by our load
4:57 balancers and reverse proxies like
4:59 ngx they ensure that the user request
5:01 are evenly distributed across multiple
5:04 servers maintaining a smooth user
5:07 experience even during traffic specs our
5:09 server is also going to need to store
5:11 data for that we also have an external
5:13 storage server that is not running on
5:16 the same production server instead it's
5:17 connected over a
5:20 network our servers might also be
5:22 communicating with other servers as well
5:24 and we can have many such services not
5:27 just one to ensure everything runs
5:29 smoothly we have logging and monitoring
5:31 system s keeping a Keen Eye on every
5:33 micro interaction of storing logs and
5:36 analyzing data it's standard practice to
5:38 store logs on external Services often
5:42 outside of our primary production server
5:44 for the back end tools like pm2 can be
5:46 used for logging and monitoring on the
5:48 front end platforms like Sentry can be
5:51 used to capture and Report errors in
5:53 real time and when things don't go as
5:55 plann meaning our logging systems detect
5:58 failing requests or anomalies first it
6:01 enforce our alerting service after that
6:03 push notifications are sent to keep
6:05 users informed from generic something
6:07 rank wrong to specific payment failed
6:09 and modern practice is to integrate
6:12 these alerts directly into platforms we
6:14 commonly use like slack imagine a
6:16 dedicated slack Channel where alerts pop
6:19 up at the moment an issue arises this
6:21 allows developers to jump into action
6:23 almost instantly addressing the root CS
6:26 before it escalates and after that
6:28 developers have to debug the issue first
6:30 and foremost the issue needs to be
6:32 identified those logs we spoke about
6:35 earlier they are our first Port of Call
6:37 developers go through them searching for
6:39 patterns or anomalies that could point
6:41 to the source of the problem after that
6:43 it needs to be replicated in a safe
6:46 environment the golden rule is to never
6:47 debug directly in the production
6:50 environment instead developers recreate
6:52 the issue in a staging or test
6:54 environment this ensures users don't get
6:57 affected by the debugging process then
6:58 developers use tools to peer into the
7:01 running app apption and start debugging
7:03 once the bug is fixed a hot fix is
7:06 rolled out this is a quick temporary fix
7:08 designed to get things running again
7:09 it's like a patch before a more
7:12 permanent solution can be implemented in
7:13 this section let's understand the
7:16 pillars of system design and what it
7:17 really takes to create a robust and
7:20 resilent application now before we jump
7:23 into the technicalities let's talk about
7:25 what actually makes a good design when
7:27 we talk about good design in system
7:29 architecture we are really focusing ing
7:32 on a few key principles scalability
7:34 which is our system growth with its user
7:37 base maintainability which is ensuring
7:39 future developers can understand and
7:42 improve our system and efficiency which
7:45 is making the best use of our resources
7:47 but good design also means planning for
7:49 failure and building a system that not
7:51 only performs well when everything is
7:54 running smoothly but also maintains its
7:57 composure when things go wrong at the
7:58 heart of system design are three key
8:01 elements moving data storing data and
8:05 transforming data moving data is about
8:07 ensuring that data can flow seamlessly
8:10 from one part of our system to another
8:12 whether it's user request seeding our
8:14 servers or data transfers between
8:17 databases we need to optimize for Speed
8:19 and security storing data isn't just
8:22 about choosing between SQL or nosql
8:24 databases it's about understanding
8:27 access patterns indexing strategies and
8:29 backup Solutions we need to ensure that
8:31 our data is not only stored securely but
8:34 is also readily available when needed
8:37 and data transformation is about taking
8:39 row data and turning it into meaningful
8:42 information whether it's aggregating log
8:44 files for analysis or converting user
8:47 input into a different format now let's
8:49 take a moment to understand the crucial
8:52 Concept in system design the cap theorem
8:54 also known as Brewers theorem named
8:57 after computer scientist Eric Brewer
8:59 this theorem is a set of principles that
9:01 guide us in making informed tradeoffs
9:03 between three key components of a
9:06 distributed system consistency
9:09 availability and partition tolerance
9:11 consistency ensures that all nodes in
9:13 the distributed system have the same
9:15 data at the same time if you make a
9:17 change to one node that change should
9:20 also be reflected across all nodes think
9:23 of it like updating a Google doc if one
9:25 person makes an edit everyone else sees
9:28 that edit immediately availability means
9:30 that the system is is always operational
9:33 and responsive to requests regardless of
9:34 what might be happening behind the
9:37 scenes like a reliable online store no
9:40 matter when you visit it's always open
9:42 and ready to take your order and
9:44 partition tolerance refers to the
9:47 system's ability to continue functioning
9:49 even when a network partition occur
9:51 meaning if there is a disruption in
9:53 communication between nodes the system
9:55 still works it's like having a group
9:57 chat where even if one person loses
9:59 connection the rest of the group can
10:02 continue chatting and according to cap
10:04 theorem a distributed system can only
10:05 achieve two out of these three
10:08 properties at the same time if you
10:10 prioritize consistency and partition
10:12 tolerance you might have to compromise
10:15 on availability and vice versa for
10:17 example a banking system needs to be
10:19 consistent and partition tolerant to
10:22 ensure Financial accuracy even if it
10:24 means some transactions take longer to
10:27 process temporarily compromising
10:29 availability so every design DEC
10:31 decision comes with tradeoffs for
10:33 example a system optimized for read
10:35 operations might perform poorly on write
10:38 operations or in order to gain
10:40 performance we might have to sacrifice a
10:42 bit of complexity so it's not about
10:44 finding the perfect solution it's about
10:46 finding the best solution for our
10:48 specific use case and that means making
10:51 informed decision about where we can
10:53 afford to compromise so one important
10:56 measurement of system is availability
10:58 this is the measure of systems
11:00 operational performance and
11:02 reliability when we talk about
11:04 availability we are essentially asking
11:06 is our system up and running when our
11:09 users need it this is often measured in
11:11 terms of percentage aiming for that
11:14 golden 5 9's availability let's say we
11:17 are running a critical service with 99.9
11:21 availability that allows for around 8.76
11:24 hours of downtime per year but if we add
11:27 two NES to it we are talking just about
11:29 5 minutes of downtime per year and
11:31 that's a massive difference especially
11:34 for services where every second counts
11:36 we often measure it in terms of uptime
11:38 and downtime and here is where service
11:40 level objectives and service level
11:43 agreements come into place slos are like
11:45 setting goals for our systems
11:48 performance and availability for example
11:50 we might set an SLO stating that our web
11:52 service should respond to request within
11:54 300 milliseconds and
11:58 99.9% of the time slas on the other hand
12:00 are like for formal contracts with our
12:02 users or customers they Define the
12:04 minimum level of service we are
12:07 committing to provide so if our SLA
12:10 guarantees 99.99 availability and we
12:12 drop below that we might have to provide
12:15 refunds or other compensations to our
12:18 customers building resilence into our
12:20 system means expecting the unexpected
12:22 this could mean implementing redundant
12:24 systems ensuring there is always a
12:26 backup ready to take over in case of
12:29 failure or it could mean designing our
12:32 system to degrade gracefully so even if
12:34 certain features are unavailable the
12:36 core functionality remains intact to
12:39 measure this aspect we used reliability
12:41 fault tolerance and
12:44 redundancy reliability means ensuring
12:46 that our system works correctly and
12:48 consistently fa tolerance is about
12:51 preparing for when things go wrong how
12:53 does our system handle unexpected
12:56 failures or attacks and redundancy is
12:58 about having backups ensuring that if
13:00 one part of our system fails there is
13:03 another ready to take its place we also
13:05 need to measure the speed of our system
13:07 and for that we have throughput and
13:10 latency throughput measures how much
13:12 data our system can handle over a
13:14 certain period of time we have server
13:16 throughput which is measured in requests
13:19 per second this metric provides an
13:21 indication of how many client requests a
13:24 server can handle in a given time frame
13:27 a higher RPS value typically indicates
13:29 better performance and the ability to
13:31 handle more concurrent users we have
13:34 database throughput which is measured in
13:36 queries per second this quantifies the
13:39 number of queries a database can process
13:41 in a second like server throughput a
13:44 higher QPS value usually signifies better
13:45 better
13:47 performance and we also have data
13:49 throughput which is measured in bytes
13:51 per second this reflects the amount of
13:54 data transferred over a network or
13:56 processed by a system in a given period
13:59 of time on the other hand latency
14:00 measures how long it takes to handle a
14:03 single request it's the time it takes
14:05 for a request to get a response and
14:08 optimizing for one can often lead to
14:10 sacrifices in the other for example
14:12 batching operations can increase
14:14 throughput but might also increase
14:17 latency and designing a system poly can
14:19 lead to a lot of issues down the line
14:22 from performance bottlenecks to security
14:24 vulnerabilities and unlike code which
14:26 can be refactored easily redesigning A
14:29 system can be a Monumental task that's
14:31 why it's crucial to invest time and
14:33 resources into getting the design right
14:35 from the start and laying a solid
14:37 foundation that can support the weight
14:40 of future features and user growth now
14:42 let's talk about networking Basics when
14:44 we talk about networking Basics we are
14:46 essentially discussing how computers
14:49 communicate with each other at the heart
14:51 of this communication is the IP address
14:54 a unique identifier for each device on a
14:57 network IP V4 addresses are 32bit which
14:59 allows for approximately 4 billion
15:02 unique addresses however with the
15:04 increasing number of devices we are
15:07 moving to IP V6 which uses 128bit
15:09 addresses significantly increasing the
15:12 number of available unique addresses
15:14 when two computers communicate over a
15:16 network they send and receive packets of
15:19 data and each packet contains an IP
15:21 header which contains essential
15:23 information like the senders and
15:25 receivers IP addresses ensuring that the
15:28 data reaches the correct destination
15:30 this process is governed by the Internet
15:32 Protocol which is a set of rules that
15:35 defines how data is sent and received
15:38 besides the IP layer we also have the
15:40 application layer where data specific to
15:43 the application protocol is stored the
15:45 data in these packets is formatted
15:47 according to specific application
15:50 protocol data like HTTP for web browsing
15:52 so that the data is interpreted
15:55 correctly by the receiving device once
15:57 we understand the basics of Ip
15:59 addressing and data packets we can dive
16:02 into transport layer where TCP and UDP
16:05 come into play TCP operates at the
16:08 transport layer and ensures reliable
16:10 communication it's like a delivery guy
16:12 who makes sure that your package not
16:14 only arrives but also checks that
16:16 nothing is missing so each data packet
16:19 also includes a TCP header which is
16:21 carrying essential information like port
16:24 numbers and control flux necessary for
16:26 managing the connection and data flow
16:29 TCP is known for its reliability it
16:31 ensures the complete and correct
16:34 delivery of data packets it accomplishes
16:35 this through features like sequence
16:38 numbers which keep track of the order of
16:40 packets and the process known as the
16:42 freeway handshake which establishes a
16:45 stable connection between two devices in
16:48 contrast UDP is faster but less reliable
16:51 than TCP it doesn't establish a
16:53 connection before sending data and
16:55 doesn't guarantee the delivery or order
16:58 of the packets but this makes UDP
17:00 preferable for time sensitive
17:02 Communications like video calls or live
17:05 streaming where speed is crucial and
17:08 some data loss is acceptable to tie all
17:10 these Concepts together let's talk about
17:13 DNS domain name system DNS acts like the
17:16 internet form book translating human
17:19 friendly domain names into IP addresses
17:21 when you enter a URL in your browser the
17:24 browser sends a DNS query to find the
17:26 corresponding IP address allowing it to
17:28 establish a connection to the server and
17:30 and retrieve the web page the
17:33 functioning of DNS is overseen by I can
17:35 which coordinates the global IP address
17:39 space and domain name system and domain
17:41 name registers like name chip or gold Ed
17:44 are accredited by I can to sell domain
17:47 names to the public DNS uses different
17:49 types of Records like a records which
17:52 map The Domain to its corresponding IP
17:54 address ensuring that your request
17:57 reaches to the correct server or 4 a
17:59 records which map a domain name name to
18:03 an IP V6 address and finally let's talk
18:05 about the networking infrastructure
18:07 which supports all these communication
18:10 devices on a network have either public
18:13 or private IP addresses public IP
18:15 addresses are unique across the internet
18:17 while private IP addresses are unique
18:20 within a local network an IP address can
18:23 be stated permanently assigned to a
18:25 device or dynamic changing over time
18:28 Dynamic IP addresses are commonly used
18:30 for res idential internet connections
18:33 and devices connected in a local area
18:36 network can communicate with each other
18:38 directly and to protect these networks
18:40 we are using firewalls which are
18:42 monitoring and controlling incoming and
18:46 outgoing Network traffic and within a
18:48 device specific processes or services
18:50 are identified by ports which when
18:52 combined with an IP address create a
18:56 unique identifier for a network service
18:58 some ports are reserved for specific
19:02 protocols like 80 for HTTP or 22 for
19:05 SSH now let's cover all the essential
19:07 application layer protocols the most
19:09 common protocol out of this is HTTP
19:11 which stands for hyper text transfer
19:14 protocol which is built on TCP IP it's a
19:17 request response protocol but imagine it
19:19 as a conversation with no memory each
19:21 interaction is separate with no
19:24 recollection of the past this means that
19:25 the server doesn't have to store any
19:28 context between requests instead each
19:30 request contains all the necessary
19:33 information and notice how the headers
19:35 include details like URL and Method
19:37 while body carries the substance of the
19:40 request or response each response also
19:42 includes the status code which is just
19:44 to provide feedback about the result of
19:47 a client's request on a server for
19:50 instance 200 series are success codes
19:51 these indicate that the request was
19:55 successfully received and processed 300
19:57 series are redirection codes this
20:00 signify that further action needs to be
20:02 taken by the user agent in order to
20:06 fulfill the request 400 series are
20:08 client error codes these are used when
20:10 the request contains bad syntax or
20:13 cannot be fulfilled and 500 series are
20:15 server error codes this indicates that
20:18 something went wrong on the server we
20:21 also have a method on each request the
20:23 most common methods are get post put
20:25 patch and delete get is used for
20:27 fetching data post is usually for
20:30 creating a data on server puted patch
20:33 are for updating a record and delete is
20:36 for removing a record from database HTTP
20:39 is oneway connection but for realtime
20:41 updates we use web sockets that provide
20:43 a two-way Communication channel over a
20:46 single long lift connection allowing
20:48 servers to push real-time updates to
20:50 clients this is very important for
20:52 applications requiring constant data
20:55 updates without the overhead of repeated
20:58 HTTP request response Cycles it is
21:00 commonly used for chat applications live
21:03 sport updates or stock market feeds
21:05 where the action never stops and neither
21:06 does the
21:08 conversation from email related
21:11 protocols SMTP is the standard for email
21:14 transmission over the Internet it is the
21:16 protocol for sending email messages
21:19 between servers most email clients use
21:22 SMTP for sending emails and either IMAP
21:25 or pop free for retrieving them imup is
21:27 used to retrieve emails from a server
21:29 allowing a client to access and
21:31 manipulate messages this is ideal for
21:33 users who need to access their emails
21:35 from multiple
21:37 devices pop free is used for downloading
21:40 emails from a server to a local client
21:42 typically used when emails are managed
21:45 from a single device moving on to file
21:47 transfer and management protocols the
21:49 traditional protocol for transferring
21:53 files over the Internet is FTP which is
21:55 often used in Website Maintenance and
21:58 large data transfers it is used for the
22:00 trans of files between a client and
22:02 server useful for uploading files to
22:05 server or backing up files and we also
22:08 have SSH or secure shell which is for
22:10 operating Network Services securely on
22:13 an unsecured Network it's commonly used
22:15 for logging into a remote machine and
22:19 executing commands or transferring files
22:21 there are also real-time communication
22:24 protocols like web RTC which enables
22:26 browser to browser applications for
22:28 voice calling video chat and file Shar
22:30 sharing without internal or external
22:32 plugins this is essential for
22:35 applications like video conferencing and live
22:36 live
22:39 streaming another one is mqtt which is a
22:41 lightweight messaging protocol ideal for
22:43 devices with limited processing power
22:46 and in scenarios requiring low bandwidth
22:50 such as iot devices and amqp is a
22:52 protocol for message oriented middleware
22:55 providing robustness and security for
22:57 Enterprise level message communication
22:59 for example it is used in tools like
23:03 rabbit mq let's also talk about RPC
23:04 which is a protocol that allows a
23:07 program on one computer to execute code
23:10 on a server or another computer it's a
23:12 method used to invoke a function as if
23:15 it were a local call when in reality the
23:18 function is executed on a remote machine
23:20 so it abstracts the details of the
23:22 network communication allowing the
23:24 developer to interact with remote
23:26 functions seamlessly as if they were
23:28 local to the application and many
23:31 application player protocols use RPC
23:33 mechanisms to perform their operations
23:36 for example in web services HTTP
23:38 requests can result in RPC calls being
23:41 made on backend to process data or
23:43 perform actions on behalf of the client
23:46 or SMTP servers might use RPC calls
23:49 internally to process email messages or
23:51 interact with
23:53 databases of course there are numerous
23:55 other application layer protocols but
23:57 devance covered here are among the most
24:00 commonly used Bo and essential for web
24:02 development in this section let's go
24:04 through the API design starting from the
24:06 basics and advancing towards the best
24:09 practices that Define exceptional apis
24:11 let's consider an API for an e-commerce
24:13 platform like Shopify which if you're
24:15 not familiar with is a well-known
24:17 e-commerce platform that allows
24:19 businesses to set up online stores in
24:21 API design we are concerned with
24:24 defining the inputs like product details
24:26 for a new product which is provided by a
24:29 seller and the output like the
24:30 information returned when someone
24:33 queries a product of an API so the focus
24:35 is mainly on defining how the crow
24:38 operations are exposed to the user
24:40 interface CR stands for create read
24:43 update and delete which are basic
24:45 operations of any data driven
24:47 application for example to add a new
24:49 product we need to send a post request
24:53 to/ API products where the product
24:55 details are sent in the request body to
24:57 retrieve these products we need to send
25:00 the get request requ EST to/ API SL
25:02 products for updating we use put or
25:06 patch requests to/ product/ the ID of
25:09 that product and removing is similar to
25:12 updating it's again/ product/ ID of the
25:15 product we need to remove and similarly
25:17 we might also have another get request
25:20 to/ product/ ID which fetches the single
25:23 product another part is to decide on the
25:25 communication protocol that will be used
25:29 like HTTP websockets or other protocols
25:31 and the data transport mechanism which
25:35 can be Json XML or protocol buffers this
25:38 is usually the case for restful apis but
25:41 we also have graphql and grpc paradigms
25:44 so apis come in different paradigms each
25:46 with its own set of protocols and
25:49 standards the most common one is rest
25:51 which stands for representational State
25:53 transfer it is stateless which means
25:55 that each request from a client to a
25:57 server must contain all the information
26:00 needed to understand and complete the
26:03 request it uses standard HTTP methods
26:07 get post put and delete and it's easily
26:09 consumable by different clients browsers
26:12 or mobile apps the downside of restful
26:14 apis is that they can lead to over
26:17 fetching or under fetching of data
26:18 because more endpoints may be required
26:21 to access specific data and usually
26:25 restful apis use Json for data exchange
26:27 on the other hand graphql apis allow
26:29 clients to request exactly what they
26:31 need avoiding over fetching and under
26:35 fetching data they have strongly typed
26:37 queries but complex queries can impact
26:40 server performance and all the requests
26:43 are sent as post requests and graphql
26:46 API typically responds with HTTP 200
26:49 status code even in case of errors with
26:52 error details in the response body grpc
26:55 stands for Google remote procedure call
26:57 which is built on http2 which provides
26:58 advanced featur features like
27:02 multiplexing and server push it uses
27:03 protocol buffers which is a way of
27:07 serializing structured data and because
27:08 of that it's sufficient in terms of
27:10 bandwidth and resources especially
27:12 suitable for
27:14 microservices the downside is that it's
27:17 less human readable compared to Json and
27:21 it requires http2 support to operate in
27:23 an e-commerce setting you might have
27:25 relationships like user to orders or
27:28 orders to products and you need to
27:29 design endpoints to reflect these
27:32 relationships for example to fetch the
27:34 orders for a specific user you need to
27:38 query to get/ users SL the user id/
27:41 orders common queries also include limit
27:44 and offset for pagination or start and
27:46 end date for filtering products within a
27:49 certain date range this allows users or
27:51 the client to retrieve specific sets of
27:55 data without overwhelming the system a
27:57 well-designed get request should be itm
27:59 ponent meaning calling it multiple times
28:01 doesn't change the result and it should
28:04 always return the same result and get
28:06 requests should never mutate data they
28:09 are meant only for retrieval if you need
28:11 to update or create a data you need to
28:14 do a put or post request when modifying
28:16 end points it's important to maintain
28:19 backward compatibility this means that
28:21 we need to ensure that changes don't
28:24 break existing clients a common practice
28:26 is to introduce new versions like
28:29 version two products so that the version
28:32 one API can still serve the old clients
28:33 and version 2 API should serve the
28:36 current clients this is in case of
28:39 restful apis in the case of graph Co
28:42 apis adding new Fields like V2 Fields
28:44 without removing old one helps in
28:46 evolving the API without breaking
28:49 existing clients another best practice
28:52 is to set rate limitations this can
28:55 prevent the API from Theos attacks it is
28:57 used to control the number of requests a
29:00 user can make in certain time frame and
29:02 it prevents a single user from sending
29:05 too many requests to your single API a
29:08 common practice is to also set course
29:10 settings which stands for cross origin
29:13 resource sharing with course settings
29:15 you can control which domains can access
29:17 to your API preventing unwanted
29:20 cross-site interactions now imagine a
29:22 company is hosting a website on a server
29:25 in Google cloud data centers in Finland
29:27 it may take around 100 milliseconds to
29:30 load for users in Europe but it takes 3
29:33 to 5 Seconds to load for users in Mexico
29:35 fortunately there are strategies to
29:37 minimize this request latency for users
29:39 who are far away these strategies are
29:41 called caching and content delivery
29:43 networks which are two important
29:45 Concepts in modern web development and
29:48 system design caching is a technique
29:50 used to improve the performance and
29:52 efficiency of a system it involves
29:54 storing a copy of certain data in a
29:56 temporary storage so that future
29:58 requests for that data can be served
30:01 faster there are four common places
30:03 where cash can be stored the first one
30:05 is browser caching where we store
30:07 website resources on a user's local
30:10 computer so when a user revisits a site
30:12 the browser can load the site from the
30:14 local cache rather than fetching
30:16 everything from the server again users
30:19 can disable caching by adjusting the
30:21 browser settings in most browsers
30:23 developers can disable cach from the
30:25 developer tools for instance in Chrome
30:27 we have the disable cache option in the
30:30 dev Vel opers tools Network tab the cach
30:32 is stored in a directory on the client's
30:35 hard drive managed by the browser and
30:38 browser caches store HTML CSS and JS
30:40 bundle files on the user's local machine
30:43 typically in a dedicated cache directory
30:46 managed by the browser we use the cache
30:48 control header to tell browser how long
30:50 this content should be cached for
30:53 example here the cache control is set to
30:56 7,200 seconds which is equivalent to 2
30:59 hours when the re ested data is found in
31:01 the cache we call that a cash hit and on
31:03 the other hand we have cash Miss which
31:05 happens when the requested data is not
31:07 in the cash necessitating a fetch from
31:10 the original source and cash ratio is
31:12 the percentage of requests that are
31:14 served from the cach compared to all
31:16 requests and the higher ratio indicates
31:18 a more effective cach you can check if
31:20 the cash fall hit or missed from the
31:23 xcash header for example in this case it
31:26 says Miss so the cash was missed and in
31:27 case the cash is found we will have here
31:30 it here we also have server caching
31:32 which involves storing frequently
31:34 accessed data on the server site
31:36 reducing the need to perform expensive
31:39 operations like database queries serers
31:41 side caches are stored on a server or on
31:44 a separate cache server either in memory
31:47 like redis or on disk typically the
31:49 server checks the cache from the data
31:51 before quering the database if the data
31:53 is in the cach it is returned directly
31:56 otherwise the server queries the
31:58 database and if the data is not in the
32:00 cache the server retrieves it from the
32:03 database returns it to the user and then
32:05 stores it in the cache for future
32:07 requests this is the case of right
32:09 around cache where data is written
32:11 directly to permanent storage byp
32:14 passing the cache it is used when right
32:16 performance is less critical you also
32:18 have write through cache where data is
32:21 simultaneously written to cache and the
32:23 permanent storage it ensures data
32:25 consistency but can be slower than right
32:28 round cache and we also have right back
32:30 cach where data is first written to the
32:32 cache and then to permanent storage at a
32:34 later time this improves right
32:36 performance but you have a risk of
32:39 losing that data in case of a crush of
32:41 server but what happens if the cash is
32:43 full and we need to free up some space
32:46 to use our cash again for that we have
32:48 eviction policies which are rules that
32:50 determine which items to remove from the
32:53 cash when it's full common policies are
32:56 to remove least recently used ones or
32:58 first in first out where we remove the
33:00 ones that were added first or removing
33:03 the least frequently used ones database
33:05 caching is another crucial aspect and it
33:07 refers to the practice of caching
33:09 database query results to improve the
33:11 performance of database driven
33:14 applications it is often done either
33:16 within the database system itself or via
33:19 an external caching layer like redies or
33:21 M cache when a query is made we first
33:24 check the cache to see if the result of
33:26 that query has been stored if it is we
33:28 return the cach state avoiding the need
33:30 to execute the query against the
33:33 database but if the data is not found in
33:35 the cache the query is executed against
33:37 the database and the result is stored in
33:40 the cache for future requests this is
33:42 beneficial for read heavy applications
33:45 where some queries are executed
33:47 frequently and we use the same eviction
33:49 policies as we have for server side
33:52 caching another type of caching is CDN
33:54 which are a network of servers
33:56 distributed geographically they are
33:58 generally used to serf static content
34:01 such as JavaScript HTML CSS or image and
34:04 video files they cat the content from
34:06 the original server and deliver it to
34:09 users from the nearest CDN server when a
34:11 user requests a file like an image or a
34:14 website the request is redirected to the
34:17 nearest CDN server if the CDN server has
34:19 the cached content it delivers it to the
34:22 user if not it fetches the content from
34:24 the origin server caches it and then
34:27 forwards it to the user this is the pool
34:29 based type type of CDN where the CDN
34:31 automatically pulls the content from the
34:33 origin server when it's first requested
34:36 by a user it's ideal for websites with a
34:38 lot of static content that is updated
34:41 regularly it requires less active
34:43 management because the CDN automatically
34:46 keeps the content up to date another
34:48 type is push based CDs this is where you
34:51 upload the content to the origin server
34:52 and then it distributes these files to
34:55 the CDN this is useful when you have
34:57 large files that are infrequently
34:58 updated but need to be quickly
35:01 distributed when updated it requires
35:03 more active management of what content
35:06 is stored on the edn we again use the
35:08 cache control header to tell the browser
35:10 for how long it should cach the content
35:13 from CDN CDN are usually used for
35:16 delivering static assets like images CSS
35:18 files JavaScript bundles or video
35:21 content and it can be useful if you need
35:22 to ensure High availability and
35:25 performance for users it can also reduce
35:28 the load on the origin server but there
35:29 are some instances where we still need
35:32 to hit our origin server for example
35:34 when serving Dynamic content that
35:37 changes frequently or handling tasks
35:39 that require real-time processing and in
35:41 cases where the application requires
35:44 complex server side logic that cannot be
35:46 done in the CDN some of the benefits
35:50 that we get from CDN are reduced latency
35:52 by serving content from locations closer
35:55 to the user CDN significantly reduce
35:58 latency it also adds High avail ability
36:01 and scalability CDN can handle high
36:03 traffic loads and are resilent against
36:06 Hardware failures it also adds improved
36:09 security because many CDN offer security
36:11 features like DDS protection and traffic
36:14 encryption and the benefits of caching
36:16 are also reduced latency because we have
36:18 fast data retrieval since the data is
36:20 fetched from the nearby cache rather
36:23 than a remote server it lowers the
36:25 server load by reducing the number of
36:27 requests to the primary data source
36:30 decreasing server load and overall
36:32 faster load times lead to a better user
36:35 experience now let's talk about proxy
36:37 servers which act as an intermediary
36:39 between a client requesting a resource
36:42 and the server providing that resource
36:44 it can serve various purposes like
36:46 caching resources for faster access
36:49 anonymizing requests and load balancing
36:51 among multiple servers essentially it
36:53 receives requests from clients forwards
36:56 them to the relevant servers and then
36:58 Returns the servers respond back to the
37:00 client there are several types of proxy
37:03 servers each serving different purposes
37:05 here are some of the main types the
37:08 first one is forward proxy which sits in
37:10 front of clients and is used to send
37:12 requests to other servers on the
37:15 Internet it's often used within the
37:17 internal networks to control internet
37:20 access next one is reverse proxy which
37:22 sits in front of one or more web servers
37:25 intercepting requests from the internet
37:27 it is used for load balancing web
37:30 acceleration and as a security layer
37:33 another type is open proxy which allows
37:35 any user to connect and utilize the
37:38 proxy server often used to anonymize web
37:41 browsing and bypass content restrictions
37:43 we also have transparent proxy types
37:45 which passes along requests and
37:47 resources without modifying them but
37:49 it's visible to the client and it's
37:51 often used for caching and content
37:54 filtering next type is anonymous proxy
37:57 which is identifiable as a proxy server
37:59 but but does not make the original IP
38:02 address available this type is used for
38:04 anonymous browsing we also have
38:06 distorting proxies which provides an
38:09 incorrect original Ip to the destination
38:11 server this is similar to an anonymous
38:14 proxy but with purposeful IP
38:16 misinformation and next popular type is
38:19 high anonymity proxy or Elite proxy
38:22 which makes detecting the proxy use very
38:24 difficult these proxies do not send X
38:27 forwarded for or other identifying
38:30 header and they ensure maximum anonymity
38:32 the most commonly used proxy servers are
38:35 forward and reverse proxies a forward
38:37 proxy acts as a middle layer between the
38:40 client and the server it sits between
38:42 the client which can be a computer on an
38:44 internal Network and the external
38:47 servers which can be websites on the
38:49 internet when the client makes a request
38:52 it is first sent to the forward proxy
38:54 the proxy then evaluates the request and
38:57 decides based on its configuration and
38:59 rules whether to allow the request
39:02 modify it or to block it one of the
39:04 primary functions of a forward proxy is
39:07 to hide the client's IP address when it
39:09 forwards the request to the Target
39:12 server it appears as if the request is
39:14 coming from the proxy server itself
39:17 let's look at some example use cases of
39:20 forward proxies one popular example is
39:23 Instagram proxies these are a specific
39:25 type of forward proxy used to manage
39:27 multiple Instagram accounts without
39:30 triggering bonds or restrictions and
39:32 marketers and social media managers use
39:34 Instagram proxies to appear as if they
39:37 are located in different area or as
39:39 different users which allows them to
39:42 manage multiple accounts automate tasks
39:44 or gather data without being flaged for
39:47 suspicious activity next example is
39:49 internet use control and monitoring
39:52 proxies some organizations use forward
39:55 proxies to Monitor and control employee
39:57 internet usage they can block access to
40:00 non-related sites and protect against
40:03 web based threats they can also scan for
40:06 viruses and malware in incoming content
40:08 next common use case is caching
40:10 frequently accessed content forward
40:13 proxies can also cach popular websites
40:15 or content reducing bandwidth usage and
40:18 speeding up access for users within the
40:21 network this is especially beneficial in
40:23 networks where bandwidth is costly or
40:26 limited and it can be also used for
40:28 anonymizing web access people who are
40:30 concerned about privacy can use forward
40:33 proxies to hide their IP address and
40:35 other identifying information from
40:38 websites they Vis it and making it
40:40 difficult to track their web browsing
40:42 activities on the other hand the reverse
40:45 proxy is a type of proxy server that
40:47 sits in front of one or more web servers
40:49 intercepting requests from clients
40:52 before they reach the servers while a
40:54 forward proxy hides the client's
40:56 identity a reverse proxy essentially
40:58 hides the servers Identity or the
41:01 existence of multiple servers behind it
41:03 the client interacts only with the
41:05 reverse proxy and may not know about the
41:08 servers behind it it also distributes
41:11 client requests across multiple servers
41:13 balancing load and ensuring no single
41:16 server becomes overwhelmed reverse proxy
41:19 can also compress inbound and outbound
41:21 data cache files and manage SSL
41:23 encryption there be speeding up load
41:26 time and reducing server load some
41:28 common use case cases of reverse proxies
41:31 are load balancers these distribute
41:33 incoming Network traffic across multiple
41:36 servers ensuring no single server gets
41:38 too much load and by Distributing
41:41 traffic we prevent any single server
41:43 from becoming a bottleneck and it's
41:45 maintaining optimal service speed and
41:48 reliability CDs are also a type of
41:50 reverse proxies they are a network of
41:53 servers that deliver cach static content
41:55 from websites to users based on the
41:58 geographical location of the user they
42:00 act as Reverse proxies by retrieving
42:02 content from the origin server and
42:04 caching it so that it's closer to the
42:07 user for faster delivery another example
42:10 is web application firewalls which are
42:13 positioned in front of web applications
42:15 they inspect incoming traffic to block
42:17 hacking attempts and filter out unwanted
42:20 traffic firewalls also protect the
42:22 application from common web exploits and
42:25 another example is SSL off loading or
42:28 acceleration some reverse proxies handle
42:31 the encryption and decryption of SSL TLS
42:33 traffic offloading that task from web
42:36 servers to optimize their performance
42:38 load balancers are perhaps the most
42:41 popular use cases of proxy servers they
42:43 distribute incoming traffic across
42:45 multiple servers to make sure that no
42:48 server Bears Too Much load by spreading
42:49 the requests effectively they increase
42:52 the capacity and reliability of
42:54 applications here are some common
42:56 strategies and algorithms used in load balancing
42:57 balancing
42:59 first one is round robin which is the
43:02 simplest form of load balancing where
43:04 each server in the pool gets a request
43:06 in sequential rotating order when the
43:08 last server is reached it Loops back to
43:11 the first one this type works well for
43:14 servers with similar specifications and
43:16 when the load is uniformly
43:18 distributable next one is list
43:20 connections algorithm which directs
43:22 traffic to the server with the fewest
43:25 active connections it's ideal for longer
43:27 tasks or when the server load is not
43:30 evenly distributed next we have the
43:32 least response time algorithm which
43:34 chooses the server with the lowest
43:36 response time and fewest active
43:38 connections this is effective and the
43:40 goal is to provide the fastest response
43:44 to requests next algorithm is IP hashing
43:46 which determines which server receives
43:48 the request based on the hash of the
43:51 client's IP address this ensures a
43:53 client consistently connects to the same
43:55 server and it's useful for session
43:57 persistence in application where it's
43:59 important that the client consistently
44:02 connects to the same server the variance
44:04 of these methods can also be vited which
44:07 brings us to the weighted algorithms for
44:09 example in weighted round robin or
44:11 weighted list connections servers are
44:13 assigned weights typically based on
44:16 their capacity or performance metrics
44:18 and the servers which are more capable
44:20 handle the most requests this is
44:22 effective when the servers in the pool
44:24 have different capabilities like
44:27 different CPU or different Rams we also
44:30 have geographical algorithms which
44:32 direct requests to the server
44:34 geographically closest to the user or
44:37 based on specific Regional requirements
44:39 this is useful for Global Services where
44:42 latency reduction is priority and the
44:44 next common algorithm is consistent
44:47 hashing which uses a hash function to
44:49 distribute data across various nodes
44:52 imagine a hash space that forms a circle
44:54 where the end wraps around to the
44:56 beginning often referred to as a has
44:59 ring and both the nodes and the data
45:01 like keys or stored values are hushed
45:04 onto this ring this makes sure that the
45:06 client consistently connects to the same
45:09 server every time an essential feature
45:11 of load balancers is continuous Health
45:14 checking of servers to ensure traffic is
45:16 only directed to servers that are online
45:19 and responsive if a server fails the
45:22 load balancer will stop sending traffic
45:25 to it until it is back online and load
45:27 balancers can be in different forms
45:30 including Hardware applications software
45:33 Solutions and cloud-based Services some
45:35 of the popular Hardware load balancers
45:38 are F5 big IP which is a widely used
45:40 Hardware load balancer known for its
45:42 high performance and extensive feature
45:45 set it offers local traffic management
45:48 Global server load balancing and
45:51 application security another example is
45:54 Citrix forly known as net scaler which
45:55 provides load balancing content
45:58 switching and ation acceleration some
46:01 popular software load balancers are AJ
46:03 proxy which is a popular open-source
46:06 software load balancer and proxy server
46:10 for TCP and HTTP based applications and
46:12 of course Eng X which is often used as a
46:15 web server but it also functions as a
46:18 load balancer and reverse proxy for HTTP
46:20 and other network protocols and some
46:23 popular cloud-based load balancers are
46:27 aws's elastic load balancing or microsof
46:30 oft aure load balancer or Google Cloud's
46:32 load balancer there are even some
46:35 virtual load balancers like Vim ver
46:37 Advanced load balancer which offers a
46:39 softwar defined application delivery
46:41 controller that can be deployed on
46:44 premises or in the cloud now let's see
46:47 what happens when a load balancer goes
46:49 down when the load balancer goes down it
46:52 can impact the whole availability and
46:54 performance of the application or
46:57 Services it manages it's basically a
46:59 single point of failure and in case it
47:01 goes down all of the servers become
47:04 unavailable for the clients to avoid or
47:06 minimize the impact of a load balancer
47:09 failure we have several strategies which
47:10 can be employed first one is
47:13 implementing a redundant load balancing
47:15 by using more than one load balancer
47:18 often in pairs which is a common
47:20 approach if one of them fails the other
47:22 one takes over which is a method known
47:23 as a
47:25 failover next strategy is to
47:27 continuously monitor and do health
47:30 checks of load balancer itself this can
47:32 ensure that any issues are detected
47:35 early and can be addressed before
47:37 causing significant disruption we can
47:39 also Implement Autos scaling and
47:42 selfhealing systems some Modern
47:43 infrastructures are designed to
47:45 automatically detect the failure of load
47:47 balancer and replace it with the new
47:51 instance without manual intervention and
47:53 in some configurations the NS failover
47:56 can reroute traffic away from an IP
47:58 address that is is no longer accepting
48:01 connections like a failed load balancer
48:03 to a preconfigured standby IP which is
48:06 our new load balancer system design
48:08 interviews are incomplete without a deep
48:10 dive into databases in the next few
48:12 minutes I'll take you through the
48:14 database Essentials you need to
48:16 understand to a that interview we'll
48:18 explore the role of databases in system
48:20 design sharding and replication
48:24 techniques and the key ACD properties
48:25 we'll also discuss different types of
48:28 databases vertical and horizontal
48:30 scaling options and database performance
48:32 techniques we have different types of
48:35 databases each designed for specific
48:38 tasks and challenges let's explore them
48:41 first type is relational databases think
48:43 of a relational database like a well
48:45 organized filling cabinet where all the
48:47 files are neatly sorted into different
48:50 drawers and folders some popular
48:53 examples of SQL databases are poster SQL
48:57 MySQL and SQL light all of the SQL
49:01 databases use tables for data storage
49:04 and they use SQL as a query language
49:06 they are great for transactions complex
49:09 queries and integrity relational
49:11 databases are also acid compliant
49:14 meaning they maintain the ACD properties
49:17 a stands for atomicity which means that
49:20 transactions Are All or Nothing C stands
49:23 for consistency which means that after a
49:25 transaction your database should be in a
49:28 consistent state I is isolation which
49:30 means that transactions should be
49:33 independent and D is for durability
49:34 which means that once transaction is
49:37 committed the data is there to stay we
49:40 also have nosql databases which drop the
49:43 consistency property from the ACD
49:45 imagine a nosql database as a
49:47 brainstorming board with sticky notes
49:50 you can add or remove notes in any shape
49:52 of form it's flexible some popular
49:55 examples are mongod DB Cassandra and
49:56 redis there are different different
49:59 types of nosql databases such as key
50:02 value pairs like redis document based
50:05 databases like mongod DB or graph based
50:09 databases like Neo 4G nosql databases
50:11 are schema less meaning they don't have
50:13 foreign Keys between tables which link
50:16 the data together they are good for
50:19 unstructured data ideal for scalability
50:22 quick iteration and simple queries there
50:25 are also inmemory databases this is like
50:27 having a whiteboard for quick
50:30 calculations and temporary sketches it's
50:32 fast because everything is in memory
50:35 some examples are redies and M cach they
50:37 have lightning fast data retrieval and
50:39 are used primarily for caching and
50:42 session storage now let's see how we can
50:44 scale databases the first option is
50:47 vertical scaling or scale up in vertical
50:49 scaling you improve the performance of
50:51 your database by enhancing the
50:54 capabilities of individual server where
50:56 the data is running this could involve
50:59 increasing CPU power adding more RAM
51:01 adding faster or more dis storage or
51:03 upgrading the network but there is a
51:05 maximum limit to the resources you can
51:08 add to a single machine and because of
51:10 that it's very limited the next option
51:13 is horizontal scaling or scale out which
51:15 involves adding more machines to the
51:17 existing pool of resources rather than
51:20 upgrading the single unit databases that
51:22 support horizontal scaling distribute
51:25 data across a cluster of machines this
51:27 could involve database sharding or data
51:30 replication the first option is database
51:32 sharding which is Distributing different
51:34 portions shards of the data set across
51:37 multiple servers this means you split
51:39 the data into smaller chunks and
51:41 distribute it across multiple servers
51:44 some of the sharding strategies include
51:46 range based sharding where you
51:48 distribute data based on the range of a
51:51 given key directory based sharding which
51:53 is utilizing a lookup service to direct
51:56 traffic to the correct database we also
51:58 have geographical charting which is
52:00 splitting databases based on geographical
52:01 geographical
52:03 locations and the next horizontal
52:06 scaling option is data replication this
52:08 is keeping copies of data on multiple
52:11 servers for high availability we have
52:14 Master Slave replication which is where
52:16 you have one master database and several
52:19 read only slave databases or you can
52:22 have master master application which is
52:24 multiple databases that can both read
52:27 and write scaling your data database is
52:29 one thing but you also want to access it
52:31 faster so let's talk about different
52:33 performance techniques that can help to
52:36 access your data faster the most obvious
52:39 one is caching caching isn't just for
52:41 web servers database caching can be done
52:44 through inmemory databases like redies
52:46 you can use it to cat frequent queries
52:48 and boost your performance the next
52:50 technique is indexing indexes are
52:52 another way to boost the performance of
52:55 your database creating an index for
52:56 frequently accessed column will
52:59 significantly speed up retrieval times
53:01 and the next technique is query
53:03 optimization you can also consider
53:05 optimizing queries for fast data access
53:07 this includes minimizing joints and
53:10 using tools like SQL query analyzer or
53:13 explain plan to understand your query's
53:15 performance in all cases you should
53:17 remember the cap theorem which states
53:19 that you can only have two of these
53:21 three consistency availability and
53:24 partition tolerance when designing a
53:26 system you should prioritize two of the
53:28 is based on the requirements that you
53:30 have given in the interview if you
53:32 enjoyed this crash course then consider
53:34 watching my other videos about system
53:36 Design Concepts and interviews see you