This tutorial provides a comprehensive crash course on system design, covering fundamental concepts from computer architecture and networking to advanced topics like databases, APIs, caching, and load balancing, all aimed at preparing individuals for system design interviews.
Mind Map
Click to expand
Click to explore the full interactive mind map • Zoom, pan, and navigate
this complete system design tutorial
covers scalability reliability data
handling and high level architecture
with clear explanations real world
examples and practical strategies hike
will teach you the Core Concepts you
need to know for a system designs
interview this is a complete crash
course on system design interview
Concepts that you need to know to as
your job interview the system design
interview doesn't have to do much with
coding and people don't want to see you
write actual code but how you glue an
entire system together and that is
exactly what we're going to cover in
this tutorial we'll go through all of
the concepts that you need to know to as
your job interview before designing
large scale distributed systems it's
important to understand the high level
architecture of the individual computer
let's see how different parts of the
computer work together to execute our
code computers function through a
layered system each optimized for
varying tasks at Decor computers
understand only binary zeros and ones
these are represented as bits one bit is
the smallest data unit in Computing it
can be either zero or one one bite
consists of eight bits and it's used to
represent a single character like a or
number like one expanding from here we
have kilobyte megabyte gigabytes and
terabytes to store this data we have
computer disk storage which holds the
primary data it can be either htd or SS
D type the disk storage is nonvolatile
it maintains data without power meaning
if you turn off or restart the computer
the data will still be there it contains
the OS applications and all user files
in terms of size discs typically range
from hundreds of gigabytes to multiple
terabytes while ssds are more expensive
they offer significantly faster data
retrieval than HDD for instance an SSD
may have a r speed of 500 MB per second to
to
3,500 while an HDD might offer 80 to 160
mb per second the next immediate access
point after dis is the Ram or random
access memory RAM serves as the primary
active data holder and it holds data
structures variables and applications
data that are currently in use or being
processed when a program runs its
variables intermediate computations
runtime stack and more are stored in Ram
because it allows for a quick read and
write access this is a volatile memory
which means that it requires power to
retain its contents and after you
restart the computer the data may not be
persisted in terms of size Rams range
from a few Gaby in consumer devices to
hundreds of gabt in high-end
servers their read right speed often
surpasses 5,000 megabytes per second
which is faster than even the fastest SS
this dis speed but sometimes even this
speed isn't enough which brings us to
the cache the cache is smaller than Ram
typically it's measured in megabytes but
access times for cach memory are even
faster than Ram offering just a few Nan
for the L1 cache the CPU first checks
the L1 cach for the data if it's not
found it checks the L2 and L3 cache and
then finally it checks the ram the
purpose of a cach is to reduce the
average time to Access Data that's why
we store frequently used data here to
optimize CPU performance and what about
the CPU CPU is the brain of the computer
it fetches decodes and executes
instructions when you run your code it's
the CPU that processes the operations
defined in that program but before it
can run our code which is written in
high level languages like Java C++
python or other languages our code first
needs to be compiled into machine code a
compiler performs this translation and
once the code is compiled into machine
code the CPU can execute it it can read
and write from our Ram disk and cach
data and finally we have motherboard or
main board which is what you might think
of as the component that connects
everything it provides the path phase
that allow data to flow between these
components now let's have a look at the
very high level architecture of a
production ready up our first key area
is the cicd pipeline continuous
integration and continuous deployment
this ensures that our code goes from the
repository through a series of tests and
pipeline checks and onto the production
server without any manual intervention
it's configured with platforms like
Jenkins or GitHub actions for automating
our deployment
processes and once our app is in
production it has to handle lots of user
requests this is managed by our load
balancers and reverse proxies like
ngx they ensure that the user request
are evenly distributed across multiple
servers maintaining a smooth user
experience even during traffic specs our
server is also going to need to store
data for that we also have an external
storage server that is not running on
the same production server instead it's
connected over a
network our servers might also be
communicating with other servers as well
and we can have many such services not
just one to ensure everything runs
smoothly we have logging and monitoring
system s keeping a Keen Eye on every
micro interaction of storing logs and
analyzing data it's standard practice to
store logs on external Services often
outside of our primary production server
for the back end tools like pm2 can be
used for logging and monitoring on the
front end platforms like Sentry can be
used to capture and Report errors in
real time and when things don't go as
plann meaning our logging systems detect
failing requests or anomalies first it
enforce our alerting service after that
push notifications are sent to keep
users informed from generic something
rank wrong to specific payment failed
and modern practice is to integrate
these alerts directly into platforms we
commonly use like slack imagine a
dedicated slack Channel where alerts pop
up at the moment an issue arises this
allows developers to jump into action
almost instantly addressing the root CS
before it escalates and after that
developers have to debug the issue first
and foremost the issue needs to be
identified those logs we spoke about
earlier they are our first Port of Call
developers go through them searching for
patterns or anomalies that could point
to the source of the problem after that
it needs to be replicated in a safe
environment the golden rule is to never
debug directly in the production
environment instead developers recreate
the issue in a staging or test
environment this ensures users don't get
affected by the debugging process then
developers use tools to peer into the
running app apption and start debugging
once the bug is fixed a hot fix is
rolled out this is a quick temporary fix
designed to get things running again
it's like a patch before a more
permanent solution can be implemented in
this section let's understand the
pillars of system design and what it
really takes to create a robust and
resilent application now before we jump
into the technicalities let's talk about
what actually makes a good design when
we talk about good design in system
architecture we are really focusing ing
on a few key principles scalability
which is our system growth with its user
base maintainability which is ensuring
future developers can understand and
improve our system and efficiency which
is making the best use of our resources
but good design also means planning for
failure and building a system that not
only performs well when everything is
running smoothly but also maintains its
composure when things go wrong at the
heart of system design are three key
elements moving data storing data and
transforming data moving data is about
ensuring that data can flow seamlessly
from one part of our system to another
whether it's user request seeding our
servers or data transfers between
databases we need to optimize for Speed
and security storing data isn't just
about choosing between SQL or nosql
databases it's about understanding
access patterns indexing strategies and
backup Solutions we need to ensure that
our data is not only stored securely but
is also readily available when needed
and data transformation is about taking
row data and turning it into meaningful
information whether it's aggregating log
files for analysis or converting user
input into a different format now let's
take a moment to understand the crucial
Concept in system design the cap theorem
also known as Brewers theorem named
after computer scientist Eric Brewer
this theorem is a set of principles that
guide us in making informed tradeoffs
between three key components of a
distributed system consistency
availability and partition tolerance
consistency ensures that all nodes in
the distributed system have the same
data at the same time if you make a
change to one node that change should
also be reflected across all nodes think
of it like updating a Google doc if one
person makes an edit everyone else sees
that edit immediately availability means
that the system is is always operational
and responsive to requests regardless of
what might be happening behind the
scenes like a reliable online store no
matter when you visit it's always open
and ready to take your order and
partition tolerance refers to the
system's ability to continue functioning
even when a network partition occur
meaning if there is a disruption in
communication between nodes the system
still works it's like having a group
chat where even if one person loses
connection the rest of the group can
continue chatting and according to cap
theorem a distributed system can only
achieve two out of these three
properties at the same time if you
prioritize consistency and partition
tolerance you might have to compromise
on availability and vice versa for
example a banking system needs to be
consistent and partition tolerant to
ensure Financial accuracy even if it
means some transactions take longer to
process temporarily compromising
availability so every design DEC
decision comes with tradeoffs for
example a system optimized for read
operations might perform poorly on write
operations or in order to gain
performance we might have to sacrifice a
bit of complexity so it's not about
finding the perfect solution it's about
finding the best solution for our
specific use case and that means making
informed decision about where we can
afford to compromise so one important
measurement of system is availability
this is the measure of systems
operational performance and
reliability when we talk about
availability we are essentially asking
is our system up and running when our
users need it this is often measured in
terms of percentage aiming for that
golden 5 9's availability let's say we
are running a critical service with 99.9
availability that allows for around 8.76
hours of downtime per year but if we add
two NES to it we are talking just about
5 minutes of downtime per year and
that's a massive difference especially
for services where every second counts
we often measure it in terms of uptime
and downtime and here is where service
level objectives and service level
agreements come into place slos are like
setting goals for our systems
performance and availability for example
we might set an SLO stating that our web
service should respond to request within
300 milliseconds and
99.9% of the time slas on the other hand
are like for formal contracts with our
users or customers they Define the
minimum level of service we are
committing to provide so if our SLA
guarantees 99.99 availability and we
drop below that we might have to provide
refunds or other compensations to our
customers building resilence into our
system means expecting the unexpected
this could mean implementing redundant
systems ensuring there is always a
backup ready to take over in case of
failure or it could mean designing our
system to degrade gracefully so even if
certain features are unavailable the
core functionality remains intact to
measure this aspect we used reliability
fault tolerance and
redundancy reliability means ensuring
that our system works correctly and
consistently fa tolerance is about
preparing for when things go wrong how
does our system handle unexpected
failures or attacks and redundancy is
about having backups ensuring that if
one part of our system fails there is
another ready to take its place we also
need to measure the speed of our system
and for that we have throughput and
latency throughput measures how much
data our system can handle over a
certain period of time we have server
throughput which is measured in requests
per second this metric provides an
indication of how many client requests a
server can handle in a given time frame
a higher RPS value typically indicates
better performance and the ability to
handle more concurrent users we have
database throughput which is measured in
queries per second this quantifies the
number of queries a database can process
in a second like server throughput a
higher QPS value usually signifies better
better
performance and we also have data
throughput which is measured in bytes
per second this reflects the amount of
data transferred over a network or
processed by a system in a given period
of time on the other hand latency
measures how long it takes to handle a
single request it's the time it takes
for a request to get a response and
optimizing for one can often lead to
sacrifices in the other for example
batching operations can increase
throughput but might also increase
latency and designing a system poly can
lead to a lot of issues down the line
from performance bottlenecks to security
vulnerabilities and unlike code which
can be refactored easily redesigning A
system can be a Monumental task that's
why it's crucial to invest time and
resources into getting the design right
from the start and laying a solid
foundation that can support the weight
of future features and user growth now
let's talk about networking Basics when
we talk about networking Basics we are
essentially discussing how computers
communicate with each other at the heart
of this communication is the IP address
a unique identifier for each device on a
network IP V4 addresses are 32bit which
allows for approximately 4 billion
unique addresses however with the
increasing number of devices we are
moving to IP V6 which uses 128bit
addresses significantly increasing the
number of available unique addresses
when two computers communicate over a
network they send and receive packets of
data and each packet contains an IP
header which contains essential
information like the senders and
receivers IP addresses ensuring that the
data reaches the correct destination
this process is governed by the Internet
Protocol which is a set of rules that
defines how data is sent and received
besides the IP layer we also have the
application layer where data specific to
the application protocol is stored the
data in these packets is formatted
according to specific application
protocol data like HTTP for web browsing
so that the data is interpreted
correctly by the receiving device once
we understand the basics of Ip
addressing and data packets we can dive
into transport layer where TCP and UDP
come into play TCP operates at the
transport layer and ensures reliable
communication it's like a delivery guy
who makes sure that your package not
only arrives but also checks that
nothing is missing so each data packet
also includes a TCP header which is
carrying essential information like port
numbers and control flux necessary for
managing the connection and data flow
TCP is known for its reliability it
ensures the complete and correct
delivery of data packets it accomplishes
this through features like sequence
numbers which keep track of the order of
packets and the process known as the
freeway handshake which establishes a
stable connection between two devices in
contrast UDP is faster but less reliable
than TCP it doesn't establish a
connection before sending data and
doesn't guarantee the delivery or order
of the packets but this makes UDP
preferable for time sensitive
Communications like video calls or live
streaming where speed is crucial and
some data loss is acceptable to tie all
these Concepts together let's talk about
DNS domain name system DNS acts like the
internet form book translating human
friendly domain names into IP addresses
when you enter a URL in your browser the
browser sends a DNS query to find the
corresponding IP address allowing it to
establish a connection to the server and
and retrieve the web page the
functioning of DNS is overseen by I can
which coordinates the global IP address
space and domain name system and domain
name registers like name chip or gold Ed
are accredited by I can to sell domain
names to the public DNS uses different
types of Records like a records which
map The Domain to its corresponding IP
address ensuring that your request
reaches to the correct server or 4 a
records which map a domain name name to
an IP V6 address and finally let's talk
about the networking infrastructure
which supports all these communication
devices on a network have either public
or private IP addresses public IP
addresses are unique across the internet
while private IP addresses are unique
within a local network an IP address can
be stated permanently assigned to a
device or dynamic changing over time
Dynamic IP addresses are commonly used
for res idential internet connections
and devices connected in a local area
network can communicate with each other
directly and to protect these networks
we are using firewalls which are
monitoring and controlling incoming and
outgoing Network traffic and within a
device specific processes or services
are identified by ports which when
combined with an IP address create a
unique identifier for a network service
some ports are reserved for specific
protocols like 80 for HTTP or 22 for
SSH now let's cover all the essential
application layer protocols the most
common protocol out of this is HTTP
which stands for hyper text transfer
protocol which is built on TCP IP it's a
request response protocol but imagine it
as a conversation with no memory each
interaction is separate with no
recollection of the past this means that
the server doesn't have to store any
context between requests instead each
request contains all the necessary
information and notice how the headers
include details like URL and Method
while body carries the substance of the
request or response each response also
includes the status code which is just
to provide feedback about the result of
a client's request on a server for
instance 200 series are success codes
these indicate that the request was
successfully received and processed 300
series are redirection codes this
signify that further action needs to be
taken by the user agent in order to
fulfill the request 400 series are
client error codes these are used when
the request contains bad syntax or
cannot be fulfilled and 500 series are
server error codes this indicates that
something went wrong on the server we
also have a method on each request the
most common methods are get post put
patch and delete get is used for
fetching data post is usually for
creating a data on server puted patch
are for updating a record and delete is
for removing a record from database HTTP
is oneway connection but for realtime
updates we use web sockets that provide
a two-way Communication channel over a
single long lift connection allowing
servers to push real-time updates to
clients this is very important for
applications requiring constant data
updates without the overhead of repeated
HTTP request response Cycles it is
commonly used for chat applications live
sport updates or stock market feeds
where the action never stops and neither
does the
conversation from email related
protocols SMTP is the standard for email
transmission over the Internet it is the
protocol for sending email messages
between servers most email clients use
SMTP for sending emails and either IMAP
or pop free for retrieving them imup is
used to retrieve emails from a server
allowing a client to access and
manipulate messages this is ideal for
users who need to access their emails
from multiple
devices pop free is used for downloading
emails from a server to a local client
typically used when emails are managed
from a single device moving on to file
transfer and management protocols the
traditional protocol for transferring
files over the Internet is FTP which is
often used in Website Maintenance and
large data transfers it is used for the
trans of files between a client and
server useful for uploading files to
server or backing up files and we also
have SSH or secure shell which is for
operating Network Services securely on
an unsecured Network it's commonly used
for logging into a remote machine and
executing commands or transferring files
there are also real-time communication
protocols like web RTC which enables
browser to browser applications for
voice calling video chat and file Shar
sharing without internal or external
plugins this is essential for
applications like video conferencing and live
live
streaming another one is mqtt which is a
lightweight messaging protocol ideal for
devices with limited processing power
and in scenarios requiring low bandwidth
such as iot devices and amqp is a
protocol for message oriented middleware
providing robustness and security for
Enterprise level message communication
for example it is used in tools like
rabbit mq let's also talk about RPC
which is a protocol that allows a
program on one computer to execute code
on a server or another computer it's a
method used to invoke a function as if
it were a local call when in reality the
function is executed on a remote machine
so it abstracts the details of the
network communication allowing the
developer to interact with remote
functions seamlessly as if they were
local to the application and many
application player protocols use RPC
mechanisms to perform their operations
for example in web services HTTP
requests can result in RPC calls being
made on backend to process data or
perform actions on behalf of the client
or SMTP servers might use RPC calls
internally to process email messages or
interact with
databases of course there are numerous
other application layer protocols but
devance covered here are among the most
commonly used Bo and essential for web
development in this section let's go
through the API design starting from the
basics and advancing towards the best
practices that Define exceptional apis
let's consider an API for an e-commerce
platform like Shopify which if you're
not familiar with is a well-known
e-commerce platform that allows
businesses to set up online stores in
API design we are concerned with
defining the inputs like product details
for a new product which is provided by a
seller and the output like the
information returned when someone
queries a product of an API so the focus
is mainly on defining how the crow
operations are exposed to the user
interface CR stands for create read
update and delete which are basic
operations of any data driven
application for example to add a new
product we need to send a post request
to/ API products where the product
details are sent in the request body to
retrieve these products we need to send
the get request requ EST to/ API SL
products for updating we use put or
patch requests to/ product/ the ID of
that product and removing is similar to
updating it's again/ product/ ID of the
product we need to remove and similarly
we might also have another get request
to/ product/ ID which fetches the single
product another part is to decide on the
communication protocol that will be used
like HTTP websockets or other protocols
and the data transport mechanism which
can be Json XML or protocol buffers this
is usually the case for restful apis but
we also have graphql and grpc paradigms
so apis come in different paradigms each
with its own set of protocols and
standards the most common one is rest
which stands for representational State
transfer it is stateless which means
that each request from a client to a
server must contain all the information
needed to understand and complete the
request it uses standard HTTP methods
get post put and delete and it's easily
consumable by different clients browsers
or mobile apps the downside of restful
apis is that they can lead to over
fetching or under fetching of data
because more endpoints may be required
to access specific data and usually
restful apis use Json for data exchange
on the other hand graphql apis allow
clients to request exactly what they
need avoiding over fetching and under
fetching data they have strongly typed
queries but complex queries can impact
server performance and all the requests
are sent as post requests and graphql
API typically responds with HTTP 200
status code even in case of errors with
error details in the response body grpc
stands for Google remote procedure call
which is built on http2 which provides
advanced featur features like
multiplexing and server push it uses
protocol buffers which is a way of
serializing structured data and because
of that it's sufficient in terms of
bandwidth and resources especially
suitable for
microservices the downside is that it's
less human readable compared to Json and
it requires http2 support to operate in
an e-commerce setting you might have
relationships like user to orders or
orders to products and you need to
design endpoints to reflect these
relationships for example to fetch the
orders for a specific user you need to
query to get/ users SL the user id/
orders common queries also include limit
and offset for pagination or start and
end date for filtering products within a
certain date range this allows users or
the client to retrieve specific sets of
data without overwhelming the system a
well-designed get request should be itm
ponent meaning calling it multiple times
doesn't change the result and it should
always return the same result and get
requests should never mutate data they
are meant only for retrieval if you need
to update or create a data you need to
do a put or post request when modifying
end points it's important to maintain
backward compatibility this means that
we need to ensure that changes don't
break existing clients a common practice
is to introduce new versions like
version two products so that the version
one API can still serve the old clients
and version 2 API should serve the
current clients this is in case of
restful apis in the case of graph Co
apis adding new Fields like V2 Fields
without removing old one helps in
evolving the API without breaking
existing clients another best practice
is to set rate limitations this can
prevent the API from Theos attacks it is
used to control the number of requests a
user can make in certain time frame and
it prevents a single user from sending
too many requests to your single API a
common practice is to also set course
settings which stands for cross origin
resource sharing with course settings
you can control which domains can access
to your API preventing unwanted
cross-site interactions now imagine a
company is hosting a website on a server
in Google cloud data centers in Finland
it may take around 100 milliseconds to
load for users in Europe but it takes 3
to 5 Seconds to load for users in Mexico
fortunately there are strategies to
minimize this request latency for users
who are far away these strategies are
called caching and content delivery
networks which are two important
Concepts in modern web development and
system design caching is a technique
used to improve the performance and
efficiency of a system it involves
storing a copy of certain data in a
temporary storage so that future
requests for that data can be served
faster there are four common places
where cash can be stored the first one
is browser caching where we store
website resources on a user's local
computer so when a user revisits a site
the browser can load the site from the
local cache rather than fetching
everything from the server again users
can disable caching by adjusting the
browser settings in most browsers
developers can disable cach from the
developer tools for instance in Chrome
we have the disable cache option in the
dev Vel opers tools Network tab the cach
is stored in a directory on the client's
hard drive managed by the browser and
browser caches store HTML CSS and JS
bundle files on the user's local machine
typically in a dedicated cache directory
managed by the browser we use the cache
control header to tell browser how long
this content should be cached for
example here the cache control is set to
7,200 seconds which is equivalent to 2
hours when the re ested data is found in
the cache we call that a cash hit and on
the other hand we have cash Miss which
happens when the requested data is not
in the cash necessitating a fetch from
the original source and cash ratio is
the percentage of requests that are
served from the cach compared to all
requests and the higher ratio indicates
a more effective cach you can check if
the cash fall hit or missed from the
xcash header for example in this case it
says Miss so the cash was missed and in
case the cash is found we will have here
it here we also have server caching
which involves storing frequently
accessed data on the server site
reducing the need to perform expensive
operations like database queries serers
side caches are stored on a server or on
a separate cache server either in memory
like redis or on disk typically the
server checks the cache from the data
before quering the database if the data
is in the cach it is returned directly
otherwise the server queries the
database and if the data is not in the
cache the server retrieves it from the
database returns it to the user and then
stores it in the cache for future
requests this is the case of right
around cache where data is written
directly to permanent storage byp
passing the cache it is used when right
performance is less critical you also
have write through cache where data is
simultaneously written to cache and the
permanent storage it ensures data
consistency but can be slower than right
round cache and we also have right back
cach where data is first written to the
cache and then to permanent storage at a
later time this improves right
performance but you have a risk of
losing that data in case of a crush of
server but what happens if the cash is
full and we need to free up some space
to use our cash again for that we have
eviction policies which are rules that
determine which items to remove from the
cash when it's full common policies are
to remove least recently used ones or
first in first out where we remove the
ones that were added first or removing
the least frequently used ones database
caching is another crucial aspect and it
refers to the practice of caching
database query results to improve the
performance of database driven
applications it is often done either
within the database system itself or via
an external caching layer like redies or
M cache when a query is made we first
check the cache to see if the result of
that query has been stored if it is we
return the cach state avoiding the need
to execute the query against the
database but if the data is not found in
the cache the query is executed against
the database and the result is stored in
the cache for future requests this is
beneficial for read heavy applications
where some queries are executed
frequently and we use the same eviction
policies as we have for server side
caching another type of caching is CDN
which are a network of servers
distributed geographically they are
generally used to serf static content
such as JavaScript HTML CSS or image and
video files they cat the content from
the original server and deliver it to
users from the nearest CDN server when a
user requests a file like an image or a
website the request is redirected to the
nearest CDN server if the CDN server has
the cached content it delivers it to the
user if not it fetches the content from
the origin server caches it and then
forwards it to the user this is the pool
based type type of CDN where the CDN
automatically pulls the content from the
origin server when it's first requested
by a user it's ideal for websites with a
lot of static content that is updated
regularly it requires less active
management because the CDN automatically
keeps the content up to date another
type is push based CDs this is where you
upload the content to the origin server
and then it distributes these files to
the CDN this is useful when you have
large files that are infrequently
updated but need to be quickly
distributed when updated it requires
more active management of what content
is stored on the edn we again use the
cache control header to tell the browser
for how long it should cach the content
from CDN CDN are usually used for
delivering static assets like images CSS
files JavaScript bundles or video
content and it can be useful if you need
to ensure High availability and
performance for users it can also reduce
the load on the origin server but there
are some instances where we still need
to hit our origin server for example
when serving Dynamic content that
changes frequently or handling tasks
that require real-time processing and in
cases where the application requires
complex server side logic that cannot be
done in the CDN some of the benefits
that we get from CDN are reduced latency
by serving content from locations closer
to the user CDN significantly reduce
latency it also adds High avail ability
and scalability CDN can handle high
traffic loads and are resilent against
Hardware failures it also adds improved
security because many CDN offer security
features like DDS protection and traffic
encryption and the benefits of caching
are also reduced latency because we have
fast data retrieval since the data is
fetched from the nearby cache rather
than a remote server it lowers the
server load by reducing the number of
requests to the primary data source
decreasing server load and overall
faster load times lead to a better user
experience now let's talk about proxy
servers which act as an intermediary
between a client requesting a resource
and the server providing that resource
it can serve various purposes like
caching resources for faster access
anonymizing requests and load balancing
among multiple servers essentially it
receives requests from clients forwards
them to the relevant servers and then
Returns the servers respond back to the
client there are several types of proxy
servers each serving different purposes
here are some of the main types the
first one is forward proxy which sits in
front of clients and is used to send
requests to other servers on the
Internet it's often used within the
internal networks to control internet
access next one is reverse proxy which
sits in front of one or more web servers
intercepting requests from the internet
it is used for load balancing web
acceleration and as a security layer
another type is open proxy which allows
any user to connect and utilize the
proxy server often used to anonymize web
browsing and bypass content restrictions
we also have transparent proxy types
which passes along requests and
resources without modifying them but
it's visible to the client and it's
often used for caching and content
filtering next type is anonymous proxy
which is identifiable as a proxy server
but but does not make the original IP
address available this type is used for
anonymous browsing we also have
distorting proxies which provides an
incorrect original Ip to the destination
server this is similar to an anonymous
proxy but with purposeful IP
misinformation and next popular type is
high anonymity proxy or Elite proxy
which makes detecting the proxy use very
difficult these proxies do not send X
forwarded for or other identifying
header and they ensure maximum anonymity
the most commonly used proxy servers are
forward and reverse proxies a forward
proxy acts as a middle layer between the
client and the server it sits between
the client which can be a computer on an
internal Network and the external
servers which can be websites on the
internet when the client makes a request
it is first sent to the forward proxy
the proxy then evaluates the request and
decides based on its configuration and
rules whether to allow the request
modify it or to block it one of the
primary functions of a forward proxy is
to hide the client's IP address when it
forwards the request to the Target
server it appears as if the request is
coming from the proxy server itself
let's look at some example use cases of
forward proxies one popular example is
Instagram proxies these are a specific
type of forward proxy used to manage
multiple Instagram accounts without
triggering bonds or restrictions and
marketers and social media managers use
Instagram proxies to appear as if they
are located in different area or as
different users which allows them to
manage multiple accounts automate tasks
or gather data without being flaged for
suspicious activity next example is
internet use control and monitoring
proxies some organizations use forward
proxies to Monitor and control employee
internet usage they can block access to
non-related sites and protect against
web based threats they can also scan for
viruses and malware in incoming content
next common use case is caching
frequently accessed content forward
proxies can also cach popular websites
or content reducing bandwidth usage and
speeding up access for users within the
network this is especially beneficial in
networks where bandwidth is costly or
limited and it can be also used for
anonymizing web access people who are
concerned about privacy can use forward
proxies to hide their IP address and
other identifying information from
websites they Vis it and making it
difficult to track their web browsing
activities on the other hand the reverse
proxy is a type of proxy server that
sits in front of one or more web servers
intercepting requests from clients
before they reach the servers while a
forward proxy hides the client's
identity a reverse proxy essentially
hides the servers Identity or the
existence of multiple servers behind it
the client interacts only with the
reverse proxy and may not know about the
servers behind it it also distributes
client requests across multiple servers
balancing load and ensuring no single
server becomes overwhelmed reverse proxy
can also compress inbound and outbound
data cache files and manage SSL
encryption there be speeding up load
time and reducing server load some
common use case cases of reverse proxies
are load balancers these distribute
incoming Network traffic across multiple
servers ensuring no single server gets
too much load and by Distributing
traffic we prevent any single server
from becoming a bottleneck and it's
maintaining optimal service speed and
reliability CDs are also a type of
reverse proxies they are a network of
servers that deliver cach static content
from websites to users based on the
geographical location of the user they
act as Reverse proxies by retrieving
content from the origin server and
caching it so that it's closer to the
user for faster delivery another example
is web application firewalls which are
positioned in front of web applications
they inspect incoming traffic to block
hacking attempts and filter out unwanted
traffic firewalls also protect the
application from common web exploits and
another example is SSL off loading or
acceleration some reverse proxies handle
the encryption and decryption of SSL TLS
traffic offloading that task from web
servers to optimize their performance
load balancers are perhaps the most
popular use cases of proxy servers they
distribute incoming traffic across
multiple servers to make sure that no
server Bears Too Much load by spreading
the requests effectively they increase
the capacity and reliability of
applications here are some common
strategies and algorithms used in load balancing
balancing
first one is round robin which is the
simplest form of load balancing where
each server in the pool gets a request
in sequential rotating order when the
last server is reached it Loops back to
the first one this type works well for
servers with similar specifications and
when the load is uniformly
distributable next one is list
connections algorithm which directs
traffic to the server with the fewest
active connections it's ideal for longer
tasks or when the server load is not
evenly distributed next we have the
least response time algorithm which
chooses the server with the lowest
response time and fewest active
connections this is effective and the
goal is to provide the fastest response
to requests next algorithm is IP hashing
which determines which server receives
the request based on the hash of the
client's IP address this ensures a
client consistently connects to the same
server and it's useful for session
persistence in application where it's
important that the client consistently
connects to the same server the variance
of these methods can also be vited which
brings us to the weighted algorithms for
example in weighted round robin or
weighted list connections servers are
assigned weights typically based on
their capacity or performance metrics
and the servers which are more capable
handle the most requests this is
effective when the servers in the pool
have different capabilities like
different CPU or different Rams we also
have geographical algorithms which
direct requests to the server
geographically closest to the user or
based on specific Regional requirements
this is useful for Global Services where
latency reduction is priority and the
next common algorithm is consistent
hashing which uses a hash function to
distribute data across various nodes
imagine a hash space that forms a circle
where the end wraps around to the
beginning often referred to as a has
ring and both the nodes and the data
like keys or stored values are hushed
onto this ring this makes sure that the
client consistently connects to the same
server every time an essential feature
of load balancers is continuous Health
checking of servers to ensure traffic is
only directed to servers that are online
and responsive if a server fails the
load balancer will stop sending traffic
to it until it is back online and load
balancers can be in different forms
including Hardware applications software
Solutions and cloud-based Services some
of the popular Hardware load balancers
are F5 big IP which is a widely used
Hardware load balancer known for its
high performance and extensive feature
set it offers local traffic management
Global server load balancing and
application security another example is
Citrix forly known as net scaler which
provides load balancing content
switching and ation acceleration some
popular software load balancers are AJ
proxy which is a popular open-source
software load balancer and proxy server
for TCP and HTTP based applications and
of course Eng X which is often used as a
web server but it also functions as a
load balancer and reverse proxy for HTTP
and other network protocols and some
popular cloud-based load balancers are
aws's elastic load balancing or microsof
oft aure load balancer or Google Cloud's
load balancer there are even some
virtual load balancers like Vim ver
Advanced load balancer which offers a
softwar defined application delivery
controller that can be deployed on
premises or in the cloud now let's see
what happens when a load balancer goes
down when the load balancer goes down it
can impact the whole availability and
performance of the application or
Services it manages it's basically a
single point of failure and in case it
goes down all of the servers become
unavailable for the clients to avoid or
minimize the impact of a load balancer
failure we have several strategies which
can be employed first one is
implementing a redundant load balancing
by using more than one load balancer
often in pairs which is a common
approach if one of them fails the other
one takes over which is a method known
as a
failover next strategy is to
continuously monitor and do health
checks of load balancer itself this can
ensure that any issues are detected
early and can be addressed before
causing significant disruption we can
also Implement Autos scaling and
selfhealing systems some Modern
infrastructures are designed to
automatically detect the failure of load
balancer and replace it with the new
instance without manual intervention and
in some configurations the NS failover
can reroute traffic away from an IP
address that is is no longer accepting
connections like a failed load balancer
to a preconfigured standby IP which is
our new load balancer system design
interviews are incomplete without a deep
dive into databases in the next few
minutes I'll take you through the
database Essentials you need to
understand to a that interview we'll
explore the role of databases in system
design sharding and replication
techniques and the key ACD properties
we'll also discuss different types of
databases vertical and horizontal
scaling options and database performance
techniques we have different types of
databases each designed for specific
tasks and challenges let's explore them
first type is relational databases think
of a relational database like a well
organized filling cabinet where all the
files are neatly sorted into different
drawers and folders some popular
examples of SQL databases are poster SQL
MySQL and SQL light all of the SQL
databases use tables for data storage
and they use SQL as a query language
they are great for transactions complex
queries and integrity relational
databases are also acid compliant
meaning they maintain the ACD properties
a stands for atomicity which means that
transactions Are All or Nothing C stands
for consistency which means that after a
transaction your database should be in a
consistent state I is isolation which
means that transactions should be
independent and D is for durability
which means that once transaction is
committed the data is there to stay we
also have nosql databases which drop the
consistency property from the ACD
imagine a nosql database as a
brainstorming board with sticky notes
you can add or remove notes in any shape
of form it's flexible some popular
examples are mongod DB Cassandra and
redis there are different different
types of nosql databases such as key
value pairs like redis document based
databases like mongod DB or graph based
databases like Neo 4G nosql databases
are schema less meaning they don't have
foreign Keys between tables which link
the data together they are good for
unstructured data ideal for scalability
quick iteration and simple queries there
are also inmemory databases this is like
having a whiteboard for quick
calculations and temporary sketches it's
fast because everything is in memory
some examples are redies and M cach they
have lightning fast data retrieval and
are used primarily for caching and
session storage now let's see how we can
scale databases the first option is
vertical scaling or scale up in vertical
scaling you improve the performance of
your database by enhancing the
capabilities of individual server where
the data is running this could involve
increasing CPU power adding more RAM
adding faster or more dis storage or
upgrading the network but there is a
maximum limit to the resources you can
add to a single machine and because of
that it's very limited the next option
is horizontal scaling or scale out which
involves adding more machines to the
existing pool of resources rather than
upgrading the single unit databases that
support horizontal scaling distribute
data across a cluster of machines this
could involve database sharding or data
replication the first option is database
sharding which is Distributing different
portions shards of the data set across
multiple servers this means you split
the data into smaller chunks and
distribute it across multiple servers
some of the sharding strategies include
range based sharding where you
distribute data based on the range of a
given key directory based sharding which
is utilizing a lookup service to direct
traffic to the correct database we also
have geographical charting which is
splitting databases based on geographical
geographical
locations and the next horizontal
scaling option is data replication this
is keeping copies of data on multiple
servers for high availability we have
Master Slave replication which is where
you have one master database and several
read only slave databases or you can
have master master application which is
multiple databases that can both read
and write scaling your data database is
one thing but you also want to access it
faster so let's talk about different
performance techniques that can help to
access your data faster the most obvious
one is caching caching isn't just for
web servers database caching can be done
through inmemory databases like redies
you can use it to cat frequent queries
and boost your performance the next
technique is indexing indexes are
another way to boost the performance of
your database creating an index for
frequently accessed column will
significantly speed up retrieval times
and the next technique is query
optimization you can also consider
optimizing queries for fast data access
this includes minimizing joints and
using tools like SQL query analyzer or
explain plan to understand your query's
performance in all cases you should
remember the cap theorem which states
that you can only have two of these
three consistency availability and
partition tolerance when designing a
system you should prioritize two of the
is based on the requirements that you
have given in the interview if you
enjoyed this crash course then consider
watching my other videos about system
Design Concepts and interviews see you
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.