Podcast
Questions and Answers
What is a key reason a cache server is not ideal for persisting data?
What is a key reason a cache server is not ideal for persisting data?
- Cache servers are only designed for small amounts of data.
- Cache servers are more expensive to maintain.
- Cache servers use more power than databases.
- Cached data is stored in volatile memory. (correct)
What happens to cached data when it expires, according to an expiration policy?
What happens to cached data when it expires, according to an expiration policy?
- It is compressed to save space.
- It is removed from the cache. (correct)
- It is automatically backed up to a persistent database.
- It is moved to a slower tier of cache storage.
What does consistency in caching refer to?
What does consistency in caching refer to?
- Having a uniform expiration policy across all cached data.
- Keeping the data store and the cache in sync. (correct)
- Using the same type of memory for all cache servers.
- Ensuring cached data is always encrypted.
What is a single point of failure (SPOF)?
What is a single point of failure (SPOF)?
What is a recommended strategy to mitigate the risk of a single cache server being a SPOF?
What is a recommended strategy to mitigate the risk of a single cache server being a SPOF?
What is cache eviction?
What is cache eviction?
What type of content is typically delivered by a Content Delivery Network (CDN)?
What type of content is typically delivered by a Content Delivery Network (CDN)?
What does overprovisioning memory for a cache achieve?
What does overprovisioning memory for a cache achieve?
What is the primary function of a CDN?
What is the primary function of a CDN?
What happens if a CDN server doesn't have a requested file in its cache?
What happens if a CDN server doesn't have a requested file in its cache?
What does TTL stand for in the context of CDNs?
What does TTL stand for in the context of CDNs?
What is a key consideration when using a CDN regarding infrequently used assets?
What is a key consideration when using a CDN regarding infrequently used assets?
What should a website do if a CDN experiences a temporary outage?
What should a website do if a CDN experiences a temporary outage?
What is one way to immediately remove a file from a CDN before its TTL expires?
What is one way to immediately remove a file from a CDN before its TTL expires?
What is the ultimate goal when answering system design questions?
What is the ultimate goal when answering system design questions?
What is object versioning in the context of CDNs?
What is object versioning in the context of CDNs?
What should you understand well to shape the direction of the discussion?
What should you understand well to shape the direction of the discussion?
Why is setting an appropriate cache expiry time important for time-sensitive content?
Why is setting an appropriate cache expiry time important for time-sensitive content?
What is vital to the success of a system design interview?
What is vital to the success of a system design interview?
What is the initial setup described in the chapter 'Scale From Zero to Millions of Users'?
What is the initial setup described in the chapter 'Scale From Zero to Millions of Users'?
During the initial design blueprint phase, what should you do with your interviewer?
During the initial design blueprint phase, what should you do with your interviewer?
What is the purpose of drawing box diagrams during a system design interview?
What is the purpose of drawing box diagrams during a system design interview?
In the single server setup, what components typically reside on the same server?
In the single server setup, what components typically reside on the same server?
What is the primary reason for performing back-of-the-envelope calculations during a system design interview?
What is the primary reason for performing back-of-the-envelope calculations during a system design interview?
What is the first step in the request flow when a user accesses a website?
What is the first step in the request flow when a user accesses a website?
When is it appropriate to include API endpoints and database schema in a system design discussion?
When is it appropriate to include API endpoints and database schema in a system design discussion?
What is the purpose of DNS in the request flow?
What is the purpose of DNS in the request flow?
Before diving into the details, what should you do regarding back-of-the-envelope calculations?
Before diving into the details, what should you do regarding back-of-the-envelope calculations?
In the initial request flow (before scaling), who typically provides the Domain Name System (DNS) service?
In the initial request flow (before scaling), who typically provides the Domain Name System (DNS) service?
What benefit is gained by going through a few concrete use cases?
What benefit is gained by going through a few concrete use cases?
What should you and the interviewer have already agreed on before the design deep dive?
What should you and the interviewer have already agreed on before the design deep dive?
During a senior candidate interview, what might the discussion focus on?
During a senior candidate interview, what might the discussion focus on?
What is a disadvantage of the sliding window counter algorithm related to memory usage?
What is a disadvantage of the sliding window counter algorithm related to memory usage?
The sliding window counter algorithm is a hybrid approach that combines which of the following?
The sliding window counter algorithm is a hybrid approach that combines which of the following?
In the sliding window counter algorithm, the number of requests in the rolling window is calculated using which formula?
In the sliding window counter algorithm, the number of requests in the rolling window is calculated using which formula?
What is a primary advantage of the sliding window counter algorithm?
What is a primary advantage of the sliding window counter algorithm?
For what type of look back window does the sliding window counter algorithm work?
For what type of look back window does the sliding window counter algorithm work?
Why is a database generally not a good choice for storing rate limiting counters?
Why is a database generally not a good choice for storing rate limiting counters?
Which technology is a popular option to implement rate limiting?
Which technology is a popular option to implement rate limiting?
Which commands does Redis offer that makes it suitable for rate limiting?
Which commands does Redis offer that makes it suitable for rate limiting?
Which CAP characteristic is sacrificed by CP systems?
Which CAP characteristic is sacrificed by CP systems?
Why are CA systems impractical in real-world distributed applications?
Why are CA systems impractical in real-world distributed applications?
In a distributed system with three replicas (n1, n2, n3), what happens when n3 goes down during a network partition if the system chooses consistency (CP)?
In a distributed system with three replicas (n1, n2, n3), what happens when n3 goes down during a network partition if the system chooses consistency (CP)?
In a distributed system, what is the primary concern for a bank system when choosing between consistency and availability?
In a distributed system, what is the primary concern for a bank system when choosing between consistency and availability?
In an AP system, what is the typical behavior when a network partition occurs and a client attempts a write operation?
In an AP system, what is the typical behavior when a network partition occurs and a client attempts a write operation?
What is the main trade-off between CP and AP systems?
What is the main trade-off between CP and AP systems?
What is a key consideration when choosing the appropriate CAP guarantees for a distributed key-value store?
What is a key consideration when choosing the appropriate CAP guarantees for a distributed key-value store?
Flashcards
Cache Server Data Persistence
Cache Server Data Persistence
Data in cache servers is lost upon restart because it's stored in volatile memory.
Expiration Policy
Expiration Policy
A strategy to remove data from the cache after a set time period.
Cache Consistency
Cache Consistency
Keeping the data store and cache synchronized.
Single Point of Failure (SPOF)
Single Point of Failure (SPOF)
Signup and view all the flashcards
Overprovisioning
Overprovisioning
Signup and view all the flashcards
Cache Eviction
Cache Eviction
Signup and view all the flashcards
Least-Recently-Used (LRU)
Least-Recently-Used (LRU)
Signup and view all the flashcards
Content Delivery Network (CDN)
Content Delivery Network (CDN)
Signup and view all the flashcards
System Design Interview Success
System Design Interview Success
Signup and view all the flashcards
Scaling Systems
Scaling Systems
Signup and view all the flashcards
Single Server Setup
Single Server Setup
Signup and view all the flashcards
Domain Name System (DNS)
Domain Name System (DNS)
Signup and view all the flashcards
Internet Protocol (IP) Address
Internet Protocol (IP) Address
Signup and view all the flashcards
System Design Questions
System Design Questions
Signup and view all the flashcards
System Design Focus
System Design Focus
Signup and view all the flashcards
System Design Scenarios
System Design Scenarios
Signup and view all the flashcards
CDN Content Delivery
CDN Content Delivery
Signup and view all the flashcards
TTL (Time-to-Live)
TTL (Time-to-Live)
Signup and view all the flashcards
CDN Workflow
CDN Workflow
Signup and view all the flashcards
CDN Cost considerations
CDN Cost considerations
Signup and view all the flashcards
Setting Cache Expiry
Setting Cache Expiry
Signup and view all the flashcards
CDN Fallback
CDN Fallback
Signup and view all the flashcards
Invalidating Files in CDN
Invalidating Files in CDN
Signup and view all the flashcards
Design Blueprint
Design Blueprint
Signup and view all the flashcards
Back-of-the-Envelope Calculations
Back-of-the-Envelope Calculations
Signup and view all the flashcards
Use Cases
Use Cases
Signup and view all the flashcards
Design Deep Dive
Design Deep Dive
Signup and view all the flashcards
Feed Publishing
Feed Publishing
Signup and view all the flashcards
News Feed Building
News Feed Building
Signup and view all the flashcards
Thinking Out Loud
Thinking Out Loud
Signup and view all the flashcards
Agreed Objectives
Agreed Objectives
Signup and view all the flashcards
Sliding Window Counter Algorithm
Sliding Window Counter Algorithm
Signup and view all the flashcards
Formula for Requests in Rolling Window
Formula for Requests in Rolling Window
Signup and view all the flashcards
Advantage of Sliding Window Counter Algo
Advantage of Sliding Window Counter Algo
Signup and view all the flashcards
Limitation of Sliding Window Counter Algo
Limitation of Sliding Window Counter Algo
Signup and view all the flashcards
Basic Idea of Rate Limiting Algorithms
Basic Idea of Rate Limiting Algorithms
Signup and view all the flashcards
In-memory cache
In-memory cache
Signup and view all the flashcards
Redis
Redis
Signup and view all the flashcards
Need a Counter
Need a Counter
Signup and view all the flashcards
CP System
CP System
Signup and view all the flashcards
AP System
AP System
Signup and view all the flashcards
CA System
CA System
Signup and view all the flashcards
Network Partition
Network Partition
Signup and view all the flashcards
Data Consistency
Data Consistency
Signup and view all the flashcards
System Availability
System Availability
Signup and view all the flashcards
Availability over Consistency
Availability over Consistency
Signup and view all the flashcards
Consistency over Availability
Consistency over Availability
Signup and view all the flashcards
Study Notes
- Title: System Design Interview - An Insider's Guide, Second Edition by Alex Xu
- The book covers topics such as scaling to millions of users, back-of-the-envelope estimation, system design framework, rate limiters, consistent hashing, key-value stores, unique ID generators, URL shorteners, web crawlers, notification systems, news feeds, chat systems, search autocomplete, YouTube, and Google Drive.
- Editor: Paul Solomon
- The book provides a strategy for approaching system design questions and knowledge for building scalable systems, with a step-by-step framework and examples.
General Interview Information
- System design interviews are difficult because they require designing architecture for software systems and have no certain pattern.
- Companies use these interviews to assess communication and problem-solving skills.
- An interviewee is evaluated on their ability to analyze a vague problem and solve it step by step, explain ideas, and optimize the system.
- System design questions are open-ended, and the goal is to achieve system design goals through architecture.
Chapter 1: Scaling from Zero to Millions of Users
- Designing systems to support millions of users is a challenging & continuous process.
- Aims to use gradually scaled up systems to serve millions of users.
- The goal is to master techniques helpful for answering system design interview questions.
Single Server Setup
- Complex systems often begin simple, running everything on a single server (web app, database, cache, etc.).
- Users access websites via domain names, resolved by the Domain Name System (DNS).
- The server returns HTML pages or JSON responses for rendering.
- Web applications use server-side languages (Java, Python) for business logic and client-side languages (HTML, JavaScript) for presentation.
- Mobile applications use HTTP for communication with the web server, and JSON is a common API response format.
Database Considerations
- With user base growth, separating web/mobile traffic (web tier) and database (data tier) onto multiple servers enables independent scaling.
- Options between relational databases (MySQL, PostgreSQL) and non-relational databases (NoSQL such as CouchDB, Cassandra) exist.
- Relational databases organize data in tables and support join operations using SQL.
- NoSQL databases are categorized into key-value, graph, column, and document stores.
- NoSQL databases lack join operation support but can offer low latency, handle unstructured data, and massive data storage.
- Non-relational databases might be the right choice where super-low latency is required, data is unstructured, only serialization and deserialization are needed or there's an unstructured large volume of data.
Scaling Methods
Vertical Scaling
- Improving a server's power (CPU, RAM) is a simple option, but has limits.
- The option lacks failover and redundancy, resulting in complete website/app downtime if the server fails.
Horizontal Scaling
- Adding more servers is preferable for large applications to avoid the limitations of vertical scaling.
Load Balancers
- A load balancer evenly distributes traffic among web servers to address access issues and prevent server overload.
- Load balancers improve web tier availability and solve failover issues.
- If a server goes offline, traffic is automatically rerouted.
- Web servers are made unreachable by clients for better security through private IPs for server communications through the load balancer.
- The load balancer is connected to the public IP.
- If traffic grows the addition of web servers is simplified for web traffic gracefully.
Database Replication
- Database replication provides redundancy. It is achieved through the master/slave configuration.
- Master DB is a "source of truth" writes operations only, and a Slave DB maintains a copy of the data (read operations only).
- Modifying commands are sent directly to the master database.
- A read-heavy application warrants more slave databases than master databases.
- Database replication offers better performance, reliability, and high availability.
- Advantage of processing more queries in parallel.
- Advantage of replicability of data across different geographical locations for preservation.
- Website can remain in operation even when offline through data replicability of data stored in another database server.
- If a slave database goes offline, read operations are directed to alternative slave databases or a master database temporarily.
- If a master database goes offline, a slave database is promoted to a master.
- Multi-masters and circular replications are more complicated replication methods.
Caching and Content Delivery Networks (CDN)
- Cache temporarily stores results of expensive responses or frequently accessed data in memory for quicker retrieval.
- A cache tier can improve system performance, reduce database workloads, and allow independent scaling
- Cache stores available response and sends to client otherwise queries the database and saves to the cache.
- Consider cache when data is read frequently and volatile memory saves to cache server restarts with all memory lost.
- Important data saved in persistent data stores.
- An implemented expiration Policy should be removed from the cache to avoid outdating by the server and database by sync transaction.
- A single cache server represents a potential single point of failure (SPOF), so multiple cache servers across data centers are recommended.
- Overprovision the required memory by percentages and used the LRU to prevent adding items to fill cache.
- Content Delivery Network (CDN) is a network of geographically dispersed servers delivering static content (images, videos, CSS, JS files).
- CDN improves load times by delivering content from a server closest to the user. When a use visits a website CDN stores the content close
- A user tries to get image by using URL; if not available in cache the CDN server the CDN server requests file from web server.
- A CDN is run by 3rd party providers charged for data transfers for significant benefit and to remove them
- Setting it important for time-sensitive content; setting a cache expiry time.
- Cache export time is also important and shouldn't be short because reloading causes repeat.
Improving Web and Data Tiers
- Stateless web tier supports horizontal scaling.
- Session data is moved out of the web tier with a good practice by accessing state data from databases.
- Stateful server remembers data request while a stateless server keeps no data.
- Stateless systems is simply, more robust and Scalable.
- Shared data allows HPPT from users sent to any web server to a fetch state data store from stored data to kept out on web server because of an stateless system.
- NoSQL data store chosen is easy to scale.
- Auto scaling mean adding or removing web servers manually traffic load and the web tiers are remove out of services.
- To rapidly attract your website and significant the number of users is crucial to provide better user experience across geographical areas supporting multiple centers.
- In any large scale, system outages redirect all traffic to have the data centre directs redirect to the near data control depending on where the user use the data source.
- In favor of multi date enter setup you directly effective tools in direct traffic or the current data standard.
- For more info the tool in the correct data centre depending on where User is dedicated in multi data center.
- Users from regions could use different locally because or caches or the data center with that common strategy is replicant has.
- So the data centers the next list implements asynchronous multi data centers.
- Test applications with multi data centre.
- Improve the data system.
Message Queues and Automation
- A message queue provides asynchronous communication, serving as a buffer and distributing requests.
- Message queue is referred architecture reliable application so, you can post to the queue when consumer is unavailable with post traffic production.
- With consumers, the message to processing, photo processing, and customization (cropping sharpening, blur, etc). can be completed efficiently in a queue.
- Workers are added to reduce processing time or queue will become empty all the time,the number of workers can be reduced.
- Monitoring error logs identify errors and problem in which help to identify errors.
- Logging help to identify a large business when working with a small website that runs on a few servers the productivity through integrated code check.
Database Scaling
- Two approaches for database scaling exists: vertical scaling (adding more power) and horizontal scaling (adding more servers)
- Vertical scaling means adding power (CPU RAM) to the existing machine
- Horizontal scaling also known as sharding that practice adding more servers by comparing vertical scaling
- Sharding separates large databases into smaller parts called shards each shares the same schema as though the actual data
- To achieve a system based user ID access (for example) through functions to find the corresponding shard can be applied.
- The factor to consider is the choice when implementing shorting strategy is the choice of key and retrieve more modify data efficiently through directory.
Sharding
- Is a complex system which can easily introduced challenges.
- Resharding Data is needed when the key sharding
- Shards due to rapid occur, certain chart experience, chart execution or uneven data distribution, requiring and data transfer
- Community project for massive access
- Access is overloaded for same support for social applications we make to allocate each celebrity requirements
- Once a database has charted across multiple operations it's heart perform chart operations through all those data to a no secret data store.
- For more information visit (Number)
Scaling to Millions of Users
- Scaling to millions of users is an iterative process with fine-tuning and optimisations.
- Maintain a stateless web tier, build redundancy, use caching, support multiple data centers, host static assets on content delivery networks (CDN).
- Scale data tier through sharding, split tiers into individual services, and monitor the system with automated tools.
Chapter 2: Back-of-the-Envelope Estimation
- Estimations using thought and common numbers to check design requirements.
- Basics require understanding numbers well, for example powers of two, latency numbers for programming, and estimate availability numbers in your server stack.
- Calculations use powers of 2.
- ASCII character uses 8 bit memory.
- Memory is fast and use compression; data centers send data slowly.
- High availability is continuous/long operational service measured as a percentage where most services are measured between 99%-100%. Service level Agreement measures for the agreement between up-time, better if measured as a percentage.
- When estimating to calculate the back of envelope make use of rounding, write your assumption, label your units, and consider QPS along with storage caches.
Chapter 3: Framework For System Design Interviews
- System design interviews involve collaboration to solve an open-ended and ambiguity problem.
- Focus on showcasing design skillset instead of final design which is why those that collaborate succeed.
- Engineers that over engineer run the risk of designing a pure and high pricing system which should be considered red flags in design.
- This section contains useful tips to design effective problems through effective steps.
- Step 1 is to understand the problem and design scope by clarifying requirements with all your assumptions and gathering all the needed information.
- Asking what features are needed?, how fast they will scale?, and the company stack etc. help greatly improve the analysis.
- The next step is to propose high level design, collaborate with the interviewer in the process.
- Work step by step ,come up with a plan, feedback loop and then think aloud while communication is key.
- Work to achieve all the desired outcomes with a great blueprint that is high level is obtained.
- Time management essential as it is easy to get carried away with minute details that do not demonstrate your abilities you must be armed.
- Try not to get into unnecessary details.
- Wrap up by asking follow-up questions, identifying the system bottlenecks, and discussing potential improvements such as Server Failure. -Time analysis should allocate 3-10 mins to understand problem, 10-15mins to propose high-level design, 10 – 25 mins to design deep dives and only 3-5 to wrap.
Do's and Don't
- Always ask for clarification to understand the requirements of the problem and correct your assumptions,
- There is no a right answer a solution for the same problem can be approached from several routes with the right communication. Don'ts
- Not being prepared for typical interview questions (like those discussed previously), being unprepared or failing to make high-level design, or not asking questions with the interviewer before design.
Chapter 4: Design A Rate Limiter
- It controls rate of traffic sent by service or client through HTTP limits which is allowed through a time window which excessed calls blocked.
API Rate Limiter Benefits
- Prevent Resource starvation caused by DoS attacks
- Almost publish API enforce with Twitter which can provide limit of 300 tweets in 3 hours.
- Reduce cost through paid API that help save on external API.
- Essential to reduces cost.
- Prevent servers from overloaded.
Important
- Important to understand problem, scope and implementation on type of rate and clarity. Types
- Server/Client side.
- Properties that throttle requests (IP, User ID)
- Scale of the system
- Environmental distribution
- Needs for users of protocol.
Requirements
- Limit excessive requests
- Low latency for HTTP.
- Use of little memory
- Distributed rate limit.
- Exception handling with clear messaging.
- High fault tolerance.
Algorithm For Rate Limiting
- Implementations are generally done at all, with this chapter focus on server and gateway limiting through algorithms like:
- Token Bucket
- Leaking Bucket
- Fixed window counter
- Sliding window logs.
- Sliding window counter
- Token Bucket is widely used over internet companies using defined buckets with preset rate and time period
- A request consumes a toke and checks for if enough tokens are present.
- Algorithm depends on bucket size.
- And its easy to implements in efficient memory with quick burst limit.
Leaking Bucket Algorithm: Requests
- The algo is to similar to token but it is implemented FIFO - First in first out with an queue.
- Check system queues its full rate is adapted
- Size means equal for the Queue.
- Two parameters - A Bucket size and Outflow Rate.
Fixed window counter: The algorithm includes:
- Time Windows and assigns counter
- Every request increments by 1.
Sliding window log algorithm
- As its known algorithms had may cause to go through the window.
- This algorithm logs all timestamps.
- Algorithm keeps tracks all request timestamps and data in redis.
- Remove outdate timestamps that is defined as order of star of current window. The log has the same/lower and accepted otherwise rejected.
Sliding window counter algorithm
- Combines fixed windows and sliding.
High-level architecture.
-Simple as you need and counter store based user and P address
- In memory cache for fast time is a good practice
- The user gets the limit
- The client middleware counter will correspond with bucket and limited will reach requests are sent to API.
Details.
- Limit rate on what’s written for configurations and data integrity with HTTP and its limitations.
- In an events will 1 or 0 for the headers clients are returned for their API for the Rate
Distributing in the environment.
- This often requires some amount synchronizing because they state list clients will direct traffic and cannot work.
- Redis stores can implement a flexible solution traffic the sand server.
Performance optimization.
- Optimisation in both with areas to import and increase speed and traffic in locations
- Using synchronized event in the consistent model
- We will know to achieve the current state with rate limit is effective and data to check.
Step 4 Wrap Up.
Algorithms of rate limiting and pros and cons with additional point.
- Heart and soft traffic can exceed.
- Avoid rate limits
- Add time re-try.
Chapter 5: Design Consistent Hashing
-
Key idea is in horizontal scaling, which efficiently redistributes data/requests evently acorss servers by using a hashing technique.
-
Rehashing
-
Problems will arise when a server is added or removed.
Consistent Hashing Technique
- Wikipedia: resizing a hash table.
- Space hashing
- SHA - 1: encrypts 0 to 1 +2^16-1 and mapping to the hash ring: https://en.wikipedia.org/wiki/Consistent_hashing.
- Hashing Keys use to map servers base don hash function using the same hash functions and data/servers are re mapped as a ring which minimizes need to remap after edits
Basic Approach Issues
-
Hard to maintain consistent size of partition to added server & hard to uniform non-key distribution and using nodes/replicas solves these problems.
-
Virtual Nodes
-
Each the server with 3 and on the ring to find which server key's location server.
-
As the number of virtual increased in distribution with standard derivation (smaller the more virtual which data is spread).
-
Find a set of keys.
-
Is the added after the moves the affect range.
-
To removed small traction, require with chart that they key the server with more
The benefit includes;
- Redistributes Keys when servers are added or removed
- Easy to scale horizontally
- Migrate hotspotting
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
Explore caching limitations for persistent data and the impact of expiration policies. Understand consistency in caching, single points of failure, and mitigation strategies using CDNs. Learn about cache eviction and optimizing Content Delivery Networks for performance and resilience.