System Design - High Level PDF

‭ omprehensive System Design Roadmap For‬ C ‭Beginners‬ ‭ his roadmap will serve as a high level syllabus for mastering system design. It covers all‬ T ‭essential topics in distributed systems, networking, storage, and scalability with proper context‬ ‭to help you build scalable, fault-tolerant systems. Each section builds on the previous one to‬ ‭ensure a gradual, deep understanding.‬ ‬ ‭1. Introduction to System Design‬ al ‭ ystem design phase in software development takes high-level requirements—both functional‬ S oy ‭and non-functional—and transforms them into a blueprint that guides the construction of the‬ ‭system. This process involves making strategic decisions about how different components of the‬ ‭system will interact, what technologies will be used, and how to ensure the system meets its‬ ‭performance, scalability, and reliability goals.‬ G ‭‬ W ‭ hat is System Design?‬ ‭System design is the process of defining the architecture, modules, interfaces,‬ ‭components, and data flow to meet specific business or technical goals. It helps create‬ i ‭systems that are scalable, reliable, secure and easy to maintain.‬ in ‭‬ ‭Why is it important?‬ ‭System design is crucial for large-scale systems to handle increased load, avoid‬ ‭downtime, and maintain fast response times.‬ al ‭‬ ‭How to start learning?‬ ‭Start by breaking down the problem into components. Then, identify communication‬ ‭Sh ‭paths, select storage mechanisms, and think about scalability and failure handling.‬ ‭‬ ‭Horizontal vs. Vertical Scaling‬ ‭○‬ ‭Horizontal Scaling‬‭: Adding more servers to distribute‬‭load (e.g., load‬ ‭balancers distributing traffic across multiple servers).‬ ‭○‬ ‭Vertical Scaling‬‭: Adding more resources (CPU, memory)‬‭to a single server‬ ‭(e.g., upgrading a database server).‬ ‭‬ ‭High-level Design vs. Low-level Design‬ ‭○‬ ‭High-level Design‬‭: Focuses on the overall architecture‬‭and how components‬ ‭interact (e.g., database, services, APIs).‬ ‭○‬ ‭Low-level Design‬‭: Delves into specific modules, class diagrams, and algorithms‬ ‭used within components.‬ ‭‬ ‭Monolith vs. Microservices‬ ‭○‬ ‭Monolith‬‭: One large application managing all logic‬‭(easier to start but harder to‬ ‭scale).‬ ‭Created By Shalini Goyal‬ ‭○‬ M ‭ icroservices‬‭: Separate smaller services for managing specific tasks (increases‬ ‭flexibility but introduces complexity).‬ ‭‬ ‭Logging and Metrics Calculation‬ ‭○‬ ‭Use logging to track important events and metrics to monitor system health and‬ ‭performance (e.g., error rates, latency).‬ ‭ ‬ ‭Software Decoupling and Extensibility‬ ‭○‬ ‭Decouple components for independent updates and scaling. Use APIs and queues‬ ‭to avoid tight coupling between modules.‬ ‭‬ ‭Primary-Secondary Architecture‬ ‭○‬ ‭A common database architecture where the‬‭primary‬‭handles writes, and‬ ‭secondary‬‭replicates data and handles reads to improve performance.‬ ‬ al ‭2. Key Characteristics of Distributed Systems‬ oy ‭ istributed systems allow multiple servers to work together as one logical unit. Here are the‬ D ‭core traits that define them:‬ ‭‬ R ‭ esource Sharing‬ G ‭Components share hardware or software resources like memory, CPU, or files.‬ ‭‬ ‭Openness‬ ‭Use of open standards (e.g., HTTP, REST) ensures interoperability with other systems.‬ ‭‬ ‭Concurrency‬ i in ‭Multiple requests are handled simultaneously. Systems need synchronization techniques‬ ‭to avoid conflicts (e.g., locking mechanisms).‬ ‭‬ ‭Scalability‬ al ‭Ability to grow with increasing demand. Includes scaling storage, compute, and network‬ ‭resources.‬ ‭‬ ‭Fault Tolerance‬ ‭Sh ‭Systems need to detect failures and continue to operate (e.g., using replicas or redundant‬ ‭services).‬ ‭‬ ‭Transparency‬ ‭Users shouldn't notice the distributed nature of a system (e.g., location transparency‬ ‭ensures users don’t know where data is stored).‬ ‭3. Networking and Protocols‬ ‭Understanding networking is essential for communication between distributed components.‬ ‭‬ ‭IP, TCP, UDP‬ ‭○‬ ‭IP‬‭: Routing packets across networks.‬ ‭Created By Shalini Goyal‬ ‭ ‬ ‭TCP‬‭: Ensures reliable delivery.‬ ○ ‭○‬ ‭UDP‬‭: Faster but does not guarantee delivery (useful‬‭for streaming).‬ ‭‬ ‭HTTP/HTTPS‬ ‭○‬ ‭Used for communication between clients and servers on the web. HTTPS adds‬ ‭security with encryption.‬ ‭ ‬ ‭DNS‬ ‭○‬ ‭Resolves domain names (e.g., google.com) into IP addresses.‬ ‭‬ ‭REST vs RPC APIs‬ ‭○‬ ‭REST‬‭: Stateless, resource-oriented APIs.‬ ‭○‬ ‭RPC‬‭: Remote Procedure Calls—function-oriented, useful‬‭in microservices.‬ ‬ ‭4. Storage Systems‬ al ‭ ata is the backbone of every system. Understanding storage solutions helps you build resilient‬ D oy ‭systems.‬ ‭‬ ‭Shapes of Data‬ ‭○‬ ‭Structured (SQL), Semi-structured vs Unstructured (NoSQL) data.‬ ‭‬ ‭Sorts of Availability‬ G ‭○‬ ‭Techniques like replication and failover ensure high availability.‬ ‭‬ ‭Consistency and Scalability‬ i ‭○‬ ‭CAP theorem: You can choose only two between‬‭Consistency‬‭,‬‭Availability‬‭,‬ in ‭and‬‭Partition tolerance‬‭in a distributed system.‬ ‭‬ ‭SQL vs NoSQL Databases‬ ‭○‬ ‭SQL‬‭: Structured queries with relational tables (e.g.,‬‭MySQL).‬ al ‭○‬ ‭NoSQL‬‭: Flexible schemas (e.g., MongoDB) for handling‬‭large datasets.‬ ‭‬ ‭Memory Estimation‬ ‭○‬ ‭Estimate memory needs for operations to avoid bottlenecks.‬ ‭Sh ‭‬ ‭Sharding and Partitioning‬ ‭○‬ ‭Divide data across servers to spread load (e.g., dividing user data by region).‬ ‭‬ ‭Database Replication‬ ‭○‬ ‭Replicate data across nodes for redundancy.‬ ‭‬ ‭Map-Reduce‬ ‭○‬ ‭A framework for processing large data sets across multiple nodes (used in‬ ‭Hadoop).‬ ‭‬ ‭ACID Properties‬ ‭○‬ ‭Ensure reliability:‬‭Atomicity, Consistency, Isolation, Durability‬‭.‬ ‭Created By Shalini Goyal‬ ‭5. Metrics and Performance Monitoring‬ ‭Performance metrics help you evaluate the system's responsiveness.‬ ‭‬ ‭Latency‬ ‭○‬ ‭Time taken for a request to travel from source to destination.‬ ‭‬ ‭Throughput‬ ‭○‬ ‭Number of requests processed in a given time frame.‬ ‭‬ ‭Availability‬ ‭○‬ ‭Uptime of the system, avoiding Single Points of Failure (SPOF).‬ ‬ ‭6. Proxies and Load Balancing‬ al ‭Proxies and load balancers distribute traffic efficiently across multiple servers.‬ oy ‭‬ ‭Forward and Reverse Proxies‬ ‭○‬ ‭Forward proxies act on behalf of clients, while reverse proxies act on behalf of‬ ‭servers (e.g., NGINX, HAProxy).‬ G ‭‬ ‭Load Balancing Strategies‬ ‭○‬ ‭Round Robin‬‭: Distribute requests equally.‬ ‭○‬ ‭Weighted Round Robin‬‭: More requests to powerful servers.‬ i ‭○‬ ‭IP Hashing‬‭: Route requests based on the client’s IP.‬ in ‭○‬ ‭Layer 4 vs Layer 7 Load Balancing‬‭: Distribute based‬‭on network or‬ ‭application layer information.‬ al ‭7. Caching Mechanisms‬ ‭Sh ‭Caching helps reduce latency and offload work from databases.‬ ‭‬ ‭Cache-aside‬ ‭○‬ ‭Application loads data into the cache when needed.‬ ‭‬ ‭Write-through and Write-behind‬ ‭○‬ ‭Sync data to cache and database in different ways to maintain consistency.‬ ‭‬ ‭Application, Database, and In-Memory Caching‬ ‭○‬ ‭Use Redis or Memcached to cache frequently accessed data.‬ ‭Created By Shalini Goyal‬ ‭8. Consistency and CAP Theorem‬ ‭Consistency is crucial for maintaining the accuracy of data across nodes.‬ ‭‬ ‭Consistent Hashing‬ ‭○‬ ‭Efficiently distributes load across servers by hashing keys.‬ ‭‬ ‭CAP Theorem‬ ‭○‬ ‭Balances trade-offs between Consistency, Availability, and Partition Tolerance.‬ ‭9. Content Delivery Networks (CDN)‬ ‬ al ‭CDNs speed up content delivery by caching it near the user.‬ ‭‬ ‭Push CDN‬ oy ‭○‬ ‭Preload content to CDN nodes.‬ ‭‬ ‭Pull CDN‬ ‭○‬ ‭Content is fetched only when needed.‬ G ‭10. Logging, Monitoring, and Alerting‬ i in ‭Observability ensures the system operates smoothly and issues are resolved quickly.‬ ‭‬ ‭Log Collection, Transport, and Storage‬ al ‭○‬ ‭Use tools like ELK stack for managing logs.‬ ‭‬ ‭Analysis and Alerting‬ ‭○‬ ‭Identify patterns and set alerts for abnormal activity.‬ ‭Sh ‭11. Rate Limiting‬ ‭Rate limiting controls traffic to prevent abuse.‬ ‭‬ ‭Token Bucket and Leaky Bucket Algorithms‬ ‭○‬ ‭Manage request flow and prevent overload.‬ ‭Created By Shalini Goyal‬ ‭12. Polling and Streaming‬ ‭Use the appropriate method for fetching updates.‬ ‭‬ ‭Polling‬ ‭○‬ ‭Regularly checking for new data (inefficient but simple).‬ ‭‬ ‭Streaming‬ ‭○‬ ‭Continuous flow of data in real-time (e.g., Kafka for event streaming).‬ ‬ al oy i G in al ‭Sh ‭Created By Shalini Goyal‬ ‭System Design Exercises and Learning Plan‬ ‭Purpose of the Exercises‬ ‭ he following exercises are to give learners a challenge in building large-scale systems. They‬ T ‭offer practice with architectural decision-making, system scalability, fault tolerance, and‬ ‭network management. These challenges simulate real-world problems, encouraging learners to‬ ‭implement distributed design principles. So tickle your brain with them, don’t start looking for‬ ‭solutions over the internet.‬ ‭ ry to create your designs on paper and then look for the solution over the internet to find out‬ T ‭the differences.‬ ‬ al ‭Exercise Plan Overview‬ oy ‭Each design exercise is divided into four key components:‬ ‭.‬ 1 ‭ roblem Definition:‬‭What is the system supposed to‬‭achieve?‬ P G ‭2.‬ ‭Core Functional Requirements:‬‭Features necessary to‬‭make the system useful.‬ ‭3.‬ ‭Non-Functional Requirements:‬‭Performance, scalability,‬‭availability, etc.‬ ‭4.‬ ‭High-Level Solution Design:‬‭Architectural decisions,‬‭components, and trade-offs.‬ i in ‭1. How to Design a URL Shortening Service like bit.ly?‬ al ‭Key Topics:‬ ‭Sh ‭‬ ‭ atabase design (NoSQL or Relational)‬ D ‭‬ ‭Hashing algorithms for generating short URLs‬ ‭‬ ‭Handling high read-write traffic‬ ‭‬ ‭Caching (in-memory cache like Redis)‬ ‭Focus Areas:‬ ‭‬ C ‭ ore requirements:‬‭Generate, store, and retrieve short‬‭URLs.‬ ‭‬ ‭Non-functional needs:‬‭High availability, low latency.‬ ‭‬ ‭Scalability:‬‭Handle millions of URL requests per day.‬ ‭2. How to Design a Website like Pastebin?‬ ‭Created By Shalini Goyal‬ ‭Key Topics:‬ ‭‬ ‭ ext storage and retrieval‬ T ‭‬ ‭Security (private/public pastes)‬ ‭‬ ‭Data expiration (auto-deletion)‬ ‭‬ ‭Database schema design‬ ‭Focus Areas:‬ ‭‬ D ‭ ata structure:‬‭Use relational databases to store‬‭user text snippets.‬ ‭‬ ‭Expiration policies:‬‭Ensure old posts are removed‬‭automatically.‬ ‭‬ ‭Authentication:‬‭Add user management with login features‬‭for private pastes.‬ ‬ al ‭3. How to Design a Chat Application like WhatsApp or Telegram?‬ oy ‭Key Topics:‬ ‭‬ ‭ eal-time messaging protocols‬ R ‭‬ ‭Consistent and eventual messaging delivery‬ ‭‬ ‭Offline message storage‬ G ‭‬ ‭Encryption (end-to-end security)‬ ‭Focus Areas:‬ i in ‭‬ M ‭ essage queues:‬‭Use Kafka or RabbitMQ for message‬‭handling.‬ ‭‬ ‭Database:‬‭Use NoSQL for scalability (MongoDB or Cassandra).‬ ‭‬ ‭Real-time updates:‬‭Implement WebSockets for live chat.‬ al ‭Sh ‭4. How Would You Create Your Own Instagram?‬ ‭Key Topics:‬ ‭‬ ‭ edia storage (images/videos)‬ M ‭‬ ‭Content feed algorithms‬ ‭‬ ‭User profile and authentication‬ ‭‬ ‭Caching and CDN for media delivery‬ ‭Focus Areas:‬ ‭‬ U ‭ ser feed:‬‭Design algorithms to personalize the feed.‬ ‭‬ ‭Storage:‬‭Use AWS S3 or similar services for media‬‭storage.‬ ‭‬ ‭Performance:‬‭Employ CDNs for fast image loading.‬ ‭Created By Shalini Goyal‬ ‭5. How to Create Your Own Twitter?‬ ‭Key Topics:‬ ‭‬ ‭ andling real-time tweets and replies‬ H ‭‬ ‭Follower-following relationship model‬ ‭‬ ‭Timeline generation‬ ‭‬ ‭Rate limiting for APIs‬ ‭Focus Areas:‬ ‭‬ T ‭ imeline:‬‭Use caching for quick timeline retrieval.‬ ‬ al ‭‬ ‭Search:‬‭Implement text search with Elasticsearch.‬ ‭‬ ‭Follower model:‬‭Use graph-based data structures.‬ oy ‭6. How to Design a File Sharing System like Google Drive or Dropbox?‬ ‭Key Topics:‬ G ‭‬ ‭ ile storage and synchronization‬ F ‭‬ ‭User access control and permissions‬ i ‭‬ ‭File versioning‬ in ‭‬ ‭Distributed storage‬ ‭Focus Areas:‬ al ‭‬ S ‭ torage:‬‭Use cloud services with replication for redundancy.‬ ‭‬ ‭Sync logic:‬‭Handle conflicts between versions.‬ ‭Sh ‭‬ ‭Access control:‬‭Ensure secure sharing of files.‬ ‭7. How to Design a Global Video Streaming Service like Netflix?‬ ‭Key Topics:‬ ‭‬ ‭ ontent delivery using CDNs‬ C ‭‬ ‭Video encoding and storage‬ ‭‬ ‭Subscription management‬ ‭‬ ‭Load balancing for large-scale traffic‬ ‭Focus Areas:‬ ‭Created By Shalini Goyal‬ ‭‬ L ‭ oad management:‬‭Use load balancers to distribute traffic.‬ ‭‬ ‭Video storage:‬‭Store different resolutions for bandwidth‬‭optimization.‬ ‭‬ ‭CDN:‬‭Use edge servers for content delivery.‬ ‭8. How to Design an ATM System?‬ ‭Key Topics:‬ ‭‬ ‭ ransactional operations (withdrawal, deposit)‬ T ‭‬ ‭Network security‬ ‭‬ ‭Data synchronization between banks‬ ‬ ‭‬ ‭Fault-tolerant design‬ al ‭Focus Areas:‬ oy ‭‬ A ‭ vailability:‬‭Handle offline transactions.‬ ‭‬ ‭Consistency:‬‭Ensure account balance consistency.‬ ‭‬ ‭Security:‬‭Encrypt transactions.‬ G ‭9. How to Design a Web Crawler like Google?‬ i ‭Key Topics:‬ in ‭‬ ‭ rawling strategies (BFS vs. DFS)‬ C ‭‬ ‭URL prioritization‬ al ‭‬ ‭Handling duplicate content‬ ‭‬ ‭Scalable storage of crawled data‬ ‭Sh ‭Focus Areas:‬ ‭‬ E ‭ fficiency:‬‭Avoid duplicate crawling.‬ ‭‬ ‭Scalability:‬‭Use distributed crawling with multiple‬‭bots.‬ ‭‬ ‭Storage:‬‭Store metadata with MongoDB or Elasticsearch.‬ ‭10. How to Design an API Rate Limiter?‬ ‭Key Topics:‬ ‭‬ T ‭ oken bucket and leaky bucket algorithms‬ ‭‬ ‭Monitoring traffic per IP‬ ‭Created By Shalini Goyal‬ ‭‬ ‭Throttling strategies‬ ‭Focus Areas:‬ ‭‬ A ‭ PI limits:‬‭Enforce request limits to prevent abuse.‬ ‭‬ ‭Storage:‬‭Use Redis to manage tokens.‬ ‭‬ ‭Monitoring:‬‭Provide analytics for rate-limited endpoints.‬ ‭11. How to Design a News Feed like Facebook?‬ ‭Key Topics:‬ ‬ ‭‬ F ‭ eed ranking algorithms‬ al ‭‬ ‭Real-time updates‬ ‭‬ ‭Social graph management‬ oy ‭Focus Areas:‬ ‭‬ R ‭ anking:‬‭Use algorithms to rank posts.‬ G ‭‬ ‭Caching:‬‭Cache user feeds to minimize recomputation.‬ ‭‬ ‭Real-time:‬‭Use WebSockets for live updates.‬ i in ‭12. How to Create a Deployment System?‬ ‭Key Topics:‬ al ‭‬ C ‭ ontinuous Integration/Continuous Deployment (CI/CD)‬ ‭‬ ‭Rollbacks and versioning‬ ‭Sh ‭‬ ‭Deployment automation tools‬ ‭Focus Areas:‬ ‭‬ A ‭ utomation:‬‭Use Jenkins or GitHub Actions.‬ ‭‬ ‭Version control:‬‭Integrate with Git repositories.‬ ‭‬ ‭Rollback:‬‭Ensure fast recovery from failed deployments.‬ ‭13. How to Design a Multiplayer Card Game?‬ ‭Key Topics:‬ ‭Created By Shalini Goyal‬ ‭‬ G ‭ ame state synchronization‬ ‭‬ ‭Real-time communication‬ ‭‬ ‭Handling latency issues‬ ‭Focus Areas:‬ ‭‬ G ‭ ame logic:‬‭Sync states across players.‬ ‭‬ ‭Communication:‬‭Use WebSockets for real-time gameplay.‬ ‭‬ ‭Scalability:‬‭Ensure servers can handle concurrent‬‭players.‬ ‭14. How to Design a Ride-Hailing App like Uber?‬ ‬ al ‭Key Topics:‬ ‭‬ ‭ eal-time GPS tracking‬ R oy ‭‬ ‭Driver-rider matching algorithms‬ ‭‬ ‭Payment processing‬ ‭‬ ‭Load balancing for surge requests‬ ‭Focus Areas:‬ G ‭‬ L ‭ ocation tracking:‬‭Use GPS APIs for tracking.‬ ‭‬ ‭Matching system:‬‭Optimize rider-driver matching.‬ i ‭‬ ‭Payment:‬‭Secure payment gateway integration.‬ in al ‭15. Design an Application like Google Docs‬ ‭Key Topics:‬ ‭Sh ‭‬ C ‭ ollaborative editing‬ ‭‬ ‭Conflict resolution in real-time edits‬ ‭‬ ‭Document storage and permissions‬ ‭Focus Areas:‬ ‭‬ C ‭ ollaboration:‬‭Implement operational transforms (OT)‬‭for edits.‬ ‭‬ ‭Storage:‬‭Use cloud storage with versioning.‬ ‭‬ ‭Permissions:‬‭Manage shared access.‬ ‭Created By Shalini Goyal‬ ‭Learning Path and Plan of Execution‬ ‭1.‬ ‭Week 1-2:‬ ‭○‬ ‭Start with simpler systems like‬‭URL shortener‬‭and‬‭Pastebin‬‭.‬ ‭○‬ ‭Focus on‬‭database design‬‭and caching mechanisms.‬ ‭2.‬ ‭Week 3-4:‬ ‭○‬ ‭Move to‬‭chat applications‬‭and‬‭social media apps‬‭like‬‭Instagram.‬ ‭○‬ ‭Dive deep into‬‭real-time communication‬‭and‬‭feed generation‬‭.‬ ‭3.‬ ‭Week 5-6:‬ ‭○‬ ‭Work on‬‭file sharing systems‬‭and‬‭video streaming platforms‬‭.‬ ‭○‬ ‭Study‬‭CDNs‬‭and load balancing techniques.‬ ‭4.‬ ‭Week 7-8:‬ ‭○‬ ‭Design complex systems like‬‭Uber‬‭and‬‭Google Docs‬‭.‬ ‬ ‭○‬ ‭Learn‬‭synchronization algorithms‬‭and handle‬‭payment‬‭integration‬‭.‬ al ‭5.‬ ‭Week 9-10:‬ ‭○‬ ‭Build‬‭deployment systems‬‭and multiplayer games.‬ oy ‭○‬ ‭Practice‬‭CI/CD workflows‬‭and real-time state management.‬ ‭6.‬ ‭Final Week:‬ ‭○‬ ‭Take on large-scale challenges like‬‭news feed‬‭and‬‭web crawlers‬‭.‬ ‭○‬ ‭Present the designs and discuss trade-offs in group reviews.‬ i G in al ‭Sh ‭Created By Shalini Goyal‬ ‭ ow to Solve System Design Problems: Example –‬ H ‭URL Shortener‬ ‭Step 1: Requirement Clarification‬ ‭ efore diving into design, it is essential to understand what the system needs to do and what‬ B ‭trade-offs we are willing to make.‬ ‭‬ ‭Questions to Ask:‬ ‭1.‬ ‭What are the features the system must support?‬ ‭‬ ‭Example: Users can input a long URL and receive a shortened version.‬ ‭‬ ‭Should we allow URL expiry? (Optional feature)‬ ‬ ‭2.‬ ‭Should we focus only on the backend, frontend, or both?‬ al ‭‬ ‭Backend: Core URL shortening logic.‬ ‭‬ ‭Frontend: Simple UI to input and display URLs.‬ oy ‭3.‬ ‭What scale of users are we designing for?‬ ‭‬ ‭Estimate based on millions of users (high traffic).‬ ‭4.‬ ‭Are there any extended requirements?‬ ‭Example: API access for third-party services.‬ G ‭5.‬ ‭Should the system prioritize‬‭availability‬‭or‬‭consistency‬‭?‬ ‭Availability is often prioritized in web services with high traffic.‬ i in ‭Step 2: Capacity Estimations‬ al ‭ his helps us decide the resources the system will need and the appropriate architecture to‬ T ‭handle load.‬ ‭Sh ‭‬ ‭Questions to Answer:‬ ‭1.‬ ‭What scale is expected from the system?‬ ‭Example: 500 million URLs per year.‬ ‭2.‬ ‭Define the‬‭read/write ratio‬‭.‬ ‭Example: 80% reads (accessing URLs) and 20% writes (creating new URLs).‬ ‭3.‬ ‭Traffic Estimation:‬ ‭‬ ‭Assume 10 million requests per day.‬ ‭‬ ‭Peak traffic: 500 requests per second.‬ ‭4.‬ ‭Storage Estimation:‬ ‭‬ ‭Assume 1 KB per URL entry (including metadata).‬ ‭‬ ‭For 5 years: 1.8 billion URLs ≈ 1.8 TB of storage.‬ ‭5.‬ ‭Bandwidth Estimation:‬ ‭Example: 50 GB/day for incoming/outgoing data requests.‬ ‭Created By Shalini Goyal‬ ‭Step 3: System API (System Interface) Design‬ ‭This step involves defining how external users or systems will interact with your service.‬ ‭‬ A ‭ PI Design Approach:‬ ‭Use‬‭RESTful APIs‬‭since they are stateless and easy‬‭to integrate with web services.‬ ‭‬ ‭Endpoints:‬ ‭○‬ ‭POST /shorten‬‭: Accepts a long URL and returns a short‬‭version.‬ ‭○‬ ‭GET /{shortURL}‬‭: Redirects to the original URL.‬ ‭○‬ ‭GET /info/{shortURL}‬‭: Returns metadata (e.g., number‬‭of hits).‬ ‭○‬ ‭Optional:‬‭API for bulk URL shortening for enterprise‬‭use.‬ ‬ al ‭Step 4: Database Schema (Data Modeling)‬ oy ‭ hoosing the correct database and schema is crucial to meet system performance and storage‬ C ‭requirements.‬ G ‭‬ D ‭ atabase Choice:‬ ‭Use‬‭NoSQL databases‬‭(like Redis or DynamoDB) for fast‬‭reads and writes. NoSQL‬ ‭also supports scalability.‬ i ‭‬ ‭Schema:‬ in ‭○‬ ‭shortURL (Primary Key)‬‭: Unique short identifier.‬ ‭○‬ ‭longURL‬‭: Original long URL.‬ ‭○‬ ‭createdAt‬‭: Timestamp of creation.‬ al ‭○‬ ‭expiryDate‬‭: Optional field for URL expiration.‬ ‭‬ ‭Block Storage:‬ ‭If storing URLs and metadata becomes large, you can use cloud storage systems (like‬ ‭Sh ‭AWS S3) for archival purposes.‬ ‭Step 5: High-Level Design‬ ‭At this stage, create a block diagram showing the core components.‬ ‭1.‬ ‭Components:‬ ‭○‬ ‭Frontend:‬‭Web interface to enter URLs and retrieve‬‭short ones.‬ ‭○‬ ‭Backend Services:‬ ‭‬ ‭URL Shortening Service: Generates short URLs.‬ ‭‬ ‭Redirection Service: Maps short URLs to original URLs.‬ ‭‬ ‭Analytics Service: Tracks URL clicks and metadata.‬ ‭Created By Shalini Goyal‬ ‭ ‬ ‭Database:‬‭Stores the mapping between long and short URLs.‬ ○ ‭○‬ ‭Cache Layer:‬‭Stores frequently accessed URLs (e.g.,‬‭Redis).‬ ‭2.‬ ‭Diagram:‬ ‭Include the following:‬ ‭○‬ ‭User → Web Frontend → API Layer → URL Shortener → Database.‬ ‭○‬ ‭Cache sits between the API and the database for performance.‬ ‭Step 6: Detailed Design‬ ‭ elve into the technical details such as caching strategies, partitioning, and handling high‬ D ‭traffic.‬ ‬ al ‭‬ P ‭ artitioning Strategy:‬ ‭Use‬‭sharding‬‭on URL keys to split the database load‬‭across multiple servers.‬ ‭‬ ‭Handling Hot URLs:‬ oy ‭Cache popular URLs in Redis to avoid frequent database lookups.‬ ‭‬ ‭Load Balancing:‬ ‭Use a‬‭round-robin load balancer‬‭to distribute traffic‬‭across multiple servers.‬ ‭‬ ‭Caching Strategy:‬ G ‭○‬ ‭Store short-to-long URL mappings in Redis.‬ ‭○‬ ‭Implement‬‭cache-aside pattern‬‭: Check cache first;‬‭if not found, query the‬ ‭database and update the cache.‬ i in ‭Step 7: Handling Bottlenecks and Single Points of Failure‬ al ‭Design for reliability by eliminating SPOFs and introducing redundancy.‬ ‭Sh ‭‬ ‭Questions to Ask:‬ ‭1.‬ ‭Are there any single points of failure?‬ ‭Example: If the database crashes, all requests will fail.‬ ‭2.‬ ‭How to remove these SPOFs?‬ ‭‬ ‭Use‬‭database replication‬‭to ensure availability.‬ ‭3.‬ ‭Do we have enough data replicas?‬ ‭‬ ‭Maintain multiple replicas across regions to prevent data loss.‬ ‭4.‬ ‭How are we monitoring performance?‬ ‭‬ ‭Use monitoring tools like Prometheus to set up alerts for high latencies or‬ ‭errors.‬ ‭Step 8: Performance Monitoring and Scaling‬ ‭Created By Shalini Goyal‬ ‭Monitoring helps identify bottlenecks early and allows proactive scaling.‬ ‭‬ ‭Metrics to Track:‬ ‭○‬ ‭Latency:‬‭How fast the service returns a shortened‬‭URL.‬ ‭○‬ ‭QPS (Queries per Second):‬‭How many requests are handled‬‭concurrently.‬ ‭○‬ ‭Cache Hit Rate:‬‭Percentage of requests served from‬‭cache.‬ ‭‬ ‭Scaling Approach:‬ ‭○‬ ‭Horizontal Scaling:‬‭Add more servers as the user base‬‭grows.‬ ‭○‬ ‭Auto-scaling Policies:‬‭Use cloud-based auto-scaling‬‭to handle traffic spikes.‬ ‭Final Solution Summary:‬ ‬ al ‭The URL Shortener system will have:‬ ‭.‬ 1 ‭ rontend:‬‭Simple interface to input URLs.‬ F oy ‭2.‬ ‭Backend Services:‬‭Stateless API services for URL shortening‬‭and redirection.‬ ‭3.‬ ‭Database:‬‭NoSQL for fast access with sharding.‬ ‭4.‬ ‭Cache:‬‭Redis for hot URLs and quick lookups.‬ ‭5.‬ ‭Monitoring Tools:‬‭Prometheus or Grafana to track performance‬‭and set alerts.‬ G ‭6.‬ ‭Load Balancer:‬‭Distribute requests across servers.‬ i in ‭Conclusion‬ ‭ y following the structured approach outlined above, we designed a highly scalable, available,‬ B al ‭and reliable URL shortening service. This problem-solving framework can be applied to other‬ ‭system design challenges, ensuring you address both functional and non-functional‬ ‭requirements systematically.‬ ‭Sh ‭Created By Shalini Goyal‬

System Design - High Level PDF

Document Details

Tags

Related

Summary

Full Transcript