Mastering System Design [01]: Foundations of System Design

Bhavyansh @ DiversePixel
8 min readSep 5, 2024

--

This article will cover essential topics like client-server architecture, databases, APIs, security protocols, rate limiting, idempotency, and video streaming, laying a solid foundation for designing robust and scalable systems.

1. Client-Server Architecture

What is Client-Server Architecture? Client-server architecture is a model where client devices request services and resources from a centralized server. The server then processes these requests and returns the appropriate responses.

How It Works:

  1. Clients: Devices or applications that initiate requests. Examples include web browsers, mobile apps, and desktop applications.
  2. Server: A central system that processes client requests, providing services or data. Servers can host websites, store databases, or perform complex computations.

Real-World Examples:

  • Web Browsing: When you access a website, your browser (client) requests HTML, CSS, and JavaScript files from a web server.
  • Email Services: Clients like Outlook or Gmail request data from email servers to show your inbox and send emails.

Purpose Solved: This architecture allows centralized management, resource sharing, and easy maintenance, making it ideal for various applications, from small websites to large-scale enterprise solutions.

2. Database Fundamentals

Relational Databases (RDBMS): Relational databases use tables to store data. They support SQL for querying and offer robust support for ACID (Atomicity, Consistency, Isolation, Durability) properties to ensure reliable transactions.

We cannot use any other scripting language instead of SQL because they will want the data to be loaded in memory, loading TBs of data is impractical.

Key Components:

  • Tables: Store data in rows and columns.
  • Indexes: Enhance search speed by allowing quick lookup of rows based on specific columns. Auxiliary data structure is used for fast searching, an auxiliary table is created in which data is stored in sorted order, making searching easy.
    - They require extra memory
    - Shouldn’t index every column.
    - They even have types: bitmap, reverse demographic indexes, etc.
  • ACID Properties: Ensure reliable transactions even in failure scenarios.
    - Atomicity: The entire transaction must finish, or you revert the database to its original state.
    - Consistency: Changes are made to tables in a predefined way, data integrity constraints must be followed.
    - Isolation: Multiple users can read/write concurrently without a mixup of data.
    - Durability: Transactions persist in non-volatile memory.
    All major Databases are ACID compliant.

Real-World Examples:

  • MySQL: A popular open-source relational database used by companies like Facebook and Twitter.
  • PostgreSQL: Known for its advanced features and reliability, used by platforms like Instagram.

Key-Value Stores (e.g., Redis): Key-value stores are NoSQL databases optimized for simplicity and speed. They store data as a collection of key-value pairs, making them ideal for caching, session storage, and real-time analytics.

How Redis Works: Redis stores data in-memory for fast access, with options to persist data to disk. It’s highly performant for read-heavy operations and supports data structures like strings, hashes, lists, and sets.

Purpose Solved: Databases provide structured storage, quick retrieval, and data integrity, critical for applications requiring reliable data management and real-time access.

3. APIs

What are APIs? APIs (Application Programming Interfaces) allow different software systems to communicate with each other. They define a set of rules and protocols for building and interacting with software applications.

Types of APIs:

  • RESTful APIs: Follow REST principles and are stateless, allowing scalability and simplicity.
  • GraphQL APIs: Offer flexibility in data querying, allowing clients to request specific data fields.

Real-World Examples:

  • GitHub API: Allows developers to interact programmatically with GitHub repositories, automate tasks, and integrate with other tools.
  • Twitter API: Enables developers to access and interact with Twitter data, post tweets, and more.

Webhook Example: GitHub Webhooks: GitHub webhooks allow external applications to receive real-time notifications of events in a GitHub repository, such as commits, pull requests, or issues.

Purpose Solved: APIs and webhooks enable interoperability between different software systems, facilitating automation and integration.

4. HTTPS and SSL/TLS

Why HTTPS Matters: HTTPS (Hypertext Transfer Protocol Secure) is the secure version of HTTP, encrypting data between the client and server to prevent eavesdropping and man-in-the-middle attacks. It ensures that communication over the web remains private and protected.

How HTTPS and SSL/TLS Work:

  • HTTP Overview: HTTP provides an abstraction over the TCP protocol, simplifying the process of web communication for developers. However, HTTP is inherently insecure, which is why HTTPS, an extension of HTTP, is necessary.

Encryption: In HTTPS, data is obfuscated at the sender and unobfuscated at the receiver to maintain confidentiality and integrity. Encryption types include:

  • Symmetric Encryption: The same key is used for both encryption and decryption (e.g., AES).
  • Asymmetric Encryption: Involves a public key for encryption and a private key (kept secret) for decryption (e.g., RSA, PKCS algorithms).

TLS Handshake Process (Successor of SSL): HTTPS runs on top of the TLS (Transport Layer Security) protocol, replacing its predecessor SSL (Secure Sockets Layer). The TLS handshake process establishes a secure session:

  1. Client Hello (cHello) -> Server Hello (sHello): Both client and server exchange random byte strings. The server also sends its SSL certificate, containing its public key.
  2. Pre-Master Secret: The client generates a pre-master secret, encrypts it using the server’s public key (from the SSL certificate), and sends it to the server.
  3. Session Key Generation: Both client and server decrypt the pre-master secret and use it to generate symmetric session keys. These keys will be valid only for the current session and ensure efficient encrypted communication.

SSL Certificate: Issued by a Certification Authority (CA), an SSL certificate contains the server’s public key and is signed by the CA. The client uses the CA’s public key to verify the certificate’s authenticity before proceeding.

Real-World Examples:

  • Online Banking: HTTPS ensures that sensitive financial information, like passwords and account details, remains confidential and tamper-proof.
  • E-commerce Websites: Protects customer data, such as credit card information, during online transactions.

Purpose Solved: HTTPS and SSL/TLS provide essential security by encrypting data during transmission and ensuring both integrity and confidentiality. They establish trust between the client and server, allowing safe online communication.

5. Rate Limiting

What is Rate Limiting? Rate limiting controls the number of requests a client can make to a server within a specific timeframe. It prevents abuse, ensures fair usage, and protects server resources.

How It Works:

  • Fixed Window: Limits the number of requests in a fixed time period (e.g., 100 requests per minute).
  • Sliding Window: Offers more flexibility by checking requests within a rolling timeframe.

Real-World Examples:

  • API Services: Twitter limits API requests to prevent abuse and ensure fair usage.
  • Web Scraping: Prevents bots from overwhelming websites with too many requests.

Purpose Solved: Rate limiting ensures service stability and prevents abuse, enhancing the security and reliability of applications.

6. Idempotency and Intelligent Retry

What is Idempotency? Idempotency ensures that performing the same operation multiple times has the same effect as performing it once. This is crucial for ensuring data consistency, especially in distributed systems.

Why It’s Important:

  • Ensures Data Integrity: Prevents unintended consequences from duplicate requests (e.g., multiple payments).
  • Improves Reliability: Supports robust retry mechanisms without adverse effects.

How to Implement:

  • Unique Request IDs: Track each request to ensure it’s only processed once.
  • Idempotent Endpoints: Design endpoints to handle repeated requests safely.

Real-World Examples:

  • Payment Systems: Prevent double charges if a payment request is inadvertently repeated.
  • Order Processing: Ensure that the same order isn’t placed multiple times.

Purpose Solved: Idempotency and intelligent retry mechanisms maintain data consistency and reliability, especially in scenarios with intermittent connectivity or failures.

7. Video Streaming

How Video Streaming Works: Video streaming allows continuous transmission of video files from a server to a client, enabling real-time playback without downloading the entire file.

Key Components:

  • Content Delivery Networks (CDNs): Distribute video content across multiple servers to reduce latency and improve availability.
  • Adaptive Bitrate Streaming: Adjusts video quality based on the user’s internet speed, ensuring smooth playback.

Real-World Examples:

  • Netflix: Uses adaptive streaming to provide high-quality video based on the user’s internet speed and device capabilities.
  • YouTube: Relies on CDNs to deliver videos quickly and efficiently to users worldwide.

Purpose Solved: Video streaming technology ensures efficient delivery of video content, optimizing for performance and user experience.

8. Webhooks

What are Webhooks? Webhooks are user-defined HTTP callbacks that allow one system to send real-time data to another system whenever a specific event occurs. Unlike traditional APIs, where clients periodically check for updates (polling), webhooks push data to the client as soon as an event happens, enabling more efficient and immediate data exchange.

How Webhooks Work:

  1. Event Occurrence: An event (e.g., a new commit in a repository) triggers the webhook.
  2. Webhook Payload: The server sends an HTTP POST request to a pre-configured URL (the client endpoint) with a payload containing event details.
  3. Handling the Request: The client server processes the webhook request and performs an action, such as updating a database or sending a notification.

Real-World Example: GitHub Webhooks

  • Purpose: GitHub webhooks allow external applications to receive notifications of events like commits, pull requests, or issues. For example, a Continuous Integration (CI) system might use webhooks to start testing automatically whenever new code is pushed to a repository.
  • How It Works: Developers set up webhooks in their GitHub repository settings, specifying a URL and the types of events to be notified about. When these events occur, GitHub sends a POST request with details, and the endpoint processes the information accordingly.

Purpose Solved: Webhooks enable real-time communication between different systems, reducing the need for constant polling and improving the efficiency and responsiveness of applications.

9. Practical Example: Designing a Secure RESTful API with Rate-Limiting, Webhook Integration, and Video Streaming Support

Scenario: You are tasked with designing a secure RESTful API for a video-sharing platform that supports rate-limiting to prevent abuse, integrates with webhooks to notify clients of events (such as video uploads or comments), and provides efficient video streaming.

Design Considerations:

  1. Security and HTTPS:
  • Ensure all API endpoints are served over HTTPS to protect data in transit with SSL/TLS encryption.
  • Use token-based authentication (e.g., OAuth 2.0) to secure API endpoints, ensuring that only authorized users can access or modify resources.

2. Rate Limiting:

  • Implement a rate-limiting mechanism to prevent abuse by limiting the number of requests a user or IP can make within a given timeframe.
  • Choose an appropriate rate-limiting strategy (fixed window, sliding window, or token bucket) based on the API’s use case and expected traffic patterns.
  • Use headers to communicate rate-limit status to clients (e.g., X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset).

3. Webhook Integration:

  • Allow clients to register webhook URLs to receive real-time notifications when specific events occur, such as video uploads, likes, or new comments.
  • Implement webhook signing (e.g., using HMAC) to verify the authenticity of webhook payloads and prevent malicious actors from spoofing requests.
  • Include retry mechanisms with exponential backoff in case webhook deliveries fail due to temporary network issues or endpoint unavailability.

4. Video Streaming Support:

  • Store videos in a blob storage system (e.g., AWS S3 or Google Cloud Storage) to handle large files efficiently and scale storage capacity as needed.
  • Use a content delivery network (CDN) to cache and deliver video content closer to end users, reducing latency and improving streaming quality.
  • Implement adaptive bitrate streaming (e.g., using HLS or MPEG-DASH) to provide a smooth viewing experience regardless of the user’s network conditions.

Architecture:

  1. API Gateway:
  • Acts as the single entry point for all client requests, handling security (HTTPS, authentication), rate limiting, and request routing to backend services.

2. Webhook Service:

  • Responsible for managing webhook subscriptions, triggering webhooks upon specific events, and ensuring reliable delivery with retries and verification.

3. Video Storage and CDN Integration:

  • Manages video uploads, stores video files in a scalable blob storage system, and integrates with a CDN for efficient content delivery.

4. Rate Limiting Middleware:

  • Sits in the API Gateway to monitor and control the rate of incoming requests, enforcing rate limits based on predefined policies.

Outcome: By combining secure API design principles with rate limiting, webhook integration, and optimized video streaming, the platform provides a robust and user-friendly experience, handling high traffic and ensuring real-time interactions.

Conclusion

This foundational article lays the groundwork for understanding key concepts in system design. By mastering these topics, you’ll be equipped with the essential knowledge needed to build secure, efficient, and scalable systems.

Stay tuned for the next article in this series, where we’ll dive deeper into data management, scaling, and distributed systems!

--

--

Bhavyansh @ DiversePixel

Hey I write about Tech. Join me as I share my tech learnings and insights. 🚀