- Key Characteristics and Fundamentals of Distributed Systems
- Monolithic VS Microservice (Service Discovery, Resiliency)
- Vertical vs horizontal scaling Watch1
- Load Balancing / Application Delivery Controller (ADC) Read1 Read2 Watch1
- Consistent Hashing Watch1 Read1 Read2 Read3
- Throughput, Latency
- CAP theorem
- ACID vs BASE
- Redundancy and Replication
- Partitioning/Sharding
- Optimistic vs pessimistic locking
- Strong vs eventual consistency
- SQL vs NoSQL
- Types of NoSQL (Key value, Wide column, Document-based, Graph-based)
- Caching
- Data center/racks/hosts
- CPU/memory/Hard drives/Network bandwidth
- Random vs sequential read/writes to disk
- DNS lookup
- HTTP, HTTPS, HTTP2
- HTTP
- HTTPS Read1
- HTTP & SSL/TLS
- Public key infrastructure and certificate authority(CA)
- Symmetric vs asymmetric encryption
- WebSockets
- Long-Polling vs WebSockets vs Server-Sent Events
- TCP/IP model
- IPv4 vs IPv6
- TCP vs UDP
- Consistent Hashing
- CDNs & Edges
- Data Partitioning
- Indexes
- Master-Slave, Master-Master
- Active-Passive, Active-Active
- Leader election
- Design patterns and Object-oriented design
- Virtual machines and containers
- Pub-sub architecture
- REST, GraphQL
- MapReduce
- Bloom filters and Count-Min sketch
- Paxos
- Multithreading, locks, synchronization, CAS(compare and set)
- Proxies
- Authentication
- JWT
- OAUTH2
- File / Media Upload
- S3, Multiple Quality Files
- WIP...
- Databases Comparison
- Cassandra
- MongoDB/Couchbase
- RabbitMQ / Kafka / Pub-Sub comparison Comparison
- RabbitMQ: Watch1, Watch2
- Google PubSub: Watch Playlist
- Mysql / PostgreSQL
- Scalability in Postgres
- Redis / Memcached
- InfluxDB [Suitable for TimeSeries, IoT data]
- Zookeeper
- NGINX
- HAProxy
- Solr, Elastic search
- Amazon, EC2, S3
- Docker, Kubernetes
- Hadoop/Spark and HDFS
- Eureka, Hysterix
- Heroku / Azure DevOps
- Jenkins CI/CD
- TinyURL
- Instagram | Photo hosting platform
- Timeline | Newsfeed | Twitter
- Dropbox | Google Drive
- Whatsapp | Facebook Messenger NL GS Ref
- MakeMyTrip | BookMyShow
- Amazon | Flipkart
- Youtube | Netflix NL
- Uber | IRCTC
- Swiggy | Zomato
- Yelp | Nearby
- Twitter Search
- Google Search
- SplitWise
- Zerodha
- API Rate Limiter
- Web Crawler
- Rate limiting system
- Distributed cache
- Typeahead Suggestion | Auto-complete system
- Recommendation System
- Design a tagging system like tags used in LinkedIn
Low Level Design Problems (Machine Coding Round) Reference
- Elevator system
- Snake and Ladder game
- Tic Tac Toe
- ATM machine - https://medium.com/swlh/atm-an-object-oriented-design-e3a2435a0830
- Traffic Control System
- Vehicle Parking System
- Online Coding Platform problem-statement
- File Sharing System
- Object Oriented Design Prerations [https://www.oodesign.com/]
- SOLID Principles
- Design Patterns [https://refactoring.guru/design-patterns]
- More Problems List
- More Good Resources:
Engineering Blogs Ref
Airbnb AirPair Artsy Asana Bandcamp BenefitFocus Bitly Bittorrent Cerner Chartbeat Cloudera Cloudflare Docker Dropbox Ebay Etsy Eventbrite Facebook Flickr Fiftythree Flipboard Foursquare Github Gnip GoSquared Grouper Groupon Harry's Heroku Honeybadger Indeed Instagram Intent Linkedin Livechat Medallia Monetate Netflix Oyster Paypal Pinterest Prezi Quora Rightscale Salesforce Shopify Simple Slideshare Songkick Soundcloud Spotify Square Strava Tumblr Twitter Twilio Thumbtack Wayfair Wealthfront Webengage Yahoo Yammer Yelp Zenpayroll Zillow
- HOW TO ACE A SYSTEMS DESIGN INTERVIEW-https://www.palantir.com/2011/10/how-to-ace-a-systems-design-interview/
- HighScalability Blog-http://highscalability.com/
- Distributed Systems-http://book.mixu.net/distsys/single-page.html
- Distributed Deep Dive - https://ably.com/blog/introducing-distributed-deep-dive-interview-series-by-ably-realtime
- Architecture for microservice by Microsoft - https://docs.microsoft.com/en-us/dotnet/architecture/microservices/
1. If we are dealing with a read-heavy system, it's good to consider using a Cache. 2. If we need low latency in the system, it's good to consider using a Cache & CDN. 3. If we are dealing with a write-heavy system, it's good to use a Message Queue for async processing OR Append only logs 4. If we need a system to be an ACID complaint, we should go for RDBMS or SQL Database 5. If data is unstructured & doesn't require ACID properties, we should go for NoSQL Database 6. If the system has complex data in the form of videos, images, files etc, we should go for Blob/Object storage 7. If the system requires complex/heavy pre-computation like a news feed, we should use a Message Queue & Cache 8. If the system requires searching data in high volume, we should consider using a search index, tries or a search engine like Elasticsearch 9. If the system requires to Scale SQL Database, we should consider using Database Sharding & Partitioning 10. If the system requires High Availability, Performance, & Throughput, we should consider using a Load Balancer 11. If the system requires faster data delivery globally, reliability, high availability, & performance, we should consider using a CDN 12. If the system has data with nodes, edges, and relationships like friend lists, & road connections, we should consider using a Graph Database 13. If the system needs scaling of various components like servers, databases, etc, we should consider using Horizontal Scaling 14. If the system requires high-performing database queries, we should use Database Indexes 15. If the system requires bulk job processing, we should consider using Batch Processing & Message Queues 16. If the system requires reducing server load and preventing DOS attacks, we should use a Rate Limiter 17. If the system has microservices, we should consider using an API Gateway (Authentication, SSL Termination, Routing etc) 18. If the system has a single point of failure, we should implement Redundancy in that component 19. If the system needs to be fault-tolerant, & durable, we should implement Data Replication (creating multiple copies of data on different servers) 20. If the system needs user-to-user communication (bi-directional) in a fast way, we should use Websockets 21. If the system needs the ability to detect failures in a distributed system, we should implement a Heartbeat 22. If the system needs to ensure data integrity, we should use Checksum Algorithm 23. If the system needs to scale servers with add/removal of nodes efficiently, with no hotspots, we should implement Consistent Hashing 24. If the system needs to transfer data between various servers in a decentralized way, we should go for\ Gossip Protocol 25. If the system needs anything to deal with a location like maps, nearby resources, we should consider using Quadtree, Geohash etc 26. Avoid using any specific technology names such as - Kafka, S3, or EC2. Try to use more generic names like message queues, object storage etc 27. If High Availability is required in the system, it's better to mention that the system cannot have strong consistency. Eventual Consistency is possible 28. If asked how domain name query in the browser works and resolves IP addresses. Try to sketch or mention about DNS (Domain Name System) 29. If asked how to limit the huge amount of data for a network request like youtube search, trending videos etc. One way is to implement Pagination which limits response data. 30. If asked which policy you would use to evict a Cache. The preferred/asked Cache eviction policy is LRU (Least Recently Used) Cache. Prepare around its Data Structure and Implementation.
(1) Features
(2) API
(3) Availability
(4) Latency
(5) Scalability
(6) Durability
(7) Class Diagram
(8) Security and Privacy
(9) Cost-effective
(1) Use cases
(2) Scenarios that will not be covered
(3) Who will use
(4) How many will use
(5) Usage patterns
(1) Throughput (QPS for read and write queries)
(2) Latency expected from the system (for read and write queries)
(3) Read/Write ratio
(4) Traffic estimates
- Write (QPS, Volume of data)
- Read (QPS, Volume of data)
(5) Storage estimates
(6) Memory estimates
- If we are using a cache, what is the kind of data we want to store in cache
- How much RAM and how many machines do we need for us to achieve this ?
- Amount of data you want to store in disk/ssd
(1) Latency and Throughput requirements
(2) Consistency vs Availability [Weak/strong/eventual => consistency | Failover/replication => availability]
(1) APIs for Read/Write scenarios for crucial components
(2) Database schema
(3) Basic algorithm
(4) High level design for Read/Write scenario
(1) Scaling the algorithm
(2) Scaling individual components:
-> Availability, Consistency and Scale story for each component
-> Consistency and availability patterns
#### Think about the following components, how they would fit in and how it would help
a) DNS
b) CDN [Push vs Pull]
c) Load Balancers [Active-Passive, Active-Active, Layer 4, Layer 7]
d) Reverse Proxy
e) Application layer scaling [Microservices, Service Discovery]
f) DB [RDBMS, NoSQL]
> RDBMS
>> Master-slave, Master-master, Federation, Sharding, Denormalization, SQL Tuning
> NoSQL
>> Key-Value, Wide-Column, Graph, Document
Fast-lookups:
-------------
>>> RAM [Bounded size] => Redis, Memcached
>>> AP [Unbounded size] => Cassandra, RIAK, Voldemort
>>> CP [Unbounded size] => HBase, MongoDB, Couchbase, DynamoDB
g) Caches
> Client caching, CDN caching, Webserver caching, Database caching, Application caching, Cache @Query level, Cache @Object level
> Eviction policies:
>> Cache aside
>> Write through
>> Write behind
>> Refresh ahead
h) Asynchronism
> Message queues
> Task queues
> Back pressure
i) Communication
> TCP
> UDP
> REST
> RPC
(1) Throughput of each layer
(2) Latency caused between each layer
(3) Overall latency justification
- 25 Interview Questions
- 25 Interview Questions
- High-Scalability
- Hired In Tech
- workat.tech
- System Design
- SYSTEM DESIGN PREPARATION
- The System Design Primer
- Gaurav Sen Playlist
- Narendra L - Tech Dummies
- low-level-design-primer
- System Design Interesting Reads
- Real Time Analytics on Big Data Architecture
- how-i-finally-got-some-awesome-offers
- Pragmatic Programming Techniques
- Multithreading
- Helpful list of LeetCode Posts on System Design
- [Booking.com interview exp] https://leetcode.com/discuss/interview-experience/1184565/Booking-or-Amsterdam-or-Senior-Java-Developer-or-Apr-2021-Offer