Collection of useful architecture stories, how organizations have implemented various technology stacks.
- History
- Miscellaneous
- Companies
- AIRBNB
- AMAZON, AWS
- BBC
- CHIME
- CLOUDFLARE
- COINBASE
- DISCORD
- DISNEY
- EXPEDIA
- FIGMA
- FLIPKART
- GITHUB
- GITLAB
- IMDB
- KHAN ACADEMY
- LIBERRY MUTUAL
- LYFT
- MICROSOFT
- NETFLIX
- PALANTIR
- PAYPAL
- SLACK
- SALESFORCE
- SENDOSO
- SEQUOIA CAPITAL
- SHOPIFY
- SIEMENS
- SPOTIFY
- SNAP
- SOUNDCLOUD
- TINER
- TWILIO
- UBER
- ZILLOW
- The History of Git: The Road to Domination in Software Version Control
- Windows vs MacOS vs Linux – Operating System Handbook
- Airbnb at Scale
- Dynamic Kubernetes Cluster Scaling at Airbnb
- Airbnb’s Microservices Architecture Journey To Quality Engineering
- Rebuilding Payment Orchestration at Airbnb
- Airbnb Optimizes Usage and Costs by Using Savings Plans and Actionable Cost Data on AWS
- Intelligent Automation Platform: Empowering Conversational AI and Beyond at Airbnb
- Airbnb Streamlines the Development Process with a Unified Architecture for Collaborative Hosting
- How Airbnb Supports Co-Hosting
- Measuring Web Performance at Airbnb
- Building Large-Scale iOS Apps at Airbnb
- Creating Airbnb’s Page Performance Score
- The Human Side of Airbnb’s Microservice Architecture
- How Airbnb Democratizes Data Science With Data University
- How Airbnb is Boosting Data Literacy With ‘Data U Intensive’ Training
- AirBnb Meet Ottr: A Serverless Public Key Infrastructure Framework
- Airbnb Migrating Kafka transparently between Zookeeper clusters
- How Airbnb Tech Fosters a Culture of Learning
- How Airbnb Enables Consistent Data Consumption at Scale
- Hotel Booking System | Airbnb System Design| Most frequently asked question in technical interviews
- What Technology Stack Does Airbnb Use?
- ZippyDB: the Architecture of Facebook’s Strongly Consistent Key-Value Store
- How we built a general purpose key value store for Facebook with ZippyDB
- How Facebook Accelerates SQL at Extreme Scale
- Facebook, Building a more accurate time service at Facebook scale
- Rebuilding our tech stack for the new facebook.com
- How GitHub Uses Machine Learning to Extend Vulnerability Code Scanning
- How GitHub Does DevOps for its iOS and Android Apps
- How GitHub Partitioned Its Relational Database to Improve Reliability at Scal
- GitHub’s Journey from Monolith to Microservices
- Design Docs at Google
- Google Details Its Zero-Trust Architecture. Can Enterprises Use It?
- Dependency Inversion Principle: How Google Developers write code
- Google SRE: Site Reliability Engineering at a Global Scale
- Google The Standard of Code Review
- Google Engineering Practices Documentation Ref1
- Google Engineering Practices Documentation Ref2
- Google Software Principles
- Google Software Engineering Culture
- Software Engineering at Google: Practices, Tools, Values, and Culture
- Software Engineering at Google 31 Jan 2017 Fergus Henderson PDF
- Software Engineering At Google 100% Complete Book Notes
- Service Overload Detection and Remediation at LinkedIn
- LinkedIn, Uber, PayPal, eBay, Groupon, Walmart, Node.JS
- Akhilesh Gupta on the Architecture of LinkedIn’s Real-Time Messaging Platform
- Trino: Open Source Infrastructure Upgrading at Lyft
- Building Lyft’s In-App Messaging Platform
- How LyftLearn Democratizes Distributed Compute through Kubernetes Spark and Fugue
- Building Lyft’s Incentive Platform
- Parameter Exploration at Lyft
- Scaling productivity on microservices at Lyft (Part 1)
- Transforming modern engineering at Microsoft
- What are the Microsoft SDL practices?
- DevOps Lessons Learned at Microsoft Engineering
- TOOLS
- Netflix Drive: Building a Cloud Native Filesystem for Media Assets
- Netflix Edge Authentication and Token-Agnostic Identity Propagation Zuul
- Netflix Embraces GraphQL Microservices for Rapid Application Development
- Netflix Studio Engineering Overview
- A Design Analysis of Cloud-based Microservices Architecture at Netflix
- Netflix Studio Search: Using Elasticsearch and Apache Flink to Index Federated GraphQL Data
- A Survey of Causal Inference Applications at Netflix
- Byte Down: Making Netflix’s Data Infrastructure Cost-Effective
- The Four Innovation Phases of Netflix’s Trillions Scale Real-time Data Infrastructure
- Rapid Event Notification System at Netflix
- Modernizing the Netflix TV UI Deployment Process
- Netflix: A Culture of Learning
- Fixing Performance Regressions Before they Happen
- Scaling Video Quality Measurements at Netflix with Cosmos
- Netflix: Lessons in Experimentation
- Auto-Diagnosis and Remediation in Netflix Data Platform
- Netflix Snaring the Bad Folks
- Netflix System Architecture
- Netflix Video Quality at Scale with Cosmos Microservices
- Netflix Builds a Reliable, Scalable Platform with Event Sourcing, MQTT and Alpakka-Kafka
- Practical API Design Using gRPC at Netflix
- Netflix Data pipeline asset management with Dataflow
- The Show Must Go On: Securing Netflix Studios At Scale
- Netflix Building confidence in a decision
- Netflix Cloud Packaging in the Terabyte Era
- Practical API Design at Netflix, Part 1: Using Protobuf FieldMask
- Netflix How We Build Micro Frontends With Lattice
- Practical API Design at Netflix, Part 2: Protobuf FieldMask for Mutation Operations
- The Show Must Go On: Securing Netflix Studios At Scale
- Decision Making at Netflix
- A/B Testing and Beyond: Improving the Netflix Streaming Experience with Experimentation and Data Science
- It’s All A/Bout Testing: The Netflix Experimentation Platform
- Caching for a Global Netflix
- Netflix Builds a Reliable, Scalable Platform with Event Sourcing, MQTT and Alpakka-Kafka
- Practical API Design at Netflix, Part 1: Using Protobuf FieldMask
- STORY A Sticky Situation: How Netflix Gains Confidence in Changes
- How Netflix is able to enrich VPC Flow Logs at Hyper Scale to provide Network Insight
- Product Design at Palantir: Q&A with the team
- How Palantir Foundry Powers bp’s Digital Transformation in Reliability
- From Human-Defined to Software-Defined Data Integration — Part III, Data Pipelines
- Product Reliability at Palantir: Life on the PRX Team
- How Palantir Foundry Helps Customers Build and Deploy AI-Powered Decision-Making Application
- Scaling Kubernetes to Over 4k Nodes and 200k Pods
- PayPal Adopts GraphQL: Gains Increased Developer Productivity
- GraphQL at PayPal: An Adoption Story
- Estimating Potential Audience Size of an Ad at Pinterest
- How Pinterest Tuned Memcached for Big Performance Gains
- Improving Distributed Caching Performance and Efficiency at Pinterest
- How Pinterest Supercharged its Growth Team With Experiment Idea Review
- Pinterest Using Kafka to Throttle QPS on MySQL Shards in Bulk Write APIs
- Optimizing Pinterest’s Data Ingestion Stack: Findings and Learnings
- How Pinterest built its Trust & Safety team
- Introducing PinFlex: Pinterest’s model for the Future of Work
- Large Scale Hadoop Upgrade At Pinterest
- 99% to 99.9% SLO: High Performance Kubernetes Control Plane at Pinterest
- Spinner: The Mass Migration to Pinterest’s New Workflow Platform
- Spinner: Pinterest’s Workflow Platform
- 3 Innovations While Unifying Pinterest’s Key-Value Storage
- Pre-Submit UI Tests at Pinterest
- Pinterest Druid Holiday Load Testing
- How Pinterest powers a healthy comment ecosystem with machine learning
- Campaign Budgets at Pinterest
- MemQ: An efficient, scalable cloud native PubSub system
- SearchSage: Learning Search Query Representations at Pinterest
- How Pinterest Scaled up Its Ad-Serving Architecture
- Efficient Resource Management at Pinterest’s Batch Processing Platform
- Pinterest Ensuring High Availability of Ads Realtime Streaming Services
- Pinterest Home Feed Unified Lightweight Scoring: A Two-tower Approach
- Pinterest’s Analytics as a Platform on Druid Part 1 of 3
- Remote Development at Slack
- Deploys at Slack
- A Terrible, Horrible, No-Good, Very Bad Day at Slack HAPROXY
- Integrating Continuous Load Testing into Slack Pipeline
- Email Classification at Slack: Designing an Eventually Consistent Custom Classifier
- API Design Principles and Process at Slack
- Building an SLO-Driven Culture at Salesforce
- The Unified Infrastructure Platform Behind Salesforce Hyperforce
- 10 Principles for Architecture at Salesforce
- Shopify Invests in Research for Ruby at Scale
- Under Deconstruction: The State of Shopify’s Monolith
- Upgrading MySQL at Shopify
- Improving Speed and Stability of Software Delivery Simultaneously at Siemens Healthineers
- Improving Speed and Stability of Software Delivery Simultaneously at Siemens Healthineers
- Visual Analytics at Spotify
- Spotify Agile Team Organisation: Squads, Chapters, Tribes and Guilds
- Spotify System Architecture
- Spotify Leveraging Mobile Infrastructure with Data-Driven Decisions
- Spotify’s Event Delivery – Life in the Cloud
- The End of the Public API Strangler
- Microservices Architecture used by SoundCloud
- Domain-Driven Design with Value-Added Services and Domain Gateways at SoundCloud
- The “Backends for Frontends” Pattern at SoundCloud
- Building Resiliency into the Twitter Ad Pacing Service
- Twitter’s Tough Architectural Decision
- An Overview of Twitter's Security Key Implementation
- Twitter System Architecture
- Design Twitter — Microservices Architecture of Twitter Service
- One Stone, Three Birds: Finer-Grained Encryption @ Apache Parquet
- How We Saved 70K Cores Across 30 Mission-Critical Services (Large-Scale, Semi-Automated Go GC Tuning @Uber)
- Real-Time Exactly-Once Event Processing at Uber with Apache Flink, Kafka, and Pinot
- Uber System Architecture
- How Uber handles millions of ride/food requests efficiently part 2
- Data Collection, Standardization and Usage at Scale in the Uber Rider App
- How Uber handles millions of ride/food requests efficiently part 1
- Cost-Efficient Open Source Big Data Platform at Uber
- Data Collection, Standardization and Usage at Scale in the Uber Rider App
- Uber’s Journey Toward Better Data Culture From First Principles
- Uber Re-Architected Its Foundational Fulfilment Service