a list of papers, conference, books, mooc, Q&A and other stuffs for distributed systems
issues for more materials are welcome.
theories related to distributed systems, maily basic concepts/consensus algorithms/formal methods etc.
-
basic concepts/introductions
- Distributed Systems for Fun and Profit (strongly recommend)
- Notes on distributed systems for young bloods
- A Note on Distributed Systems
- Time, clocks, and the ordering of events in a distributed system
- Fundamentals of distributed computing: A practical tour of vector clock systems
- HLC: Hybrid Logical Clocks
- Virtual Time and Global States of Distributed Systems
- Timestamps in Message-Passing Systems That Preserve the Partial Ordering
- Distributed snapshots: determining global states of distributed systems
- Development of the domain name system
- Rediscovering-Distributed-System
-
consistency/fault-tolerence/replication
-
consistency
- The part-time parliament
- The Byzantine Generals Problem
- Paxos Made Simple
- Viewstamped replication: A new primary copy method to support highly-available distributed systems
- The Chubby lock service for loosely-coupled distributed systems
- Paxos Made Live: An Engineering Perspective
- Paxos for System Builders
- ZooKeeper: Wait-free Coordination for Internet-scale Systems
- Zab : High-performance broadcast for primary-backup systems
- ZooKeeper ’ s atomic broadcast protocol : Theory and practice
- In Search of an Understandable Consensus Algorithm
- PAXOS Made Transparent
- Revisiting the PAXOS algorithm
- The Paxos Family of Consensus Protocols
- Multi-Paxos: An Implementation and Evaluation
- Consensus on transaction commit
- Consistency in Distributed Storage Systems An Overview of Models, Metrics and Measurement Approaches
- Base: An Acid Alternative
- Eventually Consistent
- Consensus in the Cloud: Paxos Systems Demystified
-
fault-tolerence/replication
- Impossibility of Distributed Consensus With One Faulty Process
- Implementing fault-tolerant services using the state machine approach: a tutorial
- Remus: High Availability via Asynchronous Virtual Machine Replication
- Perspectives on the CAP Theorem
- Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services
- CAP Twelve Years Later
-
leader election
-
-
formal methods
-
others
-
fs
-
database
-
cluster management
-
computing
- Dryad : Distributed Data-Parallel Programs from Sequential Building Blocks
- MapReduce : Simplified Data Processing on Large Clusters
- Pregel: a system for large-scale graph processing
- Dremel: Interactive Analysis of Web-Scale Datasets
- Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing
- Storm@twitter
- GraphX: Graph Processing in a Distributed Dataflow Framework
- Introducing Apache Giraph for Large Scale Graph Processing
- Large-Scale Distributed Graph Computing Systems : An Experimental Evaluation
- Models for Parallel Computing : Review and Perspectives
- Actors: A Model of Concurrent Computation in Distributed Systems
- Communicating sequential processes
- Parallel Algorithms Lecture Notes
- DTHREADS: Efficient and Deterministic Multithreading
- Kendo: efficient deterministic multithreading in software
- Replication: Theory and Practice
- Distributed Systems: Concepts and Design
- Distributed Systems: Principles and Paradigms
- Distributed Systems: An Algorithmic Approach
- Distributed Algorithms: An Intuitive Approach
- Distributed Computing: Principles, Algorithms, and Systems
- MIT 6.824: Distributed Systems
- CMU 15-440: Distributed Systems Syllabus
- MIT 6.852/18.437 Distributed Algorithms
- MIT 6.S897: Large-Scale Systems
- CS 525 Spring 2015 Advanced Distributed Systems
- CS–745/845: Formal Specification and Verification of Systems
- UNDERSTANDING PAXOS
- The Log: What every software engineer should know about real-time data's unifying abstraction
- Consensus Protocols: Two-Phase Commit
- Consensus Protocols: Three-phase Commit
- Three-Phase Commit Protocol
- Consensus Protocols: A Paxos Implementation
- Consensus Protocols: Paxos
- FLP and CAP are not the same
- Consistency and availability in Amazon's Dynamo
- Distributed systems theory for the distributed systems engineer
- PAXOS/MULTI-PAXOS ALGORITHM
- EVENTUAL CONSISTENCY
- The Essential Leslie Lamport
- The Essential Nancy Lynch
- The Essential Barbara Liskov
- Viewstamped Replication Revisited