/paper_reading

经典论文阅读笔记,文章同步发布在知乎和博客上。欢迎提 PR

Apache License 2.0Apache-2.0

paper_reading

经典论文阅读笔记,文章同步发布在知乎和博客上。欢迎提 PR。论文列表如下。

1. 分布式计算

  1. Discretized Streams: Fault-Tolerant Streaming Computation at Scale: 笔记原文知乎
  2. Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark : 笔记原文知乎
  3. The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing笔记原文知乎
  4. Distributed Snapshots: Determining Global States of a Distributed System: 笔记原文
  5. MapReduce: Simplified Data Processing on Large Clusters
  6. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
  7. S4: Distributed Stream Computing Platform

2. 分布式协调

  1. The Chubby lock service for loosely-coupled distributed systems
  2. In Search of an Understandable Consensus Algorithm (Extended Version)

3. 分布式存储

  1. The Google File System
  2. Bigtable: A Distributed Storage System for Structured Data
  3. Dynamo: Amazon’s Highly Available Key-value Store
  4. Finding a Needle in Haystack: Facebook's Photo Storage
  5. Spanner: Google's Globally-Distributed Database
  6. F1 - The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business

4. 调度系统

  1. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types
  2. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
  3. Omega: flexible, scalable schedulers for large compute clusters
  4. Large-scale cluster management at Google with Borg
  5. Borg: the Next Generation

5. OLAP

  1. F1 Query: Declarative Querying at Scale
  2. Apache Druid

6. AutoScale

  1. Autopilot: workload autoscaling at Google