Computer-Science-Papers

Storagesystems

Analytics

Clustermanager and Scheduling

Streamprocessing

Pubsub

Graph processing in distributed setting.

Consensus and replicated state machines.

Peertopeer systems and information dessimination.

Additional May be Repeated articles will categorize later.

Short Name Title Link Extra links
1 Apache Kafka Kafka: A Distributed Messaging System for Log Processing (https://notes.stephenholiday.com/Kafka.pdf)
2 Apache Cassandra Cassandra - A Decentralized Structured Storage System (https://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf)
3 Apache Flink Apache Flink: Stream and Batch Processing in a Single Engine (https://asterios.katsifodimos.com/assets/publications/flink-deb.pdf)
4 Apache Spark Spark: Cluster Computing with Working Sets (https://www.usenix.org/legacy/event/hotcloud10/tech/full_papers/Zaharia.pdf)
5 Apache Zookeeper ZooKeeper: Wait-free coordination for Internet-scale systems (https://www.usenix.org/legacy/event/atc10/tech/full_papers/Hunt.pdf)
6 BigTable Bigtable: A Distributed Storage System for Structured Data (https://research.google.com/archive/bigtable-osdi06.pdf)
8 Apache Impala Apache Impala: A Modern, Open-Source SQL Engine for Hadoop (https://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
9 Apache Druid Druid: A Real-time Analytical Data Store (http://static.druid.io/docs/druid.pdf)
10 Timer Wheel Hashed and Hierarchical Timing Wheels (http://www.cs.columbia.edu/~nahum/w6998/papers/sosp87-timing-wheels.pdf)
11 MillWheel MillWheel: Fault-Tolerant Stream Processing at Internet Scale (https://research.google.com/pubs/archive/41378.pdf)
12 Dynamo Dynamo: Amazon’s Highly Available Key-value Store (https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf)
13 Google File System The Google File System (https://research.google.com/archive/gfs-sosp2003.pdf)
14 MapReduce MapReduce: Simplified Data Processing on Large Clusters (https://research.google.com/archive/gfs-sosp2003.pdf)
15 Spanner Spanner: Google’s Globally-Distributed Database (https://research.google.com/archive/spanner-osdi2012.pdf)
16 Zab Zab: High-performance broadcast forprimary-backup systems (http://www.cs.cornell.edu/courses/cs6452/2012sp/papers/zab-ieee.pdf)
17 Paxos Paxos Made Simple (https://lamport.azurewebsites.net/pubs/paxos-simple.pdf)
18 Chubby The Chubby lock service for loosely-coupled distributed systems (https://research.google.com/archive/chubby-osdi06.pdf)
19 Dremel Dremel: Interactive Analysis of Web-Scale Datasets (https://research.google/pubs/pub36632/)
20 Megastore Megastore:Providing Scalable, Highly Available Storage for Interactive Services (https://research.google/pubs/pub36971.pdf)
21 Raft In Search of an Understandable Consensus Algorithm (Extended Version) (https://raft.github.io/raft.pdf)
22 Flexible Paxos Flexible Paxos: Quorum Intersection Revisited (https://arxiv.org/abs/1608.06696)
23 Thrift Thrift: Scalable Cross-Language Services Implementation (https://thrift.apache.org/static/files/thrift-20070401.pdf)
24 Maglev Maglev: A Fast and Reliable Software Network Load Balancer (https://research.google.com/pubs/archive/44824.pdf)
25 LSM The Log-Structured Merge-Tree (LSM-Tree) (https://www.cs.umb.edu/~poneil/lsmtree.pdf)
26 Chord Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications (https://pdos.csail.mit.edu/papers/chord:sigcomm01/chord_sigcomm.pdf)
27 Kademlia Kademlia: A Peer-to-peer Information System Based on the XOR Metric (https://www.scs.stanford.edu/~dm/home/papers/kpos.pdf)
28 Mesa Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing (https://research.google/pubs/pub42851/ )
29 SCRIBE SCRIBE: A large-scale and decentralized application-level multicast infrastructure https://rowstron.azurewebsites.net/PAST/jsac.pdf
30 PAST Storage management and caching in PAST- A large-scale, persistent peer-to-peer storage utility https://people.mpi-sws.org/~druschel/publications/PAST-hotos.pdf
31 Pastry Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems? https://www.cs.cornell.edu/people/egs/615/pastry.pdf
32 Linearizability Linearizability: A Correctness Condition for Concurrent Objects http://cs.brown.edu/~mph/HerlihyW90/p463-herlihy.pdf
33 Time and Clocks Time, Clocks, and the Ordering of Events in a Distributed System http://lamport.azurewebsites.net/pubs/time-clocks.pdf
34 CRDTs CRDTs: Consistency without concurrency control http://hal.archives-ouvertes.fr/docs/00/39/79/81/PDF/RR-6956.pdf
35 Photon Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams https://research.google/pubs/pub41318/
36 TAO TAO: Facebook’s Distributed Data Store for the Social Graph https://www.usenix.org/system/files/conference/atc13/atc13-bronson.pdf
37 Pregel Pregel: A System for Large-Scale Graph Processing https://15799.courses.cs.cmu.edu/fall2013/static/papers/p135-malewicz.pdf
38 Dapper Dapper: A-large-scale-distributed-tracing-infrastructure https://research.google/pubs/pub36356.pdf
39 Raft Refloated Raft Refloated: Do We Have Consensus? https://www.cl.cam.ac.uk/~ms705/pub/papers/2015-osr-raft.pdf
40 Percolator Large-scale Incremental Processing Using Distributed Transactions and Notifications https://research.google/pubs/pub36726.pdf
41 Monarch Monarch: Google’s Planet-Scale In-Memory Time Series Database https://research.google/pubs/pub50652/
42 Borg Large-scale cluster management at Google with Borg https://research.google/pubs/pub43438.pdf
43 Borg - Next Borg: the Next Generation https://research.google/pubs/pub49065.pdf
44 Amazon Aurora Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases https://web.stanford.edu/class/cs245/readings/aurora.pdf
45 Gorilla Gorilla: A Fast, Scalable, In-Memory Time Series Database http://www.vldb.org/pvldb/vol8/p1816-teller.pdf
46 HDFS The Hadoop Distributed File System https://storageconference.us/2010/Papers/MSST/Shvachko.pdf
47 Autopilot Autopilot: workload autoscaling at Google https://dl.acm.org/doi/10.1145/3342195.3387524
48 Consistent hashing Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web https://dl.acm.org/doi/pdf/10.1145/258533.258660
49 SEDA SEDA: An Architecture for Well-Conditioned, Scalable Internet Services http://www.sosp.org/2001/papers/welsh.pdf
50 Bitcask Bitcask: A Log-Structured Hash Table for Fast Key/Value Data https://riak.com/assets/bitcask-intro.pdf
51 DynamoDB Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service https://www.usenix.org/system/files/atc22-elhemali.pdf
52 Isolation levels A critique of ANSI SQL isolation levels https://dl.acm.org/doi/pdf/10.1145/223784.223785
54 Deletable Bloom Filter The deletable bloom filter https://arxiv.org/pdf/1005.0352
55 Hash Coding Space\Time Trade-offs in Hash Coding with Allowable Errors https://dl.acm.org/doi/pdf/10.1145/362686.362692
56 Expedite Byzantine Shifting Gears- Changing Algorithms on the Fly To Expedite Byzantine Agreement https://www.sciencedirect.com/science/article/pii/089054019290035E
57 Scalability cost Scalability! But at what COST? https://www.usenix.org/system/files/conference/hotos15/hotos15-paper-mcsherry.pdf
58 Foundation DB FoundationDB: A Distributed Unbundled Transactional Key Value Store https://www.foundationdb.org/files/fdb-paper.pdf
59 Monolith Monolith: Real Time Recommendation System With Collisionless Embedding Table https://arxiv.org/pdf/2209.07663
60 Memcache at Facebook Scaling Memcache at Facebook https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final170_update.pdf
61 MilliSampler A microscopic view of bursts, buffer contention, and loss in data centers https://dl.acm.org/doi/pdf/10.1145/3517745.3561430 https://engineering.fb.com/2023/04/17/networking-traffic/millisampler-network-traffic-analysis/
62 FlexiRaft FlexiRaft: Flexible Quorums with Raft https://www.cidrdb.org/cidr2023/papers/p83-yadav.pdf
63 Minesweeper Scalable Statistical Root Cause Analysis on AppTelemetry https://arxiv.org/abs/2010.09974
64 Shard Manager Shard Manager: A Generic Shard ManagementFramework for Geo-distributed Applications
65 FlumeJava FlumeJava: Easy, Efficient Data-Parallel Pipelines https://research.google/pubs/pub35650.pdf
66 Heron Twitter Heron: Stream Processing at Scale https://dl.acm.org/doi/pdf/10.1145/2723372.2742788
67 Dataflow The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in MassiveScale, Unbounded, OutofOrder Data Processing https://research.google/pubs/pub43864.pdf
68 Flink State Management in Apache Flink http://www.vldb.org/pvldb/vol10/p1718-carbone.pdf
69 Dgraph Dgraph: Synchronously Replicated, Transactional and Distributed Graph Database