Pinned Repositories
ARES
ASAP
ASAP: Prioritizing Attention via Time Series Smoothing
ColBERT
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
dawn-bench-entries
DAWNBench: An End-to-End Deep Learning Benchmark and Competition
FAST
End-to-end earthquake detection pipeline via efficient time series similarity search
FrugalGPT
FrugalGPT: better quality and lower cost for LLM applications
index-baselines
Simple baselines for "Learned Indexes"
macrobase
MacroBase: A Search Engine for Fast Data
noscope
Accelerating network inference over video
sparser
Sparser: Raw Filtering for Faster Analytics over Raw Data
Future Data Systems's Repositories
stanford-futuredata/ColBERT
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
stanford-futuredata/macrobase
MacroBase: A Search Engine for Fast Data
stanford-futuredata/ARES
stanford-futuredata/FAST
End-to-end earthquake detection pipeline via efficient time series similarity search
stanford-futuredata/FrugalGPT
FrugalGPT: better quality and lower cost for LLM applications
stanford-futuredata/gavel
Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020
stanford-futuredata/stk
stanford-futuredata/sinkhorn-label-allocation
Sinkhorn Label Allocation is a label assignment method for semi-supervised self-training algorithms. The SLA algorithm is described in full in this ICML 2021 paper: https://arxiv.org/abs/2102.08622.
stanford-futuredata/Willump
Willump Is a Low-Latency Useful Machine learning Platform.
stanford-futuredata/Baleen
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)
stanford-futuredata/Uniserve
A runtime implementation of data-parallel actors.
stanford-futuredata/Megatron-LM
Ongoing research training transformer models at scale
stanford-futuredata/blazeit
Its BlazeIt because it's blazing fast
stanford-futuredata/POP
Code for "Solving Large-Scale Granular Resource Allocation Problems Efficiently with POP", which appeared at SOSP 2021
stanford-futuredata/omg
stanford-futuredata/loa
Public code for LOA
stanford-futuredata/tasti
Semantic Indexes for Machine Learning-based Queries over Unstructured Data (SIGMOD 2022)
stanford-futuredata/cs245-as1
Student files for CS245 Programming Assignment 1: In-memory data layout
stanford-futuredata/cs245-as2-public
stanford-futuredata/InQuest
Accelerating Aggregation Queries on Unstructured Streams of Data
stanford-futuredata/SparseJointShift
Model Performance Estimation and Explanation When Labels and A Few Features Shifts
stanford-futuredata/sketchstore
Algorithms for compressing and merging large collections of sketches
stanford-futuredata/smol
stanford-futuredata/supg
stanford-futuredata/parallel-lb-simulator
stanford-futuredata/abae
Accelerating Approximate Aggregation Queries with Expensive Predicates (VLDB 21)
stanford-futuredata/ezmode
An iterative algorithm for selecting rare events in large, unlabeled datasets
stanford-futuredata/pop-ncflow
Code for POP (SOSP 2021) and NCFlow (NSDI 2021)
stanford-futuredata/redisgeo-bench
Simple benchmark for Redis geosets for top-k queries.
stanford-futuredata/teavar