/Database-Optimization

:books: A collection of work related to Database Optimization.

Related work on Database Optimization

Inspired by GNNPapers.

Content

Conferences&Workshop

Abbreviation Full Name -2nd -1st Latest
SIGMOD International Conference on Management of Data 2019 2020 2021
VLDB International Conference on Very large Databases 2019 2020 2021
ICDE International Conference on Data Engineering 2019 2020 2021
CIDR The Conference on Innovative Data Systems Research 2017 2019 2020
EDBT/ICDT International Conference on Extending Database Technology 2018 2019 2020
DEEM Workshop on Data Management for End-To-End Machine Learning 2018 2019 2020
aiDM International Workshop on Exploiting Artificial Intelligence Techniques for Data Management 2018 2019 2020

Note: After entering the resource page, search the keyword to find the corresponding category (such as optimization), you can see the receiving papers under the research category.

Courses

Advanced Database Systems-CMU-15721-Spring2020

Datasets

TPC The TPC Benchmark™H (TPC-H) is a decision support benchmark.

IMDB Subsets of IMDb data are available for access to customers for personal and non-commercial use.

JOB Join Order Benchmark (JOB).

Tools

TPC-H Query Plan Visualization

SQL AST Explorer

Papers

Survey papers

  1. Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches. Foundations and Trends in Databases 2012. book

    Graham Cormode, Minos Garofalakis, Peter J. Haas and Chris Jermaine.

  2. How good are query optimizers, really?. VLDB 2015. [paper,github]

    Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter Boncz, Alfons Kemper, and Thomas Neumann.

  3. Database Meets Deep Learning: Challenges and Opportunities. SIGMOD 2016. paper

    Wei Wang, Meihui Zhang, Gang Chen, H. V. Jagadish, Beng Chin Ooi, and Kian-Lee Tan.

  4. Query optimization through the looking glass, and what we found running the Join Order Benchmark. The VLDB Journal — The International Journal on Very Large Data Bases 2018. paper

    Viktor Leis, Bernhard Radke, Andrey Gubichev, Atanas Mirchev, Peter Boncz, Alfons Kemper, and Thomas Neumann.

  5. 基于机器学习的数据库技术综述. 计算机学报, 2020. paper

    李国良,周煊赫,孙佶,余翔,袁海涛,刘佳斌 ,韩越.

Cardinality Estimation

histograms

  1. Equi-depth multidimensional histograms. SIGMOD 1988. paper

    M. Muralikrishna and David J. DeWitt.

  2. Selectivity Estimation Without the Attribute Value Independence Assumption. VLDB 1997. paper

    Viswanath Poosala and Yannis E. Ioannidis.

  3. The history of histograms (abridged). VLDB 2003 . paper

    Yannis Ioannidis.

sketch

  1. A linear-time probabilistic counting algorithm for database applications. ACM Transactions on Database SystemsJune 1990. paper

    Kyu-Young Whang, Brad T. Vander-Zanden, and Howard M. Taylor.

  2. Loglog Counting of Large Cardinalities. ESA 2003. paper

    Marianne Durand and Philippe Flajolet.

  3. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms,Volume 55, Issue 1,2005. paper

    Graham Cormode and S. Muthukrishnan.

  4. HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. Analysis of Algorithms 2007paper

    Philippe Flajolet,Éric Fusy,Olivier Gandouet,Frédéric Meunier.

wavelet

  1. Approximate computation of multidimensional aggregates of sparse data using wavelets. SIGMOD 1999. paper

    Jeffrey Scott Vitter and Min Wang.

  2. Approximate query processing using wavelets. The VLDB Journal — The International Journal on Very Large Data BasesSeptember,2001. paper

    Kaushik Chakrabarti, Minos Garofalakis, Rajeev Rastogi, and Kyuseok Shim.

sampling

  1. Cardinality Estimation Done Right:Index-Based Join Sampling. CIDR 2017. paper

    Viktor Leis, B. Radke, Andrey Gubichev, A. Kemper, T. Neumann.

deep learning

  1. Learned Cardinalities:Estimating Correlated Joins with Deep Learning. CIDR,2019. [paper,github]

    Andreas Kipf,Thomas Kipfm,Bernhard Radke,Viktor Leis,Peter Boncz,Alfons Kemper.

  2. Estimating Cardinalities with Deep Sketches. SIGMOD 2019. paper

    Andreas Kipf, Dimitri Vorona, Jonas Müller, Thomas Kipf, Bernhard Radke, Viktor Leis, Peter Boncz, Thomas Neumann, and Alfons Kemper.

  3. An end-to-end learning-based cost estimator. VLDB 2019. paper

    Sun Ji, and Guoliang Li.

  4. Monotonic Cardinality Estimation of Similarity Selection: A Deep Learning Approach. SIGMOD 2020 . paper

    Yaoshu Wang, Chuan Xiao, Jianbin Qin, Xin Cao, Yifang Sun, Wei Wang, and Makoto Onizuka.

  5. Fauce: Fast and Accurate Deep Ensembles with Uncertainty for Cardinality Estimation. VLDB 2021.paper

    Jie Liu, Wenqian Dong, Dong Li, Qingqing Zhou.

  6. FLAT: Fast, Lightweight and Accurate Method for Cardinality Estimation. VLDB 2021.paper

    Rong Zhu, Ziniu Wu, Yuxing Han, Kai Zeng (Alibaba Group), Andreas Pfadler, Zhengping Qian, Jingren Zhou Bin Cui.

Selectivity Estimation

  1. Selectivity Estimation Without the Attribute Value Independence Assumption. VLDB 1997. pdf

    Viswanath Poosala and Yannis E. Ioannidis.

  2. Selectivity estimation using probabilistic models. VLDB 2001. paper

    Lise Getoor, Benjamin Taskar, and Daphne Koller.

  3. Learning State Representations for Query Optimization with Deep Reinforcement Learning. DEEM 2018. paper

    Jennifer Ortiz, Magdalena Balazinska, Johannes Gehrke, and S. Sathiya Keerthi.

  4. Deep unsupervised cardinality estimation. VLDB 2019. [paper,github]

    Zongheng Yang, Eric Liang, Amog Kamsetty, Chenggang Wu, Yan Duan, Xi Chen, Pieter Abbeel, Joseph M. Hellerstein, Sanjay Krishnan, and Ion Stoica.

  5. NeuroCard: one cardinality estimator for all tables. VLDB 2020. [paper,github]

    Zongheng Yang, Amog Kamsetty, Sifei Luan, Eric Liang, Yan Duan, Xi Chen, and Ion Stoica.

  6. Deep Learning Models for Selectivity Estimation of Multi-Attribute Queries. SIGMOD 2020. paper

    Shohedul Hasan, Saravanan Thirumuruganathan, Jees Augustine, Nick Koudas, and Gautam Das.

Cost Estimation

  1. An end-to-end learning-based cost estimator. VLDB 2019. [paper,code)

    Sun Ji, and Guoliang Li.

Join Order Selection

  1. Learning to Optimize Join Queries With Deep Reinforcement Learning. arxiv 2018. paper

    Sanjay Krishnan, Zongheng Yang, Ken Goldberg, Joseph Hellerstein, Ion Stoica.

  2. Deep Reinforcement Learning for Join Order Enumeration. aiDM 2018 paper

    Ryan Marcus and Olga Papaemmanouil.

  3. Neo: A Learned Query Optimizer. VLDB 2019. paper

    Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, and Nesime Tatbul.

  4. Plan-structured deep neural network models for query performance prediction. VLDB 2019. paper

    Ryan Marcus and Olga Papaemmanouil.

  5. Research challenges in deep reinforcement learning-based join query optimization. aiDM 2020. paper

    Runsheng Benson Guo and Khuzaima Daudjee.

  6. Reinforcement Learning with Tree-LSTM for Join Order Selection. ICDE 2020. [paper,code]

    Xiang Yu,Guoliang Li,Chengliang Chai and Nan Tang.

Query Perfomance Prediction

  1. Query Performance Prediction for Concurrent Queries using Graph Embedding. VLDB 2020. paper

    Xuanhe Zhou, Ji Sun, Guoliang Li, Jianhua Feng.

Automatic Configuration Tuning

statistical approach

  1. Self-tuning performance of database systems based on fuzzy rules. FSKD'14: 11th International Conference on Fuzzy Systems and Knowledge Discovery ,2014. paper

    Wei, Zhijie, Zuohua Ding, and Jueliang Hu.

heuristic search

  1. BestConfig: tapping the performance potential of systems via automatic configuration tuning. SoCC '17: Proceedings of the 2017 Symposium on Cloud Computing, 2017. paper

    Yuqing Zhu, Jianxun Liu, Mengying Guo, Yungang Bao, Wenlong Ma, Zhuoyue Liu, Kunpeng Song, and Yingchun Yang.

machine learning

  1. Automatic Database Management System Tuning Through Large-scale Machine Learning. SIGMOD 2017. paper

    Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, and Bohan Zhang.

deep learning

  1. An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning. SIGMOD 2019. paper

    Ji Zhang, Yu Liu, Ke Zhou, Guoliang Li, Zhili Xiao, Bin Cheng, Jiashu Xing, Yangtao Wang, Tianheng Cheng, Li Liu, Minwei Ran, and Zekang Li.

Index tuning

  1. An Adaptive Approach for Index Tuning with Learning Classifier Systems on Hybrid Storage Environments. HAIS 2018: Hybrid Artificial Intelligent Systems,2018. paper

    Júlio Cesar NievolaDeborah Carvalho Ribeiro.

End-To-End

  1. SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning. SIGMOD 2019. paper

    Immanuel Trummer, Junxiong Wang, Deepak Maram, Samuel Moseley, Saehan Jo, and Joseph Antonakakis.

  2. Towards a Hands-Free Query Optimizer through Deep Learning. CIDR 2019. paper

    Ryan Marcus and Olga Papaemmanouil.

Application

  1. Bao: Making Learned Query Optimization Practical. SIGMOD 2021. paper

    Ryan Marcus, Parimarjan Negi, Hongzi Mao, Nesime Tatbu, Mohammad Alizadeh, Tim Kraska.

  2. Are We Ready For Learned Cardinality Estimation?. VLDB 2021. paper

    Xiaoying Wang, Changbo Qu, Weiyuan Wu, Jiannan Wang, Qingqing Zhou.

  3. Make Your Database System Dream of Electric Sheep: Towards Self-Driving Operation.VLDB 2021.[paper,code]

    Andrew Pavlo, Matthew Butrovich, Lin Ma, Prashanth Menon, Wan Shen Lim, Dana Van Aken, William Zhang