/ml-papers

A personal ML research paper library

ML Papers, Blogs, and Videos

To-Do

Papers / Blogs

  • Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding [Paper]
  • Rethinking Tabular Data Understanding with Large Language Models [Paper]
  • Corrective Retrieval Augmented Generation [Paper]
  • Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs [Paper]
  • Self-Rewarding Language Models [Paper]
  • War and Peace (WarAgent): LLM-based Multi-Agent Simulation of World Wars [Paper]
  • Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection [Paper]
  • Mathematical Discoveries from Program Search with LLMs [Paper]
  • Evaluating Large Language Models: A Comprehensive Survey [Paper]
  • Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks [Paper]
  • HyperFast: Instant Classification for Tabular Data [Paper]
  • ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent [Paper]
  • Exploiting Novel GPT-4 APIs [Paper]
  • Do Androids Know They're Only Dreaming of Electric Sheep? [Paper]
  • Retrieval-Augmented Generation for Large Language Models: A Survey [Paper]
  • Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? [Paper]
  • Direct Preference Optimization: Your Language Model is Secretly a Reward Model [Paper]
  • Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM [Paper]
  • Large Language Models on Graphs: A Comprehensive Survey [Paper]
  • Relational Deep Learning: Graph Representation Learning on Relational Databases [Paper]
  • PDFTriage: Question Answering over Long, Structured Documents [Paper]
  • From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting [Paper]
  • LLMs Are Zero-Shot Time Series Forecasters [Paper]
  • Rethinking Tabular Data Understanding with Large Language Models [Paper]
  • Table-GPT: Table-tuned GPT for Diverse Table Tasks [Paper]
  • MRKL Systems: Modular Reasoning, Knowledge and Language [Paper]

Done

Large Language Models

  • Mixtral of Experts [Paper]
  • Can Generalist Foundation Models Outcompete Special-Purpose Tuning? [Paper]
  • Let's Verify Step By Step [Paper]
  • Navigating the Jagged Technological Frontier [Paper]
  • AI Canon [Blogpost]
  • Mixture of Experts Explained [Blogpost]
  • Challenges and Applications of Large Language Models [Paper]
  • Open Problems and Fundamental Limitations of RLHF [Paper]
  • Why AI Will Save The World [Blogpost]
  • Google "We Have No Moat, And Neither Does OpenAI" [Blogpost]
  • Attention Is All Your Need [Paper]
  • Sparks of Artificial General Intelligence: Early Experiments with GPT-4 [Paper]
  • The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision) [Paper]
  • The Annotated Transformer [Blogpost]
  • The Illustrated Transformer [Blogpost]
  • The Illustrated GPT-2: Visualize Transformer Language Models [Blogpost]
  • How GPT3 Works: Visualizations and Animations [Blogpost]
  • Five Years of GPT Progress [Blogpost]
  • Understanding Large Language Models [Blogpost]
  • RLHF: Reinforcement Learning from Human Feedback [Blogpost]

LLM Applications

  • DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines [Paper]
  • RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval [Paper]
  • Building LLM Applications for Production [Blogpost]
  • Best Practices for LLM Evaluation of RAG Applications [Blogpost]
  • Emerging Architectures for LLM Applications [Blogpost]
  • Patterns for Building LLM-based Systems & Products [Blogpost]
  • RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture [Paper]

LLM Prompting

  • OpenAI Prompt Engineering Guide [Blogpost]
  • Graph of Thoughts: Solving Elaborate Problems with LLMs [Paper]
  • Chain-of-Verification Reduces Hallucination in LLMs [Paper]
  • ReAct: Synergizing Reasoning and Acting in Language Models [Paper]
  • Chain-of-Thought Prompting Elicits Reasoning in LLMs [Paper]
  • Tree of Thoughts: Deliberate Problem Solving with LLMs [Paper]
  • LLM-Rec: Personalized Recommendation via Prompting LLMs [Ppaer]

LLM / ML Courses

  • ML Engineering for Production Specialization [Course]
  • DeepLearning.AI LLM Short Courses [Course]
  • LLMs: Application Through Production [Course]
  • LLMs: Foundation Models from the Ground Up [Courses]
  • Generative AI with Large Language Models [Course]
  • Building LLM-Powered Apps [Course]
  • The Full Stack LLM Bootcamp [Course]
  • Neural Networks From Zero to Hero [Course]

Machine Learning General

  • Lakehouse: A New Generation of Open Platforms [Paper]
  • Machine Learning and Causality: The Impact of Financial Crises on Growth [Paper]
  • A Comparative Study of Hyper-Parameter Optimization Tools [Paper]
  • WeightedSHAP: analyzing and improving Shapley based feature attributions [Paper]
  • Machine Learning Operations (MLOps): Overview, Definition, and Architecture [Paper]
  • Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [Paper]
  • Deep Residual Learning for Image Recognition [Paper]
  • A Unified Approach to Interpreting Model Predictions [Paper]
  • Focal Loss for Dense Object Detection [Paper]
  • Cyclical Learning Rates for Training Neural Networks [Paper]
  • Entity Embeddings of Categorical Variables [Paper]
  • Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in ML [Paper]
  • An Overview of Gradient Descent Optimization Algorithms [Blogpost]
  • An Updated Overview of Gradient Descent Optimization Algorithms [Blogpost]
  • A Recipe for Training Neural Networks[Blogpost]
  • Deep Neural Nets: 33 Years Ago and 33 Years From Now [Blogpost]
  • DAC: Deep Autoencoder-Based Clustering [Paper]

Business Problem Domains

  • XGBSE: Improving XGBoost for Survival Analysis [Blogpost]
  • Survival Regression with Accelerated Failure Time Model in XGBoost [Paper]
  • Random Survival Forests [Paper]
  • Understanding Survival Analysis: Kaplan-Meier Estiamte [Paper]
  • Predicting Customer Lifetime Values: E-Commerce Use Case [Paper]
  • A Deep Probabilistic Model for Customer Lifetime Value Prediction [Paper]
  • Churn Prediction with Sequential Data and Deep Neural Networks [Paper]
  • Predicting Customer Churn: Extreme Gradient Boosting with Temporal Data [Paper]
  • Behavioral Modeling for Churn Prediction [Paper]
  • Deep & Cross Network for Ad Click Predictions [Paper]
  • Abuse and Fraud Detection in Streaming Services Using Heuristic-Aware Machine Learning [Paper]

Graph Neural Networks

  • Temporal Graph Networks for Deep Learning on Dynamic Graphs [Paper]
  • A Review on Graph Neural Network Methods in Financial Applications [Paper]
  • A Survey on Graph Representation Learning Methods[Paper]

Tabular Data

  • On Embeddings for Numerical Features in Tabular Deep Learning [Paper]
  • Why Do Tree-Based Models Still Outperform Deep Learning on Tabular Data [Paper]
  • An Embedding Learning Framework for Numerical Features in CTR Prediction [Paper]
  • DCN V2: Improved Deep & Cross Network and Practical Lessons [Paper]
  • Revisiting Deep Learning Models for Tabular Data [Paper]
  • Tabular Data: Deep Learning is Not All You Need [Paper]
  • Deep Neural Networks and Tabular Data: A Survey [Paper]
  • XGBoost: A Scalable Tree Boosting System [Paper]

Time Series

  • Transformers in Time Series: A Survey [Paper]
  • Forecasting with Trees [Paper]
  • Deep Learning for Time Series Forecasting: Tutorial and Literature Survey [Paper]
  • DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks [Paper]
  • NeuralProphet: Explainable Forecasting at Scale [Paper]
  • Prophet: Forecasting at Scale [Paper]
  • AR-Net: A simple Auto-Regressive Neural Network for time-series [Paper]
  • Conditional Time Series Forecasting with Convolutional Neural Networks [Paper]
  • WaveNet: A Generative Model for Raw Audio [Paper]
  • An Experimental Review on Deep Learning Architectures for Time Series Forecasting [Paper]
  • Do We Really Need Deep Learning Models for Time Series Forecasting? [Paper]
  • Machine Learning vs Statistical Methods for Time Series Forecasting: Size Matters [Paper]