A curated list of papers that may be of interest to Data Scientists and Machine Learning students and professionals:
MAR-2024
FEB-2024
- Large Language Models: A Survey (2024)
- Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context (2024)
JAN-2024
- Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM (2024)
- Leveraging Large Language Models for NLG Evaluation: A Survey (2024)
- Foundations of Vector Retrieval (2024)
DEC-2023
- An In-depth Look at Gemini's Language Abilities (2023)
- Gemini: A Family of Highly Capable Multimodal Models (2023)
- Mamba: Linear-Time Sequence Modeling with Selective State Spaces (2023)
- Data Management For Large Language Models: A Survey (2023)
- The Efficiency Spectrum of Large Language Models: An Algorithmic Survey (2023)
NOV-2023
- Have we built machines that think like people? (2023)
- ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up? (2023)
OCT-2023
SEP-2023
- Applications of Deep Neural Networks with Keras (2022)
- The Modern Mathematics of Deep Learning (2023)
- Multimodal Deep Learning (2023)
- To SMOTE, or not to SMOTE? (2023)
- Instruction Tuning for Large Language Models: A Survey (2023)
- Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities (2023)
AUG-2023
- A Survey on Multimodal Large Language Models (2023)
- Efficient Guided Generation for Large Language Models (2023)
- From Pretraining Data to Language Models to Downstream Tasks Tracking the Trails of Political Biases Leading to Unfair NLP Models (2023)
- Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment (2023)
JULY-2023
- Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better (2021)
- Foundational Aligning Large Language Models with Human: A Survey (2023)
- Foundational Models Defining a New Era in Vision: A Survey and Outlook (2023)
- Challenges and Applications of Large Language Models (2023)
- How is ChatGPT's behavior changing over time? (2023)
- A Survey on Evaluation of Large Language Models (2023)
JUNE-2023
- FinGPT: Open-Source Financial Large Language Models (2023)
- How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources (2023)
- A Simple and Effective Pruning Approach for Large Language Models (2023)
- Reasoning with Language Model Prompting: A Survey (2023)
- All arXiv Publications Pre-Processed for NLP, Including Structured Full-Text and Citation Network (2023)
JUNE-2023
- An Overview of Catastrophic AI Risks (2023)
- A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks (2023)
- The Impact of Positional Encoding on Length Generalization in Transformers (2023)
- CodeTF: One-stop Transformer Library for State-of-the-art Code LLM (2023)
- Everyone wants to do the model work, not the data work: Data Cascades in High-Stakes AI (2021)
- Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning (2019)
- Evaluating the Quality of Machine Learning Explanations (2020)
- Evaluation of statistical and machine learning models for time series prediction (2019)
MAY-2023
- A PhD Student's Perspective on Research in NLP in the Era of Very Large Language Models (2023)
- The False Promise of Imitating Proprietary LLMs (2023)
- Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training (2023)
- Trustworthy AI: From Principles to Practices (2023)
- Machine Learning Testing: Survey, Landscapes and Horizons (2019)
- BIML Interactive Machine Learning Risk Framework (2023)
- A Survey on the Explainability of Supervised Machine Learning (2020)
- Cramming: Training a Language Model on a Single GPU in One Day (2023)
- Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks (2023)
- A PhD Student's Perspective on Research in NLP in the Era of Very Large Language Models (2023)
- Excuse me, do you have a moment to talk about version control? (2017)
- Good Enough Practices in Scientific Computing (2016)
- RWKV: Reinventing RNNs for the Transformer Era (2023)
- Scaling Speech Technology to 1,000+ Languages (2023)
- Any-to-Any Generation via Composable Diffusion (2023)