what-I-read-2024

papers I read in 2024

February

Main Interest : Optimal Serving, Model Quantization

Efficient Memory Management for Large Language Model Serving with PagedAttention (ArXiv 2023)

Tidy Data (Journal of Statistical Software 2014)

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards (ArXiv 2024)

A Comprehensive Survey of Compression Algorithms for Language Models (ArXiv 2024)

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models (ICLR 2023)

March

Main Interest : Multilinguality, LM Evaluation

No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization (ArXiv 2024)

Do Llamas Work in English? On the Latent Language of Multilingual Transformers (ArXiv 2024)

Is Cosine-Similarity of Embeddings Really About Similarity? (ArXiv 2024)

SurfingAttack: Interactive Hidden Attack on Voice Assistants Using Ultrasonic Guided Waves (Network and Distributed System Security Symposium)

The Generative AI Paradox : What it can create, it may not understand (ArXiv 2024)

April

Main Interest : Multilinguality, LM Evaluation

Evalverse: Unified and Accessible Library for Large Language Model Evaluation (ArXiv 2024)

HyperClova X technical report (ArXiv 2024)

THE GENERATIVE AI PARADOX: “What It Can Create, It May Not Understand” (ArXiv 2024)

BATCH CALIBRATION: RETHINKING CALIBRATION FOR IN-CONTEXT LEARNING AND PROMPT ENGINEERING (ICLR 2024)

Generative AI for Synthetic Data Generation: Methods, Challenges and the Future(ArXiv 2024)