/NLP-Paper-News

The list of NLP paper and news I've checked. There might be short description of them (abstract) in Korean.

๐Ÿ“œ: Paper link ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป: Developer blog & Github link ๐Ÿ—ž๏ธ: News


2024

โ˜ƒ February

1st ~ 3rd week
  • ๐Ÿ“œ [Cohere] Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
    • 119๊ฐœ๊ตญ, 3,000์—ฌ ๋ช…์˜ ์—ฐ๊ตฌ์ž๊ฐ€ ์ฐธ์—ฌํ•œ ๋‹ค๊ตญ์–ด ๋ชจ๋ธ ์—ฐ๊ตฌ ํ”„๋กœ์ ํŠธ์˜ ๊ฒฐ๊ณผ๋ฌผ. ๋ฐ์ดํ„ฐ์…‹๋„ ์˜คํ”ˆ์†Œ์Šค๋กœ ์ œ๊ณต (513M ๊ฐœ instruction fine-tuning ๋ฐ์ดํ„ฐ์…‹)
  • ๐Ÿ“œ OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [OpenAI] Memory and new controls for ChatGPT
    • ChatGPT๋ฅผ ์ด์šฉํ•  ๋•Œ ๊ณผ๊ฑฐ์˜ ์ฑ„ํŒ… ๋‚ด์—ญ์„ ํ˜„์žฌ ์ฑ„ํŒ…์—์„œ์˜ memory๋กœ ํ™œ์šฉํ•˜์—ฌ ๊ฐœ์ธ ๋งž์ถค์œผ๋กœ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค. ์•„์ง ์ผ๋ถ€ ์œ ์ € ๋Œ€์ƒ์œผ๋กœ ํ…Œ์ŠคํŠธ ์ค‘์ธ ๊ธฐ๋Šฅ.
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [NVIDIA] Say What? Chat With RTX Brings Custom Chatbot to NVIDIA RTX AI PCs
  • ๐Ÿ—ž๏ธ Nvidia briefly beats Amazon and nears Alphabetโ€™s market cap amid AI hype
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [DeepLearning.AI] Serverless LLM apps with Amazon Bedrock
  • ๐Ÿ“œ On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks
  • ๐Ÿ“œ [Google DeepMind] Transformers Can Achieve Length Generalization But Not Robustly
    • ํŠธ๋žœ์Šคํฌ๋จธ๋„ ์ œํ•œ์ ์œผ๋กœ ์ž…๋ ฅ ๊ธธ์ด๋ฅผ ๋Š˜๋ฆด(extrapolate) ์ˆ˜ ์žˆ๋‹ค. (์•ฝ 2.5๋ฐฐ). ํ•˜์ง€๋งŒ ์ผ๋ฐ˜ํ™” ๊ฐ€๋Šฅํ•œ ์„ธํŒ…์€ ์•„๋‹˜.
  • ๐Ÿ“œ [Google DeepMind] Chain-of-Thought Reasoning Without Prompting
    • ๋ง ๊ทธ๋Œ€๋กœ ํ”„๋กฌํ”„ํŠธ ์—†์ด CoT Reasoning์„ ์œ ๋„ํ•  ์ˆ˜ ์žˆ๋‹ค. Decoding process๋ฅผ ์กฐ์ •ํ•จ
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Google] Our next-generation model: Gemini 1.5
    • ๋ฌด๋ ค ์ž…๋ ฅ์„ 1M ํ† ํฐ์œผ๋กœ ๋ฐ›์„ ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ฃผ์žฅํ•˜๋Š” Gemini 1.5 ๋ฒ„์ „์ด ๋“ฑ์žฅ. ๋ฐฐํฌ ์ค€๋น„๋Š” ๋˜์—ˆ์œผ๋‚˜ ์•„์ง ๋ฐฐํฌํ•˜์ง€ ์•Š์€ ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์ง.
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [OpenAI] Sora: Creating video from text
    • OpenAI์—์„œ ๋งŒ๋“  ์ตœ์ดˆ์˜ Text-to-Video ๋ชจ๋ธ. ์ž…์ด ๋–ก ๋ฒŒ์–ด์งˆ ์ •๋„์˜ ์„ฑ๋Šฅ์œผ๋กœ ์—ฌ๋Ÿฌ ์ปค๋ฎค๋‹ˆํ‹ฐ์—์„œ ํ™”์ œ๋ฅผ ๋ถˆ๋Ÿฌ์ผ์œผํ‚ค๋Š” ์ค‘.
  • ๐Ÿ“œ [Apple] Guiding Instruction-based Image Editing via Multimodal Large Language Models
    • ์ด๋ฏธ์ง€ ํŽธ์ง‘์— ์žˆ์–ด์„œ ์ „๋ฌธ์ ์ธ ์ง€์‹ ์—†์ด ํ…์ŠคํŠธ๋งŒ์„ ์ด์šฉํ•˜๋Š”๋ฐ ๊ทธ ๊ฒฐ๊ณผ๋ฌผ์ด ์•„์ฃผ ๋›ฐ์–ด๋‚จ. ICLRโ€™24 Spotlight ๋…ผ๋ฌธ.
  • ๐Ÿ“œ Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in Large Language Models
  • ๐Ÿ—ž๏ธ Slack AI is here, letting you catch up on lengthy threads and unread messages
    • ์ฝ์ง€ ์•Š์€ ์Šค๋ ˆ๋“œ ์š”์•ฝ ๊ธฐ๋Šฅ. ์•„์ง UK & US์—์„œ๋งŒ ์ด์šฉ ๊ฐ€๋Šฅ
  • ๐Ÿ“œ [Google DeepMind & Research] A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
    • [gist memories]์— ์—ํ”ผ์†Œ๋“œ๋ฅผ ์ €์žฅํ•˜์—ฌ ReadAgent๊ฐ€ task์™€ ๊ด€๋ จ ์žˆ๋Š” ์ •๋ณด๋ฅผ ๋น ๋ฅด๊ฒŒ ๊ฐ€์ ธ์˜ค๋„๋ก ํ•˜๋Š” ๋ฐฉ์‹. ์‚ฌ๋žŒ์ด ๊ธด ๊ธ€์„ ์ฝ๋Š” ๋ฐฉ์‹์—์„œ ์ฐฉ์•ˆ.
  • ๐Ÿ“œ DoRA: Weight-Decomposed Low-Rank Adaptation
    • LoRA์™€ FT ์‚ฌ์ด์˜ gap์„ ์ค„์ด๊ธฐ ์œ„ํ•ด pre-trained weight๋ฅผ magnitude์™€ direction์œผ๋กœ ๋ถ„ํ•ดํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋„์ž…
  • ๐Ÿ“œ Can We Verify Step by Step for Incorrect Answer Detection?
    • CoT์˜ ๊ฐ step์— ๋Œ€ํ•ด process discernibility score (PDS)๋ฅผ ๊ตฌํ•˜์—ฌ answer-checking baseline์„ ์ œ๊ณต
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป minbpe
    • Karpathy๊ฐ€ OpenAI๋ฅผ ํ‡ด์‚ฌํ•˜๋ฉฐ ๊ณต๊ฐœํ•œ BPE ์ฝ”๋“œ. ๋‚˜๋งŒ์˜ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Meta] V-JEPA
    • ์•„์ฃผ ์ ์€ ์–‘์˜ labeled data๋กœ self-superviseํ•œ ๋ชจ๋ธ๋กœ, ์ƒ์„ฑํ˜•์ด ์•„๋‹˜. ์ƒˆ๋กœ์šด ์ปจ์…‰ Joint Embedding Predictive Architecture๋ฅผ ์ œ์•ˆ.
4th week
5th week
  • ๐Ÿ“œ [UC Berkely] LoRA+: Efficient Low Rank Adaptation of Large Models
    • ๊ธฐ์กด LoRA๊ฐ€ suboptimalํ•˜๋‹ค๋Š” ๋ฌธ์ œ์ ์„ ์ง€์ ํ•˜๋ฉฐ ์„ฑ๋Šฅ์„ 1~2% ๊ฐœ์„ ํ•จ๊ณผ ๋™์‹œ์— ์†๋„๋Š” ์ตœ๋Œ€ 2๋ฐฐ๊นŒ์ง€ ํ–ฅ์ƒ์‹œํ‚จ adaptation ๊ธฐ๋ฒ•์„ ์ œ์‹œ
    • ๊ธฐ์กด์˜ LoRA์—์„œ ์‚ฌ์šฉํ•˜๋Š” adapater ํ–‰๋ ฌ A์™€ B๋Š” ๊ณ ์ •๋œ learning rate๋กœ ์—…๋ฐ์ดํŠธ๋œ๋‹ค๋Š” ์ ์ด ๋ฌธ์ œ์ž„ โ†’ ๋‘ ํ–‰๋ ฌ์˜ learning rate๋ฅผ ์กฐ์ ˆํ•จ์œผ๋กœ์จ ํผํฌ๋จผ์Šค์™€ ํ•™์Šต ์†๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜ LoRA+ ๋ฅผ ์ œ์‹œ
  • ๐Ÿ“œ OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
    • ์˜ฌ๋ฆผํ”ผ์•„๋“œ ์ˆ˜์ค€์˜ ๊ณผํ•™ ๋ฌธ์ œ๋กœ ๊ตฌ์„ฑ๋œ ๋ฒค์น˜๋งˆํฌ. 8,952๊ฐœ์˜ ์ˆ˜ํ•™ ๋ฐ ๋ฌผ๋ฆฌ ๋ฌธ์ œ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ ์ „๋ฌธ๊ฐ€ ์ˆ˜์ค€์˜ step-by-step reasoning annotation์„ ํฌํ•จ
  • ๐Ÿ“œ Large Language Models for Data Annotation: A Survey
    • LLM์„ annotation์— ํ™œ์šฉํ•œ ํ•™์Šต ๊ธฐ๋ฒ•์ด๋‚˜ ๋ฐฉ๋ฒ•๋ก ์— ๋Œ€ํ•œ ์„œ๋ฒ ์ด ํŽ˜์ดํผ
  • ๐Ÿ“œ Purifying Large Language Models by Ensembling a Small Language Model
    • ์–ธ์–ด ๋ชจ๋ธ ํ•™์Šต์— ์‚ฌ์šฉ๋œ ๋ฏผ๊ฐํ•œ ์ •๋ณด๋“ค์ด๋‚˜ data poisioning ๊ด€๋ จ ์ด์Šˆ ๋“ฑ์„ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์œผ๋กœ SLM ensemeble์„ ์ œ์‹œ
  • ๐Ÿ“œ Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation
    • expert & amateur ๋ชจ๋ธ์„ ํ•„์š”๋กœ ํ•˜๋Š” Contrastive Decoding ๋ฐฉ์‹์˜ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด dropout๊ณผ quantization์„ ์ ์šฉ
  • ๐Ÿ“œ tinyBenchmarks: evaluating LLMs with fewer examples
    • ํ˜„์กดํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ ๋ฐ์ดํ„ฐ์…‹์€ ์ง€๋‚˜์น˜๊ฒŒ ๋งŽ์€ ์ผ€์ด์Šค๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ๋‹ค. ์ด์™€ ๋™์ผํ•œ ์ˆ˜์ค€์˜ ํ‰๊ฐ€๊ฐ€ ๊ฐ€๋Šฅํ•œ ์†Œ์ˆ˜์˜ examples๋ฅผ curate.
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Google DeepMind] ๐Ÿงž Genie: Generative Interactive Environments
    • single image prompt๋กœ ๊ฒŒ์ž„ ๋งŒ๋“ค๊ธฐ..
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Mistral AI] Le Chat Mistral
    • Mistral์—์„œ ์ œ๊ณตํ•˜๋Š” ์ฑ—๋ด‡ ์„œ๋น„์Šค
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Mitral AI] Au Large
    • Mistral์—์„œ ์ถœ์‹œํ•œ ์ƒˆ๋กœ์šด ํ”Œ๋ž˜๊ทธ์‹ญ ๋ชจ๋ธ. GPT-4์˜ ๋’ค๋ฅผ ์ž‡๋Š” ์ˆ˜์ค€์˜ ์„ฑ๋Šฅ์ด๋ฉฐ API๋ฅผ ํ†ตํ•ด ์ด์šฉ ๊ฐ€๋Šฅ (Le Plateforme, Azure, Self-deployment)
  • ๐Ÿ“œ [Microsoft Research] ๐Ÿณ Orca-Math: Unlocking the potential of SLMs in Grade School Math
    • Mistral-7B ๋ชจ๋ธ์„ ๋ฒ ์ด์Šค๋กœ ํ•™์Šตํ•œ 7B ๋ชจ๋ธ Orca-Math. 200K ๊ฐœ์˜ ๊ณ ํ’ˆ์งˆ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ, feedback์„ ํ†ตํ•ฉ์‹œํ‚ค๋Š” ํ•™์Šต ๋ฐฉ์‹ ๋“ฑ์ด ํ™œ์šฉ๋จ. Llama-2-70B, ChatGPT-3.5 ๋“ฑ์„ ๋Šฅ๊ฐ€ํ•˜๋Š” ํผํฌ๋จผ์Šค
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Argilla] OpenHermesPreferences - a dataset of 1M AI preferences for RLAIF and DPO
    • Mixtral-8x7B-Instruct-v0.1, Nous-Hermes-2-Yi-34B, PairRM ๋“ฑ์œผ๋กœ๋ถ€ํ„ฐ ํš๋“ํ•œ 1M ๊ฐœ์˜ AI preferences ๋ฐ์ดํ„ฐ์…‹. DPO or RLAIF ์— ํ™œ์šฉ ๊ฐ€๋Šฅ
  • ๐Ÿ“œ LLMs with Chain-of-Thought Are Non-Causal Reasoners
    • CoT๋Š” ์˜ฌ๋ฐ”๋ฅด์ง€๋งŒ ์ •๋‹ต์„ ๋„์ถœํ•˜์ง€ ๋ชปํ•œ ์ผ€์ด์Šค, ๊ทธ๋ฆฌ๊ณ  ๊ทธ ๋ฐ˜๋Œ€์˜ ์ผ€์ด์Šค๋“ค์— ๋Œ€ํ•œ ๋ถ„์„
  • ๐Ÿ“œ Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models
    • ๋ณต์žกํ•œ ์ถ”๋ก  ํƒœ์Šคํฌ์— ๋Œ€ํ•ด์„œ problem context๋ฅผ ๋ถ„ํ•ด ๋ฐ ์„ค๋ช…ํ•จ์œผ๋กœ์จ ๋ฌธ์ œ ํ•ด๊ฒฐ ๋Šฅ๋ ฅ์„ ํ–ฅ์ƒ ์‹œํ‚ด (Problem Elaboration Prompting, PEP)
  • ๐Ÿ—ž๏ธ Apple cancels work on electric car, shifts team to generative AI
    • ์• ํ”Œ์ด ๋”์ด์ƒ ์ „๊ธฐ์ฐจ๋ฅผ ๋งŒ๋“ค์ง€ ์•Š๊ณ  ์ƒ์„ฑํ˜• AI ๊ฐœ๋ฐœ์— ์ง‘์ค‘ํ•œ๋‹ค๋Š” ์†Œ์‹
  • ๐Ÿ“œ Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models
    • LLM์ด ์ฃผ๊ด€์ ์ธ ํƒœ์Šคํฌ๋ฅผ ์ฒ˜๋ฆฌํ•  ๋•Œ๋Š” ๊ฐ๊ด€์ ์ธ ํƒœ์Šคํฌ๋ฅผ ์ฒ˜๋ฆฌํ•  ๋•Œ์— ๋น„ํ•ด ์—ด๋“ฑํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ž„. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ CoT์™€ ๊ฐ™์€ rationale ์ œ์‹œ ๋ฐฉ์‹ ๋Œ€์‹  dialogue๋ฅผ ๋„์ž….
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [DeepLearning.AI] Prompt Engineering with Llama 2
    • Meta์˜ Llama 2๋ฅผ ํ™œ์šฉํ•˜์—ฌ few-shot prompting๊ณผ ๊ฐ™์€ prompt engineering์— ๋Œ€ํ•ด ํ•™์Šต

๐ŸŒฑ March

1st ~ 2nd week
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป OpenAI APIโ€™s change on log probabilities from 5 to 20 return
  • ๐Ÿ—ž๏ธ Robotics startup Figure raises $675 mln from Microsoft, Nvidia, OpenAI
    • IT ๊ณต๋ฃก ๊ธฐ์—…๋“ค์ด ๋กœ๋ด‡ ๋ถ„์•ผ์—๋„ ์ ๊ทน์ ์œผ๋กœ ํˆฌ์žํ•˜๊ณ  ์žˆ๋‹ค๋Š” ์†Œ์‹
  • ๐Ÿ“œ [IIT] How to think step-by-step: A mechanistic understanding of chain-of-thought reasoning
    • CoT์— ๋Œ€ํ•ด layer๋ณ„๋กœ ๋ถ„์„. token representation์„ ํ™•์ธํ•œ ๊ฒฐ๊ณผ ์ค‘๊ฐ„ ์ด์ „์˜ layer์—์„œ๋Š” ์‚ฌ์ „ ํ•™์Šต๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ํŽธํ–ฅ๋˜์–ด ์žˆ์œผ๋‚˜ ์ค‘๊ฐ„ ์ดํ›„๋ถ€ํ„ฐ๋Š” ๊ธ‰๊ฒฉํžˆ in-context์— ์ง‘์ค‘
  • ๐Ÿ“œ [Rice University] Learning to Compress Prompt in Natural Language Formats
    • API์— ๋Œ€ํ•ด์„œ๋Š” soft prompt compression์„ ์ ์šฉํ•  ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์ž์—ฐ์–ด ํ˜•ํƒœ๋กœ compressionํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œ. ์—ฌ๊ธฐ์— ์‚ฌ์šฉ๋˜๋Š” ๊ฒƒ์ด Natrual Language Prompt Encapsulation (Nano-Capsulator) framework.
  • ๐Ÿ“œ [Microsoft] ResLoRA: Identity Residual Mapping in Low-Rank Adaption
    • original model์˜ long calculation path๋ฅผ ๋™์ผํ•˜๊ฒŒ ๊ฑฐ์ณ์•ผ ํ•˜๋Š” LoRA์˜ ํ•œ๊ณ„๋ฅผ ๋ณด์™„ํ•˜๊ธฐ ์œ„ํ•ด ํ•™์Šต ๋™์•ˆ์— residual path๋ฅผ ๋”ํ•˜๊ณ , ์ถ”๋ก  ๋™์•ˆ์—๋Š” ์ด๋Ÿฌํ•œ extra path๋ฅผ ์ œ๊ฑฐํ•˜๊ธฐ ์œ„ํ•œ merging approach๋ฅผ ์‚ฌ์šฉ โ†’ LoRA์™€ ๋Œ€๋น„ ํ•™์Šต ๋ฐ ์ถ”๋ก  cost๋Š” ๋” ๋‚ฎ์œผ๋ฉด์„œ๋„ performance๋Š” ๋” ์ข‹์Œ
  • ๐Ÿ“œ Datasets for Large Language Models: A Comprehensive Survey
    • 8๊ฐœ ์–ธ์–ด, 32๊ฐœ ๋„๋ฉ”์ธ, 444๊ฐœ ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ ์„œ๋ฒ ์ด ๋…ผ๋ฌธ. ์ด 774.5TB์— ๋‹ฌํ•˜๋Š” ์‚ฌ์ „ํ•™์Šต corpora๋ฅผ ๋ถ„๋ฅ˜
  • ๐Ÿ“œ [Apple] LUCID: LLM-Generated Utterances for Complex and Interesting Dialogues
    • 4,277๊ฐœ์— ๋‹ฌํ•˜๋Š” multi-domain, multi-intent conversation๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด LUCID๋ฅผ ์‚ฌ์šฉ (LLM-generated Utterances for Complex and Interesting Dialogues)
  • ๐Ÿ“œ An Empirical Categorization of Prompting Techniques for Large Language Models: A Practitioner's Guide
    • 7๊ฐœ์˜ ์นดํ…Œ๊ณ ๋ฆฌ๋กœ ๊ตฌ๋ถ„ํ•˜์—ฌ academicํ•˜๋ฉด์„œ๋„ pragmaticํ•œ ๋‚ด์šฉ์˜ prompting ํ…Œํฌ๋‹‰์„ ์ •๋ฆฌํ•œ ์„œ๋ฒ ์ด ํŽ˜์ดํผ
  • ๐Ÿ“œ [Meta] Learning and Leveraging World Models in Visual Representation Learning
    • Joint-Embedding Predictive Architecture (JEPA)์— conditioning, prediction difficulty, capacity ๊ฐœ๋…์„ ๋”ํ•œ Image Word Models๋ฅผ ์ œ์‹œ. ์–€ ๋ฅด์ฟค์ด ์—ฐ๊ตฌ์— ์ฐธ์—ฌ
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Anthropic] Introducing the next generation of Claude
    • Haiku, Sonnet, Opus๋กœ ๊ตฌ์„ฑ๋œ Claude 3 family๋ฅผ ๊ณต๊ฐœ. 159๊ฐœ ๊ตญ๊ฐ€์—์„œ API ์ด์šฉ ๊ฐ€๋Šฅ. (์ž์‹ ๋“ค์˜ ์ฃผ์žฅ์œผ๋กœ๋Š”) ์—ฌ๋Ÿฌ ๋ฒค์น˜๋งˆํฌ์—์„œ GPT-4๋ฅผ ๋Šฅ๊ฐ€ํ•˜๋Š” ์„ฑ๋Šฅ. Vision ๊ด€๋ จ ๋Šฅ๋ ฅ๋„ ๋›ฐ์–ด๋‚œ ํŽธ. ๋ถˆํ•„์š”ํ•œ ๊ฑฐ์ ˆ ๋ฉ”์„ธ์ง€ ๋ฐ˜ํ™˜์œจ๋„ ํฌ๊ฒŒ ๋–จ์–ด์ง (์ด์ „ ๋ฒ„์ „์—์„œ์˜ ์ด์Šˆ). 200K์˜ window size๋กœ ์ถœ์‹œ๋˜์—ˆ์œผ๋‚˜ ํŠน์ • ๊ณ ๊ฐ๋“ค์— ํ•œํ•ด 1M ํ† ํฐ๋„ ์ฒ˜๋ฆฌ ๊ฐ€๋Šฅํ•˜๊ฒŒ๋” ํ•  ์ˆ˜ ์žˆ์Œ์„ ์–ธ๊ธ‰.
  • ๐Ÿ“œ Distilling Text Style Transfer With Self-Explanation From LLMs
    • test style transfer ๋ถ„์•ผ์—์„œ ๋ถ€์กฑํ•œ parallel ๋ฐ์ดํ„ฐ์…‹์„ ๊ตฌ์ถ•. ์—ฌ๊ธฐ์— LLM distillation์„ ํ™œ์šฉ
  • ๐Ÿ“œ [Stanford, Georgia Tech, Microsoft, Google DeepMind] Design2Code: How Far Are We From Automating Front-End Engineering?
    • ์‹ค์ œ 484๊ฐœ์˜ ์›นํŽ˜์ด์ง€๋ฅผ ํ…Œ์Šคํฌ ์ผ€์ด์Šค๋กœ ๋‘๊ณ  Design2Code task๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ๋ฅผ ๊ตฌ์ถ•. Gemini Pro Vision์— ๋ฒ„๊ธˆ๊ฐ€๋Š” Design2Code-18B ๋ชจ๋ธ์„ fine-tuning
  • ๐Ÿ“œ PHAnToM: Personality Has An Effect on Theory-of-Mind Reasoning in Large Language Models
    • Theory of Mind (ToM) Reasoning์„ ์ด๋Œ์–ด๋‚ด๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ personality๊ฐ€ ์–ด๋–ค ๊ฒƒ์ธ์ง€์— ๋Œ€ํ•œ ์—ฐ๊ตฌ. ํŠน์ • personality๊ฐ€ ToM ๊ด€๋ จ ํƒœ์Šคํฌ์˜ ์„ฑ๋Šฅ์„ ๋†’์ด๋Š” ๋ฐ ๋„์›€์ด ๋˜๋Š” ๊ฒƒ์„ ํ™•์ธ.
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป 2024 ์˜คํ”ˆ์†Œ์Šค ์ปจํŠธ๋ฆฌ๋ทฐ์…˜ ์•„์นด๋ฐ๋ฏธ [์ฒดํ—˜ํ˜•] ๋ฉ˜ํ‹ฐ ๋ชจ์ง‘
    • โ€˜Git ํ™œ์šฉ ๋ฐ Gemma๋ฅผ ์ด์šฉํ•œ LLM ์•ฑ ๊ฐœ๋ฐœโ€™
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป Elon Musk and OpenAIโ€™s fiery battle
    • OpenAIโ€™s blog posting about Elon Muskโ€™s accusation
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป Claude 3โ€™s system prompt (X link)
  • ๐Ÿ“œ Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem
    • ๊ธฐ์กด Math Word Problem ๋ฐ์ดํ„ฐ์…‹์„ ๊ธฐ๋ฐ˜์œผ๋กœ unanswerable problems๋ฅผ ํฌํ•จํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฒค์น˜๋งˆํฌ๋ฅผ ๊ตฌ์ถ•. ๋Œ€๋‹ต ๊ฐ€๋Šฅํ•œ ๋ฌธ์ œ์™€ ๊ทธ๋ ‡์ง€ ์•Š์€ ๋ฌธ์ œ ๊ฐ 2,600๊ฐœ์”ฉ ๊ตฌ์„ฑ. InstructGPT, Claude, LLaMA ์‹œ๋ฆฌ์ฆˆ๋กœ ๊ฒ€์ฆ.
  • ๐Ÿ“œ ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
    • LLM์˜ ํŠน์ • layer๋“ค์ด ๋†’์€ ์œ ์‚ฌ๋„๋ฅผ ๊ฐ€์ง„๋‹ค๋Š” ๊ฒƒ์€ ๋ถˆํ•„์š”ํ•œ layer๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ๋‹ค๋Š” ๋œป โ†’ Block Influence (BI)๋ผ๋Š” metric์„ ์ •์˜ํ•˜์—ฌ ๊ฐ layer์˜ ์ค‘์š”๋„๋ฅผ ์ธก์ • โ†’ pruning์—์„œ SoTA๋ฅผ ๋‹ฌ์„ฑํ•œ ShortGPT๋ฅผ ๊ฐœ๋ฐœ
  • ๐Ÿ“œ GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
    • full parameter learning์„ ์‚ฌ์šฉํ•˜์ง€๋งŒ LoRA๋ณด๋‹ค๋„ memory-efficientํ•œ ํ•™์Šต ์ „๋žต์ธ Graident Low-Rank Projection (GaLore)๋ฅผ ์ œ์‹œ. 7B ๋ชจ๋ธ์„ 24GB ๋ฉ”๋ชจ๋ฆฌ GPU ํ•œ ๋Œ€๋กœ ๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ ์—†์ด pre-training ๊ฐ€๋Šฅํ•˜๋„๋ก ๋งŒ๋“œ๋Š” ํ…Œํฌ๋‹‰.
  • ๐Ÿ“œ SaulLM-7B: A pioneering Large Language Model for Law
    • Mistral 7B ๋ชจ๋ธ์„ ๋ฒ ์ด์Šค๋กœ ๋ฒ•๋ฅ  ๋ฐ์ดํ„ฐ๋กœ continual pre-training & instruction fine-tuningํ•œ ๋ชจ๋ธ SaulLM-7B ๋ชจ๋ธ์„ ๊ณต๊ฐœ. 30B ํ† ํฐ์˜ ๋ฒ•๋ฅ  ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šตํ–ˆ๋‹ค๊ณ  ํ•จ.
  • ๐Ÿ—ž๏ธ Salesforce announces new AI tools for doctors
    • ์„ธ์ผ์ฆˆํฌ์Šค์—์„œ ์˜๋ฃŒ ๋ถ„์•ผ์˜ ํ–‰์ •์  ์—…๋ฌด ๋ถ€๋‹ด์„ ์™„ํ™”ํ•ด์ค„ ์ˆ˜ ์žˆ๋Š” Einstein Copilot์„ ์ถœ์‹œ
  • ๐Ÿ“œ Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
    • LLM ์„ฑ๋Šฅ ํ‰๊ฐ€ ๊ฒฐ๊ณผ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๋ฆฌ๋”๋ณด๋“œ๋กœ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ์ฑ—๋ด‡ ์•„๋ ˆ๋‚˜์— ๋Œ€ํ•œ ์„ค๋ช…์ด ๋‹ด๊ธด ๋…ผ๋ฌธ. ์‚ฌ์šฉ๋œ ๋ฉ”ํŠธ๋ฆญ์ด๋‚˜ ์ง€๊ธˆ๊นŒ์ง€์˜ ํ‰๊ฐ€ ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ๋ถ„์„์„ ํฌํ•จํ•˜๊ณ  ์žˆ์Œ
  • ๐Ÿ“œ Yi: Open Foundation Models by 01.AI
    • 01.AI์—์„œ ์ถœ์‹œํ•œ LLM, Yi. 6B, 34B ์‚ฌ์ด์ฆˆ์˜ ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์ด๋ฉฐ 200K์˜ context length, depth-upscaled model, vision-language model ์ด๋ผ๋Š” ํŠน์ง•์„ ์ง€๋‹˜
  • ๐Ÿ“œ [Meta] Teaching Large Language Models to Reason with Reinforcement Learning
    • feedback์œผ๋กœ๋ถ€ํ„ฐ ๋ฐฐ์šฐ๋Š” ์—ฌ๋Ÿฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜ (Expert Iteration, Proximal Policy Optimization, Return-Conditioned RL)์— ๋Œ€ํ•œ ๋น„๊ต ์—ฐ๊ตฌ
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป ๐Ÿฆ WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป mamba_peft.py on HuggingFace
    • mamba๋ฅผ ์ด์ œ transformers์—์„œ ์ด์šฉํ•  ์ˆ˜ ์žˆ์Œ. ์œ„ ๋งํฌ๋Š” PEFT example ์ฝ”๋“œ.
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป Foundation Model Development Cheatsheet
    • ๊ฐ์ข… ๋ชจ๋ธ ๋ฐ ๋ฐ์ดํ„ฐ์…‹์„ ์นดํ…Œ๊ณ ๋ฆฌ์™€ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ๋กœ ๊ตฌ๋ถ„ํ•˜์—ฌ ํ•œ ๋ฒˆ์— ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š” ์‚ฌ์ดํŠธ
  • ๐Ÿ“œ Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation
    • 1.65M ๊ฐœ์˜ examples๋กœ ํ•™์Šต๋œ ์˜คํ”ˆ์†Œ์Šค ๋ชจ๋ธ for conditional task generation. unannotated text๋ฅผ instruction tuning์„ ์œ„ํ•œ task-specific training datasets์œผ๋กœ ๋ณ€ํ™˜
3rd week
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Gen AI Korea 2024] ์ƒ์„ฑํ˜• AI ๋ ˆ๋“œํŒ€ ์ฑŒ๋ฆฐ์ง€
    • 4์›” 11์ผ (๋ชฉ) ~ 4์›” 12์ผ (๊ธˆ), ์ฝ”์—‘์Šค์—์„œ ์ง„ํ–‰๋˜๋Š” ์ฑŒ๋ฆฐ์ง€ ๋ฐ ์ปจํผ๋Ÿฐ์Šค. Cohere ๋Œ€ํ‘œ, Kakao ์ด์‚ฌ, ๋„ค์ด๋ฒ„ AI ์ˆ˜์žฅ ๋“ฑ ์œ ๋ช… ์ธ์‚ฌ๋“ค์ด ์ฐธ์—ฌ
  • ๐Ÿ“œ [Anthropic] The Claude 3 Model Family: Opus, Sonnet, Haiku
    • Anthropic์—์„œ ์ตœ๊ทผ ์ถœ์‹œํ•œ Claude 3 ๋ชจ๋ธ ํŒจ๋ฐ€๋ฆฌ์— ๋Œ€ํ•œ model card. ์ฃผ๋กœ ๋ฒค์น˜๋งˆํฌ ์„ฑ๋Šฅ ํ‰๊ฐ€ ๊ฒฐ๊ณผ๊ฐ€ ์ œ์‹œ๋˜์–ด ์žˆ๋Š” ๋“ฏํ•จ
  • ๐Ÿ“œ [Microsoft] Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
    • OpenAI์—์„œ ์ถœ์‹œํ•œ text-to-video ์ƒ์„ฑ AI ๋ชจ๋ธ, Sora์— ๋Œ€ํ•œ comprehensive review paper
  • ๐Ÿ“œ [Google Research] Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation
    • ๊ธฐ์กด์—๋Š” ์ „์ฒด output์— ๋Œ€ํ•œ single reward๋ฅผ ๋ฐ˜ํ™˜ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— reward signal ์ž์ฒด๊ฐ€ spareํ•˜๋‹ค๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ์Œ โ†’ LLM์˜ ๋น„ํŒ(critique) ๋Šฅ๋ ฅ์„ ํ™œ์šฉํ•˜์—ฌ RL ํ•™์Šต ๊ณผ์ •์—์„œ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ๋Š” intermediate-step rewards๋ฅผ ์ƒ์„ฑ
  • ๐Ÿ“œ Birbal: An efficient 7B instruct-model fine-tuned with curated datasets
    • NeurIPS workshop์œผ๋กœ ์ง„ํ–‰๋œ LLM Efficiency Challenge. RTX 4090 ๋˜๋Š” A00 with 40GB ํ•œ ๋Œ€๋กœ 24์‹œ๊ฐ„ ๋‚ด์— ํ•™์Šตํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•จ. ๋ณธ ๋ชจ๋ธ์€ Mistral-7B๋ฅผ ๋ฒ ์ด์Šค๋กœ ์‚ผ๊ณ  ์žˆ์œผ๋ฉฐ RTX 4090์œผ๋กœ 16์‹œ๊ฐ„ ๋™์•ˆ ํ•™์Šตํ•จ. ์ด๋Š” ๋‹ค์–‘ํ•œ ํƒœ์Šคํฌ๋ฅผ ์•„์šฐ๋ฅด๋Š” ๊ณ ํ’ˆ์งˆ instruction dataset์—์„œ ๊ธฐ์ธํ•จ
  • ๐Ÿ“œ [Google DeepMind] Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
    • context์˜ ๊ธธ์ด๊ฐ€ ๊ธด ์ƒํ™ฉ์—์„œ, Gemini 1.5 ๋ชจ๋ธ ํŒจ๋ฐ€๋ฆฌ๊ฐ€ ์–ด๋–ค ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๋Š”์ง€ ๋น„๊ต ๋ถ„์„ํ•œ ๊ตฌ๊ธ€์˜ technical report. MMLU์—์„œ ์‚ฌ๋žŒ์˜ ์ตœ๊ณ  ์ ์ˆ˜๋ฅผ ๋„˜์€ ์ตœ์ดˆ์˜ ๋ชจ๋ธ์ด๋ผ๊ณ  ์ฃผ์žฅํ•˜์ง€๋งŒ ๋Œ€์ค‘์˜ ํ‰๊ฐ€๋Š” ์ƒ์ดํ•จ.
  • ๐Ÿ“œ MuseGraph: Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining
    • task-specific Chain-of-Thought-based insturction generation mechanism
  • ๐Ÿ“œ Harnessing Multi-Role Capabilities of Large Language Models for Open-Domain Question Answering
    • ODQA ํƒœ์Šคํฌ์—์„œ โ€˜retrieve-then-readโ€™์™€ โ€˜generate-then-readโ€™ ํŒจ๋Ÿฌ๋‹ค์ž„์„ ํ•ฉ์นœ ๋ฐฉ์‹. query expansion, document selection, answer generation์˜ ์„ธ ๊ฐ€์ง€ ์Šคํ…์œผ๋กœ ๊ตฌ์„ฑ๋จ.
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Cohere] Command-R: Retrieval Augmented Generation at Production Scale
    • long context๋ฅผ ํ™œ์šฉํ•˜๋Š” RAG๋‚˜ ์™ธ๋ถ€ API, ๋˜๋Š” tool ์‚ฌ์šฉ์— ์ ํ•ฉํ•œ ์ƒ์„ฑํ˜• ๋ชจ๋ธ Command-R์„ ๊ณต๊ฐœ. Embed & Rerank ๋ชจ๋ธ๊ณผ ํ•จ๊ป˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„๋จ. Cohere API๋ฅผ ํ†ตํ•ด ์ด์šฉ ๊ฐ€๋Šฅ.
  • ๐Ÿ“œ [MIT] RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback
    • query์™€ ๋ฌด๊ด€ํ•œ ๋ฌธ์„œ๊ฐ€ retrieve ๋˜๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด Iterative Self-Feedback ๋ฐฉ์‹์„ ์ œ์•ˆ
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [OpenAI] transfromer-debugger (TBD)
    • Small Language Models์˜ ํŠน์ • ํ–‰๋™์„ ์กฐ์‚ฌํ•˜๊ธฐ ์œ„ํ•œ ๋ชฉ์ ์œผ๋กœ ์ œ์ž‘๋œ ๋””๋ฒ„๊น… ํˆด (๊นƒํ—ˆ๋ธŒ ๋ ˆํฌ ๋งํฌ)
  • ๐Ÿ“œ [Google DeepMind, OpenAI] Stealing Part of a Production Language Model
    • proprietary ๋ชจ๋ธ์˜ embedding projector layer๋ฅผ hacking์œผ๋กœ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค๋Š” ํ™”์ œ์˜ ๋…ผ๋ฌธ
  • ๐Ÿ“œ [Meta] Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
    • seed ๋ชจ๋ธ๋กœ๋ถ€ํ„ฐ ๊ฐ ๋ฐ์ดํ„ฐ์— ๋”ฐ๋ผ ๋‹ค๋ฅธ expert LLM์„ ํ•™์Šต์‹œํ‚ค๊ณ , router๋ฅผ ํ†ตํ•ด ์ถ”๊ฐ€์ ์ธ FeedForward layer๋ฅผ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐฉ์‹์ธ Branch-Train-Mix๋ฅผ ์ œ์•ˆ. MoE finetuning์ด ํ•„์š”ํ•˜์ง€ ์•Š์€ Branch-Train-Merge ๋ฐฉ์‹์—๋„ ์ ์šฉ ๊ฐ€๋Šฅ.
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [DeepLearning.AI] Knowledge Graph for RAG
    • Neo4j์™€์˜ collaboration. RAG ๋‚ด์—์„œ knowledge graph๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ฐฐ์šฐ๋Š” ๊ณผ์ • (graph store)
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Google DeepMind] A generalist AI agent for 3D virtual environments
    • ๋‹ค์–‘ํ•œ video-game ํ™˜๊ฒฝ์—์„œ natural language instruction์„ ๋”ฐ๋ฅผ ์ˆ˜ ์žˆ๋Š” Multiworld Agent๋ฅผ ๊ฐœ๋ฐœ
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Microsoft Research] Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
    • ์—ฌ๋Ÿฌ ์„ ํƒ์ง€ ์ค‘์—์„œ ํ•˜๋‚˜๋ฅผ ๊ณ ๋ฅด๋Š” Multiple Choice Question Answering (MCQA) ๋Œ€์‹  24๊ฐœ์˜ ๋ชจ๋ธ์ด ์ฐธ์—ฌํ•˜๋Š” RWQ-Elo ranking system์„ ์ œ์•ˆ
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [OpenAI] Figure Status Update - OpenAI Speech-to-Speech Reasoning
    • OpenAI์—์„œ Figure๋ผ๋Š” ๋กœ๋ด‡ ํšŒ์‚ฌ์™€ ์ œํ’ˆ์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์ธ์ง€ ๋ฐ ์ถ”๋ก  ๋Šฅ๋ ฅ์ด ์•„์ฃผ ๋›ฐ์–ด๋‚œ ๋กœ๋ด‡์„ ๊ฐœ๋ฐœ
  • ๐Ÿ“œ [Tancent] Large Language Models are Contrastive Reasoners
    • โ€œLetโ€™s give a correct and a wrong answerโ€, prompt๋ฅผ ์•ž์— ๋ถ™์—ฌ์คŒ. ์ด๋กœ์จ LLM์ด ํ›Œ๋ฅญํ•œ contrastive reasoner๋ผ๋Š” ๊ฒƒ์„ ์ž…์ฆํ•œ ์—ฐ๊ตฌ.
  • ๐Ÿ“œ Logits of API-Protected LLMs Leak Proprietary Information
    • proprietary ๋ชจ๋ธ๋“ค์˜ hidden size, full-vocabulary output ๋“ฑ์— ๊ด€ํ•œ ์ •๋ณด๋ฅผ ์ ์€ API ๋น„์šฉ์œผ๋กœ hackingํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๋…ผ๋ฌธ. gpt-3.5-turbo์˜ ๊ฒฝ์šฐ $1000 ์ดํ•˜๊ฐ€ ํ•„์š”ํ•˜๋‹ค๊ณ  ์ฃผ์žฅ.
  • ๐Ÿ“œ [Apple] MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
    • Multimodal Large Language Models์— ๊ด€ํ•œ ์‚ฌ์ „ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ ์„ ์ •, ํ•™์Šต ๊ธฐ๋ฒ•, ์ด๋ฏธ์ง€ ์ธ์ฝ”๋” ๋“ฑ์— ๋Œ€ํ•œ ์—ฐ๊ตฌ. dense ๋ชจ๋ธ๊ณผ mixture-of-experts (MoE) ๋ฐฉ์‹์„ ๊ฒฐํ•ฉํ•œ MM1 ๋ชจ๋ธ ํŒจ๋ฐ€๋ฆฌ๋ฅผ ๊ฐœ๋ฐœ
  • ๐Ÿ—ž๏ธ Ex-Activision CEO Bobby Kotick pitched buying TikTok to potential partners, including Sam Altman: report
    • ๋ฏธ๊ตญ์—์„œ๋Š” ํ‹ฑํ†ก์„ ๊ทœ์ œํ•˜๋Š” ์™€์ค‘์— Activision์˜ ์ „ CEO๊ฐ€ ํ‹ฑํ†ก์„ ์ธ์ˆ˜ํ•˜๊ณ  OpenAI์™€ ํ˜‘๋ ฅํ•  ๊ณ„ํš์„ ๊ฐ–๊ณ  ์žˆ์Œ์— ๊ด€ํ•œ ๋ณด๋„
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [xAI] Open Release of Grok-1
    • ์ผ๋ก  ๋จธ์Šคํฌ์˜ AI ํšŒ์‚ฌ xAI์—์„œ LLM Grok-1 (314B)์„ ์˜คํ”ˆ ์†Œ์Šค๋กœ ๊ณต๊ฐœ. ์•ฝ์†์„ ์ง€ํ‚ค๋Š” ์ƒ๋‚จ์ž.. OpenAI์™€์˜ ๊ด€๊ณ„์— ๊ธฐ์ธํ•œ ํ˜„์ƒ๊ฐ™๊ธฐ๋„ ํ•˜๊ณ .. (๊นƒํ—ˆ๋ธŒ ๋งํฌ)
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Cohere] C4AI Command-R (HuggingFace)
    • Cohere์—์„œ ๊ณต๊ฐœํ•œ RAG์— ํŠนํ™”๋œ LLM. ์ง€๋‚œ ๋ฒˆ API๋กœ ๊ณต๊ฐœํ•œ ์ดํ›„ ๋ชจ๋ธ๋„ ํ—ˆ๊น…ํŽ˜์ด์Šค์— ๊ณต๊ฐœ.
  • ๐Ÿ“œ [Stanford University] Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
    • ์–ธ์–ด ๋ชจ๋ธ์ด reasoning์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ณผ์ •์—์„œ, ๋งค ์Šคํ…๋งˆ๋‹ค โ€˜thoughtโ€™๋ฅผ ๋ณ‘๋ ฌ์ ์œผ๋กœ ์ƒ์„ฑํ•˜์—ฌ ๋” ์ข‹์€ ์ถ”๋ก ์ด ๊ฐ€๋Šฅํ•˜๋„๋ก ์œ ๋„ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆ
  • ๐Ÿ“œ [Peking University] RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation
    • CoT ๋ฌธ์žฅ์˜ ๊ฐ ์š”์†Œ์™€ ๊ด€๋ จ๋œ content๋ฅผ ์ฐพ์•„์„œ ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ•„์š”ํ•œ ๊ฒฝ์šฐ revise. revised ๋ฌธ์žฅ๋“ค๋กœ CoT๋ฅผ ์žฌ๊ตฌ์„ฑ
4th week
  • ๐Ÿ—ž๏ธ [Nvidia] Nvidia reveals Blackwell B200 GPU, the โ€˜worldโ€™s most powerful chipโ€™ for AI
    • H100์˜ ๋’ค๋ฅผ ์žˆ๋Š” ํ”Œ๋ž˜๊ทธ์‹ญ GPU, B200 ๊ณต๊ฐœ
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป Open-Sora
    • OpenAI์˜ Sora์— ์˜๊ฐ์„ ๋ฐ›์•„ ๋งŒ๋“  ๊ณ ํ’ˆ์งˆ video ์ƒ์„ฑ ๋ชจ๋ธ. ์˜คํ”ˆ์†Œ์Šค๋กœ ๊ณต๊ฐœ.
  • ๐Ÿ“œ [CMU-LTI] Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases
    • upstream datasets processing๊ณผ downstrea performance evaluation์„ ํ†ตํ•ฉํ•œ ์‹œ์Šคํ…œ์„ ๊ตฌ์ถ•. ๋ฐ์ดํ„ฐ ํฌ๋กค๋ง๋ถ€ํ„ฐ QA ์‹œ์Šคํ…œ ์ „๋ฐ˜์— ๋Œ€ํ•œ ๋‚ด์šฉ์„ ๋‹ค๋ฃจ๊ณ  ์žˆ์Œ
  • ๐Ÿ“œ [UC Berkeley] RAFT: Adapting Language Model to Domain Specific RAG
    • Test ๋‹จ๊ณ„์—์„œ ๋ชจ๋ธ์ด ์™ธ๋ถ€ ๋ฌธ์„œ๋ฅผ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ์‹์— ๋Œ€ํ•ด ํ•™์Šตํ•˜๋„๋ก ํ•จ. ์ด๋•Œ golden only ๋ฐฉ์‹์ด ์•„๋‹Œ sampled negative documents๋„ ํ™œ์šฉ.
  • ๐Ÿ“œ [Google Research] PERL: Parameter Efficient Reinforcement Learning from Human Feedback
    • RLHF์— LoRA๋ฅผ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆ. ์ •ํ™•ํžˆ๋Š” reward model ํ•™์Šต์— LoRA๊ฐ€ ํ™œ์šฉ๋จ
  • ๐Ÿ“œ [EACL 2024] Aligning Large and Small Language Models via Chain-of-Thought Reasoning
    • SLM์ด ํŠน์ • ์–‘์‹์„ ์ž˜ ๋”ฐ๋ฅผ ์ˆ˜ ์žˆ๋„๋ก Instruction-tuning-CoT Method๋ฅผ ์ œ์•ˆ
  • ๐Ÿ“œ RankPrompt: Step-by-Step Comparisons Make Language Models Better Reasoners
    • LLM์ด reasoning ๊ณผ์ • ์ค‘์— ๋งŒ๋“œ๋Š” ์‹ค์ˆ˜๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•œ ๋ฐฉ์‹์œผ๋กœ LLM์ด ์Šค์Šค๋กœ ์ž์‹ ์˜ response์— ๋Œ€ํ•ด ranking ํ•˜๋Š” ๋ฐฉ์‹์„ ์ œ์•ˆ. ์ถ”๊ฐ€์ ์ธ ๋ฆฌ์†Œ์Šค ์‚ฌ์šฉ์ด ๋ฐœ์ƒํ•˜์ง€ ์•Š๋Š”๋‹ค๋Š” ์ ์ด ํŠน์ง•.
  • ๐Ÿ“œ [KAIST] SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs
    • ODQA ํƒœ์Šคํฌ์—์„œ retrieved passage๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ โ€˜๋‹ต๋ณ€ ํ›„๋ณด ์ƒ์„ฑ - ์กฐ๊ฑด๋ถ€ ์š”์•ฝ - ๊ฒ€์ฆโ€™ ๊ณผ์ฆ์„ ๊ฑฐ์ณ ๋ฒค์น˜๋งˆํฌ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ๋Œ์–ด์˜ฌ๋ฆฐ LK Lab์˜ ์—ฐ๊ตฌ
  • ๐Ÿ“œ [Microsoft Corporation] LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
    • LLM์œผ๋กœ๋ถ€ํ„ฐ data distillation๋ฅผ ํ†ตํ•ด ์••์ถ•๋œ ํ…์ŠคํŠธ๋ฅผ ํš๋“ํ•˜๊ณ  ์ด์— ๋Œ€ํ•ด annotation์„ ์ˆ˜ํ–‰ํ•œ ๋’ค ํ•„ํ„ฐ๋ง์„ ๊ฑฐ์ณ ๋‚˜์˜จ ๊ฒฐ๊ณผ๋ฅผ ์••์ถ•ํ•˜์—ฌ ๋ชจ๋ธ์— ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ „๋‹ฌ
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Google DeepMind] TacticAI: an AI assistant for football tactics
    • ๋ฆฌ๋ฒ„ํ’€์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•ด์„œ ์ฝ”๋„ˆํ‚ฅ ๊ฒฐ๊ณผ๋ฅผ ์˜ˆ์ธกํ•˜๋Š” AI ๋ชจ๋ธ์„ ๊ฐœ๋ฐœ. ์ด์ „์—๋„ ๋ฆฌ๋ฒ„ํ’€ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•œ ๊ฒฐ๊ณผ๊ฐ€ ์žˆ์—ˆ๋Š”๋ฐ ํ›„์†์ž‘์œผ๋กœ ๋‚˜์˜จ ๋“ฏํ•จ.
  • ๐Ÿ“œ [Google DeepMind] Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models (ICLRโ€™ 2024)
    • LLM์ด ์ฃผ์–ด์ง„ ๋ฌธ์ œ๋กœ๋ถ€ํ„ฐ high-level concept๊ณผ ์›์น™๋“ค์„ ์ถ”์ถœํ•ด๋‚ด๊ณ  ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ reasoning ํ•˜๋Š” Step-Back Prompting์„ ์ œ์•ˆ. ๊ฐ„๋‹จํžˆ ๋งํ•˜์ž๋ฉด Abstraction โ†’ Reasoning ๊ณผ์ •์„ ๊ฑฐ์นจ.
  • ๐Ÿ“œ [AI2] RewardBench: Evaluating Reward Models for Language Modeling
    • RLHF์— ๊ฐ€์žฅ ์ค‘์š”ํ•œ ์š”์†Œ ์ค‘ ํ•˜๋‚˜์ธ Reward Model์ด reward๋ฅผ ์ œ๋Œ€๋กœ ๋ฐ˜ํ™˜ํ•˜๊ณ  ์žˆ๋Š”์ง€ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š” ๋ฒค์น˜๋งˆํฌ๋ฅผ ๊ฐœ๋ฐœํ•˜์—ฌ ๊ณต๊ฐœ. prompt-win-lose trios ๋ฐ์ดํ„ฐ์…‹.
  • ๐Ÿ“œ LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
    • ๋‹ค์–‘ํ•œ Efficient fine-tuning ๊ธฐ๋ฒ•๋“ค์„ ๋‚ด์žฅ web UI LlamaBoard๋ฅผ ํ†ตํ•ด ์ฝ”๋”ฉํ•  ํ•„์š” ์—†์ด ๊ฐ„๋‹จํ•˜๊ณ  ํŽธ๋ฆฌํ•˜๊ฒŒ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์†Œ๊ฐœ
  • ๐Ÿ“œ MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
    • ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ชจ๋ธ์ด ๊ทธ๋ฆผ์„ ์ •ํ™•ํžˆ ์ดํ•ดํ•˜๊ณ  ๋ฌธ์ œ๋ฅผ ํ‘ธ๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ๋žŒ์ด ์ง์ ‘ annotationํ•œ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ 15K ๊ฐœ๋ฅผ ํฌํ•จํ•˜๋Š” MathVerse ๋ฒค์น˜๋งˆํฌ๋ฅผ ๊ณต๊ฐœ
  • ๐Ÿ“œ [KAIST] Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity
    • classifier (์‚ฌ์ด์ฆˆ๊ฐ€ ์ž‘์€ LM)์„ ํ†ตํ•ด query๋ฅผ straightforward/simple/complex query๋กœ ๊ตฌ๋ถ„ํ•˜๊ณ  ๊ฐ๊ฐ ๋‹ค๋ฅธ ๋ฐฉ์‹์œผ๋กœ retrieval์„ ์ˆ˜ํ–‰
  • ๐Ÿ“œ [Sakana AI] Evolutionary Optimization of Model Merging Recipes
    • ๋ชจ๋ธ merge์™€ ๊ด€๋ จํ•˜์—ฌ ์„ ํƒ๋œ ๋ชจ๋ธ๋“ค์˜ layer๋ฅผ ์ž๋™์ ์œผ๋กœ ๋ณ‘ํ•ฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•จ.
5th week
  • ๐Ÿ“œ Instructing Large Language Models to Identify and Ignore Irrelevant Conditions
    • Math Word Problem (MWP)๋ฅผ ํ’€ ๋•Œ ์ž์ฃผ ์‚ฌ์šฉ๋˜๋Š” CoT prompting์— ๋Œ€ํ•œ ์—ฐ๊ตฌ. I3C๋ผ๋Š” ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์‹œํ–ˆ๋Š”๋ฐ, LLM์œผ๋กœ ํ•˜์—ฌ๊ธˆ irrelevant conditions๋ฅผ ๋ฌด์‹œํ•˜๋„๋ก instructํ•˜๋Š” ๋ฐฉ์‹์ž„. ์ด๊ฒƒ์ด RAG์—๋„ ์ ์šฉ๋  ์ˆ˜ ์žˆ์ง€ ์•Š์„๊นŒ ํ•˜๋Š” ์ƒ๊ฐ์ด ๋“ฆ.
  • ๐Ÿ“œ [Microsoft Research, CMU] Can large language models explore in-context?
    • GPT-3.5, GPT-4, Llama2๋ฅผ ๋Œ€์ƒ์œผ๋กœ ๋‹ค์–‘ํ•œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋””์ž์ธํ•ด์„œ ์‹คํ—˜์„ ์ˆ˜ํ–‰. ๊ฒฐ๊ตญ ์ง€๊ธˆ๊นŒ์ง€์˜ ์–ธ์–ด ๋ชจ๋ธ๋“ค์€ ์ƒ๋‹นํ•œ interventions(์˜ˆ๋ฅผ ๋“ค์–ด fine-tuning) ์—†์ด๋Š” robustํ•œ ํ–‰๋™ ์–‘์ƒ์„ ๋ณด์ผ ์ˆ˜ ์—†๋‹ค๋Š” ๊ฒฐ๋ก ์„ ๋‚ด๋ฆผ
  • ๐Ÿง‘๐Ÿปโ€๐Ÿ’ป [Lightning AI] lightning-thunder
    • ํŒŒ์ดํ† ์น˜๋ฅผ ํ™œ์šฉํ•œ LLM ํ•™์Šต ์†๋„๋ฅผ 40% ๊ฐ€๋Ÿ‰ ํ–ฅ์ƒ์‹œ์ผœ์ฃผ๋Š” compiler๋ฅผ ๊ณต๊ฐœ. single accelerator & multi-GPU ํ™˜๊ฒฝ์—์„œ ๋ชจ๋‘ ํ™œ์šฉ ๊ฐ€๋Šฅ.
  • ๐Ÿ“œ [Johns Hopkins, Yale, AI2] FOLLOWIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions
    • Information Retrieval (IR) ์— LLM์„ ์‚ฌ์šฉํ•˜๋”๋ผ๋„ ์ง€๊ธˆ๊นŒ์ง€๋Š” ๋‹จ์ˆœํžˆ query๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์„ ๋ฟ์ด์—ˆ์Œ โ†’ instruction following retrieval model, FollowIR์„ ์ œ์•ˆ
  • ๐Ÿ“œ [UC Berkeley] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
    • baseline student LLM์„ ์ดˆ๊ธฐ ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•ด ํ•™์Šต โ†’ ํ•™์Šต ๊ฒฐ๊ณผ๋ฅผ ํ‰๊ฐ€ํ•˜์—ฌ ์ž˜๋ชป๋œ ์ผ€์ด์Šค๋“ค์„ ๋ชจ์Œ โ†’ teacher LLM์ด ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜์—ฌ ํ•™์Šต ๋ฐ์ดํ„ฐ์— ์ถ”๊ฐ€