DAVIAN-Lab/Paper-study

DAVIAN Lab. Seminar (2024)

Review

1 hour in-depth review per paper

Date	Topic	Presenter	Video
01.25 2024	Diffusion Model Alignment Using Direct Preference Optimization	형준하	Video
01.18 2024	ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings	백유진	Video
01.11 2024	DayDreamer: World Models for Physical Robot Learning	이병근	Video
01.04 2024	Computer Vision in The Wild	송준하	Video
12.21 2023	Language Models Don’t Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting	최민석	Slide
12.07 2023	DALL-E 3: Improving Image Generation with Better Captions	황성원	Video
~ 2023	Link

Sprint

5 minutes quick review per paper

Date	Topic	Presenter	Video
01.25 2024	Bad Students Make Great Teachers Rethinking FID: Towards a Better Evaluation Metric for Image Generation InstantID: Zero-shot Identity-Preserving Generation in Seconds AI 커버곡 어떻게 만들까?	박민호 조영우	Video
01.18 2024	Tokenizer is Key to Visual Generation Divide and not forget: Ensemble of selectively trained experts in Continual Learning FITS: Modeling Time Series with 10k Parameters ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs Pixart-alpha and Pixart-delta Generative Models: What do they know? Do they know things? Instruct-Imagen: Image Generation with Multi-modal Instruction MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation Boundary Attention: Learning to Find Faint Boundaries at Any Resolution TrustLLM: Trustworthiness in Large Language Models Tuning Language Models by Proxy Improving Text Embeddings with Large Language Models	조호준 윤주열 박준우 최승환	Video
01.11 2024	Are Emergent Abilities of Large Language Models a Mirage? Scaling Data-Constrained Language Models Direct Preference Optimization: Your Language Model is Secretly a Reward Model Mixtral of Experts SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling LLaMA Pro: Progressive LLaMA with Block Expansion Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets FreeU: Free Lunch in Diffusion U-Net Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models ITI-GEN: Inclusive Text-to-Image Generation Fair Text-to-Image Diffusion via Fair Mapping	양소영 정하원 정채연 김정호	Video
01.04 2024	Siamese Masked Autoencoders Learning to Reason and Memorize with Self-Notes Video Prediction Models as Rewards for Reinforcement Learning Pixel Aligned Language Models Gradient-based Parameter Selection for Efficient Fine-Tuning SegGPT: Segmenting Everything In Context Gemini vs GPT-4V: A Preliminary Comparison and Combination Large Language Model Bias Index GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Model DreamTuner: Single Image is Enough for Subject-Driven Generation StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation	이승일 이상현 황동윤 정소현	Video
12.21 2023	ERM++: An Improved Baseline for Domain Generalization DATACOMP: In search of the next generation of multimodal datasets AI2. Does progress on imagenet transfer to real-world datasets? Aligning Large Language Models through Synthetic Feedback Self-Evaluation Improves Selective Generation in Large Language Models Large Language Models as Optimizers EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision	이도현 조영우 임혜수 최새미	Video
12.14 2023	Analyzing and Improving the Training Dynamics of Diffusion Models VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence Cache Me if You Can: Accelerating Diffusion Models through Block Caching DreaMoving: A Human Video Generation Framework based on Diffusion Models Vision Transformers Need Registers DeepCache: Accelerating Diffusion Models for Free Kandinsky 3.0 Technical Report FreeInit: Bridging Initialization Gap in Video Diffusion Models Alpha-CLIP: A CLIP Model Focusing on Wherever You Want The mechanistic basis of data dependence and abrupt learning in an in-context classification task Meta Continual Learning Revisited: Implicitly Enhancing Online Hessian Approximation via Variance Reduction LRM: Large Reconstruction Model for Single Image to 3D	최승환 박민호 박준우 김태성	Video
12.07 2023	Towards Accurate Differential Diagnosis with Large Language Models Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models Communicative Agents for Software Development IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models LCM-LoRA: A Universal Stable-Diffusion Acceleration Module Adversarial Diffusion Distillation Training Chain-of-Thought via Latent-Variable Inference The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning GAIA: A Benchmark for General AI Assistants FaceStudio: Put Your Face Everywhere in Seconds ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation Describing Differences in Image Sets with Natural Language	조호준 윤주열 김진희	Video
~ 2023	Link