DAVIAN Lab. Seminar (2024)

Review

  • 1 hour in-depth review per paper
Date Topic Presenter Video
01.25 2024 Diffusion Model Alignment Using Direct Preference Optimization 형준하 Video
01.18 2024 ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings 백유진 Video
01.11
2024
DayDreamer: World Models for Physical Robot Learning 이병근 Video
01.04 2024 Computer Vision in The Wild 송준하 Video
12.21 2023 Language Models Don’t Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting 최민석 Slide
12.07 2023 DALL-E 3: Improving Image Generation with Better Captions 황성원 Video
~ 2023 Link

Sprint

  • 5 minutes quick review per paper
Date Topic Presenter Video
01.25 2024 Bad Students Make Great Teachers
Rethinking FID: Towards a Better Evaluation Metric for Image Generation
InstantID: Zero-shot Identity-Preserving Generation in Seconds
AI 커버곡 어떻게 만들까?
박민호
조영우
Video
01.18 2024 Tokenizer is Key to Visual Generation
Divide and not forget: Ensemble of selectively trained experts in Continual Learning
FITS: Modeling Time Series with 10k Parameters
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
Pixart-alpha and Pixart-delta
Generative Models: What do they know? Do they know things?
Instruct-Imagen: Image Generation with Multi-modal Instruction
MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation
Boundary Attention: Learning to Find Faint Boundaries at Any Resolution
TrustLLM: Trustworthiness in Large Language Models
Tuning Language Models by Proxy
Improving Text Embeddings with Large Language Models
조호준
윤주열
박준우
최승환
Video
01.11
2024
Are Emergent Abilities of Large Language Models a Mirage?
Scaling Data-Constrained Language Models
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Mixtral of Experts
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
LLaMA Pro: Progressive LLaMA with Block Expansion
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
FreeU: Free Lunch in Diffusion U-Net
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models
ITI-GEN: Inclusive Text-to-Image Generation
Fair Text-to-Image Diffusion via Fair Mapping
양소영
정하원
정채연
김정호
Video
01.04 2024 Siamese Masked Autoencoders
Learning to Reason and Memorize with Self-Notes
Video Prediction Models as Rewards for Reinforcement Learning
Pixel Aligned Language Models
Gradient-based Parameter Selection for Efficient Fine-Tuning
SegGPT: Segmenting Everything In Context
Gemini vs GPT-4V: A Preliminary Comparison and Combination
Large Language Model Bias Index
GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models
I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Model
DreamTuner: Single Image is Enough for Subject-Driven Generation
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation
이승일
이상현
황동윤
정소현
Video
12.21 2023 ERM++: An Improved Baseline for Domain Generalization
DATACOMP: In search of the next generation of multimodal datasets
AI2. Does progress on imagenet transfer to real-world datasets?
Aligning Large Language Models through Synthetic Feedback
Self-Evaluation Improves Selective Generation in Large Language Models
Large Language Models as Optimizers
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
이도현
조영우
임혜수
최새미
Video
12.14 2023 Analyzing and Improving the Training Dynamics of Diffusion Models
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
DreaMoving: A Human Video Generation Framework based on Diffusion Models
Vision Transformers Need Registers
DeepCache: Accelerating Diffusion Models for Free
Kandinsky 3.0 Technical Report
FreeInit: Bridging Initialization Gap in Video Diffusion Models
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
The mechanistic basis of data dependence and abrupt learning in an in-context classification task
Meta Continual Learning Revisited: Implicitly Enhancing Online Hessian Approximation via Variance Reduction
LRM: Large Reconstruction Model for Single Image to 3D
최승환
박민호
박준우
김태성
Video
12.07 2023 Towards Accurate Differential Diagnosis with Large Language Models
Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models
Communicative Agents for Software Development
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module
Adversarial Diffusion Distillation
Training Chain-of-Thought via Latent-Variable Inference
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning
GAIA: A Benchmark for General AI Assistants
FaceStudio: Put Your Face Everywhere in Seconds
ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation
Describing Differences in Image Sets with Natural Language
조호준
윤주열
김진희
Video
~ 2023 Link