Welcome to the Huggingface Reading Group! The goal of this group is to have a weekly presentation on research papers/groups of papers. The goal of this repository is to compile all the past presentation write-ups and recordings.
This group was started by Huggingface community member James Kelly on 09/26/2023. In the beginning, we "presented" via a summary of papers in discord threads but we started 1/12/2024 to do presentations in discord calls thanks to Phil Butler. The presentations, in general, are targetted for the general audience on the subject of Generative Models but no research papers are off limits.
Presenter: James Kelly
Paper: Ambiguity-Aware In-Context Learning with Large Language Models
Presenter: James Kelly
Paper: Controlling Neural Networks with Rule Representations (NeurIPs, 2021)
Presenter: Isamu Isozaki
Paper: InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation
Presenter: Isamu Isozaki
Papers: Text Embeddings Reveal (Almost) As Much As Text+NEFTune: Noisy Embeddings Improve Instruction Finetuning
Presenter: Vsevolod I. Avrutskiy. Author of the paper
Paper: Training Image Derivatives: Increased Accuracy and Universal Robustness
Presenter: Isamu Isozaki
Paper: Zephyr: Direct Distillation of LM Alignment
6: Literature Review on RAG(Retrieval Augmented Generation) for Custom Domains(Presented on 11/29/2023)
Presenter: Isamu Isozaki
Papers: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks + Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering + RA-DIT: Retrieval-Augmented Dual Instruction Tuning
7: Understanding MagVIT2: Language Model Beats Diffusion: Tokenizer is key to visual generation(Presented on 12/13/2023)
Presenter: Isamu Isozaki
Paper: Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
8: Understanding Common Diffusion Noise Schedules and Sample Steps are Flawed(Presented on 12/21/2023)
Presenter: Isamu Isozaki
Paper: Common Diffusion Noise Schedules and Sample Steps are Flawed
9: The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems: A Scoping Survey(Presented on 1/5/2024)
Presenter: Dhruv Dhamani. Author of the paper
Paper: The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems: A Scoping Survey
10: Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation(Presented on 1/12/2024)
Presenter: Phil Butler
Paper: Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Unfortunately, no recordings but a coauthors came.
Presenter: Isamu Isozaki
Papers: On the acceptability of arguments and its fundamental role in non-monotonic reasoning, logic programming, and n-person games+An Answer Set Programming Approach to Argumentative Reasoning in the ASPIC+ Framework+HYPO’s legacy: introduction to the virtual special issue+Induction of Defeasible Logic Theories in the Legal Domain+Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset+Large Language Models in Law: A Survey+The Smart Court - A New Pathway to Justice in China?
12: A forthcoming decoder-only foundation model for time-series forecasting & further research(Presented on 2/9/2024)
Presenter: Tonic
Paper: A decoder-only foundation model for time-series forecasting
Presenter: Eric Auld
Paper: Mamba: Linear-Time Sequence Modeling with Selective State Spaces
14: Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures
Presenter: Vincent Abbott. Author of the paper
Presenter: Prateek Yadav. Author of TIES-Merging and ComPEFT
Papers: TIES-Merging: Resolving Interference When Merging Models+Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch+ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization+Learning to Route Among Specialized Experts for Zero-Shot Generalization
Presenter: Shashank Shekhar
Papers: Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context + Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference + Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity + Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts
Presenter: Harvie Zhang. Author of the paper
Paper: HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction
Presenter: Dan Ofer. Author of the papers
Papers: ProteinBERT: A universal deep-learning model of protein sequence and function+Detecting anomalous proteins using deep representations+Protein Language Models Expose Viral Mimicry and Immune Escape
I was absent this meeting so if anyone knows, please let me know/do a pr to fill this part!
Paper: Just Say the Name: Online Continual Learning with Category Names Only via Data Generation
Presenter: Isamu Isozaki
Papers: Graph Machine Learning in the Era of Large Language Models (LLMs)+Large Language Models on Graphs: A Comprehensive Survey+House-GAN: Relational Generative Adversarial Networks for Graph-constrained House Layout Generation
Presenter: Isamu Isozaki
Papers: GROVE: A Retrieval-augmented Complex Story Generation Framework with A Forest of Evidence+Creating Suspenseful Stories: Iterative Planning with Large Language Models+Improving Pacing in Long-Form Story Planning+Large Language Models Fall Short: Understanding Complex Relationships in Detective Narratives+Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers+DOC: Improving Long Story Coherence With Detailed Outline Control+End-to-end Story Plot Generator+Weaver: Foundation Models for Creative Writing
Presnter: starrynightdev
Papers: Accurate structure prediction of biomolecular interactions with AlphaFold 3+Highly accurate protein structure prediction with AlphaFold
Write ups: Huggingface blog+Github blog
Presenter: PS_Venom
Papers: Hamiltonian Neural Networks+Lagrangian Neural Networks
Presenter: Isamu Isozaki
Papers: Natural Language Reasoning, A Survey + Emergent Abilities of Large Language Models + Chain-of-Thought Prompting Elicits Reasoning in Large Language Models + Finetuned Language Models Are Zero-Shot Learners + Show Your Work: Scratchpads for Intermediate Computation with Language Models + Language Models (Mostly) Know What They Know + Tree of Thoughts: Deliberate Problem Solving with Large Language Models + Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models+Boosting Logical Reasoning in Large Language Models through a New Framework: The Graph of Thought + Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters + Large Language Models Can Be Easily Distracted by Irrelevant Context + Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting + Large Language Models Cannot Self-Correct Reasoning Yet + The Impact of Reasoning Step Length on Large Language Models + Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning + Efficient Tool Use with Chain-of-Abstraction Reasoning + Self-playing Adversarial Language Game Enhances LLM Reasoning
Presenter: Franz Louis Cesista. Author of paper
Paper: Multimodal Structured Generation: CVPR's 2nd MMFM Challenge Technical Report
Presenter: Rishit Dagli. First author of paper
Paper: SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
Presenter: Isamu Isozaki, Manil Shrestha
Papers: PentestGPT: An LLM-empowered Automatic Penetration Testing Tool+LLM Agents can Autonomously Hack Websites+LLM Agents can Autonomously Exploit One-day Vulnerabilities+Teams of LLM Agents can Exploit Zero-Day Vulnerabilities+LLMs as Hackers: Autonomous Linux Privilege Escalation Attacks+AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks