Princeton Natural Language Processing

Pinned Repositories

ALCE
[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627
Language:Python395 8 1932
DensePhrases
[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.org/abs/2012.12624
Language:Python598 13 3075
LLM-Shearing
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
Language:Python477 26 6835
LM-BFF
[ACL 2021] LM-BFF: Better Few-shot Fine-tuning of Language Models https://arxiv.org/abs/2012.15723
Language:Python710 29 49130
MeZO
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
Language:Python988 19 3155
PURE
[NAACL 2021] A Frustratingly Easy Approach for Entity and Relation Extraction https://arxiv.org/abs/2010.12812
Language:Python772 13 63118
SimCSE
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
Language:Python3.3k 27 265498
SWE-agent
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.47% of bugs in the SWE-bench evaluation set and takes just 1.5 minutes to run.
Language:Python11.3k 85 2421.1k
SWE-bench
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
Language:Python1.3k 22 80217
tree-of-thought-llm
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Language:Python4.3k 120 52398

Princeton Natural Language Processing's Repositories

princeton-nlp/SWE-agent
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.47% of bugs in the SWE-bench evaluation set and takes just 1.5 minutes to run.
Language:Python11.3k 85 2421.1k
princeton-nlp/tree-of-thought-llm
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Language:Python4.3k 120 52398
princeton-nlp/SimCSE
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
Language:Python3.3k 27 265498
princeton-nlp/SWE-bench
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
Language:Python1.3k 22 80217
princeton-nlp/MeZO
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
Language:Python988 19 3155
princeton-nlp/LLM-Shearing
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
Language:Python477 26 6835
princeton-nlp/ALCE
[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627
Language:Python395 8 1932
princeton-nlp/SimPO
SimPO: Simple Preference Optimization with a Reference-Free Reward
Language:Python327 7 1421
princeton-nlp/LESS
ICML 2024: Less: Selecting Influential Data for Targeted Instruction Tuning
Language:Jupyter Notebook245 4 1716
princeton-nlp/AutoCompressors
[EMNLP 2023] Adapting Language Models to Compress Long Contexts
Language:Python239 9 1817
princeton-nlp/WebShop
[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
Language:Python216 13 2443
princeton-nlp/intercode
[NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898
Language:Python175 8 1630
princeton-nlp/TransformerPrograms
[NeurIPS 2023] Learning Transformer Programs
Language:Python150 3 420
princeton-nlp/CEPE
[ACL 2024] Long-Context Language Modeling with Parallel Encodings
Language:Python109 6 48
princeton-nlp/QuRating
Selecting High-Quality Data for Training Language Models
Language:Python97 7 47
princeton-nlp/LLMBar
[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following
Language:Python85 5 23
princeton-nlp/USACO
Can Language Models Solve Olympiad Programming?
Language:Python80 4 14
princeton-nlp/NLProofS
EMNLP 2022: Generating Natural Language Proofs with Verifier-Guided Search https://arxiv.org/abs/2205.12443
Language:Python79 6 314
princeton-nlp/MQuAKE
[EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions
Language:Jupyter Notebook77 5 115
princeton-nlp/LM-Kernel-FT
A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643
Language:Python68 8 12
princeton-nlp/c-sts
[EMNLP 2023] C-STS: Conditional Semantic Textual Similarity
Language:Python59 4 66
princeton-nlp/Collie
[ICLR 2024] COLLIE: Systematic Construction of Constrained Text Generation Tasks
Language:Jupyter Notebook49 6 03
princeton-nlp/MABEL
EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975
Language:Python35 4 52
princeton-nlp/LM-Science-Tutor
Language:Python25 7 10
princeton-nlp/PTP
Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073
Language:Python19 6 10
princeton-nlp/corpus-poisoning
[EMNLP 2023] Poisoning Retrieval Corpora by Injecting Adversarial Passages https://arxiv.org/abs/2310.19156
Language:Python15 7 11
princeton-nlp/lwm
We develop world models that can be adapted with natural language. Intergrating these models into artificial agents allows humans to effectively control these agents through verbal communication.
Language:Python8 9 22
princeton-nlp/Heuristic-Core
The code accompanying the paper "The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models" - https://arxiv.org/abs/2403.03942
Language:Python6 4 0
princeton-nlp/il-scaling-in-games
Official code repo of "Scaling Laws for Imitation Learning in NetHack"
Language:Python3 5 0
princeton-nlp/MoQA
Language:Python3 3 0