manestay's Stars
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
microsoft/WSL
Issues found on WSL
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
deepspeedai/DeepSpeedExamples
Example models using DeepSpeed
ssut/py-googletrans
(unofficial) Googletrans: Free and Unlimited Google translate API for Python. Translates totally free of charge.
datadreamer-dev/DataDreamer
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
zhudotexe/kani
kani (カニ) is a highly hackable microframework for chat-based language models with tool use/function calling. (NLP-OSS @ EMNLP 2023)
brucewlee/lftk
[BEA @ ACL 2023] General-purpose tool for linguistic features extraction; Tested on readability assessment, essay scoring, fake news detection, hate speech detection, etc.
AkariAsai/XORQA
This is the official repository for NAACL 2021, "XOR QA: Cross-lingual Open-Retrieval Question Answering".
zhudotexe/redel
ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)
zhudotexe/fanoutqa
Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models (ACL 2024)
BunsenFeng/PoliLean
Code for "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models". ACL 2023. Best Paper Award.
blender-nlp/SmartBook
amazon-science/street-reasoning
STREET: a multi-task and multi-step reasoning dataset
project-miracl/nomiracl
NoMIRACL: A multilingual hallucination evaluation dataset to evaluate LLM robustness in RAG against first-stage retrieval errors on 18 languages.
WadeYin9712/GeoMLAMA
tareknaous/camel
CAMeL Dataset
edchengg/easyproject
ACL 2023 (Findings) End-to-end Cross-lingual Label Project
shadowkiller33/Language_attack
A repo for LLM jailbreak
fladhak/creative-summ-data
ubisoft/ubisoft-laforge-BinaryAlignWordAlignementasBinarySequenceLabeling
repo on the BinaryAlign: Word Alignment as Binary Sequence Labeling
amazon-science/xstreet
manestay/borderlines
Repository for the NAACL 2024 paper "This Land is {Your, My} Land: Evaluating Geopolitical Biases in Language Models"
afedotowaa/authorship_attribution
ahwang16/grounded-intuition-gpt-vision
Resources for Grounded Intuition of GPT-Vision's Abilities with Scientific Images
artemisp/balance-my-slurm
A user-friendly load balancing script enables users to optimize models for brief periods, ensuring equitable resource allocation without requiring administrative intervention
GateNLP/wpextract
Create datasets from WordPress sites for research or archiving
Info-Sync/InfoSync
Implementation of the semi-structured inference model in our ACL 2023 paper: INFOSYNC: Information Synchronization across Multilingual Semi-structured Tables.
manestay/paxqa
Code and Data for "PAXQA: Generating Cross-lingual Question Answering Examples at Training Scale" (EMNLP 2023)