XuexII

XuexII's Stars

facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Language:Python30.4k 426 4.2k6.4k
mem0ai/mem0
The Memory layer for your AI apps
Language:Python22.6k 127 6712.1k
google-research/text-to-text-transfer-transformer
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
Language:Python6.2k 108 405756
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python5.9k 58 600475
andrewyng/translation-agent
Language:Python4.7k 52 15544
amazon-science/mm-cot
Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)
Language:Python3.8k 56 54312
microsoft/LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
Language:Python3.7k 54 121277
bigscience-workshop/promptsource
Toolkit for creating, sharing and using natural language prompts.
Language:Python2.7k 33 162352
huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Language:Python2k 46 129143
facebookresearch/DPR
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
Language:Python1.7k 23 210302
amazon-science/auto-cot
Official implementation for "Automatic Chain of Thought Prompting in Large Language Models" (stay tuned & more will be updated)
Language:Jupyter Notebook1.5k 15 6139
gururise/AlpacaDataCleaned
Alpaca dataset from Stanford, cleaned and curated
Language:Python1.5k 27 25151
AlibabaResearch/AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Language:C++1.5k 36 176173
reata/sqllineage
SQL Lineage Analysis Tool powered by Python
Language:Python1.3k 22 309240
allenai/open-instruct
Language:Python1.3k 16 115171
mlfoundations/dclm
DataComp for Language Models
Language:HTML1.1k 38 61104
google-research/deduplicate-text-datasets
Language:Rust1.1k 13 41111
facebookresearch/cc_net
Tools to download and cleanup Common Crawl data
Language:Python969 23 44142
sooftware/conformer
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
Language:Python955 9 37175
conversationai/perspectiveapi
Perspective is an API that uses machine learning models to score the perceived impact a comment might have on a conversation. See https://developers.perspectiveapi.com for more information.
888 50 0115
princeton-nlp/SimPO
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
Language:Python696 8 7045
arielnlee/Platypus
Code for fine-tuning Platypus fam LLMs using LoRA
Language:Python622 6 2460
xfactlab/orpo
Official repository for ORPO
Language:Python419 6 2738
princeton-nlp/LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
Language:Jupyter Notebook365 5 3430
itsnamgyu/reasoning-teacher
Official code for "Large Language Models Are Reasoning Teachers", ACL 2023
Language:Jupyter Notebook305 5 1719
Aiden0526/SymbCoT
Codes and Data for ACL 2024 Paper "Faithful Logical Reasoning via Symbolic Chain-of-Thought".
Language:Python154 2 415
microsoft/simulated-trial-and-error
Language:Python116 3 612
gpt4life/alpagasus
Unofficial implementation of AlpaGasus
Language:Python84 3 76
wangpf3/consistent-CoT-distillation
Language:Python35 1 34
Glareone/GenAI-System-2-Attention-S2A-by-Meta
datasets from the paper "Towards Understanding Sycophancy in Language Models"
Language:Jupyter Notebook1 0 0