Xuanfang1121's Stars
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
SophonPlus/ChineseNlpCorpus
搜集、整理、发布 中文 自然语言处理 语料/数据集,与 有志之士 共同 促进 中文 自然语言处理 的 发展。
microsoft/TaskWeaver
A code-first agent framework for seamlessly planning and executing data analytics tasks.
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
facebookresearch/large_concept_model
Large Concept Models: Language modeling in a sentence representation space
juand-r/entity-recognition-datasets
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
MeetKai/functionary
Chat language model that can use tools and interpret the results
hymie122/RAG-Survey
Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".
Tencent/Tencent-Hunyuan-Large
sihyun-yu/REPA
Official Pytorch Implementation of Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
shibing624/ChatPDF
RAG for Local LLM, chat with PDF/doc/txt files, ChatPDF. 纯原生实现RAG功能,基于本地LLM、embedding模型、reranker模型实现,无须安装任何第三方agent库。
epfLLM/Megatron-LLM
distributed trainer for LLMs
CLUEbenchmark/FewCLUE
FewCLUE 小样本学习测评基准,中文版
zhilizju/Awesome-instruction-tuning
A curated list of awesome instruction tuning datasets, models, papers and repositories.
IAAR-Shanghai/Awesome-Attention-Heads
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
alibaba/ChatLearn
A flexible and efficient training framework for large-scale alignment tasks
rail-berkeley/crossformer
OpenNLPLab/lightning-attention
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
alipay/financial_evaluation_dataset
deepseek-ai/ESFT
Expert Specialized Fine-Tuning
wangyuxinwhy/generate
A Python Package to Access World-Class Generative Models
goombalab/hydra
Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"
LeapLabTHU/Deep-Incubation
Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)
test-time-training/ttt-lm-kernels
Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Cranial-XIX/longhorn
Official PyTorch Implementation of the Longhorn Deep State Space Model
DACUS1995/pytorch-mmap-dataset
A custom pytorch Dataset extension that provides a faster iteration and better RAM usage
junkangwu/beta-DPO
[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
RenzeLou/Muffin
MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following