xizexi's Stars
boyu-ai/Hands-on-RL
https://hrl.boyuai.com/
for-ai/iterative-data-selection
kyutai-labs/moshi
unit-mesh/build-your-ai-coding-assistant
《AI 研发提效:构建 AI 辅助编码助手》 —— 介绍如何 DIY 一个端到端(从 IDE 插件、模型选型、数据集构建到模型微调)的 AI 辅助编程工具,类似于 GitHub Copilot、JetBrains AI Assistant、AutoDev 等。
QwenLM/Qwen-Agent
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
pytorch-labs/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
thuiar/TEXTOIR
TEXTOIR is the first opensource toolkit for text open intent recognition. (ACL 2021)
FlagOpen/Infinity-Instruct
gpt-omni/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
OpenSenseNova/piccolo-embedding
code for piccolo embedding model from SenseTime
sail-sg/regmix
🧬 RegMix: Data Mixture as Regression for Language Model Pre-training
SqueezeAILab/LLM2LLM
[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
openai/openai-python
The official Python library for the OpenAI API
nlpxucan/WizardLM
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
hkust-nlp/deita
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
princeton-nlp/LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
tianyi-lab/Superfiltering
[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
tianyi-lab/Cherry_LLM
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
PKU-Baichuan-MLSystemLab/PAS
wdndev/llm_interview_note
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
banksy23/XCoder
adbar/trafilatura
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
CASIA-LM/MoDS
UKPLab/sentence-transformers
State-of-the-Art Text Embeddings
modelscope/data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
ksm26/Function-Calling-and-Data-Extraction-with-LLMs
Master the techniques of function-calling and structured data extraction with LLMs. Learn to enhance LLM capabilities, integrate web services, and build practical applications for real-world data usability.
poloclub/transformer-explainer
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
shizhediao/Post-Training-Data-Flywheel
We aim to provide the best references to search, select, and synthesize high-quality and large-quantity data for post-training your LLMs.
OpenBMB/MiniCPM
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.