koalazf99

AI Research @GAIR-NLP | Ex @microsoft, @xlang-ai

Shanghai Jiao Tong UniversityShanghai

koalazf99's Stars

karpathy/LLM101n
LLM101n: Let's build a Storyteller
30.7k 2.5k 01.7k
unslothai/unsloth
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
Language:Python19.7k 134 1.2k1.4k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python14.8k 123 1.2k1.4k
adbar/trafilatura
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Language:Python3.8k 30 391267
CosmosShadow/gptpdf
Using GPT to parse PDF
Language:Python3.1k 12 40229
mistralai/mistral-finetune
Language:Python2.8k 40 39239
microsoft/Phi-3CookBook
This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open sourced AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.
Language:Jupyter Notebook2.6k 17 84285
deepseek-ai/DeepSeek-Coder-V2
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
2.4k 24 51127
jackmpcollins/magentic
Seamlessly integrate LLMs as Python functions
Language:Python2.1k 11 80107
cambrian-mllm/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Language:Python1.8k 21 70118
OpenLLMAI/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
Language:Python1.8k 21 179168
mlfoundations/dclm
DataComp for Language Models
Language:HTML1.2k 37 68108
GAIR-NLP/anole
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Language:Python698 11 4536
magpie-align/magpie
Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!
Language:Python538 5 3157
google/aqt
Language:Python272 6 3227
bigcode-project/bigcodebench
BigCodeBench: Benchmarking Code Generation Towards AGI
Language:Python256 6 4730
leanprover/vscode-lean4
Visual Studio Code extension for the Lean 4 proof assistant
Language:TypeScript173 11 20852
zhaoyu-li/DL4TP
[COLM 2024] A Survey on Deep Learning for Theorem Proving
151 7 010
keirp/OpenWebMath
Language:XSLT131 3 48
xlang-ai/Spider2-V
[NeurIPS 2024] Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Language:Jupyter Notebook113 4 17
sail-sg/regmix
🧬 RegMix: Data Mixture as Regression for Language Model Pre-training
Language:Jupyter Notebook95 5 105
GAIR-NLP/OlympicArena
This is the official repository of the paper "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI"
Language:JavaScript89 4 64
epfml/schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
Language:Python62 3 02
ChenWu98/agent-attack
[Arxiv 2024] Adversarial attacks on multimodal agents
Language:Python48 2 05
LLM360/k2-train
Language:Python39 5 26
GAIR-NLP/MoPS
[ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"
Language:Jupyter Notebook32 2 01
koalazf99/Awesome-DataCentric-LLM
Trending projects & awesome papers about data-centric llm studies.
32 3 02
crux-eval/eval-arena
Language:Python22 2 03
zhxieml/remiss-jailbreak
Language:Python21 1 00
young-geng/tpu_pod_commander
TPU pod commander is a package for managing and launching jobs on Google Cloud TPU pods.
Language:Python17 1 00

koalazf99

koalazf99's Stars

karpathy/LLM101n

unslothai/unsloth

Dao-AILab/flash-attention

adbar/trafilatura

CosmosShadow/gptpdf

mistralai/mistral-finetune

microsoft/Phi-3CookBook

deepseek-ai/DeepSeek-Coder-V2

jackmpcollins/magentic

cambrian-mllm/cambrian

OpenLLMAI/OpenRLHF

mlfoundations/dclm

GAIR-NLP/anole

magpie-align/magpie

google/aqt

bigcode-project/bigcodebench

leanprover/vscode-lean4

zhaoyu-li/DL4TP

keirp/OpenWebMath

xlang-ai/Spider2-V

sail-sg/regmix

GAIR-NLP/OlympicArena

epfml/schedules-and-scaling

ChenWu98/agent-attack

LLM360/k2-train

GAIR-NLP/MoPS

koalazf99/Awesome-DataCentric-LLM

crux-eval/eval-arena

zhxieml/remiss-jailbreak

young-geng/tpu_pod_commander