tongyao-zhu

PhD @ NUS Computing

National University of SingaporeSingapore

tongyao-zhu's Stars

meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
Language:Jupyter Notebook15.1k 192 3782.2k
RUCAIBox/LLMSurvey
The official GitHub page for the survey paper "A Survey of Large Language Models".
Language:Python10.5k 158 64820
arcee-ai/mergekit
Tools for merging pretrained large language models.
Language:Python4.8k 52 317439
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Language:Python4.1k 26 549437
microsoft/LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
Language:Python3.7k 54 124282
lucidrains/vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
Language:Python2.6k 32 129211
RUC-NLPIR/FlashRAG
⚡FlashRAG: A Python Toolkit for Efficient RAG Research
Language:Python1.3k 12 83108
Xnhyacinth/Awesome-LLM-Long-Context-Modeling
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
992 44 636
THUDM/LongBench
[ACL 2024] LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
Language:Python662 7 7454
datamllab/LongLM
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Language:Python610 10 3762
texttron/tevatron
Tevatron - A flexible toolkit for neural retrieval research and development.
Language:Python524 12 99100
catid/self-discover
Implementation of Google's SELF-DISCOVER
Language:Python282 3 431
neulab/knn-transformers
PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an implementation of kNN-LM and kNN-MT
Language:Python269 4 1422
booydar/babilong
BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.
Language:Jupyter Notebook152 5 316
chentong0/factoid-wiki
Dense X Retrieval: What Retrieval Granularity Should We Use?
131 9 109
RulinShao/retrieval-scaling
Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".
Language:Python128 4 111
sail-sg/sailor-llm
[EMNLP-2024] ⚓️ Sailor: Open Language Models for South-East Asia
Language:Python109 7 19
jxmorris12/bm25_pt
minimal pytorch implementation of bm25 (with sparse tensors)
Language:Python88 4 44
nkandpa2/long_tail_knowledge
Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"
Language:Python73 3 17
joeljang/temporalwiki
[EMNLP 2022] TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models
Language:Python66 1 35
frankxu2004/knnlm-why
Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"
Language:Python56 5 03
October2001/ProLong
[ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models
Language:Python53 2 30
bbuing9/ICLR24_SuRe
Official Code for the paper "SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs" (ICLR 2024)
Language:Python21 1 10
yuzhaouoe/pretraining-data-packing
[ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training
Language:Python18 2 05
yasumasaonoe/entity_knowledge_propagation
Language:Python16 3 42
YisongMiao/DiSQ-Score
The Dataset and Official Implementation for <Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models’ Understanding of Discourse Relations> @ ACL 2024
Language:Python14 1 03
trestad/mitigating-reversal-curse
Code for paper 'Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse'
Language:Python12 1 10
amy-hyunji/lora-for-retrieval
Language:Python9 4 00
xiangyue9607/C-MORE
Code for the ACL2022 paper "C-MORE: Pretraining to Answer Open-Domain Questions by Consulting Millions of References"
7 4 00
Fantabulous-J/Self-Training-DPR
Language:Python4 1 10

tongyao-zhu

tongyao-zhu's Stars

meta-llama/llama-recipes

RUCAIBox/LLMSurvey

arcee-ai/mergekit

open-compass/opencompass

microsoft/LMOps

lucidrains/vector-quantize-pytorch

RUC-NLPIR/FlashRAG

Xnhyacinth/Awesome-LLM-Long-Context-Modeling

THUDM/LongBench

datamllab/LongLM

texttron/tevatron

catid/self-discover

neulab/knn-transformers

booydar/babilong

chentong0/factoid-wiki

RulinShao/retrieval-scaling

sail-sg/sailor-llm

jxmorris12/bm25_pt

nkandpa2/long_tail_knowledge

joeljang/temporalwiki

frankxu2004/knnlm-why

October2001/ProLong

bbuing9/ICLR24_SuRe

yuzhaouoe/pretraining-data-packing

yasumasaonoe/entity_knowledge_propagation

YisongMiao/DiSQ-Score

trestad/mitigating-reversal-curse

amy-hyunji/lora-for-retrieval

xiangyue9607/C-MORE

Fantabulous-J/Self-Training-DPR