liziniu

Ph.D. student at The Chinese University of Hong Kong, Shenzhen.

The Chinese University of Hong Kong, ShenzhenShenzhen

liziniu's Stars

Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python14.8k 124 1.2k1.4k
h2oai/h2ogpt
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
Language:Python11.5k 156 1.2k1.3k
LargeWorldModel/LWM
Large World Model -- Modeling Text and Video with Millions Context
Language:Python7.2k 66 71555
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python6.7k 61 775612
Future-House/paper-qa
High accuracy RAG for answering questions from scientific documents with citations
Language:Python6.6k 56 287655
pytorch/torchtune
PyTorch native post-training library
Language:Python4.5k 48 804468
TransformerLensOrg/TransformerLens
A library for mechanistic interpretability of GPT-style language models
Language:Python1.7k 16 266314
evalplus/evalplus
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
Language:Python1.3k 9 192114
vectara/hallucination-leaderboard
Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
Language:Python1.3k 38 1650
microsoft/ToRA
ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].
Language:Python1k 19 2772
lucidrains/muse-maskgit-pytorch
Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch
Language:Python883 34 3682
mlfoundations/MINT-1T
MINT-1T: A one trillion token multimodal interleaved dataset.
785 25 1120
QwenLM/Qwen2.5-Math
A series of math-specific large language models of our Qwen2 series.
Language:Python669 14 3467
haoliuhl/ringattention
Large Context Attention
Language:Python659 7 1753
hsiehjackson/RULER
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
Language:Python644 15 5743
zhuzilin/ring-flash-attention
Ring attention implementation with flash attention
Language:Python618 12 3852
meta-math/MetaMath
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Language:Python398 7 2836
facebookresearch/searchformer
Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".
Language:Jupyter Notebook342 5 118
tech-srl/RASP
An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"
Language:Python296 10 224
alibaba/ChatLearn
A flexible and efficient training framework for large-scale alignment tasks
Language:Python249 12 920
mega002/lm-debugger
The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.
Language:Python174 10 718
openpsi-project/ReaLHF
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Language:Python157 3 209
google-deepmind/loft
LOFT: A 1 Million+ Token Long-Context Benchmark
Language:Python155 11 413
princeton-nlp/ProLong
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
Language:Python141 13 65
rachtibat/LRP-eXplains-Transformers
Layer-Wise Relevance Propagation for Large Language Models and Vision Transformers [ICML 2024]
Language:Python110 5 1212
OpenBMB/OlympiadBench
[ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems.
Language:Python99 5 97
ucl-dark/llm_debate
Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"
Language:Python90 4 212
ZitongYang/Synthetic_Continued_Pretraining
Code implementation of synthetic continued pretraining
Language:Python70 4 54
RulinShao/RAG-evaluation-harnesses
An evaluation suite for Retrieval-Augmented Generation (RAG).
Language:Python11 1 02
natanaelwf/LLMTest_FindTheOrigin
Testing reasoning degradation in LLMs with variable context windows and information organization.
Language:Python2

liziniu

liziniu's Stars

Dao-AILab/flash-attention

h2oai/h2ogpt

LargeWorldModel/LWM

sgl-project/sglang

Future-House/paper-qa

pytorch/torchtune

TransformerLensOrg/TransformerLens

evalplus/evalplus

vectara/hallucination-leaderboard

microsoft/ToRA

lucidrains/muse-maskgit-pytorch

mlfoundations/MINT-1T

QwenLM/Qwen2.5-Math

haoliuhl/ringattention

hsiehjackson/RULER

zhuzilin/ring-flash-attention

meta-math/MetaMath

facebookresearch/searchformer

tech-srl/RASP

alibaba/ChatLearn

mega002/lm-debugger

openpsi-project/ReaLHF

google-deepmind/loft

princeton-nlp/ProLong

rachtibat/LRP-eXplains-Transformers

OpenBMB/OlympiadBench

ucl-dark/llm_debate

ZitongYang/Synthetic_Continued_Pretraining

RulinShao/RAG-evaluation-harnesses

natanaelwf/LLMTest_FindTheOrigin