liziniu
Ph.D. student at The Chinese University of Hong Kong, Shenzhen.
The Chinese University of Hong Kong, ShenzhenShenzhen
liziniu's Stars
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
h2oai/h2ogpt
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
LargeWorldModel/LWM
Large World Model -- Modeling Text and Video with Millions Context
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Future-House/paper-qa
High accuracy RAG for answering questions from scientific documents with citations
pytorch/torchtune
PyTorch native post-training library
TransformerLensOrg/TransformerLens
A library for mechanistic interpretability of GPT-style language models
evalplus/evalplus
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
vectara/hallucination-leaderboard
Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
microsoft/ToRA
ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].
lucidrains/muse-maskgit-pytorch
Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch
mlfoundations/MINT-1T
MINT-1T: A one trillion token multimodal interleaved dataset.
QwenLM/Qwen2.5-Math
A series of math-specific large language models of our Qwen2 series.
haoliuhl/ringattention
Large Context Attention
hsiehjackson/RULER
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
zhuzilin/ring-flash-attention
Ring attention implementation with flash attention
meta-math/MetaMath
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
facebookresearch/searchformer
Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".
tech-srl/RASP
An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"
alibaba/ChatLearn
A flexible and efficient training framework for large-scale alignment tasks
mega002/lm-debugger
The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.
openpsi-project/ReaLHF
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
google-deepmind/loft
LOFT: A 1 Million+ Token Long-Context Benchmark
princeton-nlp/ProLong
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
rachtibat/LRP-eXplains-Transformers
Layer-Wise Relevance Propagation for Large Language Models and Vision Transformers [ICML 2024]
OpenBMB/OlympiadBench
[ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems.
ucl-dark/llm_debate
Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"
ZitongYang/Synthetic_Continued_Pretraining
Code implementation of synthetic continued pretraining
RulinShao/RAG-evaluation-harnesses
An evaluation suite for Retrieval-Augmented Generation (RAG).
natanaelwf/LLMTest_FindTheOrigin
Testing reasoning degradation in LLMs with variable context windows and information organization.