Pinned Repositories
AI-Scientist
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
awesome-os-paper
A list of awesome traditional (core) OS paper before the LLM era
cs703-sqlizer
Replicate the SQLizer work
cs739-errceph
cs739-osdvisual
Potential Ceph Widget to show the hierarchical structure of OSD in a cluster
DivorceGRE
How I divorced GRE in two weeks (with a decent score)
FinalProjectShowcases
Maybe some cool things happening?
FlexFlashAttention3
FlexAttention w/ FlashAttention3 Support
DistServe
Disaggregated serving system for Large Language Models (LLMs).
aici
AICI: Prompts as (Wasm) Programs
GindaChen's Repositories
GindaChen/FlexFlashAttention3
FlexAttention w/ FlashAttention3 Support
GindaChen/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
GindaChen/aici
AICI: prompts as programs
GindaChen/applegpuinfo
Print all known information about the GPU on Apple-designed chips
GindaChen/cognify
The Multi-Faceted Optimizer for GenAI Workflows
GindaChen/ell
A language model programming framework.
GindaChen/flashinfer
FlashInfer: Kernel Library for LLM Serving
GindaChen/GindaChen
Short Profile about @GindaChen
GindaChen/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
GindaChen/jetson-containers
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
GindaChen/LiveBench
LiveBench: A Challenging, Contamination-Free LLM Benchmark
GindaChen/llama-stack
Model components of the Llama Stack APIs
GindaChen/LLM-Workshop
LLM Workshop by Sourab Mangrulkar
GindaChen/LLMCompass
GindaChen/LoongServe
GindaChen/marc
Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"
GindaChen/Marco-o1
An Open Large Reasoning Model for Real-World Solutions
GindaChen/Open-O1
GindaChen/PASTA
PASTA: Post-hoc Attention Steering for LLMs
GindaChen/prompt-lib
A set of utilities for running few-shot prompting experiments on large-language models
GindaChen/pycachesim
Python Cache Hierarchy Simulator
GindaChen/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
GindaChen/reflexion
[NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning
GindaChen/scratch-finetuner
s
GindaChen/self-refine
LLMs can generate feedback on their work, use it to improve the output, and repeat this process iteratively.
GindaChen/SelfEval-Guided-Decoding
GindaChen/T-MAC
Low-bit LLM inference on CPU with lookup table
GindaChen/Thinking-Claude
Let your Claude able to think
GindaChen/transfusion-pytorch
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
GindaChen/vattention
Dynamic Memory Management for Serving LLMs without PagedAttention