YangWang92

Microsoft Research

Pinned Repositories

VPTQ
VPTQ, A Flexible and Extreme low-bit quantization algorithm
Language:Cuda609 18 5342
3DFNoC
3D NoC Emulation Model on a Single FPGA
Language:Java1 0 00
abliterator
Simple Python library/structure to ablate features in LLMs which are supported by TransformerLens
Language:Python1 0 00
Accel-NASBench
Accel-NASBench: A Surrogate Benchmark for Accelerator-Aware NAS
Language:Python1 0 00
AdderNet
Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"
Language:Python1 0 00
AI-Youtube-Shorts-Generator
A python tool that uses GPT-4, FFmpeg, and OpenCV to automatically analyze videos, extract the most interesting sections, and crop them for an improved viewing experience.
Language:Python10
Ampere_Persistent_Cache_Eval
Language:Cuda3 1 00
AX6S-unlock
Language:Python144 1 221
tvm-models-baseline
Language:Python2 0 00
YangWang92.github.io
Language:HTML0 2 00

YangWang92's Repositories

YangWang92/ao
PyTorch native quantization and sparsity for training and inference
1
YangWang92/FractalTensor
Language:Python1 0 00
YangWang92/Megatron-LM-rocm-fork
Ongoing research training transformer models at scale
1
YangWang92/CodeIO
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction
YangWang92/commercial_thermal_map_dataset
YangWang92/DeepSeek-V3
Language:Python0 0
YangWang92/EASIER
Efficient Auto-scalable Scientific Infrastructure for Engineers and Researchers
Language:Python0 0
YangWang92/flute
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
Language:Cuda0 0
YangWang92/grouped_gemm
PyTorch bindings for CUTLASS grouped GEMM for rocm
YangWang92/large_concept_model
Large Concept Models: Language modeling in a sentence representation space
YangWang92/Liger-Kernel
Efficient Triton Kernels for LLM Training
Language:Python0 0
YangWang92/llama3_interpretability_sae
A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.
Language:Python0 0
YangWang92/Marco-o1
An Open Large Reasoning Model for Real-World Solutions
Language:Python0 0
YangWang92/mfu_calculation
A simple calculation for LLM MFU.
YangWang92/MiniMax-01
Language:Python0 0
YangWang92/ml-mobileclip
This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" CVPR 2024
Language:Python0 0
YangWang92/occamy
A high-efficiency system-on-chip for floating-point compute workloads.
YangWang92/open-instruct
YangWang92/open-r1
Fully open reproduction of DeepSeek-R1
Language:Python0 0
YangWang92/QLLM
A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ, and export to onnx/onnx-runtime easily.
Language:Python0 0
YangWang92/quip-sharp
Language:Python0 01
YangWang92/ReST-MCTS
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
Language:Python0 0
YangWang92/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python0 0
YangWang92/simpleRL-reason
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
YangWang92/TileFusion
Language:C++0 0
YangWang92/TinyZero
Language:Python
YangWang92/verl
veRL: Volcano Engine Reinforcement Learning for LLM
YangWang92/VITA.dev
✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Language:Python0 0
YangWang92/VPTQ
VPTQ, A Flexible and Extreme low-bit quantization algorithm
Language:Cuda
YangWang92/wandb
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.