Pinned Repositories
AutoSmoothQuant
An easy-to-use package for implementing SmoothQuant for LLMs
lightseq
LightSeq: A High Performance Library for Sequence Processing and Generation
academicpages.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
books_and_wiki_en_clean_format_and_shard
QQQ
QQQ is an innovative and hardware-optimized W4A8 quantization solution.
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
HandH1998's Repositories
HandH1998/QQQ
QQQ is an innovative and hardware-optimized W4A8 quantization solution.
HandH1998/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
HandH1998/academicpages.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
HandH1998/books_and_wiki_en_clean_format_and_shard
HandH1998/BUAA_Course
HandH1998/BUAA_Course_Sharing
北京航空航天大学(北航)课程作业资料共享计划——涉密不上网,上网不涉密!!!
HandH1998/manifold_distillation
HandH1998/mct_former
HandH1998/pregenerate_bert_train_corpus
HandH1998/carInsurancePred
HandH1998/easy-scrape
HandH1998/Entity-Relation-Extraction
Entity and Relation Extraction Based on TensorFlow and BERT. 基于TensorFlow和BERT的管道式实体及关系抽取,2019语言与智能技术竞赛信息抽取任务解决方案。Schema based Knowledge Extraction, SKE 2019
HandH1998/HandH1998.github.io
个人主页
HandH1998/Java_learn
HandH1998/JS_learn
HandH1998/lightseq
LightSeq: A High Performance Library for Sequence Processing and Generation
HandH1998/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
HandH1998/marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
HandH1998/matplotlib
HandH1998/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
HandH1998/ML_practice
HandH1998/net2net
HandH1998/NLP-Tutorials
Simple implementations of NLP models. Tutorials are written in Chinese on my website https://mofanpy.com
HandH1998/NN-CUDA-Example
Several simple examples for popular neural network toolkits calling custom CUDA operators.
HandH1998/smoothquant
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
HandH1998/soln-ml
A research framework for fast prototyping of automl algorithms.
HandH1998/test
just for test
HandH1998/tmp_bert_mlkd
HandH1998/zh-NER-TF
A very simple BiLSTM-CRF model for Chinese Named Entity Recognition 中文命名实体识别 (TensorFlow)