LiweiPE

Pinned Repositories

footprint_research-Stature_estimation_using_CNN
Language:Python0 1 00
Object_detection_drinks
Language:Python0 1 01
Recognition-of-Yoga-Poses-through-an-Interactive-System-with-Kinect-device
0 1 00
MInference
[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.
Language:Python1.1k 8 11560
dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including CUDA, x86 and ARMv9.
Language:C264 8 4128
NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python13.6k 219 2.5k2.8k
KsanaLLM
Language:C++326 12 3131
Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Language:Shell16.6k 110 9451.2k