Pinned Repositories
footprint_research-Stature_estimation_using_CNN
Object_detection_drinks
Recognition-of-Yoga-Poses-through-an-Interactive-System-with-Kinect-device
MInference
To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.
dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including x86 and ARMv9.
KsanaLLM