Pinned Repositories
AgeGenderEstimate
AICamera
Demonstration of using Caffe2 inside an Android application.
BertPunc
SOTA punctation restoration (for e.g. automatic speech recognition) deep learning model based on BERT pre-trained model
BigCiDian
Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.
bloaty
Bloaty McBloatface: a size profiler for binaries
CarClassifier
faceboxes
LearningMaterials
My Learning Materials
LPR-Trian
pi
caiyueliang's Repositories
caiyueliang/bloaty
Bloaty McBloatface: a size profiler for binaries
caiyueliang/chatgpt-retrieval-plugin
The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.
caiyueliang/Chinese-Text-Classification-Pytorch
中文文本分类,TextCNN,TextRNN,FastText,TextRCNN,BiLSTM_Attention,DPCNN,Transformer,基于pytorch,开箱即用。
caiyueliang/client
Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
caiyueliang/cube-studio
云原生一站式机器学习平台,多租户,数据资产,notebook在线开发,拖拉拽任务流编排,多机多卡分布式训练,超参搜索,推理服务,多集群调度,多项目组资源组,边缘计算,大模型实时训练, ai应用商店
caiyueliang/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
caiyueliang/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
caiyueliang/FACEGOOD-Audio2Face
http://www.facegood.cc
caiyueliang/FasterTransformer
Transformer related optimization, including BERT, GPT
caiyueliang/grpc
The C based gRPC (C++, Python, Ruby, Objective-C, PHP, C#)
caiyueliang/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
caiyueliang/kaldi_demo
caiyueliang/KaldiWebrtcServer
Python server for communicating with Kaldi from the browser using WebRTC
caiyueliang/langchain
🦜🔗 Build context-aware reasoning applications
caiyueliang/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
caiyueliang/minddiffusion
A collection of diffusion models based on MindSpore
caiyueliang/nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
caiyueliang/pytorch_examples
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
caiyueliang/pytriton
PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.
caiyueliang/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
caiyueliang/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
caiyueliang/sglang
SGLang is a fast serving framework for large language models and vision language models.
caiyueliang/SparrowRecSys
A Deep Learning Recommender System
caiyueliang/stock
stock,股票系统。使用python进行开发。
caiyueliang/TensorRT_1
TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
caiyueliang/torch2trt
An easy to use PyTorch to TensorRT converter
caiyueliang/triton
Development repository for the Triton language and compiler
caiyueliang/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
caiyueliang/web_frame
caiyueliang/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit