Pinned Repositories
2015_Face_Detection
Face detection. 2015CVPR Cascade CNNs for Face Detection
ads_text
ads detect for text
AgeGenderDeepLearning
AImgDetect
new v2.0.0 && v2.2.0
Alibaba-MIT-Speech
Alibaba speech technology
Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
caffe_image_classfication
Using Caffe
caffe_multilabel
caffe multilabel for image&text
caffe_rfcn_c
caffe rfcn for c(no python)
PVANet_train
code fixed from https://github.com/sanghoon/pva-faster-rcnn
lvchigo's Repositories
lvchigo/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
lvchigo/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
lvchigo/baichuan-7B
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
lvchigo/BaiduImageSpider
一个超级轻量的百度图片爬虫
lvchigo/bert4torch
An elegent pytorch implement of transformers
lvchigo/CAT
A CRF-based ASR Toolkit
lvchigo/CLIP
Contrastive Language-Image Pretraining
lvchigo/CodeFormer
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
lvchigo/Collaborative-Diffusion
Collaborative Diffusion (CVPR 2023)
lvchigo/FastDeploy
⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for Cloud and Edge. Including Vision, Text, Audio and Video 20+ main stream scenarios and 150+ SOTA models with end-to-end optimization and multi-platform multi-framework support.
lvchigo/jsoncpp
A C++ library for interacting with JSON.
lvchigo/keyword-spot
端到端语音唤醒工具箱,从模型训练到模型推理。
lvchigo/onnx-typecast
Script to typecast ONNX model parameters from INT64 to INT32.
lvchigo/openvino
OpenVINO™ Toolkit repository
lvchigo/OpenVoice
Instant voice cloning by MyShell.
lvchigo/PaddleSlim
PaddleSlim is an open-source library for deep model compression and architecture search.
lvchigo/Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
lvchigo/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector
lvchigo/SpeechGPT
SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities.
lvchigo/Squeezeformer
[NeurIPS'22] Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
lvchigo/stable-diffusion
A latent text-to-image diffusion model
lvchigo/Transformer-SOD
lvchigo/trt-samples-for-hackathon-cn
Simple samples for TensorRT programming
lvchigo/trt2022_wenet
lvchigo/voice_datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
lvchigo/wekws
Production First and Production Ready End-to-End Keyword Spotting Toolkit
lvchigo/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
lvchigo/wenet_trt8
lvchigo/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
lvchigo/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)