lvchigo

Pinned Repositories

2015_Face_Detection
Face detection. 2015CVPR Cascade CNNs for Face Detection
Language:HTML1 2 00
ads_text
ads detect for text
Language:C++0 2 00
AgeGenderDeepLearning
Language:Shell0 2 00
AImgDetect
new v2.0.0 && v2.2.0
Language:C++0 1 01
Alibaba-MIT-Speech
Alibaba speech technology
0 3 00
Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
Language:Java00
caffe_image_classfication
Using Caffe
Language:C++2 2 01
caffe_multilabel
caffe multilabel for image&text
Language:Jupyter Notebook1 4 00
caffe_rfcn_c
caffe rfcn for c(no python)
Language:C++1 4 10
PVANet_train
code fixed from https://github.com/sanghoon/pva-faster-rcnn
Language:Jupyter Notebook7 2 213

lvchigo's Repositories

lvchigo/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Language:Python0 0
lvchigo/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
1 0
lvchigo/baichuan-7B
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
Language:Python1 0
lvchigo/BaiduImageSpider
一个超级轻量的百度图片爬虫
lvchigo/bark
🔊 Text-Prompted Generative Audio Model
lvchigo/bert4torch
An elegent pytorch implement of transformers
lvchigo/CodeFormer
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
Language:Python1 0
lvchigo/Collaborative-Diffusion
Collaborative Diffusion (CVPR 2023)
lvchigo/FastDeploy
⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for Cloud and Edge. Including Vision, Text, Audio and Video 20+ main stream scenarios and 150+ SOTA models with end-to-end optimization and multi-platform multi-framework support.
Language:C++1 0
lvchigo/financial_research_report
金融研究报告生成
lvchigo/keyword-spot
端到端语音唤醒工具箱，从模型训练到模型推理。
Language:Python1 0
lvchigo/MiniMax-01
lvchigo/onnx-typecast
Script to typecast ONNX model parameters from INT64 to INT32.
Language:Python1 0
lvchigo/openvino
OpenVINO™ Toolkit repository
lvchigo/OpenVoice
Instant voice cloning by MyShell.
Language:Python0 0
lvchigo/PaddleSlim
PaddleSlim is an open-source library for deep model compression and architecture search.
Language:Python1 0
lvchigo/Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Language:Python1 0
lvchigo/sapiens
High-resolution models for human tasks.
lvchigo/seed-tts-eval
lvchigo/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector
lvchigo/SpeechGPT
SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities.
1 0
lvchigo/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook1 0
lvchigo/Transformer-SOD
1 0
lvchigo/trt-samples-for-hackathon-cn
Simple samples for TensorRT programming
Language:Jupyter Notebook1 0
lvchigo/unsloth
Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
lvchigo/voice_datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
1 0
lvchigo/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Language:Python1 0
lvchigo/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Language:Jupyter Notebook1 0
lvchigo/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Language:Python1 0
lvchigo/yolov10
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]