Pinned Repositories
3d-ken-burns
an implementation of 3D Ken Burns Effect from a Single Image using PyTorch
3DUnetCNN
Keras 3D U-Net Convolution Neural Network (CNN) designed for medical image segmentation
6d-object-pose-estimation
This repository summarizes papers and codes for 6D Object Pose Estimation.
A-Light-and-Fast-Face-Detector-for-Edge-Devices
A light and fast one class detection framework for edge devices. We provide face detector, head detector, pedestrian detector, vehicle detector......
Abnormal-Behavior-Detection-Based-On-Optical-Flow-Features
CSC2515-University of Toronto. This project applied computer vision and mechine learning methods aimed to detect abnormal behaved object in crowd, by Hanwen Liang and Haohan Li.
abnormal-spatiotemporal-ae
Codes for "Abnormal Event Detection in Videos using Spatiotemporal Autoencoder".
Abnormal_Event_Detection
Abnormal Event Detection in Videos using SpatioTemporal AutoEncoder
AbnormarCrowdDetection
Abnormal Crowd Detection Implementation with Python
AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
AudioClassification-Pytorch
The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResNetSE and other models, as well as a variety of preprocessing methods.
ethanyhzhang's Repositories
ethanyhzhang/AudioClassification-Pytorch
The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResNetSE and other models, as well as a variety of preprocessing methods.
ethanyhzhang/baby-llama2-chinese
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
ethanyhzhang/bisheng
Bisheng is an open LLM devops platform for next generation AI applications.
ethanyhzhang/Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
ethanyhzhang/clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
ethanyhzhang/CTranslate2
Fast inference engine for Transformer models
ethanyhzhang/Firefly
Firefly: 大模型训练工具,支持训练Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
ethanyhzhang/fontina
Data generation, model training and inference for Visual Font Recognition using PyTorch
ethanyhzhang/Fooocus
Focus on prompting and generating
ethanyhzhang/g2p-mix
Grapheme-to-Phoneme for Mixed Chinese and English
ethanyhzhang/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
ethanyhzhang/HierSpeechpp
The official implementation of HierSpeech++
ethanyhzhang/ImageBind
ImageBind One Embedding Space to Bind Them All
ethanyhzhang/JioNLP
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
ethanyhzhang/LLMs-from-scratch
Implementing a ChatGPT-like LLM from scratch, step by step
ethanyhzhang/marker
Convert PDF to markdown quickly with high accuracy
ethanyhzhang/mmyolo_tensorrt
ethanyhzhang/OpenVoice
Instant voice cloning by MyShell
ethanyhzhang/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
ethanyhzhang/Rerender_A_Video
[SIGGRAPH Asia 2023] Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
ethanyhzhang/SCINeRF
[CVPR 2024 Highlight] SCINeRF: Neural Radiance Fields from a Snapshot Compressive Image
ethanyhzhang/stable-diffusion-xl-demo
A gradio web UI demo for Stable Diffusion XL 1.0, with refiner and MultiGPU support
ethanyhzhang/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
ethanyhzhang/TEASER-plusplus
A fast and robust point cloud registration library
ethanyhzhang/tensorrtx
Implementation of popular deep learning networks with TensorRT network definition API
ethanyhzhang/ultralytics
NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite
ethanyhzhang/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
ethanyhzhang/WHAM
ethanyhzhang/whisper_streaming
Whisper realtime streaming for long speech-to-text transcription and translation
ethanyhzhang/YuzuMarker.FontDetection
✨ 首个CJK(中日韩)字体识别以及样式提取模型 YuzuMarker的字体识别模型与实现 / First-ever CJK (Chinese Japanese Korean) Font Recognition and Style Extractor, side project of YuzuMarker