Jerrisk

Computer vision, Recommendation System, NLP

Tongji UniversityHangzhou, China

Jerrisk's Stars

Tencent/HunyuanVideo
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Language:Python7.7k596
compphoto/IntrinsicCompositing
Code for the SIGGRAPH Asia 2023 paper "Intrinsic Harmonization for Illumination-Aware Compositing"
Language:Python6814
Stable-X/StableNormal
[SIGGRAPH Asia 2024 (Journal Track)] StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal
Language:Python57925
Kwai-Kolors/MPS
Language:Python1356
gemelo-ai/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Language:Python86999
espeak-ng/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Language:C4.5k938
numediart/MBROLA
MBROLA is a speech synthesizer based on the concatenation of diphones
Language:C24560
GitYCC/g2pW
Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)
Language:Python29937
MCG-NKU/AMT
Official code for "AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation" (CVPR2023)
Language:Python23219
baowenbo/DAIN
Depth-Aware Video Frame Interpolation (CVPR 2019)
Language:Python8.3k839
Yuan-ManX/SouPyX
SouPyX: An Audio Exploration Space.🪐
Language:Python363
InternLM/MindSearch
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
Language:JavaScript5.8k590
modelscope/DiffSynth-Studio
Enjoy the magic of Diffusion models!
Language:Python6.8k629
archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
1.9k70
ga642381/speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
85955
state-spaces/s4
Structured state space sequence models
Language:Jupyter Notebook2.5k305
vercel/ai
Build AI-powered applications with React, Svelte, Vue, and Solid
Language:TypeScript11.2k1.7k
2noise/ChatTTS
A generative speech model for daily dialogue.
Language:Python33.8k3.7k
antiboredom/camera-motion-detector
Uses opencv to detect when a camera is panning or zooming.
Language:Python1006
bytedance/particle-sfm
ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild. ECCV 2022.
Language:C++29923
discus0434/aesthetic-predictor-v2-5
SigLIP-based Aesthetic Score Predictor
Language:Python1732
DAMO-NLP-SG/Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Language:Python2.9k265
microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
Language:Python21.8k2.2k
abi/screenshot-to-code
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
Language:Python67.1k8.2k
brentyi/tyro
CLI interfaces & config objects, from types
Language:Python56228
meta-llama/PurpleLlama
Set of tools to assess and improve LLM security.
Language:Python2.8k470
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python28k3.2k
ActiveState/appdirs
A small Python module for determining appropriate platform-specific dirs, e.g. a "user data dir".
Language:Python1.1k97
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
13.6k871
ttengwang/Caption-Anything
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything
Language:Python1.7k105

Jerrisk

Jerrisk's Stars

Tencent/HunyuanVideo

compphoto/IntrinsicCompositing

Stable-X/StableNormal

Kwai-Kolors/MPS

gemelo-ai/vocos

espeak-ng/espeak-ng

numediart/MBROLA

GitYCC/g2pW

MCG-NKU/AMT

baowenbo/DAIN

Yuan-ManX/SouPyX

InternLM/MindSearch

modelscope/DiffSynth-Studio

archinetai/audio-ai-timeline

ga642381/speech-trident

state-spaces/s4

vercel/ai

2noise/ChatTTS

antiboredom/camera-motion-detector

bytedance/particle-sfm

discus0434/aesthetic-predictor-v2-5

DAMO-NLP-SG/Video-LLaMA

microsoft/graphrag

abi/screenshot-to-code

brentyi/tyro

meta-llama/PurpleLlama

meta-llama/llama3

ActiveState/appdirs

BradyFU/Awesome-Multimodal-Large-Language-Models

ttengwang/Caption-Anything