hululuzhu

Everyday ML for everyone

hululuzhu's Stars

microsoft/markitdown
Python tool for converting files and office documents to Markdown.
Language:Python34.1k 121 1371.5k
opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。
Language:Python24.5k 125 8651.8k
Genesis-Embodied-AI/Genesis
A generative world for general-purpose robotics & embodied AI learning.
Language:Python22.8k 202 3241.9k
barry-ran/QtScrcpy
Android real-time display control software
Language:C++22k 216 8832.8k
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Jupyter Notebook21.3k 211 3992.2k
openai/swarm
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Language:Python17.7k 290 111.8k
state-spaces/mamba
Mamba SSM architecture
Language:Python13.8k 102 5951.2k
richards199999/Thinking-Claude
Let your Claude able to think
Language:TypeScript13.5k 98 271.6k
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Language:Python13.1k 107 617914
facebookresearch/demucs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Language:Python8.6k 157 5441.1k
ltdrdata/ComfyUI-Manager
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, this extension provides a hub feature and convenience functions to access a wide range of information within ComfyUI.
Language:Python8k 42 6151k
LargeWorldModel/LWM
Large World Model -- Modeling Text and Video with Millions Context
Language:Python7.2k 66 71554
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Language:Jupyter Notebook6.7k 74 1k807
livekit/agents
Build real-time multimodal AI applications 🤖🎙️📹
Language:Python4.6k 55 399541
fudan-generative-vision/hallo2
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
Language:Python3.5k 432 57647
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Language:Python3.2k 29 134282
facebookresearch/audio2photoreal
Code and dataset for photorealistic Codec Avatars driven from audio
Language:Python2.7k 31 62263
Farama-Foundation/HighwayEnv
A minimalist environment for decision-making in autonomous driving
Language:Python2.7k 29 474772
chibat/chrome-extension-typescript-starter
Chrome Extension TypeScript Starter
Language:TypeScript2.6k 28 24436
b4rtaz/distributed-llama
Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.
Language:C++1.6k 32 65112
MaximeVandegar/Papers-in-100-Lines-of-Code
Implementation of papers in 100 lines of code.
Language:Python1.4k 27 10149
marl/crepe
CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)
Language:Python1.1k 34 78159
yangxy/PASD
[ECCV2024] Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization
Language:Python933 10 7562
teticio/audio-diffusion
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.
Language:Jupyter Notebook734 17 4470
ai-ng/swift
Fast voice assistant powered by Groq, Cartesia, and Vercel.
Language:TypeScript516 5 9104
FreedomIntelligence/HuatuoGPT-Vision
Medical Multimodal LLMs
Language:Python229 27 1223
room-js/chrome-extension-ts-starter
Chrome Extension starter built with TypeScript
Language:TypeScript89 2 527
submit-paper/Danzero_plus
Language:Python35 3 127
indently/discord_tutorial_2024
Here's the source code from my YouTube tutorial.
Language:Python24 3 230
AdamEXu/OpenBot
OpenBot
Language:Python3

hululuzhu

hululuzhu's Stars

microsoft/markitdown

opendatalab/MinerU

Genesis-Embodied-AI/Genesis

barry-ran/QtScrcpy

facebookresearch/audiocraft

openai/swarm

state-spaces/mamba

richards199999/Thinking-Claude

OpenBMB/MiniCPM-V

facebookresearch/demucs

ltdrdata/ComfyUI-Manager

LargeWorldModel/LWM

pyannote/pyannote-audio

livekit/agents

fudan-generative-vision/hallo2

dvlab-research/MGM

facebookresearch/audio2photoreal

Farama-Foundation/HighwayEnv

chibat/chrome-extension-typescript-starter

b4rtaz/distributed-llama

MaximeVandegar/Papers-in-100-Lines-of-Code

marl/crepe

yangxy/PASD

teticio/audio-diffusion

ai-ng/swift

FreedomIntelligence/HuatuoGPT-Vision

room-js/chrome-extension-ts-starter

submit-paper/Danzero_plus

indently/discord_tutorial_2024

AdamEXu/OpenBot