johnwick123f

Pinned Repositories

AutoStudio
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation
Language:Jupyter Notebook0 0 00
bitsandbytes
8-bit CUDA functions for PyTorch
Language:Python0 0 00
bounded-attention
Language:Python0 0 00
Bunny
A family of lightweight multimodal models.
Language:Python0 0 00
cog-video-morpher
Generate a video that morphs between subjects, with an optional style
Language:Python00
LISAKaggle
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Language:Python1 0 00
MplugOwl
Language:Python10
PersonalROS
Personal stuff for robots
Language:Python1 1 00
Project
Simple repository for personal project
Language:Python1 1 00
sussy
Code for subgoal synthesis via image editing
Language:Python00

johnwick123f's Repositories

johnwick123f/PersonalROS
Personal stuff for robots
Language:Python1 1 00
johnwick123f/Project
Simple repository for personal project
Language:Python1 1 00
johnwick123f/AutoStudio
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation
Language:Jupyter Notebook0 0 00
johnwick123f/bounded-attention
Language:Python0 0 00
johnwick123f/Bunny
A family of lightweight multimodal models.
Language:Python0 0 00
johnwick123f/cog-video-morpher
Generate a video that morphs between subjects, with an optional style
Language:Python00
johnwick123f/coqui-ai-TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language:Python0 0 00
johnwick123f/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Language:Python0 0 00
johnwick123f/sussy
Code for subgoal synthesis via image editing
Language:Python00
johnwick123f/fish-speech
Brand new TTS solution
Language:Python0 0
johnwick123f/GLEE
GLEE: General Object Foundation Model for Images and Videos at Scale
Language:Python0 0
johnwick123f/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
johnwick123f/gpt_sovits_python
Python wrapper for fast inference with GPT-SoVITS
johnwick123f/Grasp-Anything
Dataset and Code for "Grasp-Anything: Large-scale Grasp Dataset from Foundation Models."
Language:Python0 0
johnwick123f/graspnetAPI
Toolbox for our GraspNet-1Billion dataset.
Language:Python0 0
johnwick123f/GroundingDINO
Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Language:Python0 0
johnwick123f/groundingLMM
Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
Language:Python0 0
johnwick123f/llama-cpp-python
Python bindings for llama.cpp
Language:Python0 0
johnwick123f/LLaMA2-Accessory
An Open-source Toolkit for LLM Development
Language:Python0 0
johnwick123f/MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
Language:Python0 0
johnwick123f/multi_token
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
Language:Python0 0
johnwick123f/piecewise-rectified-flow
perflow but library
Language:Python0 0
johnwick123f/resemble-enhance
AI powered speech denoising and enhancement
Language:Python0 0
johnwick123f/rich-text-to-image
Rich-Text-to-Image Generation
Language:Python0 0
johnwick123f/RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
Language:Jupyter Notebook0 0
johnwick123f/StreamMultiDiffusion
Official code for the paper "StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control."
Language:Jupyter Notebook0 0
johnwick123f/text-generation-webui
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, llama.cpp (GGUF), Llama models.
Language:Python
johnwick123f/tokenize-anything
Tokenize Anything via Prompting
Language:Jupyter Notebook0 0
johnwick123f/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python0 0
johnwick123f/videollm-online
VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)
Language:Python0 0