kimx3966
🔥 ML Research Engineer (LLM Driven Diffusion Process) 🔥 - Multiple Open Source Contributor - York & UMN Alumni
Pinned Repositories
2022-1-deep-learning-applications
3D-Deep-Learning-with-Python
3D Deep Learning with Python, Published by Packt
3d-pose-baseline
A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.
a-person-mask-generator
Extension for Automatic1111 and ComfyUI to automatically create masks for Background/Hair/Body/Face/Clothes in Img2Img
data_visualization_final_project
Data Visualization - Victims of Gun Violence & Google Trends
deep-active-zoomIn-network
Open-Sora-Dataset
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
stylus
text-to-image-nemo-guadrail
kimx3966's Repositories
kimx3966/Adv-Diffusion
[AAAI-2024] Official code for work "Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model"
kimx3966/Open-Sora-Dataset
kimx3966/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
kimx3966/stylus
kimx3966/text-to-image-nemo-guadrail
kimx3966/animate-your-word
kimx3966/Chat-UniVi
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
kimx3966/cog-sdxl-clip-interrogator
Attempt at cog wrapper for a SDXL CLIP Interrogator
kimx3966/CSD
kimx3966/DiffSynth-Studio
Enjoy the magic of Diffusion models!
kimx3966/DiT-MoE
Scaling Diffusion Transformers with Mixture of Experts
kimx3966/DynamiCrafter
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
kimx3966/E5-V
E5-V: Universal Embeddings with Multimodal Large Language Models
kimx3966/ELLA
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
kimx3966/Face-Adapter
kimx3966/FollowYourEmoji
[arXiv 2024] Follow-Your-Emoji: This repo is the official implementation of "Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation"
kimx3966/graph-of-thoughts
Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"
kimx3966/IF
kimx3966/imagetoprompt-ai
Turn your images into detailed and descriptive text prompts with AI
kimx3966/MegaVIT
The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"
kimx3966/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
kimx3966/MoMA
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
kimx3966/NLLB-200-Distilled-350M-en-ko
nllb-200 distilled 350M for English to Korean translation
kimx3966/OneTrainer
OneTrainer is a one-stop solution for all your stable diffusion training needs.
kimx3966/OpenVid-1M
kimx3966/piecewise-rectified-flow
PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator
kimx3966/ReNO
ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization
kimx3966/TEx-Face
kimx3966/tilemaker
kimx3966/X-Adapter
[CVPR 2024] X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model