imba-pericia's Stars
mifi/lossless-cut
The swiss army knife of lossless video/audio editing
facefusion/facefusion
Industry leading face manipulation platform
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
GoogleCloudPlatform/generative-ai
Sample code and notebooks for Generative AI on Google Cloud, with Gemini on Vertex AI
google-gemini/cookbook
Examples and guides for using the Gemini API
yisol/IDM-VTON
[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
ali-vilab/VGen
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
Doubiiu/DynamiCrafter
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
TMElyralab/MuseV
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
Picsart-AI-Research/StreamingT2V
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
open-mmlab/PIA
[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA,你的个性化图像动画生成器,利用文本提示将图像变为奇妙的动画
ammen99/wf-recorder
yuval-alaluf/Attend-and-Excite
Official Implementation for "Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models" (SIGGRAPH 2023)
Doriandarko/RepoToTextForLLMs
Automate the analysis of GitHub repositories for LLMs with RepoToTextForLLMs. Fetch READMEs, structure, and non-binary files efficiently. Outputs include analysis prompts to aid in comprehensive repo evaluation
G-U-N/AnimateLCM
[SIGGRAPH ASIA 2024 TCS] AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data
vtosters/lite
Модифицированный клиент VK
TIGER-AI-Lab/AnyV2V
Code and data for "AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks"
ttchengab/zest_code
[ECCV-2024] This is the official implementation of ZeST.
pkoutoupis/rapiddisk
An Advanced Linux RAM Drive and Caching kernel modules. Dynamically allocate RAM as block devices. Use them as stand alone drives or even map them as caching nodes to slower local disk drives. Access those volumes locally or export them across an NVMe Target network. Manage it all from a web API.
ali-vilab/Ranni
JaKooLit/OpenSuse-Hyprland
Automated Hyprland Install script for OpenSuse Tumbleweed. All gpu supported
alphacep/vosk-tts
Text To Speech Synthesis with Vosk
winniesi/tg-gemini-bot
Just a single click and you've got it set up on Vercel.
Haoming02/sd-webui-mosaic-outpaint
An Extension for Automatic1111 Webui that trivializes outpainting
dsavell/docker-grav
Docker Container for GRAV CMS
brick2face/seamless-tile-inpainting
An automatic1111 extension for making seamless tiles using Stable Diffusion inpainting
kaifcoder/Invoice-Query-Tool-using-gemini-ai
This repository contains a Python project that leverages the Gemini Pro Vision API to extract invoice information from images. The primary goal of this project is to allow users to upload images of receipts and query specific details about the invoice. The project utilizes Conda for dependency management.
tillo13/kumori_cli_engine
The Kumori CLI engine automation tool leverages InstantID and HuggingFace/Diffusers to batch-generate personalized, identity-preserving stylized images using sophisticated facial analysis and pose estimation techniques, all through a Python command-line interface.