mvasil
My name is Mariya, and I'm interested in all things computer vision, generative models, video understanding, and model fairness.
@KaiberAI Los Angeles
Pinned Repositories
animatediff-kaiber
Improved AnimateDiff with a number of improvements
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
fashion-compatibility
Learning Type-Aware Embeddings for Fashion Compatibility
handsoff
Learning-Similarity-Conditions
MRL
Code repository for the paper - "Matryoshka Representation Learning"
nerf
Code release for NeRF (Neural Radiance Fields)
StableVideo
[ICCV 2023] StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Tune-A-Video
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
visual-compatibility
Context-Aware Visual Compatibility Prediction (https://arxiv.org/abs/1902.03646)
mvasil's Repositories
mvasil/fashion-compatibility
Learning Type-Aware Embeddings for Fashion Compatibility
mvasil/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
mvasil/StableVideo
[ICCV 2023] StableVideo: Text-driven Consistency-aware Diffusion Video Editing
mvasil/ai-audio-startups
Community list of startups working with AI in audio and music technology
mvasil/animatable_nerf
Code for "Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies" ICCV 2021
mvasil/animatediff-cli-prompt-travel
animatediff prompt travel
mvasil/animatediff-kaiber
Improved AnimateDiff with a number of improvements
mvasil/AnyDoor
Official implementations for paper: Anydoor: zero-shot object-level image customization
mvasil/BiFormer
BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation, CVPR2023
mvasil/CoDeF
Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
mvasil/Gen-L-Video
The official implementation for "Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising".
mvasil/Grounded-Segment-Anything
Grounded-SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
mvasil/handsoff
mvasil/inanimate
Generate images from an initial frame and text
mvasil/MRL
Code repository for the paper - "Matryoshka Representation Learning"
mvasil/3d-motion-mag
mvasil/BrushNet
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
mvasil/co-tracker
CoTracker is a model for tracking any point (pixel) on a video.
mvasil/CVPR23_LFDM
The pytorch implementation of our CVPR 2023 paper "Conditional Image-to-Video Generation with Latent Flow Diffusion Models"
mvasil/cycle-diffusion
[ICCV 2023] Zero-shot image editing with stochastic diffusion models
mvasil/DreamPose
Official implementation of "DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion"
mvasil/FateZero
[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
mvasil/Inpaint-Anything
Inpaint anything using Segment Anything and inpainting models.
mvasil/KGI
This is the code repo for ICCV23 paper Virtual Try-On with Garment-Pose Keypoints Guided Inpainting
mvasil/lora
Using Low-rank adaptation to quickly fine-tune diffusion models.
mvasil/mvasil.github.io
mvasil/stable-audio-tools
Generative models for conditional audio generation
mvasil/text2cinemagraph
Official Pytorch implementation of Artistic Cinemagraph: Synthesizing Artistic Cinemagraphs from Text
mvasil/videocomposer
Official repo for VideoComposer: Compositional Video Synthesis with Motion Controllability
mvasil/Wav2Lip
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020.