/ComfyUI-IF_MemoAvatar

Memory-Guided Diffusion for Expressive Talking Video Generation

Primary LanguagePythonMIT LicenseMIT

ComfyUI-IF_MemoAvatar

Memory-Guided Diffusion for Expressive Talking Video Generation

demo

#ORIGINAL REPO MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation
Longtao Zheng*, Yifan Zhang*, Hanzhong Guo, Jiachun Pan, Zhenxiong Tan, Jiahao Lu, Chuanxin Tang, Bo An, Shuicheng Yan
Project Page | arXiv | Model

This repository contains the example inference script for the MEMO-preview model. The gif demo below is compressed. See our project page for full videos.

Demo GIF

ComfyUI-IF_MemoAvatar

Memory-Guided Diffusion for Expressive Talking Video Generation

Overview

This is a ComfyUI implementation of MEMO (Memory-Guided Diffusion for Expressive Talking Video Generation), which enables the creation of expressive talking avatar videos from a single image and audio input.

Features

  • Generate expressive talking head videos from a single image
  • Audio-driven facial animation
  • Emotional expression transfer
  • High-quality video output thorium_XMBCG9kbGn
mafe.mp4

Installation

*** Xformers NOT REQUIRED BUT BETTER IF INSTALLED*** *** MAKE SURE YoU HAVE HF Token On Your environment VARIABLES ***

git clone the repo to your custom_nodes folder and then

cd ComfyUI-IF_MemoAvatar
pip install -r requirements.txt

I removed xformers from the file because it needs a particular combination of pytorch on windows to work

if you are on linux you can just run

pip install xformers 

for windows users if you don't have xformers on your env

pip show xformers 

follow this guide to install a good comfyui environment if you don't see any version install the latest following this free guide

Installing Triton and Sage Attention Flash Attention

Watch the video

Model Files

The models will automatically download to the following locations in your ComfyUI installation:

models/checkpoints/memo/
├── audio_proj/
├── diffusion_net/
├── image_proj/
├── misc/
│ ├── audio_emotion_classifier/
│ ├── face_analysis/
│ └── vocal_separator/
└── reference_net/
models/wav2vec/
models/vae/sd-vae-ft-mse/
models/emotion2vec/emotion2vec_plus_large/

Copy the faceanalisys/models models from the folder directly into faceanalisys just until I make sure don't just move then duplicate them cos HF will detect empty and download them every time If you don't see a models.json or errors out create one yourself this is the content

{
  "detection": [
    "scrfd_10g_bnkps"
  ],
  "recognition": [
    "glintr100"
  ],
  "analysis": [
    "genderage",
    "2d106det",
    "1k3d68"
  ]
}

and a version.txt containing 0.7.3

yW8hDQhnhM

:IF_MemoAvatar_comfy