A gradio demo of image models. Refer to colab notebook for setup.
The project incorporates image generation and editing models in concert to perform various tasks.
Currently Supported Tasks:
- Image captioning
- Generate auto SAM mask
- Objects detection, segmentation, annotation
- Remove / replace background
- Inpainting
- Upscale image 4x
- Text to image
- Drawing to image
- Image to image
Interaction modes:
- Selecting points on the image
- Text prompts
- Auto mode
- Drawing
- Upload image mask -> TBD
- Audio -> TBD
Models used:
- Segment Anything (SAM)
- Grounding DINO
- Matte Anything (ViTMatte - Hust Labs)
- Stable Diffusion 2 (Hugging Face diffusers)
- Stable Diffusion Controlnet
- BLIP
- Mobile SAM
- Matte Anything Model (MAM - SHI Labs) -> TBD
TBD:
- Options to choose from checkpoints e.g. Stable Diffusion versions
- Options to further control SD generation
- More tasks e.g. image editing with more models
Upscale Task and Text to Image Task:
Inpainting (tea pot -> puppy || green apple -> orange || cat -> rabbit):
Remove/Replace Background (SD generated backgrounds):
Remove Background for Transparent objects:
Image to Image A. (prompt for terrace swimming pool):
Image to Image B. (prompt for 1: pool table with balls, 2:fantasy landscape on artstation):
Advanced Settings to tune the results:
This app is built with the help of following models and libraries. Please visit their pages to know more about them.
- Grounded Segment Anything
- Segment Anything
- GroundingDINO
- Stable Diffusion with Hugging Face diffusers
- Controlnet
- Matte-Anything
- Mobile SAM
- BLIP