This repository contains a set of example projects for image related transformers tasks using Amazon SageMaker. This includes the following tasks:
- text-to-image-custom-container: Generate an image from a text prompt. Deploy using a custom Docker container on SageMaker.
- image-to-image-custom-container: Generate an image from a starting image and text prompt. Deploy using a custom Docker container on SageMaker.
- image-inpainting-custom-container: Alter a portion of an image according to a text prompt and image mask. Deploy using a custom Docker container.
- cross-modality-container-ofa: Generate a caption to describe an image, answer a specific question about an image, or generate a border around the specific object in an image
To see the example project for each task above, take a look at the corresponding directory in this repository with the same name.
Many of the models in this repository use the Stable Diffusion algorithm. From Wikipedia: "Stable Diffusion is a machine learning, text-to-image model developed by StabilityAI, in collaboration with EleutherAI and LAION, to generate digital images from natural language descriptions. The model can be used for other tasks too, like generating image-to-image translations guided by a text prompt. Stable Diffusion was trained on a subset of the LAION-Aesthetics V2 dataset. It was trained using 256 Nvidia A100 GPUs at a cost of $600,000."
Here is an example of each task, so that you can get an idea of what each one does.
First, manually create a mask to focus the algorithm on a part of an image. The blacked out part of the image is frozen, and will not be changed by the algorithm. Here, we want to keep the background, but swap out the dog.
Here, we start with a simplistic cat drawing, and we want to enhance the image according to the prompt.
-
Prompt: what does the image describe?
Model: a cat wearing a face mask -
Prompt: What is the cat wearing?
Model: Mask -
Prompt: which region does the text " eyes " describe?
Model:
-
Prompt: what does the image describe?
Model: portrait of a group of pets, cats and dogs -
Prompt: What is the color of the cat?
Model: gray -
Prompt: What is the color of the largest dog?
Model: brown
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.