A custom interior design pipeline API that combines Realistic Vision V3.0 inpainting pipeline with segmentation and MLSD ControlNets. This repo uses Cog to create a dockerized API. See the Replicate demo to test the running API.
You will need to have Cog and Docker installed to serve your model as an API. To run a prediction:
cog predict -i image=@test_images/bedroom_3.jpg prompt="A bedroom with a bohemian spirit centered around a relaxed canopy bed complemented by a large macrame wall hanging. An eclectic dresser serves as a unique storage solution while an array of potted plants brings life and color to the room"
To start your server and serve the model as an API:
cog run -p 5000 python -m cog.server.http
The API input arguments are as follows:
- image: The provided image serves as a base or reference for the generation process.
- prompt: The input prompt is a text description that guides the image generation process. It should be a detailed and specific description of the desired output image.
- negative_prompt: This parameter allows specifying negative prompts. Negative prompts are terms or descriptions that should be avoided in the generated image, helping to steer the output away from unwanted elements.
- num_inference_steps: This parameter defines the number of denoising steps in the image generation process.
- guidance_scale: The guidance scale parameter adjusts the influence of the classifier-free guidance in the generation process. Higher values will make the model focus more on the prompt.
- prompt_strength: In inpainting mode, this parameter controls the influence of the input prompt on the final image. A value of 1.0 indicates complete transformation according to the prompt.
- seed: The seed parameter sets a random seed for image generation. A specific seed can be used to reproduce results, or left blank for random generation.
This is a custom pipeline inspired by AICrowd's Generative Interior Design hackathon that uses Realistic Vision V3.0 as the base model. See the base and ControlNet model pages for their respective licenses. This code base is licensed under the MIT license.
From neuralwork with ❤️