Composite-Image-Evaluation

Here are some possible evaluation metrics to evaluate the quality of composite images from different aspects.

  • Evaluate whether the foreground is harmonious with background.

    • Harmony score: use illumination encoder to extract the illumination codes from foreground and background, and measure their similarity.

    • Inharmony hit: use inharmonious region localization model to detect the inharmonious region, and calculate the overlap (e.g., IoU) between detected region and foreground region.

  • Evaluate whether the foreground object placement is reasonable.

  • Evaluate whether the foreground is compatible with background in terms of geometry and semantics.

    • FOS score: use foreground object search model to calculate the compatibility score between foreground and background in terms of geometry and semantics.
  • Evaluate the fidelity of foreground, i.e., whether the synthesized foreground is similar to the input foreground.

    • Clip score: use CLIP to extract the embeddings from the input foreground image and the generated foreground patch, and measure their similarity.

    • Dino score: use DINO to measure the average cosine similarity between the input and generated foreground.

  • Evaluate the over quality of foreground or the whole composite image.

    • FID: use pretrained image encoder (e.g., InceptionNet, CLIP) to extract the embeddings from real images and generated images, and measure their Fréchet Inception Distance.
    • QS: use quality score to measure the quality of each single generated image, and compute average score.