This project provides tools to compute various image quality assessment (IQA) metrics.
- Compute Image Quality Assesment Metrics: Assessment quality with multiple full and no reference metrics
- Multiple Color Spaces: Support for different color spaces (e.g., RGB, HSV) to assess image quality in various domains.
- Heatmaps Generation: Generate metric maps visualizing the spatial distribution of metric values across the image
- Image Difference: Generate thresholded difference images to highlight significant differences between two images
- Configurable via JSON: Flexible configuration through JSON files for specifying image paths, metrics, color spaces, and output options.
It supports:
- 18 full-reference
- 5 no-reference
- image diffs in 8 different color spaces with flexible thresholding
The full reference and no reference metrics are from these python packages:
Some example comparison databases are available here: https://lanl.github.io/libra/
- Clone Repository
git clone https://github.com/lanl/libra
- Install Dependencies
pip install opencv-python-headless numpy matplotlib scikit-image torch piq pyiqa ImageHash
Note: some dependecies are not available through conda. We recommend using virtual environments for now.
The JSON configuration file should contain the following keys:
- reference_image_path (str): Path to the reference image.
- distorted_image_path (str): Path to the distorted image.
- output_directory (str): Path to the output directory where the CSV file and metric maps will be saved.
- output_filename (str, optional): Name of the output CSV file (default: "metrics.csv").
- generate_metrics (bool, optional): Flag to generate metrics (default: False).
- generate_maps (bool, optional): Flag to generate metric maps (default: False).
- generate_image_difference (bool, optional): Flag to generate thresholded difference images (default: False).
- difference_threshold (int, optional): Threshold value for generating thresholded difference images (default: 10).
- metrics (list of str, optional): List of metrics to compute.
- color_spaces (list of str, optional): List of color spaces to use for computing metrics (default: ["RGB"]).
- map_window_size (int, optional): Window size for computing metric maps (default: 161).
- map_step_size (int, optional): Step size for computing metric maps (default: 50).
Here is an example of the JSON configuration found in samples:
{
"reference_image_path": "samples/data/test3/orig.png",
"distorted_image_path": "samples/data/test3/compressed.png",
"output_directory": "output_compression",
"output_filename": "metrics.csv",
"generate_metrics": true,
"generate_maps": true,
"generate_image_difference":true,
"difference_threshold": 100,
"metrics": ["SSIM", "VSI", "GMSD", "MSE", "DSS"],
"color_spaces": ["RGB", "HSV"],
"map_window_size": 11,
"map_step_size": 30
}
Run following command to compute:
python libra/main.py samples/sample.json
A command line innterface is also provided, type:
python libra/main.py -h
for more information.
This example evaluates the visualization quality of isotropic turbulence dataset subjected to tensor compression with a maximum Peak Signal-to-Noise Ratio (PSNR) of 40. The assessment focuses on how effectively the tensor compression retains the visual fidelity of the turbulence data.
References
Dataset: https://klacansky.com/open-scivis-datasets/\
Compression Technique: https://github.com/rballester/tthresh
Reference Image |
Compressed Image (PSNR: 40) |
Color Space | Description |
---|---|
RGB | Standard color space with three primary colors: Red, Green, and Blue. Commonly used in digital images and displays. |
HSV | Stands for Hue, Saturation, and Value. Often used in image processing and computer vision because it separates color. |
HLS | Stands for Hue, Lightness, and Saturation. Similar to HSV but with a different way of representing colors. |
LAB | Consists of three components: Lightness (L*), a* (green to red), and b* (blue to yellow). Mimics human vision. |
XYZ | A linear color space derived from the CIE 1931 color matching functions. Basis for many other color spaces. |
LUV | Similar to LAB but with a different chromaticity component. Used in color difference calculations and image analysis. |
YCbCr | Color space used in video compression. Separates the image into luminance (Y) and chrominance (Cb and Cr) components. |
YUV | Used in analog television and some digital video formats. Separates image into luminance (Y) and chrominance (U and V). |
Metric | Python Package | Description | Value Ranges |
---|---|---|---|
MSE | libra | Measures the average squared difference between the reference and test images. | Range: [0, ∞). Lower MSE indicates higher similarity. |
SSIM | piq | Assesses the structural similarity between images considering luminance, contrast, and structure. | Range: [-1, 1]. Higher values indicate better similarity. |
PSNR | piq | Represents the ratio between the maximum possible power of a signal and the power of corrupting noise. | Range: [0, ∞) dB. Higher values indicate better image quality. |
FSIM | piq | Evaluates image quality based on feature similarity considering phase congruency and gradient magnitude. | Range: [0, 1]. Higher values indicate better feature similarity. |
MS-SSIM | piq | Extension of SSIM that evaluates image quality at multiple scales. | Range: [0, 1]. Higher values indicate better structural similarity. |
VSI | piq | Measures image quality based on visual saliency. | Range: [0, 1]. Higher values indicate better visual similarity. |
SR-SIM | piq | Assesses image quality using spectral residual information. | Range: [0, 1]. Higher values indicate better visual similarity. |
MS-GMSD | piq | Evaluates image quality based on gradient magnitude similarity across multiple scales. | Range: [0, ∞). Lower values indicate higher similarity. |
LPIPS | piq | Uses deep learning models to assess perceptual similarity. | Range: [0, 1]. Lower values indicate higher similarity. |
PieAPP | piq | Deep learning-based metric for perceptual image quality. | Range: [0, 1]. Lower values indicate higher quality. |
DISTS | piq | Combines deep learning features to evaluate image quality based on structure and texture similarity. | Range: [0, 1]. Lower values indicate higher similarity. |
MDSI | piq | Measures image quality based on mean deviation similarity index. | Range: [0, ∞). Lower values indicate better quality. |
DSS | piq | Computes image quality using a detailed similarity structure. | Range: [0, 1]. Higher values indicate better similarity. |
IW-SSIM | piq | Information-weighted SSIM that emphasizes important regions in images. | Range: [0, 1]. Higher values indicate better structural similarity. |
VIFp | piq | Measures image quality based on visual information fidelity. | Range: [0, 1]. Higher values indicate better preservation of information. |
GMSD | piq | Gradient Magnitude Similarity Deviation metric for assessing image quality. | Range: [0, ∞). Lower values indicate higher similarity. |
HaarPSI | piq | Uses Haar wavelet-based perceptual similarity index to evaluate image quality. | Range: [0, 1]. Higher values indicate better perceptual similarity. |
pHash | ImageHash | Generates a compact hash value that represents the perceptual content of an image. | Range: [0, ∞). Higher values indicate worse perceptual similarity. |
Metric | Python Package | Description | Value Ranges |
---|---|---|---|
BRISQUE | pyiqa | Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) uses natural scene statistics to measure image quality. | Range: [0, 100]. Lower values indicate better quality. |
CLIP-IQA | piq | Image quality metric that utilizes the CLIP model to assess the visual quality of images based on their similarity to predefined text prompts. | Range: [0, 1]. Higher values indicate better quality. |
NIQE | pyiqa | Natural Image Quality Evaluator. It assesses image quality based on statistical features derived from natural scene statistics. | Range: [0, 100]. Lower values indicate better quality. |
MUSIQ | pyiqa | Multi-Scale Image Quality. An advanced metric that evaluates image quality across multiple scales to better capture perceptual quality. | Range: [0, 1]. Higher values indicate better quality. |
NIMA | pyiqa | Neural Image Assessment. A deep learning-based model that predicts the aesthetic and technical quality of images. | Range: [0, 10]. Higher values indicate better quality. |