Project Libra

Overview

This project provides tools to compute various image quality assessment (IQA) metrics.

Features

Compute Image Quality Assesment Metrics: Assessment quality with multiple full and no reference metrics
Multiple Color Spaces: Support for different color spaces (e.g., RGB, HSV) to assess image quality in various domains.
Heatmaps Generation: Generate metric maps visualizing the spatial distribution of metric values across the image
Image Difference: Generate thresholded difference images to highlight significant differences between two images
Configurable via JSON: Flexible configuration through JSON files for specifying image paths, metrics, color spaces, and output options.

It supports:

18 full-reference
5 no-reference
image diffs in 8 different color spaces with flexible thresholding

The full reference and no reference metrics are from these python packages:

Examples:

Some example comparison databases are available here: https://lanl.github.io/libra/

Installation

Clone Repository

git clone https://github.com/lanl/libra

Install Dependencies

pip install opencv-python-headless numpy matplotlib scikit-image torch piq pyiqa ImageHash

Note: some dependecies are not available through conda. We recommend using virtual environments for now.

Usage

Set up json configuration

The JSON configuration file should contain the following keys:

reference_image_path (str): Path to the reference image.
distorted_image_path (str): Path to the distorted image.
output_directory (str): Path to the output directory where the CSV file and metric maps will be saved.
output_filename (str, optional): Name of the output CSV file (default: "metrics.csv").
generate_metrics (bool, optional): Flag to generate metrics (default: False).
generate_maps (bool, optional): Flag to generate metric maps (default: False).
generate_image_difference (bool, optional): Flag to generate thresholded difference images (default: False).
difference_threshold (int, optional): Threshold value for generating thresholded difference images (default: 10).
metrics (list of str, optional): List of metrics to compute.
color_spaces (list of str, optional): List of color spaces to use for computing metrics (default: ["RGB"]).
map_window_size (int, optional): Window size for computing metric maps (default: 161).
map_step_size (int, optional): Step size for computing metric maps (default: 50).

Here is an example of the JSON configuration found in samples:

{
    "reference_image_path": "samples/data/test3/orig.png",
    "distorted_image_path": "samples/data/test3/compressed.png",
    "output_directory": "output_compression",
    "output_filename": "metrics.csv",
    "generate_metrics": true,
    "generate_maps": true,
    "generate_image_difference":true,
    "difference_threshold": 100,
    "metrics": ["SSIM", "VSI", "GMSD", "MSE", "DSS"],
    "color_spaces": ["RGB", "HSV"],
    "map_window_size": 11,
    "map_step_size": 30
}

Run following command to compute:

python libra/main.py samples/sample.json

A command line innterface is also provided, type:

python libra/main.py -h

for more information.

Example Usage

This example evaluates the visualization quality of isotropic turbulence dataset subjected to tensor compression with a maximum Peak Signal-to-Noise Ratio (PSNR) of 40. The assessment focuses on how effectively the tensor compression retains the visual fidelity of the turbulence data.

References
Dataset: https://klacansky.com/open-scivis-datasets/\ Compression Technique: https://github.com/rballester/tthresh

Reference Image

Compressed Image (PSNR: 40)

Image Quality Maps

RGB Color Space

HSV Color Space

Compatible Color Spaces

Color Space	Description
RGB	Standard color space with three primary colors: Red, Green, and Blue. Commonly used in digital images and displays.
HSV	Stands for Hue, Saturation, and Value. Often used in image processing and computer vision because it separates color.
HLS	Stands for Hue, Lightness, and Saturation. Similar to HSV but with a different way of representing colors.
LAB	Consists of three components: Lightness (L), a (green to red), and b* (blue to yellow). Mimics human vision.
XYZ	A linear color space derived from the CIE 1931 color matching functions. Basis for many other color spaces.
LUV	Similar to LAB but with a different chromaticity component. Used in color difference calculations and image analysis.
YCbCr	Color space used in video compression. Separates the image into luminance (Y) and chrominance (Cb and Cr) components.
YUV	Used in analog television and some digital video formats. Separates image into luminance (Y) and chrominance (U and V).

Image Quality Assesment Metrics

Full Reference Metrics

Metric	Python Package	Description	Value Ranges
MSE	libra	Measures the average squared difference between the reference and test images.	Range: [0, ∞). Lower MSE indicates higher similarity.
SSIM	piq	Assesses the structural similarity between images considering luminance, contrast, and structure.	Range: [-1, 1]. Higher values indicate better similarity.
PSNR	piq	Represents the ratio between the maximum possible power of a signal and the power of corrupting noise.	Range: [0, ∞) dB. Higher values indicate better image quality.
FSIM	piq	Evaluates image quality based on feature similarity considering phase congruency and gradient magnitude.	Range: [0, 1]. Higher values indicate better feature similarity.
MS-SSIM	piq	Extension of SSIM that evaluates image quality at multiple scales.	Range: [0, 1]. Higher values indicate better structural similarity.
VSI	piq	Measures image quality based on visual saliency.	Range: [0, 1]. Higher values indicate better visual similarity.
SR-SIM	piq	Assesses image quality using spectral residual information.	Range: [0, 1]. Higher values indicate better visual similarity.
MS-GMSD	piq	Evaluates image quality based on gradient magnitude similarity across multiple scales.	Range: [0, ∞). Lower values indicate higher similarity.
LPIPS	piq	Uses deep learning models to assess perceptual similarity.	Range: [0, 1]. Lower values indicate higher similarity.
PieAPP	piq	Deep learning-based metric for perceptual image quality.	Range: [0, 1]. Lower values indicate higher quality.
DISTS	piq	Combines deep learning features to evaluate image quality based on structure and texture similarity.	Range: [0, 1]. Lower values indicate higher similarity.
MDSI	piq	Measures image quality based on mean deviation similarity index.	Range: [0, ∞). Lower values indicate better quality.
DSS	piq	Computes image quality using a detailed similarity structure.	Range: [0, 1]. Higher values indicate better similarity.
IW-SSIM	piq	Information-weighted SSIM that emphasizes important regions in images.	Range: [0, 1]. Higher values indicate better structural similarity.
VIFp	piq	Measures image quality based on visual information fidelity.	Range: [0, 1]. Higher values indicate better preservation of information.
GMSD	piq	Gradient Magnitude Similarity Deviation metric for assessing image quality.	Range: [0, ∞). Lower values indicate higher similarity.
HaarPSI	piq	Uses Haar wavelet-based perceptual similarity index to evaluate image quality.	Range: [0, 1]. Higher values indicate better perceptual similarity.
pHash	ImageHash	Generates a compact hash value that represents the perceptual content of an image.	Range: [0, ∞). Higher values indicate worse perceptual similarity.

No Reference Metrics

Metric	Python Package	Description	Value Ranges
BRISQUE	pyiqa	Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) uses natural scene statistics to measure image quality.	Range: [0, 100]. Lower values indicate better quality.
CLIP-IQA	piq	Image quality metric that utilizes the CLIP model to assess the visual quality of images based on their similarity to predefined text prompts.	Range: [0, 1]. Higher values indicate better quality.
NIQE	pyiqa	Natural Image Quality Evaluator. It assesses image quality based on statistical features derived from natural scene statistics.	Range: [0, 100]. Lower values indicate better quality.
MUSIQ	pyiqa	Multi-Scale Image Quality. An advanced metric that evaluates image quality across multiple scales to better capture perceptual quality.	Range: [0, 1]. Higher values indicate better quality.
NIMA	pyiqa	Neural Image Assessment. A deep learning-based model that predicts the aesthetic and technical quality of images.	Range: [0, 10]. Higher values indicate better quality.