uiuc-validation: A Python repository from DARPA-CRITICALMAAS

UIUC Validation Metrics

This repository contains the University of Illinois at Urbana-Champaign (UIUC) validation metrics's for DARPA's CriticalMAAS program. This includes the code for generating statistical metrics and debugging feedback on the performance of our models.

Quickstart

Installation

To get started with this repo you will need to clone the repository and and install requirements.txt. We recommend using python>=3.10 and a virtual environment.

git clone git@github.com:DARPA-CRITICALMAAS/uiuc-validation.git
cd uiuc-validation
python3 -m venv venv
source ./venv/bin/activate
pip install -r requirements.txt

Usage

CLI

Included are two demo scripts that can perform generate validation metrics on a file(s) and/or directory(s).

usgsDemo.py : Runs the original AI4CMAMetric USGS grading algorithm. This is the algorithm that was used to grade the AI4CMA Competition.
validationDemo.py : Generates a simple F1 score (equivalent to "difficult_weight" = None) and the USGS F1 Score.

# Example call to validationDemo.py
python validationDemo.py --pred_segmentations <mydata>/predictions --true_segmentations <mydata>/true_segmentations --map_images <mydata>/map_images --legends <mydata>/map_legends

Python

There are three provided python methods:

grade_point_raster : Grades a point raster against the true raster. Will provide the same score as the USGS metric.
grade_poly_raster : Grades a poly raster against the true raster. The score returned is equivalent to running USGS score with "difficult_weight" = None
usgs_grade_poly_raster : Grades a poly raster against the true raster. Runs an optimized version of the usgs grading metric for polygons. The score reurned is the same as the USGS metric.

Metrics used for grading.

Preforming validation will produce a csv file with scores for each legend feature. The scores provided are:

F1 Score (equivalent to USGS score with "difficult_weight" = None)
Precision
Recall
IoU Score (Intersection over Union) *Polygon features only
USGS F1 Score
USGS Precision
USGS Recall
Mean matched distance (in pixels) *Point features only
Matched Points (true positive) *Point features only
Unmatched Points (false positive) *Point features only
Missing Points (false negative) *Point features only

Visual Feedback

If the --feedback parameter is enabled the program will also produce a debug image for each legend feature that is graded. Please note that the feedback image is the same size as the orginal image. The key for these images is as follows :

${\color{#4F4}\textsf{Correct Prediction (True Positive)}}$	${\color{#f00}\textsf{Incorrect Prediction (False Positive)}}$
${\color{#000}\textsf{Nothing Present (True Negative)}}$	${\color{#f0f}\textsf{Missing Prediction (False Negative)}}$

Example of a validation image from a UIUC model run on AR_StJoe_Mbs_poly

Documentation

validationDemo Parameters

-p, --pred_segmentations : required
Path to file(s) and/or directory(s) containing the predicted rasters to grade. The program will grade any .tif files provided. File names are expected to match their corresponding map true raster filename. E.g. if there a file CA_Sage_Mbv_poly.tif is provided there needs to be a CA_Sage_Mbv_poly.tif file in the true raster directory.
-t, --true_segmentations : required
Directory containing the true raster segmentations to grade against.
-m, --map_images : required
Directory containing the base map for the segmentation.
-l, --legends : required
Directory containing the legend jsons for the maps.
-o, --output : optional
Directory to write the validation feedback to. Default is "results". The outputs currently created include a visualization image of the validation for each legend and a csv containing the scores for each legend processed. If the directory does not exist, it will be created.
--log : optional
Option to set the file that logging will write to. Default is "logs/Latest.log".
--min_valid_range : optional
Maximum distance in % of the largest size of the image (diagonal) between a predicted pixel vs. a true one that will be considered as valid to include in the scoring. Default is 0.1
--difficult_weight : optional
Weight to give difficult points in the F1 score, range is a float within [0, 1]. Default is 0.7
--set_false_as : optional
when set to "hard" the pixels that are not within the true polygon area will be considered hard. Set how to treat false positives and false negatives Options are "hard" or "easy". Default is "hard"
--color_range : optional
The range of color variation to consider for the legend color. Default is 4
-v, --verbose : optional
Flag to change the default logging level of INFO to DEBUG.
--feedback : optional
Flag to enable the saving of debugging feedback images.

Authors and acknowledgment

This repository is maintained by the UIUC team.