Measure the aesthetic quality of photographs, based on cutting-edge methodology.
An avid photographer may easily end up having dozens of seemingly identical photos. Or perhaps they wish to pick a handful of best photos from their collection. They might hope that they could give a set of photos to someone, who then ranks the photos and picks the best ones.
This is possible with modern image analysis and machine learning techniques. I'm talking about assessing the aesthetic quality of photographs -- automatically. Unfortunately, none of the applications from Adobe, Apple, or others seem to address this issue. This project aims to fill the gap, that is aiding photo management by cutting-edge methodology.
The question of photo quality is subjective to certain extent. However, most photos are taken by ordinary people who follow general rules as it comes to photo quality. In this domain, we want to see excellent sharpness, little motion blur, golden rule compositions, no compression artefacts (think of JPEG), people's faces that have their eyes open, etc.
In fact many photo features, both technical and aesthetic in nature, can be modeled and measured. Indeed, the field of measuring image quality is a well-known research topic. Most of the research, however, has concentrated on measuring quality when the true image is known. But in photography, all we have is a single photo with no ground-truth. This fairly new field of study is called no-reference image quality assessment, or NR for short, and it is in this field where we wish to make a contribution.
We have explored the scientific literature on NR and measures of image quality, and we have implemented approximately sixty such measures. We wish to use these to predict how humans rate photos.
As test data, we use the DPChallange dataset, which consists of 16504 photos, of both color and black & white. Based on measurements on photos, we train a statistical model to predict average ratings that a photo has received. Models like linear regression and RandomForest were considered.
Concentrating on color images, we achieved a correlation of 0.423 between predicted ratings and average ratings. In total 60% of data was used for training a RandomForest model, and the rest of the data was used to assess the prediction accuracy of the model.
Each photo was rated by several people, so a true rating is in fact a distribution of ratings, rather than average. When we sampled ratings randomly from these distributions, as we assume a human would do, we achieved a correlation of 0.468 between sampled ratings and averages. Based on these correlations, we claim, our method works at 90% level when compared to a human performance.
We used the stdev of ratings to measure the prediction error. On an average photo in the test set, the differences between individual ratings and the average have stdev 1.628. As the absolute prediction error is only 0.615 (rating difference from average) on the test set, we end up with relative error of only 0.380 stdev's. With 95% confidence, the predictions fall within one stdev [-0.86;1.07] around average rating, as shown in the plot:
Original 755 x 501 JPG photo:
-
Here are the blur-related measurements we have for the photo. The blur measurements show that the photo has good quality. In practice the photos always contain some amount of blur -- here the radius of blur is estimated to be 3 pixels.
Blur measures | Value | Explanation |
---|---|---|
Blur annoyance quality (0--1): | 0.84759 | the blur does not distract too much |
MDWE horizontal blur width: | 3.29405 | the amount of blur is still modest |
MDWE Gaussian quality (0--1): | 1 | blurry appearance is still appealing |
MDWE JPEG2000 quality (0--1): | 1 | there are not JPEG2000 artefacts to speak of |
By measuring the blurriness properties we can arrive at some conclusions on a photo's attractiveness. From a set of dozen similar photos, automatically picking the sharpest one saves us the trouble of pixel-peeping the minor differences between candidates.
Hues in the photo:
-
The image below shows the hues in the photo. The block-like structure probably originates from the JPEG compression. In the HSV color space the variation within those blocks is fully conveyed by the Saturation and Value dimensions. Note that colors that are almost white or black are quite insensitive to hue.
-
Let's have a look at the hue amounts in a polar coordinates. The image shows that the most dominant hues are lime green and orange. A best-fit von Mises distribution (black line) is used to identify dominating hues.
Hue Measurements | Value | Explanation |
---|---|---|
Color dispersion(mu) | 0.6655 (rad), or 38.13° | dominant hue is orange |
Color dispersion(kappa) | 1.3797 | the dominance is not particularly concentrated |
Color dispersion(pi) | 0.72621 | dominant hues cover 72.6 % of the photo's pixels |
Color dispersion(ds) | 323.5 px | average distance between two pixels that have dominant hue |
Color dispersion(custom.ds) | 0.970699 | pixels with dominant hues are almost as scattered as any two pixels |
Detecting dominant colors may help to decide whether a photo has incorrect color temperature or other type of color cast. By measuring the scatter, we are able to distinguish natural colors, such as blue sky and green forest, from an incorrect color temperature.
Sharpness and blurriness:
- In some sense sharpness and blurriness are the opposites of each other. A photograph may very well contain both, and still be of a high quality.
- In the sharpness image the sharpest pixels (white) are those at the strong edges:
- In the blurriness image the background of the photo appears as highly blurry (white):
Image segmentation:
- Image segmentation offers a way to split the image so that objects and coherent areas can be detected. In many photos, we have some areas filled with more or less the same color, while the areas may differ from each other. By detecting the segments, we simultaneously split the image into distinguishable parts.
- In the following image we have decided that the puffin photo is naturally split into 26 clusters, each of which corresponds to a single color. Using these clustered colors, the image has 20212 connected color segments, especially below in the grass area. Here they are, each shown with a random color to make them stand out:
- In the connected segments image, we show ten largest connected components against a black background:
- In the reconstructed image we have used only 26 clusters, that is colors, to reconstruct the original photo:
Exposure and contrast:
Three simple measures for detecting whether image has pleasing exposure and contrast:
Measurement | Value | Explanation |
---|---|---|
Basic exposure | 0.933 | Good exposure, near middle-grey |
Basic RMS contrast | 0.136 | Natural luminance spread (dark vs. light, near 0.18) |
Basic interval contrast | 0.570 | moderate dynamics: middle 95% of photo has 57% of full dynamic range |
More measures:
In Datta et. al (2006), the authors propose dozens of measures that are potentially related to photo quality. We have implemented most of them, and for the puffin image we obtained the following values.
Datta R., Joshi D., Li J. and Wang J.: Studying Aesthetics in Photographic Images Using a Computational Approach, 2006.
Measurement | Datta variant | Value | Explanation |
---|---|---|---|
Average Intensity | 1 | 0.394 | Slightly darker than neutral grey |
Colorfulness | 2 | 61.03 | Ordinary distance to equally distributed colors |
Colorfulness-Grey | 2 (grey, n=6) | 25.14 | Quite colorful (distance from a greyscale image) |
Average saturation | 3 | 0.244 | |
Average hue | 4 | 103.02 | Average hue is green |
Average central hue | 5 | 91.55 | In the image center we have green with a hint of yellow |
Average central saturation | 6 | 0.196 | Center is less saturated than the rest of photo |
Average central intensity | 7 | 0.475 | Center is brigher than edge areas |
Texture, hue | 10,11,12;19 | 0.00803, 0.0259, 0.0921; sum 0.126 | |
Texture, saturation | 13,14,15;20 | 0.0125, 0.0316, 0.0747; sum 0.119 | |
Texture, value | 16,17,18;21 | 0.00955, 0.0248, 0.0584; sum 0.0927 | |
Size feature | 22 | 1256 | Modest-sized photo: sum of rows and columns |
Aspect ratio | 23 | 0.664, 1.51 | Classical 3:2 aspect ratio |
Number of large patches | 24 | 5 | Full number of large segments (max 5) |
Number of clusters | 25 | 26 | Photo is complex: best described by 26 clusters (random variation) |
Average patch HSV values | 26--40 | 272.6, 63.2, 30.1, 25.3, 224.4; 0.0586, 0.0867, 0.194, 0.137, 0.0438; 0.238, 0.276, 0.382, 0.329, 0.376 | |
Relative patch sizes | 41--45 | 0.112, 0.0616, 0.0604, 0.0454, 0.0389 | None of the segments fills the photo fully |
Segment position codes | 48--52 | 12, 12, 21, 12, 21 | Five largest segment centers reside on top, left-middle part of photo |
Segment distances from center | Esa proxy 48--52 | 0.492, 0.446, 0.443, 0.324, 0.353 | Five largest segments not in the center nor corners |
Only center within DOF for H, S, V | 53--55 rescaled | 0.113, -0.109, 0.109 |
The project leans towards the scientific research on no-reference image quality. It combines the scattered results from numerous articles and provides an implementation for the methods that lack one. Together, a number of photographic features can be measured from a photo. To assess the quality of a photo, we use machine learning techniques on photo features to arrive at a fair quality assessment. As training data, we use publicly available photo collections that contain individual photo ratings.
- Find relevant articles and measures for no-reference quality assessment.
- Implement the most relevant methods.
- Download a collection of photos with ratings from a public source.
- Train a machine learning model to map measurements into a photo rating.
- Learn and improve on what we have.
This software has been written in R version 3.4.3. It serves as a prototype for photo quality assessment, and as a library of working implementations. Keeping a future production version in mind, the code contains a lot of details that support implementation work in lower-level languages.
In current code, we use the following R packages along with their dependencies: emdist, waveslim, NbClust, jpeg, png, ggplot2, circular and mmand.