/art-embeddings-by-style

Visualize a collection of artwork images by style similarity.

Primary LanguagePythonMIT LicenseMIT

Art embedding by style

Visualize a collection of artwork images by style similarity, as defined in A Neural Algorithm of Artistic Style. Specify a folder of images, and the program generates a two-dimensional scatterplot in which each data point represents an image in the folder, and images that are stylistically similar appear close together. Click on the data points to see the images they represent.

Embeddings are generated by feeding each image through the 19-layer VGG convolutional neural network, embedding them in a high-dimensional vector space as the concatenation of their flattened Gram matrices, defining the distance between two vectors to be the "style loss" between them, and finally projecting onto two dimensions with TSNE.

Prerequisites

  • All package dependencies are specified in environment.yml. Users of the conda package management system can copy my exact environment with the command
conda env create -f environment.yml
  • Additionally, you must download the file imagenet-vgg-verydeep-19.mat into the project directory, found here

Usage

Execute the program with

python embed_by_style.py [path_to_image_directory]

All files in the specified directory with a .png or .jpg extension will be included in the visualization.

Options

The optional flag

-l [label_csv_file]

takes a csv file of numerical image labels. The csv file must have a heading of the form "filename,label" and must contain an entry for every image file present in the image directory. If specified, the labels will be used to color the data points in the scatter plot. Possible labels include:

  • year in which the artwork was created
  • numerical identifier of the artist (e.g. 0 for Picasso, 1 for Van Gogh, etc)
  • artistic school

The optional flag

-d [path_to_dump_location]

will dump a pickled version of the two-dimensional embedding data into the specified location with the name [image_folder_name]_embed.pickle. The pickle file contains a dictionary with keys 'embeddings', 'filenames', and 'labels' (if specified).

Help

List command line options with

python embed_by_style.py --help

Example

A sample image folder containing selected works of Picasso (courtesty of WikiArt) can be found in sample/picasso. The file picasso.csv labels each image by year of creation. Running the command

python embed_by_style.py sample/picasso -d sample -l sample/picasso.csv

uses the labels in the csv file to color images on the scatter plot and dumps a pickle file picasso_embed.pickle in the folder sample. The following scatterplot is generated: