/deepdream_c

Minimalistic DeepDream with C89

Primary LanguageSourcePawn

deepdream.c

This is an artistic experiment trying to implement Convolutional Neural Network inference and back-propagation using a minimal subset of C89 language and standard library features. As of now this program can:

  • Classify images using the ImageNet pre-trained Inception-V1 (GoogleNet) architecture

  • Generate adversarial examples (versions of the input image with almost imperceivable modifications that nevertheless cause neural network to incorrectly classify them)

  • Apply DeepDream effect to an input image. This effect tries to amplify the magnitude of activations of a particular layer of the neural network, which fills the image with psychedelic patterns

Main coding principles were:

Use as few dependencies or assumptions about hardware as possible. Keep code simple and concise.

The observed performance in case of using an optimizing compiler is comparable with what I had with CPU Caffe in 2015. In particular, rendering 7 octaves of DeepDream on the default input image takes around 25 minutes on a MacBook Pro 2019.

History and motivation

A detailed backstory can be found here. Below is the short summary.

This project was developed by me, Alexander Mordvintsev, to celebrate the 6th anniversary of DeepDream creation. It all started with an art commission request by the k21.kanon.art project. Originally we planned to recreate a few first DeepDream images by running the original code from 2015. However it happened to be surprisingly challenging to bring even just a few year old code to life given all the dependences. This resonated with my overall frustration with the excessive complexity, bloat and over-reliance on frameworks of modern software. I decided to see how difficult would it be to implement the whole thing from scratch, using technologies that are at least 30 years old, but still in active use today.

Usage

This code is supposed to be as easy to build and run as a "Hello, World!" program, which means that just passing deepdream.c to a C compiler does the job. Compilation typically takes no more than a second on most systems I've tried. In practice it's preferable to add compiler-specific optimization flags that make execution a few times faster. Here are the command lines I was using to build the code on various systems:

Linux/Mac (gcc or clang):

gcc -O3 deepdream.c -o deepdream
clang -O3 deepdream.c -o deepdream

Windows (MSVC):

cl /O2 deepdream.c

todo: add pedantic command lines

The deepdream binary uses the following command line format:

deepdream [mode] [-h] [-i input.bmp] [-o output.bmp]

mode is one of:

  • dream (default) -- apply DeepDream effect to the input image and store it as output. Also stores progressive image scales as octave_<i>.bmp.
  • classify the input image and print top-5 labels and scores
  • adversarial -- generate an altered version of the input image that fools neural network to classify it as "tennis ball".
  • test -- run forward and backward passes through the network and compare the tensor values with reference files in the test/ folder

-i and -o flags specify input and output images correspondingly. Only 24-bit color BMP files are supported. Default input is cat_dog224.bmp

Setting other options

As of now all other parameters are configured through the global constants at the beginning of deepdream.c. Given the negligible compilation times this seems to be a viable mechanism. Later some of these constants may be promoted to flags.

Architecture

The repository contains following files:

  • deepdream.c -- main source file, implements DeepDream algorithm, all required neural network layers and auxiliary functions like image manipulation. See comments in that file for the details. The only includes are <stdio.h> and "inception.inc". The network weights are loaded from InceptionV1.pb.

  • inception.inc -- a code generated by gen_inception.py, that defines the neural network structure, tensors and layers. Importantly, it contains offsets of the network parameters of each layer in InceptionV1.pb. 1000 text labels for the ImageNet classes are also stored here.

  • gen_inception.py -- Python script that uses [TensorFlow] to parse the neural model graph from InceptionV1.pb and generates inception.inc. It also populates the content of the test/ directory with the values of some tensors when running the cat_dog224.bmp image thorough the network. This is used to validate the C implementation against TensorFlow.

  • InceptionV1.pb -- TensorFlow protobuf file that contains the network architecture and parameters for the ImageNet-pre-trained Inception-V1 model. The architecture was proposed by Christian Szegedy. Particular version was trained by Sergio Guadarrama using the Caffe framework. It was later converted to TensorFlow format by authors of the Lucid project.

  • cat_dog224.bmp -- the 224x224 image that I was using as the starting point for generating first DeepDream images in May 2015. It contains a cat and a dog, but all top-5 labels, assigned to this image by the Inception-V1 network are dog breeds. I was trying to "reverse-engineer" the model to find the point when "dogs" overtake "cats" in the inference process. I downloaded this image from some free desktop wallpapers website and don't have any other information about its origin.

  • ImageNet_standard.txt -- text labels for 1000 ImageNet classes.

  • test/* -- a few reference tensor values to validate the C network implementation against TensorFlow. Test files can be generated by running gen_inception.py test. The test files in the repository were produced on CPU Colab kernel, and C operations were implemented to reproduce them. Later I noticed that test data that were generated on GPU (cuDNN backend), or on Mac CPU (oneDNN backend) differ in the way the gradient of MaxPool operation is computed. I decided to keep the original behaviour, please be aware that test data produced on your system may differ.

Code features

  • The only library include is <stdio.h> (for file I/O and printing)
  • No use of dynamic memory allocation
  • No assumption that floats use IEEE-754 standard, parsing binary network weights manually
  • No <math.h>, use iterative approximations for exp() and pow(..., 0.75) (for LRN layers of Inception-V1)
  • Forward and backward operations share most of the code, so implementing back-propagation was relatively simple

Acknowledgments

Thanks to:

  • KANON for inspiring and supporting this project

  • Christian Szegedy for Inception architecture and discovery of adversarial examples

  • Caffe authors for creating a framework that I was using in my early experiments

  • Sergio Guadarrama for releasing the Caffe version of the Inception-V1 model

  • Google for supporting my work on reverse-engineering Deep Neural Networks

  • Chris Olah, Ludwig Schubert, Shan Carter, Nick Cammarata and other people who work on Lucid and Distill for pushing DNN visualization research further on a level I could have never reached alone