netsounds

Sonification of convolutional neural networks

Functionality

Model

Currently, we've got a modified pretrained SqueezeNet v1.0 running some home images, and dumping the activations at certain layers out as numpy pickles.

To do this, run python3 test_squeezenet.py from /src/models/

Sound

utils.py has a function that writes an activation to a sound. It's pretty rad. Right now it wrote out the audio for the first activation from a 1024 px image of mixing_bowl.jpg

TODO

Dump images in test_squeezenet.py
Explore the activations at different layers
- Made sounds for each activation using two generation schemes:
  - Concatenating each filter at a level
  - Summing all filters at a level
Put as much of this on a GPU as possible:
- Use PyTorch as much as possible
- Make sure we can still use CPU (in case of use on Pi, for instance)

Notes

activations in this repo are generated from images with small edge length of 256:
- This means that the activations are very small
- To get higher-fidelity activations, we must use larger images, but we quickly run out of memory (and storage on GitHub)
I've tried putting larger activations in, but my machine (8GB RAM) runs out of memory on anything larger than 1024
I put the activations for image_size=1024 in here

AudreyBeard/netsounds

netsounds

Functionality

Model

Sound

TODO

Notes