/mayfest2018

"Enchanted with Machine Learning," the Exhibition of Engineering, 91st May Festival, the University of Tokyo

Primary LanguageJupyter Notebook

Exhibition of Algorithm Section, the Exhibition of Engineering, 91st May Festival, the University of Tokyo

2018年5月19日(土), 20日(日)に開催された5月祭の, 工学部応用物理系(物理工学科, 計数工学科)「工学展覧会」で展示したものです. 当日展示, 配布した資料は以下にアップロードしてあります.

This was a part of “the Exhibition of Engineering” on 91st May Festival of the University of Tokyo which was held from May 19 to May 20 in 2018. This exhibition was hosted by Department of Applied Physics and Department of Mathematical Engineering and Information Physics, School of Engineering. Handouts and posters are also uploaded (in Japanese, English version is in preparation):

Overview: Enchanted with Machine Learning

連日話題の人工知能. 魔法とうたわれ, 錬金術かのようにもてはやされていますが, 所詮は計算にすぎません. 5月祭ではCNNに着目し, その歴史的背景や基礎から, 応用例の画風変換まで展示しました. このレポジトリでは主に画風変換を紹介しています. Python3, PyTorchで実装しています.

Artificial Intelligence is at the center of attention these days. Although it looks like magic or alchemy, it is just "computation." In the exhibition focused on CNN (convolutional neural network), visitors experienced for themselves from CNN’s historical background and theoretical basis to one of CNN applications "neural style transfer." In this repository, you can find codes on neural style transfer implemented in Python3 and PyTorch. Learned models are also available.

Neural Style Transfer(画風変換)

画風変換とは, コンテンツ画像とスタイル画像が与えられた時, コンテンツの情報をできるだけ保ったままスタイル画像に画風を寄せた画像を出力するアルゴリズムです. 例えば, 赤門の写真をゴッホ風やムンク風に変換できます. 画像がVGGを通ることで, 段々スタイルに関する情報が捨てられ, コンテンツに関する情報のみが抽出されていくという性質を活用しています.

Neural style transfer is an algorithm that outputs an image that keeps the contents of a given contents image as much as possible but in the texture of a given style reference image. As you can see above, we can blend a picture of Akamon Gate with “Gogh” texture or “Munch” texture. This algorithm is using the fact that the information of style is gradually lost and only information of contents is extracted by passing images through VGG.

Original

Gatys et al. (2015)によって提唱されたアルゴリズムです. PyTorch公式のチュートリアルにほぼ従っています. $python neural_style_transfer.py style_img content_img
で実行されます. CPU環境で数十分~数時間かかります.

The original algorithm was introduced in Gatys et al. (2015). This implementation is following the PyTorch official tutorial code. You can run it by:
$python neural_style_transfer.py style_img content_img
It will take tens of minutes or hours with CPU.

Real-Time Style Transfer

オリジナルのアルゴリズムは任意のコンテンツ画像を任意のスタイル画像に変換できる一方, 毎回変換器を学習するため変換に時間がかかります. この問題を解決したのがJohnson et al. (2016)です. スタイル画像ごとに変換器を事前に学習しておくことで, 画風変換自体のスピードが1000倍程度高速化しました.

学習済みのモデルをダウンロードしたのち,
$python fast_style_change.py content_img model
で実行されます. content_imgは変換したいコンテンツ画像(.jpg)へのパス, model = {0:Balla, 1:Dubuffet, 2:Gogh, 3:Munch} です.

新たに変換器を学習したい場合は,
$python fast_style_train.py train_data_dir style_img
で学習できます. 学習にはMicrosoft COCO Datasetのtrain2014(~80k枚, 13GB)を用いました. GPU環境で2epochの学習に9-10時間かかります.

While the original algorithm allows any contents images to transform any texture, it is time-consuming as it learns transfer part every time. Johnson et al. (2016) solved this problem by training “Image Transform Net” beforehand. Although Image Transform Net must be built for each style images, style transfer itself increases 1000 times in speed.

When testing pretrained models, download pretrained models and run:
$python fast_style_change.py content_img model
‘content_img’ is a path to your content image (.jpg), and ‘model’ = {0:Balla, 1:Dubuffet, 2:Gogh, 3:Munch}.

When training a new model, run:
$python fast_style_train.py train_data_dir style_img
Microsoft COCO Dataset train2014 (~80k images, 13GB) was used for training. It will take roughly 10 hours for 2 epoch training with GPU.

You can also refer:

おまけ: Dog or Cat? Another Application of VGG16: Transfer Learning (転移学習)

2010年代初頭まで, 人間にできて人工知能にできないことの代表例だった「犬猫の分類」. 今では学習済みVGG16を用いてお手軽に実装することができます.

Until the early 2010s, “classification of dogs and cats” was one of the most popular examples to show AI limitations. However, using pretrained VGG16, this task can be easily solved even with your laptop.

References

[1] L. Gatys, A. Ecker and M. Bethge. A Neural Algorithm of Artistic Style. 2015. http://arxiv.org/abs/1508.06576.
[2] J. Johnson, A. Alahi and F. Li. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. 2016. http://arxiv.org/abs/1603.08155
[3] D. Ulyanov, A. Vedaldi and V. Lempitsky. Instance Normalization: The Missing Ingredient for Fast Stylization. 2016. https://arxiv.org/abs/1607.08022