This prject is written for Udacity machine leanring nanodegree capstone project that intended to solve the detiction problem with number digits in real-world images. Given a real-world image that contains any amount of number digits with any font, color, and/or size exist anywhere in the image. The project is trying to identify the location and the class of each of the digits. It breaks down the problem into two sub-problems. First, it uses state of the art OverFeat for region proporal of digit regions. Second, it uses neural network to classify each proposed region for the digit. Special thanks to Russell Stewart for open source the implementaion of Overfeat in TensorFlow, TensorBox.
Make sure you have TensorFlow installed
on your computer befroe you start. You may also need to install numpy
, scipy
,
PIL
, and verious of python libraries if you don't already have them.
- Clone this repository.
- Download
googlenet.pb
,classification_model.ckpt
, andoverfeat_checkpint.ckpt
intodata
directory via runingdownload_models.sh
. - Put the your image(s) in
images_input
folder. - If you only have one image to evaluate, enter
python evaluation.py <image.jpg>
in Terminal. Make sure to replace <image.jpg> with the file name of your image and to include the filename extension. - If you want to evaluate on all images in the folder, simply enter
python evaluation.py
in Terminal. - There are already 21 images in the
images_input
folder that you can try it out yourself. - The Terminal will show the average time used for each image evaluation once all the
images are evaluated. The results will be in the
images_output
folder. A .json file with all detected digits will also be generated indata
folder.
Here I will go through the steps to train on the SVHN dateset. You can replace the data with your own dataset.
- Download and extract train.tar.gz, and test.tar.gz into the root of this repository.
- If you have Matlab installed on your computer, you can copy
mat_to_txt.m
intotrain
andtest
directory to convertdigitStruct.mat
to txt file. - You can also get my generated verson from train and test. Move each txt into their corresponding folder. If you are interested in training on bigger set, I also have extra available.
- Run
overfeat_data_processing.py
in theSVHN
folder of this repository. It will automaticlly generate resized images, .idl, and .json for TensorBox to train on. - Copy
overfeat_rezoom.json
from the root of this repository into TensorBox'shypes
folder and copySVHN
folder from this repository into TensorBox'sdata
folder. You should be ready to train with TensorBox. - Run
character_classification_data_processing.py
in this repository, it should automaticlly generate the pickle files of image data and labels for our classification networks.
- Region proposal will be trained using TensorBox. Please refer to TensorBox for more detail.
- After TensorBox is trained, rename the .ckpt file with
overfeat_checkpint.ckpt
and copy it into thedata
folder of this repository. - Run
classification_networks.py
in this repository to train the classification networks. - The check point file will be stored in
/tmp
folder. Rename the .ckpt file withclassification_model.ckpt
and copy it into thedata
folder in this repository. - If you don't already have
googlenet.pb
in thedata
folder of this repository, download it from here. - After all files are trained and in place, run
python evaluation.py
to evluate all images inimages_input
folder orpython evaluation.py <image.jpg>
for a specific image. Results will be shown inimages_output
folder.