/IMAGEine

UChicago CS122 Project

Primary LanguagePython

#IMAGEine: Cooper Nederhood, Liqiang Yu, Laurence Warner

##########################################################################
# Part of project: data gathering ########################################
# Location in folder: /data ##############################################
# Author: Cooper Nederhood* ##############################################
##########################################################################
* 3 small utility fn's from the webscraping PA are used. See data_fns.py for details. All other code is 100% original

# Documentation for data gathering and validation process
# Relevant scripts conatined in "IMAGEine/data/"

(1) - External programs to be installed

	geckodriver: Launching a firefox instance through selenium to do the
	web automated picture scraping requires the user to install 'geckodriver' from
	Mozilla. Earlier versions of Selenium did not need an additional driver, but current versions do. 
	Note, every browser has a different driver, so 'geckodriver' works
	for Firefox only. It needs to be in "PATH" and when I installed it automatically
	installed in the correct place, no future changes needed. I just clicked and downloaded
	at the link immediately below

	Available for download at: https://github.com/mozilla/geckodriver/releases

	For more informatin see: http://selenium-python.readthedocs.io/installation.html


(2) - Python packages to be installed if not installed already. All packages
		can be installed via pip3 according to permissions of current user

	wikipedia: 		wikipedia API to gather landmark information

	selenium: 		Selenium web automation

	urllib.request: for html retrieval and parsing
	urllib.parse: 	for html retrieval and parsing
	requests:		for html retrieval and parsing
	bs4:			for html retrieval and parsing

	time: 			to slow scraping script to avoid being shut down

	json, io, os:	general libraries for saving data

	sys:			used to accept command line arguments

	pillow:			PIL (Python Image Library) used to check if file is picture type


(3) - How to get data for a given state?
	From the command line execute the single following command:
		$ python3 get_data.py "{State}" {photo_count} {test_count} {landmark_start} {get_wiki_boolean}
			note: {get_wiki_boolean} is optional

		Examples:
			python3 get_data.py "Illinois" 30 1 0
				- gets 30 pictures for each Illinois landmark, 1 test photo. Starts at landmark #0

			nohup python3 get_data.py "North Dakota" 10 5 10 True
				- gets 10 pictures for each North Dakota landmark, 5 test photos. 
					Starts at landmark #10 and also gets the wikipedia information.
					Suppresses output and appends to .txt file

		See get_data.py for more information on command arguments


(3) - High level summaries about code structure, class design, and features beyond the basic doc 
	string in the actual '.py' files

	By running "get_data.py" the above line of code first calls the "gen_state_location_list" function
	from 'data_fns.py' which creates a list of Landmark class instances. 

	The Landmark class is defined in "Landmark.py". Essentially, this is a contained class with
	each instance reflecting a landmark in the given state. Each instance is initialized with just the
	name and the state. Landmark methods then perform the other relevant operations on the Landmark.
	These operations include getting the wiki information, getting the photo urls, and saving the photo 
	urls to disk in a structure most readily incorporated into later analyses. The operations of querying
	wikipedia for information and getting the photo urls is incredibly time consuming. To increase efficiency
	and avoid unecessary re-running of code the "save_to_file" method in the Landmark class saves out as much relevant information as possible. Each landmark has an associated 'image_info.json' file with details 
	on the data scraping process, as well as a 'landmark_info.json' file with info about the landmark. 

	As discussed, the user may want to separate the tasks of gathering the photos and gathering the wiki
	data. If the wiki data is bypassed in the run of "get_data.py" as shown above, the user can run
	the function "get_wiki_data.py" to create a csv file of the relevant wiki information. The implementation
	of the Illinois quiz uses this approach. This allowed for other streams of work, like the Machine Learning
	algorithm to procede.

	Finally, as discussed in our presentation, validating that the data retrieval process, which took days of running, 
	was a succesful representation of the available images required the generation of summary statistics when data 
	scraping. Summarizing this through the "gen_image_summary" function in 'data_fns.py' creates a simple easy to read summary for the given state summarizing the fail/successes for each images. The current incarnation of the quiz 
	uses only Illinois data, but the structure and flexibly functionality can easily be scaled to other states. In
	fact, to test that the code was still working I ran preliminary scrapes on North Dakota and Connecticut and the 
	corresponding output file structure populated as planned. 

##############################################################################
# Part of project: image classification ######################################
# Location in folder: /transfer_learning #####################################
# Author: Liqiang Yu* ########################################################
##############################################################################
#  Liqiang Yu is responsible for /transfer_learning and part of ./django_shell.
#
#                     Documentation of Code Ownership  
#
#  "Direct copy" ~ Provided by TensorFlow tutorial and few edits made
#
#  ./transfer_learning/label_image.py
#
#  "Modified" ~ Heavily utilized templates provided by TensorFlow tutorial
#               and meaningful edits made
#
#  ./transfer_learning/retrain.py
#
#  "Original"     ~ Original code
#
#  ./django_shell/create_quiz.py


# Documentation for transfer learning process
# Relevant scripts conatined in "IMAGEine/transfer_learning/"

###Transfer Learning

(1) - Python packages to be installed if not installed already. All packages
      can be installed via pip3 according to permissions of current user

	TensorFlow:  an open-source software library for dataflow programming 
        across a range of tasks. It is a symbolic math library, and also used 
	for machine learning applications such as neural networks.

	Before start, you should install TensorFlow by typing 

        $sudo pip3 install tensorflow 


(2) - Train a new model and using the retrained model

	To train the model:

	$cd IMAGEine/transfer_learning
	$python retrain.py --image_dir ./data/landmark

	The model evaluation is at IMAGEine/transfer_learning/model_training.png
	and IMAGEine/transfer_learning/model_evaluation.png

	To use the retrained model:

	After training, you can apply the model on a test image by

	$python label_image.py \
	--graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt \
	--input_layer=Mul \
	--output_layer=final_result \
	--input_mean=128 --input_std=128 \
	--image=$HOME/IMAGEine/data/Illinois/Test/lm1/FILE_0.jpg

	Replace the image directory for your own directory, an example
	output is:

	lm1 0.89462405
	lm81 0.058848068
	lm70 0.006154099
	lm52 0.0041986946
	lm67 0.0025173887

	To view TensorBoard summaries:

	$tensorboard --logdir /tmp/retrain_logs

	then navigate your web browser to localhost:6006 to view the TensorBoard

(3) - High level summaries about model selection and code structure
      beyond the basic doc string in the actual '.py' files

	(1) Algorithm selection: Initially we proposed to use Histogram of Oriented 
	Gradients[1] to extract the features of landmarks and apply a SVM[2] on the 
	feature matrix for landmark recognition. HOG requires all objects to have 
	the same feature dimension, which means images need to have a similar aspect 
	ratio (such as 2:5 for skyscrapers). Another study[7] on this problem adds 
	SURF to extract the features, but it also requires a fixed aspect ratio. After 
	examining the landmark image data we obtained, we found out that they do not 
	usually have a similar aspect ratio thus we have to manually crop them so we gave
	up this method and decided to move forward with other possible approaches.

	Google[5] build the model using additional GPS information, but most online 
	images do not have GPS metadata associated with them. Yunpeng Li et al.[6] use
	a 3D point cloud approach on over 2 million images. Tobias Weyand et al.[8] and
	David J. Crandall et al.[10] approach the problem by using large-scale image 
	collections. These approaches are not feasible for our classification problem 
	since we only have around 2,500 images and most of them do not contain the GPS 
	information. 

	Google also provides a landmark detection API[9], which works well for some 
	famous landmarks in the world. This would not work well for our problem because
	approximately half of 87 landmarks (not famous) in Illinois have less than 30 
	valid results in Google Image search.

	Usually, training a image classification problem from the scratch takes several
	days on multiple GPUs. We found that transfer learning[12][13][14][19][21] is a
	good solution for our problem. Transfer learning is a technique that shortcuts a
	lot of training work by taking a pre-trained model for a set of categories like 
	ImageNet, and retrains from the existing weights for new classes. In our approach,
	we will only modify the last layer of the Inception V3 model since we have a 
	relatively small dataset. And we are able to retrain the model on a single CPU
	with a much smaller image set.

	(2) retrain.py: The script first downloads the Inception V3 model and gets the 
	TensorFlow Graph object, the penultimate layer and the last layer (tensor). 
	Then it splits the image data into a training set, a validation set and a testing
	set. The 'add_jpeg_decoding' uses JPEG decoding function provided by TensorFlow
	and creates a tensor for feeding the image and the output tensor after resizing 
	and decoding. After preprocessing, we calculate the bottleneck values for the 
	penultimate layer and store them locally in 'create_bottleneck_file' and 
	'run_bottleneck_on_image'. And we specify the train step, loss function,
	bottleneck tensor and a new layer for training as well as the evaluation step.
	Last but not the least, we train the final layer using backward propagation. 
	Finally, we evaluate the model performance by testing on the test data and save
	the retrained graph to local folder.


(4) - Papers and documentation read for landmark recognition

	Histogram of Oriented Gradients:
	[1] Histogram of Oriented Gradients: 
	https://www.learnopencv.com/histogram-of-oriented-gradients/

	SVM:
	[2] Handwritten Digits Classification : An OpenCV ( C++ / Python ) Tutorial:
	https://www.learnopencv.com/handwritten-digits-classification-an-opencv-c-python-tutorial/

	[3] Introduction to SURF (Speeded-Up Robust Features):
	http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_feature2d/py_surf_intro/py_surf_intro.html

	[4] Object Detection using Single Shot Multibox Detector:
	http://cv-tricks.com/object-detection/single-shot-multibox-detector-ssd/

	Landmark recognition:
	[5] Tour the World: building a web-scale landmark recognition engine
	./transfer_learning/google_landmark_recognition.pdf

	[6] Worldwide Pose Estimation using 3D Point Clouds:
	./transfer_learning/global_pose.pdf

	[7] Study Impact of Architectural Style and Partial View on Landmark Recognition:
	./transfer_learning/Chen-StudyImpactofArchitectural StyleandPartialViewonLandmarkRecognition-report.pdf

	[8] Visual Landmark Recognition from Internet Photo Collections: A Large-Scale Evaluation:
	./transfer_learning/1409.5400.pdf

	[9] Detecting Landmarks:
	https://cloud.google.com/vision/docs/detecting-landmarks

	[10] Recognizing Landmarks in Large-Scale Social Image Collections:
	./transfer_learning/landmarks2015book.pdf

	Deep Neural Network:
	[11] Going Deeper with Convolutions:
	./transfer_learning/GoogLeNet.pdf

	Transfer learning and TensorFlow: 
	[12] DeCAF: A Deep Convolutional Activation Feature for Generic Visual 
	Recognition
	./transfer_learning/1310.1531v1.pdf

	[13] How transferable are features in deep neural networks?
        ./transfer_learning/1411.1792.pdf

	[14] TRANSFER LEARNING IN TENSORFLOW USING A PRE-TRAINED INCEPTION-RESNET-V2 MODEL:
	https://kwotsin.github.io/tech/2017/02/11/transfer-learning.html

	[15] Inception:
	https://github.com/tensorflow/models/tree/master/research/inception

	[16] Object Detection:
	https://github.com/tensorflow/models/tree/master/research/object_detection

	[17] MobileNet:
	https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html
	
	[18] Tensorflow Tutorial 2: image classifier using convolutional neural network:
	http://cv-tricks.com/tensorflow-tutorial/training-convolutional-neural-network-for-image-classification/

	[19] Using Transfer Learning to Classify Images with TensorFlow:
	https://medium.com/@st553/using-transfer-learning-to-classify-images-with-tensorflow-b0f3142b9366

	[20] Bringing in your own dataset:
	https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md

	[21] Deep learning on Coursera by Andrew Ng:
	https://www.coursera.org/learn/machine-learning-projects

##########################################################################

# Part of project: Prototype Django web application: "Know Your State" ########################################

# Location in folder: /prototype ##############################################

# Author: Laurence Warner ##############################################

##########################################################################

#                     Documentation of Code Ownership  
# All file paths within prototype/
#  "Direct copy" ~ 
#
#  All except noted below
#
#  "Modified" ~ 

#	prototype/
		settings.py
		urls.py
	quiz/
		urls.py
		views.py
		

#  "Original"     ~ Original code
#
#  quiz/
		models.py
		static/
		templates/

Also: within django_shell/
#	"Original" - create_quiz.py 
Collaborative effort with Cooper & Rex.


(1) External packages to be installed

Django 2.0
	pip3 install --user django

(2) Python packages to install if not installed already

os
sys
pandas

(3) How to run app

From the top prototype/ folder, run the following command line arguments:

python3 manage.py runserver

In your web browser go to the url provided.

python3 manage.py makemigrations quiz
python3 manage.py migrate

To launch the ipython3 shell:

python3 manage.py shell

From within the shell:

run ../django_shell/create_quiz.py

This will populate the website with questions.
Click on a question number to play that question. Enjoy trying to outguess the machine!

(4) Notes on code.

prototype/quiz/
	models.py 
		creates the two classes for the quiz: Landmark & Choice. Note: photo_url is a string.
	views.py
		how pages are created. Note how user choice is collected: each time a user votes, all choice objects have the guess attribute reset to False. Then only the one chosen changes to True. 
	templates/quiz/
		All three files use bootstrap styling. 

		detail.html
			A question page for given landmark passed to template. Ingenuity: passing image url as attribute of landmark into img tag.
			A form to collect user vote. For the given landmark passed to the template, loop through choices and display. 

		results.html
			Results. If statements to show different outcomes depending upon user choice & whether machine is correct. E.g. for each choice, check whether machine guessed. If not the correct landmark, display info on this choice too.

django_shell/
	create_quiz.py
		1). Read csv as pandas dataframe
		2). Loop over rows. For each row:
		a) collect info on that Landmark object's attributes
		b) if machine guessed a different landmark, collect attributes from that row for Choice
		c) randomly generate unused rows to ensure four choices in total
		d) Create the Landmark and Choice objects and save them into the database.