HCI_lite

Description: A lite human computer interaction system, mainly dependent on you 'handler' in your webcam.

Author: Peng Zheng.

Duration:

Progress Start Date Deadline

Building basic video flow frame skeleton, including controller tracking, controller tailing... 5/9/2018 5/9/2018

Appending top menu. 5/10/2018 5/10/2018

Inserting real time image_style_transfer mode. 5/10/2018 5/11/2018

Implementing clothes region extraction for styleTransfer. 5/11/2018 5/12/2018

Implementing formula evaluation: integerate my own dataset for training Lenet to do single character recognition, split the handwritten expression, then evaluate it. 5/13/2018 5/14/2018

Finishing sunglasses wearing mode. 5/14/2018 5/15/2018

Refining my documents and improving the stability. 5/13/2018 ...

Development SumUp 5/10/2018 5/15/2018

Substitude 'grabCut' for 'formula_evaluation' 6/6/2018 6/6/2018

AR mode 6/7/2018 6/7/2018

Progress	Start Date	Deadline
Building basic video flow frame skeleton, including controller tracking, controller tailing...	5/9/2018	5/9/2018
Appending top menu.	5/10/2018	5/10/2018
Inserting real time image_style_transfer mode.	5/10/2018	5/11/2018
Implementing clothes region extraction for styleTransfer.	5/11/2018	5/12/2018
Implementing formula evaluation: integerate my own dataset for training Lenet to do single character recognition, split the handwritten expression, then evaluate it.	5/13/2018	5/14/2018
Finishing sunglasses wearing mode.	5/14/2018	5/15/2018
Refining my documents and improving the stability.	5/13/2018	...
Development SumUp	5/10/2018	5/15/2018
Substitude 'grabCut' for 'formula_evaluation'	6/6/2018	6/6/2018
AR mode	6/7/2018	6/7/2018

Dependencies:

OpenCV==4.0.0-pre	# With opencv-contrib
numpy==1.14.3
matplotlib==2.2.2
tensorflow-gpu==1.8.0    # CUDA=9, CUDNN=7

Outline:

Project Structure:

Mode:

Guide:

video mode setting: {
    "display": Random colors, While ink would fade, like tails,
    "styleTransfer": Stylize the whole input from webcam or only your clothes,
    "grabCut": Move you from the whole scene to a new video,
    "glass": Help you wear a pair of glasses,
    "AR": Build a roof on the plane you choose,
}

Overall:

Display mode:

Algorithms: Nothing is worth mention. I hoped to use hands as the controllers so that I can use the gestures to do many things. However, my little thinkpad with a Geforce 940m GPU doesn't approve of my suggestion..., and I bumped into the final controller idea in Adrian Rosebrock's blog.
StyleTransfer mode:

Stylize image: Yes..., it's <<The Starry Night>> again(@...@)! Here she comes:

Algorithm: Clarification by the team of my roommates

Whole input is stylized except my body:

Algorithms: HSL Color Space, Basic Morphology operations, etc.
Only clothes stylized:

Algorithms: Background Substraction (LSBP), HSL Color Space

~~Simple Formula Evaluation(Removed):~~

Concerning my laptop thinkpad-t450 with i5-5200U and Geforce 940m... I used Lenet. to recognize each single character(coz this is only a very simple formula evaluation, I only took some basic operations into account.)

The shuffled dataset consists of MNIST and handwrittenMathSymbol. BTW, if you're interested in recognizing a complex mathematic expression, take a look at the MathSymbol dataset, which is from a this kind of competition on Kaggle.
1. ~~The well-trained MobileNetV2:~~
2. ~~Then I just split the formula horizontally, just like what I did in the VehicleLicensePlateRecognition.~~
3. ~~Afterwards, recognize each single character.~~
4. ~~Finally, evaluate the stitched string.~~
Figure Extraction:

Algorithm: grabCut.

Extension: Mask-RCNN
Glass mode:

Algorithms: Haarcascade.
AR of building roof on a plane

Modified from plane_ar sample in opencv

Algorithms: 3d_calibration.

TODO:

Use Openpose to estimate my poseture, especially the hands.
Modify the roof in AR mode into a more general object, such as an image or a video.
Yet, above all, get a fairly good computer with a camera.