individual_path: A Jupyter Notebook repository from LilianYou

This is Lily Cheng's report for the UCI winter20 psych239 class final project.

The goal of this project is to classify two individuals' paths. The dataset comes from an IOS game app called Sea Hero Quest (Hyde et al., 2016) [links:https://play.google.com/store/apps/details?id=com. glitchers.catchhero, https://itunes.apple.com/gb/app/ sea-hero-quest/id1034383306?mt=8], which contains 45 different levels of wayfinding tasks. The difficulty levels of the game increase from level1 to level45. The task starts with showing the player a map with certain numbered flags indicating the order of the goal locations. Players are supposed plan their route in advance during the map viewing before starting their tasks. After viewing the map, subjects will take a first person view to drive a boat in a river maze. Tapping on the left and right of the boat to turn the boat's direction, and sliding up and down to speed up or slow down the movement. The game starts from very easy mode with a single location such as level 1 and level2, to more difficult levels such as foggy weather which make half of the river invisible and more complex river branches. People's position (x, y coordinates) during the navigation task will be recorded every 0.5s. People also report their demographic information (e.g., sex, age, location, etc.) voluntarily for extra bonus in the game. Previous research on this dataset has suggested that people have similar performance in real world navigation as in this sea hero quest game (Coutrot et al., 2019).

Sex differences have been found frequently in navigation studies. Males generally have better performance (Astur, Ortiz, & Sutherland, 1998; Moffat, Hampson, & Hatzipantelis, 1998), use allocentric strategies more frequently (Coluccia & Louse, 2004; Lawton, 1994; Lawton & Kallai, 2002), and have lower general navigation anxiety (Lawton, 1994) than females. The author's ultimate goal of research is to look at sex differences in travel trajectories by ultilizing a machine learning algorithm to classify travel trajectories of males and females at different game levels. However, due to current limited access to the full sea hero quest dataset, the author took a step back and asked a smaller question: Can we use a machine learning algorithm to classify individual differences in travel trajectories (aka. paths)?

Individual differences in travel trajectories is a valid question. Researchers have found large individual differences in navigation performances in a maze (unpublished data from chrastil lab). However, it is unknown whether there is individual signatures in travel trajectories that leads to different navigation performances. To look at, I started with a very small sample size of 2 subjects' data (one is the author) in the first 14 levels that's extracted directly from the developer version of the sea hero quest game.

Here are the main steps of the data analysis which also corresponds to the step titles in the code. All data were saved in the author's google drive with a shared link https://drive.google.com/drive/folders/13tXB6uLEg60Tl01sSNlP7hthITKdgKiw?usp=sharing . The author mainly used tensorflow for building neural networks on the google collaborator platform. The author first extracted each individual's coordinate data at each level separately and plotted them as separate travel trajectory figures. From the data visualization, we could tell that there is quite obvious individual differences between the two players: one is a more efficient navigator that she took smooth path in most tasks, while the other is a less efficient navigator that the person takes uncertain routes back and forth in most cases. Next, half of the travel trajectory images were saved as training datasets. The other half of the travel trajectory images were saveds as validation datasets. After loading images for both training and validation processes, the author built a neural network with 3 convolutionary blocks with a max pool layer in each of them. There's a fully connected layer with 512 units on top of it that is activated by a relu activation function. The batch size was set to be 2 and the epoch size was set to be 10. After training and validation, the author plotted two figures. One figure represents how training and validation accuracy changes with epochs. The other figure represents how training and validation loss changes with epochs. It turns out that the two accuracy lines in the first figure almost stay constant during the 10 epochs with the training accuracy always being high and validation accuray always being low. This big difference between the two lines is an obvious sign that there is an overfitting of the model. Therefore, extra steps need to be conducted to decrease the overfitting. The author first tried adding augmentation to the training data generator which includes horizontal flip, zoom range, height shift range, width shift range, and rotation range. However, the overfit situation did not improve after augmentation. Therefore, the augmentation during training process is not nessary so it was taken out. Next, the author added a dropout (20%) process after the first and third max pooling in the neural network. This time, the training accuracy has a increasing process and stay constant at the 0.5 value. The validation accuracy stay constant at around 0.33. Compared with the last model, the difference between the two accuracy lines has decreased. Therefore, the model has been improved after adding dropout processes. There is still obvious overfitting within the model, but we need to realize that the dataset is very small in this project so it is not surprising the model did not do a good job in classification. I expect that the same algorithm will work much better if we can use it to train the full sea hero quest dataset. Alternatively, it is also possible that there is actually no significant path signature for valid classification at an individual level.

Additionally, I tried plot navigation data in two different ways. The first one can be seen in the code file "psych239_final_proj.ipynb". In this file, people's travel trajectories were plotted as paths lines which include temporal information. The second one was used in the code file 'psych239_final_proj2.ipynb'. In this file, temporal information was removed. People's travel trajectories were represented as scatter plots. The data reported above is the data from utilizing scatter plots. Training results from line plots that includes temporal information suggests a worse fit of the model, with training accuracy stay high at the very beginning and then experience a rapid dip, and keep being high for the rest epochs. Therefore, a raster plot of individual travel trajectory with temporal information being removed is a preferred type of data used for classification.

In the near future, I will move the code to a python spark version and translate it into numpy. Then I will use the same model to analyze sex differences in the travel trajectory of the full sea hero quest dataset.

The author would like to thank Professor Emre Neftci, who have helped her with developing ideas and grasping the course materials to keep the project moving forward.

References:

Hyde, M., Scott-Slade, M., Scott-Slade, H., Hornberger, M., Spiers, H., Dalton, R., ... & Bohbot, V. (2016). Sea Hero Quest: The World’s first mobile game where anyone can help scientists fight dementia.

Coutrot, A., Schmidt, S., Coutrot, L., Pittman, J., Hong, L., Wiener, J. M., ... & Spiers, H. J. (2019). Virtual navigation tested on a mobile app is predictive of real-world wayfinding navigation performance. PloS one, 14(3).

Astur, R. S., Ortiz, M. L., & Sutherland, R. J. (1998). A characterization of performance by men and women in a virtual Morris water task:: A large and reliable sex difference. Behavioural Brain Research, 93(1–2), 185–190.

Moffat, S. D., Hampson, E., & Hatzipantelis, M. (1998). Navigation in a “virtual” maze: Sex differences and correlation with psychometric measures of spatial ability in humans. Evolution and Human Behavior, 19(2), 73–87.

Coluccia, E., & Louse, G. (2004). Gender differences in spatial orientation: A review. Journal of Environmental Psychology, 24(3), 329–340.

Lawton, C. A. (1994). Gender differences in way-finding strategies: Relationship to spatial ability and spatial anxiety. Sex Roles, 30(11–12), 765–779.

Lawton, C. A., & Kallai, J. (2002). Gender differences in wayfinding strategies and anxiety about wayfinding: A cross-cultural comparison. Sex Roles, 47(9–10), 389–401. https://doi.org/10.1023/A:1021668724970

LilianYou/individual_path