carissalow/rapids

1.4/features/phone-locations/

utterances-bot opened this issue · 5 comments

Hi RAPIDS team! I hope this is the right place to post...I'm playing around at the moment with using RAPIDS to extract behavioural location features using the Barnett provider. I am getting an error for (at least) one of my participants:

Error in if (hourofday >= 20 || hourofday <= 8) { :
missing value where TRUE/FALSE needed
Calls: barnett_daily_features ... MobilityFeatures -> GetMobilityFeaturesMat -> SigLocs
Execution halted
[Thu Jul 22 17:47:00 2021]
Error in rule phone_locations_barnett_daily_features:
jobid: 2251
output: data/interim/p451/phone_locations_barnett_daily.csv

RuleException:
CalledProcessError in line 401 of /rapids/rules/features.smk:
Command 'set -euo pipefail; Rscript --vanilla /rapids/.snakemake/scripts/tmp1ialzxof.daily_features.R' returned non-zero exit status 1.
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/snakemake/executors/init.py", line 2339, in run_wrapper
File "/rapids/rules/features.smk", line 401, in __rule_phone_locations_barnett_daily_features
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/snakemake/executors/init.py", line 560, in _callback
File "/opt/conda/envs/rapids/lib/python3.7/concurrent/futures/thread.py", line 57, in run
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/snakemake/executors/init.py", line 546, in cached_or_run
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/snakemake/executors/init.py", line 2351, in run_wrapper
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

I wonder if you could shed some insight on what's going on here...I can't work it out myself. I adjusted the code slightly to print hourofday and xx at the end of the 'determine which is home' for-loop in SigLocs.R to try to see what was happening, but there are no missings and there are clearly times for my participant between >20 and <8. I can't see any reason why this participant is any different to the others. Short of removing this participant, do you have any ideas on what I can try to fix this?

Hi @heroberts, thanks for reaching out! I wouldn't be able to say for sure what is going on here since that is code I adapted from the original implementation. It's interesting tho' because I've never seen that error before with our datasets.

However, we do have good news; Ian Barnett and Shirley Hayati came up with a re-implementation of these features in Python that fixed some bugs and is faster. The patch is almost ready (I need to clean it up a little bit), and hopefully, it fixes this problem you are having.

Once the patch is merged into develop, I'll let you know, and you can give it a try. If that does not work, we could take a closer look at the data from that participant to try to debug the problem.

Thanks for your response @JulioV! I did see the Python version pull request so as soon as that is available I will give it a try :-) If I figure it out on my side I'll also let you know in case it can help others in future.

PS Really great work you're all doing with this project - looking forward to seeing it go from strength to strength.

Thanks for your kind words @heroberts, we are glad you found RAPIDS useful.

A couple of updates:

A preliminary version of the Python code is in this branch, it can be configured and run in the same way as the old R implementation. Please feel free to give it a try, so far there is one last pesky bug that might be related to location traces without any travels but I think this problem is rare. It'd still be informative if you get (or not) the same error with your data. FYI the error message is IndexError: index -1 is out of bounds for axis 0 with size 0.

Also thanks again for your PR, we always try to recognize everyone's contributions so also feel free to add your name and profile to our list of Community Contributors in the team and main pages.

closing this now that the python implementation is out on v1.5.0