dkoslicki/NCATS

API caching for Q1 and Q2?

Closed this issue · 8 comments

@saramsey can someone run Q1Solution.py -a and Q2Solution.py -a on lysine so the sqlite API caching is updated for all the relevant queries?

I'm on it

Looks like they will need to run sequentially rather than concurrently. Right?

Just started this on lysine:
python3 Q1Solution.py -a 1>q1_stdout.log 2>q1_stderr.log && python3 Q2Solution.py -a 1>q2_stdout.log 2>q2_stderr.log

OK the two scripts ran (sequentially) for approx. 5.75 hours (total) and appear to have completed without exiting prematurely. The orangeboard.sqlite cache file appears to have been updated (based on the file timestamp), though it is already so large that the relative size increase is negligible.

Here is a link to the logfiles (stdout and stderr separately) for the two scripts:
Google Drive NCATS/RT CLI

Now I propose to restart the scripts so we can see if there is a speed improvement due to the caching.

OK per previous comment, I have just restarted:
python3 Q1Solution.py -a 1>q1_stdout.log 2>q1_stderr.log && python3 Q2Solution.py -a 1>q2_stdout.log 2>q2_stderr.log

@dkoslicki : After the cache has been built up, running these two scripts sequentially takes 4.0 hours. So that is a pretty significant reduction in running time. Caching FTW.