simon-andrews/umass-toolkit

Results from large/slow/expensive calls should be cached

simon-andrews opened this issue · 5 comments

Currently, when a user requests data from UMass Toolkit, the functions will go to the server, download the data, and return it to the user. However, this is probably unnecessary since the data doesn't change all that often (example: dining menus only change once per day). We should figure out some system for saving data in a variable somewhere when a function is called for the first time. Then, when the function is called again, check to see if some amount of time has passed. If that time has not passed, return the cached data. If it has passed, download the data again and update the cache.

Why?

  • To speed up programs
  • So that we don't request stuff from the UMass servers too often

Recommended reading:

Recommended experience:

  • Programming in any language. Basically knowing what variables are. Bonus points if you took AP CompSci and already know your HashMaps!

This issue can be solved by using two scripts. One to serve the result from a database, and the other to update the database after some time. That way we simply have to query a database and not worry about the time passed or retrieving the cache. I'll translate this to code after a while.

That's sort of the idea here, but a database would be a little much since a variable would be sufficient for what we're doing. Also, by keeping track of time since last access we wouldn't need to worry about threading or multiple scripts and having some background updater process running all the time.

Only updating if n seconds have passed means we're only updating when the user actually does the call the function. If they never call, we never download the data and save some computing time/power/space.

Here's basically what my idea is:
Image of my idea as a flowchart

This way would mean that the service will be slow for some(those who use it after n seconds) and fast for others(those who use it before n seconds).
I would still highly recommend using a database because maintaining and updating variables is difficult on the server side, because each time the server runs it's a different instance of the same code.

Oh I gotcha. UMTK is a client-side library for interacting with servers run by UMass. We don't operate any servers or databases ourselves. There's only one "user" in our scenario, who is the programmer using UMTK in their own program.