Persistence Problem
Closed this issue · 8 comments
I expect to be able to reuse a cache persisted to disk. To simulate restarting julia, etc., (and because my use case will involve a wrapper function that will get redefined anyways) I define a new function to test it but that fails.
Steps to reproduce:
using Caching, Unitful, Logging
using Unitful: V, A
# turn on debug output
global_logger(ConsoleLogger(stderr, Logging.Debug))
# create cache entries and sync
power(U::Unitful.Voltage, I::Unitful.Current) = U * I
cache = Caching.Cache(power; name="power", filename="cache-power.bin")
cache(1V, 1A)
syncache!(cache; with="both")
# now try to reuse the cache
power2(U::Unitful.Voltage, I::Unitful.Current) = U * I
cache2 = Caching.Cache(power2; name="power", filename="cache-power.bin")
syncache!(cache2; with="both")
cache2(1V, 1A)
The output indicates that cache2 is empty and the function call to it produces a "Full cache miss." Am I using Caching
incorrectly, or is this a bug?
Unitful can be removed from the example to make it minimal (and more general):
using Caching, Logging
# turn on debug output
global_logger(ConsoleLogger(stderr, Logging.Debug))
# create cache entries and sync
power(U, I) = U * I
cache = Caching.Cache(power; name="power", filename="cache-power.bin")
cache(1, 1)
syncache!(cache; with="both")
# now try to reuse the cache (this fails)
power2(U, I) = U * I
cache2 = Caching.Cache(power2; name="power", filename="cache-power.bin")
syncache!(cache2; with="both")
cache2(1, 1)
This issue is also independent of having a new function definition, as the following two listings (to be executed in consecutive julia REPL sessions) show:
using Caching, Logging
# turn on debug output
global_logger(ConsoleLogger(stderr, Logging.Debug))
# create cache entries and sync
power(U, I) = U * I
cache = Caching.Cache(power; name="power", filename="cache-power.bin")
cache(1, 1)
syncache!(cache; with="both")
exit()
using Caching, Logging
# turn on debug output
global_logger(ConsoleLogger(stderr, Logging.Debug))
# sync cache and try to use it
power(U, I) = U * I
cache = Caching.Cache(power; name="power", filename="cache-power.bin")
syncache!(cache; with="both")
cache(1, 1)
The result is again a "Full cache miss." I expected a cache hit (from disk).
You are correct however it is not a bug; loading an existing disk cache is not supported. This should perhaps be stated more explicitly.
To directly be able to associate a cache file to a cache object is more complicated as it would need a specific file format (entry separators) and a parser for the file. It would be even harder to guarantee that no mismatch is being done (i.e. pick the wrong cache file). In short, this would complicate matters alot and would be a recipe for errors.
The current implementation allows for using the disk cache as long as the Cache
object is in memory as the file is just a stream of serialized objects with separators i.e. offsets
held inside the Cache
object.
The cached entries check can be found in utils.jl
, lines 105:106
, and relies entirely on the Cache
object information.
memonly = setdiff(cache.history, keys(cache.offsets))
diskonly = setdiff(keys(cache.offsets), cache.history)
Closing as this is not a bug.
PS A solution for this is to serialize the Cache
object (i.e. cache
) itself. After de-serialization the file cache "cache-power.bin"
can be safely used ;) (the Caching
module and power
method have to be in scope)
Thanks!
It would be great to have a persistent cache. I was basically hoping your module could work similar to tho joblib cache in Python. Are you taking feature requests?
Apparently persistency is a thing. Ill look into it in the following weeks; if you do have a solution, ideea or want to hack it, PRs are welcomed as well ;)
Reopening as well
Cool, thanks! I guess it's not going to be easy... Will make a PR if I can come up with something.