ExampleWasTaken/reactify

add caching

Opened this issue · 0 comments

Caching

Intelligent and efficient caching is an important step towards reducing API calls and a faster and more efficient application.
This issue serves as a first design outline for caching functionality.

This design assumes manual API handling is in place and that the cache has:

  • access to response headers returned by the Spotify API
  • complete control over network requests

It does not consider API handling in general. See #12 for that.

Design

The cache structure is divided in two kinds of caches. The Memory Cache and the Disk Cache.

Memory Cache

The memory cache is for high-speed data delivery. The memory cache acts as a gateway to the disk cache. It keeps all data for a short period of time before flushing it to the disk cache.

Disk Cache

Disk cache uses localStorage or sessionStorage depending on the data it tries to cache. The disk cache holds all cached data but because it is downstream of the memory cache it may not always contain the most recent data. (See Data priority further down for more info)

Data flow

When data is requested the cache first checks if it can deliver that data.
If the cache exists and is still valid it returns that data and cancels the network request.
When the requested data is not available, it requests it from the origin and returns it as soon as it receives it. Afterwards the cache starts to process the data.

Data processing

After receiving new data the memory cache will store all data. If any of the flushing conditions are met it will flush the data to the disk space. Flushing doesn't mean it sends the data to the disk cache and deletes it afterwards. The determination whether a certain data object should be deleted from memory cache after a flush depends on its individual caching strategy. (see further down for caching strategies) E.g. Things like the user profile image may be kept in memory cache to ensure high-speed loading times should the user visit a view that displays their profile picture.

Data priority

Data in the memory cache takes priority over data in the disk cache. This is to ensure 1. the fasted response time possible and 2. to ensure the most recent cached data is returned.

Data flushing

Flushing means updating the disk space with new data from the cache. Since the memory cache does not know what data the disk space holds flushing will always overwrite data in the disk cache.

Note

Overwriting data means "overwriting existing data"! Flushing with an empty memory cache leaves the disk cache as is.

Flushing does not mean deleting data from the memory cache. Data can be deleted from the memory cache during the flushing process but it is not guaranteed to happen.
Consequently, deleting data from the memory cache will always cause a flush beforehand to ensure that data is cached on disk.

The memory cache will regularly flush its data to disk. It is only predictable to a certain degree when exactly that happens. (see requestIdleCallback() and the list of conditions for flushing below). However, you can always command a flush manually by calling the flush() method.

Conditions for flushing include:

  • The browser reports an idle state.
  • A certain period of time has passed since the last flush.
  • A certain amount of unflushed data has accumulated.
  • The user requests a flush using a provided flush() method.
  • The user requests that the memory be cleared.
  • ...this list may be expanded

Cache invalidation

To remove stale data both caches will invalidate stale data regularly (a set time period) by discarding them.
This happens asynchronously and is performed independently by both caches. The potential for race conditions when fetching data can be neglected because only stale data is deleted which will not be returned anyway.

Note

Only data that is definitely stale will be deleted. Data that has been cached with directives that allow for stale data to be returned such as stale-while-revalidate will be kept.

Manually clearing cache

Manually clearing cache means deleting all data. The only distinction that can be made is whether to clear only the memory cache or both, the memory and the disk cache. Only clearing the disk cache does not make sense as it would be updated by the memory cache on the next flush.

Note

The ability to clear only the memory cache is provided as a devtool to rule out possible issues caused by the memory cache. It usually makes more sense to clear the entire cache.

Common data types and their caching strategy

Important

Cache-control headers should always take priority over custom caching strategies!

Playlist

Playlists have a snapshot ID that identifies their version. Whenever a playlist is fetched it should be cached together with its snapshot ID. This allows for efficient retrieval as we only need to check against the snapshot ID and return the cached playlist if the cached snapshot matches the latest returned one.
Because playlists can get very long we only store the items we fetched because the user requested them. This way we can accumulate playlist data as the user scrolls through it.

Playlists the user has saved to their library should be cached indefinitely until a new snapshot is found. All other playlist should have an expiration date at which they will be discarded.

Releases

Releases are albums, singles, etc.
As release takedowns are relatively rare we can cache releases for a longer period.

Releases the user has saved to their library should be cached for longer periods than those that are not.

Artists

Artist data may change regularly because of gained or lost followers, changes in popularity count, etc. Therefore, caching should be limited to a session. (It may make sense to cache that in sessionStorage)

Artist's albums and top tracks

As this data does not change too often (it also may not be updated by Spotify in real time) cache-control instructions should be applied.

Player Data

Player data is fetched in very short intervals. It should therefore never be cached as it may change very regularly.
It may make sense to store the latest player data in memory for the entire application to access but this should be separated from caching logic as caching does not apply here (e.g. the data is updated almost constantly).

Profile of current user

For security and privacy reasons only a select few, security-uncritical things should be cached for a short period. (Again, cache-control instructions may cause this behavior to be different)
This may include:

  • Display name
  • Follower count
  • Profile picture (this should probably be handled as stale-while-revalidate)

Other user profiles should never be stored

Note regarding podcasts

Podcasts are not supported at the moment, but if they become in the future, specific caching strategies need to be found for episodes, etc.