mxsasha/nrtmv4

A better way of dealing with deltas

mxsasha opened this issue · 7 comments

Current design

To make sure we're all on the same page, the draft at time of writing basically says:

  • Mirror servers generate Delta Files up to every minute - but may be much less often for IRR Databases that get few changes.
  • Delta Files are stored in a unique URL and are immutable, allowing all the caching.
  • The Update Notification File lists all current Delta Files.
  • The Update Notification File is mutable on the same URL and therefore allows only very limited caching. Therefore, we try to keep them small.
  • Mirror server will trim old Delta Files when their total size exceeds the Snapshot File size. If a client is that far behind, it is more efficient (fewer transfer bytes) for them to get the last Snapshot File.

(This is also how RRDP works, which is where I took it from.)

Application to IRR

Some assumptions on behaviour:

  • A lot of clients will keep very closely up to date, and download around 0 to 5 Delta Files every time they check for updates.
  • Some clients will run behind, and need to catch up with days of Delta Files, maybe longer.
  • In IRR, individual Delta Files are probably very small. Especially compared to the Snapshot File.

Some numbers on the last point, volume of updates on November 6-12:

    source    | per_week | per_day | per_hour
--------------+----------+---------+----------
 AFRINIC      |     1830 |     261 |       10
 ALTDB        |      222 |      31 |        1
 ARIN         |     1566 |     223 |        9
 ARIN-NONAUTH |      161 |      23 |        0
 CANARIE      |        1 |       0 |        0
 RIPE         |    34070 |    4867 |      202
 RIPE-NONAUTH |        6 |       0 |        0
 TC           |      539 |      77 |        3

Due to NRTM I can't pin exactly how many Delta Files would have been produced, but certainly not more than above numbers.

For RIPE, the volume would be many small files: if exactly equally spead out, one Delta File every minute with 3 IRR object changes each. They're probably more clustered, so not quite as bad. For other IRRs, less of an issue.

RIPE has 6395804 objects in my copy. So if we assume RIPE always changes 5000 objects per day, and all objects are the same size, the Delta File expiration will not start until 1279 days of Deltas have been gathered - because only at that point the Snapshot File will be bigger.

This means that a client running three years behind on RIPE, will still find all Delta Files there, and will use them to catch up rather than reinitialise from the Snapshot File. In that process, the client will download 250.000 Delta Files, optimistically assuming one Delta File per 5 minutes due to clustering. These are pretty rough numbers, but order of magnitude works.

(Notable large ones missing in my list are RADB and APNIC, I think my mirror might have broken.)

Impact and possible solutions

Even if the total file size is reasonable, having a client download 250.000 Delta Files is rather impractical, so this requires a solution in the standard. We're still trying to meet a number of needs:

  • The server should not need to generate data differently for different clients.
  • It's nice if we can benefit from caching in HTTP layers. But maybe Delta Files are so small it's not too important.
  • The Snapshot Update Notification File will be the most often requested - mirror clients request it every time they want to check for new content, so it should remain small.
  • Clients should be able to fetch update in a reasonably optimal way. Both those who are following closely, and those that lag behind. Optimal does not have to be perfect, but should avoid reinitialising from the Snapshot File too often, downloading a lot more data than needed, or downloading a ridiculous number of small files.

I had two possible thoughts for now:

  • Delta File aggregation. Periodically, have the mirror server aggregate older (6 hours old?) Delta Files into one, say in 1 hour blocks, so aggregating up to 60 Delta Files into one. Inside the aggregate file, each Delta File segment should be split out by version number, so that clients can pick up halfway. In the Update Notification File, list the file with a range of version numbers rather than one specific. No cost for clients who stay closely up to date. Clients further behind will need to download fewer files. Downside: has implementation complexity - an extra background task that aggregates Delta Files all the time.
  • Faster expiry. Instead of trimming Delta Files when their total size exceeds the Snapshot File size, just do it after a few days (2-3?). Upside: really simple to understand and implement. Downside: even if one day behind, there could be up to 1440 files to download for the client, which still takes a while.

@stkonst also mentioned some alternate ideas about Delta Files, but I thought I'd lay out the current goals issues properly here :)

job commented

I think "faster expiry" is the idea to put forward to the working group in the initial draft, simply because it is simpler. If WG participants have appetite for more complexity / efficiency, we can try the 'delta aggregation' route.

Hi Sasha,

Your numbers regarding the amount of changes per hour in RIPE DB are relatively aligned to the numbers that were extracted in a past research of mine. Based on the analysis you provided above:

  • I vote for faster expiry as well but I am afraid that we will end up implementing both strategies. But let's see what the community will advice us.

  • Perhaps faster expiry should be 24h ? If a client is behind more than a day then simply get the latest snapshot file and start updating yourself from that point on.

I agree. I'll write a PR.

In the wonderful analysis above you write:

"The Snapshot File will be the most often requested - mirror clients request it every time they want to check for new content, so it should remain small."

I guess you wanted to say "The Update Notification File..." since according to the design this is the one that mirror clients will consult every time they want to check for new changes.

I guess you wanted to say "The Update Notification File..." since according to the design this is the one that mirror clients will consult every time they want to check for new changes.

Ah yes, indeed. Snapshots should ideally be our least requested file ;)

Actually, since we're fully in agreement I went ahead and committed it to main in 398b156 6dc9240.

Discussed and we are staying with deleting deltas after 24 hrs, so this is finished.