Investigate metadata scalability
trishankkarthik opened this issue · 5 comments
How does the implementation plan to handle metadata for a software update repository with a large number of targets and target delegations? Presently, it looks like the metadata will be quite large if left uncompressed for a sufficiently large number of targets and target delegations.
A few solutions:
- Compress metadata with standard (e.g. GZIP) techniques.
- Investigate metadata difference schemes.
#44 will give us some data about this issue.
Things to do efficiently: downloading only the subset of target metadata relevant to the target file in question; downloading as much as possible in as few HTTP requests as possible.
See #57 for a method to reduce metadata size in the common case where a delegated role is trusted with wildcard target paths.
Maybe consider binary data exchange formats, such as Protocol Buffers or Cap'n Proto.
The tentatively-named "lazy bin walk" scheme to address metadata scalability is discussed in our design document for PyPI+TUF.