webrecorder/specs

[Use Case]: Locally hosted copies of at risk video to guard against link rot

edsu opened this issue · 1 comments

edsu commented

Describe a use case for WACZ format.

A news site uses the browser to create an archive of a tweet that contains a video they would like to use in a news story, downloading the WACZ file and embedding it into their page instead of the live tweet, which may be deleted. The news site hosts the WACZ file in their CMS along with other content.

Additional Requirements

  • List of entry pages to start browsing from
  • Full-text search index
  • Technical metadata about the web archive
  • User-defined descriptive metadata
  • Screenshots of key pages
  • Encryption of data
  • Proof of Authenticity (Signing and Verification)
  • Fast access to multiple WACZ files in aggregate
  • Crawl or capture logs

How will web archives be created for this use case?

  • Manually, using a browser to capture exact content as directed by the user.
  • Automatically, using a crawler to crawl desired content, either once or on a specified schedule.

Sensitive private content and access

  • No, this use case focuses on archiving publicly accessible data only, and web archive can be made public.
  • No, this use case focuses on archiving publicly data only, but web archive is not inteded to be public.
  • Yes, this use case involves archiving data that is not public, and the web archive should not be made public.
edsu commented

This has been added to the current use cases document.