/twitter-preserver

Converts downloaded twitter archives into preservable HTML.

Primary LanguageJavaScriptApache License 2.0Apache-2.0

Unit tests

Twitter preserver

An open-source project funded by the Alfred P. Sloan Foundation as part of the preservation of The COVID Tracking Project.

When you download an archive of your user data from Twitter, the ZIP file you are provided is fine for personal use, but not useful for long-term preservation (over 50 years) in an archive. For example:

  • The iconography and design of the interface assumes the user is familiar with Twitter, which may not exist or be drastically different.
  • Some content like DMs might not be appropriate for preservation.
  • Javascript or browser security rules might change in the future, breaking the way the archive includes content.
  • The actual HTML files are not easily parseable by future researchers.
  • Printing the archive pages to a stable format (acid-free paper) is not pretty.

This project converts a downloaded ZIP file from Twitter into a collection of simple HTML files made to be either hosted online, or opened locally. It also converts the data into a Markdown and PDF.

View a demonstration of a preserved Twitter account

Usage

twitter-archiver [commands]

Usage: twitter-preserver path/to/archive.zip
twitter-preserver --source=path/to/archive.zip

Arguments:

--source
  The source Zip file or an expanded Zip file that was downloaded from Twitter.

--expandUrls
  Enabled by default. Will replace all "t.co" URLs in Tweet text with
  the actual URL the link was pointing to.

--clean
  Remove cache for URLs.

--output=./public
  Write the results of the archive to this folder (default ./public)

--pdf
  Write output to PDF files. Will appear as a `./pdf` directory within the output directory

--include=tweets,dms,likes
  Comma-separated list of what data to include in the export.
  Possible values are: tweets, dms, likes, group-dms

--templates=./src/templates
  Use this directory for Nunjucks template files.
  Useful if you want to use your own templates to format the output.

--dev
  Run a web server that live updates for easier template development.
  
  

Using your own templates