twick
is a command-line tool for fetching and storing tweets on short notice.
twick
fetches tweets that match a given search query, and stores them in any SQLAlchemy-supported database (SQLite, PostgreSQL, MySQL, and more).
Developed at BuzzFeed.
pip install twick
To authenticate its API requests, twick
requires the standard set of Twitter credentials: API key, API secret, access token, and access token secret. (For instructions on how to obtain these credentials, read here.) You can either supply them via the --credentials
command-line argument (as four, space-separated strings), or by setting the following environment variables in your shell:
export TWICK_API_KEY="[replace me]"
export TWICK_API_SECRET="[replace me]"
export TWICK_ACCESS_TOKEN="[replace me]"
export TWICK_ACCESS_TOKEN_SECRET="[replace me]"
twick
has two subcommands:
-
twick fetch
polls for new tweets at a regular interval. -
twick backfill
pulls earlier tweets, and stops when it can find no more.
Both store basic data on each tweet (id
, text
, created_at
, user_name
, screen_name
, and user_location
) and each API response (query
, count
, completed_in
, max_id
, since_id
, refresh_url
, next_results
).
Your search query will be the first argument after each subcommand. You can also supply any of these optional arguments:
--db [connection string]
: Any valid SQLAlchemy connection string, describing where to store your results. Default:sqlite:///twick.sqlite
--throttle [num]
: Wait [num] seconds between API requests. Defaults to 15 to stay under standard rate limits.--store-raw
: Store raw tweet JSON, in addition to excerpted fields described above.--quiet
: Silence logging.--credentials [api_key, api_secret, access_token, access_token_secret]
: See "Setup" above.
twick fetch "harlem building collapse" --db sqlite:///tweets.db
twick fetch "drone from:buzzfeedben" --db sqlite:///ben-drone-tweets.sqlite --throttle 60
twick backfill "to:davidplotz pandas" --store-raw --throttle 5