Export your personal Reddit data: saves, upvotes, submissions etc. as JSON.
pip3 install --user -r requirements.txt
- To use the API, you need to register a custom āpersonal scriptā app and get
client_id
andclient_secret
parameters.See more here.
- To access userās personal data (e.g. saved posts/comments), Reddit API also requires
username
andpassword
.Yes, unfortunately it wants your plaintext Reddit password, you can read more about it here.
Usage:
Recommended: create secrets.py
keeping your api parameters, e.g.:
username = "USERNAME" password = "PASSWORD" client_id = "CLIENT_ID" client_secret = "CLIENT_SECRET"
After that, use:
./export.py --secrets /path/to/secrets.py
That way you type less and have control over where you keep your plaintext secrets.
Alternatively, you can pass parameters directly, e.g.
./export.py --username <username> --password <password> --client_id <client_id> --client_secret <client_secret>
However, this is verbose and prone to leaking your keys/tokens/passwords in shell history.
You can also import export.py
as a module and call get_json
function directly to get raw JSON.
I highly recommend checking exported files at least once just to make sure they contain everything you expect from your export. If not, please feel free to ask or raise an issue!
WARNING: reddit API limits your queries to 1000 entries.
I highly recommend to back up regularly and keep old exports. Easy way to achieve it is command like this:
./export.py --secrets /path/to/secrets.py >"export-$(date -I).json"
Or, you can use arctee that automates this.
Check out these links if youāre interested in getting older data thatās inaccessible by API:
- comment by /u/binkarus
- Reddit admis say that the rationale behind the API limitation is performance and caching
- perhaps you can request all of your data under GDPR? I havenāt tried that personally though.
- pushshift can potentially help you retrieve old data
See example-output.json, itās got some example data you might find in your data export. Iāve cleaned it up a bit as itās got lots of different fields many of which are probably not relevant.
However, this is pretty API dependent and changes all the time, so better check with Reddit API if you are looking to something specific.
You can use dal.py
(stands for āData Access/Abstraction Layerā) to access your exported data, even offline.
I elaborate on motivation behind it here.
- main usecase is to be imported as python module to allow for programmatic access to your data.
You can find some inspiration in =my.= package that Iām using as an API to all my personal data.
- to test it against your export, simply run:
./dal.py --source /path/to/export
- you can also try it interactively:
./dal.py --source /path/to/export --interactive
Example output:
Your most saved subreddits: [('orgmode', 50), ('emacs', 36), ('QuantifiedSelf', 33), ('AskReddit', 33), ('selfhosted', 29)]