PED Project
Getting Posts
It's all already done, but if you'd like to use our scripts to get other posts here is how to do it:
- Go to https://www.reddit.com/prefs/apps/
- Create app: give it any name, choose script, set redirect uri to http://localhost:8080, you shoudl see something like this:
- Clone this repo
- Create a
.env
file in the root directory and put envs into it:
CLIENT_ID=<the string under "personal use app" in the screen>
CLIENT_SECRET=<secret from the screen>
USER_AGENT=<anything e.g. testscript by u/your_user_name> It doesn't really need to be an env
REDDIT_USERNAME=<username of reddit account used for the creation of app in screen>
PASSWORD=<the password for this username>
I should look like this:
CLIENT_ID=FZ__Cy<rest of id>
CLIENT_SECRET=10Z7kV4btNlpo<rest of secret>
USER_AGENT=testscript by u/deepfuckingvalue
REDDIT_USERNAME=deepfuckingvalue
PASSWORD=hunter2
- Run
get_ids.py
- now it can take a long time and should you need to pause it you should:- copy the last like printed (e.g.
100 1986055
), - kill the process
- when you want to resume the process change the filename in
f = open("ids.txt", "w")
to e.g.ids2
(yeah, I know I could probably append to the file) - comment or remove
end_ts = int(dt.datetime(2021, 2, 11, 12).timestamp())
, - uncomment
# end_ts = start_ts + 1093144
and put the value from the copied last line in place of1093144
- copy the last like printed (e.g.
- Once
get_ids
finishes you and you have more than one file usemerge_ids.py
to merge them into one replace['ids1.txt','ids2.txt','ids3.txt']
with the list of your filenames - Use
filter.py
to filter removed posts - in my case they were more than 95% of all posts, so it was worth it. Note that this also creates a file with ids, timestamps, and flag if a post is deleted for plotting purposes, if you do nto want this comment lines withf2
variable. - Use
mapper.py
to map ids of unremoved posts to their current statistics. If you need more (or less) information editline = [...]
i
in this case is a praw.models.submission