This is to automate the process of collecting data for:
So far, there are 3 types of compressed files that can be found in the pushshift website. 3 types of those files are dealt with eventually.
- zst
- bz2
- xz
For now, to download any specific type of compressed file, we use the following command:
- For zst files:
python deal_with_zst.py pushshift_URL SUBREDDIT
- for bz2 files:
python deal_with_bz2.py pushshift_URL SUBREDDIT
- for xz files:
python deal_with_xz.py pushshift_URL SUBREDDIT