/saveface

python script to download facebook posts and comments etc using graph api through facepy

Primary LanguagePython



This is a work in progress, the git version history has a working version on the branch 'working2' which I will use in the meantime. I'll sort this out later today.

python script to download facebook posts and comments etc using graph api through facepy

usage: python saveface.py [-h] [-g [Where to source the pages from]] -a
[facebook auth token] [-r [rest api request string]]
[-f [output format for results]] [-o [output to stdout]]
[-s [pickle the array of pages]]
[-n [filename for the output]]
[-l [filepath for the output]]
[-p [pprint options [pprint options …]]]
[-i [download images?]] [-d [path to images]]
[-c [css filename]]

Download facebook posts, comments,images etc. Default request string is :
me?fields=posts.include_hidden(true) {created_time,from,message,comments
{created_time,from,message,comments
{created_time,from,message},attachment},full_picture}

optional arguments:
-h, --help show this help message and exit
-g [Where to source the pages from], --getfrom [Where to source the pages from] Optional.
      Can be one of facebook or pickle. Defaults to facebook
-a [facebook auth token], --auth_tkn [facebook auth token] Optional. Your app’s facebook
      authorisation token. Must be present if you are not sourcing results from a pickle
-r [rest api request string], --request_string [rest api request string] Optional.
      The request string to query facebook’s api. Defaults to posts, comments, images
-f [output format for results], --format [output format for results] Optional.
      Can be one of json, pjson (prettyprinted), xml or html. Defaults to json
-o [output to stdout], --stdout [output to stdout] Optional.
      Output to stdout. Defaults to False
-s [pickle the array of pages], --save [pickle the array of pages] Optional.
      Use Pickle to store the array of pages. Defaults to False
-n [filename for the output], --filename [filename for the output] Optional.
       A filename for the results. Results will not be saved without filename being specified
-l [filepath for the output], --location [filepath for the output] Optional.
      A filepath for the results file. Defaults to ./
-p [pprint options [pprint options …]], --pprint_options [pprint options [pprint options …]] Optional.
      Options for pprint module. key=value with comma ie -p [indent=4, depth=80]
-i [download images?], --images [download images?] Optional.
      A boolean to indicate whether or not to download images. Defaults to False
-d [path to images], --image_path [path to images] Optional.
      The path to the images folder. Defaults to ./images
-c [css filename], --css [css filename] Optional.
      The filename of the css file. Defaults to saveface.css

Saving Face with saveface.py

Currently this will download json, xml, or html (which I’m in the process of styling).
The next step is a config file, which will include a templated html representation.
Eventually it will be able to make a local copy of images, referenced from
the xml file by relative file paths to the local copies.

Disclaimer

This project has been to help myself to learn python.
I chose python largely because I can develop it into an app using a framework that will compile to both ios and android. Also, I've wanted to play with python for ages.
The inheritance structure is a bit wonky, as to inherit html from xml is kind of the wrong way round, and they should be separate classes, as well as the json class, all inheriting from a super class, but like I say, its been to play with.
It works, but Cambridge Analytica closed down the ability to easily get to so called 'Public' groups, so I decided not to continue to develop it, as it has served its purpose in downloading my own data.
You can still use it for this.
I had intended to make sure that the program would store its request string as a script, that one could then call to update the pickle with the latest posts, and to use an xmlpullparser for display, with a dequeue so that I could reduce memory consumption for the sake of displaying on the browser.
I'd also wanted to extend it into a number of other features, but time has prevented me, I'm forced to move onto other things, and I don't to expect to be back to this again.