charlesroelli/org-board

Improved support for response-type websites

Jdogzz opened this issue · 0 comments

I've noticed that while webpages from websites like stackexchange, reddit, etc, the obtained copy doesn't have all the info from responses (sometimes sub-responses aren't loaded, and sometimes additional pages of responses are made). It would perhaps be a nice enhancement to org-board to use the website's API, if available, for building a page-specific repository of the threads that develop. The flow could go something like this:

  • User sets an option in the config file to do extra processing like this for a given domain (probably per-site functionality will have to be made)
  • org-board-new is run on a URL
  • The raw website is retrieved with wget as usual
  • Something like an mbox file is made and the website's API is accessed to populate the file with the threaded conversation intact
  • Later, when the user runs org-board-open, the user can have the option of opening the raw website as usual or using a mail reader to open the mbox file

As a potential further option, though this may conflict with the way org-board treats website updates, the user can request the mbox file be updated with new or altered responses on the webpage since the original retrieval.