lengstrom/falcon

Wild ideas for the future: Persisting whole website states offline (with images) + content sharing

niieani opened this issue · 3 comments

It's a known fact that the web is broken - sites and articles disappear daily. Links are not permanent, thus it's not enough to just index the text and save them if you want to be able to reliably go back to the content some time in the future.

It would be really awesome if, by clicking the Falcon button on a website, one could persist the state of the website locally - everything - including CSS and pictures (and maybe even videos, if we're wild and have too much space!). If we combine this with a private server used for storing this data (#49), added the ability to manually tag specific sites for future reference and to comment on them, we could potentially make it into a tool where whole teams could collaborate while researching a given topic! One would then also need a GUI for browsing through that saved history manually, but it would essentially be the best collaborative research tool available.

I'd totally be up for helping develop this.

dldx commented

I would absolutely love this feature too! I'm guessing that the html is already stored offline so I can't imagine it would take much more than the current setup but perhaps I'm wrong..

Hi guys! This sounds like a really cool idea, and it'd be great to integrate something like this into Falcon. The system I'm envisioning is:

  1. Press a button on Falcon menu to "archive" website
  2. View archived websites in some kind of separate history viewing page, or maybe somehow integrate with the f<tab>cmd functionality already in place.

Some other thoughts:

  • Storing data on the server is messy, high effort, and inherently insecure
  • I am a full time college student so I can't work on this until January, but I'm happy to vet pull requests

The click-to-archive method sounds good.

Not sure what you mean about having a server being insecure? With relatively low effort all the security risks can be mitigated:

  • use key-pair encryption and encrypt data locally, before uploading to the server, that way even if somebody breaks into the server, they are unable to decrypt the data
  • the search cache and all metadata is likewise created and encrypted/decrypted and a decrypted instance is kept locally for fast access
  • we use HTTPS for all communication between client<->sever and authenticate the user for extra security
  • in case you'd want to share data with other users in the system, you'd generate a new key-pair for each website/entry and re-encrypt those keys with that persons public key

With encryption in place, serving the function of the server could technically even be a public shared storage like Dropbox, because encrypted files would be unusable to any evil-doers, and you need to have the private key to access their contents. The private key would obviously need to be keep secret by the user.

I have experience with creating secure, encrypted client<->server communication, so given some spare time, I'd be happy to work on this.