bookieio/Bookie

inconsistent dates in API output

Closed this issue · 7 comments

First off, the updated field doesn't seem to have any value in it. What is it for?

Second, I sometimes see a date field in the output, e.g.:

headers: {"bmark": {"username": "anarcat", "updated": "", "extended": "", "description": "Example Domain", "tags": [], "bid": 4873, "stored": "2014-11-29 22:27:51", "inserted_by": "unknown_api", "tag_str": "", "clicks": 0, "is_private": true, "hash_id": "2a1b402420ef46"}, "location": "https://example.com/bookie/bmark/readable/2a1b402420ef46"}, body: {'content-length': '332', 'keep-alive': 'timeout=5, max=100', 'server': 'Apache/2.4.10 (Debian)', 'connection': 'Keep-Alive', u'_http_status': 200, 'date': 'Sat, 29 Nov 2014 22:28:04 GMT', 'access-control-allow-origin': '*', 'access-control-allow-headers': 'X-Requested-With', 'content-type': 'application/json; charset=UTF-8'}

what is that date field? and why isn't it in the same format as the stored field?

i am trying to figure out a way to see if i can detect, through only /:username/bmark, if a bookmark is new or already known. Relying on the time seems flaky...

and all this is so that I can make sure that the content is updated when i update a bookmark, but maybe that's already the case?

It looks like you're hitting the method you request the api from. I'm not sure what tool you're using. Using httpie (python tool) kind of helps make it more clear as it breaks down the httpclient (normally a browser) data vs the actual JSON api response.

http://paste.ubuntu.com/9306746/

Is an example fetching one of my own bookmarks. Now, the thing is that you should only get a good bookmark if it exists. If I change that hash to an invalid one you get a 404 response with a nice error message in the API.

http://paste.ubuntu.com/9306770/

Hopefully this helps clear it up. I'm marking this as invalid as it's about the client you're using to call the API vs the API itself I believe. Please let me know if I've read this incorrectly.

oh i see, i confused the http header with the json output, sorry.

or actually, the examples you pasted are about a hash you know. but what if i have a URL? how can i tell if it's already been submitted? and can i do that while submitting the URL?

i think having a "created" field would be useful.. for example.

and what's up with the updated field?

Ok, so the updated field is supposed to be saved after a bookmark that already existed is changed. I tried it out on the bookmark in the sample bookmark in the paste above and tested that when I added a tag the updated field is set to the current date/time. So let's say that you imported old bookmarks and we kept the original timestamp, but later go through and mass update tags or something, the original timestamp is kept apart from the last edit timestamp.

'created' is what 'stored' is really.

As for testing if a url is bookmarked, you're right. We don't have a good way to do that atm. There's basically two paths I've got for you atm. One, hash the url yourself using the algorithm here:

https://github.com/bookieio/Bookie/blob/develop/bookie/lib/urlhash.py#L5

That's kind of done for you using the bookie_parser project I've not updated yet to bookie.io

https://github.com/bookieio/bookie_parser

It has an api that you can use and pass it a url. I'll hash the url, load the readable content, and send back that response.

https://github.com/bookieio/bookie_parser/blob/master/bookie_parser/__init__.py#L30

You can try it out against the heroku served instance:

http://r.bmark.us/

The hash it gives you back is the same as a Bookie hash.

http -j --follow POST "http://r.bmark.us/api/v1/parse" url="http://tornado.readthedocs.org/en/latest/"

for instance will load the readable version of the tornado homepage docs, let you know the hash_id is 8d8cd9a9507053 and give you some extra info on the request itself.

If you think it'd be helpful we can look at adding some sort of "url -> hash_id" endpoint to the api.

understood, thanks for the feedback!

one last question while i have you here.. ;) is the content field "reparsed" when you submit a url through the bmark endpoint a second time? ie. is it possible to "refresh" the static copy of the HTML that way?

thanks!

@anarcat no, the only times it's parsed is

  • if it's a new bookmark and there's no current readable content then the celery process picks it up
  • if the bookmark is new and sent with content
  • if it's an edit of a bookmark and there's new content sent as a POST update to the existing bookmark.

understood, thanks!

and I think that makes perfect sense too... i guess docs should be updated with this issue to be really complete, but I feel lazy now. :)