ToDo list

Question

ToDo list

Closed this issue 7 years ago · 48 comments

Answer 1 · 2016-12-06T18:27:43.000Z

@jarun I can do pypi, it's really easy for me to maintain for you. Just need access to the "repo".
You can do that here and add the user name shaggytwodope to the maintainer role.

Answer 2 · 2016-12-06T19:19:51.000Z

Done! Thank you!!! I thought of mailing you once but held back lest I would be imposing. :)

Answer 3 · 2016-12-06T19:30:15.000Z

@jarun No worries mate, started a new job tho so weekend releases might get delayed for me. But short of that I'll gladly update asap. It's now updated to the 2.7 release here. I've tested the install process and it appears to work wonderfully.

Feel free to email or @ ping me anytime. I try to respond quickly. Also we/someone can add pip install buku instructions to readme at some stage now.

Answer 4 · 2016-12-06T19:36:36.000Z

@jarun oh yeah, so the readme isn't pypi friendly at the moment, I'll look into this later on. I see it as a very low priority. Pypi site is in a transition stage for a new platform. Better markdown support may come. But for now I think it's ok to look a little off. Most likely tomorrow I'll see what options exist to clean it a bit.

Answer 5 · 2016-12-06T19:46:50.000Z

No hurries. I'll add the pip installation procedure right away. Thanks again!

Answer 6 · 2016-12-19T21:22:54.000Z

@jarun what about storing additional metadata per bookmark such as date_added, date_updated and possibly track how many times a bookmark was opened from buku?

these 3 metadata details seem useful to me because while tags are useful they don't convey when I did something, nor can I trend tag usage over time (as a web developer I create bookmarks per development project). Similarly as tags get added it might be nice to be able to say "I never went back to this bookmark, or opened this bookmark alot"

Answer 7 · 2016-12-20T02:24:54.000Z

Hi, please read the project policy:

no history, obsolete records, usage analytics or homing.

Answer 8 · 2016-12-20T02:44:09.000Z

@jarun thank you for the clarification. I believe you value privacy above all else, and I salute you for that. I would argue that similar to an Email or a blog post -- knowing the date of something does not imply user tracking; but is a useful fact of the data. Data is only as useful as it is searchable, tags and text searching will take you only so far.

Regardless, thank you for the prompt reply. I would kindly suggest that you provide a title to the last line of your Readme file with a heading of "Project Policy" or "Project Goals" as the single sentence doesn't express the depth of your passion on these ideals and may be easily overlooked by future users with similar wishes.

Answer 9 · 2016-12-20T02:48:52.000Z

Thanks for your suggestion and understanding. BTW, if you wish to use Buku as your backend you can extend it in your project. I do that already for some columns. Also, would you be interested in the REST API task?

Answer 10 · 2016-12-20T17:10:46.000Z

The suggestion of extending it sounds like a good idea for custom columns, I will try that.

Unfortunately I'm not a python developer so I'm out of my element with python+REST :(. Best regards.

Answer 11 · 2016-12-20T18:08:21.000Z

Unfortunately I'm not a python developer so I'm out of my element with python+REST

In it's simplest form I think you'ld need to (de)serialize the input and return for each API.

Answer 12 · 2016-12-24T19:20:09.000Z

Maybe an API specification first?

Answer 13 · 2016-12-24T22:36:47.000Z

@Qu4tro yes, item 2 in the list.

Answer 14 · 2017-01-30T17:24:27.000Z

Hi, jarun--

Here's a feature idea: when buku attempts to pull down the title from the URL, if the server does not respond or produces other HTTP errors, add special tags to the broken entries. This, combined with the delete bookmark by search feature will allow users to cull obsolete/broken bookmarks from their database. It is even possible to use a series of numbered tags like: NoResponse1, NoResponse2, NoResponse3 to handle the possibility that a URL is only temporarily offline. (For each failure to connect, buku replaces the old tag NoResponsen with NoResponsen+1; users can search for NoResponse5 and delete all of them.)

Other tags are HTTP404 for the pages that have been removed (users will probably want to purge those bookmarks or update them with new URLs), etc.

Answer 15 · 2017-01-31T02:42:10.000Z

@dchang0 Thanks for the suggestion! Having a title or not due to HTTP errors becomes irrelevant because of the following options:

set manual tags and mark immutable
option to search blank titles using buku -S blank
silent full refresh showing error codes for all bookmarks using buku -u --tacit

I'm afraid any additional options would be a feature bloat around titles.

Answer 16 · 2017-01-31T02:48:04.000Z

Thanks for the reply. Perhaps I should not have mentioned titles. My actual goal is to make it so that it is easy to automatically cull dead URLs in one or two or three passes. (The only reason for multiple passes is that some URLs might only be temporarily broken.) And there is not necessarily a need to use tags to store the state of broken URLs, either. If there were some other more elegant means of marking dead or alive URLs between passes, that would be fine by me too.

…

On Jan 30, 2017, at 18:42, Arun Prakash Jana ***@***.***> wrote: @dchang0 <https://github.com/dchang0> Tanks for the suggestion! Having a title or not due to HTTP errors becomes irrelevant because of the following options: set manual tags and mark immutable option to search blank titles using buku -S blank silent full refresh showing error codes for all bookmarks using buku -u --tacit I'm afraid any additional options would be a feature bloat around titles. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#103 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIePamikbARLRIg_gHZmwJPCfnnIWXYHks5rXp-DgaJpZM4LFk6G>.

Answer 17 · 2017-01-31T02:57:21.000Z

Why is it important for a bookmark management utility to store the broken/temporarily broken status of pages?

Answer 18 · 2017-01-31T03:13:16.000Z

Well, that's not actually important. The final goal is the only important part: making it easy to clear out bad bookmarks quickly and without lots of manual labor. The only reason to have to store a state is only because a server could be down temporarily. If we don't deal with this, it's not necessary to store the state, such as deleting the bad bookmarks immediately.

…

On Jan 30, 2017, at 18:57, Arun Prakash Jana ***@***.***> wrote: Why is it important for a bookmark management utility to store the broken/temporarily broken status of pages? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#103 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIePalsvSaJV30QSgRm3UYstlF87I2dpks5rXqMSgaJpZM4LFk6G>.

Answer 19 · 2017-01-31T03:17:33.000Z

All I am saying is these statuses are not permanent. They keep changing and if we were to track those properly (in real time) we should find some way to run buku -u with some frequency as a daemon in the background.

Let's say a page is showing 404. We mark it in Buku accordingly. Later the page comes back online but the status is still 404 in Buku. Users should never RELY on the stored data and take a decision to remove or not to visit the page. The ONLY way to know for sure a page is down (at the moment) is to visit the page. Why should users care about the last time Buku checked the status?

Answer 20 · 2017-01-31T03:29:48.000Z

Presumably one would run the check just before running the mass deletion for the confirmation that the page is really down. And there would always be the possibility of getting a false positive (user thinks the site is permanently down but it was only temporarily down).

…

On Jan 30, 2017, at 19:17, Arun Prakash Jana ***@***.***> wrote: All I am saying is these statuses are not permanent. They keep changing and if we were to track those properly (in real time) we should find some way to run buku -u with some frequency as a daemon in the background. Let's say a page is showing 404. We mark Buku in accordingly. Later the page comes back online but the status is still 404 in Buku. Users should never RELY on the stored data and take a decision to remove or not visit the page. The ONLY way to know for sure a page is down is to visit the page. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#103 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIePajwmf7qou1zB58SbkbAqfvti5Po9ks5rXqfNgaJpZM4LFk6G>.

Answer 21 · 2017-01-31T03:34:33.000Z

Presumably one would run the check just before running the mass deletion

That's what buku -u --tacit does for you. For all bookmarks at once, multi-threaded. Please run it once first.

And there would always be the possibility of getting a false positive

I don't want it coming from Buku. Not for stuff not under Buku's control.

Answer 22 · 2017-01-31T03:59:50.000Z

Understood. I'll have to store the state outside of buku. As it stands, it's too much work to clean up my old database manually.

…

On Jan 30, 2017, at 19:34, Arun Prakash Jana ***@***.***> wrote: Presumably one would run the check just before running the mass deletion That's what buku -u --tacit does for you. For all bookmarks at once. Please run it once first. And there would always be the possibility of getting a false positive I don't want it coming from Buku. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#103 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIePahGdrWJzBaYOCNoN3fq2VVmbhYTVks5rXqvKgaJpZM4LFk6G>.

Answer 23 · 2017-01-31T04:09:02.000Z

Thanks for your understanding!

Answer 24 · 2017-01-31T07:37:17.000Z

How about this idea instead? Have buku store a "Last seen on [datestamp]" value for each bookmark. This would never be false, as buku would only update the value if the URL is seen. Thus, it would never produce false positives. Then, if a user could run searches like this: "show me all the bookmarks that have a Last seen on date older than 90 days" and the user knows for sure that buku had been updated several times in those 90 days, they could safely choose to delete all those bookmarks found in this search. What do you think? buku would never lie nor produce a false positive. The false positives, if any, would be within the user's logic/assumptions.

…

On Jan 30, 2017, at 19:59, Dennis Chang ***@***.***> wrote: Understood. I'll have to store the state outside of buku. As it stands, it's too much work to clean up my old database manually. > On Jan 30, 2017, at 19:34, Arun Prakash Jana ***@***.*** ***@***.***>> wrote: > > Presumably one would run the check just before running the mass deletion > > That's what buku -u --tacit does for you. For all bookmarks at once. Please run it once first. > > And there would always be the possibility of getting a false positive > > I don't want it coming from Buku. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub <#103 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIePahGdrWJzBaYOCNoN3fq2VVmbhYTVks5rXqvKgaJpZM4LFk6G>. >

Answer 25 · 2017-01-31T13:32:51.000Z

Buku is too busy to track you - no history, obsolete records, usage analytics or homing.

However, if you are too eager to have this, please extend the flags column with a new bit for last_checked_status. 0 = OK (default), 1 = some HTTP failure.

Answer 26 · 2017-01-31T16:42:35.000Z

I'm curious as to why "obsolete records" would be considered "tracking you." A record being obsolete does not reveal anything about the habits of the user to a spying third party. The others, I could understand could expose a user's behavior. Thanks for the suggestion to extend the flags column. It would have to be a datestamp to be of use.

…

On Jan 31, 2017, at 05:32, Arun Prakash Jana ***@***.***> wrote: Buku is too busy to track you - no history, obsolete records, usage analytics or homing. However, if you are too eager to have this, please extend the flags column with a new bit for last_checked_status. 0 = OK (default), 1 = some HTTP failure. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#103 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIePatbUIBXjyh5KmPg5wz6mg3VNHk34ks5rXzgEgaJpZM4LFk6G>.

Answer 27 · 2017-01-31T17:04:37.000Z

I'm curious as to why "obsolete records" would be considered "tracking you."

The same way Google may mine your 2016 search history to know you. Anyway, I am not aligned to the datestamp stuff because it is a usage record.

From your earlier comment:

it's too much work to clean up my old database manually.

It's actually not, if that's what you want to do and by old database you mean a Buku (or importable) database. Run buku -u --tacit and redirect the output to a file. Parse (or manually check) the output file for failures and get the indices. Pass all the indices to buku -d.

Answer 28 · 2017-01-31T17:15:24.000Z

Hmm. Well, it seems that we have different definitions for the term "obsolete record," then. Please allow me to illustrate: Let's say I bookmark sites A, B, C, and D. Assume no one other than me can see my bookmarks. If B were to go obsolete, that is not through any user action. It happened out of my control (server was shut down). Then, a third party could not deduce any user behavior from B going obsolete, because B going obsolete was not caused by any user action. I understand and agree with your "I am not aligned to the datestamp stuff because it is a usage record." Re: "Run buku -u --tacit and redirect the output to a file. Parse (or manually check) the output file for failures and get the indices. Pass all the indices to buku -d." Understood.

…

On Jan 31, 2017, at 09:04, Arun Prakash Jana ***@***.***> wrote: I'm curious as to why "obsolete records" would be considered "tracking you." The same way Google may mine your 2016 search history to know you. Anyway, I am not aligned to the datestamp stuff because it is a usage record. From your earlier comment: it's too much work to clean up my old database manually. It's actually not, if that's what you want to do and by old database you mean a Buku (or importable) database. Run buku -u --tacit and redirect the output to a file. Parse (or manually check) the output file for failures and get the indices. Pass all the indices to buku -d. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#103 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIePaivD737QKS1eEOyz91EisMZpmLKnks5rX2mngaJpZM4LFk6G>.

Answer 29 · 2017-02-01T17:54:37.000Z

Is the REST API todo still open? Or is someone working on it?

Answer 30 · 2017-02-01T22:52:59.000Z

I was thinking about implementing it. Made a small project recently to get the grasps, on Flask-Restful which seems to be a good choice for the problem in hand.

There is still a few things that it's still not clear to me, but I might just be overthinking it. I'm thinking about implementating a small, functional subset first. I have a list of the endpoints that I planned somewhere, that I can post here.

PS: Also thinking about swagger support. Swagger is a great idea!

Answer 31 · 2017-02-01T22:55:56.000Z

@Qu4tro So you are going to work on it? Because, I am also familiar with building RESTful APIs with Flask, and was wondering if I could contribute. Yes it will be great if you can post the list of endpoints, I am curious.

Swagger support might be a little bit overkill IMO.

Answer 32 · 2017-02-01T23:14:04.000Z

If you're already familiar I think you should go ahead and create the repository. I will contribute to it as well.
I will post the list later when I get home.

I might be wrong about swagger, but it's gives so much: web gui, other languages clients for API, etc...

Answer 33 · 2017-02-02T02:28:02.000Z

@kishore-narendran please start with the REST APIs and @Qu4tro can contribute in. It is a long pending task item and I have come across people interested in this (to have Buku as a backend).

REST would be our first target. We'll have a discussion around swagger in phase 2 and take a call. I'm not familiar with it so I'ld have to take a look to understand the pros and cons myself.

Answer 34 · 2017-02-02T04:28:06.000Z

@kishore-narendran @Qu4tro I have added both of you as the owner of this task item. Please raise a bug and discuss the design issues, pros and cons, conflicts etc. there. I will pitch in wherever I can.

Answer 35 · 2017-03-07T16:45:41.000Z

Hello jarun, I am new to open source world.I want to contribute to your project.I know Python. How can i help you?

Answer 36 · 2017-03-07T17:42:48.000Z

@dheerajkrishna90 unfortunately our test cases are much behind our implementation. Can you start with adding more test cases for new APIs/functionality?

Answer 37 · 2017-03-07T23:55:48.000Z

yeah okay, @jarun should I add test cases to test_helpers.py or test_bukuDb.py?

Answer 38 · 2017-03-08T02:23:02.000Z

You have to write the new cases for class BukuDb in in test_bukuDb.py, generic APIs in test_helpers.py.

Answer 39 · 2017-03-08T02:25:22.000Z

Yeah got it

Answer 40 · 2017-03-08T04:46:23.000Z

Hi. I am a somewhat new contributor to open-source software. I have always wanted to contribute to a python project. Is there a task that I can do to be familiar with your codebase? Thanks.

Answer 41 · 2017-03-08T05:04:33.000Z

@castellanprime can you team up with @dheerajkrishna90 and work on adding new test cases to strengthen our automated build process? We really need to beef it up with solid test cases that take care of corner cases etc. One of you can open an issue and discuss your plan there.

Answer 42 · 2017-03-08T05:11:53.000Z

@jarun Okay

Answer 43 · 2017-03-08T09:58:08.000Z

yeah @castellanprime we will work together

Answer 44 · 2017-03-08T10:00:51.000Z

I would love to see test cases where the current logic fails.

Answer 45 · 2017-03-18T02:28:41.000Z

@jarun , @castellanprime , @dheerajkrishna90 is anyone working on test? can i also working on it? i will start with BukuHTMLParser class.

Answer 46 · 2017-03-18T02:33:50.000Z

@rachmadaniHaryono AFAIK, no one is right now. Please start right away.

We are far behind implementation in tests. I would love to see dozens fail.

Answer 47 · 2017-03-18T03:33:05.000Z

@jarun, i add pr for this #130.

is it to your liking? if everything is correct i will add another test for class/func.

Answer 48 · 2017-03-27T12:53:35.000Z

Rolled at #135.

ToDo list

Notes

Identified tasks