jarun/buku

ToDo list

Closed this issue · 48 comments

jarun commented

Continued from #78.

Notes

The list below is a growing one. While suggesting new features please consider contributing to Buku. The code is intentionally kept simple and easy to understand with comments. We'll be happy to help out any new contributor.

Some of the fresh-baked features may not have been released yet. Grab the master branch for those.

Identified tasks

  • Ubuntu Unity scope
  • A browser plugin (probably a new project; see #122)
  • Add more tests
  • API documentation
  • Rest API for webapps
  • Android app (using the same database)
  • Text editor support (thanks @ZwodahS)
  • Need a PyPI maintainer
  • Make refreshdb faster using threads (record updates should be synchronized).
  • Show usage count in lag list
  • Proxy support
  • Continuous search at prompt
  • Add prompt help
  • Specify custom DB file to class BukuDb (library usage, no exposed option)
  • Move to urllib3
  • Handle redirects using referrer masking. Example URL. Fixed with urllib3.
  • Support URL shortening. This helps to share URLs. (see #92 for limitations)
  • Make a bookmark title immutable via refreshdb()
  • Markdown import/export
  • Regex search
  • Ubuntu PPA
  • Export specific tags to HTML
  • Exact word match using REGEX. Make substring match optional.
  • Delete all records based on a search result
  • Delete multiple items, support combination of indices and ranges
  • Append tags
  • Travis CI integration
  • Ubuntu deb package generation on new tag
  • Merge bookmark database files (for users who work on multiple systems)
  • Export bookmarks in FF or Chrome html format.
  • Option to add folder names as tags while importing HTML (see #80)
  • Implement self-upgrade (see #83)
  • Anything else which would add value (please discuss in this thread)

@jarun I can do pypi, it's really easy for me to maintain for you. Just need access to the "repo".
You can do that here and add the user name shaggytwodope to the maintainer role.

jarun commented

@jarun No worries mate, started a new job tho so weekend releases might get delayed for me. But short of that I'll gladly update asap. It's now updated to the 2.7 release here. I've tested the install process and it appears to work wonderfully.

Feel free to email or @ ping me anytime. I try to respond quickly. Also we/someone can add pip install buku instructions to readme at some stage now.

@jarun oh yeah, so the readme isn't pypi friendly at the moment, I'll look into this later on. I see it as a very low priority. Pypi site is in a transition stage for a new platform. Better markdown support may come. But for now I think it's ok to look a little off. Most likely tomorrow I'll see what options exist to clean it a bit.

jarun commented

No hurries. I'll add the pip installation procedure right away. Thanks again!

@jarun what about storing additional metadata per bookmark such as date_added, date_updated and possibly track how many times a bookmark was opened from buku?

these 3 metadata details seem useful to me because while tags are useful they don't convey when I did something, nor can I trend tag usage over time (as a web developer I create bookmarks per development project). Similarly as tags get added it might be nice to be able to say "I never went back to this bookmark, or opened this bookmark alot"

jarun commented

Hi, please read the project policy:

no history, obsolete records, usage analytics or homing.

@jarun thank you for the clarification. I believe you value privacy above all else, and I salute you for that. I would argue that similar to an Email or a blog post -- knowing the date of something does not imply user tracking; but is a useful fact of the data. Data is only as useful as it is searchable, tags and text searching will take you only so far.

Regardless, thank you for the prompt reply. I would kindly suggest that you provide a title to the last line of your Readme file with a heading of "Project Policy" or "Project Goals" as the single sentence doesn't express the depth of your passion on these ideals and may be easily overlooked by future users with similar wishes.

jarun commented

Thanks for your suggestion and understanding. BTW, if you wish to use Buku as your backend you can extend it in your project. I do that already for some columns. Also, would you be interested in the REST API task?

The suggestion of extending it sounds like a good idea for custom columns, I will try that.

Unfortunately I'm not a python developer so I'm out of my element with python+REST :(. Best regards.

jarun commented

Unfortunately I'm not a python developer so I'm out of my element with python+REST

In it's simplest form I think you'ld need to (de)serialize the input and return for each API.

Maybe an API specification first?

jarun commented

@Qu4tro yes, item 2 in the list.

Hi, jarun--

Here's a feature idea: when buku attempts to pull down the title from the URL, if the server does not respond or produces other HTTP errors, add special tags to the broken entries. This, combined with the delete bookmark by search feature will allow users to cull obsolete/broken bookmarks from their database. It is even possible to use a series of numbered tags like: NoResponse1, NoResponse2, NoResponse3 to handle the possibility that a URL is only temporarily offline. (For each failure to connect, buku replaces the old tag NoResponsen with NoResponsen+1; users can search for NoResponse5 and delete all of them.)

Other tags are HTTP404 for the pages that have been removed (users will probably want to purge those bookmarks or update them with new URLs), etc.

jarun commented

@dchang0 Thanks for the suggestion! Having a title or not due to HTTP errors becomes irrelevant because of the following options:

  • set manual tags and mark immutable
  • option to search blank titles using buku -S blank
  • silent full refresh showing error codes for all bookmarks using buku -u --tacit

I'm afraid any additional options would be a feature bloat around titles.

jarun commented

Why is it important for a bookmark management utility to store the broken/temporarily broken status of pages?

jarun commented

All I am saying is these statuses are not permanent. They keep changing and if we were to track those properly (in real time) we should find some way to run buku -u with some frequency as a daemon in the background.

Let's say a page is showing 404. We mark it in Buku accordingly. Later the page comes back online but the status is still 404 in Buku. Users should never RELY on the stored data and take a decision to remove or not to visit the page. The ONLY way to know for sure a page is down (at the moment) is to visit the page. Why should users care about the last time Buku checked the status?

jarun commented

Presumably one would run the check just before running the mass deletion

That's what buku -u --tacit does for you. For all bookmarks at once, multi-threaded. Please run it once first.

And there would always be the possibility of getting a false positive

I don't want it coming from Buku. Not for stuff not under Buku's control.

jarun commented

Thanks for your understanding!

jarun commented

Buku is too busy to track you - no history, obsolete records, usage analytics or homing.

However, if you are too eager to have this, please extend the flags column with a new bit for last_checked_status. 0 = OK (default), 1 = some HTTP failure.

jarun commented

I'm curious as to why "obsolete records" would be considered "tracking you."

The same way Google may mine your 2016 search history to know you. Anyway, I am not aligned to the datestamp stuff because it is a usage record.

From your earlier comment:

it's too much work to clean up my old database manually.

It's actually not, if that's what you want to do and by old database you mean a Buku (or importable) database. Run buku -u --tacit and redirect the output to a file. Parse (or manually check) the output file for failures and get the indices. Pass all the indices to buku -d.

Is the REST API todo still open? Or is someone working on it?

I was thinking about implementing it. Made a small project recently to get the grasps, on Flask-Restful which seems to be a good choice for the problem in hand.

There is still a few things that it's still not clear to me, but I might just be overthinking it. I'm thinking about implementating a small, functional subset first. I have a list of the endpoints that I planned somewhere, that I can post here.

PS: Also thinking about swagger support. Swagger is a great idea!

@Qu4tro So you are going to work on it? Because, I am also familiar with building RESTful APIs with Flask, and was wondering if I could contribute. Yes it will be great if you can post the list of endpoints, I am curious.

Swagger support might be a little bit overkill IMO.

If you're already familiar I think you should go ahead and create the repository. I will contribute to it as well.
I will post the list later when I get home.

I might be wrong about swagger, but it's gives so much: web gui, other languages clients for API, etc...

jarun commented

@kishore-narendran please start with the REST APIs and @Qu4tro can contribute in. It is a long pending task item and I have come across people interested in this (to have Buku as a backend).

REST would be our first target. We'll have a discussion around swagger in phase 2 and take a call. I'm not familiar with it so I'ld have to take a look to understand the pros and cons myself.

jarun commented

@kishore-narendran @Qu4tro I have added both of you as the owner of this task item. Please raise a bug and discuss the design issues, pros and cons, conflicts etc. there. I will pitch in wherever I can.

Hello jarun, I am new to open source world.I want to contribute to your project.I know Python. How can i help you?

jarun commented

@dheerajkrishna90 unfortunately our test cases are much behind our implementation. Can you start with adding more test cases for new APIs/functionality?

yeah okay, @jarun should I add test cases to test_helpers.py or test_bukuDb.py?

jarun commented

You have to write the new cases for class BukuDb in in test_bukuDb.py, generic APIs in test_helpers.py.

Yeah got it

Hi. I am a somewhat new contributor to open-source software. I have always wanted to contribute to a python project. Is there a task that I can do to be familiar with your codebase? Thanks.

jarun commented

@castellanprime can you team up with @dheerajkrishna90 and work on adding new test cases to strengthen our automated build process? We really need to beef it up with solid test cases that take care of corner cases etc. One of you can open an issue and discuss your plan there.

yeah @castellanprime we will work together

jarun commented

I would love to see test cases where the current logic fails.

@jarun , @castellanprime , @dheerajkrishna90 is anyone working on test? can i also working on it? i will start with BukuHTMLParser class.

jarun commented

@rachmadaniHaryono AFAIK, no one is right now. Please start right away.

We are far behind implementation in tests. I would love to see dozens fail.

@jarun, i add pr for this #130.

is it to your liking? if everything is correct i will add another test for class/func.

jarun commented

Rolled at #135.