jarun/buku

ToDo List

jarun opened this issue ยท 52 comments

jarun commented

Continued from #103.

Notes

The list below is a growing one. While suggesting new features please consider contributing to Buku. The code is intentionally kept simple and easy to understand with comments. We'll be happy to assist any new contributor. We need your help!

Some of the fresh-baked features may not have been released yet. Grab the master branch for those.

Identified tasks

  • Option to scan and import Firefox and Google Chrome bookmarks DB file (URL, title, tags, description fields) to Buku database (@alex-bender)
  • Suggest tags those go together
  • Support custom colours (refer to googler)
  • API documentation (comments need to be in doxygen format)
  • Android app (using the same database)
  • Search multiple tags, exclusion in tag search (not sure if the RoI makes sense)
  • Append/overwrite/remove tags from prompt
  • Rest API for webapps (ongoing activity @kishore-narendran)
  • Add more tests (ongoing activity @rachmadaniHaryono)
  • A browser plugin (thanks @samhh for bukubrow)
  • Text editor support (thanks @ZwodahS)
  • Need a PyPI maintainer (thanks @shaggytwodope)
  • Make refreshdb faster using threads (record updates should be synchronized)
  • Show usage count in tag list
  • Proxy support (thanks @denisfalqueto)
  • Continuous search at prompt
  • Add prompt help
  • Specify custom DB file to class BukuDb (library usage, no exposed option)
  • Move to urllib3
  • Handle redirects using referrer masking. Example URL. Fixed with urllib3.
  • Support URL shortening. This helps to share URLs. (see #92 for limitations)
  • Make a bookmark title immutable via refreshdb()
  • Markdown import/export
  • Regex search
  • Ubuntu PPA (thanks @shaggytwodope)
  • Export specific tags to HTML
  • Exact word match using REGEX. Make substring match optional.
  • Delete all records based on a search result
  • Delete multiple items, support combination of indices and ranges
  • Append tags
  • Travis CI integration
  • Ubuntu deb package generation on new tag
  • Merge bookmark database files (for users who work on multiple systems)
  • Export bookmarks in FF or Chrome html format.
  • Option to add folder names as tags while importing HTML (thanks @Mohammadkhalifa)
  • Check and show upstream version
  • Anything else which would add value (please discuss in this thread)

Anything else which would add value (please discuss in this thread)

Maybe make the web browser configurable? i.e. when you're on a system where your default browser is Firefox but you want to open bookmarks in Lynx because you're on a terminal anyway. Probably:

  1. Add a --browser option with the binary as an argument.
  2. Move the webbrowser.open() into an else case and add an if case for whether --browser was passed and the argument is a valid binary.
jarun commented

Try the BROWSER environment variable:

BROWSER=firefox buku -o 3

Does not work on Windows.

set BROWSER="C:\Program Files\Some Browser\Browser.exe" does neither.

jarun commented

Not really a buku problem. A quick google search leads to http://stackoverflow.com/questions/25875093/how-to-set-browser-environmental-variable-for-python-webbrowser

but I don't pay and use windows ;). So can't try it out myself.

The last time I had to pay for Windows was in 2001. Later versions came preinstalled (for free) or over MSDNAA (for free) or as a free update (for free).

I can confirm that the SET BROWSER=Something not in quotes answer works for me indeed.

jarun commented

The last time I had to pay for Windows was in 2001.

LoL!!! ๐Ÿ‘
Thanks for the confirmation.

@jarun you should mark "Option to add folder names as tags while importing HTML" as done

jarun commented

Done!

  1. When using --np mode exit code should be different than 0 when search or action fails - this would allow for using Buku in custom scripts
  2. It would be awesome if there was a way to perform a search and immediately open all found results. Currently it seems impossible to use -s and -o flags together.
    Buku <some query> -o opens a random bookmark.
jarun commented
  1. It's not so simple because it's not a direct API call. The API searchdb() indeed returns number of results or None. I would suggest you write yourself a small wrapper that calls the API directly.

  2. Did you try buku -s hello --oa?

It's be nice to have created_at and updated_at fields in the bookmarks table of the database to timestamp the records. Then you could also search by date range, especially since records' index values don't provide old/new context once they've been moved to fill in holes from deleted records.

created_at would only be set at the time the record is first created. Upon creation, updated_at would mirror created_at but would be updated any time the record is modified (tag changes, updated comment, changed title, etc).

These fields could also be used to retrieve or provide timestamps when importing and exporting browser bookmark files. The example file I was looking at had attributes for ADD_DATE and LAST_MODIFIED, but they were all the same (I assume the times represented correlate with the date/time the export created the file.

jarun commented

@drbraden from the Introduction:

Buku is too busy to track you - no history, obsolete records, usage analytics or homing.

Please read it. It's there for a reason.

Because there is no example of python script using buku, i write a simple script to print status code of saved url on buku. i'm not quite sure where to put it, so i put it here until there are more example

script use grequest from https://github.com/kennethreitz/grequests

import buku
import grequests

bdb = buku.BukuDb()
recs = bdb.get_rec_all()
recs[0]
# output: (1, 'example.com', 'example', 'tag1,tag2', 'page description', 0)
# Records have following structure:
# - id,
# - url,
# - metadata,
# - tags,
# - description,
# - flags
urls = [x[1] for x in recs]
rs = (grequests.get(u) for u in urls)
gr_results = grequests.map(rs)
for resp, url in zip(gr_results, urls):
  stat_code = None if resp.status_code is None else resp.status_code  
  print('{}: {}'.format(stat_code, url))
# output
# 200: http://website1.com/
# None: http://website2.com/
# 200: http://website3.com/
# ...
jarun commented

Very useful. Please add it under the As a library section in README.

I love Buku. What would you think about adding a field to the sqlite3 db for the text of the page? The page's text could be extracted while fetching the title, and all the html tags could be stripped off. It would then be available for searching.

Example: There are several pages that I bookmark to read later or have already read, but the keyword I'm searching for isn't always contained in the title. And sometimes I can't enumerate every possible tag. If I could include the page's text in my search, then I could find my pages faster and easier.

Is this too much for a bookmark manager? I know it would increase the size of the DB.

jarun commented

Is this too much for a bookmark manager?

Yes. You are asking for a local Google search engine. Search wouldn't remain a plaintext search anymore. The RoI isn't reasonable.

Awesome work here! I'm starting up rely on this more and more, and I'm really amazed how feature rich this is already ๐Ÿ‘

One thing I'd like to request, is a flag to specify the db location. I know that it looks in specific locations already - including local directory - but this would let me have multiple, completely separate sets of bookmarks (e.g. Work, personal, temporary/todo). I'm currently achieving this just fine with tags, but the option to logically separate them by file would be a great addition, especially for scripted scenarios.

Thanks again for the great work here!

jarun commented

@absolutejam Issue #171 was the same request. Though I am a bit reluctant, I think it's time to enable it based on public demand. ;)

Suggested Feature. It would be really nice to have some way to a set of tags automatically assigned when one tag is assigned. Here are some example.

I could tell buku that whenever I assign the tag "django", it should automatically assigns the tags "python" and "programming language"

I could tell buku that whenever I assign the tag "iPhone" to an article, it automatically assigns the tags "Apple" and "Tech Companies"

That way, you don't have to keep typing the list of tags every time. This is one of the main benefits of a folder-based bookmark manager. When you assign a bookmark to a folder, the parent folders are automatically associated with the bookmark.

As a second idea, buku could suggest other tags based on your past bookmarks when you add a tag.

jarun commented

That way, you don't have to keep typing the list of tags every time.

Did you notice that tags can be set using simple redirection at the prompt?

This is one of the main benefits of a folder-based bookmark manager.

Yes, but parsing/reading/writing to a filesystem numerous times is inherently slower than using a flat file. And listing tags would be scanning all those directories.

As a second idea, buku could suggest other tags based on your past bookmarks when you add a tag.

We can do this, suggest optionally. When a tag is added find other tags that go together and show the tag names + indices. User can use redirection at the prompt and add the tags.

Would you be interested in contributing the feature?

By redirection, do you mean the --tag flag? Would definitely like more clarity on this.

I agree with point 2, which is why I do like tags better than folders. But a possible alternative is an single file with indented tags showing the hierarchical relationship between tags (ex. Company:Apple:iPhone). Then, when searching, you could have a strict search for the tags assigned to the bookmark only and a fuzzy search, which also includes bookmarks matching to parent or children tags with the results.

Do like the sound of 3. Not sure if I would be able to contribute the feature (pretty novice programmer, so probably unlikely), but can always try to take a look.

Two other questions: Can you add Markdown and URLs in comments field (Most concerned about using the # sign for headings)? Can comments from a bookmark or set of bookmarks be exported to a text file? Buku could be pretty useful for taking more detailed notes on the webpage you have bookmarked while reading them.

jarun commented

By redirection, do you mean the --tag flag? Would definitely like more clarity on this.

No, I meant the smart tag editing with >>, > or << symbols at prompt.

Then, when searching, you could have a strict search for the tags assigned to the bookmark only and a fuzzy search, which also includes bookmarks matching to parent or children tags with the results.

The RoI isn't great.

Not sure if I would be able to contribute the feature (pretty novice programmer, so probably unlikely), but can always try to take a look.

You should if you really want it. I am not sure if I can get back to this.

Can comments from a bookmark or set of bookmarks be exported to a text file?

You have to add a new filter to do that. Check option -f.

Most concerned about using the # sign for headings

Buku could be pretty useful for taking more detailed notes on the webpage you have bookmarked while reading them.

You can add anything in a note. Try the -w option where you can edit it. You can edit a bookmark in your favourite editor.

Okay. Thanks.

Hello!
Thanks for so good tool.
I would like to implement importing Chrome bookmarks to Buku database but I'm new to Buku.
Could you please advise where should I start from?

jarun commented

@alex-bender are you familiar with the Firefox and Chrome bookmark sqlite schema? If yes, you can check the importdb() API which does all import stuff.

I've checked Chrome bookmarks file. It's simple json. I don't know anything about FF, but let me check.

@jarun Yeah, I found it. Are Buku/.github guides the only guides which I should follow?

jarun commented

I've checked Chrome bookmarks file. It's simple json. I don't know anything about FF, but let me check.

AFAIK the JSON file is the latest backup. Is the data same as that in the bookmark SQLite file?

Are Buku/.github guides the only guides which I should follow?

Yep!

Sorry but I can't find bookmark SQLite file for Chrome. But data in the Bookmark file are the same what I can see in my Bookmark Manager

jarun commented

Oh no, sorry. Data are not the same. You are right.. Looking for the database..

Please confirm if the data are same or not. Consider this scenario: you added a new bookmark in Chrome but didn't close the browser. Do you see that latest bookmark in the json file?

Sorry for the mess. Here is what i've done:

  • Exported bookmarks as html and calculated number of elements there
  • Calculated number of elements in the Bookmarks json file
    They are equal

Also, I've done what you asked, and result is positive: After adding a new bookmark it does appear in Bookmark file without closing the browser.

jarun commented

Awesome!

Then here's what we need to do - check the same with Firefox. If everything looks fine, add a new option to Buku to auto-import. We need to scan the usual locations (on Mac, Linux and Windows) for FF and Chrome. If any of them is not found, ask the user to specify the location of the file (for the specific browser) manually.

Yeah, FF also updated bookmarks db without closing browser.

jarun commented

Great! Waiting for your PR! ๐Ÿ‘

@jarun hello, work is in progress, but you can comment anything you consider important here:
https://github.com/alex-bender/Buku/commit/8102f84fb4a03117594bea076577a0421845eed1
I'm not sure that using webbrowser module for this case is the best decision, but anyway.
Please advice.

@jarun, is there a plan to refactor parser on importdb function

https://github.com/jarun/Buku/blob/730b80f7388fa6d837aaa3c6455ba3387f9c8c60/buku.py#L1648

i want to test #172, but kinda hard with class method.

jarun commented

@alex-bender thanks! I have reviewed.

I'm not sure that using webbrowser module for this case is the best decision

If it gives you the installation location of the browser, go ahead. We already import it. Do you have anything else in mind?

jarun commented

@rachmadaniHaryono please feel free to make the necessary changes and refactor. i will review in the PR. Please ensure you test the following cases:

  1. <a> </a> in desc
  2. <a> some text </a> in desc
  3. Multi-line desc
  4. Multi-line desc followed by another BM data with text.

Please use the bookmark paste I linked in the issue.

@jarun Thanks for reviewing!

If it gives you the installation location of the browser

As far as I understand webbrowser doesn't give us location, only absence of browser.

Do you have anything else in mind?

I have a question according to the naming: should we check browsers like chromium or only chrome?
And a question about profiles: there are could be many profiles in chrome and FF, so should we ask user about desired profile or import bookmarks from all of them?

I'll implement rest (+tests, rebase\squash, docs) and will make PR.

jarun commented

As far as I understand webbrowser don't give us location, only absence of browser.

OK. I misunderstood earlier. Anyway, it would be local data so we have to check in default location ourselves.

I have a question according to the naming: should we check browsers like chromium or only chrome?

I guess google chrome is sufficient as long as we provide the option to specify manually (documentation would play a really significant role in this feature). Every Google, Opera and Yandex has a fork of Chromium nowadays.

so should we ask user about desired profile or import bookmarks from all of them?

Does FF or GC show bookmarks from all profiles in the UI? If yes, we grab all!

@jarun here is GC behavior: In my case two different profiles (chrome://settings/ People section) has different config folders ~/.config/google-chrome/Default and ~/.config/google-chrome/Profile 1.
Adding bookmark to one of them -- does not affect another one.

jarun commented

Thanks! Here's what I think - if a user has 10 profiles all of the bookmarks belong to him. I think it's OK to import all the bookmarks. I think even when FF/GC is installed and it imports bookmarks from other browsers, it imports all profiles. Can you please verify this?

jarun commented

By user above I meant OS user, not GC profile/user.

But what if there are different profiles for different family members? In case when they are using single OS account (Like admin on windows)
I suppose that we should ask user which of the profiles we should work with.

jarun commented

You are saying they are using the same login but Chrome is customized for each to have different profiles? Sorry, I think the idea is very far fetched and a big privacy issue. It's like two people using a single diary to write their journals.

Anyway, what happens in the install -> auto import scenario? We can follow that.

Ok, I'll check auto import scenario.

jarun commented

From this discussion thread it seems like they only import from the default profile. We can do the same.

jarun commented

No, I guess it's not absolute.

I see here 3 options:

  • Import only default
  • Import all what we've found
  • Ask user which accounts should be imported.

Of course first one is the simplest one. We can stick with it and see what'll happen next.

jarun commented

Yes, let's go default. We'll document this under operational notes. As there will be an option to specify manually I don't see a problem.

jarun commented

Please open a new issue for this. We can continue discussing there. I will roll this general discussion thread.

jarun commented

Rolled at #174