jarun/buku

ToDo list

jarun opened this issue ยท 51 comments

jarun commented

Continued from #251.

Notes

The list below is a growing one. While suggesting new features please consider contributing to Buku. The code is intentionally kept simple and easy to understand with comments. We'll be happy to assist any new contributor. We need your help!

Some of the fresh-baked features may not have been released yet. Grab the master branch for those.

Identified tasks

  • Add option --preserve to ignore specific fields during auto-update [see API update_rec() and issue #327]
  • Import keywords from browser html (see #311)
  • bukuserver: generate separate package (see #307)
  • bukuserver: add login feature (see #309)
  • Test cases for exportdb() API (@rachmadaniHaryono)
  • Fix tests, which are taking long time to finish (@rachmadaniHaryono)
  • Android app (with the same schema) (probably a separate project; we'll be glad to help)
  • Import firefox exported json (title, url, tags, desc; parent folder if opted) [see API importdb()] (thanks @ckolumbus)
  • Disable fetch from web during auto-import, import and merge
  • Honor -n with -p (thanks @jpdasma)
  • Support search options with --export
  • Import "description" (into description field) and "keywards" (as tags) from HTML <meta> Tag
  • Result pagination
  • Convert bukuserver README from rst to markdown
  • Browse a bookmark (possibly dead URL) on Wayback machine
  • Show bookmarks to be deleted before deletion
  • Support keyword filtering (records having keywords a and b but not c and d) (thanks @saltyCatfish)
  • Support filtering by tags with search options (see #250) (thanks @saltyCatfish)
  • Port feature from googler/ddgr - omniprompt key O
  • Show results with most search keyword matches on top (thanks @mosegontar)
  • Text-mode user agent for Buku
  • Copy search result URL to cipboard
  • Read default Firefox profile name from profiles.ini (see #212, thanks @alex-bender)
  • Support --format in search results (ref, thanks @mosegontar)
  • API documentation (comments need to be in NumPy format) (thanks @mosegontar)
  • Auto-import: optionally add parent folder name as tag, ask for unique tag [like importdb()]
  • Support custom colours (thanks @shv-q3)
  • Generate packages on Travis-CI using PackageCore (see #189) (thanks @shaggytwodope)
  • Search multiple tags, exclusion in tag search (thanks @mosegontar)
  • Auto-import Firefox and Google Chrome bookmarks (thanks @alex-bender)
  • Suggest tags those go together
  • Append/overwrite/remove tags from prompt
  • Add more tests (ongoing activity @rachmadaniHaryono)
  • A browser plugin (thanks @samhh for bukubrow)
  • Text editor support (thanks @ZwodahS)
  • Need a PyPI maintainer (thanks @shaggytwodope)
  • Make refreshdb faster using threads (record updates should be synchronized)
  • Show usage count in tag list
  • Proxy support (thanks @denisfalqueto)
  • Continuous search at prompt
  • Add prompt help
  • Specify custom DB file to class BukuDb (library usage, no exposed option)
  • Move to urllib3
  • Handle redirects using referrer masking. Example URL. Fixed with urllib3.
  • Support URL shortening. This helps to share URLs. (see #92 for limitations)
  • Make a bookmark title immutable via refreshdb()
  • Markdown import/export
  • Regex search
  • Ubuntu PPA (thanks @shaggytwodope)
  • Export specific tags to HTML
  • Exact word match using REGEX. Make substring match optional.
  • Delete all records based on a search result
  • Delete multiple items, support combination of indices and ranges
  • Append tags
  • Travis CI integration
  • Ubuntu deb package generation on new tag
  • Merge bookmark database files (for users who work on multiple systems)
  • Export bookmarks in FF or Chrome html format.
  • Option to add folder names as tags while importing HTML (thanks @Mohammadkhalifa)
  • Check and show upstream version
  • Anything else which would add value (please discuss in this thread)

model 1 is currently used template, either see #251 or master branch

model 2

buku server - chromium_013

model 3

buku server - chromium_014


i'm playing on bookmark view and see if there is anything that can be changed

i actually want to ask the user about the current view

  • does it need favicon?
  • do they want to create their own view or only configure each part of bookmark view?
  • how do they want to view url? full, shortened, netloc only or None?
  • do they want to see description?
  • do they want to see the tag on the same cell (model 1, and model 2) or on different cell (model 3)?

also netloc only for url is inspired by reddit.

  • this can be expanded further by adding netloc filter
  • work really well only on entry with good title
  • could eliminate url entry

also i found bug where bukuserver can't find chart.js but can't reproduce it.

jarun commented

I think model 2 without favicon and short netloc is model 1, right?

  • if we add favicon do we have to fetch for every bookmark each time?
  • only configure each part of bookmark view should be fine. We should see if we really need to provide any configuration options. We should keep minimal.
  • keep netloc only, and show full link on mouse hover. Saves space.
  • I think we can have a config to show or not show description OR we can limit the desc to 1 line by default and expand on click (I am not sure of the effort).
  • same sell and in model 1 or 2. If there are 2 many tags the vertical height would be too much.

if we add favicon do we have to fetch for every bookmark each time?

kinda, basically it just add filter to get netloc and add this line to html

<img src='http://www.google.com/s2/favicons?domain={netloc}' />

related https://stackoverflow.com/questions/5119041/how-can-i-get-a-web-sites-favicon

keep netloc only, and show full link on mouse hover. Saves space.

netloc only is easy. to show link there is actually 2 option.

...we can limit the desc to 1 line by default and expand on click (I am not sure of the effort).

for expand on click maybe bootstrap's collapse https://getbootstrap.com/docs/3.3/javascript/#collapse

only configure each part of bookmark view should be fine. We should see if we really need to provide any configuration options. We should keep minimal.

I think we can have a config to show or not show description OR we can limit the desc to 1 line by default and expand on click (I am not sure of the effort

2 option for setting bukuserver

  • use os env variable. this is similar to flask setting
  • use config file. load config file first time server started
jarun commented
jarun commented

@rachmadaniHaryono once these are done we can plan for the next release. The server seems to be in good shape. What do you think?

OK hopefully this will be the last thing before next release

jarun commented

Cool!

Hello, please consider the following issues/ideas.

  • Browser integration through add-on: I don't think Bukubrow's Cargo.lock structure will work with *BSD OS, but don't quote me on that. Rusqlite will have to be a separate port, plus there needs to be a list of Depends under "Building".

Wish list suggestions (by order of importance):

  • http front end so as to access BM's from tablet/phone (example: searx/webapp.py). The Rest API probably does this but documentation is missing?
  • Scrapbook function: Preserve page or site, index it as special BM with option to not refresh (immutable).
  • User based authentication if needed, for multiple users from webapp interface. Probably difficult, as DB structure will need some modification, location of buku.sqlite becomes issue, and "run-as user" must be addressed?

These probably don't belong here, but I'll mention them anyway:

  • PDF: Save single web page as pdf, then text search ability for saved/indexed pdfs.
  • Note taking: If working on a specific topic, you want to jot down notes and place urls relevant to topic inside the notes.

http front end so as to access BM's from tablet/phone (example: searx/webapp.py). The Rest API probably does this but documentation is missing?

currently working on that bukuserver.

@tom-i mention he can run on on espresso server and can access that.

to run it locally and make it accesible to device on same network, just host it on 0.0.0.0

User based authentication if needed, for multiple users from webapp interface. Probably difficult, as DB structure will need some modification, location of buku.sqlite becomes issue, and "run-as user" must be addressed?

this feature may or may not included on bukuserver goal. the actual reason there is development for it is just an accessible frontend through hosted server.

currently bukuserver don't need additional database, and it may kept it that way for some time in the future.

adding more feature is possible as long as it is simple one (see statistic page for example)

Scrapbook function: Preserve page or site, index it as special BM with option to not refresh (immutable).

PDF: Save single web page as pdf, then text search ability for saved/indexed pdfs.

not quite sure i can talk much about this but there is similar project which more focused on archiving link see https://github.com/pirate/bookmark-archiver

it have pdf, webarchive.org and html saving.

there is planned feature for search ability but look like it will not be merged

ArchiveBox/ArchiveBox#24

e: related maybe possible to create buku integration with this?

https://www.gregmcleod.com/rip-xmarks-hello-syncmarx/

jarun commented

I don't think Bukubrow's Cargo.lock structure will work with *BSD OS

Please check with the Bukubrow project on this one. And to be frank I don't see the need for another browser plugin. You can add any bookmarks you want directly to buku from the browser oR import later OR export from Buku to the browser later.

Scrapbook function: Preserve page or site, index it as special BM with option to not refresh (immutable).

@rachmadaniHaryono I believe he is requesting for this flag on the webpage. We already have immutable support at an API level.

User based authentication if needed, for multiple users from webapp interface. Probably difficult, as DB structure will need some modification, location of buku.sqlite becomes issue, and "run-as user" must be addressed?

Is there a web-plugin with it's own database and can handle this? Otherwise, we can add this to the ToDo list for bukuserver. It makes the server operations truly private.

PDF: Save single web page as pdf, then text search ability for saved/indexed pdfs.

Re-iterating - we are not writing a private google search engine alternative. Use the notes field for additional text.

Note taking: If working on a specific topic, you want to jot down notes and place urls relevant to topic inside the notes.

Please use the notes option and relevant tags.

tom-i commented

Bukuserver accessible from local network is awesome feature, but next level (from my side) would be, to open port (5001) on my router and access bukuserver page from Internet.
So I have questions:

  1. Is it save to open that port to the Internet?
  2. Is there any security issues for the front end? I mean if someone can hack the page and can easily access to my router.
  3. If there is new update (via git or so) for bukuserver / buku, should I install newer version via pip3 again? Or which steps I should follow? E.g. uninstall older version via pip3 and then git pull repository and then install new bukuserver?

Is there a web-plugin with it's own database and can handle this? Otherwise, we can add this to the ToDo list for bukuserver. It makes the server operations truly private.

If additional database is on the plan, I'm thinking about adding this https://pythonhosted.org/Flask-Security/quickstart.html

Is it save to open that port to the Internet?

  • currently no
    • form is not using csrf
    • everyone can edit / view your saved bookmarks
    • secret keys on app is using random number
    • other security think I didn't know

Is there any security issues for the front end? I mean if someone can hack the page and can easily access to my router.

Main issue when open the port to internet is user access as mentioned above.

If there is new update (via git or so) for bukuserver / buku, should I install newer version via pip3 again? Or which steps I should follow? E.g. uninstall older version via pip3 and then git pull repository and then install new bukuserver?

Cmiiw on this

Using git + virtualenv will always get the latest file, which will put on local main folder, so it is no problem

I'm still doubt with pip installation, but it should also update bukuserver on user installation folder if it is installed correctly (not tested for installation using sudo pip but that is also discouraged)

Hi and thanks for the input.

Re my PDF/Notes comment: Was only hoping to get some input on how to solve this.

@rachmadaniHaryono suggestion looks like the exact thing I was hoping for - much thanks!

Please use the notes option and relevant tags.

That would be the reverse of what I'm trying to figure out.
As mentioned, it was a shot in the dark to see if any idea would surface.

You can add any bookmarks you want directly to buku from the browser

You're talking about key-binding + shell script here as I understand?
Direct-access to the sqlite db would be a personal preference at read/review stage. I'll take it up on Bukubrow issues.

HTTP server-side (Flask-Security) would be a better choice IMHO, probably has better credential check integration and one could define custom db path for each user from there so that Buku its self will not need modification as multi-user.

@tom-i: I would suggest structuring it differently and NOT exposing your servers directly to internet unless you have a specific reason for public instances.

  • Install some VPN software on device facing the internet (or forward the traffic from router to VPN server).
  • Create self-signed certificates and install client certs to your laptop/tablet.
  • Connect to your home LAN through VPN (also means traffic gets encrypted) and access bukuserver as if you were resident on your LAN.

Thanks again & regards

As note there is this doc on how to deploy flask http://flask.pocoo.org/docs/1.0/deploying/ but I haven't try it yet (mostly due to my internet connection issue)

https://github.com/rachmadaniHaryono/Buku/tree/feature/append-tag-on-import

current behaviour

  • import file
  • bookmark exist
  • nothing changed

in my use case, i move the folder and expect it also reflected on buku tag.

there are actually two options, replace or append, but i choose append to prevent any undoable step and probability of missing tag.

currently on branch only work on html import

jarun commented

Are duplicate tags parsed out and final tags ordered?

If you call parse_tags() all the duplicates will be removed and they will be ordered.

Yes, I think preserving all tags is a great idea.

If you call parse_tags() all the duplicates will be removed and they will be ordered.

tags = row[1] + tags_in[1:]
tags = parse_tags([tags])

does that part on append_tag_at_index method didn't remove duplicate tag already?

jarun commented

Yes it does. Just wanted to ensure. Didn't look at the code. :)

Hi Jarun, I am newbie to opensource. Please give me beginner tasks.

jarun commented

@kirakrishnan I guess the core library is already quite matured and stable. I was thinking of pagination but I guess the server with GUI already does it very well.

@rachmadaniHaryono do you have anything in the server side... or need help with testing the features?

can't think anything simple for server.

imo the next step should be designing the additional database, adding more javascript capability, pluginize some parts

jarun commented

@kirakrishnan let us know if you are interested in the items @rachmadaniHaryono mentioned and more importantly, you think you can handle those. If not sure, feel free to ask.

@jarun I am interested in the items. I am very comfortable with sql and I have some hands-on experience in javascript. Please tell me little more about the task.

jarun commented

@rachmadaniHaryono can you check if it's possible to create small tasks for the items?

@jarun @kirakrishnan not quite sure how small it is, but here is some items that can be done:

  • ajax call to edit/create bookmark
  • example urls for bookmark
jarun commented

@rachmadaniHaryono seeing that you have responded 4 days back without any reply yet, I believe @kirakrishnan has demonstrated his enthusiasm. Please work on the ideas yourself.

People tend to love the idea they would write amazing code in amazing projects, only they never really find the time to start. ;)

will be late for this and not sure how long. this may affect that image/screenshot task for bukuserver. hope some irl problem will be resolved fast so i can get back to it.

jarun commented

will be late for this and not sure how long.

no hurry! we just made a release.

this may affect that image/screenshot task for bukuserver.

we need the screenshot as soon as possible. people are visiting the page and don't know what to expect. not everyone would install a sever by looking at the description, they need something more attractive than that.

@jarun some reply here to not make it cluttered


a481e9c

name and description should be separated.

not quite sure why it have to be changed to md, but i will follow it.

also unrelated to this edit, but this part should be added

#278


aea87cb

i'm ok with moving it imgur.


88b0467

ok with that


also i will hopefully able to replace the included js file this week based on @szlin advice


e: unrelated to this but imgur is blocked here in indonesia, but i don't know how much indonesian buku user affected here.

jarun commented

Updated the documentation at c1c43a0. Please review.

Moved it to markdown as the main page is also in markdown.

Is there a working alternative to imgur? I used another site earlier but it went down. imgur still lives. ;)

The update is ok

Moved it to markdown as the main page is also in markdown

Ok that is fine

Is there a working alternative to imgur? I used another site earlier but it went down. imgur still lives. ;)

Nothing as simple as imgur. Maybe keep imgur for now and add alternative when user ask

jarun commented

OK!

jarun commented

Can you please add server-side identified tasks to the task list? In case someone wants to contribute.

  • designing the additional database,
  • adding more javascript capability,
    • ajax call to edit/create bookmark
    • example urls for bookmark
  • pluginize some parts
  • use non minified JavaScript
  • user buku library
  • login/permission

That is just copy paste from comments above.


Some idea

  • image screenshot
  • web status (404, 200)
  • GitHub Link details,( programming language, Stars,etc)

Some crazy idea that I write, so I don't forget it next time (maybe idea for plugin)


Can't specify too details because I'm on mobile right now

jarun commented

I think you'll have to make the points more elaborate. For example:

  • designing the additional database - which database, where does it help etc.
  • adding more javascript capability - which capability

like that.

tom-i commented

Hello guys,
is it possible that I have 2 different versions of bukuserver?
I've updated bukuserver yesterday and today too, kill bukuserver from running processes and run bukuserver via:

source env/bin/activate
bukuserver run --host 0.0.0.0 --port 5000

I saw that there are some icons before URLs, so I realized, that it's some kind of new feature.
Then I've restarted my NAS server and my bukuserver started via cron after reboot:
@reboot cd /home/nashome/programs/Buku/bukuserver/env/bin && ./bukuserver run --host 0.0.0.0 --port 5000
showed me, that there is no icons of the webs before URLs in bookmarks page.
So can I have different versions there? Or how should I start bukuserver via cron correctly, please?
Thx guys

EDIT:
Ok, so I've made some small shell script with

source env/bin/activate
bukuserver run --host 0.0.0.0 --port 5000

inside and I run it each reboot via cron now.

But question is still unanswered, why I've 2 versions there :)

this is just guess, but buku installation on this virtualenv /home/nashome/programs/Buku/bukuserver/env/bin is not updated. try running pip install -e . once again on virtualenv after updating buku to latest version to reinstall it.

based on your description, the upgrade should be

  • kill bukuserver running instance
  • activate virtualenv
  • git fetch && pip install -e . buku repo or just run pip install -U buku to get latest version
  • restart

e:

is it possible that I have 2 different versions of bukuserver?

have 2 different virtualenv on each virtualenv install different version/commit you want

i.e. using v3.7 and v.3.8

  • mkvirtualenv venv1
  • workon venv1
  • git checkout v3.7
  • pip install -e .
  • deactivate
  • mkvirtualenv venv2
  • workon venv2
  • git checkout v3.8
  • pip install -e .
  • deactivate

(based on your description) the cron command would be

@reboot cd /home/nashome/programs/Buku/bukuserver/venv1/bin && ./bukuserver run --host 0.0.0.0 --port 5000
@reboot cd /home/nashome/programs/Buku/bukuserver/venv2/bin && ./bukuserver run --host 0.0.0.0 --port 5001

not sure about the correctness on last part because i use different virtualenv, but basically just put the correct virtualenv path based on your configuration.

also use different port or host

Random thought :

This part is not filled yet

https://github.com/jarun/Buku/community

Also Imo #287 can be closed now

jarun commented

I saw that earlier but left untouched. Closed the issue.

jarun commented

I don't think they will remove the particular field from the specification. They will probably stop importing the description field.

jarun commented

I have added a new item in the ToDo list, thanks to your link. ;)

doc for new feature taken from #298

Noting down stuff we need to handle if we do this (to be done gradually, not in one PR):

- Trigger auto-fetch of tags and description only when the options are not provided in cmdline
- Provide options in editor to update description and tags from web
- Extend immutable to keep description and tags untouched
- ~Limit auto-fetched tags to certain word count?~
- Flag `immutable` will now apply to title, desc and tags.

doc:

doc on slow test

based on this rachmadaniHaryono/Buku@f6cbbe8 here are slow tests

124.81s call     tests/test_bukuDb.py::test_delete_rec_range_and_delay_commit
61.51s call     tests/test_bukuDb.py::TestBukuDb::test_search_by_multiple_tags_search_all
61.49s call     tests/test_bukuDb.py::TestBukuDb::test_add_rec
61.41s call     tests/test_bukuDb.py::TestBukuDb::test_search_by_multiple_tags_search_any
61.38s call     tests/test_bukuDb.py::TestBukuDb::test_search_keywords_and_filter_by_tags
61.36s call     tests/test_bukuDb.py::TestBukuDb::test_search_by_tags_exclusion
61.32s call     tests/test_bukuDb.py::TestBukuDb::test_get_rec_id
61.25s call     tests/test_bukuDb.py::TestBukuDb::test_append_tag_at_index
61.24s call     tests/test_bukuDb.py::TestBukuDb::test_searchdb
61.24s call     tests/test_bukuDb.py::test_delete_rec_on_non_interger[a-a-1-True]
61.24s call     tests/test_bukuDb.py::TestBukuDb::test_suggest_tags
61.23s call     tests/test_bukuDb.py::TestBukuDb::test_delete_tag_at_index
61.23s call     tests/test_bukuDb.py::TestBukuDb::test_append_tag_at_all_indices
61.22s call     tests/test_bukuDb.py::TestBukuDb::test_replace_tag
61.22s call     tests/test_bukuDb.py::test_delete_rec_index_and_delay_commit
61.21s call     tests/test_bukuDb.py::test_compactdb
61.21s call     tests/test_bukuDb.py::TestBukuDb::test_get_rec_by_id
61.20s call     tests/test_bukuDb.py::TestBukuDb::test_search_by_tag
61.20s call     tests/test_bukuDb.py::TestBukuDb::test_search_and_open_in_broswer_by_range
61.19s call     tests/test_bukuDb.py::TestBukuDb::test_search_and_open_all_in_browser
60.33s call     tests/test_bukuDb.py::test_print_rec_hypothesis
14.83s call     tests/test_bukuDb.py::test_browse_by_index

related pr

notes:

I have some free time this weekend. I'll try to work on -n and -p if nobody is still doing it. :D

jarun commented

@jpdasma please go ahead. That would be a great boost to user experience.

T4P4N commented

using urwid for frontend? argument interface is good but sometimes it's painful to edit bookmarks. i think it should be in todo list.

jarun commented

Sorry, at this stage of the project I wouldn't like someone to spend valuable bandwidth on interface. There are many features to be implemented.

jarun commented

Rolled at #343.