sunaku/dasht

Versioned Docsets

RobertAudi opened this issue · 12 comments

This is related to #4. In Dash, there is the possibility to install a specific version of a docset (e.g.: Ruby 2.2.4 docset). Looking into the docset feeds, I can see that there is a <other-versions> tag in (some) docset files (e.g.: the Ruby 2 docset), but I couldn't figure out how to retrieve pre-built specific versions.

One workaround – which would apply to #4 as well – would be to be able to import local docsets by looking in common places.
For instance, on OS X, Dash docsets are in "$HOME/Library/Application Support/Dash", and in there there is a "Versioned DocSets" directory:

"$HOME/Library/Application Support/Dash/"
├── "Cheat Sheets"
├── "DocSets"
├── "User Contributed"
└── "Versioned DocSets"
    └── "Ruby 2 - DHDocsetDownloader"
        └── "2-2-4"
            └── "Ruby.docset"

Or maybe an additional configuration variable could be available to specify additional lookup paths.

Try the issue-15 branch which searches $DASHT_DOCSETS_DIR recursively, resolving symlinks therein. 🎅 To use this, go into your $DASHT_DOCSETS_DIR and create symlinks to places that contain (either directly within or anywhere beneath) your *.docset folders.

However, note that ^ anchors in docset name regexps are currently broken in this branch because they're matched against an entire file path instead of just the docset name. 😓 I'll need to think of a better solution for that (sigh... if only POSIX shell had a filter_by(lambda) ✨ operation... 😞).

Fixed ability to use ^ anchors in docset name regexps now! 😺

That is really cool. Just tried it and it works. 👏

By the way, I think too many users take the work of project owners/maintainers for granted. I don't. Thanks a lot for the short response time and more importantly, thanks a lot for this amazing project. I think people don't say this enough.

Thanks for your kind words. 😺 I'm glad to help and to hear that it worked correctly. :neckbeard:

This feature needs more refactoring before I can release it, maybe next weekend. 🎁

Hey @RobertAudi, I noticed that this approach slows down dasht on my low-powered system. 🐌

So I want to try creating a dasht-docsets-inherit script that looks for existing docsets in specified locations and symlinks them (flat) into dasht's docset installation directory. 💡 This way, we don't have to pay the price of searching the filesystem with find every time docsets need to be enumerated or searched. :neckbeard:

Would this be an acceptable solution for this problem?

As far as I'm concerned, any working solution would be acceptable. However, when I think of "the end user", I think an acceptable solution would be one that is transparent, or at least seemless.

In that sense, I believe that the location of existing docsets can often be inferred, and thus the whole symlinking process can be transparently taken care of.

What I mean by inferred is the following: IF there are any existing docsets, then they are likely to have been installed by either Dash (on OS X) or Zeal (on *nix and Windows). Those two applicantions install (by default) docsets in specific locations which, as I said before, can be inferred.

The inference would only take place when installing new docsets, it would just be (virtually) additional docsets "repositories".

The way I see it, it's similar to the way "taps" work in Homebrew, or PPAs in Debian-based systems, except that the "repositories" are local. And this could potentially be a gateway to a much more powerful docsets packaging system.

(Sorry for the formatting and the lack of emojis, I'm writing this on my iPhone :/ )

As a follow-up, I know that my last comment is more relevant to #4 than versioned docsets per-say, but the lookup logic is the same in my opinion: Versioned docsets (essentially legacy/archived official docsets) ARE third party docsets.

Hey @RobertAudi, thanks for your feedback. 📞 After much consideration, I decided to implement the symlink approach because it requires the least amount of changes to the codebase. 🚶 Please try the new "inherit" branch instead of the old "issue-15" branch. 🎅

@RobertAudi Please try the issue-15-try3 branch, which introduces a new environment variable:

DASHT_DOCSETS_PATH
Defines additional filesystem locations where [Dash] docsets may be found.
These locations are not recursively searched and they must be delimited
by one or more colon : characters, like the PATH environment variable.

You can set it to include all locations that contain *.docset folders, such as:

export DASHT_DOCSETS_PATH=$(
  find "$HOME/Library/Application Support/Dash/" -type d -name '*.docset' | 
  sed 's|[^/]*$||' | sort -u | tr '\n' :
)

Let's get this long awaited issue moving again! 👷‍♂️ 🚀 ✨

@sunaku Sorry for the super slow response time, I finally tried the issue-15-try3 branch. It does partially work. Here is a list of remaining problems.

Problems

Broken links

Some docsets are present in the form of an archive with the name tarix.tgz. For those docsets, all links will be broken because the *.docset/Contents/Resources/Documents/ directory doesn't exist. The tarix.tgz contains that exact directory, so unarchiving it would solve this problem automagically.

Unlisted docsets

The only docsets listed by the dasht-docsets command and in the docsets selection dropdown in the browser are the ones in the DASHT_DOCSETS_DIR (or $XDG_DATA_HOME/dasht/docsets). Docsets from the DASHT_DOCSETS_PATH are not listed, which means that we can't filter searches to look in those docsets specifically.

Distinct seasch results with the same name

If a search result is found in two different versioned docsets, they will appear under the same name (including the docset name). For example, searching for sample will show results from all the Ruby docsets (among others), but there will be no indication of the diffence between each result (namely the version of Ruby). That is because the docset name is actually the same, but is located under a directory specifying the actual version.

Suggested solutions

Taking into account the problems above, I came up with a couple of simple solutions:

  • Use sources instead of paths, so that installing/updating would check there in addition to the default one.
    • A source can be a local directory or a url, or even some kind of repository. Ideally this should be irrelevant.
  • Add an option to dasht-docsets-install to customize the name of docsets. This would have a couple of interesting benefits:
    • No duplicate docsets
    • Docsets from different sources can be installed under the name of an official docset. A very stupid example: Install the PHP docset under the name Ruby.

This would not change any of the existing dasht-* commands, except the dasht-docsets-install command with the additional option. It will also avoid dead links and lookup hell because all the docsets would be installed in the same location (DASHT_DOCSETS_DIR).

Hey @RobertAudi, thanks for your detailed feedback and sorry about the long delay in my response. 😓

After much thought, I feel most confident about the symlink creation approach 🔗 (see the dasht-docsets-inherit script I provided earlier in the inherit branch) to address the problems you have listed.

To customize the names of docsets during/after installation (also requested in PR #19), I have an idea. 💡

Stay tuned. 🏃‍♂️

Crude but effective:

export DASHT_DOCSETS_DIR="$HOME/Library/Application Support/dasht"
rm -rf $(DASHT_DOCSETS_DIR)
mkdir $(DASHT_DOCSETS_DIR)
cd "$HOME/Library/Application\ Support/Dash"
find . -name '*.docset' -depth -exec sh -c '
    for source; do
      case $source in ./*/*.docset)
        target="$(printf %sz "${source#./}" | tr / -)";
        cp -R "$source" "../dasht/${target%z}";;
      esac
    done
' _ {} +

Presumes Mac OS directory structures of course.

Inspiration: https://unix.stackexchange.com/questions/45644/flatten-directory-but-preserve-directory-names-in-new-filename