jojojames/fussy

Error in post-command-hook (vertico--exhibit)

Closed this issue · 23 comments

Started happening when I enter a space in my completion after 2148fd7

Error in post-command-hook (vertico--exhibit): (wrong-type-argument window-valid-p #<window 15>)

CleanShot 2022-07-04 at 20 45 56@2x

Haven't had time to debug further yet.

My config:

(use-package orderless
  :straight t
  :commands (orderless-filter))

(use-package fuz
  :demand t
  :config
  (unless (require 'fuz-core nil t)
    (fuz-build-and-load-dymod)))

(use-package fussy
  :straight (fussy :type git :host github :repo "jojojames/fussy")
  :demand t

  :config
  (setq fussy-filter-fn 'fussy-filter-orderless
        fussy-score-fn 'fussy-fuz-score)
  (push 'fussy completion-styles)
  (setq
   ;; For example, project-find-file uses 'project-files which uses
   ;; substring completion by default. Set to nil to make sure it's using
   ;; flx.
   completion-category-defaults nil
   completion-category-overrides nil))

Thanks, looks like '(nil) can also be returned, I added a check for that case.

Thanks, that no longer crashes, but it does not filter properly either. As soon as I have a space in there it filters everything out.

Could you provide an example so we're on the same page? I don't usually enter spaces in my queries and don't expect the fuzzy matchers to give scores to those queries.

I use orderless, so something like M-x tog de should narrow to toggle-debug-on-error

(as opposed to orderless-flex, which is too slow for my taste)

So you want to filter with 'tog' 'de' (however orderless does it) and then apply scoring to the results that come back right? What query would you think maps to the 'fuzzy' part in this case?

Right now, it'd (probably) filter properly getting you a collection from 'tog' 'de' and then at the same time score a query like 'tog de' against 'toggle-debug-on-error'. Not sure if there are any missing assumptions in between.

Good question, I hadn't thought too much about it, but I imagine the fuzzy algorithms don't really know how to handle the space. I wonder if the spaces could be removed, so the filter on tog de ends up scoring based on togde.

Maybe if the caching I mentioned were possible I could switch back to orderless-flex and this would be moot. I imagine fussy-filter-orderless would need to come with a warning then?

Good question, I hadn't thought too much about it, but I imagine the fuzzy algorithms don't really know how to handle the space. I wonder if the spaces could be removed, so the filter on tog de ends up scoring based on togde.

Yeah, I was thinking one approach was to remove the space and have the query be 'togde' instead of 'tog de', think we'd have to do it right after the filtering and only remove the space for scoring.

Added something for this. TBH, not super sure I will keep this but it also doesn't seem harmful at the moment.

I’ll try it out. One thing I realized is that “de tog” also matches toggle-debug-on-error. That’s where orderless gets its name from. If I’m not mistaken, this would still incorrectly filter.

I guess I’m curious what the reason for using the scoring function to filter is? And, I could answer this myself by looking at the source later, but does fussy filter or sort first? Id expect it to filter first and then sort/rank the remainder. Is that how that works?

I think the only way to rank properly with orderless may be to rank for each term and combine those ranks in some way. Or to rank by all possible orderings of the terms. Seems nasty.

I just read your comment on #18 so that answers my question about ordering, but it's still unclear to me why there is additional filtering during the scoring.

And I did confirm that the recent patch helps with tog de but not with de tog (or debug toggle error more specifically)

One thing I realized is that “de tog” also matches toggle-debug-on-error. That’s where orderless gets its name from. If I’m not mistaken, this would still incorrectly filter.

In this case, the filtering is correct as that is what/how orderless is filtering. After that, we score whatever comes back from the filter.

Id expect it to filter first and then sort/rank the remainder. Is that how that works?

Filter -> [batch of candidates] -> Score

And I did confirm that the recent patch helps with tog de but not with de tog (or debug toggle error more specifically)

What is that supposed to match to?

but it's still unclear to me why there is additional filtering during the scoring.

It's to try to cut down on the additional entries we have to process. After-all, after scoring, we need to sort the collection. It's not really necessary though.

What is that supposed to match to?

debug toggle error should match toggle-debug-error. Orderless doesn't care about the order, but all of the fuzzy scorers do.

I see, well it definitely served to highlight a difference between the scoring and how orderless works.

debug toggle error should match toggle-debug-error. Orderless doesn't care about the order, but all of the fuzzy scorers do.

You can likely disable the filtering, there's a defcustom for it (not sure if I added it everywhere like I should have though.)

Either way though, that entry would match below everything else since it'd be a score of 0, effectively filtering it out.

For orderless, what about just using the first term? e.g. debug toggle error would score for simply debug.

Scoring 0 isn't quite the same as filtering out when it's one of the only entries. What I've found on master is that many of the things I search for in the way I search for them just don't show up in a short list.

For example, in a rails project, I may search for something like user then realize there's too much, so I will add model -> user model, which I want to match app/model/user.rb.

I do, as an aside, wonder if this style of searching is more compatible with prescient than with something like fussy.

For orderless, what about just using the first term? e.g. debug toggle error would score for simply debug.

Right.. that starts getting into big questions on how the integration should work. For example what if the user sets up orderless with style dispatchers?

Scoring 0 isn't quite the same as filtering out when it's one of the only entries. What I've found on master is that many of the things I search for in the way I search for them just don't show up in a short list.

I agree which is why I mentioned you can disable the filtering if you want. My point was that for collections returned with a decent number of candidates, it'd be at the bottom, out of sight. You'd essentially only get it with no candidates/short list.

Yeah, I'm really starting to think that the filtering and the scoring have to be the same algorithm like they are in something like fzf. At first I was thinking the score based filtering you added was odd, but maybe it's the only filtering that should be there. Let the scoring algorithm decide what's in and what's out. If you want to use orderless, use orderless, not fussy. Does that make sense or am I missing something?

Yeah, I'm really starting to think that the filtering and the scoring have to be the same algorithm like they are in something like fzf.

fzf does a two pass similar to fussy IIRC.

get list of completions from somewhere (the filtering) (git ls, find ., fd, etc) -> fzf scores/sorts (not sure if it also filters here or just puts it in the back)

If you want to use orderless, use orderless, not fussy. Does that make sense or am I missing something?

Yeah, there could probably be some more synergy there (e.g. orderless style dispatcher flex -> fussy scores) while keeping all the other style dispatchers working. I don't really use orderless anymore so got no motivation to write anything related to that though.

Tangential but related, I wonder if it's more performant to just do (all-completions "") instead of passing a flex prefix in against the entire collection.

e.g.

Current:

(all-completions FLEX-PATTERN) against entire collection to filter -> hands off to fzf/etc to score / filter

To:

(all-completions "") get entire to collection -> hand off to fzf/etc to score/filter

https://github.com/jojojames/fussy/tree/all-completions

I added it here ^ real quick but didn't do any exhaustive testing. See if you can try it out and see if there are improvements. Think doing that, we do zero filtering and let fussy use the backends to score/filter. (Have to use the fussy-filter-fast filter function though instead of the orderless one you're already using).

It might work reasonably well if there's a cache in place so that the results of the previous entry becomes the new collection to score/filter.

If you then tune the amount of candidates to score, you could probably get pretty good performance in completion-in-region functions. Not sure though, just a musing at this point but seems possible unless I'm missing something.

Tangential but related, I wonder if it's more performant to just do (all-completions "") instead of passing a flex prefix in against the entire collection.

Yeah, this is more or less what I was imagining. It's faster. Usable for M-x (though not as fast as orderless w/o flex, but that's not surprising). Still not usable for emacs-lisp completion (also not surprising).

I wonder if for completion-in-region you could force the first letter to be anchored to beginning of word. e.g. foo matches flow-orb but not some-flow-orb (it becomes ^f.*o.*o, rather than f.*o.*o). That would greatly cut down on the search space and be just as useful (at least for me). I have no idea if the scorers allow anchoring.

Also, don't worry about making it work with orderless, that seems very complicated and I don't even know if I'd use it. I've gotten used to orderless's way of working, for the most part, though I do sometimes miss fuzziness.

I wonder if for completion-in-region you could force the first letter to be anchored to beginning of word. e.g. foo matches flow-orb but not some-flow-orb (it becomes ^f.*o.*o, rather than f.*o.*o). That would greatly cut down on the search space and be just as useful (at least for me). I have no idea if the scorers allow anchoring.

Can you try the latest on the same branch to see if performance is acceptable? (again you'd have to use the fussy-filter-fast filter instead of orderless) I experimented with it a while back but never really followed through on it because I was more focused on completing-read where ideally the match is a.*b.c. instead of ^a.*b.c. but I can see the latter being a reasonable optimization/default for completion-in-region. I also don't think it's necessary to check every letter but only the first. My guess is that the filtering would be much faster just checking for the first letter instead of the existence of every letter. We can then let fussy score/filter after we get an initial list of candidates.

We'd still need to know when it's completion-in-region though vs completing-read. company makes it easy because you can probably write an advice on top to switch a flag to T which we can then read in our filter function.

After doing that, could also tweak the # of candidates to actually score and it could be acceptable.

It's much faster and pretty much usable. The very first search (at 2 or 3 characters) stutters slightly, but not much.

Not sure if it'd be faster to do the full regex filter for the first few characters with ^prefix or just do what I mentioned earlier and use the first letter of the query.

e.g.

(regexp
          (if (< (length infix) 3)
              (concat "\\`"
                      (mapconcat
                       (lambda (x)
                         (setq x (string x))
                         (concat "[^" x "]*" (regexp-quote x)))
                       infix
                       ""))
            (format "^%s" (substring infix 0 1)) ))```

For initial filtering.

Polished it up a bit, I do something like this right now for company with it

  (defun j-company-capf (f &rest args)
    "Manage `completion-styles'."
    (let ((fussy-max-candidate-limit 5000)
          (fussy-completing-at-point t))
      (apply f args)))