joddie/pcre2el

Case insensitive modifier?

Closed this issue · 8 comments

That would be convenient to have one—placing i at the end when you’re not sure about the case of a word instead of placing [aA] combinations for every letter. I didn’t quite understand, are modifiers not supported at all?

Sorry for the delayed response. Supporting /i would be tricky to do right because Emacs controls case-folding using a dynamically-bound variable, case-fold-search: as far as I know there's no way to embed case-insensitive behavior in the string or regexp object itself, as there is in Perl, JS, etc.

I'm not an expert, but I'm led to think that case-folding searching is rather tricky to do well for all the different alphabets in Unicode. I assume that Emacs probably does a good job with case-fold-search set to t, but I can't think of a good way to embed that flag in a translated regexp (short of advising all the regexp primitives in Elisp, which is not appealing).

For basic purposes, I guess it might be good enough to replace every literal character c with [cC] in the translated regexp: at least it might be good enough to be useful even if not fully correct. I will try to work on adding this, although I have limited time for the next two months.

After a bit of experimentation, this is not as bad to implement as I had supposed. There are some cases which will never work right (notably backreferences) but I hope we can provide enough to be useful for simple cases. I will push a cleaned up next branch containing this and several other cleanups/enhancements in the next few days.

This will also relate to issue #13 since all three modifier flags (x, s, i) need to allow toggling on and off when reading from the minibuffer.

This is now merged into master and should be available in the minibuffer using the C-c i keybinding. Let me know if it works!

Er… I’m not sure if I’m doing it right, but r[o]s now matches ros, rOs and rOS. The case-independent search is default now? A-and C-c i is undefined when I’m in the minibuffer.

What command / key sequence are you using? Also, what's the value of case-fold-search (which enables Emacs' builtin case-folding?) I think it defaults to t, so many searches are case-insensitive by default...

I use isearch-forward. case-fold-search is set to t.

I think what you are seeing is Emacs's default out-of-the box behavior: all searches are case-insensitive by default, unless you customize case-fold-search to nil. Normally, isearch will also go into case-sensitive mode if you type any uppercase characters, or you can enable it explicitly using isearch-toggle-case-fold (M-s c).

So it may be that the addition of an emulated /i flag isn't as useful as anticipated ;-) However, if you want to use it, you can customize case-fold-search to nil and then use the C-c i binding to toggle it. This binding wasn't enabled in isearch-mode before, but it should be now.

Note that the fake case-folding behavior enabled by the /i flag does not work with backreferences ((foo)\1 matches foofoo and FOOFOO, but not fooFOO), where Emacs's case-fold-search setting does the right thing.

or you can enable it explicitly using isearch-toggle-case-fold (M-s c).

Oh, I’ve been looking for that option for so long.

So it may be that the addition of an emulated /i flag isn't as useful as anticipated ;-)

Er… I don’t remember what exactly problems I had with case sensitiveness, but they definitely were there :-)

However, if you want to use it, you can customize case-fold-search to nil and then use the C-c i binding to toggle it.

Yay, it works, thank you!