Bug: get_word_before_cursor() doesn't return strings starting with hyphen/s
Opened this issue · 2 comments
Consider a simple application as one given below.
from prompt_toolkit.completion import WordCompleter
from prompt_toolkit import prompt
mycompleter = WordCompleter(['-a', '-app', '--apple', '--apricot' ,'build', 'builder', 'buildx'])
text = prompt('prompt: ', completer=mycompleter)
print('You said: %s' % text)Running the above code and then entering - or -- on the prompt generates completions correctly. However if a alphanumeric character is entered after the hyphen/s no completions are generated. For e.g --a does not cause --apple and --apricot to be displayed.
On testing and debugging I found out that the get_word_before_cursor() function used by WordCompleter does not correctly return the hyphenated words as specified above which causes this issue.
On further investigation the culprit seems to be the regexp used by the find_start_of_previous_word() function defined at line 475 of src/prompt_toolkit/document.py
The current regexp is ([a-zA-Z0-9_]+|[^a-zA-Z0-9_\s]+). The part of the regex after | which is meant for pulling special characters will pull only and only special characters. If an alphanumeric character appears after the special characters the expression will fail and nothing is returned. The first part of the expression comes into play when the string starts with an alphanumeric character. It will also fail when strings contain special characters and will only and only return alphanumeric characters up until a special character is encountered. There are 2 replacement regexps which can be used in the stead:
-
([\S]+)
This will match all the special characters and alphanumeric characters except whitespace, tab, carriage return. -
([a-zA-Z0-9_=-]+)
This will just add-and=to the acceptable characters which are 2 most common types of switches used in command line arguments. I removed the 2nd expression because it seems not to serve any purpose. If it does it can be added and will still solve the issue with-=characters
Another related problem occurs if a custom regex pattern is passed to the get_word_before_cursor function in order to bypass the above issue. The function _is_word_before_cursor_complete used by the function get_word_before_cursor should be using the return statement in the else block for all cases. Right now it tries to evaluate the regexp and a space in the prompt means the code in the If block will be evaluated True every single time. Hence no completions are ever displayed following a whitespace character. Easy way to test this is to pass the default regexp as the pattern= argument.