prompt-toolkit/python-prompt-toolkit

get_word_before_cursor() incorrectly returns string containing trailing whitespace when a `pattern` value is provided

Closed this issue · 2 comments

Problem

Library version: 3.0.52

Document.get_word_before_cursor() incorrectly returns string containing trailing whitespace when a pattern value is provided.

This happens because Document._is_word_before_cursor_complete() fails to check for whitespace at the word boundary in the case where pattern is true-ish.

The docstring claims:

If we have whitespace before the cursor this returns an empty string.

So this behavior is incorrect.

Repro

Use the following program:

import re
from prompt_toolkit.document import Document


def print_word_before_cursor(document: Document, pattern: re.Pattern | None = None):
    text = document.text
    word = document.get_word_before_cursor(pattern=pattern)
    print(f"Text: {text!r}, Cursor Position: {document.cursor_position}, Word Before Cursor: {word!r}")


if __name__ == "__main__":
    text = "Fubar "
    document = Document(text=text, cursor_position=len(text))


    # Show correct documented behavior:
    #     The get_word_before_cursor() method returns '' since the cursor
    #     is at a whitespace character.

    print_word_before_cursor(document)  # Correct behavior: returns ''


    # Show incorrect behavior:
    #     Using the same word search pattern as the library does,
    #     the get_word_before_cursor() method now returns 'Fubar ' even though the cursor
    #     is at a whitespace character.

    _FIND_WORD_RE = re.compile(r"([a-zA-Z0-9_]+|[^a-zA-Z0-9_\s]+)") # copied verbatim from prompt_toolkit/document.py

    print_word_before_cursor(document, pattern=_FIND_WORD_RE)   # Incorrect behavior: returns 'Fubar '

Resulting Output

Text: 'Fubar ', Cursor Position: 6, Word Before Cursor: ''
Text: 'Fubar ', Cursor Position: 6, Word Before Cursor: 'Fubar '

Expected Output

Text: 'Fubar ', Cursor Position: 6, Word Before Cursor: ''
Text: 'Fubar ', Cursor Position: 6, Word Before Cursor: ''

Workaround

text = "" if doc.char_before_cursor.isspace() else doc.get_word_before_cursor(pattern=mypattern)

Related: #1609

Hi @jonathanslenders
Could you please review this PR: #2025
Thank you.