rain-1/linenoise-mob

Hangs on ctrl-t with utf-8 input

weinholt opened this issue · 4 comments

Hello,

Linenoise hangs when transposing two utf-8 characters with ctrl-t.

To reproduce, run "make example; ./example" and enter "åä" so your screen looks like this:

こんにちは> åä

Move the cursor to ä (positioning it between å and ä). Press ctrl-t.

confirmed. looked at lldb backtraces. what they have in common is linenoiseUtf8NextCharLen.

        case CTRL_T:    /* ctrl-t, swaps current character with previous. */
            if (l.pos > 0 && l.pos < l.len) {
                int aux = buf[l.pos-1];
                buf[l.pos-1] = buf[l.pos];
                buf[l.pos] = aux;
                if (l.pos != l.len-1) l.pos++;
                refreshLine(&l);
            }
            break;

control-t code was merged which does not work with the new unicode API. So it creates invalid data.

So there is 2 bugs here:

  • ctrl-t implementation
  • invalid unicode data causes infinite loop

and actually 3

  • pasting in unicode text is very unpredictable and buggy

other functions which make use of buf directly:

        case CTRL_U: /* Ctrl+u, delete the whole line. */
            buf[0] = '\0';
            l.pos = l.len = 0;
            refreshLine(&l);
            break;
        case CTRL_K: /* Ctrl+k, delete from current to end of line. */
            buf[l.pos] = '\0';
            l.len = l.pos;
            refreshLine(&l);
            break;

I believe both are unicode safe though, I think 0 and l.pos will always be on the boundary of a codepoint.

The loop is caused by utf8BytesToCodePoint returning 0 on utf-8 invalid text. I'm not sure what the best thing to do in a case like that is.

fa0de5c improves the behavior on invalid text, stopping infinite loops.

6d043da fixes the transpose bug for åä and こんにちは. I just realized after pushing that it may not solve the issue when we have more complex clusters(?).

https://github.com/yhirose/linenoise/tree/utf8-support repo has this same issue. need to contact him.