swsnr/mdcat

Wrap text to column limit.

swsnr opened this issue ยท 12 comments

swsnr commented

Use textwrap, perhaps, to wrap all content to the column size of the TTY.

Will perhaps be tricky with termion formatting characters.

swsnr commented

We can't directly wrap text (eg, w/ textwrap), because we need to account for formatting escapes and for indentation.

But we can use unicode_width to compute the length of text as we write it, and then keep track of the current column being written to. If writing would exceed the column limit we can scan backwards for the first whitespace before column limit, wrap the text and try again until the entire text is written.

That would be a great to have. E.g. in cases where links get transferred to the bottom of the document the shorter lines are very disturbing.

Just give me 3 or 4 months to learn a little Rust (I really want that feature ;) )

consolemd does, indeed uses the python equivalent of textwrap: kneufeld/consolemd@6a2c6ee
which means it'll occasionally go under the provided wrap limit, but, eh it's good enough as a start

swsnr commented

@agschaid Take all the time you need; I don't think I'll fix this anytime soon.

@igalic I don't think that's good enough for me. It ought to be done right.

i understand, @lunaryorn!

i suppose the rust way would be to create a wrapper type that can represent the output on the console, but can also be used as a String or &str for feeding into textwrap

I believe there is no need to create a wrapper type as textwrap supports ANSI escape codes for a few months now (mgeisler/textwrap#179).
I don't know how well that works with windows though.

swsnr commented

@orasunis I guess that'd be a good start. I haven't really looked at it, but if it handles colours, OSC 8 hyperlinks and iterm marks well, it might even be a perfect solution ๐Ÿ™‚

just to put my "commitment" into perspective: I have two kids. So my "3 or 4 months" can quickly grow into "2 or 3 years" ;)

I think this is a good project/motivation for me to get a little into rust. But don't count on me.

swsnr commented

@agschaid There's no commitment here. We all do this in our free time, we all have a life and priorities ๐Ÿ™‚

Hi all, I saw a link to this on mgeisler/textwrap#179 -- just wanted to say that textwrap will indeed ignore all ANSI escape sequences since version 0.12. Ignoring means "they don't contribute to the string width", so the wrapping computations are not affected by the escape characters any longer.

Will perhaps be tricky with termion formatting characters.

I've actually been playing around with this recently and I wrote a little demo program: https://github.com/mgeisler/textwrap/blob/master/examples/interactive.rs

It uses termion and if you modify it to use colored text, then you'll see that you can indeed very easily run into problems. Basically, textwrap::wrap will give you back a Vec of strings, complete with the original escape codes. If you just print those to the terminal everything works fine. However, my example program draws a red border around the text and so I use code like

        write!(
            stdout,
            "{}{}โ”‚{}",
            cursor::Goto(col - 1, row),
            color::Fg(color::Red),
            color::Fg(color::Reset),
        )?;

This is a problem if there is, say, blue text which was supposed to be wrapped over two lines: the color now stops at the point of the color::Fg(color::Reset) code.

Perhaps you don't have such borders and then things are easy...?

swsnr commented

ANSI sequences aren't the issue; the issue is more that pulldown is a pull parser, so we don't get the text at once but rather scattered over many different events.

Hi @lunaryorn,

I just realized that we talked about stateful wrapping over in mgeisler/textwrap#224 :-)

So yeah, if you get your text one piece at a time, then it'll be harder to use textwrap. I "cheated" and simply accumulated the text in mgeisler/textwrap#140 (comment), but I understand that you have a different architecture in mdcat.

With mgeisler/textwrap#234, I'm introducing a new more advanced wrapping algorithm which select optimal break points for an entire paragraph at a time โ€” "optimal" according to some penalties which discourage short lines. This is by its nature also quite stateful. I will see if I can make the original wrapping algorithm work in an incremental fashion again.