[tui][examples] Example #5, fix editor component, typing more than 6 "#" produces janky output
Closed this issue ยท 8 comments
In the top level folder of the r3bl-open-core
repo, run ./run.nu run
. Then select 5
and press enter.
This is the r3bl-cmdr demo. When you press ######
you will see some strange artifacts displayed on the screen. This might have something to do w/ a recent commit done in the last month that fixed the markdown parser to handle headings that are greater than 6 "#" marks ๐
tui-rc-demo-bug-2023-11-18_19.59.13.mp4
Even more simply when you type _this is not italic
as the only text in the editor component, it breaks. It displays the content as italic.
Test cases to make the parser break ๐คธ:
_
or*
at the start or end or in the middle of a word.- To break the heading add an extra "#" (7 hashes break the heading).
The parser for headings might be conflicting w/ the parsers for bold and italic.
headings.mp4
star.and.underscore.mp4
@e0lithic It doesn't look like there are issues w/ the headings. However it does look like italic parsing (and bold parsing) have been broken since the beginning.
Just typing _this should not be italic
will break it. The screenshot below shows this. It should not be rendered as italic, since there is not closing _
.
@e0lithic This test highlights some of the issues:
#[cfg(test)]
mod tests {
use crossterm::style::Stylize;
use r3bl_rs_utils_core::*;
use super::*;
#[test]
fn fix_italic() {
let input = ["_this should not be italic"].join("\n");
let (remainder, blocks) = parse_markdown(&input).unwrap();
println!("{:?}", remainder);
println!("{:?}", blocks);
}
}
Output from test:
"_this should not be italic"
List { items: [] }
Observation regarding usage of alt
in parsers.
alt
returns the last error, hence if none of the matches work, then the last delimiters error is thrown. Hence the output for parse_elemenet_italic
and other parsers using alt
will have different results different errors depending on the order of the parser.
This testcase should clarify it .
#[test]
fn test_delimiter(){
let parser = parse_element_italic;
let mut parserStar = delimited(tag(ITALIC_1), is_not(ITALIC_1), tag(ITALIC_1));
let mut parserUnderscore = delimited(tag(ITALIC_2), is_not(ITALIC_2), tag(ITALIC_2));
assert_eq2!(
parser("*here is italic"),
Err(NomErr::Error(Error {
input: "*here is italic",
code: ErrorKind::Tag
}))
);
assert_eq2!(
parserStar("*here is italic"),
Err(NomErr::Error(Error {
input: "",
code: ErrorKind::Tag
}))
);
assert_eq2!(
parserUnderscore("*here is italic"),
Err(NomErr::Error(Error {
input: "*here is italic",
code: ErrorKind::Tag
}))
);
}
Additional fixes are required to ensure that the special characters are only recognised at word boundaries. Following examples should be treated as generic text.
test_ing_
_test_ing
Code smell emanating from parse_element_plaintext()
.
I got that code from another MD parser, and I thought it was fishy at the time as well
https://github.com/r3bl-org/r3bl-open-core/blob/main/tui/src/tui/md_parser/parse_element.rs#L107 (edited)
My best guess is that it is getting anychar that is not one of the *, _, -, etc special chars.
- We already have the tag parsers for special characters. There is no need to use these. Not at least for the ones which have dedicated parsers running prior to this
- This is code that was inherited a long time ago that who knows what it does, and everything else got rewritten around it
- It kind of makes sense now why _this is not italic was crapping out, since that strange function was checking for _ as an invalid character!
There are changes we have made to the markdown spec. Here's a full list of the extras that we need: https://github.com/r3bl-org/r3bl-open-core/blob/main/tui/src/tui/md_parser/parser.rs#L39
We support most of the standard constructs. And one thing we diverge from radically is SMART LISTS
This was a huge effort ... to make it so that we can track indentation levels across line breaks ... this diverges from markdown spec, since it is a block level construct. We do something similar for code blocks, so we can parse the code block contents separately and then syntax highlight them too!
Extras:
- tags list,
- authors list,
- title value,
- date value,
- smart list
Pictures of smart lists which look at multi-line elements as block elements. This is a divergence from markdown spec, which is mostly single line scoped elements.
@nazmulidris A similar issue can be reproduced as follows :