iamacup/react-native-markdown-display

Allow handing in pre-parsed AST

pke opened this issue ยท 13 comments

pke commented

For my component I'd like to extract all the text links from the markdown to build an action sheet out of the links.

So I would like to pre-parse the markdown using markdown-it for all links and then hand in that AST to the component instead of a markdown-it prop.

I want to prevent thet parsing of the markdown twice, once in my component and second in yours.

Do you think that would be a possible addition?

So we could do two things I think

  1. Make a prop that accepts the output of markdownit.parse - if this exists we just use this instead of running it internally
  2. Make a prop that is the AST, which would replace the output of AstRenderer.js if present

The problem with 2 is, AstRenderer.js is a very changed file - so there is potential for lots of breaking in the future.

What do you think?

pke commented

The output of markdownit.parse is almost what I am using now, but I prefer a less nested AST (preferably flat even) so I am using your exported tokensToAST.

I am currently using this code and it works great but has a performance hit due to double parsing:

const links = useMemo(() => {
  const markdown = MarkdownIt({
    html: false,
    typographer: false,
  })
  const ast = tokensToAST(stringToTokens(children, markdown))
  const collectLinks = (result: FinePrintLink[], node: ASTNode): FinePrintLink[] => {
    if (node.type === "link") {
      result.push({
        uri: node.attributes.href,
        text: node.children?.map(({ content }) => content).join(" "),
      })
    }
    return node.children.reduce(collectLinks, result)
  }
  return ast.reduce(collectLinks, [])
}, [children])

I would not introduce a new prop, but rather define children additionally as union with ASTNode[] or even the return value of markdownit.parse

So my call to Markdown could look like this:

const ast = tokensToAST(stringToTokens(children, markdown))
<Markdown onLinkPress={onPress} style={style}>{ast}</Markdown>

Could also be possible

const tokens = stringToTokens(children, markdown)
// same as
const tokens = MarkdownIt().parse(children)
<Markdown onLinkPress={onPress} style={style}>{tokens}</Markdown>

What do you think?

Hey

Sorry about the delay here - can you take a look at https://github.com/iamacup/react-native-markdown-display/releases/tag/6.2.x and let me know if this is what you were thinking?

Supports this syntax:

const markdownItInstance = MarkdownIt({typographer: true});

const copy = `
# H1

Paragraph
`;

const ast = tokensToAST(stringToTokens(copy, markdownItInstance))
            <Markdown
              debugPrintTree
            >
              {ast}
            </Markdown>

released this in 6.1.5 - re-open if we need to change how it works :) Readme is updated to reflect the above

pke commented

Thanks, I'll test it on Monday and report back here!

pke commented

@iamacup works like a charm! Thanks for fulfilling this feature request!

Two problems in 6.1.6:

  1. Passing in AST nodes is slower than plain markdown string, it's obvious in FlatList scenario. (even the AST nodes is not modified).
  2. Image node can't be parsed in AST nodes way.

p.s.
Currently I'm using React.memo for FlatList item cache, so problem 1 is not that serious, but it's really slow.

pke commented

@helsonxiao thanks for reporting back.

to 1st: Could you provide a source snippet how you use the AST within the FlatList? The markdown component also uses just memoization internally so maybe your AST is re-calculated over and over again because of unstable dependencies?

to 2nd: You mean images are not displayed that way? That shouldn't make any difference. Maybe a test case could be made for that.

just opening to keep track of this in case it turns out to be library related - thanks for handling @pke !!

@helsonxiao thanks for reporting back.

to 1st: Could you provide a source snippet how you use the AST within the FlatList? The markdown component also uses just memoization internally so maybe your AST is re-calculated over and over again because of unstable dependencies?

to 2nd: You mean images are not displayed that way? That shouldn't make any difference. Maybe a test case could be made for that.

  1. I haven't measured it with Performance API but I do feel the difference. I also tried to cache the AST processor or return AST nodes directly. Note: My FlatList Item use react-native-syntax-highlighter for code display, but it's still slower than plain string mode after I removed the highlighter. If it's still confused for you, I will check your codes. Anyway, React.memo is necessary so it's not a big deal.

  2. Try this input. It works well in plain string mode.

const input = "![่ƒŒๅŒ…ๅ››่ฎฒ.png](https://media-test.jiuzhang.com/media/markdown/images/7/16/72644746-c710-11ea-8f89-0242ac1c0003.jpg)\n\n```\n[[java]]\nasfafagagag233\n[[golang]]\nasfgsagag\n```\n\n![่ƒŒๅŒ…ๅ››่ฎฒ.png](https://media-test.jiuzhang.com/media/markdown/images/7/16/72644746-c710-11ea-8f89-0242ac1c0003.jpg)"

image

// turn on astPreprocessor then images disappear
return (
  <Markdown /* astPreprocessor={astPreprocessor} */ rules={rules}>
    {input}
  </Markdown>
);

// here I just return the ast, nothing changed.
// I will use this method to pick up important nodes.
const astPreprocessor = (ast: ASTNode[]) => {
  return ast;
};

const Markdown: React.FC<MarkdownProps> = props => {
  const _props = {
    rules,
    style: MarkdownStyles,
    ...props,
  };

  if (props.astPreprocessor) {
    const ast = tokensToAST(stringToTokens(props.children, markdownItInstance));

    return <_Markdown {..._props}>{props.astPreprocessor(ast)}</_Markdown>;
  }

  return <_Markdown {..._props}>{props.children}</_Markdown>;
};

image

pke commented

About the speed difference:
At which point are you using memoizsation? The result coming back from astPreprocessor should be memoized in your component based on the children.

const {
  children,
  astPreprocessor,
} = props
const content= useMemo(() => {
  astPreprocessor ? astPreprocessor(tokenToAST(stringToTokens(children, markdownItInstance)) : children
}, [children, astPreprocessor])

return <_Markdown markdownIt={markdownItInstance} {...props}>{content}</_Markdown>

Make sure you are also re-using the markdownItInstance you globally defined, or the Markdown component will create its own.

About the images I'd have to check this out in a free minute, but without knowing your styles for images and what the preprocessor actually does with the AST, its hard to predict.
So your preprocessor divides the markdown based on code fences?

I think the markdown input is not entirely settled. stringToTokens only does the job partly. There are still two more steps from this file. https://github.com/iamacup/react-native-markdown-display/blob/master/src/lib/parser.js#L19

I copied them into my project then it works fine ๐Ÿ˜„. It could be nicer if you export them or refine the stringToTokens function. The speed confusion is also resolved. Thanks!

+1 to cleaning up the necessary imports for a custom parser function. At present both cleanupTokens and groupTextTokens need to be imported from the src/lib/util folder.

However I wouldn't be in favour of consolidating the current functions into stringToTokens, as for my use case I need to do some preprocessing before groupTextTokens is called:

import {
  MarkdownIt,
  stringToTokens,
  tokensToAST,
} from 'react-native-markdown-display';
import { cleanupTokens } from 'react-native-markdown-display/src/lib/util/cleanupTokens';
import groupTextTokens from 'react-native-markdown-display/src/lib/util/groupTextTokens';

const markdownIt = MarkdownIt({
  typographer: true,
});

export default function toAST(markdownContent) {
  // Process into an AST, to support the changing of links into blocklinks.
  let tokens = stringToTokens(markdownContent, markdownIt);
  tokens = cleanupTokens(tokens);
  // It's important to blockify the links before grouping the text tokens,
  // as this will exclude the links from being grouped, which would otherwise cause issues on Android.
  tokens = blockifyLinks(tokens);
  tokens = groupTextTokens(tokens);

  return tokensToAST(tokens);
}

// We need links to be treated as 'blocklinks' so that styling works correctly.
function blockifyLinks(tokens) {
  tokens.forEach((token) => {
    if (token.type === 'link') {
      token.block = true;
    }
  });
  return tokens;
}