techprimate/TPPDF

Attributed text from markdown is rendered as regular text

Opened this issue · 4 comments

What did you do?

I created a PDF and added an attributed text generated from a markdown string. However, the text is rendered to the PDF document as plain text, without any attribution. Here is the code for reference:

init() {
    let document = TPPDF.PDFDocument(format: .usLetter)
        
    let markdownString = try? NSAttributedString(markdown: "This is a *simple* **test**")
    let textElementObject = PDFAttributedText(text: markdownString!)
    document.add(attributedTextObject: textElementObject)
                       
    let generator = PDFGenerator(document: document)
    let url = try? generator.generateURL(filename: "Example.pdf")
}

Just for completeness, I also tried the following:

let markdownString = try? NSAttributedString(markdown: "This is a *simple* **test**")
document.add(attributedText: markdownString)

What did you expect to happen?

I expected the text to be rendered like this:

This is a sample test

I.e., I was expecting that the string would be rendered considering the markdown syntax (italic and bold).

What happened instead?

The text was rendered like this:

This is a sample test

So the text was rendered without any attributes. Please check the attached PDF for the result.

It is worth noting, however, that the * are not rendered. So it seems that the markdown syntax is parsed correctly, but the attributes are simply ignored.

TPPDF Environment

TPPDF version: 2.6.0
Xcode version: 15.4
Swift version: 5

Demo Code / Project

You can download an XCode project (macOS App) from my repo
The relevant code is here

And here is the generated PDF:
Example.pdf

TL;DR: I believe CoreText is not able to render the Markdown elements, we need to implement a workaround.

Research & Development Conclusion:

I started off by looking at the NSAttributedString which is rendered in PDFAttributedTextObject.draw() using CTFramesetterCreateWithAttributedString.

It looks like this:

(lldb) po attributedString
 Optional<NSAttributedString>
  - some : This is a {
    NSPresentationIntent = "<NSPresentationIntent 0x60000262e0a0>: Paragraph (id 1)";
}simple{
    NSInlinePresentationIntent = 1;
    NSPresentationIntent = "<NSPresentationIntent 0x60000262e0a0>: Paragraph (id 1)";
} {
    NSPresentationIntent = "<NSPresentationIntent 0x60000262e0a0>: Paragraph (id 1)";
}test{
    NSInlinePresentationIntent = 2;
    NSPresentationIntent = "<NSPresentationIntent 0x60000262e0a0>: Paragraph (id 1)";
}

Looking at the Apple Documentation for NSPresentationIntent, this seems to be a container holding Markdown attributes for the range of characters.

At this point my assumption is that CoreText can not handle the NSPresentationIntent and therefore ignores them.

Furthermore, the words simple and test have an NSInlinePresentationIntent configured, so at least the attributed string matches the expected markdown.

During further research I found the following repository, reporting similar issues
https://github.com/frankrausch/AttributedStringStyledMarkdown. They workaround the issue by replacing the presentation intents with CoreText compatible attributes:

https://github.com/frankrausch/AttributedStringStyledMarkdown/blob/main/AttributedStringStyledMarkdown/AttributedString%2BStyledMarkdown.swift#L38-L67

I tested the same approach in TPPDF and it seems to resolve the issue. For you to reproduce, please modify the PDFAttributedTextObject.generateAttributedText to replace the incompatible attributes in the attributed string:

func generateAttributedText(generator: PDFGenerator, container: PDFContainer) throws -> NSAttributedString {
...
    } else if let attributedText = attributedText {
        let mutableAttrString = NSMutableAttributedString(attributedString: attributedText.text)
        mutableAttrString.enumerateAttributes(in: NSRange(location: 0, length: mutableAttrString.length)) { attrs, range, _ in
            if #available(iOS 15.0, *) {
                if let presentationIntent = attrs[.presentationIntentAttributeName] as? PresentationIntent {
                    // TODO: replace the presentation intent with CoreText compatible attributes
                }
                if let inlinePresentationIntent = attrs[.inlinePresentationIntent] as? UInt {
                    mutableAttrString.removeAttribute(.inlinePresentationIntent, range: range)
                    switch InlinePresentationIntent(rawValue: inlinePresentationIntent) {
                    case .emphasized:
                        mutableAttrString.addAttribute(.font, value: Font.italicSystemFont(ofSize: Font.systemFontSize), range: range)
                    case .stronglyEmphasized:
                        mutableAttrString.addAttribute(.font, value: Font.boldSystemFont(ofSize: Font.systemFontSize), range: range)
                    default:
                        // TODO: implement all presentation intent
                        break
                    }
                }
            }
        }
        return mutableAttrString
    } else {
    ...
}

It should then render the text correctly.

image

Please confirm that this approach is viable, so we can then further extend the attributes mapping

Thank you @philprime for the input and investigating this!

That's an unfortunate limitation of the parsing of md as an Attributed string. Interesting that they followed that approach, might be to make that independent of any font or other configuration that the text might eventually be rendered with.

I think this sounds like the viable approach that might be a nice addition to TPPDF to address the issue we have in our PR. How do you want to proceed here @philprime & @RealLast? Should we make a PR or do you want to take a stab at this @philprime?

@PSchmiedmayer I am currently short on time to invest in this right now.

If this issue has a high priority for you, I need to kindly ask you to create PR.
As always I am happy to give feedback.

Thank you @philprime! I will check back with @RealLast once he is back from vacation about the criticality but I could see this as a nice addition that we can add as part of our work on the Spezi ecosystem; the changes sound like some straight forward additions.