matthewwithanm/python-markdownify

Doesn't properly convert image inside table

bertmad3400 opened this issue · 2 comments

Take the following html as an example:

<table> <tbody> <tr> <td> <img src="https://avatars.githubusercontent.com/u/57845632?s=400&u=48a7889c50d234959a518020779ae2dd0c1e6546&v=4"> </td> </tr> </tbody> </table>

When running this through markdownify, it for some reason doesn't produce a markdown table, with an image, but simply a table, like this:

\n\n| |\n\n

Why this happens is not yet clear to me.

I've hit the same issue, I think this is because once you get to a td element then convert_children_as_inline is set to True, which forces convert_image to return the content of the image's alt tag (often blank, sadly).

Edit: creating a custom MarkdownConverter with a tweaked convert_image method seems to be working ok for my needs, but your mileage may vary.

class AlwaysRenderImagesConverter(MarkdownConverter):
    def convert_img(self, el, text: str, convert_as_inline: bool) -> str:
        """Allows images to be rendered in headings and table cells"""
        alt = el.attrs.get("alt", None) or ""
        src = el.attrs.get("src", None) or ""
        title = el.attrs.get("title", None) or ""
        title_part = ' "%s"' % title.replace('"', r"\"") if title else ""
    
        return "![%s](%s%s)" % (alt, src, title_part)

Thanks for your issue! I added the option keep_inline_images_in, it will be available in the next release (next few days).