trentm/python-markdown2

Table Conversion Issue with Trailing Spaces/Tabs

syntaxsurge opened this issue · 0 comments

Description

When converting Markdown tables to HTML using the markdown2 library with the 'tables' extra, I have noticed an issue where tables are not correctly recognized and converted if there are trailing spaces or tabs at the end of the table row syntax (|). This results in the table being incorrectly rendered as a paragraph with line breaks (<br />), rather than as an HTML table.

Steps to Reproduce

The issue can be reproduced with the following Markdown example:

import markdown2

markdown_text = """
| Pros                                   | Cons                                          |  
|-----------------------------------------|------------------------------------------------|  
| Unique and refreshing take on the genre | May not resonate with all viewers              |  
| Cult classic status                      | Over-the-top humor may polarize audiences      |  
| Influential in launching careers         | Niche appeal among comedy aficionados          |
"""

html = markdown2.markdown(markdown_text, extras=['tables'])
print(html)

In this example, each row in the table ends with spaces or tabs after the closing pipe (|). The expected behavior would be for the library to ignore these trailing spaces/tabs and convert the table correctly into HTML.

Expected Output

The expected output should be an HTML table, something like:

<table>
  <tr>
    <td>Pros</td>
    <td>Cons</td>
  </tr>
  <tr>
    <td>Unique and refreshing take on the genre</td>
    <td>May not resonate with all viewers</td>
  </tr>
  ...
</table>

Actual Output

However, the actual output is:

<p>| Pros                                   | Cons                                          | <br />
|-----------------------------------------|------------------------------------------------| <br />
| Unique and refreshing take on the genre | May not resonate with all viewers              | <br />
| Cult classic status                      | Over-the-top humor may polarize audiences      | <br />
| Influential in launching careers         | Niche appeal among comedy aficionados          |</p>

The table is incorrectly rendered as a paragraph with <br /> line breaks.

Impact of the Issue

This issue complicates the processing of Markdown tables, especially in situations where the content is dynamically generated and may inadvertently include trailing spaces or tabs. A robust parsing mechanism that can handle or ignore these trailing characters would greatly enhance the usability of the library in handling tables.

Thank you for considering this issue and looking into a potential fix or improvement.