gawel/pyquery

Incorrect handling of whitespace around inline elements for text()

Opened this issue · 0 comments

The text() function makes use of the INLINE_TAGS set to determine the handling of newlines, however it looks like this list was not updated in quite a while. The last comprehensive list on MDN, before the page was rewritten in April 2023, includes the following, with the ones missing from the INLINE_TAGS set marked with a cross:

  • <a>
  • <abbr>
  • <acronym>
  • <audio> ❌
  • <b>
  • <bdi> ❌
  • <bdo>
  • <big>
  • <br>
  • <button>
  • <canvas> ❌
  • <cite>
  • <code>
  • <data> ❌
  • <datalist> ❌
  • <del> ❌
  • <dfn>
  • <em>
  • <embed> ❌
  • <i>
  • <iframe> ❌
  • <img>
  • <input>
  • <ins> ❌
  • <kbd>
  • <label>
  • <map>
  • <mark> ❌
  • <meter> ❌
  • <noscript> ❌
  • <object>
  • <output> ❌
  • <picture> ❌
  • <progress> ❌
  • <q>
  • <ruby> ❌
  • <s> ❌
  • <samp>
  • <script>
  • <select>
  • <slot> ❌
  • <small>
  • <span>
  • <strong>
  • <sub>
  • <sup>
  • <svg> ❌
  • <template> ❌
  • <textarea>
  • <time>
  • <u> ❌
  • <tt>
  • <var>
  • <video> ❌
  • <wbr> ❌