Parsing bug (unable to parse valid XML)
Closed this issue · 6 comments
xq 1.1.1 fails to parse valid XML, and I suspect this is a regression. I found this after an upgrade.
I use xq in a script to parse RSS from https://mastodon.social/@flameReactor.rss . After upgrading to 1.1.1, xq started generating errors for the RSS. Previous versions had been working.
I narrowed it down to a small subset of the XML that generates the error. Given the input:
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:webfeeds="http://webfeeds.org/rss/1.0" xmlns:media="http://search.yahoo.com/mrss/">
<channel>
<title>Flame Reactor</title>
<description>Public posts from @flameReactor@mastodon.social</description>
<link>https://mastodon.social/@flameReactor</link>
<image>
<url>https://files.mastodon.social/accounts/avatars/108/494/405/727/932/593/original/73c676559a899098.png</url>
<title>Flame Reactor</title>
<link>https://mastodon.social/@flameReactor</link>
</image>
</channel>
</rss>
xq generates an unexpected syntax error at the first <link>
element.
$ xq xml2
XML syntax error on line 6: unexpected end element </link>
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:webfeeds="http://webfeeds.org/rss/1.0" xmlns:media="http://search.yahoo.com/mrss/">
<channel>
<title>Flame Reactor</title>
<description>Public posts from @flameReactor@mastodon.social</description>
<link/>https://mastodon.social/@flameReactor</channel>
</rss>
xmllint confirms the input is valid xml:
$ xmllint xml2 > /dev/null && echo VALID
VALID
Dist: Artix, 6.0.12-artix1-1
xq: xq version 1.1.1 (2022-12-19, f0c2fb2)
Thank you for your report. The regression was introduced in 4a1efbc
I was very close to re-reporting this issue, but thought to check closed issues before opening a new one. Would it make sense to keep issues open until the fix is rolled into a new release? I don't want to tell you how to run your own github project, but figured I'd at least mention the idea...
@namespacebrian It usually works that way. The issue is closed as soon as the fix is committed to the master branch. It may produce duplicate bug reports, but it helps to keep track of closed issues (I even do it using a commit message with the corresponding syntax).
@sibprogrammer
I think this is still not fixed. Parsing now happens without error, but queries still fail (in the same way).
Given the original report's XML:
$ xq --version
xq version 1.1.2 (2023-01-15, 7c585a2e6ccd7e6e3adc63e4524023389135c938)
$ xq xml2
# xml is parsed and formatted correctly
$ xq -x '//title' xml2
XML syntax error on line 6: unexpected end element </link>
Oh, sorry, the same issue here as well. I reported it separately. See #25
All fixed with 1.1.3, thank you!