parse "http://main_test.geekpark.net/rss.rss" failed
allentown521 opened this issue · 14 comments
Caused by: org.jdom2.input.JDOMParseException: Error on line 116: At line 116, column 598: not well-formed (invalid token);
but i try other rss client not use rome is normal
I opened this feed in both Chrome and Firefox and both tell me there is an error with the UTF-8 in the feed. I don't think this is a problem with Rome, inoreader may well be silent about UTF-8 issues.
You can use String(byte[] bytes, Charset charset) to convert the feed from bytes to a string before passing it to Rome.
You can wrap the string in a StringReader(), as follows:
SyndFeedInput syndFeedInput = new SyndFeedInput();
syndFeed = syndFeedInput.build(new StringReader(feedString))
I don't know how you are getting the url, but you want to read bytes.
I just tried this code and it works for me now:
import com.rometools.rome.feed.synd.SyndFeed;
import com.rometools.rome.io.SyndFeedInput;
import com.rometools.rome.io.XmlReader;
final URL url = new URL("http://main_test.geekpark.net/rss.rss");
final SyndFeedInput syndFeedInput = new SyndFeedInput();
final SyndFeed syndFeed = syndFeedInput.build(new XmlReader(url));
And the feed XML does not report an error in either Chrome or Firefox, so it was correct when I tested it. But it was not correct when I tested it before and what I noticed was the error was being reported in difference parts of the feed when I reloaded it which suggests that the bad content was being dynamically generated.
Some test cases would be great, and did you try this constructor for XmlReader:
public XmlReader(final InputStream is, final boolean lenient)
so:
final SyndFeed syndFeed = syndFeedInput.build(new XmlReader(url.openStream(), true));
I wasnt talking about a test case, I was wondering if you had tried the 'lenient' flag for XmlReader(), actually I don't know what your original code looked like.
Ok, well problem is that since the bad content has rolled off the feed (I just checked again), and I did not keep a version, I cant really dig anymore into this. However happy to help if the bad content reappears. I would suggest you close this issue.