freme-project/e-Internationalization

Whitelabel Error Page error when submitting large HTML

Closed this issue · 8 comments

m1ci commented

When submitting

curl -X POST --header "Content-Type: text/html" --header "Accept: text/html" -d @test.html "http://api-dev.freme-project.eu/current/e-entity/freme-ner/documents?language=en&dataset=dbpedia&mode=all" -v

with this file.

We get

<html><body><h1>Whitelabel Error Page</h1><p>This application has no explicit mapping for /error, so you are seeing this as a fallback.</p><div id='created'>Thu Oct 29 10:28:42 CET 2015</div><div>There was an unexpected error (type=Internal Server Error, status=500).</div><div>String index out of range: 701</div></body></html>

I could reproduce the error. I traced it down to the e-Internationalization codes. So I created a unit test in e-Internationalization that reproduces the error. Remove the comments and run the tests to see the error.

@borriellom please take a look at the error and fix it.

Fixed.
I have also committed the file long-html-enriched.ttl for the unit test.
I made some changes in the NIF generation process as well. This file has been generated by converting the HTML file to NIF and then by enriching it through the FREME NER service.

Please, check if it works now.

Still not working when executing the code from @m1ci. It downloads a part of the result it seems, but then it stops and after a while you get a time out.

The result is complete, I don't know why it tries to read further bytes. @jnehring do you think it could be an issue in the broker? Smaller files work fine.

Today the bugfixing period of FREME 0.4 finished. I am currently preparing the release. We have to take this bug into the live version of FREME 0.4.

So we will fix this bug in FREME 0.5

There was an additional bug in the broker that prevented long HTML from being properly processed. I created a bugfix. Now I can process @m1ci's request. Further I created an integration test using @m1ci's request.

@m1ci and @pheyvaer please test and close the issue in case it is resolved.

Works for me.

m1ci commented

works for me too.