
Error while running EHR phenolyzer

Closed this issue · 4 comments

I am getting the following error while trying to run the app using my own clinical note.'

./ehr_phenolyzer.py -i example/sepsis.txt -p sepsis -n "NCBOannotator" > ehr_phenolyzer.log

Traceback (most recent call last):
File "./ehr_phenolyzer.py", line 127, in
File "/home/shayantan/Desktop/EHR-Phenolyzer/lib/pyncbo_annotator.py", line 53, in run_ncbo_annotator
ncbo_json = json.loads(bopen.open(url_info).read())
File "/home/shayantan/anaconda3/lib/python3.7/urllib/request.py", line 531, in open
response = meth(req, response)
File "/home/shayantan/anaconda3/lib/python3.7/urllib/request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "/home/shayantan/anaconda3/lib/python3.7/urllib/request.py", line 569, in error
return self._call_chain(*args)
File "/home/shayantan/anaconda3/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/home/shayantan/anaconda3/lib/python3.7/urllib/request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 414: Request-URI Too Large

Hi Shayantan,
How big is your input file "sepsis.txt" (how many words in the file)? Could you run EHR-Phenolyzer testing code under test/test_ncbo_annotator.sh successfully in your environment (under EHR-Phenolyzer/test, run $bash test_ncbo_annotator.sh)?


@MenggeZhao thinks it is likely a HTTP GET error due to limitations on number of characters. We will change to POST and update the repository soon.

@gangcai Each text is around 200 sentences long (on average). I have 4500 such texts in 'sepsis.txt'. I have successfully run the test sample data and got the files in the "out" folder.

I changed HTTP Get to HTTP Post when using NCBO annotator. Also I fix a bug in the same file that may cause KeyError.

HTTP protocol has no limit on how large size a note can be sent to NCBO annotator when using POST method. But NCBOannotator server may have a limit on the note size. So, users should limit the note size within 300KB, otherwise, NCBO annotator may reject to respond to you, and raise an HTTP 408 error: Request Time Out.