louisabraham/har2requests

response["content"]["text"] causes KeyError: 'text'

louisabraham opened this issue · 8 comments

@spider-x reported the following issue in #1

C:\Users\Stefan>har2requests input_booking.har > output_booking.py
Traceback (most recent call last):
  File "c:\users\stefan\appdata\local\programs\python\python37\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\users\stefan\appdata\local\programs\python\python37\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Stefan\AppData\Local\Programs\Python\Python37\Scripts\har2requests.exe\__main__.py", line 9, in <module>
  File "c:\users\stefan\appdata\local\programs\python\python37\lib\site-packages\click\core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "c:\users\stefan\appdata\local\programs\python\python37\lib\site-packages\click\core.py", line 717, in main
    rv = self.invoke(ctx)
  File "c:\users\stefan\appdata\local\programs\python\python37\lib\site-packages\click\core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "c:\users\stefan\appdata\local\programs\python\python37\lib\site-packages\click\core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "c:\users\stefan\appdata\local\programs\python\python37\lib\site-packages\har2requests\__init__.py", line 177, in main
    entry["request"], entry["response"], entry["startedDateTime"]
  File "c:\users\stefan\appdata\local\programs\python\python37\lib\site-packages\har2requests\__init__.py", line 67, in from_json
    responseText=response["content"]["text"],
KeyError: 'text'

The har spec says that <content> should contain a "text" field.

@spider-x can you share the har file, for example with https://dpaste.de/? Or at least show the response that doesn't have any text? How did you generate it?

The problem is that if I fix this bug without understanding its cause, it will sneak errors in the other user's (if there are any) codes.

Here is the har file, [deleted]. I have generated the har file in chrome and than i have tried it also in the firefox.

I deleted your comment because your har file contains cookies that could be used to spoof your identity.

I'll look at your file now.

I'm going to fix it by putting "" when the size is 0, since the text is only needed to detect authorization tokens.

For further reference, I observed the following patterns:

  • When "text" is missing, "size": 0
  • When"size": 0, "text" can be missing or "text": ""
  • "text" is missing exactly when "compression": 0

You can now try with version 0.1.0.

A have tried the version 0.1.0 and i get this error:

C:\Users\Stefan>har2requests input_booking.har > output_booking.py
Traceback (most recent call last):
File "c:\users\stefan\appdata\local\programs\python\python37\lib\runpy.py", line 193, in run_module_as_main
"main", mod_spec)
File "c:\users\stefan\appdata\local\programs\python\python37\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "C:\Users\Stefan\AppData\Local\Programs\Python\Python37\Scripts\har2requests.exe_main
.py", line 9, in
File "c:\users\stefan\appdata\local\programs\python\python37\lib\site-packages\click\core.py", line 764, in call
return self.main(*args, **kwargs)
File "c:\users\stefan\appdata\local\programs\python\python37\lib\site-packages\click\core.py", line 717, in main
rv = self.invoke(ctx)
File "c:\users\stefan\appdata\local\programs\python\python37\lib\site-packages\click\core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "c:\users\stefan\appdata\local\programs\python\python37\lib\site-packages\click\core.py", line 555, in invoke
return callback(*args, **kwargs)
File "c:\users\stefan\appdata\local\programs\python\python37\lib\site-packages\har2requests_init
.py", line 177, in main
entry["request"], entry["response"], entry["startedDateTime"]
File "c:\users\stefan\appdata\local\programs\python\python37\lib\site-packages\har2requests_init_.py", line 65, in from_json
if request["method"] in ["POST", "PUT"]
KeyError: 'postData'

Indeed, there were two more bugs. I fixed them and could run har2requests on your example.

har2requests is very experimental so bugs like this are to be expected.

If you think the behavior can be improved (by identifying more patterns to produce a cleaner output code), don't hesitate to make a feature request.

Also, I suggest you first find the interesting requests in the devtools panel rather than dumping everything as har.

Thank you very much. Your fast updates are amazing. For me has been interesting all requests from 3rd request, because when i try it with requests_html it stuck on rendering javascript. Thank you very much again :)

The har spec says that <content> should contain a "text" field.

Actually the link you reference states that the text field is optional. I modified line 68 in __init__.py to read:

if response["content"]["size"] > 0 and "text" in response["content"]

This fixed the error for me.