spulec/uncurl

Unable to parse certain binary curl commands. ValueError("No closing quotation")

Opened this issue · 2 comments

When there's binary in the curl request, sometimes the quotation marks screw up shlex and it won't parse properly.

For example

curl -i -s -k -X $'POST'
-H $'Host: ayk-web.mo.konami.net' -H $'User-Agent: UnityPlayer/2020.3.18f1 (UnityWebRequest/1.0, libcurl/7.75.0-DEV)' -H $'Accept: /' -H $'Accept-Encoding: gzip, deflate' -H $'Content-Type: application/octet-stream' -H $'X_acts: User.home' -H $'Atoken: 23528e425359819ac4252c83e73cbc934095818015c1ff83843c9f62ff1d' -H $'X-Unity-Version: 2020.3.18f1' -H $'Content-Length: 194'
--data $'`\x00^\x00\xd6\x1eq\xe1\xbc[$\xf5\xba\x05O\xd7\xf1'\xad=o\xcf\x8a5\xe4[\x0a2r\xc9\xe4\x1c\xdc/WL\x1b\xe1\x13_AjI?\x9d\xe4\xa7\x0f\x8dbSTE\xd2j\xe2{\x8d\xb4\xc7\xf7/_C\xa6B\x07=c\xa4\xd0\x12\xd7\xdf\x01\x98b\xa1\x0f]\xbaK\xb7a\x19o\x84\x1c\xfe\x95\x96\xfc\xb8\x08\xa0\xf4\xb9\xbb\xf1\x0d{"acts":[{"act":"User.home","id":13}],"v":"1.0.2","ua":"Android/7.1.2/SM-G973N","h":558161692}'
$'https://ayk-web.mo.konami.net/ayk/api/User.home'

Traceback (most recent call last):
File "<pyshell#46>", line 1, in
uncurl.parse(curl)
File "D:\python\lib\site-packages\uncurl\api.py", line 74, in parse
parsed_context = parse_context(curl_command)
File "D:\python\lib\site-packages\uncurl\api.py", line 30, in parse_context
tokens = shlex.split(curl_command)
File "D:\python\lib\shlex.py", line 305, in split
return list(lex)
File "D:\python\lib\shlex.py", line 295, in next
token = self.get_token()
File "D:\python\lib\shlex.py", line 105, in get_token
raw = self.read_token()
File "D:\python\lib\shlex.py", line 187, in read_token
raise ValueError("No closing quotation")
ValueError: No closing quotation

some commands where the binary is a bit nicer, will parse just fine. But others will not.

Is there any bandaid solution to this?

when its pasted directly into curlconverter.com it spits out the python stuff just fine. not sure what the issue is or how they fixed it.

Your command uses Bash's ANSI-C quoted strings. They $'look like this' and let you use escape codes like \n in Bash, otherwise in other shells you have to write a string with an actual Enter character or whatever. shlex doesn't support them.

In curlconverter we fixed this problem by using an actual bash parser (tree-sitter-bash) implemented in JavaScript and then writing some code that interprets these strings.

If you'd like, you can use curlconverter's parser (sorta) from Python, curlconverter/curlconverter#322 (comment)