Quote in proxy string causes InvalidURL: Failed to parse.
Closed this issue ยท 14 comments
When attempting to make a request through a proxy configured with a basic username and password, an InvalidURL
exception is raised if the proxy string contains a quote.
This appears to be new in the 2.22.0
release. We downgraded to 2.21.0
and are experiencing no issues.
Expected Result
Expected quote to be allowed in proxy string and for the proxy connection to succeed.
Actual Result
Got the following error:
InvalidURL: Failed to parse: http://user:pass"with"quote@example.org:3128/
Here's the full stacktrace:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 412, in send
conn = self.get_connection(request.url, proxies)
File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 305, in get_connection
proxy_url = parse_url(proxy)
File "/usr/local/lib/python3.6/site-packages/urllib3/util/url.py", line 234, in parse_url
raise LocationParseError(url)
urllib3.exceptions.LocationParseError: Failed to parse: http://user:pass"with"quote@example.org:3128/
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/usr/local/lib/python3.6/site-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 414, in send
raise InvalidURL(e, request=request)
requests.exceptions.InvalidURL: Failed to parse: http://user:pass"with"quote@example.org:3128/
Reproduction Steps
import requests
proxies = {'http': 'http://user:pass"with"quote@example.org:3128/'}
requests.get('http://example.org', proxies=proxies)
System Information
{
"chardet": {
"version": "3.0.4"
},
"cryptography": {
"version": ""
},
"idna": {
"version": "2.8"
},
"implementation": {
"name": "CPython",
"version": "3.6.6"
},
"platform": {
"release": "4.9.125-linuxkit",
"system": "Linux"
},
"pyOpenSSL": {
"openssl_version": "",
"version": null
},
"requests": {
"version": "2.22.0"
},
"system_ssl": {
"version": "1010006f"
},
"urllib3": {
"version": "1.25.2"
},
"using_pyopenssl": false
}
@hwstovall Have you tried replacing any quote with %22?
A little research seems to indicate this may be a problem with one of the urllib methods. Perhaps somebody more familiar with the code base could confirm.
also if you try to use user with @ example superuser80@gmail.com will raise InvalidURL
Having the same issue with usernames containing @
.
Having the same issue with usernames containing
@
.
Confirmed, but isn't happening on normal usage, on my side, a simple script like this:
s = requests.Session()
addr = "http://lum-customer-hl_*****f-zone-****-country-ru:******@zproxy.lum-superproxy.io:22225"
s.proxies = {
"http": addr,
"https": addr,
}
r = s.get("http://lumtest.com/myip.json")
r.raise_for_status()
print(r.json())
works like a charm, but if I move the same logic under GAE standard env, it triggers the above error, with or without the toolbelt monkeypatch, no matter if I'm using the httplib sockets or not, production or local.
My workaround was to simply change the password into something that doesn't contain a special character, like @
.
I've found out that the error is triggered by urllib3 at util/url.py line 234 on:
if has_authority and uri_ref.authority is None:
raise LocationParseError(url)
so I guess it's an external issue, but why is this not happening in every environment?
The issue is rather in urllib3 1.25.3 that gets installed when installing requests 2.22.0 (if you have an old urllib3). I do not see this issue with requests 2.22.0 if I downgrade urllib3 to 1.24.3, which is installed by requests 2.21.0 too.
There is definitely a change in how we handle the userinfo
section of a URL to strictly match with RFC 3986. We can probably be more relaxed with our parsing of that section.
@sethmlarson The issue is easily reproducible if you use requests.get("https://user-with-@-inside:password@somewebsite.com")
Thanks for the reproducer. I'm working on a fix over on urllib3.
Having the same problem .. I am using basic auth and quite often the username is an email address.
quoting the proxy url allows requests (urllib3) to send the request but it appears the proxy service cannot handle this and the request fails with a 407.
Looking forward to the fix.
Passing user credentials as a part of URL is not the proper way to authenticate. It becomes difficult to handle other kinds of authentication apart from Basic.
Authentication is supposed to be passed as an argument to requests when making connections. Such that it is independent.
You can also refer requests' authentication docs for more details.
Have also explained the same in my StackOverflow answer in which the library is dependent on requests internally.
@sethmlarson Is this still have anything to do with urllib3?
Yep, this has everything to do with urllib3. Our URL parser isn't percent-encoding characters within the userinfo section of the URL. I've merged a PR into master on urllib3 that should fix this issue but I haven't prepared a release (New job, moved to a new apartment) but I'm trying to get back into OSS.
try to encode it to url encoding then make the request .
Could you give this a shot after upgrading urllib3? It should be fixed in recent versions.
python -m pip install --upgrade urllib3
This was resolved in urllib3 1.25.9.