psf/requests

Quote in proxy string causes InvalidURL: Failed to parse.

Closed this issue ยท 14 comments

When attempting to make a request through a proxy configured with a basic username and password, an InvalidURL exception is raised if the proxy string contains a quote.

This appears to be new in the 2.22.0 release. We downgraded to 2.21.0 and are experiencing no issues.

Expected Result

Expected quote to be allowed in proxy string and for the proxy connection to succeed.

Actual Result

Got the following error:
InvalidURL: Failed to parse: http://user:pass"with"quote@example.org:3128/

Here's the full stacktrace:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 412, in send
    conn = self.get_connection(request.url, proxies)
  File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 305, in get_connection
    proxy_url = parse_url(proxy)
  File "/usr/local/lib/python3.6/site-packages/urllib3/util/url.py", line 234, in parse_url
    raise LocationParseError(url)
urllib3.exceptions.LocationParseError: Failed to parse: http://user:pass"with"quote@example.org:3128/

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 414, in send
    raise InvalidURL(e, request=request)
requests.exceptions.InvalidURL: Failed to parse: http://user:pass"with"quote@example.org:3128/

Reproduction Steps

import requests

proxies = {'http': 'http://user:pass"with"quote@example.org:3128/'}
requests.get('http://example.org', proxies=proxies)

System Information

{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "2.8"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.6.6"
  },
  "platform": {
    "release": "4.9.125-linuxkit",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.22.0"
  },
  "system_ssl": {
    "version": "1010006f"
  },
  "urllib3": {
    "version": "1.25.2"
  },
  "using_pyopenssl": false
}

@hwstovall Have you tried replacing any quote with %22?

A little research seems to indicate this may be a problem with one of the urllib methods. Perhaps somebody more familiar with the code base could confirm.

also if you try to use user with @ example superuser80@gmail.com will raise InvalidURL

xolan commented

Having the same issue with usernames containing @.

Having the same issue with usernames containing @.

Confirmed, but isn't happening on normal usage, on my side, a simple script like this:

s = requests.Session()
addr = "http://lum-customer-hl_*****f-zone-****-country-ru:******@zproxy.lum-superproxy.io:22225"
s.proxies = {
    "http": addr,
    "https": addr,
}
r = s.get("http://lumtest.com/myip.json")
r.raise_for_status()
print(r.json())

works like a charm, but if I move the same logic under GAE standard env, it triggers the above error, with or without the toolbelt monkeypatch, no matter if I'm using the httplib sockets or not, production or local.

My workaround was to simply change the password into something that doesn't contain a special character, like @.
I've found out that the error is triggered by urllib3 at util/url.py line 234 on:

    if has_authority and uri_ref.authority is None:
        raise LocationParseError(url)

so I guess it's an external issue, but why is this not happening in every environment?

The issue is rather in urllib3 1.25.3 that gets installed when installing requests 2.22.0 (if you have an old urllib3). I do not see this issue with requests 2.22.0 if I downgrade urllib3 to 1.24.3, which is installed by requests 2.21.0 too.

There is definitely a change in how we handle the userinfo section of a URL to strictly match with RFC 3986. We can probably be more relaxed with our parsing of that section.

@sethmlarson The issue is easily reproducible if you use requests.get("https://user-with-@-inside:password@somewebsite.com")

Thanks for the reproducer. I'm working on a fix over on urllib3.

Having the same problem .. I am using basic auth and quite often the username is an email address.

quoting the proxy url allows requests (urllib3) to send the request but it appears the proxy service cannot handle this and the request fails with a 407.

Looking forward to the fix.

Passing user credentials as a part of URL is not the proper way to authenticate. It becomes difficult to handle other kinds of authentication apart from Basic.

Authentication is supposed to be passed as an argument to requests when making connections. Such that it is independent.

You can also refer requests' authentication docs for more details.
Have also explained the same in my StackOverflow answer in which the library is dependent on requests internally.

@sethmlarson Is this still have anything to do with urllib3?

Yep, this has everything to do with urllib3. Our URL parser isn't percent-encoding characters within the userinfo section of the URL. I've merged a PR into master on urllib3 that should fix this issue but I haven't prepared a release (New job, moved to a new apartment) but I'm trying to get back into OSS.

try to encode it to url encoding then make the request .

Could you give this a shot after upgrading urllib3? It should be fixed in recent versions.

python -m pip install --upgrade urllib3

This was resolved in urllib3 1.25.9.