johnnychen94/jill.py

bypass ssl check

kerim371 opened this issue · 12 comments

Hi,

I've found strange behaviour and writing here in hope you could help me.

My app builds python 3.6.7 from source in Release mode (no matter what my app config is, python is always built in release).
Then I do ./python pip install jill
Then:
/python -m jill install 1.6.1 --confirm --install_dir ~/Documents/d for debug app
/python -m jill install 1.6.1 --confirm --install_dir ~/Documents/r for release app

This command only works for release app configuration (even in both cases python is built as release, I believe pythons are identical) and in debug app I can see the error:

./python jill install 1.6.1 --confirm --install_dir ~/Documents
JILL - Julia Installer 4 Linux (MacOS, Windows and FreeBSD) -- Light

----- Download Julia -----
downloading Julia release for 1.6.1-linux-x86_64
downloading from https://julialang-s3.julialang.org/bin/linux/x64/1.6/julia-1.6.1-linux-x86_64.tar.gz
failed to download julia-1.6.1-linux-x86_64.tar.gz
False

I've tried different julia versions, even tried sudo ./python ... but it fails anyway...

I understand that information I provide is unsufficient but I'm confused... Do you have ideas why this can possibly happen?

Ubuntu 20.04

This really looks like a network issue to me:

jill.py/jill/download.py

Lines 30 to 42 in 9b4fad6

try:
msg = f"downloading from {url}"
logging.info(msg)
print(msg)
wget.download(url, temp_outpath)
print() # for format usage
msg = f"finished downloading {outname}"
print(f"{color.GREEN}{msg}{color.END}")
except (URLError, ConnectionError):
msg = f"failed to download {outname}"
logging.info(msg)
print(f"{color.RED}{msg}{color.END}")
return False

As you can see, the core operation is delegated to wget, so perhaps you can try and see if it's reproducible with wget.download (maybe with another URL)?

@johnnychen94 thank you for information.

I've tried to modify this snippet of code to display more information about temp_outpath and url in Jill of both python:

        try:
            msg = f"downloading from {url}"
            logging.info(msg)
            print(msg)
            print(url)
            print(temp_outpath)
            wget.download(url, temp_outpath)
            print()  # for format usage
            msg = f"finished downloading {outname}"
            print(f"{color.GREEN}{msg}{color.END}")
        except (URLError, ConnectionError):
            msg = f"failed to download {outname}"
            logging.info(msg)
            print(f"{color.RED}{msg}{color.END}")

and they produce almost the same output:
Working one:

JILL - Julia Installer 4 Linux (MacOS, Windows and FreeBSD) -- Light

----- Download Julia -----
downloading Julia release for 1.6.1-linux-x86_64
downloading from https://julialang-s3.julialang.org/bin/linux/x64/1.6/julia-1.6.1-linux-x86_64.tar.gz
https://julialang-s3.julialang.org/bin/linux/x64/1.6/julia-1.6.1-linux-x86_64.tar.gz
/tmp/tmp2nx52p7n/julia-1.6.1-linux-x86_64.tar.gz
100% [......................................................................] 112784227 / 112784227
finished downloading julia-1.6.1-linux-x86_64.tar.gz

and this one produces error:

JILL - Julia Installer 4 Linux (MacOS, Windows and FreeBSD) -- Light

----- Download Julia -----
downloading Julia release for 1.6.1-linux-x86_64
downloading from https://julialang-s3.julialang.org/bin/linux/x64/1.6/julia-1.6.1-linux-x86_64.tar.gz
https://julialang-s3.julialang.org/bin/linux/x64/1.6/julia-1.6.1-linux-x86_64.tar.gz
/tmp/tmp61gejwld/julia-1.6.1-linux-x86_64.tar.gz
failed to download julia-1.6.1-linux-x86_64.tar.gz

Have no idea what is happening

I mean, you can just try

import wget
url = "https://julialang-s3.julialang.org/bin/linux/x64/1.6/julia-1.6.1-linux-x86_64.tar.gz"
wget.download(url)

in both python versions, and also try other valid URLs. It's probably a wget.download issue.

@johnnychen94 thank you, just tried that. The working python works fine and downloads Julia. The broken python gives me error:

Python 3.6.7 (default, May  9 2021, 02:45:56) 
[GCC 9.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import wget
>>> url = "https://julialang-s3.julialang.org/bin/linux/x64/1.6/julia-1.6.1-linux-x86_64.tar.gz"
>>> wget.download(url)
Traceback (most recent call last):
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/urllib/request.py", line 1318, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/http/client.py", line 1400, in connect
    server_hostname=server_hostname)
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/ssl.py", line 407, in wrap_socket
    _context=self, _session=session)
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/ssl.py", line 817, in __init__
    self.do_handshake()
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/ssl.py", line 1077, in do_handshake
    self._sslobj.do_handshake()
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/ssl.py", line 689, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:847)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/site-packages/wget.py", line 526, in download
    (tmpfile, headers) = ulib.urlretrieve(binurl, tmpfile, callback)
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/urllib/request.py", line 248, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/urllib/request.py", line 526, in open
    response = self._open(req, data)
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/urllib/request.py", line 544, in _open
    '_open', req)
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/urllib/request.py", line 1361, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/home/kerim/Documents/Colada/d/python-install/lib/python3.6/urllib/request.py", line 1320, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:847)>

Thank you very much, I will try to investigate this tomorrow.

also try other valid URLs

Where can I get other url to use it in wget?

Looks like it's a SSL issue. I'm not very familiar with it, perhaps you can try ./python -m pip install --upgrade certifi (I really don't know)


I'm not sure of it, maybe it's related to https://www.python.org/downloads/release/python-360/:

If you are building Python from source, beware that the OpenSSL 1.1.0c release, the most recent as of this update, is known to cause Python 3.6 test suite failures and its use should be avoided without additional patches. It is expected that the next release of the OpenSSL 1.1.0 series will fix these problems. See http://bugs.python.org/issue28689 for more information.

Alright.. I'm closing this as it's not something that we can fix on the jill side. It's more like a python + SSL issue to me.

@johnnychen94 hi,

I understood what was the problem.
In brief, in custom python build the paths to ssl certifacates were wrong (I chacked that with commands import ssl; ssl.get_default_verify_paths()). That is why I used to get the error [SSL: CERTIFICATE_VERIFY_FAILED]

In my special case I use an application wich has many dependencies and it is built in CMake superbuild mode, and python compilation is one of the build step. So it is very uncomfortable to manually change ssl-certificate path during building step (there are some very specifique reason).

To solve my problem I decided to fork jill.py and according to stack overflow solution I added:

import ssl

ssl._create_default_https_context = ssl._create_unverified_context

Thus I added --bypass_ssl flag wich do that.

I understand that there hardly be many people who will face the same problem as mine but I could make a PR with this modification. How do you think should I make a pull request?

You can find my fork here

According to the stack overflow solution, this monkey patch is a highly discouraged solution as it affects all SSL calls, we want to limit this only to jill itself. There might be other python packages using jill to install Julia, in that case, we don't want to affect them. So I'm afraid that I can't accept the current patch you proposed.

There are two ways I have in mind that might works:

The first way is to pass the keyword to the actual downloader: jill internally uses requests.get and wget.download to fetch data, for requests.get there's a way to skip ssl check(or modify the path) https://2.python-requests.org/en/master/user/advanced/#ssl-cert-verification. wget.download doesn't support this, so if you want to add this feature, you might also need to use raw requests.get to download the content.

The second way is to still use the monkey patch, but control it in a context, and restore the settings when it quits the context, for example:

- download_package(...)
+ with ssl_unverified_context():
+     download_package()

jill also uses context https://github.com/johnnychen94/jill.py/blob/master/jill/utils/mount_utils.py to provide a clean installation. Hence if you're not familiar with the python context, you can take a look at it.

The second way might be simpler but it might also have undefined behaviors. I'm not very familiar with SSL so I really don't know.

In brief, in custom python build the paths to ssl certifacates were wrong

I'm a little bit skeptical about this. Is there a way to make it correct in the python building? For example, by telling the build script a correct ssl path.

Is there a way to make it correct in the python building? For example, by telling the build script a correct ssl path.

Probably there is but in my case this is not a solution.
All python building is controlled under third party community. They develop their open source project. And I use their app as part of mine. They are good specialists and they make correct ssl path at some build step but not right after (or while) python is built (and I need it right after python is built).
On the other hand I'm trying not to modify their code as it would lead to some overheads when I update the version of the base app.

So for me the simpliest solution is to disable ssl check when downloading julia (I'm not good at ssl staff but I guess this is harmless as I trust julia web site)

According to the stack overflow solution, this monkey patch is a highly discouraged solution as it affects all SSL calls, we want to limit this only to jill itself. There might be other python packages using jill to install Julia, in that case, we don't want to affect them. So I'm afraid that I can't accept the current patch you proposed.

There are two ways I have in mind that might works:

The first way is to pass the keyword to the actual downloader: jill internally uses requests.get and wget.download to fetch data, for requests.get there's a way to skip ssl check(or modify the path) https://2.python-requests.org/en/master/user/advanced/#ssl-cert-verification. wget.download doesn't support this, so if you want to add this feature, you might also need to use raw requests.get to download the content.

The second way is to still use the monkey patch, but control it in a context, and restore the settings when it quits the context, for example:

- download_package(...)
+ with ssl_unverified_context():
+     download_package()

jill also uses context https://github.com/johnnychen94/jill.py/blob/master/jill/utils/mount_utils.py to provide a clean installation. Hence if you're not familiar with the python context, you can take a look at it.

The second way might be simpler but it might also have undefined behaviors. I'm not very familiar with SSL so I really don't know.

I'm sorry I have not seen this your reply (thus my previous post maybe inclomplete)...
I will be busy for few days but I will think about that

closed by 031ee35

jill install --bypass-ssl