GRAAL-Research/deepparse

[BUG] `SSLError` when downloading model weights of model type: `bpemb`

ajndkr opened this issue · 6 comments

Describe the bug

When trying to use the deepparse.parser.AddressParser class with model_type="bpemb", the model weights download fails due to an SSLError:

requests.exceptions.SSLError: HTTPSConnectionPool(host='bpemb.h-its.org', port=443): Max retries exceeded with url: /multi/multi.wiki.bpe.vs100000.model (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)')))

To Reproduce

Delete model weights cache, most likely ~/.cache/deepparse, and attempt to initialise the class:

from deepparse.parser import AddressParser
address_parser = AddressParser(model_type="bpemb", attention_mechanism=False)

Expected behavior

The model download should not fail.

Desktop:

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.3 LTS
Release:        20.04
Codename:       focal

I am using deepparse==0.9.1.

Additional context

For the moment, I have implemented a dirty fix using a no_ssl_verification (from https://gist.github.com/ChenTanyi/0c47652bd916b61dc196968bca7dad1d) where I initialise the class under this context.

Thank you for you interest in improving Deepparse.

@AjinkyaIndulkar It's BPEmb to fix it since they do the request call. I've opened an issue. Their SSL certificate must have expired.

In the meantime, can you open a PR with your hot-fix so we can release a hot-fix to make it until the bug is fixed?

yeah sure.

@davebulaval I've opened a PR with the hotfix: #157

can you please review?

Done. I've changed the place to wrap around BPEmb instead of deepparse address parser code. Will publish it as soon as tests pass.

Release in 0.9.2. As per BPEmb's mention, they have also pushed a temporary hot-fix that removes SSL certificate verification and will investigate why the certificate is not working since it has been renewed recently.

The patch will be used to handle the potential certificate error.