cloudflare/python-cloudflare

Token authentication fails when using python-cloudflare versions above 2.8.15

snk-nick opened this issue · 8 comments

From the following issue: certbot/certbot#9768

Basically I've been trying to authenticate with Cloudflare using a token rather than a key and when I replace Cloudflare 2.11.7 with 2.8.15 or below it works flawlessly. Otherwise I get the following error on every single attempt to authenticate:

Error determining zone_id: 6003 Invalid request headers. Please confirm that you have supplied valid Cloudflare API credentials. (Did you copy your entire API token/key? To use Cloudflare tokens, you'll need the python package cloudflare>=2.3.1. This certbot is running cloudflare 2.11.7)

The certbot developers have not been able to reproduce this issue yet I'm getting 100% reproduction across multiple environments, installs, machines, IP addresses etc.

Any idea what might be causing this or suggestions on how to extract more information? I haven't been able to get a more detailed error message.

mahtin commented

I tested 2.11.7 and it works as expected.

$ cli4 -V
Cloudflare library version: 2.11.7
$

$ cat ~/.cloudflare/cloudflare.cfg
[CloudFlare]
token = 4███████████████████████████████████████
$

$ cli4 /zones/:██████████████.com/ | jq -r '.id,.name,.created_on'
0███████████████████████████████
██████████████.com
2014-09-29T07:42:42.616389Z
$

So I did this to check 2.8.15 and:

$ pip uninstall cloudflare
Found existing installation: cloudflare 2.11.7
Uninstalling cloudflare-2.11.7:
  Would remove:
    /opt/homebrew/bin/cli4
    /opt/homebrew/lib/python3.11/site-packages/CloudFlare/*
    /opt/homebrew/lib/python3.11/site-packages/cli4/*
    /opt/homebrew/lib/python3.11/site-packages/cloudflare-2.11.7.dist-info/*
    /opt/homebrew/lib/python3.11/site-packages/examples/*
    /opt/homebrew/share/man/man1/cli4.1
Proceed (Y/n)? Y
  Successfully uninstalled cloudflare-2.11.7
$
$ pip install cloudflare==2.8.15
Collecting cloudflare==2.8.15
...
Successfully installed cloudflare-2.8.15
$

Then repeated the same tests:

$ cli4 -V
Cloudflare library version: 2.8.15
$

$ cli4 /zones/:██████████████.com/ | jq -r '.id,.name,.created_on'
0███████████████████████████████
██████████████.com
2014-09-29T07:42:42.616389Z
$

So I don't know what's up - but I would check your cloudflare.cfg (remember - you don't need an email address with a token).

I also cleaned up - so I didn't hiccup on something else later! :)

$ pip install --upgrade cloudflare
Collecting cloudflare
...
    Found existing installation: cloudflare 2.8.15
    Uninstalling cloudflare-2.8.15:
      Successfully uninstalled cloudflare-2.8.15
Successfully installed cloudflare-2.11.7
$
mahtin commented

I installed certbot under MacOS using brew install certbot and then installed the dns-cloudflare via pip install certbot-dns-cloudflare. I setup the appropriate ini files and created a certificate. When looking thru the letsencrypt.log its seems like its not using the Python Cloudflare library. It's doing raw https calls.

$ egrep client/v4 letsencrypt.log 
2023-09-18 21:00:40,655:DEBUG:urllib3.connectionpool:https://api.cloudflare.com:443 "GET /client/v4/zones?name=████████████.com&per_page=1 HTTP/1.1" 200 None
2023-09-18 21:00:40,844:DEBUG:urllib3.connectionpool:https://api.cloudflare.com:443 "POST /client/v4/zones/████████████████████████████████/dns_records HTTP/1.1" 200 None
2023-09-18 21:00:40,985:DEBUG:urllib3.connectionpool:https://api.cloudflare.com:443 "GET /client/v4/zones/████████████████████████████████/dns_records?type=TXT&name=_acme-challenge.████████████.com&content=███████████████████████████████████████████&per_page=1 HTTP/1.1" 200 None
2023-09-18 21:00:52,458:DEBUG:urllib3.connectionpool:https://api.cloudflare.com:443 "GET /client/v4/zones?name=████████████.com&per_page=1 HTTP/1.1" 200 None
2023-09-18 21:00:52,597:DEBUG:urllib3.connectionpool:https://api.cloudflare.com:443 "GET /client/v4/zones/████████████████████████████████/dns_records?type=TXT&name=_acme-challenge.████████████.com&content=███████████████████████████████████████████&per_page=1 HTTP/1.1" 200 None
2023-09-18 21:00:52,748:DEBUG:urllib3.connectionpool:https://api.cloudflare.com:443 "DELETE /client/v4/zones/████████████████████████████████/dns_records/████████████████████████████████ HTTP/1.1" 200 None
$

This is confusing because reading over the code at https://github.com/certbot/certbot and the certbot-dns-cloudflare folder, I don't see anything that does raw https calls. That code only uses the Python Cloudflare libraries.

Can you assist in a pointer to what's going on?

mahtin commented

Oh, I see certbot/certbot#9768 which is also yours. Now I'm doubly confused by what I see above.

Yeah I've very confused.. are you able to run the tests I've been using here?

https://github.com/snk-nick/certbot-cf-dns

The certbot dev can't reproduce my issue but every platform I try on using pip or docker fails the same way and has done for everyone else I've asked to test. Rolling back the cloudflare package fixes it immediately.

It's.. strange? I'm about to run the same tests you put up above to see if I get anything different.

Ran the same tests as you within the certbot container and got the same results as I've been getting prior.

$ cli4 -V
Cloudflare library version: 2.11.7
$ cli4 /zones/:domain.com.au | jq -r '.id,.name,.created_on'
cli4: /zones/:█████████.com.au - (6003, '█████████.com.au - 6003 Invalid request headers')
$ pip install cloudflare==2.8.15
<snip>
$ cli4 -V
Cloudflare library version: 2.8.15
$ cli4 /zones/:█████████.com.au | jq -r '.id,.name,.created_on'
a███████████████████████████
█████████.com.au
2022-06-30T00:52:09.613979Z

But if I run your tests in a venv installing with pip, I see some very strange behaviour.

Install cloudflare via pip, works no problem (config file created obviously).

# bin/cli4 -V
Cloudflare library version: 2.11.7
# bin/cli4 /zones/:█████████.com.au | jq -r '.id,.name,.created_on'
███████████████████████████
█████████.com.au
2022-06-30T00:52:09.613979Z

Install certbot and certbot-dns-cloudflare and still no problem.

# bin/pip install certbot certbot-dns-cloudflare
<snip>
# bin/cli4 -V
Cloudflare library version: 2.11.7
# bin/cli4 /zones/:█████████.com.au | jq -r '.id,.name,.created_on'
███████████████████████████
█████████.com.au
2022-06-30T00:52:09.613979Z

Request a certificate, fails.

# export CLOUDFLARE_TOKEN="███████████████████████████████████████"
# export CLOUDFLARE_EMAIL="███████.com.au"
# export CLOUDFLARE_DOMAIN_LIST="█████████.com.au"
# echo "dns_cloudflare_api_token = $CLOUDFLARE_TOKEN" > /cloudflare.ini
# chmod 600 /cloudflare.ini
# cat /cloudflare.ini
dns_cloudflare_api_token = ███████████████████████████████████████
# bin/certbot certonly \
    --dns-cloudflare \
    --dns-cloudflare-credentials /cloudflare.ini \
    --dns-cloudflare-propagation-seconds 30 \
    --agree-tos \
    --no-eff-email \
    --staging \
    -n \
    -m $CLOUDFLARE_EMAIL \
    -d $CLOUDFLARE_DOMAIN_LIST
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Account registered.
Requesting a certificate for █████████.com.au
Error determining zone_id: 6003 Invalid request headers. Please confirm that you have supplied valid Cloudflare API credentials. (Did you copy your entire API token/key? To use Cloudflare tokens, you'll need the python package cloudflare>=2.3.1. This certbot is running cloudflare 2.11.7)
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

Retest using cli4... now it also fails.

# bin/cli4 /zones/:█████████.com.au | jq -r '.id,.name,.created_on'
cli4: /zones/:█████████.com.au - (6003, '█████████.com.au - 6003 Invalid request headers')

Downgrade to 2.8.15 and it works again

# bin/pip install cloudflare==2.8.15
<snip>
# bin/cli4 -V
Cloudflare library version: 2.8.15
# bin/cli4 /zones/:█████████.com.au | jq -r '.id,.name,.created_on'
███████████████████████████
█████████.com.au
2022-06-30T00:52:09.613979Z

And so does certbot

# bin/certbot certonly \
    --dns-cloudflare \
    --dns-cloudflare-credentials /cloudflare.ini \
    --dns-cloudflare-propagation-seconds 30 \
    --agree-tos \
    --no-eff-email \
    --staging \
    -n \
    -m $CLOUDFLARE_EMAIL \
    -d $CLOUDFLARE_DOMAIN_LIST
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Requesting a certificate for █████████.com.au
Waiting 30 seconds for DNS changes to propagate
Successfully received certificate.
Certificate is saved at: /etc/letsencrypt/live/█████████.com.au/fullchain.pem
Key is saved at:         /etc/letsencrypt/live/█████████.com.au/privkey.pem
This certificate expires on 2023-12-18.
These files will be updated when the certificate renews.

So uh.. yeah. I feel like the problem is indeed certbot, but what exactly is going on I don't know.

mahtin commented

I've played with this now a few times. 2.8.15 and the current version. I can't get it to fail. Honest!

I did however now understand that I was wrong in my comment above, i.e "raw https calls". They enable debug on the urllib3 level (which is below the Cloudflare Python library) and hence show the debug messages at that level (vs. at the Cloudflare level).

On your run you see Error determining zone_id: 6003 Invalid request headers... error. This comes from line 204 in dns_cloudflare.py in the certbot-dns-cloudflare package. This can only happen when you get an error from the zones = self.cf.zones.get(...) call. Assuming you are using a token to connect to Cloudflare API (which you must by this point in the code), then I believe you don't have a token that's got read access to the /zones API calls. You should check this on the Cloudflare API dashboard. Maybe create a new token?

OK - I tested that by using a token I have that's got no permissions on the zone I was testing with. It produces this error: Unable to determine zone_id for levy.red using zone names: ['levy.red', 'red']. Please confirm that the domain name has been entered correctly and is already associated with the supplied Cloudflare account.. Seem's legit. So I still can't create your issue.

OH - OH - in your script above, please remove the export's. You should not need them as the Cloudflare token is passed via cloudflare.ini only. The exports just confuse the Cloudflare libraries. Ah - so yeah - this is it! ARGGGGGGG!!!!!!

Let me explain: In 2.9.3 the Python libraries synced up with the Cloudflare Go libraries in using CLOUDFLARE_ environment variables for email/key/token. Back in 2.8.15 days the only environment variables that the library used were CF_API_ forms. See https://github.com/cloudflare/python-cloudflare#using-shell-environment-variables for this. That means your quite innocent use of export actually confuses the lower level libraries. To get cli4 to work you should only export CLOUDFLARE_TOKEN. Not the email variable.

Yeah - try this ...

# export CLOUDFLARE_TOKEN="███████████████████████████████████████"
# CLOUDFLARE_EMAIL="███████.com.au"
# CLOUDFLARE_DOMAIN_LIST="█████████.com.au"

With CLOUDFLARE_EMAIL exported, the code looks for CLOUDFLARE_API_KEY and because it's not there, it fails.

Looking at this further. The logic in certbot's dns_cloudflare.py is in fact different than the underlying Cloudflare libraries. I'm not saying which one is right; but actually, the more I think about it, maybe the Cloudflare Python library could handle this better. (Note that this response is clearly a linear dump of my thinking while debugging this - sorry for the lack of brevity).

So ... deep inside the Cloudflare Python library is code from PR #134 that was always going to give us trouble. The extract is here:

            if api_email is None and api_token is not None:
                # post issue-114 - token is used
                headers['Authorization'] = 'Bearer %s' % (api_token)
            elif api_email is None and api_key is not None:
                # pre issue-114 - key is used vs token - backward compat
                headers['Authorization'] = 'Bearer %s' % (api_key)
            elif api_email is not None and api_key is not None:
                # boring old school email/key methodology (token ignored)
                headers['X-Auth-Email'] = api_email
                headers['X-Auth-Key'] = api_key
            elif api_email is not None and api_token is not None:
                # boring old school email/key methodology (token ignored)
                headers['X-Auth-Email'] = api_email
                headers['X-Auth-Key'] = api_token
            else:
                raise CloudFlareInternalError(0, 'coding issue!')

The issue is that if you specify an email in the config or exported environment variables, then it will revert to pre issue-114 mindset and pretend the token is in fact an auth key - which in your case it isn't.

You are in fact the person who triggered this code in a bad way and sadly you have paid the price for that. I'm very sorry. It was all coded in the name of backward-compatibility. Legacy sucks!

So after saying all that, I'm back to the simple fix. Don't export the CLOUDFLARE_EMAIL value.

Please test and let me know.

Oh wow that is crazy! I left the CLOUDFLARE_EMAIL in because it was used twice, once to write the cloudflare.ini but also in the certbot command to list the email for certificate notifications (the script predates tokens and previous used an API key/email combo). So when I changed to a token, I never removed the variable as it was still in use.

And of course all of my tests where I hardcoded values, specifically to make sure it wasn't the way the environment variables were being passed though, those variables were still actually sitting there.

Quick test and you're right. Script still needs an email for notifications but a simple switch from CLOUDFLARE_EMAIL to LE_NOTIFY_EMAIL and it's fixed.

Greatly appreciate the effort and deep dive! Perhaps if the future a check could be added to see if CLOUDFLARE_TOKEN and CLOUDFLARE_EMAIL are both present it throws a warning? Either way that's solved the issue for me, cheers!

mahtin commented

Glad it now works. The email is in-fact just a LetsEncrypt thingy. Cloudflare Just wants a Token. So LE_NOTIFY_EMAIL is correct.

The reason you've hit this "bug" is because of a legacy support issue. But maybe it's time to delete that line of code.

Anyway. Glad I could sleuth it. But also, now I know why this all worked for me. Duh!