Binary data in facts.values['ec2_userdata'] breaks custom_facts.rb used with "apply" function in Bolt
mattock opened this issue · 6 comments
Describe the Bug
EC2 userdata as seen by facter can be in binary format. Facter itself seems to be ok with it, but Puppet Bolt - in particular the to_json fact in custom_facts.rb - chokes on it with a very unhelpful error message. This makes it impossible to apply Puppet code on such nodes.
Here's a partial sample ec2_userdata fact. The userdata was created by terraform-aws_instance_wrapper:
$ /opt/puppetlabs/bin/facter ec2_userdata
Zo6xS
1eGPFhϨn4mВH+h-ȹ@-PNqL_xq|?ihп*ZK@p/zx;wG|!h|)BtO7`"6\<m?{GHL%bځWcnK^;c2J1'{<
HBpq9<{|Y .DZH'
--- snip ---
I can fully understand why a JSON parser would fail on this garbage, but I'm sure that such userdata can be found elsewhere in the wild.
Expected Behavior
Having binary data in ec2_userdata fact should not break custom_facts.rb.
Steps to Reproduce
Create a simple manifest such as this:
# notify.pp
notify { 'Testing': }
Try to apply this on an EC2 instance whose ec2_userdata fact includes binary data:
bolt apply --clear-cache --log-level trace -v notify.pp --run-as root -t mynode.example.org
This will fail:
Starting: task apply_helpers::custom_facts on mynode.example.org [9/1600]
Running task apply_helpers::custom_facts with '{"plugins":"Sensitive [value redacted]","_task":"apply_helpers::custom_facts"}' on ["mynode.example.org"]
Running task 'apply_helpers::custom_facts' on mynode.example.org
Initializing ssh connection to mynode.example.org
Opened session
Running '/opt/puppetlabs/bolt/lib/ruby/gems/2.7.0/gems/bolt-3.22.1/libexec/custom_facts.rb' with {"plugins":"Sensitive [value redacted]","_task":"apply_helpers::custom_facts"}
Executing `mkdir -m 700 /tmp/48db0a81-6e90-40d9-a747-6e85577306fd`
Command `mkdir -m 700 /tmp/48db0a81-6e90-40d9-a747-6e85577306fd` returned successfully
Uploading /opt/puppetlabs/bolt/lib/ruby/gems/2.7.0/gems/bolt-3.22.1/libexec/custom_facts.rb to /tmp/48db0a81-6e90-40d9-a747-6e85577306fd/custom_facts.rb
Executing `chmod u+x /tmp/48db0a81-6e90-40d9-a747-6e85577306fd/custom_facts.rb`
Command `chmod u+x /tmp/48db0a81-6e90-40d9-a747-6e85577306fd/custom_facts.rb` returned successfully
Executing `id -g root`
Command `id -g root` returned successfully
Executing `sudo -S -H -u root -p \[sudo\]\ Bolt\ needs\ to\ run\ as\ another\ user,\ password:\ sh -c cd\;\ chown\ -R\ root:0\ /tmp/48db0a81-6e90-40d9-a747-6e85577306fd`
Command `sudo -S -H -u root -p \[sudo\]\ Bolt\ needs\ to\ run\ as\ another\ user,\ password:\ sh -c cd\;\ chown\ -R\ root:0\ /tmp/48db0a81-6e90-40d9-a747-6e85577306fd` returned successfully
Executing `sudo -S -H -u root -p \[sudo\]\ Bolt\ needs\ to\ run\ as\ another\ user,\ password:\ sh -c echo\ f26e32fd-fd2c-4b3d-8641-9af718b6a160\ 1\>\&2\;\ cd\;\ /tmp/48db0a81-6e90-40d9-a747-6e85577306fd/custo
m_facts.rb`
Command sudo -S -H -u root -p \[sudo\]\ Bolt\ needs\ to\ run\ as\ another\ user,\ password:\ sh -c echo\ f26e32fd-fd2c-4b3d-8641-9af718b6a160\ 1\>\&2\;\ cd\;\ /tmp/48db0a81-6e90-40d9-a747-6e85577306fd/custom_f
acts.rb failed with exit code 1
Executing `sudo -S -H -u root -p \[sudo\]\ Bolt\ needs\ to\ run\ as\ another\ user,\ password:\ sh -c cd\;\ rm\ -rf\ /tmp/48db0a81-6e90-40d9-a747-6e85577306fd`
Command `sudo -S -H -u root -p \[sudo\]\ Bolt\ needs\ to\ run\ as\ another\ user,\ password:\ sh -c cd\;\ rm\ -rf\ /tmp/48db0a81-6e90-40d9-a747-6e85577306fd` returned successfully
Closed session
{"target":"mynode.example.org","action":"task","object":"apply_helpers::custom_facts","status":"failure","value":{"_output":"","_error":{"kind":"puppetlabs.tasks/task-error","issue_code":"TASK_ERROR","msg":"T
he task failed with exit code 1 and no stdout, but stderr contained:\n/tmp/48db0a81-6e90-40d9-a747-6e85577306fd/custom_facts.rb:62:in `to_json': source sequence is illegal/malformed utf-8 (JSON::GeneratorError)
\n\tfrom /tmp/48db0a81-6e90-40d9-a747-6e85577306fd/custom_facts.rb:62:in `block in <main>'\n\tfrom /opt/puppetlabs/puppet/lib/ruby/2.7.0/tmpdir.rb:89:in `mktmpdir'\n\tfrom /tmp/48db0a81-6e90-40d9-a747-6e8557730
6fd/custom_facts.rb:11:in `<main>'\n","details":{"exit_code":1}}}}
The failure happens when facts are converted into json in custom_facts.rb:
facts = Puppet::Node::Facts.indirection.find(SecureRandom.uuid, environment: env)
facts.name = facts.values['clientcert']
puts(facts.values.to_json) # The failure occurs on this line
The ec2_userdata fact is the culprit, because making it an empty string works around the problem:
facts = Puppet::Node::Facts.indirection.find(SecureRandom.uuid, environment: env)
facts.name = facts.values['clientcert']
facts.values['ec2_userdata'] = "" # Work around the problem by emptying ec2_userdata fact
puts(facts.values.to_json)
With the workaround apply works as expected:
$ bolt apply --clear-cache -v notify.pp --run-as root -t mynode.example.org
--- snip ---
Notice: /Stage[main]/Main/Notify[Testing]/message: defined 'message' as 'Testing'
changed: 1, failed: 0, unchanged: 0 skipped: 0, noop: 0
--- snip ---
Environment
- Bolt controller
- Fedora release 35 (Thirty Five)
- Bolt 3.22.1
- Target node
- Ubuntu 20.04
I think the userdata looks wonky because it is gzipped (see here). I can experiment with "gzip = false" and see if that makes things better. Even if it does, I think facter might handle gzipped content more gracefully by decrypting it on the fly. I fail to see how having a binary blob as a fact could be useful without further processing (e.g. passing it directly to a decompression function).
I think the userdata looks wonky because it is gzipped (see here). I can experiment with "gzip = false" and see if that makes things better. Even if it does, I think facter might handle gzipped content more gracefully by decrypting it on the fly. I fail to see how having a binary blob as a fact could be useful without further processing (e.g. passing it directly to a decompression function).
I tested disabling of gzipping of user-data and the problem went away. So the binary data is definitely to blame here.
This issue has not had activity for 60 days and will be marked as stale.
If this issue continues to have no activity for 7 days, it will be closed.
Ping. I don't think this issue has just disappeared magically.
This issue has not had activity for 60 days and will be marked as stale.
If this issue continues to have no activity for 7 days, it will be closed.
This issue is stale and has been closed. If you believe this is in error,
or would like the Bolt team to reconsider it, please reopen the issue.