bpg/terraform-provider-proxmox

proxmox_virtual_environment_file snippet creation fails when proxmox is in cluster mode

Closed this issue · 4 comments

Describe the bug
using proxmox_virtual_environment_file with a proxmox instance in cluster mode fails

To Reproduce
I encounter this bug when the following parameters are set differently from default values:

resource "proxmox_virtual_environment_file" "cloud_config" {
  provider     = proxmox-bpg.an_alias
  content_type = "snippets"
  datastore_id = "custom"
  node_name    = "my_node"

  source_file {
    path = local_file.cloud_config_txt.filename
  }
}

for reference, `` refers to something along the lines of:

resource "local_file" "cloud_config_txt" {
  content = templatefile("${path.root}/common/30.cloud-config.txt.tmpl",
    {
      some_var = data.some_ref.something
    })
  filename = "${path.root}/common/generated_file.txt"
}

various tests worked with the proxmox instance not in cluster mode, however I do need to eventually configure it as such (kubernetes Proxmox's CSI requires it).

Terraform says the ressource is to be created, and seem to proceed doing so:

proxmox_virtual_environment_file.cloud_config: Creating...

however it fails:

Error: failed to open remote file /srv/snippets/generated_file.txt: file does not exist

Expected behavior
I would expect the snippet file to be created

Additional context

  • I tried the ressource with both source_file and source_raw. a few tests with source_raw lead to that issue in both clustered and non clustered mode, so I switched to an intermediate local ressource for that reason.
  • 0.47 works, however some features related to storage management had bugs fixed in later versions.

  • Single or clustered Proxmox: clustered
  • Proxmox version: 8.2.7
  • Provider version (ideally it should be the latest version): it seems all versions after 0.47 are impacted up until 0.67
  • Terraform/OpenTofu version: OpenTofu v1.8.5
  • OS (where you run Terraform/OpenTofu from): darwin_arm64
  • Debug logs (TF_LOG=DEBUG terraform apply): I edited a few things in my exemples above so logs wouldn't be accurate but I can eventually prepare something as a mock if debug logs are necessary
bpg commented

Hey @replicajune! 👋🏼

I don't expect any issues with snippets in PVE cluster, i'm using them myself, in a cluster, on cephfs datastore:
Screenshot 2024-11-21 at 9 51 46 PM

Could be related to your datastore config.
First of all, have you enabled snippets for the datastore?
Screenshot 2024-11-21 at 9 56 56 PM

Then, can you upload a snippet manually there?

If you can, could you run pvesm status and pvesm list custom (assuming custom is your datastore id), and post output here?

If you see your file in the list output, could you run pvesm path <full file id from the list> for that file and also post the result?

hello @bpg !

awesome to hear it should work :) here are the answers :

  • I can upload the file manually with ssh. my datastore if of directory type
  • the datastore is also configured to host snippets. I see the file I upload with scp following the previous point
  • pvesm status don't reports anything unusual for that particular datastore. although I sould mention now that it's a dedicated file system, mounted on /srv. snippets are in /srv/snippets. /srv would be the folder configured for the datastore (of type dir)
  • listing contents in that datastore reports the manually uploaded snippets, as well as a few lxc templates and cloud images (for each, one is a link to another, hence the similar file sizes)
# pvesm list custom
Volid                                   Format  Type            Size VMID
custom:iso/2024-10-08.tumbleweed.img     iso     iso       1563230208
custom:iso/tumbleweed.img                iso     iso       1563230208
custom:snippets/generated_file.txt snippet snippets         794
custom:vztmpl/2024-08.tumbleweed.tar.gz  tgz     vztmpl      71820718
custom:vztmpl/tumbleweed.tar.gz          tgz     vztmpl      71820718
  • then, pvesm path reports the acurate path of the manually uploaded file:
# pvesm path custom:snippets/generated_file.txt
/srv/snippets/generated_file.txt

another test I did with that manually uploaded file was to see if tofu would pick it up, and it did. if I then destroy it, it's also able to remove it, however another initial apply will loop back to not being able to upload the file

I think were your setup and mine differ could be in the upload mechanism, since I go with ssh to a folder instead of ceph, as it seems like the debug logs reports:

2024-11-21T21:07:09.425-0500 [DEBUG] provider.terraform-provider-proxmox_v0.67.0: uploading file to the node datastore via SSH input stream : @module=proxmox content_type=snippets node_address="map[Address:[redacted] Port:[redacted] ]" tf_req_id=a450da56-f6ae-325d-c13d-5eb6a041d5d6 tf_mux_provider=tf5to6server.v5tov6Server tf_provider_addr=registry.terraform.io/bpg/proxmox @caller=/home/runner/work/terraform-provider-proxmox/terraform-provider-proxmox/proxmox/ssh/client.go:292 file_name=generated_file.txt remote_dir=/srv tf_resource_type=proxmox_virtual_environment_file tf_rpc=ApplyResourceChange timestamp=2024-11-21T21:07:09.425-0500

Another thing I notice, while reading that debug log is that the remote_dir that this library seems to consider to upload the snippet is not the full path but the parent directory (just /srv, instead of /srv/snippets)

bpg commented

Could you also check syslog for sudo commands the provider used for upload?
I's curious how your paths resolved in the end.

I suspect you may need to update your sudoers permissions. The example from the docs

terraform ALL=(root) NOPASSWD: /usr/bin/tee /var/lib/vz/*

assumes a default local datastore config, which is mounted under /var/lib/vz/

Try to add a new line to sudoers:

terraform ALL=(root) NOPASSWD: /usr/bin/tee /srv/*

If it works, I'd recommend to lock it a bit more, limiting to /srv/snippets/* subfolder for your use case.

Thanks for mentioning /usr/bin/tee. this helped me find the curlpit. I'm keeping details bellow in case of this issue being of help in the future.

About your follow up investigation questions:

  • I have a rather permissive sudoers config on that node. my user has a ALL=(ALL) NOPASSWD:ALL setup
  • regarding system logs, when I execute tofu, I am not seeing sudoers commands passing by. the only two log lines showing are the following (Since it's a testing environement I haven't tighten security that much, so the provider uses the root account for interaction with proxmox directly):
Nov 22 07:23:18 [redacted] pvedaemon[3656153]: <root@pam> successful auth for user 'root@pam'
Nov 22 07:23:18 [redacted] sshd[4041704]: Accepted publickey for [redacted] from [redacted] port [redacted] ssh2: [redacted]

if I revert the provider to version 0.47.0, it works again. 0.48.0.

You mentioning /usr/bin/tee got me curious, and looking into the details of the change in between 0.47.0 and 0.48.0 points to a mechanism change in how ssh connections are handled, so I went ahead and set debug logs for sshd, and found out this:

Nov 22 07:40:34 [redacted] sshd[4045316]: debug3: mm_audit_run_command entering command try_sudo(){ if [ $(sudo -n pvesm apiinfo 2>&1 | grep "APIVER" | wc -l) -gt 0 ]; then sudo $1; else $1; fi }; try_sudo "/usr/bin/tee /srv/snippets/generated_file.txt"

that might help sub 0.1% of people but since I'm using fish, this fails rather very very silently. The very quick fix is to revert my setup to bash.

The longer fix could be a bunch of things (encapsulate the shell bit in an explicit sh subshell ?), although I doubt there is much users out there who are relying on anything else than [b/z]sh really.

The documentation does say to use bash as a default shell (https://github.com/bpg/terraform-provider-proxmox/blob/main/docs/index.md?plain=1#L235), although given the above sshd debug log I think this is also a requirement for the user used for the ssh connection too

Given that finding I'll change my setup accordingly. I would be curious to know if it's possible to encapsulate such shell commands in explicit sh/bash subshell in the future but we can also close this issue if you see fit, especially since the doc already points out the standard shell thing.

Thank you for your assistance, and your work on this provider, greatly appreciated!