ocp-power-automation/ocp4-upi-powervs

OCP Cluster creation is failing in powervs using rhcos-413

mdafsanhossain opened this issue · 8 comments

PowerVS VMs go into shutoff state during cluster creation when rhcos-413.92.202306140611-0-powervs.ppc64le.ova.gz is used as the image.
Terraform automation is stuck at 70% and throws remote-exec provisioner error

How to reproduce?

var.tfvars

  ibmcloud_zone = "mon01"
  service_instance_id = "8994821d-cdad-4df8-bf2b-d577a890ac24"
  ibmcloud_api_key = ""
  rhel_image_name =  "rhel-86-05162022-tier1"
  rhcos_import_image              = true                                                   
  rhcos_import_image_filename     = "rhcos-413.92.202306140611-0-powervs.ppc64le.ova.gz"   
  rhcos_import_image_storage_type = "tier1"
  system_type =  "s922"
  network_name =  "ocp-private-network"
  openshift_install_tarball =  "https://mirror.openshift.com/pub/openshift-v4/ppc64le/clients/ocp/4.13.4/openshift-client-linux.tar.gz"
  openshift_client_tarball =  "https://mirror.openshift.com/pub/openshift-v4/ppc64le/clients/ocp/4.13.4/openshift-client-linux.tar.gz"
  pull_secret_file = "/home/ubuntu/pull-secret.txt"
  rhel_subscription_username = ""
  rhel_subscription_password = ""
  ## Small Configuration Template
  bastion   = { memory = "16", processors = "0.5", "count" = 1 }
  bootstrap = { memory = "32", processors = "0.5", "count" = 1 }
  master    = { memory = "32", processors = "0.5", "count" = 3 }
  worker    = { memory = "32", processors = "0.5", "count" = 2 }
  storage_type = "nfs"
  volume_size = "200"
  cluster_id_prefix = "rhcos-test"
  cluster_domain = "nip.io"

Created cluster using https://github.com/ocp-power-automation/openshift-install-power/blob/devel/openshift-install-powervs

./openshift-install-powervs create -var-file var.tfvars

cc: @yussufsh

Shutting down rhcos VMs is part of the install process. Can you show me what errors you are seeing in the console?

This is the terraform log. VMs in the shut off state are not coming back up.

[verify_data] Found id_rsa & id_rsa.pub in current directory
[powervs_login] Trying to login with the provided IBMCLOUD_API_KEY...
[powervs_login] Targeting 'oss-community-resources-montreal' with Id crn:v1:bluemix:public:power-iaas:mon01:a/108655d3ff9e4489b1c29e83df48623d:8994821d-cdad-4df8-bf2b-d577a890ac24::
[init_terraform] Initializing Terraform plugins and validating the code...
[apply] Running terraform apply... please wait
Attempt: 1/5
[retry_terraform] Encountered below errors:         (70%)
│ Error: remote-exec provisioner error
[retry_terraform] WARN: Issues were seen while running the terraform command. Attempting to run again...
Attempt: 2/5
[retry_terraform] Encountered below errors:         (70%)
│ Error: remote-exec provisioner error
[retry_terraform] WARN: Issues were seen while running the terraform command. Attempting to run again...
Attempt: 3/5
[retry_terraform] Encountered below errors:         (70%)
│ Error: remote-exec provisioner error
[retry_terraform] WARN: Issues were seen while running the terraform command. Attempting to run again...
Attempt: 4/5
[retry_terraform] Encountered below errors:         (70%)
│ Error: remote-exec provisioner error
[retry_terraform] WARN: Issues were seen while running the terraform command. Attempting to run again...
Attempt: 5/5

I don't have the log from the VMs themselves

As I said the VMs are supposed to be in shutoff state during this stage of install. Can you attach the last log file from logs folder?

What is the status of bootatrap node? If it is shutdown please start and rerun the script. Else check the vm console and see what is the info.

What is the status of bootatrap node? If it is shutdown please start and rerun the script. Else check the vm console and see what is the info.

it was in Warning state. Creating the cluster from scratch again

@yussufsh I noticed that the bootstrap node becomes unreachable after it comes back up and it gets stuck.

I tried installing 4.12 this time. It goes to 114% and gets stuck. Bootstrap node is accessible.

ocp4-upi-powervs_20230627075921_apply_1.log
ocp4-upi-powervs_20230627075921_apply_3.log