[Bug]: mongodbatlas_advanced_cluster drift after 1.18.1 migration

Question

[Bug]: mongodbatlas_advanced_cluster drift after 1.18.1 migration

Closed this issue 4 months ago · 3 comments

cah6 commented 4 months ago

Is there an existing issue for this?

I have searched the existing issues

Provider Version

1.18.1

Terraform Version

v1.9.5

Terraform Edition

Terraform Open Source (OSS)

Current Behavior

I followed the migration guide to upgrade to 1.18.1 https://registry.terraform.io/providers/mongodb/mongodbatlas/latest/docs/guides/advanced-cluster-new-sharding-schema#migrate-advanced_cluster-type-sharded on a sharded cluster, namely I removed num_shards = 2 and instead repeated the replication_specs blocks. This signals to terraform that the cluster needs to be updated (expected) and when the change is applied, the Atlas UI reports in the activity tab No changes to the cluster were detected in the update submitted through the public API. (expected). However even after the apply, terraform still thinks there's drift; it seems that the API is returning info in the old format (a replicationSpecList with numShards = 2) so there's no way to get to a clean state.

Terraform configuration to reproduce the issue

resource "mongodbatlas_advanced_cluster" "cluster0" {
  backup_enabled                 = true
  cluster_type                   = "SHARDED"
  encryption_at_rest_provider    = "NONE"
  mongo_db_major_version         = "7.0"
  name                           = "Cluster0"
  paused                         = false
  pit_enabled                    = false
  project_id                     = mongodbatlas_project.main.id
  termination_protection_enabled = false
  version_release_system         = "LTS"

  advanced_configuration {
    javascript_enabled           = true
    minimum_enabled_tls_protocol = "TLS1_2"
    no_table_scan                = false
    oplog_size_mb                = 196800
  }

  dynamic "replication_specs" {
    # repeat this block for number of shards, which are identical in specs
    for_each = range(2)

    content {
      zone_name = "Zone 1"

      region_configs {
        priority      = 7
        provider_name = "GCP"
        region_name   = "CENTRAL_US"

        auto_scaling {
          compute_enabled            = false
          compute_scale_down_enabled = false
          disk_gb_enabled            = true
        }

        electable_specs {
          instance_size = "M30" # changed from our normal size
          node_count    = 3
        }
      }
    }
  }
}

Steps To Reproduce

Configuration above is what our cluster is, but you may need to first create the cluster with the old syntax, then change to the new syntax. That is, create cluster with:

resource "mongodbatlas_advanced_cluster" "cluster0" {
  backup_enabled                 = true
  cluster_type                   = "SHARDED"
  encryption_at_rest_provider    = "NONE"
  mongo_db_major_version         = "7.0"
  name                           = "Cluster0"
  paused                         = false
  pit_enabled                    = false
  project_id                     = mongodbatlas_project.main.id
  termination_protection_enabled = false
  version_release_system         = "LTS"

  advanced_configuration {
    javascript_enabled           = true
    minimum_enabled_tls_protocol = "TLS1_2"
    no_table_scan                = false
    oplog_size_mb                = 196800
  }

  replication_specs {
    num_shards = 2
    zone_name  = "Zone 1"

    region_configs {
      priority      = 7
      provider_name = "GCP"
      region_name   = "CENTRAL_US"

      auto_scaling {
        compute_enabled            = false
        compute_scale_down_enabled = false
        disk_gb_enabled            = true
      }

      electable_specs {
        instance_size = "M30"
        node_count    = 3
      }
    }
  }
}

then change the config to what is in the Terraform configuration to reproduce the issue section.

Although -- I'm not sure if this will reliably reproduce the issue, since we made the same change on another cluster and it doesn't produce the same drift. Besides cluster tier, the primary difference in the other cluster is that its a GEOSHARDED cluster, which may be related.

Logs

No response

Code of Conduct

I agree to follow this project's Code of Conduct

Answer 1 · 2024-09-04T18:58:02.000Z

Thanks for opening this issue! Please make sure you've followed our guidelines when opening the issue. In short, to help us reproduce the issue we need:

Terraform configuration file used to reproduce the issue
Terraform log files from the run where the issue occurred
Terraform Atlas provider version used to reproduce the issue
Terraform version used to reproduce the issue
Confirmation if Terraform OSS, Terraform Cloud, or Terraform Enterprise deployment

The ticket CLOUDP-271592 was created for internal tracking.

Answer 2 · 2024-09-04T19:27:54.000Z

Hmm. So we re-applied the same reported drift, which was:

  # mongodbatlas_advanced_cluster.cluster0 will be updated in-place
~ resource "mongodbatlas_advanced_cluster" "cluster0" {
        id                                               = "elided"
        name                                             = "Cluster0"
        # (18 unchanged attributes hidden)

      ~ replication_specs {
            id           = "elided"
          ~ num_shards   = 2 -> 1
            # (4 unchanged attributes hidden)

            # (1 unchanged block hidden)
        }
      + replication_specs {
          + container_id = (known after apply)
          + num_shards   = 1
          + zone_name    = "Zone 1"

          + region_configs {
              + priority      = 7
              + provider_name = "GCP"
              + region_name   = "CENTRAL_US"

              + analytics_auto_scaling (known after apply)

              + analytics_specs (known after apply)

              + auto_scaling {
                  + compute_enabled            = false
                  + compute_scale_down_enabled = false
                  + disk_gb_enabled            = true
                }

              + electable_specs {
                  + instance_size = "M30"
                  + node_count    = 3
                }

              + read_only_specs (known after apply)
            }
        }

        # (2 unchanged blocks hidden)
    }

and this time it seems like it there's no terraform plan drift. Maybe the cluster API got fixed to properly apply the change?

I'll close this but will re-open if it somehow drifts back to thinking tf state and cluster state is different.

Answer 3 · 2024-09-05T08:34:17.000Z

Thanks @cah6 for opening the issue.
I have tried to reproduce this without success. In any case, please reopen it or open a new issue in case this happens again.
Thanks again!