drupalwxt/helm-drupal

Persistent storage and environment specific variable question

nathanpw opened this issue · 2 comments

I have a question or looking for confirmation on my understanding of persistent storage mechanisms available to Drupal via these helm charts.

Context.: We need to be able to differentiate between environments (think dev/test/prod) for a Drupal migration. The migrations contain environment specific variables that we need to set and access within the containers for the migrations. Things like urls or database connections, etc... are different in dev/test/prod. We have a mechanisms to set the variables for each environment via our deployment mechanisms.

Concern.: Our main concern is depending on where these are set, like say in code within the containers, they may not persist if the containers get destroyed and come back up in Kubernetes. (at least my understanding).

Assumptions/Understanding.:

  • Files in the container are not persistent. Any changes made to files at runtime in Kubernetes in the containers (except the mounted persistent file storage) will be lost if the containers get destroyed and "spin" back up.
  • Linux environment variables within the containers will not persist. Nor will it propagate to all containers in a multi container/replication setup?
  • Database is persistent. Could be used (thinking Drupal config) and can be set with the post-install or post-update methods.
  • File storage (mounted storage) is persistent. Although not ideal for this context, it could be used set with post-install or post update methods.
  • Settings.php is persistent and can be modified via "extraSettings" and will persist if the container gets destroyed in Kubernetes and comes back up.

I am pretty sure about the above, however would like to confirm (especially the settings.php) and know if there are other "persistent storage mechanisms" we are unaware of or could be used for this type of context (persistent environment variables available int eh containers).

Appreciate any input/suggestion or feedback and all the work gone into these charts!

sylus commented

@zachomedia might be best to answer this but I can attempt to. I think based on my limited understanding what you really want is configmaps https://kubernetes.io/docs/concepts/configuration/configmap/ but I could be misunderstanding.

Assumptions/Understanding.:

a) Files in the container are not persistent. Any changes made to files at runtime in Kubernetes in the containers (except the mounted persistent file storage) will be lost if the containers get destroyed and "spin" back up.

This is technically correct as the files in the container image itself are immutable and if are changed by an end user and you restart the container it would revert back. However any file, directory, etc can use a volume mount instead which will be persisted. A volume mount can be of many types.

  • Configmap: This is what settings.php and other files like nginx.conf use. Can see how this is done at https://github.com/drupalwxt/helm-drupal/tree/master/drupal/templates/cm
  • Persistent Volume Claim: This is when a disk is attached and will replace the directory / files at the specific volume mount point and anything done in this folder structure will be persisted. Can see how this is done at https://github.com/drupalwxt/helm-drupal/tree/master/drupal/templates/pvc. Care should be used to check whether your disk is ReadWriteMany instead of ReadWriteOnce if you have many replicas
  • Persistent Volume Claim (NFS): This is almost like the regular PVC logic but instead of a disk we use cloud services like Azure Files which allows for the container to be stateless, as these mount points are automatically handled on container start and result in no mount waiting time.

b) Linux environment variables within the containers will not persist. Nor will it propagate to all containers in a multi container/replication setup?

Environment variables are persistent on container restart as long as they are passed in the yaml spec which incidentally the helm chart supports. However if the end user logs into the container and add additional environment variables via that method then they would not be persisted via restart.

Thank you @nathanpw for your questions! @sylus has provided some good information on this so far, and I'll add to it in response directly to your assumptions/understandings:

Files in the container are not persistent. Any changes made to files at runtime in Kubernetes in the containers (except the mounted persistent file storage) will be lost if the containers get destroyed and "spin" back up.

This is correct. @sylus mentions, by default the files in the container as immutable as the user the container runs as does not have write permissions to them. If by chance you can edit a file (because you run the container as a different user), they would be automatically reset when the container is restarted.

The only file system data that is persisted if changed are the files mounts - /var/www/html/sites/default/files and /private.

Linux environment variables within the containers will not persist. Nor will it propagate to all containers in a multi container/replication setup?

If you set an environment variable within a container, then correct it will not be persisted across restarts nor will it be in all the containers.

However, the helm chart does allow you to specify your own environment variables to make available to the containers:

https://github.com/drupalwxt/helm-drupal/blob/master/drupal/templates/deploy/drupal.yaml#L137

What this looks like in your values passed to helm is:

extraVars:
  - name: MY_CUSTOM_ENVVAR
    value: my value 

An aside to this.. it probably would make more sense to move this extraVars setting under drupal, but that's not really relevant to this question. It is also missing in the default values.yaml, so it is not obvious that it is available.

This would allow you to use your deployment tool to set the environment variable to the value you would like in each environment.

Database is persistent. Could be used (thinking Drupal config) and can be set with the post-install or post-update methods.

This is correct, the database is persistent. You can run arbitrary drush commands with the drupal.extraInstallScripts and drupal.extraUpgradeScripts settings in the helm chart (https://github.com/drupalwxt/helm-drupal/blob/master/drupal/values.yaml#L84).

File storage (mounted storage) is persistent. Although not ideal for this context, it could be used set with post-install or post update methods.

This is correct.

Settings.php is persistent and can be modified via "extraSettings" and will persist if the container gets destroyed in Kubernetes and comes back up.

This is correct, you can customize the settings.php file using the drupal.extraSettings field in the helm chart (https://github.com/drupalwxt/helm-drupal/blob/master/drupal/values.yaml#L80).

The extra settings are included after the provided settings.php file, which allows you to override an existing value or define a new value.

These are all of the ways of storing data within the Drupal WxT Helm configuration. I hope this additional context on top of what @sylus provided answers your questions! Please let me know if you have any additional questions on this :)