vtctldclient backup not working with defined VitessBackupStorages in cluster
voarsh2 opened this issue · 5 comments
I run
./vtctldclient --server 192.168.100.103:31487 Backup --allow-primary zone1-30573399
I get:
E0914 01:23:16.047192 3349604 main.go:56] rpc error: code = Unknown desc = TabletManager.Backup on zone1-0030573399 error: unable to get backup storage: no registered implementation of BackupStorage: unable to get backup storage: no registered implementation of BackupStorage
However, in the cluster spec I have defined a hostpath for the backups - and I can see it show up in VitessBackupStorages
apiVersion: planetscale.com/v2
kind: VitessBackupStorage
metadata:
creationTimestamp: '2023-09-14T00:52:24Z'
generation: 1
labels:
backup.planetscale.com/location: ''
planetscale.com/cluster: example
managedFields:
- apiVersion: planetscale.com/v2
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
.: {}
f:backup.planetscale.com/location: {}
f:planetscale.com/cluster: {}
f:ownerReferences:
.: {}
k:{"uid":"272e7a69-a91f-4196-ad2d-8930c88c2715"}: {}
f:spec:
.: {}
f:location:
.: {}
f:volume:
.: {}
f:hostPath:
.: {}
f:path: {}
f:type: {}
manager: vitess-operator
operation: Update
time: '2023-09-14T00:52:24Z'
- apiVersion: planetscale.com/v2
fieldsType: FieldsV1
fieldsV1:
f:status:
.: {}
f:observedGeneration: {}
manager: vitess-operator
operation: Update
subresource: status
time: '2023-09-14T00:53:06Z'
name: example-90089e05
namespace: vitess
ownerReferences:
- apiVersion: planetscale.com/v2
blockOwnerDeletion: true
controller: true
kind: VitessCluster
name: example
uid: 272e7a69-a91f-4196-ad2d-8930c88c2715
resourceVersion: '233908598'
uid: 24068990-be71-49b5-ad17-773f581170a9
spec:
location:
volume:
hostPath:
path: /mnt/minio-store/vitess-backups
type: Directory
status:
observedGeneration: 1
I restarted the primary and saw "--file_backup_storage_root=/vt/backups/example" was added, but this is not the path I specified in the cluster.
Hi @voarsh2,
This is not the best place to try and get support/help. You should instead use the Vitess slack as this kind of thing requires a lot of back and forth: https://vitess.io/community/ There are also many people there from the community that are using the operator in production.
I don't know anything about your setup (k8s version, vitess operator version, etc), nor what you've done -- e.g. the VitessCluster CRD definition you used. Nor what you want to do (how you want the backups to be performed).
It's clear that something isn't quite right but w/o any details I cannot say what.
In the meantime you can find the CRD/API reference here: https://github.com/planetscale/vitess-operator/blob/main/docs/api.md
You can see some example walkthroughs here: https://github.com/planetscale/vitess-operator/tree/main/docs
And a blog post: https://vitess.io/blog/2020-11-09-vitess-operator-for-kubernetes/
And the Vitess backup docs: https://vitess.io/docs/17.0/user-guides/operating-vitess/backup-and-restore/
The backups are very configurable and again I have no idea what you've specified. At the Vitess level, the error you shared is Vitess telling you that the component (vtctld,vttablet,vtbackup) has no value for its --backup_storage_implementation
flag. What backup implementation are you trying to use, e.g. file, s3, ceph...: https://github.com/planetscale/vitess-operator/tree/main/pkg/operator/vitessbackup ?
Between k8s (each install is a snowflake), Vitess, and the Vitess Operator this gets complicated. This is why Slack is easier for things like this. I know that this is complicated for you as well, and the docs are largely non-existent for the operator, but we'd need much more detail in order to try and help.
I can only guess that perhaps you specified something like this in your CRD:
spec:
backup:
engine: xtrabackup
locations:
- volume:
hostPath:
path: /backup
type: Directory
But guessing doesn't help. 🙂 After knowing the actual CRD definition, we'd have to look at the pod definitions, logs, etc.
Best Regards
Howdy @mattlord
This is not the best place to try and get support/help. You should instead use the Vitess slack as this kind of thing requires a lot of back and forth: https://vitess.io/community/ There are also many people there from the community that are using the operator in production.
Will look to try Slack next time.
As you pointed out:
spec:
backup:
engine: xtrabackup
locations:
- volume:
hostPath:
path: /mnt/minio-store/vitess-backups
type: Directory
This is what I used for the Vitess Cluster config.
- Using Vitess v18? (latest, from latest operator).
- Using Vitess Operator from this commit: (https://github.com/planetscale/vitess-operator/tree/e1c70738371cfc8edfdf432718c66461169615f4)
I've read most of those links.
The problem is now, despite the hostpath, the DB pods have --file_backup_storage_root=/vt/backups/example
in the command args. Not the path I specified.
So, when running ./vtctldclient --server 192.168.100.103:31487 BackupShard commerce/-
I get:
rpc error: code = Unknown desc = TabletManager.Backup on zone1-2469782763 error: StartBackup failed: mkdir /vt/backups/example: permission denied: StartBackup failed: mkdir /vt/backups/example: permission denied
Notice it's not using the hostpath I specified in the Cluster configuration.
VitessBackupStorage
apiVersion: planetscale.com/v2
kind: VitessBackupStorage
metadata:
labels:
backup.planetscale.com/location: ""
planetscale.com/cluster: example
name: example-90089e05
namespace: vitess
spec:
location:
volume:
hostPath:
path: /mnt/minio-store/vitess-backups
type: Directory
VitessCluster: example
apiVersion: planetscale.com/v2
kind: VitessCluster
metadata:
annotations:
objectset.rio.cattle.io/applied: H4sIAAAAAAAA/3zPwY6sIBCF4Xeptdoq0gjb+w69L4oizR0EIzWdSTq++8SZ/SzPv/iS8wbc04OPlmoBB3vGwtIIMw9Ut9trhg4+Ugng4JGEW/uXP5vwAR1sLBhQENwbsJQqKKmWds3q/zNJYxmOVAdCkcxDqrd0OQFjGLUx/R1Z9cuiYo8jh54mY+JiVz8vFs4OMnrOf3JPbE9wgGQU0TytaraavPKaFCkzTlGrqNXk7ervs50utODG4IC/cNszw29oO9JVXz8P4Ty/AwAA//+KvyL+FgEAAA
objectset.rio.cattle.io/id: dafd0577-6ae3-443f-a0ed-c177f498b249
labels:
objectset.rio.cattle.io/hash: ac73cc2183295cb3b5c3c3701f53f531b98b6291
name: example
namespace: vitess
spec:
backup:
engine: xtrabackup
locations:
- volume:
hostPath:
path: /mnt/minio-store/vitess-backups
type: Directory
cells:
- gateway:
authentication:
static:
secret:
key: users.json
name: example-cluster-config
replicas: 3
resources:
requests:
cpu: 100m
memory: 256Mi
name: zone1
images:
mysqld:
mysql80Compatible: vitess/lite:latest
mysqldExporter: prom/mysqld-exporter:v0.11.0
vtadmin: vitess/vtadmin:latest
vtbackup: vitess/lite:latest
vtctld: vitess/lite:latest
vtgate: vitess/lite:latest
vtorc: vitess/lite:latest
vttablet: vitess/lite:latest
keyspaces:
- durabilityPolicy: semi_sync
name: commerce
partitionings:
- equal:
parts: 1
shardTemplate:
databaseInitScriptSecret:
key: init_db.sql
name: example-cluster-config
tabletPools:
- cell: zone1
dataVolumeClaimTemplate:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
mysqld:
resources:
requests:
cpu: 100m
memory: 512Mi
replicas: 3
type: replica
vttablet:
extraFlags:
db_charset: utf8mb4
disable_active_reparents: "true"
resources:
requests:
cpu: 100m
memory: 256Mi
- durabilityPolicy: semi_sync
name: betawonder3
partitionings:
- equal:
parts: 1
shardTemplate:
databaseInitScriptSecret:
key: init_db.sql
name: example-cluster-config
tabletPools:
- cell: zone1
dataVolumeClaimTemplate:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
mysqld:
resources:
requests:
cpu: 500m
memory: 512Mi
replicas: 1
type: replica
vttablet:
extraFlags:
db_charset: utf8mb4
disable_active_reparents: "true"
resources:
requests:
cpu: 100m
memory: 256Mi
turndownPolicy: Immediate
updateStrategy:
type: Immediate
vitessDashboard:
cells:
- zone1
extraFlags:
security_policy: read-only
replicas: 1
resources:
limits:
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
vtadmin:
apiAddresses:
- http://192.168.100.103:31252
apiResources:
limits:
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
cells:
- zone1
rbac:
key: rbac.yaml
name: example-cluster-config
readOnly: false
replicas: 1
webResources:
limits:
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
https://github.com/planetscale/vitess-operator/blob/main/docs/api.md#planetscale.com/v2.VitessBackup - this doesn't show hostpath as a valid option, but I saw it in the sample YAML of the operator. In any case, I can't see any obvious reason why this is not working - not sure why the DB pods are using /vt/backups when I specify a different path. I might give S3 a try next......
I looked at the code and here is what I found -
The operator is using the volume configuration provided in the yaml to create a volume called vitess-backups
.
func fileBackupVolumes(volume *corev1.VolumeSource) []corev1.Volume {
return []corev1.Volume{
{
Name: fileBackupStorageVolumeName,
VolumeSource: *volume,
},
}
}
Next, Vitess mounts this said volume on a fixed hardcoded path in the vtbackup and vtctld pod. The path that is used is /vt/backups
.
func fileBackupVolumeMounts(subPath string) []corev1.VolumeMount {
return []corev1.VolumeMount{
{
Name: fileBackupStorageVolumeName,
MountPath: fileBackupStorageMountPath,
SubPath: subPath,
},
}
}
Since the volume has been mounted on the path /vt/backups
, this is what is used in the flags for vtctld and vtbackup -
func fileBackupFlags(clusterName string) vitess.Flags {
return vitess.Flags{
"backup_storage_implementation": fileBackupStorageImplementationName,
"file_backup_storage_root": rootKeyPrefix(fileBackupStorageMountPath, clusterName),
}
}
So while taking a backup, vtbackup will try to create a directory with the cluster name, in your case example
, and then take a backup there.
☝️ explains why you are seeing /vt/backups
in the error messages, because the vtctld and vtbackup binaries have the volume mounted at this directory.
Unfortunately, I don't know why the volume mount is unaccessible rpc error: code = Unknown desc = TabletManager.Backup on zone1-2469782763 error: StartBackup failed: mkdir /vt/backups/example: permission denied: StartBackup failed: mkdir /vt/backups/example: permission denied
.
One possible reason could be that maybe the volume /mnt/minio-store/vitess-backups
doesn't allow all users to create a directory inside it 🤷♂️ Maybe only the root user is permitted to create a directory. Could you try changing the permissions on this or try using a different directory that doesn't have this problem? Even in the e2e test that Vitess runs to verify that backups are working properly, we have to run mkdir -p -m 777 ./vtdataroot/backup
to change the permissions on the backup directory we mount to allow all users to create directories inside it.