pingcap/tidb-operator

BackupCR(7.5.1) can not read the correct size of backup in minio

wuyudian1 opened this issue · 5 comments

Bug Report

What version of Kubernetes are you using?

Client Version: v1.29.3
Server Version: v1.24.6-aliyun.

What version of TiDB Operator are you using?

{
   "GitVersion":"v1.5.2",
   "GitCommit":"456a0273f67ac61212da78956f49f0a4a07e21d8",
   "GitTreeState":"clean",
   "BuildDate":"2024-01-19T03:50:22Z",
   "GoVersion":"go1.21.5",
   "Compiler":"gc",
   "Platform":"linux/amd64"
}

What's the status of the TiDB cluster pods?

NAME                                                              READY   STATUS      RESTARTS   AGE     IP             NODE                       NOMINATED NODE   READINESS GATES
backup-basicai-backup-schedule-minio2-2024-04-24t01-19-00-kmwnp   0/1     Completed   0          6h14m   10.101.0.6     cn-shanghai.192.168.3.81   <none>           <none>
basicai-discovery-6f8785d8b6-mw2xw                                1/1     Running     0          23d     10.101.0.52    cn-shanghai.192.168.3.81   <none>           <none>
basicai-monitor-0                                                 4/4     Running     0          83d     10.101.0.98    cn-shanghai.192.168.3.82   <none>           <none>
basicai-ng-monitoring-0                                           1/1     Running     0          36m     10.101.2.29    cn-shanghai.192.168.3.77   <none>           <none>
basicai-pd-0                                                      1/1     Running     0          23d     10.101.0.88    cn-shanghai.192.168.3.82   <none>           <none>
basicai-tidb-0                                                    2/2     Running     0          23d     10.101.0.35    cn-shanghai.192.168.3.81   <none>           <none>
basicai-tikv-0                                                    1/1     Running     0          23d     10.101.0.139   cn-shanghai.192.168.3.83   <none>           <none>
basicai-tikv-1                                                    1/1     Running     0          23d     10.101.2.42    cn-shanghai.192.168.3.77   <none>           <none>
basicai-tikv-2                                                    1/1     Running     0          23d     10.101.0.91    cn-shanghai.192.168.3.82   <none>           <none>

What did you do?
BackupSchedule:

apiVersion: pingcap.com/v1alpha1
kind: BackupSchedule
metadata:
  name: bas...-schedule-minio2
  namespace: tidb-cluster
spec:
  backupTemplate:
    backoffRetryPolicy:
      maxRetryTimes: 2
      minRetryDuration: 300s
      retryTimeout: 30m
    backupMode: snapshot
    backupType: full
    br:
      cluster: ba..i
      clusterNamespace: tidb-cluster
    calcSizeLevel: all
    resources: {}
    s3:
      bucket: basicai-ops-backup
      endpoint: https://minio-endpoint-bxxxx.alidev.bexxxx.com
      prefix: tidb/alidev
      provider: minio
      region: oss-cn-beijing
      secretName: tidb-backup-to-minio
    volumeBackupInitJobMaxActiveSeconds: 600
  maxReservedTime: 84h
  schedule: 19 1 * * *

Check the backup status, and the BACKUPSIZE fileds are not correct。

kubectl get backup
NAME                                               TYPE   MODE       STATUS     BACKUPPATH                                                                         BACKUPSIZE   COMMITTS             LOGTRUNCATEUNTIL   TIMETAKEN   AGE
backup-s3-10181726                                 full   snapshot   Complete   s3://baXXX-ops-backup/tidb/alidev/backup10181726                                   1.9 GB       445021209884622860                                  188d
baXXX-backup-schedule-minio2-2024-04-21t01-19-00   full   snapshot   Complete   s3://baXXX-ops-backup/tidb/alidev/baXXX-pd.tidb-cluster-2379-2024-04-21t01-19-00   394 B        449226302064427009                      3m40s       3d5h
baXXX-backup-schedule-minio2-2024-04-22t01-19-00   full   snapshot   Complete   s3://baXXX-ops-backup/tidb/alidev/baXXX-pd.tidb-cluster-2379-2024-04-22t01-19-00   394 B        449248951895588865                      1m50s       2d5h
baXXX-backup-schedule-minio2-2024-04-23t01-19-00   full   snapshot   Complete   s3://baXXX-ops-backup/tidb/alidev/baXXX-pd.tidb-cluster-2379-2024-04-23t01-19-00   394 B        449271601936728065                      1m54s       29h
baXXX-backup-schedule-minio2-2024-04-24t01-19-00   full   snapshot   Complete   s3://baXXX-ops-backup/tidb/alidev/baXXX-pd.tidb-cluster-2379-2024-04-24t01-19-00   394 B        449294262358769665                      2m13s       5h5m

What did you expect to see?
BackupCR will show me the correct BACKUPSIZE

What did you see instead?
The BackupCR dose can not fetch the correct BACKUPSIZE.

@wuyudian1
is the size of backup-s3-10181726 correct, but others are not? if so, what's the difference between backup-s3-10181726 and others?

have you tried to set provider: aws?

I tested with provider: aws and provider: minio in TiDB Operator v1.6.0-beta.1, and both of them can get the correct size of the backup.

I tested with provider: aws and provider: minio in TiDB Operator v1.6.0-beta.1, and both of them can get the correct size of the backup.

@wuyudian1 Can you verify v1.6.0-beta.1 work well?

As we backup tidb to alibaba cloud OSS now, there's no environment for me to try the new version.