zilliztech/milvus-backup

[Bug]: milvus-backup is always in a task state

ZYWNB666 opened this issue · 11 comments

image

Current Behavior

./milvus-backup create -n backup_test
The ip addresses entered in the backup configuration file are the milvus and minio addresses of the official environment
I then copied the backup files to the test environment

Finally, change all the connection addresses in the configuration file to the ip address of the test environment
./milvus-backup restore --restore_index -n backup_test -s _recover

Perform the restore operation, and the data seems to be restored, but the tool remains in this state
Although the data has been restored, I am not sure whether I did the right thing because of this state
My configuration file is as follows

# Configures the system log output.
log:
  level: info # Only supports debug, info, warn, error, panic, or fatal. Default 'info'.
  console: true # whether print log to console
  file:
    rootPath: "backup.log"

http:
  simpleResponse: true

# milvus proxy address, compatible to milvus.yaml
milvus:
  #address: 10.100.24.89
  address: 10.99.122.187 #staging
  port: 19530
  authorizationEnabled: false
  # tls mode values [0, 1, 2]
  # 0 is close, 1 is one-way authentication, 2 is two-way authentication.
  tlsMode: 0
  user: "root"
  password: "Milvus"

# Related configuration of minio, which is responsible for data persistence for Milvus.
minio:
  # cloudProvider: "minio" # deprecated use storageType instead
  storageType: "minio" # support storage type: local, minio, s3, aws, gcp, ali(aliyun), azure, tc(tencent)
  
  #address: 10.98.33.114 # Address of MinIO/S3
  address: 10.109.241.231 # Address of MinIO/S3 staging
  port: 9000   # Port of MinIO/S3
  accessKeyID: minioadmin  # accessKeyID of MinIO/S3
  secretAccessKey: minioadmin # MinIO/S3 encryption string
  useSSL: false # Access to MinIO/S3 with SSL
  useIAM: false
  iamEndpoint: ""
  
  #bucketName: "milvus-bucket" # Milvus Bucket name in MinIO/S3, make it the same as your milvus instance
  bucketName: "milvus-bucket" # Milvus Bucket name in MinIO/S3, make it the same as your milvus instance
  rootPath: "file" # Milvus storage root path in MinIO/S3, make it the same as your milvus instance

  # only for azure
  backupAccessKeyID: minioadmin  # accessKeyID of MinIO/S3
  backupSecretAccessKey: minioadmin # MinIO/S3 encryption string
  
  backupBucketName: "milvus-bucket-backup" # Bucket name to store backup data. Backup data will store to backupBucketName/backupRootPath
  backupRootPath: "backup" # Rootpath to store backup data. Backup data will store to backupBucketName/backupRootPath

backup:
  maxSegmentGroupSize: 20G

  parallelism: 
    # collection level parallelism to backup
    backupCollection: 4
    # thread pool to copy data. reduce it if blocks your storage's network bandwidth
    copydata: 128
    # Collection level parallelism to restore
    restoreCollection: 2
  
  # keep temporary files during restore, only use to debug 
  keepTempFiles: false
  
  # Pause GC during backup through Milvus Http API. 
  gcPause:
    enable: false
    seconds: 7200
    address: http://localhost:9091

Expected Behavior

It should have been successful, but it didn't return any value

Steps To Reproduce

./milvus-backup create -n backup_test
./milvus-backup restore --restore_index -n backup_test -s _recover

Environment

ubuntu 20.4

Anything else?

No response

when progress is 70%, it is in the stage of building index.

当进度达到70%时,进入建设指标阶段。

What will be done at this stage? I don't understand, because my data has already been transferred, what else will be done.

This stage has been going on for five hours, and I have to suspect a bug

because your command is ./milvus-backup restore --restore_index -n backup_test -s _recover
so only when index building is finished , this task will be considered as success.

Can you provide some logs of Milvus? We need to check why the building index costs so much time.

This is backup's log
backup (2).log

This is backup's log backup (2).log

Milvus log is also needed

This is backup's log backup (2).log

Milvus log is also needed

Probably because there is too much data and the index is too complicated

However, I ran into another problem, the recovered data was twice the data before the backup

image

This is backup's log backup (2).log

Milvus log is also needed

Probably because there is too much data and the index is too complicated

However, I ran into another problem, the recovered data was twice the data before the backup

image

Is it because the Celebrity was already loaded into memory when I backed it up that there were two copies?

we also encounter the same issue with Milvus 2.4.4 and the latest backup tool (0.4.15).
the restore were never complete, i can see the Data and the index are created, but after the restore command was stuck in the loop of the 70% we just stopped it and tried to use the collection and had many errors in the collection.
on the other hand when we restore the same collection from the same backup without --restore_index, the utility finish successfully, we created the indexes manually and everything worked !

same problematic issue here.
Milvus 2.4.4
cli 0.4.15 (MacOS)

@gmoshiko-work @MatanAmoyal1 Hi, I can't reproduce it. Could you please offer me logs of milvus and backup tool? So that I can look into this issue.

close as stale