Storj S3 restore Error: "cannot find max wal index for restore: missing initial wal segment:"
SubhashBose opened this issue · 13 comments
I am using Stroj S3 bucket for backup. Everything works fine as intended - backup are created in the bucket, I can see those files, I can use litestream restore
command to restore the database too. But after the litestream replicate
task keeps running for some time (few minutes to hours) and there are some DB write, somehow the backup in S3 gets corrupted and I get this error when I run litestream restore
$ ./litestream restore -config ./litestream.conf -o ./xxx/db_restore.sqlite3 -replica storjEU data/db.sqlite3
time=2023-12-11T13:19:05.215Z level=ERROR msg="failed to run" error="cannot find max wal index for restore: missing initial wal segment: generation=93602482baf2a8be index=00000002 offset=4152"
ABOVE FAILS, BUT THIS WORKS:
$./litestream restore -config ./litestream.conf -o ./xxx/db_restore.sqlite3 -replica local data/db.sqlite3
Surprisingly this issue is not happening with my backup/replication copy on the local file system. The local and Strorj S3 replications are both happening at the same time, but only the S3 copy is getting corrupted.
Content of my config file:
$ cat litestream.conf
dbs:
- path: data/db.sqlite3
replicas:
- name: local
path: backup/vwdb.sqlite3
- name: storjEU
type: s3
bucket: sqlitebackup
path: vwdb.sqlite3
endpoint: https://gateway.eu1.storjshare.io
access-key-id: REMOVED
secret-access-key: REMOVED
retention: 240h
snapshot-interval: 24h
This are generations and snapshots for both my local and remote backups/replications (sorted by time)
GENERATIONS:
name generation lag start end
storjEU 3d90e9d4b97facac 48h45m20s 2023-12-09T11:54:54Z 2023-12-09T11:58:10Z
storjEU 267f62b7ac6583ea 48h6m33s 2023-12-09T12:00:58Z 2023-12-09T12:36:57Z
storjEU d92a36b6531e29d3 47h57m37s 2023-12-09T12:45:52Z 2023-12-09T12:45:53Z
storjEU f3eab2c8ca18e4bd 44h0m26s 2023-12-09T12:57:47Z 2023-12-09T16:43:04Z
storjEU ccc99ebd8b36d938 42h54m29s 2023-12-09T17:17:44Z 2023-12-09T17:49:01Z
storjEU 4d5b7d395596a8a9 41h39m45s 2023-12-09T19:03:44Z 2023-12-09T19:03:45Z
storjEU 3cf628fb9fa425df 41h36m22s 2023-12-09T19:07:07Z 2023-12-09T19:07:08Z
storjEU ae8128da6e295fbc 41h29m28s 2023-12-09T19:13:00Z 2023-12-09T19:14:02Z
storjEU aa222596a50388d4 41h27m3s 2023-12-09T19:16:26Z 2023-12-09T19:16:27Z
storjEU d8483e8bf6421d51 41h20m13s 2023-12-09T19:23:16Z 2023-12-09T19:23:17Z
storjEU 88a9880d95d1baab 40h52m28s 2023-12-09T19:24:20Z 2023-12-09T19:51:02Z
storjEU 0a0aeb3cc3583099 40h36m25s 2023-12-09T20:01:50Z 2023-12-09T20:07:05Z
storjEU b1d30aae7eed5dd6 40h34m54s 2023-12-09T20:07:40Z 2023-12-09T20:08:36Z
storjEU b1d075785725a7d1 40h28m44s 2023-12-09T20:14:45Z 2023-12-09T20:14:46Z
storjEU fdaa76f7c9ec4c47 40h27m57s 2023-12-09T20:15:32Z 2023-12-09T20:15:33Z
storjEU 3d3e2680f79449df 40h27m22s 2023-12-09T20:16:07Z 2023-12-09T20:16:08Z
storjEU cd9029a58d0c1c80 39h58m46s 2023-12-09T20:44:44Z 2023-12-09T20:44:44Z
storjEU 71603124797fd846 39h54m17s 2023-12-09T20:49:12Z 2023-12-09T20:49:13Z
storjEU 4e1a8133df6f8fed 39h51m41s 2023-12-09T20:51:48Z 2023-12-09T20:51:49Z
storjEU db8605682d0af6cf 39h49m10s 2023-12-09T20:54:19Z 2023-12-09T20:54:20Z
storjEU d517b2f8b03fdc39 39h46m40s 2023-12-09T20:56:49Z 2023-12-09T20:56:50Z
local 921b3b26c504a437 7h36m45s 2023-12-10T20:43:42Z 2023-12-11T05:06:46Z
storjEU 921b3b26c504a437 7h36m44s 2023-12-09T20:59:16Z 2023-12-11T05:06:46Z
local 715859894a91836b 6h16m8s 2023-12-11T06:27:23Z 2023-12-11T06:27:23Z
storjEU 715859894a91836b 6h16m6s 2023-12-11T06:27:23Z 2023-12-11T06:27:24Z
storjEU 0a006a8056a4bf38 6h9m18s 2023-12-11T06:34:10Z 2023-12-11T06:34:12Z
local 88fa5abe3b4446dc 6h8m22s 2023-12-11T06:35:09Z 2023-12-11T06:35:09Z
storjEU 88fa5abe3b4446dc 6h8m20s 2023-12-11T06:35:09Z 2023-12-11T06:35:10Z
local d60971259164faaf 4h12m20s 2023-12-11T06:36:00Z 2023-12-11T08:31:11Z
storjEU d60971259164faaf 4h12m19s 2023-12-11T06:36:00Z 2023-12-11T08:31:11Z
local 93602482baf2a8be 0s 2023-12-11T08:40:47Z 2023-12-11T12:43:31Z
storjEU 93602482baf2a8be -1s 2023-12-11T08:40:47Z 2023-12-11T12:43:32Z
SNAPSHOTS:
replica generation index size created
storjEU 3d90e9d4b97facac 0 274106 2023-12-09T11:54:54Z
storjEU 267f62b7ac6583ea 0 274143 2023-12-09T12:00:58Z
storjEU d92a36b6531e29d3 0 274169 2023-12-09T12:45:52Z
storjEU f3eab2c8ca18e4bd 0 274169 2023-12-09T12:57:47Z
storjEU ccc99ebd8b36d938 0 274195 2023-12-09T17:17:44Z
storjEU 4d5b7d395596a8a9 0 274193 2023-12-09T19:03:44Z
storjEU 3cf628fb9fa425df 0 274193 2023-12-09T19:07:07Z
storjEU ae8128da6e295fbc 0 274193 2023-12-09T19:13:00Z
storjEU aa222596a50388d4 0 274195 2023-12-09T19:16:26Z
storjEU d8483e8bf6421d51 0 274195 2023-12-09T19:23:16Z
storjEU 88a9880d95d1baab 0 274195 2023-12-09T19:24:20Z
storjEU 0a0aeb3cc3583099 0 274837 2023-12-09T20:01:50Z
storjEU b1d30aae7eed5dd6 0 275398 2023-12-09T20:07:40Z
storjEU b1d075785725a7d1 0 275394 2023-12-09T20:14:45Z
storjEU fdaa76f7c9ec4c47 0 275394 2023-12-09T20:15:32Z
storjEU 3d3e2680f79449df 0 275394 2023-12-09T20:16:07Z
storjEU cd9029a58d0c1c80 0 275394 2023-12-09T20:44:44Z
storjEU 71603124797fd846 0 275394 2023-12-09T20:49:12Z
storjEU 4e1a8133df6f8fed 0 275394 2023-12-09T20:51:48Z
storjEU db8605682d0af6cf 0 275394 2023-12-09T20:54:19Z
storjEU d517b2f8b03fdc39 0 275394 2023-12-09T20:56:49Z
storjEU 921b3b26c504a437 0 275394 2023-12-09T20:59:16Z
storjEU 921b3b26c504a437 45 277693 2023-12-10T20:59:15Z
local 921b3b26c504a437 45 277693 2023-12-10T20:59:17Z
local 715859894a91836b 0 278165 2023-12-11T06:27:23Z
storjEU 715859894a91836b 0 278165 2023-12-11T06:27:23Z
storjEU 0a006a8056a4bf38 0 278165 2023-12-11T06:34:10Z
local 88fa5abe3b4446dc 0 278165 2023-12-11T06:35:09Z
storjEU 88fa5abe3b4446dc 0 278165 2023-12-11T06:35:09Z
local d60971259164faaf 0 278165 2023-12-11T06:36:00Z
storjEU d60971259164faaf 0 278165 2023-12-11T06:36:00Z
local 93602482baf2a8be 0 278157 2023-12-11T08:40:47Z
storjEU 93602482baf2a8be 0 278157 2023-12-11T08:40:47Z
I am using latest version of Litestream from GitHub release.
Tried on both amd64 and arm64 platforms.
I have made 100% sure that no two separate instances/servers are replicating to the same target S3 path. I am aware that can produce this particular error (https://litestream.io/tips/#multiple-applications-replicating-into-location-can-corrupt)
Can you please suggest why this is happening? I have search though old githib issues and entire documentation, but couldn't find any clue.
Might be #522, a full log of Litestream is necessary to debug with a full recursive file listing of the bucket/backend state.
I have several PRs open that fix various issues related to this but to diagnose the exact one I'd need to see what the latest Litestream logged. Even the current recursive file listing would be useful.
Thanks!
(I'm not the maintainer but I dealt with this stuff recently)
Thanks for your reply.
I don't have the full log (yet). I will restart LS, start the trace, and wait for the issue to reappear.
However, this is the current state (corrupted) of the file list in S3 remote:
119 vwdb.sqlite3/generations/3cf628fb9fa425df/wal/00000000_00000000.wal.lz4
274193 vwdb.sqlite3/generations/3cf628fb9fa425df/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/0a006a8056a4bf38/wal/00000000_00000000.wal.lz4
278165 vwdb.sqlite3/generations/0a006a8056a4bf38/snapshots/00000000.snapshot.lz4
275394 vwdb.sqlite3/generations/3d3e2680f79449df/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/3d3e2680f79449df/wal/00000000_00000000.wal.lz4
275394 vwdb.sqlite3/generations/921b3b26c504a437/snapshots/00000000.snapshot.lz4
277693 vwdb.sqlite3/generations/921b3b26c504a437/snapshots/0000002d.snapshot.lz4
21337 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000007_00001038.wal.lz4
3636 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001f_0000b128.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000000c_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000024_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000000a_00000000.wal.lz4
3526 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000029_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000000d_00000000.wal.lz4
3527 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000002f_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000028_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000002e_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000035_00000000.wal.lz4
3517 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001b_000060b0.wal.lz4
3106 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000012_00001038.wal.lz4
3104 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000006_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000013_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000002_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000036_00000000.wal.lz4
21334 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000005_00003068.wal.lz4
289 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001a_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000029_00000000.wal.lz4
3107 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000010_00001038.wal.lz4
3099 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000009_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001a_00000000.wal.lz4
3521 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000026_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000019_00000000.wal.lz4
3521 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000025_00001038.wal.lz4
3104 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000000d_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000015_00000000.wal.lz4
3736 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001c_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000027_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000025_00000000.wal.lz4
3104 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000013_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000002d_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000002c_00000000.wal.lz4
3108 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000018_00001038.wal.lz4
3198 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000011_00001038.wal.lz4
3521 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000032_00001038.wal.lz4
21336 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000007_0000c140.wal.lz4
3200 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000000b_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000000b_00000000.wal.lz4
3099 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000000c_00001038.wal.lz4
3511 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001b_000080e0.wal.lz4
3863 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000001_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000018_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000002f_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000010_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000022_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000012_00000000.wal.lz4
3524 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000028_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000026_00000000.wal.lz4
3282 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000000f_00001038.wal.lz4
3522 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000027_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001d_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000016_00000000.wal.lz4
3529 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000002c_00001038.wal.lz4
15150 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000004_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000033_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000002b_00000000.wal.lz4
3432 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000001_00003068.wal.lz4
3530 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000002d_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000000f_00000000.wal.lz4
3529 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000002b_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000014_00000000.wal.lz4
3523 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000035_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000030_00000000.wal.lz4
3111 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000015_00001038.wal.lz4
528 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001b_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000007_00000000.wal.lz4
3109 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000014_00001038.wal.lz4
3103 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000007_0000a110.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000004_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001e_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000002a_00000000.wal.lz4
7133 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000023_00003068.wal.lz4
3525 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000031_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000005_00000000.wal.lz4
3519 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000020_00001038.wal.lz4
3536 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000021_00001038.wal.lz4
3519 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000002a_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000031_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000001_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000003_00000000.wal.lz4
3099 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000008_00001038.wal.lz4
3879 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001e_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000006_00000000.wal.lz4
3106 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000000e_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001b_00000000.wal.lz4
3525 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000033_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000011_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000008_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000034_00000000.wal.lz4
26211 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000002_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000021_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000017_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000009_00000000.wal.lz4
3099 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000000a_00001038.wal.lz4
3420 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000023_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000032_00000000.wal.lz4
3858 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000003_00001038.wal.lz4
3523 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000034_00001038.wal.lz4
3536 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000024_00001038.wal.lz4
21334 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000016_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001c_00000000.wal.lz4
3829 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001f_00003068.wal.lz4
3110 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000019_00001038.wal.lz4
3111 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000017_00001038.wal.lz4
3528 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000002e_00001038.wal.lz4
8834 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000030_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000020_00000000.wal.lz4
3891 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001f_00001038.wal.lz4
7367 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001d_00001038.wal.lz4
21336 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000026_00003068.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001f_00000000.wal.lz4
3103 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000005_00001038.wal.lz4
3526 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000022_00001038.wal.lz4
3513 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000001d_000090f8.wal.lz4
3735 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000000_00001038.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/0000000e_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000023_00000000.wal.lz4
119 vwdb.sqlite3/generations/921b3b26c504a437/wal/00000000_00000000.wal.lz4
274195 vwdb.sqlite3/generations/ccc99ebd8b36d938/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/ccc99ebd8b36d938/wal/00000001_00000000.wal.lz4
3104 vwdb.sqlite3/generations/ccc99ebd8b36d938/wal/00000000_00001038.wal.lz4
119 vwdb.sqlite3/generations/ccc99ebd8b36d938/wal/00000000_00000000.wal.lz4
3115 vwdb.sqlite3/generations/3d90e9d4b97facac/wal/00000001_00001038.wal.lz4
119 vwdb.sqlite3/generations/3d90e9d4b97facac/wal/00000001_00000000.wal.lz4
119 vwdb.sqlite3/generations/3d90e9d4b97facac/wal/00000000_00000000.wal.lz4
4548 vwdb.sqlite3/generations/3d90e9d4b97facac/wal/00000000_00001038.wal.lz4
119 vwdb.sqlite3/generations/3d90e9d4b97facac/wal/00000002_00000000.wal.lz4
274106 vwdb.sqlite3/generations/3d90e9d4b97facac/snapshots/00000000.snapshot.lz4
278165 vwdb.sqlite3/generations/88fa5abe3b4446dc/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/88fa5abe3b4446dc/wal/00000000_00000000.wal.lz4
119 vwdb.sqlite3/generations/715859894a91836b/wal/00000000_00000000.wal.lz4
278165 vwdb.sqlite3/generations/715859894a91836b/snapshots/00000000.snapshot.lz4
274195 vwdb.sqlite3/generations/d8483e8bf6421d51/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/d8483e8bf6421d51/wal/00000000_00000000.wal.lz4
275394 vwdb.sqlite3/generations/4e1a8133df6f8fed/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/4e1a8133df6f8fed/wal/00000000_00000000.wal.lz4
274169 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/snapshots/00000000.snapshot.lz4
3858 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000002_00001038.wal.lz4
119 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000002_00000000.wal.lz4
119 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000005_00000000.wal.lz4
3100 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000008_00001038.wal.lz4
119 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000008_00000000.wal.lz4
119 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000000_00000000.wal.lz4
3100 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000007_00001038.wal.lz4
119 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000009_00000000.wal.lz4
3100 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000004_00001038.wal.lz4
3734 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000000_00001038.wal.lz4
119 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000004_00000000.wal.lz4
3100 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000006_00001038.wal.lz4
3431 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000003_00001038.wal.lz4
3099 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000005_00001038.wal.lz4
119 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000001_00000000.wal.lz4
6950 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000001_00001038.wal.lz4
3102 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000001_000090f8.wal.lz4
119 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000007_00000000.wal.lz4
119 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000006_00000000.wal.lz4
119 vwdb.sqlite3/generations/f3eab2c8ca18e4bd/wal/00000003_00000000.wal.lz4
119 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000005_00000000.wal.lz4
119 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000003_00000000.wal.lz4
3522 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000004_00001038.wal.lz4
3521 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000002_00001038.wal.lz4
119 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000000_00000000.wal.lz4
3528 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000001_00001038.wal.lz4
3521 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000003_00001038.wal.lz4
119 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000002_00000000.wal.lz4
119 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000004_00000000.wal.lz4
3529 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000000_00001038.wal.lz4
119 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000001_00000000.wal.lz4
278157 vwdb.sqlite3/generations/93602482baf2a8be/snapshots/00000000.snapshot.lz4
275394 vwdb.sqlite3/generations/db8605682d0af6cf/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/db8605682d0af6cf/wal/00000000_00000000.wal.lz4
3104 vwdb.sqlite3/generations/b1d30aae7eed5dd6/wal/00000000_00001038.wal.lz4
119 vwdb.sqlite3/generations/b1d30aae7eed5dd6/wal/00000001_00000000.wal.lz4
119 vwdb.sqlite3/generations/b1d30aae7eed5dd6/wal/00000000_00000000.wal.lz4
275398 vwdb.sqlite3/generations/b1d30aae7eed5dd6/snapshots/00000000.snapshot.lz4
3103 vwdb.sqlite3/generations/0a0aeb3cc3583099/wal/00000000_00001038.wal.lz4
119 vwdb.sqlite3/generations/0a0aeb3cc3583099/wal/00000002_00000000.wal.lz4
119 vwdb.sqlite3/generations/0a0aeb3cc3583099/wal/00000000_00000000.wal.lz4
26234 vwdb.sqlite3/generations/0a0aeb3cc3583099/wal/00000001_00001038.wal.lz4
119 vwdb.sqlite3/generations/0a0aeb3cc3583099/wal/00000001_00000000.wal.lz4
274837 vwdb.sqlite3/generations/0a0aeb3cc3583099/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/d517b2f8b03fdc39/wal/00000000_00000000.wal.lz4
275394 vwdb.sqlite3/generations/d517b2f8b03fdc39/snapshots/00000000.snapshot.lz4
275394 vwdb.sqlite3/generations/71603124797fd846/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/71603124797fd846/wal/00000000_00000000.wal.lz4
278165 vwdb.sqlite3/generations/d60971259164faaf/snapshots/00000000.snapshot.lz4
3529 vwdb.sqlite3/generations/d60971259164faaf/wal/00000004_00001038.wal.lz4
3541 vwdb.sqlite3/generations/d60971259164faaf/wal/00000003_00001038.wal.lz4
119 vwdb.sqlite3/generations/d60971259164faaf/wal/00000007_00000000.wal.lz4
3625 vwdb.sqlite3/generations/d60971259164faaf/wal/00000002_00001038.wal.lz4
119 vwdb.sqlite3/generations/d60971259164faaf/wal/00000001_00000000.wal.lz4
3546 vwdb.sqlite3/generations/d60971259164faaf/wal/00000002_00005098.wal.lz4
119 vwdb.sqlite3/generations/d60971259164faaf/wal/00000003_00000000.wal.lz4
3849 vwdb.sqlite3/generations/d60971259164faaf/wal/00000000_00001038.wal.lz4
119 vwdb.sqlite3/generations/d60971259164faaf/wal/00000005_00000000.wal.lz4
3524 vwdb.sqlite3/generations/d60971259164faaf/wal/00000005_00001038.wal.lz4
3528 vwdb.sqlite3/generations/d60971259164faaf/wal/00000006_00001038.wal.lz4
119 vwdb.sqlite3/generations/d60971259164faaf/wal/00000000_00000000.wal.lz4
3841 vwdb.sqlite3/generations/d60971259164faaf/wal/00000002_000070c8.wal.lz4
3420 vwdb.sqlite3/generations/d60971259164faaf/wal/00000001_00001038.wal.lz4
119 vwdb.sqlite3/generations/d60971259164faaf/wal/00000004_00000000.wal.lz4
119 vwdb.sqlite3/generations/d60971259164faaf/wal/00000002_00000000.wal.lz4
119 vwdb.sqlite3/generations/d60971259164faaf/wal/00000006_00000000.wal.lz4
275394 vwdb.sqlite3/generations/b1d075785725a7d1/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/b1d075785725a7d1/wal/00000000_00000000.wal.lz4
274143 vwdb.sqlite3/generations/267f62b7ac6583ea/snapshots/00000000.snapshot.lz4
4694 vwdb.sqlite3/generations/267f62b7ac6583ea/wal/00000001_00001038.wal.lz4
119 vwdb.sqlite3/generations/267f62b7ac6583ea/wal/00000003_00000000.wal.lz4
6949 vwdb.sqlite3/generations/267f62b7ac6583ea/wal/00000001_000060b0.wal.lz4
119 vwdb.sqlite3/generations/267f62b7ac6583ea/wal/00000001_00000000.wal.lz4
3856 vwdb.sqlite3/generations/267f62b7ac6583ea/wal/00000002_00001038.wal.lz4
119 vwdb.sqlite3/generations/267f62b7ac6583ea/wal/00000000_00000000.wal.lz4
3096 vwdb.sqlite3/generations/267f62b7ac6583ea/wal/00000001_0000e170.wal.lz4
119 vwdb.sqlite3/generations/267f62b7ac6583ea/wal/00000002_00000000.wal.lz4
3734 vwdb.sqlite3/generations/267f62b7ac6583ea/wal/00000000_00001038.wal.lz4
119 vwdb.sqlite3/generations/fdaa76f7c9ec4c47/wal/00000000_00000000.wal.lz4
275394 vwdb.sqlite3/generations/fdaa76f7c9ec4c47/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000000_00000000.wal.lz4
3734 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000006_000070c8.wal.lz4
119 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000001_00000000.wal.lz4
3104 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000006_00005098.wal.lz4
6949 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000001_000070c8.wal.lz4
3201 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000001_00003068.wal.lz4
3105 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000000_00001038.wal.lz4
119 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000002_00000000.wal.lz4
3857 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000002_00001038.wal.lz4
119 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000007_00000000.wal.lz4
3717 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000003_00001038.wal.lz4
3104 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000006_0000b128.wal.lz4
3734 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000006_00001038.wal.lz4
3719 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000004_00001038.wal.lz4
119 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000004_00000000.wal.lz4
119 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000003_00000000.wal.lz4
3734 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000001_00001038.wal.lz4
3099 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000001_0000f188.wal.lz4
3733 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000005_00001038.wal.lz4
119 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000005_00000000.wal.lz4
119 vwdb.sqlite3/generations/88a9880d95d1baab/wal/00000006_00000000.wal.lz4
274195 vwdb.sqlite3/generations/88a9880d95d1baab/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/4d5b7d395596a8a9/wal/00000000_00000000.wal.lz4
274193 vwdb.sqlite3/generations/4d5b7d395596a8a9/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/aa222596a50388d4/wal/00000000_00000000.wal.lz4
274195 vwdb.sqlite3/generations/aa222596a50388d4/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/ae8128da6e295fbc/wal/00000000_00000000.wal.lz4
119 vwdb.sqlite3/generations/ae8128da6e295fbc/wal/00000001_00000000.wal.lz4
3104 vwdb.sqlite3/generations/ae8128da6e295fbc/wal/00000001_00001038.wal.lz4
3104 vwdb.sqlite3/generations/ae8128da6e295fbc/wal/00000000_00001038.wal.lz4
274193 vwdb.sqlite3/generations/ae8128da6e295fbc/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/cd9029a58d0c1c80/wal/00000000_00000000.wal.lz4
275394 vwdb.sqlite3/generations/cd9029a58d0c1c80/snapshots/00000000.snapshot.lz4
274169 vwdb.sqlite3/generations/d92a36b6531e29d3/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/d92a36b6531e29d3/wal/00000000_00000000.wal.lz4
Are you sure you currently get that exact error?
Looking at the listing the file it is complaining about is there:
278157 vwdb.sqlite3/generations/93602482baf2a8be/snapshots/00000000.snapshot.lz4
119 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000000_00000000.wal.lz4
3529 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000000_00001038.wal.lz4
119 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000001_00000000.wal.lz4
3528 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000001_00001038.wal.lz4
119 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000002_00000000.wal.lz4 <- it thinks this doesn't exist
3521 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000002_00001038.wal.lz4
119 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000003_00000000.wal.lz4
3521 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000003_00001038.wal.lz4
119 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000004_00000000.wal.lz4
3522 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000004_00001038.wal.lz4
119 vwdb.sqlite3/generations/93602482baf2a8be/wal/00000005_00000000.wal.lz4
I reordered the generation files alphabetically.
Yes, I am sure.
After generating this file list. I again re-ran the litestream restore
command, and it showed the same above error '.... generation=93602482baf2a8be ....'
Is that list from aws s3 ls
? I think it should be ordered by key and that might be the issue here 🤔
Can you do a test where you make a full clone of the bucket to a local path and try restoring from that if it's indeed an ordering problem? Because the file is there it shouldn't get confused like that.
The list I am getting is using rclone.
By the way, this is not AWS, this is Storj S3 storage.
However, to get a log of the litestream, I have stopped and re-started litestream replicate
command, and now I am dumping the stdout of the litestream to a file. As expected this has temporarily fixed the S3 remote's copy, and I can do the restore now. I will report back with the log when the problem reappears in few hours.
Can you do a test where you make a full clone of the bucket to a local path and try restoring from that if it's indeed an ordering problem? Because the file is there it shouldn't get confused like that.
so I can not do this test. will do it when I again get the error.
Okay, I could capture the same error again after a few writes to the database.
Here you can find the error message, generation and snapshot states, full file list from S3 remote, and the output/stdout when `litestream replicate' was running
https://rentry.co/litestream-s3-bug
As you asked, I downloaded the full list of files from S3 remote and ran restore
locally, and now the restore works without any error. So litestream is somehow reading the file list improperly.
--- Below is not relevant anymore, see next comment ---
So, below is the output of restore
when I ran locally. But I am not sure if this is the last backup copy of the database, can you please confirm?
I don't understand how litestream identifies the latest snapshot/wal, but looking at the snapshots
or generations
output (given in the link above), this id 921b3b26c504a437
doesn't look like the latest one.
$ ./litestream restore -o db.sqlite3 file://`pwd`/s3/vwdb.sqlite3
2023/12/11 20:01:36 INFO restoring snapshot replica=file generation=921b3b26c504a437 index=45 path=db.sqlite3.tmp
2023/12/11 20:01:36 INFO restoring wal files replica=file generation=921b3b26c504a437 index_min=45 index_max=54
2023/12/11 20:01:36 INFO downloaded wal replica=file generation=921b3b26c504a437 index=47 elapsed=9.495464ms
2023/12/11 20:01:36 INFO downloaded wal replica=file generation=921b3b26c504a437 index=46 elapsed=17.311597ms
2023/12/11 20:01:36 INFO downloaded wal replica=file generation=921b3b26c504a437 index=51 elapsed=20.57462ms
2023/12/11 20:01:36 INFO downloaded wal replica=file generation=921b3b26c504a437 index=54 elapsed=8.048494ms
2023/12/11 20:01:36 INFO downloaded wal replica=file generation=921b3b26c504a437 index=49 elapsed=25.412532ms
2023/12/11 20:01:36 INFO downloaded wal replica=file generation=921b3b26c504a437 index=52 elapsed=25.311891ms
2023/12/11 20:01:36 INFO downloaded wal replica=file generation=921b3b26c504a437 index=53 elapsed=20.356258ms
2023/12/11 20:01:36 INFO downloaded wal replica=file generation=921b3b26c504a437 index=50 elapsed=27.480586ms
2023/12/11 20:01:36 INFO downloaded wal replica=file generation=921b3b26c504a437 index=45 elapsed=31.722295ms
2023/12/11 20:01:36 INFO downloaded wal replica=file generation=921b3b26c504a437 index=48 elapsed=31.734095ms
2023/12/11 20:01:36 INFO applied wal replica=file generation=921b3b26c504a437 index=45 elapsed=9.191623ms
2023/12/11 20:01:36 INFO applied wal replica=file generation=921b3b26c504a437 index=46 elapsed=5.94864ms
2023/12/11 20:01:36 INFO applied wal replica=file generation=921b3b26c504a437 index=47 elapsed=5.143995ms
2023/12/11 20:01:36 INFO applied wal replica=file generation=921b3b26c504a437 index=48 elapsed=42.707569ms
2023/12/11 20:01:36 INFO applied wal replica=file generation=921b3b26c504a437 index=49 elapsed=6.04216ms
2023/12/11 20:01:36 INFO applied wal replica=file generation=921b3b26c504a437 index=50 elapsed=4.231668ms
2023/12/11 20:01:36 INFO applied wal replica=file generation=921b3b26c504a437 index=51 elapsed=4.620232ms
2023/12/11 20:01:36 INFO applied wal replica=file generation=921b3b26c504a437 index=52 elapsed=5.497517ms
2023/12/11 20:01:36 INFO applied wal replica=file generation=921b3b26c504a437 index=53 elapsed=6.198322ms
2023/12/11 20:01:36 INFO applied wal replica=file generation=921b3b26c504a437 index=54 elapsed=5.555397ms
2023/12/11 20:01:36 INFO renaming database from temporary location replica=file
In continuation of my last comment:
I just discovered that litestream relies on mod-time of files in the backup directory. This is a huge red-flag for me on the reliability of a backup system, as modtime can quite easily be messed up when moving files across storages. Atleast, the cruciality of mod-time should be emphasized in the litestream documentation, so that users can be aware of the limitations.
Anyway, during my previous test on local clone of S3 remote, I didn't preserve the mod-time. That is why it was restoring the wrong snapshot+wal. After, I preserved the mod-time while cloning, litesream selected the correct generation/snapshot `7b6ac7d25b62c052' as the latest one.
2023/12/11 22:36:31 INFO restoring snapshot replica=file generation=7b6ac7d25b62c052 index=0 path=db.sqlite3.tmp
2023/12/11 22:36:31 INFO restoring wal files replica=file generation=7b6ac7d25b62c052 index_min=0 index_max=7
2023/12/11 22:36:31 INFO downloaded wal replica=file generation=7b6ac7d25b62c052 index=2 elapsed=13.843453ms
2023/12/11 22:36:31 INFO downloaded wal replica=file generation=7b6ac7d25b62c052 index=7 elapsed=13.694973ms
2023/12/11 22:36:31 INFO downloaded wal replica=file generation=7b6ac7d25b62c052 index=4 elapsed=17.303197ms
2023/12/11 22:36:31 INFO downloaded wal replica=file generation=7b6ac7d25b62c052 index=1 elapsed=9.699906ms
2023/12/11 22:36:31 INFO downloaded wal replica=file generation=7b6ac7d25b62c052 index=3 elapsed=24.300565ms
2023/12/11 22:36:31 INFO downloaded wal replica=file generation=7b6ac7d25b62c052 index=0 elapsed=25.452092ms
2023/12/11 22:36:31 INFO downloaded wal replica=file generation=7b6ac7d25b62c052 index=6 elapsed=24.354965ms
2023/12/11 22:36:31 INFO downloaded wal replica=file generation=7b6ac7d25b62c052 index=5 elapsed=25.452052ms
2023/12/11 22:36:31 INFO applied wal replica=file generation=7b6ac7d25b62c052 index=0 elapsed=12.797726ms
2023/12/11 22:36:31 INFO applied wal replica=file generation=7b6ac7d25b62c052 index=1 elapsed=4.053027ms
2023/12/11 22:36:31 INFO applied wal replica=file generation=7b6ac7d25b62c052 index=2 elapsed=7.097608ms
2023/12/11 22:36:31 INFO applied wal replica=file generation=7b6ac7d25b62c052 index=3 elapsed=4.29443ms
2023/12/11 22:36:31 INFO applied wal replica=file generation=7b6ac7d25b62c052 index=4 elapsed=5.650278ms
2023/12/11 22:36:31 INFO applied wal replica=file generation=7b6ac7d25b62c052 index=5 elapsed=7.029328ms
2023/12/11 22:36:31 INFO applied wal replica=file generation=7b6ac7d25b62c052 index=6 elapsed=6.572925ms
2023/12/11 22:36:31 INFO applied wal replica=file generation=7b6ac7d25b62c052 index=7 elapsed=4.174708ms
2023/12/11 22:36:31 INFO renaming database from temporary location replica=file
To summarize, if I make a copy of S3 remote file to local path and do the litestream restore
, then it works fine. But, when reading from S3 remote directly, litestream restore
fails with the error cannot find max wal index for restore: missing initial wal segment: ....
. Can you suggest a remedy for this?
Since it works locally from a copy my best guess is the same that Storj doesn't adhere to the S3 LIST promise of returning all objects ordered. That's why I asked if the listing was from aws s3 ls
because it should be quite raw what it does with the S3 API (of Storj).
If the order is in fact not correct then you should file a bug report for Storj that they must return it ordered:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/ListingKeysUsingAPIs.html
List results are always returned in UTF-8 binary order.
I am looking at Stroj S3 api docs, and they mention this
https://docs.storj.io/dcs/api/s3/s3-compatibility#list-objects
A bucket's paths are end-to-end encrypted. We don't use an ordering-preserving encryption scheme yet, meaning that it's impossible to always list a bucket in lexicographical order (as per S3 specification).
So, maybe this is the issue.
Is it possible for litestream to do the sorting on client end, rather than relying on server?
If this is possible, then I think many more storage will become compatible with litestrem which do not adhere to exact S3 specifications.
I was also considering to use Rclone which can serve any remote as S3,. But now I am not sure it will be a good idea.
That is indeed the issue because Litestream expects them to be in order.
I think it's reasonable to add a configuration option for Litestream S3 client to buffer the list and reorder them client side to improve compatibility with non-AWS implementations. I would probably prefer it over #531 but it's up to @benbjohnson in the end. The PR might suffer from the same issue unless the SDK reorders them before returning.
So in practice your data is fine on Storj but Litestream is just a bit confused when restoring.
I opened a PR to fix this since the change was trivial enough it doesn't really need additional code or configuration options.
There's only one place where the order of WAL segments matter and it's only on restore and there was a sort implementation in the code already so I changed the promise of replicas to return unordered results and it should be all good now.
Thank you!