borgbackup/borg

Frequent "Index object count mismatch." errors on borg (1.4.0) check runs

Opened this issue · 7 comments

Have you checked borgbackup docs, FAQ, and open GitHub issues?

Yup.

Is this a BUG / ISSUE report or a QUESTION?

QUESTION: is this a bug?

System information. For client/server mode post info for both machines.

Your borg version (borg -V).

borg 1.4.0

Operating system (distribution) and version.

Debian GNU/Linux 12

Hardware / network configuration, and filesystems used.

VPS. x86
BTRFS.
Running check locally.

How much data is handled by borg?

235GB

Full borg commandline that lead to the problem (leave away excludes and passwords)

borg check --progress --verbose /path/to/repo.git

Describe the problem you're observing.

I have had a small number of these repo check errors in the last few months, since moving to borg 1.4.0

 [32B blob data]                                                                                                                  
 Index object count mismatch.                                                                                                     
 committed index: 1311317 objects                                                                                                 
 rebuilt index:   1311321 objects                                                                                                 
 ID: 9c3bbb9e94fe0c90768e673504fcbc25c00465d64033a6e982f67a1cb41a70f3 rebuilt index: (44, 254157872)  committed index: <not found>
 ID: 9109b1d58f387c573e6b31b0c23eddae342590445336e2e538de4ceb203a14db rebuilt index: (45, 171568058)  committed index: <not found>
 ID: 4a36c0c71d83e8805d64cdd810e6f7ff403372eab16bd104483cd9ff21db8acc rebuilt index: (44, 254163862)  committed index: <not found>
 ID: 74b46afd51fcf4e7fac4d8d88fa59115e513c3eb8f368702bc5b30f37da81f31 rebuilt index: (45, 171580358)  committed index: <not found>
 Finished full repository check, errors found.                                                                                    

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

I can't reproduce. It seems that an archive is written without error, and then sometimes the check complains of these problems. I don't run the check after every create; the check probably runs every 10 days whereas new archives are written nightly.

Edit: results of repair

finished segment check at segment 6447       
Starting repository index check                    
Finished full repository check, no problems found. 
Starting archive consistency check...              
Analyzing archive daily-2023-12-31T00:04:33 (1/81) 
...
Analyzing archive daily-2024-12-06T00:04:44 (81/81)
4 orphaned objects found!                          
Deleting 4 orphaned and 81 superseded objects...   
Finished deleting orphaned/superseded objects.     
Writing Manifest.                                  
Committing repo.                                   
Archive consistency check complete, problems found.

It looks like the rebuilt index has 4 more entries than the committed index.

That means that borg check found data in the segment files that were not in the committed (on disk) index.

But then it also found 4 orphans (chunks that are present, but not referenced by any archive).

So, overall, it is a harmless / cosmetic issue.

Thanks for your reply, and for Borg which is usually fantastic.

Hmmm. I'm glad it sounds like the archives are ok, but its more than cosmetic - I have scripts that rely on a successful outcome, so these crash. Also it takes about 5 hours to do the repair step which requires backup beforehand because of all the (cosmetic?) warnings that using "repair" could destroy everything!

I confess I don't really understand the inner workings of Borg, so I have no idea about segments chunks commits or orphans. It seems like a bug to me, but perhaps it's only me experiencing this problem (which could point to a hardware issue, for example).

Orphan chunks can happen e.g. if an input file has a read error in the middle (so some content chunks were already written to the repo), but the archive item for that file gets skipped due to that.

Not sure if that is the root cause here. Also I think that in that case the initial content chunks should be in the repo index.

What also could be is that the repo index gets corrupted in memory, so the chunk IDs have bit flips and do not match the on-disk chunk IDs anymore. But guess that would look a bit different in borg check then.

I wonder whether it could be files changing as they're read by borg?

@artfulrobot in borg 1.x a changing file results in a warning, but it does not abort reading the file or skip creation of the file item. so this wouldn't create orphans.

Note:

  • repository.put writes a new segment entry AND also creates a repo index entry, so guess it can't happen there.
  • maybe it happens in compact_segments by dropping a DEL tag that should be kept. the code there is rather complicated. accidentally/wrongly dropping a DEL might make a previous PUT "exist again" (without it being referenced, without it being in the repo index).

@artfulrobot do you see this regularly?

if so, could you run borg compact --debug instead of just borg compact and keep the log output of that so it can be compared to borg check output`?

@ThomasWaldmann I don't see it daily/weekly, but I've had it maybe 4 times in the last 4 months?

I've added --debug to my scripts, if it happens again I'll report back. Thanks for your time + work.