nsqio/go-diskqueue

Possible data loss and/or corruption if disk queue metadata is out of sync

fazalmajid opened this issue · 3 comments

We found a data loss/corruption issue in nsqd 0.3.7 before diskqueue was split in its own module, but AFAIK it is still present in diskqueue.

If a diskqueue's metadata is not in sync with the data file (e,.g nsqd was terminated abruptly before it had the chance to sync metadata), on restart the diskqueue will start reading messages from d.readPos, and writing at d.writePos from the metadata, either or both not the end of file. At some point incoming messages will overwrite messages still being read, the read buffer will have a part of the old message then read the new message, and this race will cause corruption (because the next message size will be reading essentially random data from the new message), cause lost messages and the file being marked as bad.

Our fix was to check the file size against the metadata and if d.writePos is < the file size, we force rotation to a new file, while allowing reading messages to the end of file (possibly causing duplicate messages, but that's better than losing messages and NSQ has no guarantees of once-only delivery anyway):

@@ -449,6 +449,26 @@ func (d *diskQueue) retrieveMetaData() error {
 	d.nextReadFileNum = d.readFileNum
 	d.nextReadPos = d.readPos
 
+	// if the metadata was not sync'd at the last shutdown of nsqd
+	// then the file might actually be larger than the writePos, in
+	// which case the safest thing to do is skip to the next file for
+	// writes, and let the reader salvage what it can
+	fileName = d.fileName(d.writeFileNum)
+	fileInfo, err := os.Stat(fileName)
+	if err != nil {
+		return err
+	}
+	fileSize := fileInfo.Size()
+	if d.writePos < fileSize {
+		d.logf("ERROR: DISKQUEUE(%s) out of sync metadata %d < %d, skipping",
+			d.name, d.writePos, fileSize)
+		d.writeFileNum += 1
+		d.writePos = 0
+		if d.writeFile != nil {
+			d.writeFile.Close()
+			d.writeFile = nil
+		}
+	}
 	return nil
 }

This diff is against the 0.3.7 diskqueue.go, but should be readily transposable to go-diskqueue.

I think this makes sense, great catch. Do you want to open a pull request to submit the change?

Sorry for the slow response, I checked in the PR #31 with the fix

thanks, let's continue the discussion there