GCP cloud storage downloaded file corruption

Question

GCP cloud storage downloaded file corruption

Cai-Chen opened this issue a year ago · 4 comments

Hi, recently we got an intermittent issue that the file size downloaded via storage sdk is different from the GCP cloud storage. Our initial investigation pointed us to here (code) that when an exception is thrown the retry won't update position then data will be duplicated/corrupted.

We wrote a simple test to verify.

package test;

import com.google.auth.oauth2.GoogleCredentials;
import com.google.cloud.ReadChannel;
import com.google.cloud.storage.BlobId;
import com.google.cloud.storage.StorageOptions;
import org.testng.annotations.Test;
import org.testng.collections.Lists;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.nio.Buffer;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.file.Paths;

public class test {
    @Test
    public void testDownload() throws Exception {
        GoogleCredentials credentials = GoogleCredentials.fromStream(new FileInputStream("/path/secret.json"))
                .createScoped(Lists.newArrayList("https://www.googleapis.com/auth/cloud-platform"));
        var storage = StorageOptions.newBuilder().setCredentials(credentials).build().getService();

        var bucket = "test-bucket";
        var name = "test.file";

        var blobReference = new GCPRemoteObjectReference(BlobId.of(bucket, name));

        final File localFilePath = Paths.get("/local/path/test.file").toFile();

        try (final ReadChannel inputChannel = storage.reader(blobReference.getBlobId())) {
            localFilePath.getParentFile().mkdirs();
            try (FileChannel fileChannel = new FileOutputStream(localFilePath).getChannel()) {
                ByteBuffer bytes = ByteBuffer.allocate(64 * 1024);

                while (inputChannel.read(bytes) > 0) {
                    ((Buffer) bytes).flip();
                    fileChannel.write(bytes);
                    bytes.clear();
                }
            }
        }
    }

}

And we set a breakpoint in java.nio.channels.Channels

When debugging this test and hitting this breakpoint, manually throw an java.net.SocketTimeoutException. Then remove the breakpoint and Resume Program to let it proceed. And check the file size in local and bucket.

I know this internal/hack way is not a perfect way to reproduce this issue, but it's just our first investigation and hard to reproduce externally.

Could this be a false alarm?

Thanks.

Answer 1 · 2023-11-14T18:42:28.000Z

Thanks for the report, and the repro. I was able to translate your repro into an integration test and add it to our suite.

After that I was able to fix the tracking error in the read logic.

Fix is in #2303

Fair warning, we are currently in a code freeze for releases due to thanksgiving in the US. The next release of the library will be sometime in December.

Answer 2 · 2023-11-14T22:46:17.000Z

cool, thanks for you quick response.

Answer 3 · 2023-12-06T22:44:30.000Z

Hey @Cai-Chen , could you share which version you were using? We're seeing some intermittent rather rare corrupted data issues using 2.29.1 and wondering if this could be related.

Answer 4 · 2023-12-07T01:17:24.000Z

Hey @Cai-Chen , could you share which version you were using? We're seeing some intermittent rather rare corrupted data issues using 2.29.1 and wondering if this could be related.

We are using libraries-bom 26.27.0 which google-cloud-storage version is 2.29.0