jeanbmar/s3-sync-client

Filters option different from AWS CLI

rktyt opened this issue · 3 comments

rktyt commented
// aws s3 sync --dryrun --delete --exclude 'excl/*' /path/to/local/dir s3://my-target-bucket
// Note: The results of the above command and the following code are different
import { S3Client } from "@aws-sdk/client-s3";
import S3SyncClient from "s3-sync-client";

process.env.S3_BACKET_NAME = "my-target-bucket";
process.env.S3_REGION = "us-east-1";
const sourceDir = "/path/to/local/dir";

const client = new S3Client({ region: process.env.S3_REGION });
const { sync } = new S3SyncClient({ client });

const res = await sync(sourceDir, `s3://${process.env.S3_BACKET_NAME}`, {
  filters: [
    {
      exclude(key) {
        return key.startsWith("excl/");
      },
    },
  ],
  del: true,
  partSize: 100 * 1024 * 1024,
  dryRun: true,
});

console.log(res);

In the case of AWS CLI, excl/path/to/file that exists on the S3 side is not deleted, whereas it is deleted when using s3-sync-client.
There is probably no problem with this specification, but please make it clear in the documentation.


Versions
aws cli: 2.4.3
s3-sync-client: 3.0.0

You are right and this behavior will cause issues for people doing a 1-1 transition.

I will think about a solution to provide options for both current and regular behaviors.

Ref for ideas:
aws/aws-cli#4923

@jeanbmar any update on the above? It will be great if this bug can be fixed, as IMO it is a necessary functionality. E.g I have adjusted the bucket-with-bucket.js code and added this code block, after finding the diff :

    deleted.forEach((deleted) => deleted.applyDeleteFilters(filters));
    const excludedDeletedObjects = sourceObjects.filter((sourceObject) => !sourceObject.isIncluded())

I think it covers most of the use cases, as in the case of the deletion, everything will be included by default, so we need to apply only the exclude to items that need to be deleted.

if such a solution make sense to you, I can open a PR and have it done properly.

Fixed in 4.2.0, now matching CLI behavior by default.
A new deleteExcluded option is also available.

await b.test('does not delete excluded files', async () => {
await syncClient.send(
new SyncBucketWithLocalCommand({
localDir: path.join(DATA_DIR, 'def/jkl'),
bucketPrefix: BUCKET,
del: true,
filters: [{ exclude: (key) => key.startsWith('xmot') }],
})
);
const objects = await syncClient.send(
new ListBucketObjectsCommand({
bucket: BUCKET,
})
);
assert(hasObject(objects, 'xmot') === true);
});
await b.test('deletes excluded files', async () => {
await syncClient.send(
new SyncBucketWithLocalCommand({
localDir: path.join(DATA_DIR, 'def/jkl'),
bucketPrefix: BUCKET,
del: true,
deleteExcluded: true,
filters: [{ exclude: (key) => key.startsWith('xmot') }],
})
);
const objects = await syncClient.send(
new ListBucketObjectsCommand({
bucket: BUCKET,
})
);
assert(hasObject(objects, 'xmot') === false);
});