awslabs/cdk-serverless-clamscan

Allow delete bucket on cloudFormation stack rollback

ellisium opened this issue · 7 comments

Hello,

I'm trying this scanner solution and it works well except that if my stack encounter an issue (I raised VPC instance limit) so cloudFormation tried to rollback my stack. However this rollback blocked and I can see error message trying to delete rClamScanVirusDefsBucket and sometimes policy as well.
In cloudFormation context, all ressources should be deletable.

Side note: it looks like scanner take 20 to 30 sec per file. Can you confirm me if it's normal performance? I set scanFunctionMemorySize to 10GB and reservedConcurrency to 3.

Thx

Hello!

  1. S3 bucket can fail to delete when there are objects in the bucket, like in the case of the definitions bucket. You will have to clear out the objects manually and re-delete the stack
  2. The bucket policy failing to delete is a known behavior due to how the policy is constructed, deleting the stack again should resolve this issue

Performance depends on the size/type of the file and can also be affected by Lambda cold starts. In my own testing I have found that this solution is not as performant as running ClamAV in a container or EC2 instance. The performance tradeoff is something to consider if deciding to use this solution

Hello, delete failed stack happens not randomly but everytime.
First ressource blocking is policy bucket

rClamScanVirusDefsBucketPolicy04128771 DELETE_FAILED API: s3:DeleteBucketPolicy Access Denied

Unfortunately this has to be deleted twice everytime due to the bucket policy.

You may need to also delete any objects in S3 buckets if they are blocking deletion

We'll use a shared stack to avoid bucket delete issue.
Regarding performance, I digged into the code and I noted that you use cmd invocation rather than to use a daemon service. Referring to some articles on internet that clamdscan is much faster than clamscan.
clamscan read the database (kind of cold start init) for each invocations that clamdscan can avoid.
WDYT?

I believe that clamdscan does not make sense in this case since a new Lambda function is invoked per scan,meaning a new daemon would have to be created on each invocation.

I'm happy to be incorrect though if clamdscan does indeed end up with a faster scan in this case.

I understand because, it's event based on file/s3 object creation, however we could keep lambda up for the daemon and run clamdscan on create event as well? I guess we could improve performance in this way, downside is to keep the lambda daemon is kind of anti-pattern and cost impact. This could be mitigted with timers on last event berfore to let scale down the daemon lambda

I think that pattern is better suited for an ECS/Fargate based solution, not this Lambda based solution. An ECS/Fargate solution may be a better pattern for those with high volume scanning workloads rather than this Lambda based solution