nlnwa/warchaeology

The compression-level flag is a bool

Closed this issue · 4 comments

Hey, i was trying to set the compression level of the warc file in the deduper using the compression-level flag but got this error:

Error: invalid argument "9" for "--compression-level" flag: strconv.ParseBool: parsing "9": invalid syntax

After looking through the code it seems like the flag is a bool, not an int
image

Also, why does the dedup command create many smaller files?

Hi

Thank you for reporting this issue. It does indeed appear to be the case that CompressionLevel should have been an accepting an integer instead of a bool. Will see if I can create a quick fix for it.

Also, why does the dedup command create many smaller files?

Are the files created of the size of ~ 1GB? It seems like the default size of the created files are just so.

From cmd/dedup/dedup.go

cmd.Flags().StringP(flag.FileSize, "S", "1GB", flag.FileSizeHelp)

I created a branch that might solve the issue: fix/compression-level-as-int-not-bool. Would it be possible for you to check if this branch solves your issue?
As of now, we do not have suitable test-data to test this functionality in the repo, so I have no idea if this solves the issue or not.

I will check it in an hour or so