Washi1337/AsmResolver

Section names with invalid UTF-8 codepoints fails the length limit of 8 bytes check

Closed this issue ยท 5 comments

Avroke commented

Problem Description

When PE section names are parsed, the ArgumentException "Section name cannot be longer than 8 characters." may be called.
However, some legitimate files may have a section name longer than 8 characters.
I also hope that malware with such a section can still be parsed.

Proposal

Can you please propose the possibility of disabling this check (Boolean parameter)?
In the worst case, simply crop the name to the first 8 characters.

Alternatives

No response

Additional Context

No response

The reason this check exists is because the specification of the PE file format states that section headers can only contain names with up to 8 characters, There is a clause that states that this can also be an offset into a string table, but executable images do not support this.

That being said, Microsoft can be inconsistent between specifications and their implementations, so feel free to point me to a sample for which this indeed is the case and then we can discuss how this could be implemented.

For example, here are 3 benign files with this problem: Download link

VirusTotal results : 7ca49c093ac66d4b28841b5a2222e53010f4e5f11d745330cf762d6f64ca379e 423ba8faabb19d9683cbad8d0cfe21c9cb66b52f02b12ee3574a2f7d453b444b 55d3ff6e4fb7791cc8f88d10d4fa1e71820e12c48242abb4c4049d025d1c4fa8

I can provide you with many more.

Download link asks for the access

Avroke commented

For example, here are 3 benign files with this problem: Download link
VirusTotal results : 7ca49c093ac66d4b28841b5a2222e53010f4e5f11d745330cf762d6f64ca379e 423ba8faabb19d9683cbad8d0cfe21c9cb66b52f02b12ee3574a2f7d453b444b 55d3ff6e4fb7791cc8f88d10d4fa1e71820e12c48242abb4c4049d025d1c4fa8
I can provide you with many more.

Download link asks for the access

I've just changed the link, thank you for reporting it to me!

Thanks for submitting the test files.

Quick inspection on all these files show that all sections really do have a name of 8 characters, as the spec specifies. I have tested this using three different PE header parsers: your VirusTotal links, PE-Bear and CFF Explorer.

The reason AsmResolver breaks, however, is because the section names are suffixed with invalid UTF-8 code points (notice the bytes after the zero terminator as reported by e.g. VirusTotal). This makes the call to Encoding.UTF8.GetByteCount return a wrong value, failing the assert.

I will change this issue to a bug report, and have a fix up shortly that filters these invalid code-points. For 6.0 in the future, we probably want to change the Name property to be of type Utf8String, as it does support retaining these invalid code-points.