gimphash is a proposed method to calculate an imphash equivalent for Go binaries. It's name stands for Go-import-hash.
Golang binaries contain their dependencies as part of the executable. These dependencies include both standard library packages and third party dependencies and can be used, analogous to a classical imphash, to identify a Golang project.
The dependencies can be listed using the pclntab that is part of each Golang binary (also see this blog post by Mandiant). The pclntab contains a number of interesting elements for reverse engineering; for the gimphash we will use the function names that are contained there.
- Locate the pclntab within a Golang binary
- Enumerate golang functions using the functab within the pclntab and iterate over their names:
- Ignore function names starting with
go.
ortype.
(compile artefacts, runtime internals) - If a function name contains
vendor/
, discard that substring and everything before it (e.g. transformvendor/golang.org/x/text
togolang.org/x/text
) - Ignore function names containing
internal/
- Ignore function names that start with one of the following standard library packages:
runtime
sync
syscall
type
time
unicode
reflect
strconv
- Ignore function names that are not public or where their receivers are not public. In order to do so:
- Find the last
/
in the function name. If no/
is found, use the start instead. Starting from that position, find the next.
. - Extract the base function name as everything after that
.
. If no.
was found, use everything after the/
index calculated in the previous step. - Ignore the function if the first alphanumeric character in the base function name is a lower case character.
- If another
.
exists within the base function name, ignore the function if the first alphanumeric character after that.
is a lower case character.
- Find the last
- Store the resulting name, if it was not ignored so far, in an ordered list
- Ignore function names starting with
- Calculate the SHA-256 hash over the concatenated names (no delimiter)
This repository contains proof-of-concept code in the following languages:
- C
- Go
The release section contains prebuilt binaries for Windows and Linux.
Run the Gimphash calculator on a single file
./c_gimphash_linux /mnt/malware-repo/Godoh/godoh-windows64.exe
8200e76e42c4e9cf2bb308d76c017cbdcde5cbbf95e99e02b14d05e7b21505f3 /mnt/mal/Godoh/godoh-windows64.exe
Run the Gimphash calculator on a malware repository
find /mnt/malware-repo/ -type f -exec ./go_gimphash_linux {} \; 2>/dev/null
...