/bincapz

detect malicious program behaviors

Primary LanguageYARAApache License 2.0Apache-2.0

bincapz

bincapz logo

Enumerates program capabilities and malicious behaviors using fragment analysis.

screenshot

Features

  • 15,300+ rules that detect everything from ioctls to malware
  • Analyzes binaries from any architecture
    • arm64, amd64, riscv, ppc64, sparc64
  • CI/CD-friendly
  • Diff-friendly output via Markdown, JSON, or YAML
  • Integrates YARA forge for rules by Avast, Elastic, FireEye, Google, Nextron, and others
  • Support for archives
    • .apk, .gem, .gz, .jar, .tar.gz, .tar.xz, .tar, .tgz, and .zip
  • Support for OCI images
  • Support for scripting languages such as bash, PHP, Perl, Ruby, NodeJS, and Python
  • Tuned for especially excellent performance with Linux programs

Shortcomings

  • Minimal rule support for Windows and Java (help wanted!)
  • Early in development; output is subject to change

Requirements

A container runtime environment such as Podman or Docker, or local developer tools:

  • go 1.21+
    • Install via the official installer, goenv, Homebrew, or a preferred package manager
  • pkg-config - included in many UNIX distributions
  • yara

Installation

Containerized

docker pull cgr.dev/chainguard/bincapz:latest

Local

Install YARA (dependency):

brew install yara || sudo apt install libyara-dev \
  || sudo dnf install yara-devel || sudo pacman -S yara

Install bincapz:

go install github.com/chainguard-dev/bincapz@latest

Usage

To inspect a binary, pass it as an argument to dump a list of predicted capabilities:

bincapz /bin/ping

There are flags for controlling output (see the Usage section) and filtering out rules. Here's the --format=markdown output:

RISK KEY DESCRIPTION EVIDENCE
MEDIUM combo/net/scan_tool may scan networks connect
gethostbyname
port
scan
socket
MEDIUM net/interface/list list network interfaces freeifaddrs
getifaddrs
MEDIUM net/ip/string converts IP address from byte to string inet_ntoa
MEDIUM net/socket/connect initiate a connection on a socket _connect
LOW net/hostname/resolve resolve network host name to IP address gethostbyname2
LOW net/icmp ICMP (Internet Control Message Protocol), aka ping ICMP
LOW net/interface/get get network interfaces by name or index if_nametoindex
LOW net/ip access the internet invalid packet
LOW net/ip/multicast/send send data to multiple nodes simultaneously multicast
LOW net/ip/resolve resolves network hosts via IP address gethostbyaddr
LOW net/ip/send/unicast send data to the internet unicast
LOW net/socket/receive receive a message from a socket recvmsg
LOW net/socket/send send a message to a socket _send
sendmsg
sendto
LOW process/userid/set set real and effective user ID of current process setuid

To only show output for the most suspicious behaviors, use --min-risk=high, which shows only "HIGH" or "CRITICAL" behaviors.

Diff mode to detect supply-chain attacks

Let's say you are a company that is sensitive to supply-chain compromises. You want to make sure an update doesn't introduce unexpected capability changes. There's a --diff mode for that:

bincapz -diff old_ffmpeg.dylib new_ffmpeg.dylib

Here is a result using the 3CX compromise as a test case. Each of the lines that beginsl with a "+" represent a newly added capability.

Changed: . [⚠️ MEDIUM → 🚨 CRITICAL]

20 new behaviors

RISK KEY DESCRIPTION EVIDENCE
+CRITICAL 3P/signature_base/3cxdesktopapp/backdoor Detects 3CXDesktopApp MacOS Backdoor component, by X__Junior (Nextron Systems) $op1
$op2
%s/.main_storage
%s/UpdateAgent
+CRITICAL 3P/signature_base/nk/3cx Detects malicious DYLIB files related to 3CX compromise, by Florian Roth (Nextron Systems) $xc1
$xc2
$xc3
+CRITICAL 3P/signature_base/susp/xored Detects suspicious single byte XORed keyword 'Mozilla/5.0' - it uses yara's XOR modifier and therefore cannot print the XOR key, by Florian Roth $xo1
+CRITICAL 3P/volexity/iconic Detects the MACOS version of the ICONIC loader., by threatintel@volexity.com $str1
$str2
$str3
+CRITICAL evasion/xor/user_agent XOR'ed user agent, often found in backdoors, by Florian Roth $Mozilla_5_0
+MEDIUM exec/pipe launches program and reads its output _pclose
_popen
+MEDIUM fs/permission/modify modifies file permissions chmod
+MEDIUM net/http/cookies access HTTP resources using cookies Cookie
HTTP
+MEDIUM net/url/request requests resources via URL NSMutableURLRequest
+MEDIUM ref/path/hidden hidden path generated dynamically %s/.main_storage
+MEDIUM shell/arbitrary_command/dev_null runs commands, discards output "%s" >/dev/null
+LOW compression/gzip works with gzip files gzip
+LOW env/HOME Looks up the HOME directory for the current user HOME
getenv
+LOW fs/lock/update apply or remove an advisory lock on a file flock
+LOW kernel/dispatch/semaphore Uses Dispatch Semaphores dispatch_semaphore_signal
+LOW kernel/hostname/get get computer host name gethostname
+LOW net/http/accept/encoding set HTTP response encoding format (example: gzip) Accept-Encoding
+LOW random/insecure generate random numbers insecurely _rand
srand
+LOW ref/path/home_library path reference within ~/Library /System/Library/Frameworks/CoreFoundation
/System/Library/Frameworks/Foundation
+LOW sync/semaphore/user uses semaphores to synchronize data between processes or threads semaphore_create
semaphore_signal
semaphore_wait

If you like to do things the hard way, you can also generate your own diff using JSON keys.

bincapz --format=json <file> | jq  '.Files.[].Behaviors | keys'

Supported Flags

  • --all: ignore nothing, show all
  • --data-files: include files that are detected to as non-program (binary or source) files
  • --diff: show capability drift between two files
  • --format string: Output type. Valid values are: json, markdown, simple, terminal, yaml (default "terminal")
  • --ignore-self: ignore the bincapz binary
  • --ignore-tags string: Rule tags to ignore
  • --min-file-risk - Only show results for files that meet this risk level (any,low,medium,high,critical)
  • --min-risk: minimum suspicion level to report (any,low,medium,high,critical)
  • --oci: scan OCI images
  • --omit-empty: omit files that contain no matches
  • --profile: capture profiling/tracing information for bincapz
  • --stats: display statistics for risk level and programkind
  • --third-party: include third-party rules, which may have licensing restrictions (default true)
  • --verbose: turn on verbose output for diagnostic/troubleshooting purposes

Samples

Bincapz samples are stored in the bincapz-samples repository here due to the size of the samples. While the samples were originally stored in this repository, size became a concern and polluted the Git history making the repository difficult to pull.

The new repository is cloned when running make test and the contents of test_data are copied into the resulting samples directory. This allows for the tests to run as usual. To update sample test data, make refresh-sample-testdata will now write updated test data content to the files in test_data which can be committed.

FAQ

How does it work?

bincapz behaves similarly to the initial triage step most security analysts use when faced with an unknown binary: a cursory strings inspection. bincapz has several advantages over human analysis: the ability to match raw byte sequences, decrypt data, and a library of 14,500+ YARA rules that combines the experience of security engineers worldwide.

This strategy works, as every program leaves traces of its capabilities in its contents, particularly on UNIX platforms. These fragments are typically libc or syscall references or error codes. Scripting languages are easier to analyze due to their cleartext nature and are also supported.

Why not properly reverse-engineer binaries?

Mostly because fragment analysis is so effective. Capability analysis through reverse engineering is challenging to get right, particularly for programs that execute other programs, such as malware that executes /bin/rm. Capability analysis through reverse engineering that supports a wide array of file formats also requires significant engineering investment.

Why not just observe binaries in a sandbox?

The most exciting malware only triggers when the right conditions are met. Nation-state actors, in particular, are fond of time bombs and locale detection. bincapz will enumerate the capabilities, regardless of conditions.

Why not just analyze the source code?

Sometimes you don't have it! Sometimes your CI/CD infrastructure is the source of compromise. Source-code-based capability analysis is also complicated for polyglot programs, or programs that execute external binaries, such as /bin/rm.

How does bincapz work for packed binaries (UPX)?

bincapz alerts when an obfuscated or packed binary is detected, such as those generated by upx. Fragment analysis may still work to a lesser degree. For the full story, we recommend unpacking binaries first.

What related software is out there?

bincapz was initially inspired by mandiant/capa. While capa is a fantastic tool, it only works on x86-64 binaries (ELF/PE), and does not work for macOS programs, arm64 binaries, or scripting languages. https://karambit.ai/ and https://www.reversinglabs.com/ offer capability analysis through reverse engineering as a service. If you require more than what bincapz can offer, such as Windows binary analysis, you should check them out.

How can I help?

If you find malware that bincapz doesn't surface suspicious behaviors for, send us a patch! All of the rules are defined in YARA format, and can be found in the rules/ folder.

Verifying commits and tags

In addition to contributed code, automated PRs and commits can be verified by following these steps.

Troubleshooting

Profiling

bincapz can be profiled by running --profile=true. This will generate timestamped profiles in an untracked profiles directory:

bash-5.2$ ls -l profiles/ | grep -v "total" | awk '{ print $9 }'
cpu_329605000.pprof
mem_329605000.pprof
trace_329605000.out

The traces can be inspected via go tool pprof and go tool trace.

For example, the memory profile can be inspected by running:

go tool pprof -http=:8080 profiles/mem_<timestamp>.pprof

Error: ld: library 'yara' not found

If you get this error at installation:

ld: library 'yara' not found

The yara C library is required:

brew install yara || sudo apt install libyara-devel || sudo dnf install yara-devel || sudo pacman -S yara

Additionally, ensure that Yara's version is 4.3.2.

If this version is not available via package managers, manually download the release from here and build it from source by following these steps.

Once Yara is installed, run sudo ldconfig -v to ensure that the library is loaded.