/codeql-sample-polkit

All stages of exploring the polkit CVE-2021-4034 using codeql

Primary LanguageCodeQLMIT LicenseMIT

The polkit pkexec bug

Overview

This repository examines the polkit pkexec bug using CodeQL. It has

  • instructions for building the databases
  • the resultant databases
  • a sequence of queries illustrating an approach to find this bug

These are done:

  • [X] the polkit source / database build
  • [X] codeql query for vulnerable source
  • [X] CFG illustration

Still to be done:

  • [ ] codeql query enhancements to also handle patched source
  • [ ] command-line instructions

The Bug

The Polkit pkexec bug (CVE-2021-4034) starts from an array bounds error w.r.t. argv and builds on that. The out-of-bounds part of the problem is something we can look at with the codeql range analysis library.

pkexec’s main() function in polkit/src/programs/pkexec.c has the structure

435 main (int argc, char *argv[])
436 {
...
534   for (n = 1; n < (guint) argc; n++)
535     {
...
568     }
...
610   path = g_strdup (argv[n]);
...

Main ideas:

  • Use simple range analysis on argc.
  • Limit rhs / lhs of expressions to those involving argc.

Versions to check:

  • All Polkit versions from 2009 onwards are vulnerable; first version in May 2009 (commit c8c3d83, “Add a pkexec(1) command”).
  • we can get a database from lgtm, the current one <2022-02-11 Fri> is ...srcVersion_a6bedfd... but this one is already past the polkit patch:
    commit a6bedfd09b7bba753de7a107dc471da0db801858 (origin/master, origin/HEAD, master)
    Author: Xi Ruoyao <xry111@mengyan1223.wang>
    Date:   Thu Jan 27 10:16:32 2022 +0000
    
        jsauthority: port to mozjs-91
    
    commit a2bf5c9c83b6ae46cbd5c779d3055bff81ded683
    Author: Jan Rybar <jrybar@redhat.com>
    Date:   Tue Jan 25 17:21:46 2022 +0000
    
        pkexec: local privilege escalation (CVE-2021-4034)
        

    And we can see that the problem is fixed:

    commit a2bf5c9c83b6ae46cbd5c779d3055bff81ded683
    Author: Jan Rybar <jrybar@redhat.com>
    Date:   Tue Jan 25 17:21:46 2022 +0000
    
        pkexec: local privilege escalation (CVE-2021-4034)
    
    diff --git a/src/programs/pkcheck.c b/src/programs/pkcheck.c
    index f1bb4e1..768525c 100644
    --- a/src/programs/pkcheck.c
    +++ b/src/programs/pkcheck.c
    @@ -363,6 +363,11 @@ main (int argc, char *argv[])
       local_agent_handle = NULL;
       ret = 126;
     
    +  if (argc < 1)
    +    {
    +      exit(126);
    +    }
    +
       /* Disable remote file access from GIO. */
       setenv ("GIO_USE_VFS", "local", 1);
        
  • So we need the source code and build our own databases, one pre-patch, one post.

The next section goes through the build steps, using a Docker container.

Build polkit and CodeQL DB

We need the build setup for polkit before we can get a codeql database.

Operating system options for building:

  • macOS is worth a try, but this becomes tricky early on. Using brew to get dependencies works to a point, but the mozjs-78 dependency is a specific version of spidermonkey and building that is not practical.
    # autoconf... a little tricky on a mac
    brew install autoconf automake libtool gtk-doc
    export PATH="/usr/local/opt/libtool/libexec/gnubin:$PATH"
    ./autogen.sh 
    
    # Use meson?
    brew install meson ninja intltool glib gobject-introspection 
        
  • Linux is the native environment for polkit, but which one? The mozjs-78 dependency is a specific version of spidermonkey; also, polkit it not used by all distributions:
    • Debian uses PolicyKit, not polkit.
    • Ubuntu:
      • 18.04 is also missing mozjs78 (only mozjs52)
      • 22.04 has mozjs78

Ubuntu 22.04 can be run in a number of ways, on hardware, a VM (vmware, virtualbox, multipass, etc.), or a docker container on another host. For this problem, we can use a Docker container and include the codeql command-line tools as well.

The definition of the container is in the ./Dockerfiles, here is the build sequence:

# Base image for setting up the qlbuild container
docker pull ubuntu:jammy
docker images
docker run --cpus 4 -m 8GB -ti ubuntu:jammy

# To-be-customized image
docker build -t qlbuild .

Note: when using docker desktop on windows and mac, memory and cpu limits must be raised there. Once set, the container running sequence is simply

# Run as daemon so it stays around even when disconnecting. 
docker run -d -p 127.0.0.1:2020:22 --cpus 8 -m 16GB qlbuild

# And connect
ssh -p 2020 test@localhost

Building on Ubuntu 22.04

# ---------------------------------
# System setup/install, as root:
echo "deb-src http://archive.ubuntu.com/ubuntu/ jammy main restricted" >> /etc/apt/sources.list
apt-get update
apt-get install -y zile build-essential git cmake \
        meson ninja-build \
        libmozjs-78-0 libmozjs-78-dev \
        libdbus-1-3 libdbus-1-dev
apt-get build-dep -y  policykit-1
apt install unzip

# polkit version a2bf5c9c also needs some extras
apt install duktape duktape-dev
# older meson into /usr/local/bin
pip3 install meson==0.60.3
# Or get the source and use that:
#     wget https://github.com/mesonbuild/meson/archive/refs/tags/0.60.3.tar.gz
#     tar zxf 0.60.3.tar.gz
#     etc.

# ---------------------------------
# codeql setup -- still root

# grab -- retrieve and extract codeql cli and library
# Usage: grab version url prefix
grab() {
    version=$1; shift
    platform=$1; shift
    prefix=$1; shift
    mkdir -p $prefix/codeql-$version &&
        cd $prefix/codeql-$version || return

    # Get cli
    wget "https://github.com/github/codeql-cli-binaries/releases/download/$version/codeql-$platform.zip"
    # Get lib
    wget "https://github.com/github/codeql/archive/refs/tags/codeql-cli/$version.zip"
    # Fix attributes
    if [ `uname` = Darwin ] ; then
        xattr -c *.zip
    fi
    # Extract
    unzip -q codeql-$platform.zip
    unzip -q $version.zip
    # Rename library directory for VS Code
    mv codeql-codeql-cli-$version/ ql
    # Remove archives
    rm codeql-$platform.zip
    rm $version.zip
}    

grab v2.7.6 linux64 /opt
grab v2.6.3 linux64 /opt

# ---------------------------------
# As user test:
# Get polkit source
cd /tmp && git clone https://gitlab.freedesktop.org/polkit/polkit.git

# Build version 0.119
cd /tmp/polkit
git checkout 0.119 
git clean -fxd

meson setup builddir
meson compile -C builddir

find builddir -name pkexec -ls
: 139269     76 -rwxr-xr-x   1 test     root        76696 Feb 12 03:06 builddir/src/programs/pkexec

# ---------------------------------
# Build codeql database for version 0.119 
cd /tmp/polkit
git checkout 0.119 
git clean -fxd

# Run the configuration step as usual, without codeql
cd /tmp/polkit && rm -fR builddir
meson setup builddir

# Run the build step under codeql
export CODEQL=/opt/codeql-v2.7.6/codeql/codeql
$CODEQL --version

$CODEQL database create  --language=cpp -s . -j 8 -v \
        polkit-0.119.db \
        --command='meson compile -C builddir'

# Wait for 
# TRAP import complete (10.2s).
# Successfully created database at /tmp/polkit/polkit-0.119.db.

# And a quick check to make sure pkexec was seen:
unzip -v polkit-0.119.db/src.zip |grep pkexec
: 29713  Defl:N     8477  72% 2022-02-14 20:12 bb39f235  tmp/polkit/src/programs/pkexec.c

# ---------------------------------
# Build codeql database for version a2bf5c9c, the patched version (and still using
# mozjs-78)
cd /tmp/polkit
git checkout a2bf5c9c 
git clean -fxd

# Run the configuration step as usual, without codeql
cd /tmp/polkit && rm -fR builddir
/usr/local/bin/meson setup builddir

# With meson 0.61, configuration runs into the error
#   actions/meson.build:3:5: ERROR: Function does not take positional arguments.
# quick search leads to 
#   https://lore.kernel.org/all/20220111222135.693a88f2@windsurf/T/
# and from there to
#   [1/1] package/gobject-introspection: bump to version 1.70.0

# Run the build step under codeql
export CODEQL=/opt/codeql-v2.7.6/codeql/codeql
$CODEQL --version

$CODEQL database create  --language=cpp -s . -j 8 -v \
        polkit-a2bf5c9c.db \
        --command='/usr/local/bin/meson compile -C builddir'

# Wait for 
# TRAP import complete (7.2s).
# Successfully created database at /tmp/polkit/polkit-a2bf5c9c.db.

# And a quick check to make sure pkexec was seen:
unzip -v polkit-a2bf5c9c.db/src.zip |grep pkexec
:   30136  Defl:N     8647  71% 2022-02-14 21:27 6af18604  tmp/polkit/src/programs/pkexec.c

Copy the db to a permanent place on the host

# Copy from the container
mkdir -p ~/local/polkit && cd ~/local/polkit 
scp -rq -P 2020  test@localhost:/tmp/polkit/polkit-0.119.db .
scp -rq -P 2020  test@localhost:/tmp/polkit/polkit-a2bf5c9c.db .

# Keep originals
zip -rq polkit-0.119.zip polkit-0.119.db 
zip -rq polkit-a2bf5c9c.zip polkit-a2bf5c9c.db

Next up, setting up for query development.

Query development setup

Queries can be explored via codeql cli by itself, or using the codeql cli + the VS Code plugin. For both cases, install the cli (see the grab() function above), and extract the databases from ./db or build them as done in in Build polkit and codeql db

In the following, we assume this directory structure for the databases:

.
├── polkit-0.119.db
│   ├── codeql-database.yml
│   ├── db-cpp
│   ├── log
│   └── src.zip
├── polkit-0.119.zip
├── polkit-a2bf5c9c.db
│   ├── codeql-database.yml
│   ├── db-cpp
│   ├── log
│   └── src.zip
└── polkit-a2bf5c9c.zip

The query

The query is developed incrementally in ./argv-out-of-bounds-*.ql.

The first steps in ./argv-out-of-bounds-0.ql use the AST and variable references to narrow results to the known parts of the problem, as in the following.

declaration      | 435 main (int argc, char *argv[])
                 | 436 {
                 | ...
init other var;  | 534   for (n = 1;
compare to argc  |            n < (guint) argc;
update other var |            n++)
                 | 535     {
                 | ...
                 | 568     }
                 | ...
indexed read     | 610   path = g_strdup (argv[n]);
                 | ...
                 | 629   if (path[0] != '/')
                 | 630     {
                 | ...
                 | 632       s = g_find_program_in_path (path);
                 | ...
indexed write    | 639       argv[n] = path = s;
                 | 640     }

Exploration of values starts in ./argv-out-of-bounds-1.ql with an attempt at using the SimpleRangeAnalysis library via

lowerBound(cmp.getLeftOperand().getFullyConverted()),  "left lower bound",
lowerBound(cmp.getRightOperand().getFullyConverted()), "right lower bound"

The bounds are correct bounds for the types, but too general – they include the possible results of iteration. We are only interested in initial bounds that are statically determinate, those before any iteration happens.

Put another way, this is not a general data flow problem; we only want to check initial value propagation along certain execution paths. The for loop complicates this, as do the operations within it.

We really want to see execution paths that bypass the loop altogether. That is done in the latter parts of ./argv-out-of-bounds-1.ql, using a SsaDefinition.

The query ./argv-out-of-bounds-2.ql cleans up the exploration from ./argv-out-of-bounds-1.ql and correctly identifies all the locations using n with a known index value n > 0.

This query reports results on the vulnerable version of the code, polkit-0.119.db. Next, it needs to be checked and enhanced so it reports nothing on the patched version, polkit-a2bf5c9c.db.