semgrep/semgrep-rules

[Rule] Dependency confusion

Sjord opened this issue · 2 comments

Sjord commented

Consider the following snippet in this Dockerfile:

RUN pip install model_lstm==${API_MODEL_LSTM_VERSION} \
    --index-url https://${FURY_TOKEN}@pypi.fury.io/${FURY_USERNAME}/ \
    --extra-index-url https://pypi.org/simple/

This installs the package model_lstm. It configures two sources for packages, pypi.fury.io and pypi.org. Presumably model_lstm is meant to be loaded from the private pypi.fury.io, since it's not in PyPI, the public package index. However, anyone can add a package called model_lstm to the public PyPI repository, and if the version number is high enough, it will be installed when building this docker image. So this allows an attacker to install their own code instead of the code from the private repo.

I think semgrep can find vulnerabilities like this, for example by searching for --extra-index-url in Dockerfiles, where the URL specifies credentials.

The above example is with Python's pip, but dependency confusion is possible with most dependency tools.

Sjord commented

Another example:

RUN --mount=type=secret,id=pip_index_url \
    pip install --extra-index-url "$(cat /run/secrets/pip_index_url)" -r full_requirements.txt --no-cache-dir

Another one:

RUN --mount=type=secret,id=netrc,target=/root/.netrc pip3 install --no-cache-dir \
                 --extra-index-url https://artifactory.algol60.net/artifactory/csm-python-modules/simple \
                 --trusted-host artifactory.algol60.net -r requirements.txt
inkz commented

@Sjord thats a great idea! probably can be done not only for python/pip