github-org-base-image-fetcher
Base Image Fetcher is a golang cli utility which can be used to fetch names of base container images used across repositories in a Github organization or a specific repository in the organization. The container image names can then be fed into a vulnerability scanning tool like trivy (Issue #1 open for this feature).
Utility in action
go run main.go --org=manisbindra --fileName=Dockerfile --generateOutputFile --noOfWorkers=10
The above command scans all files in the github organization "manisbindra", which have "Dockerfile" in their file names, and parses those files to retrieve distinct base container images being used. Other command line arguments enable output of json file containing the distinct container images along with all files where the base container image has been used. Command output is shown below
2022/09/11 23:35:47 fileCount: 6
2022/09/11 23:35:47 Reducing number of workers to number of files......
2022/09/11 23:35:47 All workers started...
2022/09/11 23:35:47 All files added for processing to worker queue...
2022/09/11 23:35:48 All files processed...
2022/09/11 23:35:48 All workers stopped...
2022/09/11 23:35:48 Writing container images, file mapping to output file...
2022/09/11 23:35:48 Printing name of distinct container image names...
Distinct container images:
--------------------------
openjdk:18-slim-buster
openjdk:18-jdk-alpine3.14
golang:1.10-stretch
python:2.7-slim
mcr.microsoft.com/dotnet/sdk:5.0
mcr.microsoft.com/dotnet/aspnet:5.0
alpine:latest
Command Line Arguments
-
org: The Github Organization to scan for files. This argument is mandatory
-
fileName: By default files with "Dockerfile" in their name are parsed for base container images. This file name can be modified using this argument
-
repoName: This can optionally be supplied, if aim is to scan only specific repository in organization
-
ghToken: This needs to be supplied if you wish to scan private github repositories. Please note! if this parameter is supplied along with --generateOutputFile then the output file generated will contain the token as a part of the file download url
-
generateOutputFile: This argument needs to be specified to enable generation of output json file
-
outputFile: This is the output file along with the full path. Format of output file is as follows (distinc container image name, along with list of files this image is used in)
{ "alpine:latest": [ "https://raw.githubusercontent.com/maniSbindra/k8s-delete-validation-webhook/98e33089208081360dbccd384ffdbb815b98ca2a/build/Dockerfile-local-go-build", "https://raw.githubusercontent.com/maniSbindra/k8s-delete-validation-webhook/98e33089208081360dbccd384ffdbb815b98ca2a/build/Dockerfile" ], "golang:1.10-stretch": [ "https://raw.githubusercontent.com/maniSbindra/k8s-delete-validation-webhook/98e33089208081360dbccd384ffdbb815b98ca2a/build/Dockerfile" ] }
-
noOfWorkers: This is the number of go routines used to parse the files. If an organization with a large number of repositories and Dockerfiles needs to be parsed, setting a high value like 10 is recommended. The default value is 1
More Sample commands in action
-
Full organization scan across public repositories in organization (with no output file)
$ go run main.go --org=manisbindra --fileName=Dockerfile --noOfWorkers=10 2022/09/12 00:01:08 fileCount: 6 2022/09/12 00:01:08 Reducing number of workers to number of files...... 2022/09/12 00:01:08 All workers started... 2022/09/12 00:01:08 All files added for processing to worker queue... 2022/09/12 00:01:08 All files processed... 2022/09/12 00:01:08 All workers stopped... 2022/09/12 00:01:08 Printing name of distinct container image names... Distinct container images: -------------------------- openjdk:18-slim-buster openjdk:18-jdk-alpine3.14 mcr.microsoft.com/dotnet/sdk:5.0 mcr.microsoft.com/dotnet/aspnet:5.0 python:2.7-slim golang:1.10-stretch alpine:latest
- Scan for single public repository
- Full scan arcross organization including private repositories
-
Scan of single public repository
$go run main.go --org=manisbindra --repoName=k8s-delete-validation-webhook --fileName=Dockerfile 2022/09/12 00:03:37 fileCount: 2 2022/09/12 00:03:37 All workers started... 2022/09/12 00:03:37 All files added for processing to worker queue... 2022/09/12 00:03:37 All files processed... 2022/09/12 00:03:37 All workers stopped... 2022/09/12 00:03:37 Printing name of distinct container image names... Distinct container images: -------------------------- alpine:latest golang:1.10-stretch
-
Specific private repository scan
$ go run main.go --org=manisbindra --repoName=privateRepoName --fileName=Dockerfile --generateOutputFile --outputFile="./tempfile.json" --ghToken=$GITHUB_TOKEN 2022/09/11 23:57:20 fileCount: 2 2022/09/11 23:57:20 All workers started... 2022/09/11 23:57:20 All files added for processing to worker queue... 2022/09/11 23:57:21 All files processed... 2022/09/11 23:57:21 All workers stopped... 2022/09/11 23:57:21 Writing container images, file mapping to output file... 2022/09/11 23:57:21 Printing name of distinct container image names... Distinct container images: -------------------------- node:lts mcr.microsoft.com/azure-cli node:lts-alpine
-
Full Scan across all private and public repositories in organization
$go run main.go --org=manisbindra --fileName=Dockerfile --generateOutputFile --outputFile="./tempfile.json" --noOfWorkers=10 --ghToken=$GITHUB_TOKEN 2022/09/12 00:05:13 fileCount: 18 2022/09/12 00:05:13 All workers started... 2022/09/12 00:05:13 All files added for processing to worker queue... 2022/09/12 00:05:14 All files processed... 2022/09/12 00:05:14 All workers stopped... 2022/09/12 00:05:14 Writing container images, file mapping to output file... 2022/09/12 00:05:14 Printing name of distinct container image names... Distinct container images: -------------------------- mcr.microsoft.com/dotnet/sdk:5.0 golang:1.15 scratch openjdk:17-slim mcr.microsoft.com/dotnet/sdk:6.0 mcr.microsoft.com/dotnet/aspnet:6.0-alpine ghcr.io/cse-labs/webvalidate:latest nginx node:lts ghcr.io/cse-labs/k3d:latest openjdk:8-jdk-alpine alpine:latest golang:1.10-stretch openjdk:18-jdk-alpine3.14 mcr.microsoft.com/vscode/devcontainers/universal:1-focal mcr.microsoft.com/azure-cli python:2.7-slim openjdk:18-slim-buster node:lts-alpine maven:3-openjdk-17-slim mcr.microsoft.com/dotnet/aspnet:5.0 gcr.io/distroless/static:nonroot