/sourcegraph-scripts

Scripts for Sourcegraph search results. Useful for static analysis <3

Primary LanguagePython

Scripts for the purposes of scraping Sourcegraph search results. Script scripts/json-to-raw-url.sh extracts raw GitHub file URLs from src-cli, and scripts/github_downloader.py downloads all the files from GitHub.

Example Usage

$ src search -stream -json '${{github.event.comment.body}} file:.github/workflows COUNT:100000' | ./scripts/json-to-raw-url.sh | python3 scripts/github_downloader.py

Why is this so useful?

This allows security researchers to run static analysis tools on a mass of GitHub repos which are fetched from Sourcegraph. Here's an example of running Semgrep:

$ semgrep --config "p/github-actions" out

The output will include full repository file paths, allowing us to easily identify the vulnerable repositories.

How to install

$ git clone https://github.com/KarimPwnz/sourcegraph-scripts.git
$ cd sourcegraph-scripts
$ pip install -r requirements.txt