/scip-clang

Primary LanguageC++Apache License 2.0Apache-2.0

scip-clang: SCIP indexer for C and C++ (Status: Beta)

scip-clang is a precise code indexer based on Clang 16, which supports cross-repository code navigation for C, C++ and CUDA in Sourcegraph.

Here are some code navigation examples:

Boost cross-repository Find References screenshot Chromium code navigation screenshot

Table of Contents

Supported Platforms

Binary releases are available for x86_64 Linux (glibc 2.16 or newer) and x86_64 macOS (supported on arm64 macOS via Rosetta).

We're exploring Windows support.

Codebases using GCC and/or Clang for routine compilation are both supported. For codebases exclusively built using GCC, compatibility should be as good as Clang's compatibility (i.e. most features should work, with graceful degradation for features that don't).

Extra requirements for indexing CUDA

When indexing CUDA code, an installation of Clang is required (using your OS package manager or otherwise), and the clang executable must be available on PATH, so that Clang's CUDA-related headers can be found. We recommend Clang 16 or newer, but in our testing, headers from Clang 14 also work.

The CUDA SDK must also be installed.

scip-clang currently supports indexing using a JSON compilation database. CMake, Bazel and Meson support emitting this format for compatibility with clang-based tooling. Projects which use Make or other build systems may be able to use Bear to intercept compilation commands and generate a compilation database.

We're interested in exploring more native Bazel support in the future.

The use of pre-compiled headers is not supported, as the format of pre-compiled headers varies across compilers and individual compiler versions.

Quick Start

The easiest way to use scip-clang, once you have a JSON compilation database, is to invoke scip-clang from the project root like so:

scip-clang --compdb-path=path/to/compile_commands.json

WARNING: You must invoke scip-clang from the project root, not from a subdirectory, even when you only want to index a subdirectory. If you only want to index a subdirectory, filter out unnecessary entries in the compilation database.

If you see any errors, see the Troubleshooting section.

If all goes well, indexing will generate a file index.scip which can be uploaded to a Sourcegraph instance using src-cli v4.5 or newer.

# See https://docs.sourcegraph.com/cli/references/code-intel/upload
# Make sure to authenticate earlier or provide an access token
src code-intel upload -file=index.scip

See the Usage section for step-by-step instructions.

System Requirements

  1. About 2MB of temporary space for every TU in the compilation database.
    echo "$(perl -e "print $(jq 'length' build/compile_commands.json) / 512.0") GB"
  2. On Linux, about 2MB of space in /dev/shm per core (df -h /dev/shm). This may particularly be an issue when using Docker on a high core count machine, as default size of /dev/shm in Docker is 64MB. See also: how to troubleshoot low disk space for IPC.
  3. 2GB RAM per core is generally sufficient.

Usage

Generating a compilation database

  • CMake: Add -DCMAKE_EXPORT_COMPILE_COMMANDS=ON to the cmake invocation. For typical projects, the overall invocation will look like:

    cmake -B build -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
  • Bazel: Use either hedronvision/bazel-compile-commands-extractor or grailbio/bazel-compilation-database. Caveat: The grailbio generator sometimes accidentally adds unexpanded Make variables in compilation commands, so you may need to remove them as a preprocessing step, before invoking scip-clang.

  • Meson: Use the Ninja backend, which generates a compilation database.

  • Nix + Make: When using Make under Nix, in our testing, the compilation database by Bear (recommended below) omits some flags needed to find headers from libc and libstdc++/libc++. Using mini_compile_commands instead avoids that.

  • Make or other build systems: Use Bear to wrap the build system invocation which can build all the code. For example:

    bear -- make all

    In our testing on Linux, Bear works with Boost's B2 build system as well.

    Some other tools which may work include:

    • compiledb (Linux, macOS, Windows): For Make-style systems, supposedly faster than Bear as it doesn't require a clean build.
    • compile-db-gen (Linux): Uses strace.
    • clade (Linux, macOS, partial Windows support).

    We have not tested any of these.

The official Clang docs may also have additional suggestions for generating a compilation database.

Building code

Large projects typically use various forms of code generation. scip-clang re-runs type-checking, so it needs access to generated code. This means that scip-clang should preferably run after building compilation artifacts.

Initial scip-clang testing

For large codebases, we recommend first testing scip-clang on a subset of a compilation database with diagnostics turned on. For example:

# Using jq (https://github.com/stedolan/jq)
jq '.[0:5]' build/compile_commands.json > build/small_compdb.json
# Invoke scip-clang from the project root
scip-clang --compdb-path=build/small_compdb.json --show-compiler-diagnostics

WARNING: You must invoke scip-clang from the project root, not from a subdirectory, even when you only want to index a subdirectory. If you only want to index a subdirectory, filter out unnecessary entries in the compilation database.

Known diagnostics when indexing CUDA
  1. If you see an error related to the texture template, that is likely because of the Clang version not being Clang 16 or newer. See llvm/llvm-project#61340
  2. If you see any errors related to GCC headers, that's a known issue. It shouldn't affect indexer correctness.
  3. If you see an error related to an unknown flag, you can generally ignore it. scip-clang skips all known NVCC-specific flags as they generally don't affect the semantics of code navigation. We can easily add more flags to skip here if needed.

If there are errors about missing system or SDK headers, install the relevant system dependencies.

If there are errors about missing generated headers, make sure to build your code first.

If there are any other errors, such as standard library or platform headers not being found, please report an issue.

Running scip-clang on a single repo

scip-clang --compdb-path=build/compile_commands.json

The --show-compiler-diagnostics flag is deliberately omitted here, since scip-clang is still able to index code in the presence of compiler errors, and any errors in headers will get repeated for each translation unit in which the header is included.

Setting up cross-repo code navigation

See the cross-repository setup docs.

Troubleshooting

See the Troubleshooting docs.

Reporting issues

Create a new GitHub issue with any relevant logs attached.

Sourcegraph customers may ask their Customer Engineers for help with filing an issue confidentally, as the log may contain information about file names etc.

Documentation

Run scip-clang --help to see documentation for different flags.

A CHANGELOG is also available.

Contributing