Static Builds Of Libclang 13

Table Of Contents

Introduction

This package consists of a set of CMake scripts that download and compile libclang into a single static archive containing all LLVM and third party dependencies so applications which link against it can be easily deployed. The build also produces a bundled shared library which allows faster iteration during development time by avoiding link time slowdowns. The build takes only about 7 minutes on my modestly provisioned 7 year old i5/16GB RAM Thinkpad instead of 5-7 hours for a normal from-scratch LLVM build.

I tested it on Manjaro Linux with GCC 11.1.0, macOS Mojave and Windows 10 using the MS Visual C++ toolchain. As an aside the Windows build is only possible because the Zig project generously provides prebuilt statically linked LLVM libraries for Windows, if you are benefiting please consider contributing, it’s immensely annoying and time-consuming to build LLVM and clang from scratch and we should support Andy Kelley for saving us the trouble.

A small easy-to-build demo app clang_visitor that parses some C++ is also included to help you get started. And to convince you the build works as advertised below are the minimal runtime dependencies on each of the supported platforms when the app is linked against the static archive:

  • Linux
> $ ldd ./clang_visitor
	linux-vdso.so.1 (0x00007ffce8bfb000)
	libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fdb9954a000)
	libm.so.6 => /usr/lib/libm.so.6 (0x00007fdb99404000)
	libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fdb993ff000)
	libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fdb993dd000)
	libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fdb993c3000)
	libc.so.6 => /usr/lib/libc.so.6 (0x00007fdb991fd000)
	/lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fdb9ea9b000)
  • macOS Mojave:
> otool -L clang_visitor
clang_visitor:
	/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
	/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 400.9.4)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)
  • … and Windows 10
> dumpbin.exe /DEPENDENTS clang_visitor.exe
Microsoft (R) COFF/PE Dumper Version 14.25.28614.0
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file clang_visitor_static.exe

File Type: EXECUTABLE IMAGE

  Image has the following dependencies:

    VERSION.dll
    KERNEL32.dll
    SHELL32.dll
    ole32.dll
    OLEAUT32.dll
    ADVAPI32.dll
    VCRUNTIME140.dll
    VCRUNTIME140_1.dll
    api-ms-win-crt-stdio-l1-1-0.dll
    api-ms-win-crt-runtime-l1-1-0.dll
    api-ms-win-crt-heap-l1-1-0.dll
    api-ms-win-crt-utility-l1-1-0.dll
    api-ms-win-crt-environment-l1-1-0.dll
    api-ms-win-crt-string-l1-1-0.dll
    api-ms-win-crt-convert-l1-1-0.dll
    api-ms-win-crt-time-l1-1-0.dll
    api-ms-win-crt-math-l1-1-0.dll
    api-ms-win-crt-locale-l1-1-0.dll
    api-ms-win-crt-filesystem-l1-1-0.dll

  Summary

      2C4000 .data
      123000 .pdata
     15CB000 .rdata
       7D000 .reloc
        1000 .rsrc
     2105000 .text

Motivation

Currently the best way to statically analyze C and C++ source is libclang. Unfortunately applications built against libclang aren’t very portable or easy to deploy because of dependencies on third party libraries like ncurses and z3 and the libclang shared library itself. Package managers do a decent job of orchestrating the install but it’s still hard to deploy an application that’s pinned to a specific version of libclang or to ship binaries between Linux distributions. There’s always containers or Nix or Guix but I think asking people to get up to speed on purely functional package managers or have Docker running just to use libclang apps is a non-starter. With this package all you need is CMake on macOS and Linux and additionally Visual Studio Build Tools on Windows.

Getting Started

Below are some instructions on getting up and running on Linux, macOS and Windows 10. Everything beyond that is the full build as a literate program and only interesting if you care about implementation details. If you just want to use this package it can be safely skipped. Enjoy!

Linux and macOS

Building

First make sure you have a cmake version greater that 3.13:

> cmake --version
cmake version 3.17.0

CMake suite maintained and supported by Kitware (kitware.com/cmake).

and gcc~/~g++ > 10:

> g++ --version
g++ (Debian 10.2.1-6) 10.2.1 20210110
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Clone this repo, create a build directory inside it and run the build and install:

> git clone https://github.com/deech/libclang-static-build
> cd libclang-static-build
> mkdir build; cd build
> cmake .. -DCMAKE_INSTALL_PREFIX=..
> make install

The install step copies all the artifacts to the directory into which you cloned this repo just above the build directory. Nothing else on the system is touched.

Once it’s done installing there will be 3 new directories in repo directory, lib, include and share. The first contains a big libclang static archive with all dependencies bundled and shared versions of those libraries for quicker compilation during development, the second contains the libclang headers and the third has two directories share/doc/examples/static and share/doc/examples/shared both of which contain a couple of identical small examples that shows how to create static and shared libclang apps.

Running The Examples

The two example directories share/doc/examples/static and share/doc/examples/shared both of which contain an identical small example program that walks a C++ header file containing an enum, the first builds a large (100MB+) statically linked executable with minimal dependencies and the second a much smaller shared executable which depends libclang, ncurses and z3 at runtime. To build them run make in their respective directories:

> cd libclang-static-build
> cd doc/example/static
> make
> ./clang_visitor
Cursor spelling, kind: __ENUM__, macro definition
Cursor spelling, kind: Enum, EnumDecl
Cursor spelling, kind: RED, EnumConstantDecl
Cursor spelling, kind: , UnexposedExpr
Cursor spelling, kind: , IntegerLiteral
Cursor spelling, kind: , IntegerLiteral
Cursor spelling, kind: GREEN, EnumConstantDecl
Cursor spelling, kind: , UnexposedExpr
Cursor spelling, kind: , BinaryOperator
Cursor spelling, kind: , BinaryOperator
Cursor spelling, kind: , IntegerLiteral
Cursor spelling, kind: , IntegerLiteral
Cursor spelling, kind: BLUE, EnumConstantDecl
Cursor spelling, kind: , UnexposedExpr
Cursor spelling, kind: , BinaryOperator
Cursor spelling, kind: , BinaryOperator
Cursor spelling, kind: RED, DeclRefExpr
Cursor spelling, kind: GREEN, DeclRefExpr

Windows 10

Building

First install CMake and Build Tools For Visual Studio 2019, then clone this repo, create a build directory inside it, run the build and install:

> git.exe clone https://github.com/deech/libclang-static-build
> cd libclang-static-build
> mkdir build
> cd build
> cmake.exe .. -Thost=x64 -G "Visual Studio 16 2019" -A x64 -DCMAKE_INSTALL_PREFIX=.. -DCMAKE_BUILD_TYPE=Release -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD="AVR" -DLLVM_ENABLE_LIBXML2=OFF -DLLVM_USE_CRT_RELEASE=MT
> "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\MSBuild\Current\Bin\MSBuild.exe" /m -p:Configuration=Release INSTALL.vcxproj

At the final step I needed to give the full path to MSBuild.exe even though I asked MS Build Tools to add it to the PATH so I reproduced it here so you don’t have to go hunt it down.

There should now be 3 new directories in the repo directory, lib, include, and share. The first contains clang_static_bundled.lib which is a 400MB static archive, include has all the headers needed to build libclang apps and share has a single statically linked demo app.

Running The Example

The example directory share/doc/examples/static contains a small demo that parses out a C++ enum from a header file. To build it:

> cd libclang-static-build\share\doc\examples\static
> mkdir build
> cd build
> cmake.exe -G "Visual Studio 16 2019" .. -DCMAKE_INSTALL_PREFIX=..
> "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\MSBuild\Current\Bin\MSBuild.exe" /m -p:Configuration=Release INSTALL.vcxproj

The directory above the build directory now contains bin which contains the example app clang_visitor.exe:

>cd ..\bin
>clang_visitor_static.exe
Cursor spelling, kind: __ENUM__, macro definition
Cursor spelling, kind: Enum, EnumDecl
Cursor spelling, kind: RED, EnumConstantDecl
Cursor spelling, kind: , IntegerLiteral
Cursor spelling, kind: GREEN, EnumConstantDecl
Cursor spelling, kind: , BinaryOperator
Cursor spelling, kind: , IntegerLiteral
Cursor spelling, kind: , IntegerLiteral
Cursor spelling, kind: BLUE, EnumConstantDecl
Cursor spelling, kind: , BinaryOperator
Cursor spelling, kind: RED, DeclRefExpr
Cursor spelling, kind: GREEN, DeclRefExpr

Implementation

At a high level to build a bundled shared and static library I grab the prebuilt clang+LLVM static archives and libclang sources, build the latter from scratch locally and then bundle it along with all the prebuilt archives into one large library that an executable can link against.

Additionally on Linux and macOS libclang depends z3 and ncurses. While the former has official prebuilt releases the latter does not and so we have to build from source locally. Both are then folded into the resulting library.

On Windows 10 the situation is nicer because the Zig project provides prebuilt LLVM archives with no dependency on z3 so the build actually goes quite a bit faster than the other platforms. Do support Zig if you can.

And finally we generate a small demo that traverses a C++ header; on Linux and macOS it’s a standard Make project and a CMake project on Windows.

Preamble

cmake_minimum_required(VERSION 3.13)
project(libclang-static-build)
list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/cmake/modules")
set(LIBCLANG_EXAMPLES "${CMAKE_CURRENT_SOURCE_DIR}/cmake/examples")
if(NOT (MSVC OR APPLE OR UNIX))
  message(FATAL_ERROR "This build currenly works only with macOS, Microsoft Visual Studio and Linux.")
endif()
if(APPLE OR UNIX)
  find_program(CMAKE_LIBTOOL libtool)
  if(NOT CMAKE_LIBTOOL)
    message(FATAL_ERROR "'libtool' is necessary for building static archives")
  endif()
  include(LinuxMacosBuild)
else()
  include(MSVCBuild)
endif()

Linux and macOS

Clang and NCurses Download URLs

“Reproducibility” is achieved by hard-coding the URLs from which to get the dependencies, I’m sure there’s more principled ways but this works ok for now.

if(APPLE)
  set(LIBCLANG_PREBUILT_URL https://github.com/llvm/llvm-project/releases/download/llvmorg-13.0.0/clang+llvm-13.0.0-x86_64-apple-darwin.tar.xz)
else()
  set(LIBCLANG_PREBUILT_URL https://github.com/llvm/llvm-project/releases/download/llvmorg-13.0.0/clang+llvm-13.0.0-x86_64-linux-gnu-ubuntu-20.04.tar.xz)
endif()
set(CLANG_SOURCES_URL https://github.com/llvm/llvm-project/releases/download/llvmorg-13.0.0/clang-13.0.0.src.tar.xz)
set(NCURSES_SOURCES_URL https://ftp.gnu.org/pub/gnu/ncurses/ncurses-6.2.tar.gz)
if(APPLE)
  set(Z3_PREBUILT_URL https://github.com/Z3Prover/z3/releases/download/z3-4.8.7/z3-4.8.7-x64-osx-10.14.6.zip)
else()
  set(Z3_PREBUILT_URL https://github.com/Z3Prover/z3/releases/download/z3-4.8.7/z3-4.8.7-x64-ubuntu-16.04.zip)
endif()

Download Libclang, NCurses and Z3

The dependencies are then downloaded and unpacked at build time

include(Download)
message(STATUS "Downloading ncurses sources, prebuilt z3 & prebuilt libclang with sources; this is ~500MB, please be patient, 'libclang_prebuilt' will take several minutes ...")
set(NCURSES_SOURCE_DIR)
download(ncurses_sources ${NCURSES_SOURCES_URL} NCURSES_DOWNLOAD_DIR)
set(LIBCLANG_SOURCES_DIR)
download(clang_sources ${CLANG_SOURCES_URL} LIBCLANG_SOURCES_DIR)
set(Z3_PREBUILT_DIR)
download(z3_prebuilt ${Z3_PREBUILT_URL} Z3_PREBUILT_DIR)
set(LIBCLANG_PREBUILT_DIR)
download(libclang_prebuilt ${LIBCLANG_PREBUILT_URL} LIBCLANG_PREBUILT_DIR)

Configure NCurses as an external project

Since ncurses does not provide prebuilt static archives we build it locally using a recipe stolen from Arch scripts:

include(ExternalProject)
ExternalProject_Add(ncurses
  SOURCE_DIR ${NCURSES_DOWNLOAD_DIR}
  CONFIGURE_COMMAND <SOURCE_DIR>/configure --enable-rpath --prefix=${CMAKE_INSTALL_PREFIX} --with-shared --with-static --with-normal --without-debug --without-ada --enable-widec --disable-pc-files --with-cxx-binding --without-cxx-shared --with-abi-version=5
  BUILD_COMMAND make
  INSTALL_COMMAND ""
  )

Setup CMake Paths And Includes

This part is why I used CMake for this project in the first place, the LLVM project provides CMake scripts that contain useful functions and macros which take care of the nitty gritty C++ compiler and inclusion flags that allow building libclang from source, without them this project would have been impossible.

list(APPEND CMAKE_MODULE_PATH "${LIBCLANG_PREBUILT_DIR}/lib/cmake/clang")
list(APPEND CMAKE_MODULE_PATH "${LIBCLANG_PREBUILT_DIR}/lib/cmake/llvm")
list(APPEND CMAKE_MODULE_PATH "${LIBCLANG_SOURCES_DIR}/cmake/modules")
include(LibClangBuild)
include(HandleLLVMOptions)
include(AddLLVM)
include(AddClang)
include(GatherArchives)

Build A Static Libclang

For some reason macOS needs to be told to use C++14 and it doesn’t hurt to include it for Linux as well:

set(CMAKE_CXX_STANDARD 14)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fno-rtti")

get_libclang_sources_and_headers populates the last three arguments with absolute paths to headers, libclang sources and the included LLVM archives.

get_libclang_sources_and_headers(
  ${LIBCLANG_SOURCES_DIR}
  ${LIBCLANG_PREBUILT_DIR}
  LIBCLANG_SOURCES
  LIBCLANG_ADDITIONAL_HEADERS
  LIBCLANG_PREBUILT_LIBS
  )

Now we have to tell the CMake recipe where to find the shared libraries for the z3 and ncurses dependencies:

include_directories(${LIBCLANG_PREBUILT_DIR}/include)

ExternalProject_Get_Property(ncurses BINARY_DIR)
set(NCURSES_BINARY_DIR ${BINARY_DIR})
set(NCURSES_SHARED_LIB)
if(APPLE)
  set(NCURSES_SHARED_LIB ${NCURSES_BINARY_DIR}/lib/libncursesw.dylib ${NCURSES_BINARY_DIR}/lib/libncursesw.5.dylib)
else()
  set(NCURSES_SHARED_LIB ${NCURSES_BINARY_DIR}/lib/libncursesw.so ${NCURSES_BINARY_DIR}/lib/libncursesw.so.5 ${NCURSES_BINARY_DIR}/lib/libncursesw.so.5.9)
endif()
unset(BINARY_DIR)

if(APPLE)
  set(Z3_SHARED_LIB ${Z3_PREBUILT_DIR}/bin/libz3.dylib)
else()
  set(Z3_SHARED_LIB ${Z3_PREBUILT_DIR}/bin/libz3.so)
endif()

add_clang_library is a libclang provided CMake function that does all the hard work of generating Makefiles to build a clang and LLVM based library or executable. It’s used twice, once to generate a static archive and once more for a shared library. I’m building it twice because building with both SHARED and STATIC in a single call seems to produce objects compiled with -fPIC so building the shared library fails. I’m probably doing something wrong but this seems to work for now.

add_clang_library(libclang
  SHARED
  OUTPUT_NAME clang
  ${LIBCLANG_SOURCES}
  ADDITIONAL_HEADERS ${LIBCLANG_ADDITIONAL_HEADERS}
  LINK_LIBS
  ${LIBCLANG_PREBUILT_LIBS} ${NCURSES_SHARED_LIB} dl pthread z
  LINK_COMPONENTS ${LLVM_TARGETS_TO_BUILD}
  DEPENDS ncurses
  )

add_clang_library(libclang_static
  STATIC
  OUTPUT_NAME clang_static
  ${LIBCLANG_SOURCES}
  ADDITIONAL_HEADERS ${LIBCLANG_ADDITIONAL_HEADERS}
  DEPENDS ncurses
  )

set_target_properties(libclang PROPERTIES VERSION 12)

This bit is pretty much copy-pasta’ed from the CMake build scripts that come with clang sources probably doesn’t do much.

if(APPLE)
  set(LIBCLANG_LINK_FLAGS " -Wl,-compatibility_version -Wl,1")
  set_property(TARGET libclang APPEND_STRING PROPERTY
               LINK_FLAGS ${LIBCLANG_LINK_FLAGS})
else()
  set_target_properties(libclang
    PROPERTIES
    DEFINE_SYMBOL _CINDEX_LIB_)
endif()

On MacOS libtool can reliably nest multiple archives into one by simply passing them in as arguments.

Unfortunately Linux is more complicated. Due to limitations of ar I have to make a thin archive, a static archive which doesn’t actually contain the other archives but references them, by calling ar with the T (for thin) argument.

Then I copy all archives I need to bundle into one directory ALL_ARCHIVES_DIRECTORY and build the thin archive there because a thin archive finds referenced archives by where they were relative to it when it was created. At install time the thin archives are the referenced archives are copied the same final location so the relative paths are intact and apps can transparently link with the thin archive.

The call to gather_archives populates ALL_ARCHIVES_DIRECTORY with a hard-coded path local to the CMake build directory, ALL_ARCHIVE_NAMES with a list of the archives basenames (just the filename without parents) so they can be passed to ar and ALL_ARCHIVE_PATHS with a list of archives with their full paths which is used at install time to copy them to the same location. This is not good code.

if(APPLE)
  add_custom_target(
    libclang_bundled ALL
    COMMAND ${CMAKE_LIBTOOL} -static -o libclang_bundled.a
              ${CMAKE_CURRENT_BINARY_DIR}/libclang_static.a
              ${LIBCLANG_PREBUILT_LIBS}
              ${NCURSES_BINARY_DIR}/lib/libncursesw.a
              ${Z3_PREBUILT_DIR}/bin/libz3.a
    DEPENDS ncurses libclang libclang_static
  )
else()
  gatherArchives(
    ALL_ARCHIVES_DIRECTORY
    ALL_ARCHIVE_NAMES
    ALL_ARCHIVE_PATHS
    ${CMAKE_CURRENT_BINARY_DIR}/libclang_static.a
    ${LIBCLANG_PREBUILT_LIBS}
    ${NCURSES_BINARY_DIR}/lib/libncursesw.a
    ${Z3_PREBUILT_DIR}/bin/libz3.a
  )
  add_custom_target(
    gather_archives ALL
    COMMAND ${CMAKE_COMMAND} -E make_directory ${ALL_ARCHIVES_DIRECTORY}
    COMMAND ${CMAKE_COMMAND} -E copy
      ${CMAKE_CURRENT_BINARY_DIR}/libclang_static.a
      ${LIBCLANG_PREBUILT_LIBS}
      ${NCURSES_BINARY_DIR}/lib/libncursesw.a
      ${Z3_PREBUILT_DIR}/bin/libz3.a
      ${ALL_ARCHIVES_DIRECTORY}
    DEPENDS ncurses libclang libclang_static
  )
  add_custom_target(
    libclang_bundled ALL
    COMMAND ${CMAKE_AR} crsT libclang_bundled.a ${ALL_ARCHIVE_NAMES}
    WORKING_DIRECTORY ${ALL_ARCHIVES_DIRECTORY}
    DEPENDS gather_archives
  )
endif()

Now it’s time to create the example app, Makefiles are generated using the Makefile templates covered in Static Makefile and Shared Makefile and they are all copied into an overall examples directory in CMAKE_CURRENT_BINARY_DIR, the CMake build directory.

That CMAKE_OSX_SYSROOT thing is simply so libclang headers can find the time.h on macOS. I’m really not sure why it isn’t in the standard location.

set(MAKEFILE_LIBCLANG_INCLUDE ${CMAKE_INSTALL_PREFIX}/include)
if(APPLE)
  set(MAKEFILE_LIBCLANG_INCLUDE "${MAKEFILE_LIBCLANG_INCLUDE} -I${CMAKE_OSX_SYSROOT}/usr/include")
endif()
set(MAKEFILE_LIBCLANG_LIBDIR ${CMAKE_INSTALL_PREFIX}/lib)

file(MAKE_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/examples/static)
configure_file(${LIBCLANG_EXAMPLES}/Makefile_static.in ${CMAKE_CURRENT_BINARY_DIR}/examples/static/Makefile)
if(APPLE)
  configure_file(${LIBCLANG_EXAMPLES}/Makefile_shared_macos.in ${CMAKE_CURRENT_BINARY_DIR}/examples/shared/Makefile)
else()
  configure_file(${LIBCLANG_EXAMPLES}/Makefile_shared.in ${CMAKE_CURRENT_BINARY_DIR}/examples/shared/Makefile)
endif()

file(COPY ${LIBCLANG_EXAMPLES}/clang_visitor.c DESTINATION ${CMAKE_CURRENT_BINARY_DIR}/examples/static)
file(COPY ${LIBCLANG_EXAMPLES}/sample.H DESTINATION ${CMAKE_CURRENT_BINARY_DIR}/examples/static)
file(COPY ${LIBCLANG_EXAMPLES}/clang_visitor.c DESTINATION ${CMAKE_CURRENT_BINARY_DIR}/examples/shared)
file(COPY ${LIBCLANG_EXAMPLES}/sample.H DESTINATION ${CMAKE_CURRENT_BINARY_DIR}/examples/shared)

Now everything is installed relative to the CMAKE_INSTALL_PREFIX, the libraries under lib, clang headers under include/clang-c and examples under share/doc.

if(APPLE)
  set(LIBCLANG_INSTALL_LIBS
    ${CMAKE_CURRENT_BINARY_DIR}/libclang_bundled.a
    ${Z3_PREBUILT_DIR}/bin/libz3.a
    ${Z3_SHARED_LIB}
    ${NCURSES_BINARY_DIR}/lib/libncursesw.a
    ${NCURSES_SHARED_LIB}
  )
else()
  set(LIBCLANG_INSTALL_LIBS
    ${ALL_ARCHIVES_DIRECTORY}/libclang_bundled.a
    ${ALL_ARCHIVE_PATHS}
    ${Z3_SHARED_LIB}
    ${NCURSES_SHARED_LIB}
  )
endif()

install(PROGRAMS ${LIBCLANG_INSTALL_LIBS} DESTINATION lib)
install(DIRECTORY ${LIBCLANG_PREBUILT_DIR}/include/clang-c DESTINATION include)
install(DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/examples DESTINATION share/doc)

Windows

The Windows build has the same overall idea as Linux/macOS it’s just different enough that it’s easier to start over than share code.

The initial bits are similar, this time the prebuilt LLVM is downloaded from the Zig project and it has all the dependencies built in so nothing more is needed except the clang sources.

set(LIBCLANG_PREBUILT_URL https://ziglang.org/deps/llvm+clang+lld-13.0.0-x86_64-windows-msvc-release-mt.tar.xz)
set(CLANG_SOURCES_URL https://github.com/llvm/llvm-project/releases/download/llvmorg-13.0.0/clang-13.0.0.src.tar.xz)

include(Download)
message(STATUS "Downloading prebuilt libclang with sources; this is ~500MB, please be patient, 'libclang_prebuilt' will take several minutes ...")
download(clang_sources ${CLANG_SOURCES_URL} LIBCLANG_SOURCES_DIR)
download(libclang_prebuilt ${LIBCLANG_PREBUILT_URL} LIBCLANG_PREBUILT_DIR)

Then I pull the the LLVM CMake scripts into scope just as before

list(APPEND CMAKE_MODULE_PATH "${LIBCLANG_PREBUILT_DIR}/lib/cmake/clang")
list(APPEND CMAKE_MODULE_PATH "${LIBCLANG_PREBUILT_DIR}/lib/cmake/llvm")
list(APPEND CMAKE_MODULE_PATH "${LIBCLANG_SOURCES_DIR}/cmake/modules")

set(CMAKE_CXX_STANDARD 14)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
include(LibClangBuild)
include(HandleLLVMOptions)
include(AddLLVM)
include(AddClang)

get_libclang_sources_and_headers(
  ${LIBCLANG_SOURCES_DIR}
  ${LIBCLANG_PREBUILT_DIR}
  LIBCLANG_SOURCES
  LIBCLANG_ADDITIONAL_HEADERS
  LIBCLANG_PREBUILT_LIBS
  )
include_directories(${LIBCLANG_PREBUILT_DIR}/include)
add_clang_library(libclang
  SHARED
  STATIC
  OUTPUT_NAME clang
  ${LIBCLANG_SOURCES}
  LINK_LIBS ${LIBCLANG_PREBUILT_LIBS} Version
  ADDITIONAL_HEADERS ${LIBCLANG_ADDITIONAL_HEADERS}
  )

set_target_properties(libclang PROPERTIES VERSION 12)

This bit is important, without it every object file spews a inconsistent DLL linkage warning. More importantly for reasons I don’t understand, I have to do this as opposed to how the LLVM project does it: set_target_properties(libclang PROPERTIES DEFINE_SYMBOL _CINDEX_LIB_)

target_compile_definitions(obj.libclang PUBLIC "_CINDEX_LIB_")

While the clang build function provided by LLVM produces a static library called clang_static.lib any app that links against it also requires libclang.dll at runtime, so we have just created a 400MB static library that doesn’t do anything. I guess static libs that also carry a dependency on DLLs is a common idiom on Windows but defeats the purpose of this project.

However I found an intermediate static archive obj.libclang.lib that also gets generated and does what I want so I pass that to the archive tool instead. I’m sure there is a better way but this seems to work for now.

LIBCLANG_INSTALL_LIBS is used at install time to copy the fat archive to the right place.

find_program(lib_tool lib)
if(NOT lib_tool)
  get_filename_component(CXX_COMPILER_DIRECTORY "${CMAKE_CXX_COMPILER}" PATH)
  set(lib_tool "${CXX_COMPILER_DIRECTORY}/lib.exe")
endif()
set(AR_COMMAND ${lib_tool} /NOLOGO /OUT:${CMAKE_CURRENT_BINARY_DIR}/clang_static_bundled.lib "${CMAKE_CURRENT_BINARY_DIR}/obj.libclang.dir/Release/obj.libclang.lib" ${LIBCLANG_PREBUILT_LIBS})

add_custom_target(libclang_static_bundled ALL
  COMMAND ${AR_COMMAND}
  DEPENDS libclang
  BYPRODUCTS ${CMAKE_CURRENT_BINARY_DIR}/clang_static_bundled.lib
  )
set(LIBCLANG_INSTALL_LIBS ${CMAKE_CURRENT_BINARY_DIR}/clang_static_bundled.lib)

And now for the example demo, I fill out the template CMake build recipe, copy the sources and build scripts to an examples directory in the local build directory …

set(CMAKE_MSVC_LIB_DIR ${CMAKE_INSTALL_PREFIX}/lib)
set(CMAKE_MSVC_INCLUDE_DIR ${CMAKE_INSTALL_PREFIX}/include)
configure_file(${LIBCLANG_EXAMPLES}/CMakeLists.MSVC.in ${CMAKE_CURRENT_BINARY_DIR}/examples/static/CMakeLists.txt)
file(COPY ${LIBCLANG_EXAMPLES}/sample.H DESTINATION ${CMAKE_CURRENT_BINARY_DIR}/examples/static/bin)
file(COPY ${LIBCLANG_EXAMPLES}/clang_visitor.c DESTINATION ${CMAKE_CURRENT_BINARY_DIR}/examples/static)
file(COPY ${LIBCLANG_EXAMPLES}/README.txt DESTINATION ${CMAKE_CURRENT_BINARY_DIR}/examples/static)

… and install the library, clang headers and demos under CMAKE_INSTALL_PREFIX\{lib,include,share} respectively

install(PROGRAMS ${LIBCLANG_INSTALL_LIBS} DESTINATION lib)
install(DIRECTORY ${LIBCLANG_PREBUILT_DIR}/include/clang-c DESTINATION include)
install(DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/examples DESTINATION share/doc)

Other Helper Modules

Build Time Downloads (Download.cmake)

include(FetchContent)
function (download name url source_dir)
  FetchContent_Declare(${name} URL ${url})
  if(NOT ${name}_POPULATED)
    message(STATUS "* Downloading ${name} from ${url}")
    FetchContent_Populate(${name})
  endif()
  set(${source_dir} ${${name}_SOURCE_DIR} PARENT_SCOPE)
endfunction()

Libclang sources, headers and static libs (LibClangBuild.cmake)

These are the clang source files and headers needed to build libclang, most have been copied wholesale from the CMakeLists.txt provided with the project.

set(LIBCLANG_SOURCE_PATH tools/libclang)
set(LIBCLANG_INCLUDE_PATH include/clang-c)
set(LIBCLANG_SOURCE_FILES
  ARCMigrate.cpp
  BuildSystem.cpp
  CIndex.cpp
  CIndexCXX.cpp
  CIndexCodeCompletion.cpp
  CIndexDiagnostic.cpp
  CIndexHigh.cpp
  CIndexInclusionStack.cpp
  CIndexUSRs.cpp
  CIndexer.cpp
  CXComment.cpp
  CXCursor.cpp
  CXIndexDataConsumer.cpp
  CXCompilationDatabase.cpp
  CXLoadedDiagnostic.cpp
  CXSourceLocation.cpp
  CXStoredDiagnostic.cpp
  CXString.cpp
  CXType.cpp
  Indexing.cpp
  FatalErrorHandler.cpp
)
set(LIBCLANG_ADDITIONAL_HEADER_FILES
  CIndexDiagnostic.h
  CIndexer.h
  CXCursor.h
  CXLoadedDiagnostic.h
  CXSourceLocation.h
  CXString.h
  CXTranslationUnit.h
  CXType.h
  Index_Internal.h
)
set(LIBCLANG_INDEX_H Index.h)

This list of static archives to an libclang app needs to link against took some experimentation, apparently we need all them in this approximate order to link successfully, I have no idea why I just tried stuff until it worked.

set(LIBCLANG_LINK_LIBS
  clangAST
  clangBasic
  clangDriver
  clangFrontend
  clangIndex
  clangLex
  clangSema
  clangSerialization
  clangTooling
  clangARCMigrate
  LLVMAArch64CodeGen
  LLVMAArch64AsmParser
  LLVMAArch64Desc
  LLVMAArch64Disassembler
  LLVMAArch64Info
  LLVMAArch64Utils
  LLVMAMDGPUCodeGen
  LLVMAMDGPUAsmParser
  LLVMAMDGPUDesc
  LLVMAMDGPUDisassembler
  LLVMAMDGPUInfo
  LLVMAMDGPUUtils
  LLVMARMCodeGen
  LLVMARMAsmParser
  LLVMARMDesc
  LLVMARMDisassembler
  LLVMARMInfo
  LLVMARMUtils
  LLVMBPFCodeGen
  LLVMBPFAsmParser
  LLVMBPFDesc
  LLVMBPFDisassembler
  LLVMBPFInfo
  LLVMHexagonCodeGen
  LLVMHexagonAsmParser
  LLVMHexagonDesc
  LLVMHexagonDisassembler
  LLVMHexagonInfo
  LLVMLanaiCodeGen
  LLVMLanaiAsmParser
  LLVMLanaiDesc
  LLVMLanaiDisassembler
  LLVMLanaiInfo
  LLVMMipsCodeGen
  LLVMMipsAsmParser
  LLVMMipsDesc
  LLVMMipsDisassembler
  LLVMMipsInfo
  LLVMMSP430CodeGen
  LLVMMSP430AsmParser
  LLVMMSP430Desc
  LLVMMSP430Disassembler
  LLVMMSP430Info
  LLVMNVPTXCodeGen
  LLVMNVPTXDesc
  LLVMNVPTXInfo
  LLVMPowerPCCodeGen
  LLVMPowerPCAsmParser
  LLVMPowerPCDesc
  LLVMPowerPCDisassembler
  LLVMPowerPCInfo
  LLVMRISCVCodeGen
  LLVMRISCVAsmParser
  LLVMRISCVDesc
  LLVMRISCVDisassembler
  LLVMRISCVInfo
  LLVMSparcCodeGen
  LLVMSparcAsmParser
  LLVMSparcDesc
  LLVMSparcDisassembler
  LLVMSparcInfo
  LLVMSystemZCodeGen
  LLVMSystemZAsmParser
  LLVMSystemZDesc
  LLVMSystemZDisassembler
  LLVMSystemZInfo
  LLVMWebAssemblyCodeGen
  LLVMWebAssemblyAsmParser
  LLVMWebAssemblyDesc
  LLVMWebAssemblyDisassembler
  LLVMWebAssemblyUtils
  LLVMWebAssemblyInfo
  LLVMX86CodeGen
  LLVMX86AsmParser
  LLVMX86Desc
  LLVMX86Disassembler
  LLVMX86Info
  LLVMXCoreCodeGen
  LLVMXCoreDesc
  LLVMXCoreDisassembler
  LLVMXCoreInfo
  LLVMCore
  LLVMSupport
  clangFormat
  clangToolingInclusions
  clangToolingCore
  clangFrontend
  clangDriver
  LLVMOption
  clangParse
  clangSerialization
  clangSema
  clangEdit
  clangRewrite
  clangAnalysis
  clangASTMatchers
  clangAST
  clangLex
  clangBasic
  LLVMAArch64Desc
  LLVMAArch64Info
  LLVMAArch64Utils
  LLVMMIRParser
  LLVMAMDGPUDesc
  LLVMAMDGPUInfo
  LLVMAMDGPUUtils
  LLVMARMDesc
  LLVMARMInfo
  LLVMARMUtils
  LLVMHexagonDesc
  LLVMHexagonInfo
  LLVMLanaiDesc
  LLVMLanaiInfo
  LLVMipo
  LLVMVectorize
  LLVMIRReader
  LLVMAsmParser
  LLVMInstrumentation
  LLVMLinker
  LLVMSystemZDesc
  LLVMSystemZInfo
  LLVMWebAssemblyDesc
  LLVMWebAssemblyInfo
  LLVMGlobalISel
  LLVMAsmPrinter
  LLVMDebugInfoDWARF
  LLVMSelectionDAG
  LLVMCodeGen
  LLVMScalarOpts
  LLVMAggressiveInstCombine
  LLVMInstCombine
  LLVMBitWriter
  LLVMTransformUtils
  LLVMTarget
  LLVMAnalysis
  LLVMProfileData
  LLVMTextAPI
  LLVMObject
  LLVMBitReader
  LLVMCore
  LLVMRemarks
  LLVMBitstreamReader
  LLVMMCParser
  LLVMMCDisassembler
  LLVMMC
  LLVMBinaryFormat
  LLVMDebugInfoCodeView
  LLVMDebugInfoMSF
  LLVMSupport
  LLVMCFGuard
  LLVMFrontendOpenMP
  LLVMDemangle
  LLVMAVRCodeGen
  LLVMAVRAsmParser
  LLVMAVRDisassembler
  LLVMAVRDesc
  LLVMAVRInfo
  LLVMPasses
  LLVMCoroutines
  LLVMSupport
  LLVMObjCARCOpts
  )

Add absolute path to sources and headers (LibClangBuild.cmake)

function(get_libclang_sources_and_headers clang_source_path clang_prebuilt_path result_sources result_headers result_required_libs)
  list(TRANSFORM LIBCLANG_SOURCE_FILES PREPEND ${clang_source_path}/${LIBCLANG_SOURCE_PATH}/ OUTPUT_VARIABLE RES)
  set(${result_sources} ${RES} PARENT_SCOPE)
  unset(RES)
  list(TRANSFORM LIBCLANG_ADDITIONAL_HEADER_FILES PREPEND ${clang_source_path}/${LIBCLANG_SOURCE_PATH}/ OUTPUT_VARIABLE RES)
  list(TRANSFORM LIBCLANG_INDEX_H PREPEND ${clang_source_path}/${LIBCLANG_INCLUDE_PATH}/ OUTPUT_VARIABLE RES1)
  list(APPEND RES ${RES1})
  set(${result_headers} ${RES} PARENT_SCOPE)
  unset(RES)
  if(MSVC)
    list(TRANSFORM LIBCLANG_LINK_LIBS PREPEND ${clang_prebuilt_path}/lib/ OUTPUT_VARIABLE RES)
    list(TRANSFORM RES APPEND .lib OUTPUT_VARIABLE RES)
  else()
    list(TRANSFORM LIBCLANG_LINK_LIBS PREPEND ${clang_prebuilt_path}/lib/lib OUTPUT_VARIABLE RES)
    list(TRANSFORM RES APPEND .a OUTPUT_VARIABLE RES)
  endif()
  set(${result_required_libs} ${RES} PARENT_SCOPE)
  unset(RES)
endfunction()

Gather Names Of Static Archives And Common Directory

function (gatherArchives all_archives_directory all_archive_names all_archive_paths)
  set(ALL_ARCHIVES_DIRECTORY_LOCAL ${CMAKE_CURRENT_BINARY_DIR}/_all_archives)
  foreach(archive_path ${ARGN})
    get_filename_component(archive_name ${archive_path} NAME)
    list(APPEND ARCHIVE_NAMES_LOCAL ${archive_name})
    list(APPEND ARCHIVE_PATHS_LOCAL ${ALL_ARCHIVES_DIRECTORY_LOCAL}/${archive_name})
  endforeach()
  set(${all_archives_directory} ${ALL_ARCHIVES_DIRECTORY_LOCAL} PARENT_SCOPE)
  set(${all_archive_names} ${ARCHIVE_NAMES_LOCAL} PARENT_SCOPE)
  set(${all_archive_paths} ${ARCHIVE_PATHS_LOCAL} PARENT_SCOPE)
endfunction()

Examples

Static Makefile

CC=@CMAKE_C_COMPILER@
CFLAGS=-I@MAKEFILE_LIBCLANG_INCLUDE@
LIBS=-L@MAKEFILE_LIBCLANG_LIBDIR@ -lclang_bundled -lstdc++ -lm -ldl -lpthread -lz
OBJ=clang_visitor.o

%.o: %.c
	$(CC) -c -o $@ $< $(CFLAGS)

clang_visitor: $(OBJ)
	$(CC) -o $@ $^ $(CFLAGS) $(LIBS)

.PHONY: clean

clean:
	rm *.o clang_visitor

Shared Makefile

CC=@CMAKE_C_COMPILER@
CFLAGS=-I@MAKEFILE_LIBCLANG_INCLUDE@
LIBS=-L@MAKEFILE_LIBCLANG_LIBDIR@ -lclang -lstdc++ -lm -ldl -lpthread -Wl,-rpath=@MAKEFILE_LIBCLANG_LIBDIR@
OBJ=clang_visitor.o

%.o: %.c
	$(CC) -c -o $@ $< $(CFLAGS)

clang_visitor: $(OBJ)
	$(CC) -o $@ $^ $(CFLAGS) $(LIBS)

.PHONY: clean

clean:
	rm *.o clang_visitor

Shared Makefile MacOS

CC=@CMAKE_C_COMPILER@
CFLAGS=-I@MAKEFILE_LIBCLANG_INCLUDE@
LIBDIR=@MAKEFILE_LIBCLANG_LIBDIR@
LIBS=-lclang -lz3 -lstdc++ -ldl -lpthread
OBJ=clang_visitor.o

%.o: %.c
	$(CC) -c -o $@ $< $(CFLAGS)

clang_visitor: $(OBJ)
	$(CC) -o $@ $^ $(CFLAGS) -L$(LIBDIR) $(LIBS); \
	install_name_tool -change libz3.dylib $(LIBDIR)/libz3.dylib $@; \
	install_name_tool -add_rpath $(LIBDIR) $@;
.PHONY: clean

clean:
	rm *.o clang_visitor

CMakeLists MSVC

cmake_minimum_required(VERSION 3.13)
project(clang_visitor)
add_library(LibclangStatic SHARED IMPORTED)
set_property(TARGET LibclangStatic PROPERTY IMPORTED_LOCATION "@CMAKE_MSVC_LIB_DIR@/clang_static_bundled.lib")
set_property(TARGET LibclangStatic PROPERTY IMPORTED_IMPLIB "@CMAKE_MSVC_LIB_DIR@/clang_static_bundled.lib")
include_directories("@CMAKE_MSVC_INCLUDE_DIR@")
add_executable(clang_visitor clang_visitor.c)
target_link_libraries(clang_visitor LibclangStatic Version)
target_compile_definitions(clang_visitor PUBLIC -D_CINDEX_LIB_)
target_link_options(clang_visitor PUBLIC /NODEFAULTLIB:libcmt.lib)
install(TARGETS clang_visitor)

Windows README

To build this project:
> mkdir build
> cd build
> "C:\Program Files\CMake\bin\cmake.exe" -G "Visual Studio 16 2019" .. -DCMAKE_INSTALL_PREFIX=..
> "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\MSBuild\Current\Bin\MSBuild.exe" /m -p:Configuration=Release INSTALL.vcxproj

To run:
> cd ..\bin
> clang_visitor.exe

Sample C++ File

#ifndef __ENUM__
#define __ENUM__

enum Enum
{
  RED = 10,
  GREEN = 10 << 2,
  BLUE = RED + GREEN
};


#endif // __ENUM__

Example Visitor

#include <clang-c/Index.h>
#include <clang-c/CXString.h>
#include <stdio.h>
#include <stdlib.h>

enum CXChildVisitResult visitor(CXCursor cursor, CXCursor parent, CXClientData data) {
    CXSourceLocation location = clang_getCursorLocation( cursor );
    if(!clang_Location_isFromMainFile(location))
        return CXChildVisit_Continue;
    CXString cxspelling = clang_getCursorSpelling(cursor);
    const char* spelling = clang_getCString(cxspelling);
    CXString cxkind = clang_getCursorKindSpelling(clang_getCursorKind(cursor));
    const char* kind = clang_getCString(cxkind);
    printf("Cursor spelling, kind: %s, %s\n", spelling, kind);
    clang_disposeString(cxspelling);
    clang_disposeString(cxkind);
    return CXChildVisit_Recurse;
}

int main(int argc, char** argv) {
    CXIndex idx = clang_createIndex(1,1);
    CXTranslationUnit tu = clang_createTranslationUnitFromSourceFile(idx, "sample.H", 0, 0, 0, 0);
    clang_visitChildren(clang_getTranslationUnitCursor(tu), visitor, 0);
    return 0;
}

Issues with GCC < 10 on Linux

When building if you get linker errors that look like:

undefined reference to `std::_Sp_make_shared_tag::_S_eq(std::type_info const&)

that means you have a gcc/g++ version less than 10 and need to upgrade. I ran into this issue with Debian Buster which is still on 8.3.0 and moving to Bullseye worked.

If upgrading isn’t possible the libclang 10 static build will work with older versions.