This post describes how to compile a single C++ source file to an
object file with the Clang API. Here is the code. It behaves like a
simplified
clang
executable that handles
-c
and
-S
.
We need an LLVM and Clang installation that provides both
lib/cmake/llvm/LLVMConfig.cmake
and
lib/cmake/clang/ClangConfig.cmake
. You can grab these from
system packages (dev versions may be required) or build LLVM
yourself-I'll skip the detailed steps here. For a DIY build, use:
I've set a prebuilt Clang as
CMAKE_CXX_COMPILER
-just a
habit of mine. llvm-project isn't guaranteed to build warning-free with
GCC, since GCC
-Wall -Wextra
has many false positives and
LLVM developers avoid cluttering the codebase.
1 2 3 4 5 6 7 8 9
% echo 'void f() {}' > a.cc % out/debug/cc -S a.cc && head -n 5 a.s .file "a.cc" .text .globl _Z1fv # -- Begin function _Z1fv .p2align 4 .type _Z1fv,@function % out/debug/cc -c a.cc && ls a.o a.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
Anonymous files
The input source file and the output ELF file are stored in the
filesystem. We could create a temporary file and delete it with a RAII
class
llvm::FileRemover
:
If inline assembly is used, we will also need the AsmParser
library:
1
LLVMInitializeX86AsmParser();
We could also call
LLVMInitializeAll*
functions instead,
which initialize all supported targets (build-time
LLVM_TARGETS_TO_BUILD
).
Here are some notes about the LLVMX86 libraries:
LLVMX86Info:
llvm/lib/Target/X86/TargetInfo/
LLVMX86Desc:
llvm/lib/Target/X86/MCTargetDesc/
(depends
on LLVMX86Info)
LLVMX86AsmParser:
llvm/lib/Target/X86/AsmParser
(depends on LLVMX86Info and LLVMX86Desc)
LLVMX86CodeGen:
llvm/lib/Target/X86/
(depends on
LLVMX86Info and LLVMX86Desc)
EmitAssembly
and
EmitObj
The code supports two frontend actions,
EmitAssembly
(
-S
) and
EmitObj
(
-c
).
You could also utilize the API in
clang/include/clang/FrontendTool/Utils.h
, but that would
pull in another library
clangFrontendTool
(different from
clangFrontend
).
Diagnostics
The diagnostics system is quite complex. We have
DiagnosticConsumer
,
DiagnosticsEngine
, and
DiagnosticOptions
.
We define a simple
DiagnosticConsumer
that handles
notes, warnings, errors, and fatal errors. When macro expansion comes
into play, we report two key locations:
The physical location (
fileLoc
), where the expanded
token triggers an issue-matching Clang's error line, and
The spelling location within the macro's replacement list
(
sm.getSpellingLoc(loc)
).
Although Clang also highlights intermediate locations for chained
expansions, our simple approach offers a solid approximation.
% cat a.h #define FOO(x) x + 1 % cat a.cc #include "a.h" #define BAR FOO void f() { int y = BAR("abc"); } % out/debug/cc -c -Wall a.cc a.cc:4:11: warning: adding 'int' to a string does not append to the string ./a.h:1:18: note: expanded from macro a.cc:4:11: note: use array indexing to silence this warning ./a.h:1:18: note: expanded from macro a.cc:4:7: error: cannot initialize a variable of type 'int' with an rvalue of type 'const char *' % clang -c -Wall a.cc a.cc:4:11: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int] 4 | int y = BAR("abc"); | ^~~~~~~~~~ a.cc:2:13: note: expanded from macro 'BAR' 2 | #define BAR FOO | ^ ./a.h:1:18: note: expanded from macro 'FOO' 1 | #define FOO(x) x + 1 | ~~^~~ a.cc:4:11: note: use array indexing to silence this warning a.cc:2:13: note: expanded from macro 'BAR' 2 | #define BAR FOO | ^ ./a.h:1:18: note: expanded from macro 'FOO' 1 | #define FOO(x) x + 1 | ^ a.cc:4:7: error: cannot initialize a variable of type 'int' with an rvalue of type 'const char *' 4 | int y = BAR("abc"); | ^ ~~~~~~~~~~ 1 warning and 1 error generated.
We call a convenience function
CompilerInstance::ExecuteAction
, which wraps lower-level
API like
BeginSource
,
Execute
, and
EndSource
. However, it will print
1 warning and 1 error generated.
unless we set
ShowCarets
to false.
clang::createInvocation
clang::createInvocation
, renamed from
createInvocationFromCommandLine
in 2022, combines
clang::Driver::BuildCompilation
and
clang::CompilerInvocation::CreateFromArgs
. While it saves a
few lines for certain tasks, it lacks the flexibility we need for our
specific use cases.