How to debug LLVM ?

Abstract

Debugging programs remains a significant topic in software engineering field. Especially in system software like Compiler, it's difficult to pinpoint the root and solve relative problems.
For my recent work on LLVM, I'd like to share some experience about it.

Classification of Bugs

Bugs in Compiler field can be mainly classified into Crash, Mis-compilation, and Missed Optimizations.

For example, there may be a C file triggering one of these bugs.

If it crashed clang, which is easiest to pinpoint, clang would dump the stack trace. With the stack trace, we are able to determine it's a frontend/middleend/backend problem.

Or, if someone reported an assembly file after mis-compilation, we have to reduce it first (use llvm-reduce if it's a LLVM-IR), and try to validate in the whole compilation. For example, validate the AST, the LLVM-IR after each pass, and the assembly after every step in backend. In this way, we can pinpoint which module caused the mis-compilation.

For missed optimizations, it's similar to the case as mis-compilation. However, it's harder to define whether it's a helpful or real missed optimizations. There are some kinds of missed optimizations that always make no sense to real improvement of optimization. And fuzzers always generate such missed cases:

Too large IR. For this kind of IR, passes like CSE, GVN and DSE only fold it partially for cost/compile-time problem
No real motivation. The optimization in LLVM is designed mostly for real-world applications. For this reason, some non-sense missed cases are not considered at all, unless they become a pattern.
Hard to debug. Complex testcases always needs reduction and can be located precisely in which module.
Won't fix. Optimization is a recursively unsolvable problem, and there is always some topics that compiler can't fix at all, such as fully eliminating all common expressions or simplifying all expressions. Most optimization in LLVM is mostly heuristic or based on experience, which determines that LLVM can't handle all cases.

Some Tools

llc/opt --print-before=[crash pass] [ir]: You could dump IR before the pass causing crash through it. For example, use 2> dump.txt to output to a file.
opt -O2 -print-before-all / opt -O2 -print-before-all: Dump all IR before/after all passes that modify IR.
llvm-reduce [ir] --test=test.sh: llvm-reduce is an IR-Reduction tool based on Delta Algorithm, which reduces ir if test.sh return 0(0 represents interestness). It make IR eaiser to analyze.

My Workflow

When I come across an IR file crashing clang/opt, I first take a look at stacktrace. The stacktrace always indicates which function/class exposes the error.

Here we assume the error is exposed by a optimization pass op1. Then we enter:

1	opt --print-before=op1 -O2 -S [ir-file] > [ir-before-op1.ll]

Or if it's a C file, we enter:

1	clang -mllvm -print-before=instcombine xxx.c -O2 -g0 2> xxx.ll

O2 can be anything causing problem. The output serves as a reproducer. And then we use op1 to reproduce it:

1	opt --passes=op1 -S [ir-file]

If we reproduce it successfully, we are down to reducing it:

Write a test.sh

#!/bin/bash
opt --passes=op1 -S $1 | grep "something related to error"

# For missed optimization, write a testcase and check it with FileCheck
# FileCheck $1 | grep "something related"

And launch llvm-reduce:

1	llvm-reduce [ir-file] --test=test.sh

Finally we get a reduced.ll. Based on this file, we analyze the problem easier.

How to Get Command-Line Arguments and IR-dump File when bootstrapping

Refer to discourse

XChy's Blog