Suppose we have a crash while compiling huge application from source, e.g. a Python package with native C++ code. A source file fails to compile with the following message:
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments: /opt/llvm/bin/clang-14 ...
...
Segmentation fault (core dumped)
Analyzing source files with many include files is often not feasible. Instead, we can deploy automatic test case reduction tools such as LLVM bugpoint or C-Reduce. C-Reduce is actually generic, and can evaluate bugs of any compiler. But in this usecase we focus particulary on Clang.
Preparation
First preparatory step for C-Reduce is to gather all source code into a single file, which means to perform the preprocessing step. This is required for C-Reduce to work with just a single file, and be able to modify it. Preprocessing could be performed just by adding -E
in the end of the “Program arguments” command of the crash report above. We will rename the output file to bug.cpp
.
Building C-Reduce
Assuming that we may need to work inside CI environment, such as Manylinux CentOS Docker container, we will use yum and compile C-Reduce from source:
git clone https://github.com/csmith-project/creduce.git
cd creduce/
- Each major verson of LLVM introduced breaking changes, so maintainers provide branches for each LLVM version.
- Our LLVM is already version 14, but it is still compatible with version 13
git checkout llvm-13.0
Install prerequisites:
yum install ncurses-devel flex perl-Exporter-Lite cpan
cpan -i Getopt::Tabular Regexp::Common File::Which
Perform the build, using LLVM toolchain provided in the user-specified folder:
PATH=/opt/llvm/bin:$PATH ./configure --prefix=$(pwd)/install
make -j8
make install
Usage
Prepare bug.sh
evaluation script for C-Reduce to work on:
#!/bin/sh
/opt/llvm/bin/clang-14 -cc1 ... -O3 bug.cpp && \
! /opt/llvm/bin/clang-14 -cc1 ... -O0 bug.cpp
In our case, build succeeds with -O3
and fails with -O0
, so we combine these two outcomes into a logical expression and feed it into C-Reduce, also providing the source file name for reducing:
LD_LIBRARY_PATH=/opt/llvm/lib:$LD_LIBRARY_PATH creduce/install/bin/creduce bug.sh bug.cpp
LD_LIBRARY_PATH
should additionally provide the location of LLVM libs.
Result
# creduce/install/bin/creduce bug.sh bug.cpp
===< 41280 >===
running 3 interestingness tests in parallel
===< pass_unifdef :: 0 >===
===< pass_comments :: 0 >===
===< pass_ifs :: 0 >===
===< pass_includes :: 0 >===
===< pass_line_markers :: 0 >===
(3.5 %, 5281788 bytes)
(5.4 %, 5178731 bytes)
...
===< pass_indent :: final >===
(100.0 %, 256 bytes)
===================== done ====================
After several hours of work, the original ~100K lines of code have been automatically reduced to just a few lines, for example just into this small test case that crashes the compiler:
extern "C" __attribute__((device)) void __ockl_atomic_add_noret_f32(float *,
float);
__attribute__((device)) void a(float *address, float b) {
__ockl_atomic_add_noret_f32(address, b);
}
References
Dmitry Mikushin
If you need help with Machine Learning, Computer Vision or with GPU computing in general, please reach out to us at Applied Parallel Computing LLC.