Introduction

BrowserStack Code Quality uses Compilation Database (CDB) to scan C/C++ repository in strict mode. BrowserStack Code Quality provides a tool ‘embold-trace’ to generate CDB from build process. BrowserStack Code Quality-trace tool facilitates users to configure which compilers to intercept during build. It then intercepts build process and generates CDB based on configured compilers.

Supported Platforms

Before you begin, please verify that you have a supported version of operating systems here.

Prerequisites

  • Linux
    strace: A tool for debugging and troubleshooting programs in Unix-like operating systems such as Linux.
    Read more about strace here.
  • Installation of strace
    For Ubuntu→ $ sudo apt-get install strace
    For CentOS/Red Hat→ $ sudo yum install strace

How to use

  • To run the embold-trace, run the below command:
    $ embold-trace [embold-trace-options] {your build command} [build-options]
  • Use `--help` to know more about the embold-trace options.

MSBuild on Windows

BrowserStack Code Quality-trace has two modes of operation for tracing MSBuild projects:

  1. Using .tlog files→Uses MSBuild generated tracker log files of the form CL.command.1.tlog to generate CDB
    This mode can be enabled using -t option of embold-trace
    $ embold-trace -t msbuild project.sln
  2. Trace files → Uses intermediate trace files to generate CDB.
    This is the default mode of operation

Output Directory

Embold-trace supports generating CDB in custom directory with option -o
$ embold-trace -o {custom/directory} msbuild project.sln

Configure embold-trace

BrowserStack Code Quality-trace uses json configuration files to configure compilers to intercept. There are two types of configuration files.

1. Top-level configuration file:

This file list down the compilers names to intercept and corresponding translation file name (excluding extension). BrowserStack Code Quality-trace provides compiler configuration for all the standard compilers.
The default name is ’embold-trace-default.json’

  • Source Extensions

The default configuration file lists supported extensions for C/C++ source files. Users should add the extension which needs to be supported to this configuration file. (if not already present)

  • Translation file:
    It contains the corresponding compiler options to specify include directory, include file and define macro.
    Key-Value description in translation json as follows :
KeyValueDescription Example
includeDirPrefix Option used to specify include directory-I (GNU Family)-I (GNU Family)
defineMacroPrefixOption used to define macro-D (GNU Family)-D (GNU Family)
includeFilePrefix Option used to include single file during compilation-include (GNU Family)-include (GNU Family)
sourceFilePrefix Prefix used to specify source files in compile commandPrefix used to specify source files in compile command. Green Hills Compiler uses flag --source to specify source file in compile command.
Example:
$ ease850 --source=abc.cpp
--source (GH Compiler)
optionFilePrefix Prefix used to specify build system options file. Options file contains compiler options for include directory, define macro etc.Instead of directly specifying options like -I, -D in compile commands, some compilers or build systems uses a special file called option file which contains compiler flags. This file is then provided in compile command using a prefix. The value of this key is that prefix.
Example:
$ cctc --option-file=option-file.txt abc.cpp
--option-file (Tasking Tri-core)
optionFilePatternA pattern to specify build system options file. Each build system or compiler has different name format for options file. So, a regex should be specified as value.Option files can also be specified using pattern instead of prefix. Example:
CMake build system generates response files whose names are like includes_CXX.rsp for C++ compilation and includes_C.rsp for C compilation. These files contain includes and define options required for compilation. And these response files are supplied usually using ‘@’ character at the start but not necessarily.
Example:
"command": "g++ @CMakeFiles/TestProject/includes_C.rsp -o CMakeFiles\ TestProject\test.o -c C:\TestProject\test.c"
A regex which match these file names should be provided as value of optionFilePattern in translation file. Then, if embold-trace finds such file pattern in compile command, then it replace the content of the option file in compile command for Embold to parse correctly.
.*includes_CXX.rsp (CMake Build)

Where to find

By default, all the provided configuration files (top-level and translation) are present in the directory where embold-trace binary is present. Any new translation file should also be added to the same directory.
However, you can keep all the configuration files in a separate directory. In this case, the top-level configuration file path should be given as input to embold-trace while invocation using option “-c
$ embold-trace -c [build-options]

2. Adding Non-Standard Compiler Name

If you have a standard compiler from the above list but a non-standard name, you must edit the default configuration file.

Example:
If a GCC compiler 4.9 is installed and compiler name is ‘gcc-4.9’
Then, add an entry to top-level configuration file.

Adding Unsupported Compiler

If your compiler is name is not found in top-level configuration file and it is not a standard compiler like GCC or Clang, then that compiler is non supported by embold-trace by default.

Supporting a non-supported compiler involves two steps:

  1. Add an entry in the top-level configuration file
  2. Add corresponding translation file

3. Add an entry in the top-level configuration file

An entry must be added to top-level configuration file for this compiler name and corresponding translation file name (excluding extension).

Example:
Suppose your new compiler name is ‘cctc’ which takes following options
“-inc” – to include directory
“-def” – to defining macro
“-ifile” – to include a file during compilation

Where, tasking_tricore is a JSON translation file named tasking_tricore .json in the same directory as top-level configuration file.

  • tasking_tricore json file will look like below:

Suppose the new compiler takes includeFile of the form @test/.includeDirs_c23dewff34.txt Hence, the regex provided is “.*.includesDirs_.*.txt”

Supported Compilers

CompilerCompiler nameTranslation config name
GCCgcc, g++, cc, c++, clanggnu.json
Clangclang, clang++gnu.json
Green Hills (GH)ease850
gh.json
Tasking Tri-corecctctasking_tricore.json
MSBuildcctctasking_tricore.json

Locate Compilation Database

After the build is finished successfully, the Compilation Database is generated in the current working directory where the build is run. A file named compile_command.json is the CDB. If the embold-trace successfully generates CDB, a log message will be printed displaying the number of compile entries in CDB.

Locate Configuration Files

By default, all the provided configuration files (top-level and translation) are present in the directory where the embold-trace binary is present. Any new translation file should also be added to the same directory.

However, you can keep all the configuration files in a separate directory. In this case, the top-level configuration file path should be given as input to embold-trace while invocation using option c”

$ embold-trace -c {top/level/config/file/path} {your build command} [build-options]

Configuring Unknown Compiler

Configuring embold-trace for an unknown compiler is an iterative process.

Steps to configure

  1. Run build using embold-trace (Check How to use section)
  2. After the build, embold-trace will print number of non-configured executable and log all the entries in a text file embold-trace-unknown-exe.txt. This file will be createdin current working directory. Not all the executable are compilers so user need to identify which ones are the compilers.
  3. If you found your compiler in non-configured list then go to next step else configuration is done no further action is needed
  4. Create a translation file for your compiler
  5. Add compiler name and translation file name entry to embold-trace-default.json
  6. Run embold-trace with trace file as input (No need to run build again)
  7. Go to step 2

Example: Configuring g++

Assumption: No compilers are configured.
Empty embold-trace-default.json

Our sample repository is cppcheck. We will build the repository using embold-trace and configure compiler g++ and create compilation database.

Step 1: Build repository using embold-trace

Step 2: Build output

As can be seen from above image, embold-trace intermediate trace file path is C:\Users\Hemant\AppData\Local\Temp\embold-trace-log_1590487077. Non-configured executable is 4 and they ate written to C:\Workspace\repos\cppcheck\build_make\embold-trace-unknown-exe.txt. Also, no commands have been intercepted as no compilers are configured.

Step 3: Inspect the contents of C:\Workspace\repos\cppcheck\build_make\embold-trace-unknown-exe.txt

There are 4 entries in unknown executable list. Not all executable are compilers only g++ is. So we will go to next step and configure g++.

Step 4: Create a translation file gnu.json (filename can be anything) for g++ compilers
For all GNU family compilers like gcc, g++
includeDirPrefix → -I
defineMacroPrefix → -D
includeFilePrefix → -include

So we will create gnu.json like this

Step 5: Add compiler name (g++) and translation file name (gnu) entry to embold-trace-default.json

Step 6: Rerun embold-trace with a new configuration and trace file as input

This time the compilation database is generated and non-configured executable count is reduced by 1 as we have configured g++ compiler. Also we didn’t run the build again. Contents of C:\Workspace\repos\cppcheck\build_make\embold-trace-unknown-exe.txt

There is no entry for g++. So g++ is configured correctly. Other executable are not required to be configured. So embold-trace configuration is complete and we got valid compilation database.

Step 7: No need to go to step 2

Likewise, we can configure any possible compiler iteratively

Non-configured Executable file format

Its a text file which provides hints to user regarding probable compilers in build. It contains unique executable path per line. Each line is further divided into 3 parts separated by semicolon. Meaning of each part is explained in below image.

As you can see, the first thing user need to check is whether the executable is compiler or not is by inspecting the second part, configure all* executable with “C“ first as they probably are compilers. And executable with most occurrences and “N“ might be custom compiler but they are can only be identified by build engineers.

*Some compiler executable internally also invokes another executable in that case we may ignore it like cc1plus. g++ internally invokes cc1plus.

Custom Source Extensions

By default, embold-trace supports following C/C++ extensions
“cc”,”c++”,”c”,”cpp”,”cxx”,”cplusplus”
However, user can add custom extensions like this

Here ‘850’ & ‘pc’ are custom C extensions and 'pcpp' is custom C++ extension