Whenever you want to scan a large repository (above 2 Millions lines of code) remember you need these memory settings. Before you run such scan make sure to tweak few environment values such as -m and ANALYSER_XMX. Note that, such values may not be known upfront when the docker run command is triggered. We recommend values as shown in the table below:

Lines of codeANALYSER_XMX-mEmbold Server/UI Scan (RAM)Remote Scan (RAM)
Upto 1 Million6GB12GB16GB16GB
Upto 2-10 Millions15GB30GB32GB32GB

The below docker run command is an example where we increased the -m and ANALYSER_XMX values.

docker run -m 30GB -d -p 3000:3000 --name BrowserStackCodeQuality -e gamma_ui_public_host=http://: -e RISK_XMX=-Xmx1024m -e ACCEPT_EULA=Y -e ANALYSER_XMX=-Xmx15360m -v /home/${USER}/BrowserStackCodeQuality/gamma_data:/opt/gamma_data -v /home/${USER}/BrowserStackCodeQuality/gamma_psql_data:/var/lib/postgresql -v /home/${USER}/BrowserStackCodeQuality/logs:/opt/gamma/logs browserstack/code-quality:$BROWSERSTACK_CQ_VERSION

For your information, a significant part of memory allocated using -m to the docker is used by the various processes of Code Quality which includes the UI, Controller Process, and Processes involved in the actual analysis of the source code. Among these processes, the analyzer process a.k.a ANALYSER_XMX uses the maximum memory.

It is also observed in most instances that the ANALYSER_XMX value is proportional to the actual number of Lines Of Code (LOC) scanned for a repository. The ANALYSER_XMX value should not be more than 70% of the allocated value for -m. Also, the environment variables passed to docker such as ANALYSER_XMX and RISK_XMX are mutually exclusive for a given analysis.

For example, a repository with 1 Million lines of code will require a minimum of 6 to 7GB ANALYSER_XMX. Said that, few other factors may impact the ANALYSER_XMX value – like within this repository, there could be a surge of code duplication that will demand the ANALYSER_XMX value to be set to 8GB and the -m value to set to 12GB.

Therefore, a general thumb rules to follow while setting ANALYSER_XMX value are:

  1. For a repository with less than 1 Million lines of code, that equals 35-37% of the total memory allocated for the container.
  2. For a repository with less than 3 Millions lines of code, that equals 40-42% of the total memory allocated for the container.
  3. For a repository with less than 5 Millions lines of code, that equals 50-55% of the total memory allocated for the container.

Note: For C++ repositories to achieve more accuracy we recommend to perform Remote scan with Strict mode. For more details click here