Your First Scan

In this guide, you will learn how to create a project, link a repository and scan it using a sample open source project. Before you get started, an active Embold account is needed. (Don’t have one? Sign up free here).
Yes No

Creating a Project

Projects are what Embold uses to store and manage your code; they contain the repositories you link to Embold. You can create as many projects as you need and can add as many repositories to it as you like. For today’s tutorial, we’ll be creating a single project and linking a single repository.

Steps to add a project:

In case you’re not there already, log into your Embold account (yourdomain.mygamma.io). Navigate to the page Projects and click Create A New Project.

Enter Project Details:

You should now see a popup with two fields: project name and project description. Normally, you would enter information relevant to your own project, but for this tutorial, we’ll use the following:

  1. In the field, Project Name, enter “Apache”.
  2. In the field, Project Description, enter “Apache Commons Text is a library focused on algorithms working on strings.”

Once you have entered the information, click Create Project.
You just created your first project on Embold!


Yes No

Linking a Repository

Repositories are the storage containers for your source code. Most of today’s repositories are stored using services like Github, GitLab, Bitbucket, etc. (view a full list of supported services here). In order to analyze any code on Embold, a repository must first be linked.

Steps to link a repository:

Inside your new project (Apache), click Link Repositories.

You should now see a popup with a series of forms representing the information Embold needs to link and scan a repository. (Some of your fields may differ from the ones presented, these are dependant on the account and/or repository type chosen).

Set the Account Type:

As the name suggests, here’s where you set the type of version control account your repository is stored on (example: Git, Bitbucket, TFS, Zip, SVN, etc.). Currently, free, open source Embold accounts are limited to public GitHub repositories only.

If not already selected, set Account/Repository Type to Git.

Get the Repository URL:

Since the project we want to analyze is hosted on Github, we’ll need to use its URL to access it. We can do this by going to the project page and copying the link that is displayed when clicking on the button Clone or download.

Alternatively, copy https://github.com/apache/commons-text.git and paste the Github URL into the field, “URL”:

Enter the Username and Password:

As mentioned, whether certain fields are required or not is based on the type of account and/or repository.
In this tutorial, we are analyzing a public repository. Hence, we don’t need to enter a username and password.
However, keep in mind that these fields would be required if we were trying to analyze a private repository.

If presented with the username and password fields, leave these blank.

Name the Repository:

This mandatory field helps give you and your teammates a good indication of what’s inside. Often, the best practice is to mirror the name that is otherwise given to the repository. In this case, Commons-Text.

In the field Repository Name, type Commons-Text.

Set the Language:

In order for Embold to properly analyze a repository, it needs to know which language it was written in. This is easily set via the dropdown in the popup.
Set Language to Java.

Click Link Repository.
You just linked your first repository!


Yes No

Scanning a repository

Scans are the foundation of Embold: Once started, Embold will analyze the given source code, generate metrics, identify code-issues and find software design anti-patterns. So let’s run a scan on the repository we just set up.

Run a scan

Inside the repository Apache click the button, Scan.

Scan selection

After clicking the Scan button, a new pop up opens that lets you specify what branch or tag of your git repository you want to scan. It also shows you how many of your scan credits are already used and how much of your scan history is occupied. By default, the ‘master’ branch is always selected. To make your results comparable to ours, navigate to the Tags tab and select commons-text-1.2. Click Scan in the bottom right corner of the pop up to start the actual scan process.

View scan progress

After clicking the Scan button, the analysis will be scheduled. You can track the scan status, along with all other currently running scans from the scan queue tab, which is accessible from the menu on the left side. By clicking on an item on this list, like in this example by clicking on ‘Commons-Text’, the detailed progress monitor opens and can be used to monitor the scan progress.

After sometimes the scan status should reach Completed.

All previous scans, along with several other repository specific settings can be found in the repository menu. To view this menu, navigate inside the Apache project and click on the three dots in the upper right corner of the Commons-text repository card.

Congratulations! You just finished your first scan! You can now navigate back to the repository to have a look at the findings.


Yes No

Identifying Problems

Get an overview of your software and identify the most important issues.

Yes No

Get an Overview of Your Software

1. You need an active Embold account. (Don’t have one? Sign up free here).
2. Identifying Problems builds off of the lessons learned in the previous articles. To get the most out of this article we recommend you finish your First Scan.

Embold rating system

The Embold rating system is a numeric representation of the quality of the software. The rating is calculated on every level; for a function, a method, a component, a package, and the overall software health. It ranges from -5 to 5, where -5 indicates a very bad rating and therefore bad quality, and 5 indicates that the software is exceptionally designed. When we analyzed the Commons-Text repository for version 1.2 to prepare this article, it had an overall rating of 2.74, which is quite good. The rating is shown on a project and on a repository level by looking at the number inside of the colored circle. This already enables you to quickly compare the quality between your projects and repositories once you have more than one scanned.

What is effecting the Embold rating?

Click on the “Commons-Text” repository tile. This will bring you right into the results of the previous scan of this repository. More specifically, it will show you the Repository Dashboard for the last scan. The first areas to focus on are the ratings displayed in the tiles. What is shown is the overall rating, then the rating for certain aspects of the code, such as the design, code issues, metrics, and duplication. All of these ratings contribute to the overall Embold rating.

Reading the Repository Overview

The Repository Overview shows the general state of your software at a glance. There are two ways to get to this view; you can either click on the “Overall Rating” tile, or you can navigate to the node summary bar at the top of your page and click on the third node which shows the overall rating number.
The top section of the overview page shows general information about the repository, such as the overall rating, history of the different scan rating, the total lines of code (LoC), the amount of components, and the amount of hotspots. The green-yellow line between the rating and the other information displayed on the page indicates the rating of the scan. As you scan the repository again for each new update, there will be more lines of different height and color to indicate the trend of the new updates. Under that, you can see that the Commons-Text repository has 27.095 LoC, 13.808 of which are executable, 160 components (classes in Java), and no hotspots, which is a component with a negative overall rating.

Under the general overview, there is more detailed information on the other components; design, metrics, duplication, and code issues. With a rating of 4.68, the design of the code is well made and has minimal issues. The metrics rating is quite low at 0.62, which is due in most part to the Number of Methods (NOM) and Complexity and Response for Class (RFC) violations. Duplication is rated 3.48, with 21 clones of an average size of 58 lines, and code issues have a rating of 3.88 with 30 high issues, but no critical ones.

Using the Node Summary Bar

The same information of the Repository Overview is displayed in the Node Summary Bar at the top of the screen. By hovering over this bar, you can see the same information of the ratings and key findings.

Navigating through your code

You can navigate through your source code tree with the path above the Node Summary Bar, or you can directly open the Tree Navigation tray by clicking the icon in the top left.

Yes No

View and Interpret the Heatmap

What Does the Heatmap Show?

The Heatmap can be accessed from the top navigation bar by clicking the box icon next to the fire icon:

Clicking on it will bring you to the Heatmap page, which shows the main components of your repository. Each component is represented as a rectangle, where the size of the rectangle stands for the size (lines of code) of this component, and the color represents the rating. By hovering over the rectangles with your mouse, the name of the component is shown, as well as its rating on the rating scale.

Use of Heatmap

The heatmap can be used for two things:
1. Understanding the overall quality of your software.
2. Identifying components to investigate further

The slider at the top of the Heatmap can be used to highlight the specific components that fall under the different ratings. For example, it can be used to highlight all components with a rating lower than 1.5.

The large orange components are indicative for issues that need further investigation. When you click on one of the components, you are brought into the Component Explorer. In here, you are able to navigate through the different code issues and see the exact lines where the problem resides. In this case, the component StrBuilder seems to be a promising candidate for a deeper investigation since it has a rating of only 0.73 (nearly a hotspot) and is quite large.
To learn how you can use the Component Explorer to improve the score, have a look at Improving Hotspots.

Yes No

Identify your top priorities with the component list

What does the component list show?

The component list can be accessed from the Node Summary bar by clicking on the list icon:

You will be redirected to a page that shows a list of all components of your software sorted by ascending Overall Score. The components with the worst score are at the top. Next to the overall score, the sub-ratings for Design, Metrics, Duplication and Code Issues are shown.

The list can be sorted by any of these ratings to allow you to find the top issues depending on which category you want to improve first, such as design quality.

How to use the component list?

In our example with Apache Common-Text, the three worst components by overall rating are StrSubstitutor, StrBuilder and StrBuilderAppendInsertTest. Since the third component represents software tests, it would be better to focus on the other two. In order to get into the code level, click on the component name and it will redirect you to the Component Explorer.

When you sort by design rating, there are two other components that have the worst rating and will be at the top of the list.

How to assess the risk?

The Embold Risk rating is displayed in the first column, next to the component name. While the Overall Rating represents the software quality of one component, the Risk rating is about how much impact this component would have on your software should something go wrong. For some projects, it is acceptable for a component to have a low Overall Rating if the risk related to it is low. But if a component has a high risk rating with a low Overall Score, it should be dealt with quickly.

All in all, the component StrSubstitutor seems to be the component that needs most attention. It has the lowest overall Embold Score, the second lowest design score, an above average risk and is not a testing component. To find out how Embold helps to improve components, have a look at your Improving Hotspots.

More advanced features

The parameter menu on the right side the Component List can be used for more advanced features. You can filter the displayed rows by their Hotspot status, or by various component types.

The Component List can also be used to display more information than just the ratings: it can display all calculated metrics and the duplication details as well. This can be selected from the Columns drop-down menu on the right side. You can view more information on the metrics here.

Yes No

Dive through Issue Distributions

The previous sections showed how to identify potential low-quality components that need more attention, either through the Heatmap or the Component List. In this section, the focus is on two different aspects:
1. Identifying problematic packages
2. Identifying problem classes that lead to bad quality

What are the available Distribution Screens?

The distribution screens can be accessed from the top navigation bar:
Embold currently shows five different types of distributions:

  • Design issue distribution
  • Code issue distribution
  • Metrics distribution
  • Duplication distribution
  • Hotspot distribution

Please click on “51 Code Issues” to access the code issue distribution now.

What does the Code Distribution screen show?

The distribution screen always shows the number of issues affecting the current package level that was selected from the Tree Navigation. Use the navigator on the top right and follow this path: src → main → java → org → apache → commons → text, the select “text” (unsure how to do this? Please look at this section to learn how to use the Tree Navigation). Your screen should look like this:

What is can be seen is that 12 highly prioritized code issues have been found in the Str.Subsitutor.java file. The parameter menu on the right side of the screen can be used to change to a different view; use it and select “Modules vs Issue Types”.

This graph shows how often certain types of code issues have been found. For Commons-Text, the issues identified most often have been “ConstructorCallsOverridableMethod” and “AvoidBranchingStatementAsLastInLoop”. These issues are always specific to the programming language used, so these issues are specific to Java. To find out more about these issues, click on the graph. For general information on each of these issue types please visit our code issues documentation.

Yes No

Assessing Changes

Quickly analyse and understand the impact of code changes in your software.

Yes No

View the impact of change

Before you get started

Before starting this article, make sure you have:
1. An active Embold account (Don’t have one? Sign up free here).
2. Gone through Your First Scan and Identifying Problems. Assessing Changes builds off of the lessons learned in the previous articles. To get the most out of this article we recommend you finish the other two first.

Scanning multiple snapshots

The Change Overview screen allows you to compare the quality of two different versions of your software. For this, you need to have two snapshots in Embold. A snapshot represents the state of your repository at any given point in time and is created as a result of scanning your software. If you have not done it already, please follow First Scan to create your first snapshot of Apache-Commons version 1.2. Next, perform a second scan of Apache-Commons and scan the Tag “commons-text-1.3”. You can do this by clicking on Scan in the top right corner after entering your repository:

After this, select the Tag “commons-text-1.3” and click on “Scan”. After the scan is finished, a second snapshot representing Apache-Commons 1.3 is created.

Using the Change Overview screen

Navigate to the Change Overview screen using the top navigation bar.

Select the two snapshots you want to compare. By default, the current snapshot is selected as the right side of the comparison, which represents version 1.3 of Commons-Text. Select the previous snapshot for the left side of the comparison.

In the change summary shows that 7160 executable lines of code were added between version 1.2 and 1.3, and the overall score decreased by 1.29 points. Use the drop-down menu to see the changes in the sub-ratings, for metrics, design, duplication and code issues.

In the change overview, all improvements and deterioration of hotspots are displayed. In version 1.3, five new hotspots were added and three components became hotspots. By clicking on them will show a more detailed view of the problems.

What does it tell me?

The Change Overview shows that between release 1.2 and 1.3, the software quality of Commons-Text decreased quite significantly. By looking at the Duplication rating it becomes evident that a lot of duplicated code was added. Additionally, five new hotspot components were added, which have a rating below zero. A negative change this drastic should be discussed within the development team to identify the root causes and find ways of establishing a continuous improvement process.

As a summary, the change overview is mainly used for two purposes:
1. Reviewing newly developed code, i.e. for a new feature or a pull request.
2. Periodically reviewing the overall quality, i.e. after each sprint or release cycle.

Yes No

Dig down to detailed changes

Using the Changed Component List

The Your First Scan article shows how to use the Change Overview to quickly assess the impact of code changes; this part shows how to analyze detailed changes and their impact. To see this, click on the Changed Component List icon in the navigation bar:

This list is similar to the standard component list introduced to in Identifying Problems. The difference is that, in addition to showing current scores and values, it also shows the change in specific values from the versions selected in the top drop down.

The component with the name “StrBuilderTest” has a rating of -0.37 in the snapshot created on Sept 21. Compared to the snapshot on Sep 20, it decreased by 2 points, indicated by the red number.

Utilizing advanced features of the changed component list

In addition to sorting the list by the current score value, it can also be sorted by the difference value. This can help answer questions like “Which component decreased most in its design score” or “Which component eliminated most of its previous duplications”.

Another advanced feature is changing the columns to show the specific metrics instead of the scores. This can be done by changing the Column dropdown to Metrics in the right parameter menu. This can answer questions like “Which component increased its Complexity the most” or “Which component gained the most new lines of code”.

Yes No

Improving Hotspots

Analyse component level problems and derive refactoring recommendations.

Yes No

Using the component explorer to fix issues

Before you get started

Before starting this tutorial, make sure you have:
1. An active Embold account (Don’t have one? Sign up free here).
2. Gone through your First Scan and Identifying Problems, and Assessing Changes. Improving Hotspots article builds off the lessons learned in these previous articles.

Analysing the component rating

In Identifying Problems, the component StrSubstitutor from Apache Commons-Text 1.2 could be problematic and requires further investigation. Navigate to this component by selecting first snapshot in the dropdown is selected, and then use the component list or the heatmap to navigate to this component.

At the top of the Component Explorer, the node summary bar is showing the Embold scores for this specific component. StrSubstitutor has an overall rating of 0.68, a design rating of 3.34, a code issue rating of –0.19, and so on. By hovering over these ratings, additional information is displayed. For example, hovering over the design rating reveals that this component is considered to be a Brain Class. A Brain Class is a class that holds too much complexity, and often contains one or more Brain Methods.

Hovering over the Metrics score opens up the actual metric values for this component. StrSubstitutor has a complexity of 102, consists of 452 statements, 57 methods and so on. The values in red indicate that a certain metric violates our threshold and is therefore considered sub-optimal. Find out more about our metrics and their thresholds on our documentation.

Investigating problematic areas of the code

Below the component scores lives a view of the source code from this snapshot. The code itself is displayed on the right side, whereas all code specific findings are on the left side. There are three different types of findings:
1. Method level design issues
2. Duplication
3. Code Issues

1. Method level design issues

There is a multitude of method level design issues that Embold can detect. For a full overview, please visit our anti-pattern documentation. One interesting finding inside of StrSubstitutor is the Brain Method substitute. A brain method is a method that tends to centralize the functionality of its owner class. It tends to be long as it carries out most of the tasks of the class that are supposed to be distributed among several methods.
Fixing a method level design issue depends on the type of issue. In this case, the issue can be solved by simplifying the “substitute” method. It should either be re-written or split up into multiple, simpler methods.
Fixing the Brain Method issue will lead to a higher design score, and could even resolve the component level design issue (Brain Class) of this component.

2. Duplication

After clicking on Duplication, all areas of the source code are shown that appear in more than one place. In this example, there is one block of 22 lines of code that appears two times in the same class. In other examples, there could also be duplicated blocks across different components. Duplicated code is highlighted in blue in the component explorer.

To fix this issue, the source code of the two duplicated section should be rewritten in a way that does not require the same code twice. This could be done by moving the duplicated code into a new function that is responsible for common functionality, and will be called by the other two functions that currently contain the duplicated code.

3. Code Issues

The third section of the Component Explorer lists all code issues found by Embold. Code issues are violations to certain code rules and can be configured for each repository. For a full list of code issue, please visit our documentation.

Twelve high priority and three medium priority issues were found in the StrSubstitutor component. By clicking on the issue in the left bar, it brings you directly to the location were the selected issue has been found. Hovering over the {x} symbol in the editor gutter reveals additional information on the specific issue, and clicking on the hover text will display a full, in-depth, description.

Fixing a code issue depends on the specific issue that has been found. The issue above could be fixed by not reassigning the method parameter “priorVariables”, but instead working with a new temporary field inside of the method. This is also described in the popup when clicking on it.

Yes No

Analyze dependencies

After navigating to the Component Explorer of a certain component, the Dependency View can be accessed via the top menu:

The Dependency View can show all incoming or outgoing dependencies of one component. Outgoing dependencies are describing the components that are used by the current component, incoming dependencies show which other components use the current component. The color of the connecting line between two components indicates the respective rating of a connected component. This can be used to easily see which low rated components are used by the current component. StrSubstitutor depends on StrBuilder, StrLookup and StrMatcher.
StrBuilder has a medium rating indicated by the orange color of the connection, so this is a dependency that should be carefully watched when developing on StrSubstitutor. To view multiple levels of dependencies, can click the little circles after each component name. This can especially be useful when analyzing a dependency based design issue such as Global/local Butterflies or global/local Breakables.

Yes No

Using the Partitioning Editor

When to use the partitioning editor

There are many issues that are a result of classes being too large. This can range from components that are slightly above the recommended lines of code threshold (like StrSubstitutor) to large components with noncohesive functionality (like God Classes).
A common strategy for dealing with such components is breaking them up into multiple smaller components. This way, each component is providing a clean abstraction and cohesive functionality. Unfortunately, doing this kind of refactoring can be quite complicated, especially when dealing with very large classes. The Partitioning Editor is intended to make this kind of refactoring work more efficient.

How to use the partitioning editor

The partitioning editor can be accessed by clicking on the corresponding button in the top menu on the component level of your source code tree:

Based on a partitioning algorithm and language processing, it recommends a way to split up the current component into multiple smaller components. With the range input on the top, the granularity of the partitioning algorithm can be defined. The higher the range, the finer the recommended partitioning.

The screenshot above shows one sample partitioning of the StrSubstitutor component at a granularity of 8. It shows that it is possible to split this component in about three smaller components as well as a few minor ones. When clicking on one of the bubbles, it shows which attributes and methods of the component are suggested to be put into that partition.The screenshot above shows one sample partitioning of the StrSubstitutor component at a granularity of 8. It shows that it is possible to split this component in about three smaller components as well as a few minor ones. When clicking on one of the bubbles, it shows which attributes and methods of the component are suggested to be put into that partition.

Yes No
Suggest Edit